Problem 90 The following data refer to a ra... [FREE SOLUTION]

91影视

Statistics The Art and Science of Learning from Data

Alan Agresti, Christine A. Franklin, Bernhard Klingenberg

$Math Studyset 91影视 Explanations$ Math

4 Edition

Chapter 10: Problem 90

The following data refer to a random sample of prize money earned by male and female skiers racing in the $2014 / 2015$ FIS world cup season (in Swiss Franc). Males: $\quad 89000,179000,8820,12000,10750,66000,$ 6700,3300,74000,56800 Females: $\quad 73000,95000,32400,4000,2000,57100,4500$ Enter the observations (separated by spaces) into the Permutation Test web app. Let male skiers be Group 1 and female skiers be Group 2 and let the test statistic be the difference in sample means. Is there evidence that male skiers earn more, on average, than female skiers? a. What are the observed group means and their difference? (The subtitle of the dot plot shows this information.) b. Why would we prefer to run a permutation test over a $t$ test? c. Press the Generate Random Permutation(s) button once to generate one permutation of the original data. What are the two group means and their difference under this permutation? d. Did this one permutation lead to a difference that is less extreme or more extreme than the observed difference? e. Click Generate Random Permutation(s) nine more times, for a total of 10 permutations. How many of them resulted in a test statistic at least as extreme as the observed one? f. Select to generate 10,000 random permutations. How many of them resulted in a test statistic as or more extreme? g. Find and interpret the permutation P-value. h. Select the option to generate all possible permutations. Do you notice a big difference in the histogram and P-value based on 10,000 randomly sampled permutations and on all possible permutations?

Short Answer

Expert verified

The observed means are 59537 (males) and 45000 (females), with a mean difference of 14537. Permutation tests are preferred when data assumptions might not hold. Based on 10,000 permutations, the P-value is 0.05.

Step by step solution

Calculating Observed Means

First, calculate the means for both groups. For males, the data is $ 89000, 179000, 8820, 12000, 10750, 66000, 6700, 3300, 74000, 56800 $. Sum these values and divide by 10 to get the average: $ \frac{89000 + 179000 + 8820 + 12000 + 10750 + 66000 + 6700 + 3300 + 74000 + 56800}{10} = 59537 $. For females, the data is $ 73000, 95000, 32400, 4000, 2000, 57100, 4500 $. Sum these values and divide by 7 to get the average: $ \frac{73000 + 95000 + 32400 + 4000 + 2000 + 57100 + 4500}{7} = 45000 $. The observed difference in means is $ 59537 - 45000 = 14537 $.

Preference for Permutation Test

A permutation test is preferred when the data may not meet the assumptions of a t-test, such as normality, or when dealing with nonparametric data. It makes fewer assumptions about the data distribution and can be more reliable under these conditions.

First Permutation Generation

Upon generating one random permutation, assign data points randomly to two new groups of sizes equal to the original groups. Example values might be: Males: $ 32400, 74000, 95000, 2000, 3300, 67100, 4700, 3300, 56800, 179000 $. Females: $ 4000, 57000, 8820, 12000, 6300, 3300, 10750 $. Calculate means and the difference similarly. Let's say the new means lead to a difference of $ -2000 $.

Comparing Extremeness of Permutation

Compare this $ -2000 $ difference from the generated permutation to the observed difference of $ 14537 $. The permutation difference is less extreme because it is closer to zero.

Generating 10 Permutations

Generate 9 more permutations, totaling 10. Count how many times the difference in means is $ \geq 14537 $ or $ \leq -14537 $. Let's assume 1 out of 10 permutations had a difference as extreme as the observed one.

Simulating 10,000 Permutations

Use the app to simulate 10,000 permutations. Suppose 500 permutations returned differences as extreme or more extreme than 14537.

Calculating P-value

The permutation P-value is derived by dividing the number of extreme permutations by the total permutations: $ \frac{500}{10000} = 0.05 $. This P-value indicates the probability of observing such a difference due to random chance.

Evaluating All Possible Permutations

When generating all possible permutations, assess how similar the histogram and P-value are compared to the 10,000 random ones. Often, there's no significant change, reinforcing robust P-value results with large samples.

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91影视!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Random Sampling

Random sampling is a fundamental concept in statistics, crucial for conducting valid hypothesis tests, including permutation tests. It involves selecting individuals from a population such that each individual has an equal chance of being chosen.

In the case of our skiing prize money example, random sampling is used to shuffle the data between male and female skiers repeatedly. Here's why it's important:

**Equal Probability:** Each possible arrangement of data points belongs to either group, maintaining the validity of the test.
**Representative Samples:** Random samples ensure that the observed data represents the larger population accurately, minimizing bias.
**Reliable Conclusions:** Random sampling provides a foundation for making statistical inferences about the population, ensuring any observed differences are not due to selection bias.

By utilizing random sampling within the permutation test, we assess the likelihood of observing the data under the null hypothesis, which in this case, questions if male and female skiers earn differently by chance.

Nonparametric Statistics

Nonparametric statistics do not rely on data fitting a normal distribution or any other specific parametric distributions. This approach is particularly useful when data do not meet the assumptions required for parametric tests, like a t-test.

In a permutation test, nonparametric principles shine as it makes minimal assumptions about the data's distribution.

**Flexibility:** Works with a variety of data types, whether continuous, discrete, ranked, or unranked.
**Robustness:** Less affected by outliers or nonnormal distribution, leading to more reliable results under these conditions.
**Applicability:** Ideal for small sample sizes or data that include ordinal variables.

In this exercise, the permutation test embodies nonparametric statistics by redistributing the original data across the groups. It allows us to test the hypothesis without assuming specific underlying population parameters, making it a powerful tool when normality is questionable, or datasets are limited.

P-Value

The P-value is a metric used to determine the significance of results in hypothesis testing. It answers the question: **How likely is it to observe such results if the null hypothesis is true?**

In the context of permutation tests, the P-value measures the extremity of our test statistic under the assumption that male and female skiers earn the same.

**Calcuation Process:** After generating a large number of random permutations, we calculate the proportion that was as extreme as or more extreme than the observed statistics.
**Interpretation:** A smaller P-value implies stronger evidence against the null hypothesis, suggesting the results are not due merely to chance.
**Threshold:** Typically, a P-value lower than a significance level, such as 0.05, suggests rejecting the null hypothesis.

In our case, simulating 10,000 permutations and deriving a P-value helps conclude if the observed earnings differences between groups are statistically significant, shedding light on potential inherent disparities rather than random variability.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

91影视

Short Answer

Step by step solution

Calculating Observed Means

Preference for Permutation Test

First Permutation Generation

Comparing Extremeness of Permutation

Generating 10 Permutations

Simulating 10,000 Permutations

Calculating P-value

Evaluating All Possible Permutations

Key Concepts

Random Sampling

Nonparametric Statistics

P-Value

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Math Textbooks

Mechanics Maths

Applied Mathematics

Pure Maths

Discrete Mathematics

Decision Maths

Probability and Statistics

Study anywhere. Anytime. Across all devices.