/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 90 The following data refer to a ra... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

The following data refer to a random sample of prize money earned by male and female skiers racing in the \(2014 / 2015\) FIS world cup season (in Swiss Franc). Males: \(\quad 89000,179000,8820,12000,10750,66000,\) 6700,3300,74000,56800 Females: \(\quad 73000,95000,32400,4000,2000,57100,4500\) Enter the observations (separated by spaces) into the Permutation Test web app. Let male skiers be Group 1 and female skiers be Group 2 and let the test statistic be the difference in sample means. Is there evidence that male skiers earn more, on average, than female skiers? a. What are the observed group means and their difference? (The subtitle of the dot plot shows this information.) b. Why would we prefer to run a permutation test over a \(t\) test? c. Press the Generate Random Permutation(s) button once to generate one permutation of the original data. What are the two group means and their difference under this permutation? d. Did this one permutation lead to a difference that is less extreme or more extreme than the observed difference? e. Click Generate Random Permutation(s) nine more times, for a total of 10 permutations. How many of them resulted in a test statistic at least as extreme as the observed one? f. Select to generate 10,000 random permutations. How many of them resulted in a test statistic as or more extreme? g. Find and interpret the permutation P-value. h. Select the option to generate all possible permutations. Do you notice a big difference in the histogram and P-value based on 10,000 randomly sampled permutations and on all possible permutations?

Short Answer

Expert verified
The observed means are 59537 (males) and 45000 (females), with a mean difference of 14537. Permutation tests are preferred when data assumptions might not hold. Based on 10,000 permutations, the P-value is 0.05.

Step by step solution

01

Calculating Observed Means

First, calculate the means for both groups. For males, the data is \( 89000, 179000, 8820, 12000, 10750, 66000, 6700, 3300, 74000, 56800 \). Sum these values and divide by 10 to get the average: \( \frac{89000 + 179000 + 8820 + 12000 + 10750 + 66000 + 6700 + 3300 + 74000 + 56800}{10} = 59537 \). For females, the data is \( 73000, 95000, 32400, 4000, 2000, 57100, 4500 \). Sum these values and divide by 7 to get the average: \( \frac{73000 + 95000 + 32400 + 4000 + 2000 + 57100 + 4500}{7} = 45000 \). The observed difference in means is \( 59537 - 45000 = 14537 \).
02

Preference for Permutation Test

A permutation test is preferred when the data may not meet the assumptions of a t-test, such as normality, or when dealing with nonparametric data. It makes fewer assumptions about the data distribution and can be more reliable under these conditions.
03

First Permutation Generation

Upon generating one random permutation, assign data points randomly to two new groups of sizes equal to the original groups. Example values might be: Males: \( 32400, 74000, 95000, 2000, 3300, 67100, 4700, 3300, 56800, 179000 \). Females: \( 4000, 57000, 8820, 12000, 6300, 3300, 10750 \). Calculate means and the difference similarly. Let's say the new means lead to a difference of \( -2000 \).
04

Comparing Extremeness of Permutation

Compare this \( -2000 \) difference from the generated permutation to the observed difference of \( 14537 \). The permutation difference is less extreme because it is closer to zero.
05

Generating 10 Permutations

Generate 9 more permutations, totaling 10. Count how many times the difference in means is \( \geq 14537 \) or \( \leq -14537 \). Let's assume 1 out of 10 permutations had a difference as extreme as the observed one.
06

Simulating 10,000 Permutations

Use the app to simulate 10,000 permutations. Suppose 500 permutations returned differences as extreme or more extreme than 14537.
07

Calculating P-value

The permutation P-value is derived by dividing the number of extreme permutations by the total permutations: \( \frac{500}{10000} = 0.05 \). This P-value indicates the probability of observing such a difference due to random chance.
08

Evaluating All Possible Permutations

When generating all possible permutations, assess how similar the histogram and P-value are compared to the 10,000 random ones. Often, there's no significant change, reinforcing robust P-value results with large samples.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Random Sampling
Random sampling is a fundamental concept in statistics, crucial for conducting valid hypothesis tests, including permutation tests. It involves selecting individuals from a population such that each individual has an equal chance of being chosen.

In the case of our skiing prize money example, random sampling is used to shuffle the data between male and female skiers repeatedly. Here's why it's important:
  • **Equal Probability:** Each possible arrangement of data points belongs to either group, maintaining the validity of the test.
  • **Representative Samples:** Random samples ensure that the observed data represents the larger population accurately, minimizing bias.
  • **Reliable Conclusions:** Random sampling provides a foundation for making statistical inferences about the population, ensuring any observed differences are not due to selection bias.
By utilizing random sampling within the permutation test, we assess the likelihood of observing the data under the null hypothesis, which in this case, questions if male and female skiers earn differently by chance.
Nonparametric Statistics
Nonparametric statistics do not rely on data fitting a normal distribution or any other specific parametric distributions. This approach is particularly useful when data do not meet the assumptions required for parametric tests, like a t-test.

In a permutation test, nonparametric principles shine as it makes minimal assumptions about the data's distribution.
  • **Flexibility:** Works with a variety of data types, whether continuous, discrete, ranked, or unranked.
  • **Robustness:** Less affected by outliers or nonnormal distribution, leading to more reliable results under these conditions.
  • **Applicability:** Ideal for small sample sizes or data that include ordinal variables.
In this exercise, the permutation test embodies nonparametric statistics by redistributing the original data across the groups. It allows us to test the hypothesis without assuming specific underlying population parameters, making it a powerful tool when normality is questionable, or datasets are limited.
P-Value
The P-value is a metric used to determine the significance of results in hypothesis testing. It answers the question: **How likely is it to observe such results if the null hypothesis is true?**

In the context of permutation tests, the P-value measures the extremity of our test statistic under the assumption that male and female skiers earn the same.
  • **Calcuation Process:** After generating a large number of random permutations, we calculate the proportion that was as extreme as or more extreme than the observed statistics.
  • **Interpretation:** A smaller P-value implies stronger evidence against the null hypothesis, suggesting the results are not due merely to chance.
  • **Threshold:** Typically, a P-value lower than a significance level, such as 0.05, suggests rejecting the null hypothesis.
In our case, simulating 10,000 permutations and deriving a P-value helps conclude if the observed earnings differences between groups are statistically significant, shedding light on potential inherent disparities rather than random variability.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Internet book prices Anna's project for her introductory statistics course was to compare the selling prices of textbooks at two Internet bookstores. She first took a random sample of 10 textbooks used that term in courses at her college, based on the list of texts compiled by the college bookstore. The prices of those textbooks at the two Internet sites were Site \(A: \$ 115, \$ 79, \$ 43, \$ 140, \$ 99, \$ 30, \$ 80, \$ 99, \$ 119, \$ 69\) Site \(B: \$ 110, \$ 79, \$ 40, \$ 129, \$ 99, \$ 30, \$ 69, \$ 99, \$ 109, \$ 66\) a. Are these independent samples or dependent samples? Justify your answer. b. Find the mean for each sample. Find the mean of the difference scores. Compare and interpret. c. Using software or a calculator, construct a \(90 \%\) confidence interval comparing the population mean prices of all textbooks used that term at her college. Interpret.

A study \(^{14}\) compared personality characteristics between 49 children of alcoholics and a control group of 49 children of nonalcoholics who were matched on age and gender. On a measure of well-being, the 49 children of alcoholics had a mean of \(26.1(s=7.2)\) and the 49 subjects in the control group had a mean of \(28.8(s=6.4) .\) The difference scores between the matched subjects from the two groups had a mean of \(2.7(s=9.7)\). a. Are the groups to be compared independent samples or dependent samples? Why? b. Show all steps of a test of equality of the two population means for a two- sided alternative hypothesis. Report the P-value and interpret. c. What assumptions must you make for the inference in part b to be valid?

Energy drinks: health risks and toxicity A study was carried out in Saudi Arabia in which 31 male university students (18 overweight/obese and 13 having normal weight) were enrolled from December 2013 to December 2014 (www.annsaudimed.net). The heart rate variability was significantly less in obese subjects as compared to subjects with normal weight at 60 minutes after consuming an energy drink as indicated by the mean heart rate range MHRR (P-value \(=0.012\) ). a. The conclusion was based on a significance test comparing means. Define notation in context, identify the groups and the population means and state the null hypothesis for the test. b. What information you are not able to obtain from the P-value approach which you could learn if the confidence interval comparing the means was provided?

In Western Australia, handheld cell phone use while driving has been banned since \(2001,\) but hands-free devices are legal. A study (published in the British Medical Journal in 2005 ) of 456 drivers in Perth who had been in a crash observed if they were using a cell phone before the crash and if they were using a cell phone during an earlier period when no accident occurred. Thus, each driver served as his or her own control group in the study. a. In comparing rates of cell phone use before the crash and the earlier accident-free period, should we use methods for independent samples or for dependent samples? Explain. b. Identify a test you can use to see whether the proportion of drivers using a cell phone differs between the period before the crash and the earlier accident-free period.

To increase Barack Obama's visibility and to raise money for the campaign leading up to the 2008 presidential election, Obama's analytics team conducted an \(\mathrm{A} / \mathrm{B}\) test with his website. In the original version, the button to join the campaign read "Sign Up". In an alternative version, it read "Learn More". Of 77,858 visitors to the original version, 5851 clicked the button. Of 77,729 visitors to the alternative version, 6927 clicked the button. Is there evidence that one version was more successful than the other in recruiting campaign members? a. Sketch an appropriate graph to compare the sample proportions visually. b. Show all steps of a significance test, using the computer output. Define any parameters you are using when specifying the hypotheses. Mention whether there is a significant difference at the 0.05 significance level. c. Interpret the confidence interval shown in the output. Why is this interval more informative than just reporting the P-value?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.