/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 6 In Example \(32.3\) we compared ... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

In Example \(32.3\) we compared two insect repellants using a permutation test for a matched pairs experiment. Because of the small sample size, we were able to obtain the exact permutation distribution as: $$ \begin{array}{l|cccccc} \hline \text { Mean difference } & -2.0 & -1.0 & -0.5 & 0.5 & 1.0 & 2.0 \\ \hline \text { Probability } & 0.125 & 0.125 & 0.250 & 0.250 & 0.125 & 0.125 \\\ \hline \end{array} $$ In this example, the observed mean difference in treatments (DEET - oil of lemon eucalyptus) is \(-2\). Using this permutation distribution, we have shown that the two-sided \(P\)-value, the chance of observing a difference this extreme, is \(0.25\). (a) Simulate the permutation distribution using 100 simulations and give the estimated \(P\)-value. Repeat this with a second simulation. How close are the answers to the exact permutation distribution and \(P\)-value? (b) Simulate the permutation distribution using 10,000 simulations and give the estimated \(P\)-value. Repeat this with a second simulation. How close are the answers to the exact permutation distribution and \(P\)-value? (c) What do the results in parts (a) and (b) show about the effect of the number of simulations on the estimated permutation distribution and \(P\)-value? Explain briefly.

Short Answer

Expert verified
More simulations result in a more accurate and precise estimate of the true permutation distribution and ($P$)-value.

Step by step solution

01

Set up simulation parameters

To simulate the permutation distribution, we will randomly shuffle the data and compute the mean difference 100 times for part (a) and 10,000 times for part (b). In each simulation, we record how often the mean difference is as extreme as (-2.0). For two-sided tests, we consider differences of (-2) or smaller and 2 or larger.
02

Conduct 100 simulations for part (a)

Randomly shuffle the paired results to compute the mean difference 100 times. Count the number of times the computed mean difference is (-2) or another equally extreme result within the permutation distribution. Divide this count by 100 to obtain the estimated ($P$)-value.
03

Repeat 100 simulations for part (a)

Conduct another set of 100 simulations and calculate the estimated ($P$)-value in the same manner as the previous step. Compare the two ($P$)-values obtained from each 100-simulation batch to gauge accuracy and variance.
04

Conduct 10,000 simulations for part (b)

Increase the number of simulations to 10,000, shuffle the data, and compute the mean difference each time. Record how often differences as extreme as (-2) appear. Divide the occurrences by 10,000 to estimate the ($P$)-value.
05

Repeat 10,000 simulations for part (b)

Repeat the simulations with another batch of 10,000. Estimate the ($P$)-value again and compare the results from the two 10,000-simulation runs to assess precision reliability.
06

Analyze the simulation results

Compare the estimated ($P$)-values from parts (a) and (b) with the known exact value of 0.25. Analyze how the increase in simulations from 100 to 10,000 affects precision and approximates the true permutation distribution.
07

Conclusion on the effect of simulations

Reflect on how more simulations provide a more stable and accurate estimate of the true permutation distribution, leading to ($P$)-values that better approximate the true permutation distribution as observed in the exact method.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Matched Pairs Experiment
When considering experiments that compare two treatments, such as insect repellants, a matched pairs experiment is often employed. In this design, each pair of subjects undergoes both treatments. This ensures that the control and treatment groups are well-matched in terms of characteristics. When applying this in the example of DEET versus oil of lemon eucalyptus, each insect was exposed to both repellants, minimizing the variability that might arise from differences between individual insects.
The result is a more robust comparison because the main source of variation is the treatment itself rather than differences between groups. This method provides control over confounding variables, allowing for a more reliable conclusion about treatment effects.
Simulation
Simulation in statistical analysis involves creating artificial data by repeatedly performing an experiment on a computer. In the context of a permutation test, simulation means shuffling observations among treatments and calculating the test statistic, such as the mean difference, for each shuffle. This was used in the exercise to simulate the permutation distribution 100 times and then 10,000 times.
The goal is to understand how often we get different outcomes just by chance. For instance, in the exercise, various simulations help estimate the probability of observing certain mean differences. By simulating 100 times, then again 10,000 times, students observed how estimates stabilize and become more accurate as the number of simulations increases.
P-value Estimation
A core goal of a permutation test is to estimate the P-value, which represents the likelihood of observing a statistic as extreme as the one computed from the actual data, under the null hypothesis. In the exercise, the exact P-value was determined to be 0.25. By simulating the permutation test, the P-value was re-estimated by determining the frequency of observing a difference as extreme as -2 across many reshuffles.
With 100 simulations, the estimations varied but provided an initial approximation. By increasing simulations to 10,000, the estimated P-value was much closer to the exact value. This illustrates a fundamental principle: increasing the number of simulations tends to yield more credible and stable P-value estimates.
Randomization Methods
Randomization is a crucial part of permutation tests. It involves randomly assigning each observation to a group in each repeat of the experiment. This ensures that any observed effects are not due to pre-existing differences.
In permutation tests, randomization allows researchers to create a distribution of test statistics under the null hypothesis. The exercise exemplified this by using randomization to generate different arrangements of data, helping to identify how often true results like the observed difference occur by chance.
Randomization forms the backbone of creating permutation distributions, a crucial element in comparing observed data against what would be expected under no treatment effect. This method strengthens the validity of results by allowing for robust conclusions based on the data's randomness and inherent variability.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

We plan to use the bootstrap method to construct a confidence interval for a population median from a sample of 43 subjects from the population. An important assumption for using the bootstrap method is (a) the sample is a random sample from the population. (b) the sampling distribution for the sample median must not be well approximated by the Normal distribution. (c) there are no outliers in the sample.

"Durable press" cotton fabrics are treated to improve their recovery from wrinkles after washing. Unfortunately, the treatment also reduces the strength of the fabric. A study compared the breaking strengths of fabrics treated by two commercial durable press processes. Five swatches of the same fabric were assigned at random to each process. Here are the data, in pounds of pull needed to tear the fabric: \({ }^{12}\) $$ \begin{array}{l|lllll} \hline \text { Permafresh } & 29.9 & 30.7 & 30.0 & 29.5 & 27.6 \\ \hline \text { Hylite } & 28.8 & 23.9 & 27.0 & 22.1 & 24.2 \\ \hline \end{array} $$ There is a mild outlier in the Permafresh group. Perhaps we should use a permutation test to test the hypothesis of no difference in median pounds of pull needed to tear the fabric. Assume a two-sided alternative and estimate the \(P\) value.

We select a random sample of six freshman students from the University of California at Santa Cruz and find that their verbal GREs are 480, 510, 590, 670, 520 , and 630 . Which of the following is not a possible bootstrap sample? (a) \(480,480,480,480,480,480\) (b) \(480,480,480,670,670,670\) (c) \(480,630,630,740,590,510\)

In a study of exhaust emissions from school buses, the pollution intake by passengers was determined for a sample of nine school buses used in the Southern California Air Basin. The pollution intake is the amount of exhaust emissions, in grams per person, that would be breathed in while travelilng on the bus during its usual 18-mile trip on congested freeways from South Central LA to a magnet school in West LA. (As a reference, the average intake of motor emissions of carbon monoxide in the LA area is estimated to be about \(0.000046\) gram per person.) Here are the amounts for the nine buses when driven with the windows open: \({ }^{17}\) $$ \begin{array}{lllllllll} 1.15 & 0.33 & 0.40 & 0.33 & 1.35 & 0.38 & 0.25 & 0.40 & 0.35 \end{array} $$ (a) Make a stemplot. Are there outliers or strong skewness that would forbid use of the \(t\) procedures? (b) Construct a \(95 \%\) bootstrap confidence interval for the mean pollution intake among all school buses used in the Southern California Air Basin that travel the route investigated in the study.

Does taking notes by hand in a statistics course improve performance? Some recent research suggests that this may be the case. 2 To explore this, six volunteers (Doug, Elizabeth, Oksana, Sebastian, Vishal, and Xinyi) agree to take part in an experiment. Four are assigned completely at random to take handwritten notes in class, and the other two are assigned to take notes on their laptops. Total points earned on the two in-class exams and final exam are used to determine course performance. The results are (out of a possible total of 500 points): $$ \begin{array}{ll} \hline \text { Handwritten Notes (Person) } & \text { Notes on Laptop (Person) } \\ \hline 380 \text { (Doug) } & 370 \text { (Elizabeth) } \\ \hline 400 \text { (Oksana) } & 310 \text { (Xinyi) } \\ \hline 420 \text { (Sebastian) } & \\ \hline 360 \text { (Vishal) } & \\ \hline \end{array} $$ (a) There are 15 possible ways the six subjects can be assigned to the two groups, with the handwritten notes group having size 4 and the laptop notes group size 2. List these. (b) For each, determine the difference in mean points (mean number of points for the handwritten notes group minus mean number of points for the laptop notes group). Combine any duplicates and make a table of the possible mean differences and the corresponding probability of each under the null hypothesis of no difference in the effect of the treatments on total points earned. (Each of the 15 possible assignments of subjects to treatments has probability \(1 / 15\) under the null hypothesis.) This is the permutation distribution. (c) Compute the \(P\)-value of the data. Assume the two-sided alternative hypothesis is that the mean number of points is different for the two groups. (d) In this example, is it possible to demonstrate significance at the \(5 \%\) level using the permutation test? Explain. (e) Assume that total number of points is Normally distributed for both groups. Use the two-sample \(t\) procedure to test the hypotheses. Use Option 1 if you have access to software.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.