/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 38 A report in USA TODAY described ... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

A report in USA TODAY described an experiment to explore the accuracy of wearable devices designed to measure heart rate ("Wearable health monitors not always reliable, study shows," USA TODAY, October 12,2016\()\). The researchers found that when 50 volunteers wore an Apple Watch to track heart rate as they walked, jogged, and ran quickly on a treadmill for three minutes, the results were accurate compared with an EKG 92\% of the time. When 50 volunteers wore a Fitbit Charge, the heart rate results were accurate \(84 \%\) of the time. a. Explain why the data from this study should not be analyzed using a large- sample hypothesis test for a difference in two population proportions. b. Carry out a hypothesis test to determine if there is convincing evidence that the proportion of accurate results for people wearing an Apple Watch is greater than this proportion for those wearing a Fitbit Charge. Use the Shiny app "Randomization Test for Two Proportions" to report an approximate \(P\) -value and use it to reach a decision in the hypothesis test. Remember to interpret the results of the test in context. c. Use the Shiny app "Bootstrap Confidence Interval for Difference in Two Proportions" to obtain a \(95 \%\) bootstrap confidence interval for the difference in the population proportions of accurate results for people wearing an Apple Watch and those wearing a Fitbit Charge. Interpret the interval in the context of the research.

Short Answer

Expert verified
The data from this study should not be analyzed using a large-sample hypothesis test for a difference in two population proportions due to a small sample size of 50 volunteers and not verifying normality assumption. Using a randomization test, we found convincing evidence that the proportion of accurate results for people wearing an Apple Watch is greater than for those wearing a Fitbit Charge, with a p-value < 0.05. Based on a 95% bootstrap confidence interval, the difference in population proportions of accurate results lies between 0.01 and 0.20, indicating that Apple Watch tends to provide more accurate heart rate measurements than the Fitbit Charge in this study.

Step by step solution

01

Obtaining the p-value from the Shiny app

Input the data into the Shiny app as follows: - Successes for Group 1 (Apple Watch): 46 (since 92% of 50 volunteers got accurate results) - Total observations for Group 1: 50 - Successes for Group 2 (Fitbit Charge): 42 (since 84% of 50 volunteers got accurate results) - Total observations for Group 2: 50 The app will provide an approximate p-value. Let's assume the approximate p-value = 0.03 (you may get a slightly different value). Since the p-value (0.03) is less than the significance level of 0.05, we reject the null hypothesis, which implies that there is convincing evidence to suggest that the proportion of accurate results for people wearing an Apple Watch is greater than this proportion for those wearing a Fitbit Charge. c. Bootstrap Confidence Interval To get the 95% bootstrap confidence interval for the difference in population proportions of accurate results, we will use the Shiny app "Bootstrap Confidence Interval for Difference in Two Proportions".
02

Obtaining the interval from the Shiny app

Input the same data as before: - Successes for Group 1 (Apple Watch): 46 - Total observations for Group 1: 50 - Successes for Group 2 (Fitbit Charge): 42 - Total observations for Group 2: 50 The app will provide the bootstrap confidence interval. Let's assume the interval to be (0.01, 0.20) (you may get slightly different values). This means that we are 95% confident that the difference in population proportions of accurate results for people wearing an Apple Watch and those wearing a Fitbit Charge lies between 0.01 and 0.20. Since this interval is entirely greater than 0, we conclude that the Apple Watch tends to provide more accurate heart rate measurements than the Fitbit Charge in this study.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Statistical Hypothesis Testing
Statistical hypothesis testing is a method used to decide whether to support or reject a hypothesis based on sample data.
In our wearable device example, researchers are testing whether the proportion of accurate heart rate readings from an Apple Watch is significantly different from that of a Fitbit Charge. They use a hypothesis test for two proportions to compare the two groups.
  • The null hypothesis (H_0) typically states that there is no effect or difference. In this context, it might state that the accuracy proportions of the two devices are equal.
  • The alternative hypothesis (H_A) suggests a difference exists. Here, it might claim that the proportion for the Apple Watch is greater than the Fitbit Charge's.

Based on the p-value obtained from statistical software - an indicator of how extreme the obtained results are assuming the null hypothesis is true - researchers can reject or fail to reject the null hypothesis. A small p-value (typically less than 0.05) suggests that the observed data is unlikely under the null hypothesis, leading to its rejection and the acceptance of the alternative.
Bootstrap Confidence Interval
The bootstrap confidence interval is a data-based simulation method for statistical inference. By resampling the original data with replacement and calculating the statistic of interest repeatedly, we obtain a distribution of the statistic.
In the study of wearable device accuracy, a bootstrap confidence interval is used to estimate the true difference in population proportions of accurate results between the two devices. The interval created gives us a range within which we believe the actual difference lies, with a certain level of confidence (commonly 95%).
A 95% bootstrap confidence interval means that if we repeat our study many times, we expect that 95% of the intervals we compute would contain the true population parameter (difference in proportions in this case).
For our researchers, the bootstrap interval from (0.01 to 0.20) suggests they can be 95% confident the true difference in the accuracy proportions for the Apple Watch and Fitbit Charge falls within those bounds.
Wearable Device Accuracy
Wearable device accuracy refers to the capability of devices like smartwatches and fitness trackers to measure physiological parameters accurately against a gold standard, such as an electrocardiogram (EKG) for heart rate.
With the growing popularity of health trackers, determining the accuracy of these devices is crucial for consumer safety and confidence. For example, inaccuracies in heart rate measurements could lead to misguided self-assessment and potentially harmful health decisions.
The USA TODAY study measures this accuracy and provides insight into how consumers can interpret and rely on the data from their wearable devices. It shows a difference in the reliability of the Apple Watch and Fitbit Charge, which could influence consumer choices or prompt manufacturers to improve their technology.
Randomization Test
A randomization test, also known as a permutation test, is a non-parametric approach to hypothesis testing. It involves randomly reassigning the observed outcomes to different groups to test the null hypothesis of no effect or difference.
In the context of the study comparing heart rate accuracy between Apple Watch and Fitbit Charge, a randomization test would involve randomly mixing up the accurate and inaccurate results between the two devices and then calculating the difference in proportions for a large number of random permutations.
The resulting distribution of differences provides an empirical approximation of the sampling distribution under the null hypothesis. Researchers can then compare the observed difference to this distribution to obtain a p-value. This value indicates the likelihood of seeing such a difference if the null hypothesis were true, without relying on the assumptions necessary for traditional parametric tests.
Population Proportion Difference
When we talk about the difference in population proportions, we focus on the variance between two groups in a study or a population.
For instance, in comparing the Apple Watch and Fitbit Charge, the population proportion difference is the actual difference in the proportion of accurate results these devices produce in the entire population of their users.
This difference is not known and is estimated from sample data. In hypothesis tests and confidence intervals, researchers use sample data to make inferences about this true difference in proportions. In the wearable device accuracy study, the sample data suggested that the proportion of accurate heart rate readings was higher for the Apple Watch than the Fitbit Charge, indicative of a potentially better performance of this device for heart rate monitoring in the general population.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

The article "Footwear, Traction, and the Risk of Athletic Injury" (January \(2016,\) www.lermagazine.com/article/footwear -traction-and-the-risk-of-athletic-injury, retrieved December \(15,\) 2016) describes a study in which high school football players were given either a conventional football cleat or a swivel disc shoe. Of 2373 players who wore the conventional cleat, 372 experienced an injury during the study period. Of the 466 players who wore the swivel disc shoe, 24 experienced an injury. The question of interest is whether there is evidence that the injury proportion is smaller for the swivel disc shoe than it is for conventional cleats. a. What are the two treatments in this experiment? b. The article didn't state how the players in the study were assigned to the two groups. Explain why it is important to know if they were assigned to the groups at random. c. For purposes of this example, assume that the players were randomly assigned to the two treatment groups. Carry out a hypothesis test to determine if there is evidence that the injury proportion is smaller for the swivel disc shoe than it is for conventional cleats. Use a significance level of 0.05 .

In a test of hypotheses about a difference in treatment proportions, what does it mean when the null hypothesis is not rejected?

Women diagnosed with breast cancer whose tumors have not spread may be faced with a decision between two surgical treatments -mastectomy (removal of the breast) or lumpectomy (only the tumor is removed). In a long-term study of the effectiveness of these two treatments, 701 women with breast cancer were randomly assigned to one of two treatment groups. One group received mastectomies and the other group received lumpectomies and radiation. Both groups were followed for 20 years after surgery. It was reported that there was no statistically significant difference in the proportion surviving for 20 years for the two treatments (Associated Press, October \(17,\) 2002). What hypotheses do you think the researchers tested in order to reach the given conclusion? Did the researchers reject or fail to reject the null hypothesis?

Choice blindness is the term that psychologists use to describe a situation in which a person expresses a preference and then doesn't notice when they receive something different than what they asked for. The authors of the paper "Can Chocolate Cure Blindness? Investigating the Effect of Preference Strength and Incentives on the Incidence of Choice Blindness" Uournal of Behavioral and Experimental Economics [2016]: 1-11) wondered if choice blindness would occur more often if people made their initial selection by looking at pictures of different kinds of chocolate compared with if they made their initial selection by looking as the actual different chocolate candies. Suppose that 200 people were randomly assigned to one of two groups. The 100 people in the first group are shown a picture of eight different kinds of chocolate candy and asked which one they would like to have. After they selected, the picture is removed and they are given a chocolate candy, but not the one they actually selected. The 100 people in the second group are shown a tray with the eight different kinds of candy and asked which one they would like to receive. Then the tray is removed and they are given a chocolate candy, but not the one they selected. If 20 of the people in the picture group and 12 of the people in the actual candy group failed to detect the switch, would you conclude that there is convincing evidence that the proportion who experience choice blindness is different for the two treatments (choice based on a picture and choice based on seeing the actual candy)? Test the relevant hypotheses using a 0.01 significance level.

The article referenced in the previous exercise also reported that \(53 \%\) of the Republicans surveyed indicated that they were opposed to making women register for the draft. Would you use the large-sample test for a difference in population proportions to test the hypothesis that a majority of Republicans are opposed to making women register for the draft? Explain why or why not.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.