Problem 56 Explain what's wrong with the wa... [FREE SOLUTION]

Chapter 3: Problem 56

Explain what's wrong with the way regression is used in each of the following examples: a. Winning times in the Boston marathon (at www. bostonmarathon.org) have followed a straight-line decreasing trend from 160 minutes in 1927 (when the race was first run at the Olympic distance of about 26 miles) to 128 minutes in 2014. After fitting a regression line to the winning times, you use the equation to predict that the winning time in the year 2300 will be about 13 minutes. b. Using data for several cities on \(x=\%\) of residents with a college education and \(y=\) median price of home, you get a strong positive correlation. You conclude that having a college education causes you to be more likely to buy an expensive house. c. A regression between \(x=\) number of years of education and \(y=\) annual income for 100 people shows a modest positive trend, except for one person who dropped out after 10 th grade but is now a multimillionaire. It's wrong to ignore any of the data, so we should report all results including this point. For this data, the correlation \(r=-0.28\)

Short Answer

Expert verified

Regression is misused by extrapolating too far, mistaking correlation for causation, and allowing outliers to distort results.

Step by step solution

Understanding Extrapolation in Regression

In example (a), using regression to predict winning times until the year 2300 assumes that the trend will remain linear indefinitely. However, this scenario ignores real-world limits on human speed and environmental changes. Predicting that the winning time will be 13 minutes is an unrealistic extrapolation beyond the feasible bounds of the data.

Distinguishing Correlation from Causation

In example (b), the mistake is concluding causation from correlation. A positive correlation between education level and home prices does not imply that one's education directly causes more expensive home purchases. Other factors, like income or socioeconomic status, might play significant roles.

Handling Outliers in Regression Analysis

In example (c), including an outlier (the multimillionaire with less education) skews the correlation significantly. While outliers shouldn't always be ignored, it's crucial to analyze their influence on the data. Reporting results that include such an outlier without additional context may result in misleading interpretations.

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91影视!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Extrapolation in Regression

Extrapolation in regression involves predicting values outside the range of the observed data based on a trend that is observed within the data. It's like trying to guess what the weather will be like next month using data from the current week. The problem arises because the assumptions made during interpolation, or predictions within data bounds, don't necessarily hold true when applied to extrapolation. When you fit a regression line to data like the Boston marathon winning times and extend it far into the future, say to the year 2300, you are assuming that the conditions affecting the data will remain unchanged. However, real-life situations often involve changes not accounted for in historical data. The human limit on running speeds, changes in training, environmental factors, and technological advancements mean that predicting a 13-minute marathon in 2300 is unrealistic. When using regression, always consider if your prediction is grounded in reality, especially when it involves extrapolation.

Correlation vs Causation

It鈥檚 easy to confuse correlation with causation. Correlation means that two variables move together, but it doesn鈥檛 mean that one causes the other. For instance, having a college education and owning an expensive house might show strong positive correlation, meaning they tend to occur together. However, concluding that education causes one to buy a costly home misses other crucial factors. Consider that income is perhaps a more direct factor linking these variables. People with higher income might both pursue higher education and buy more expensive homes. It鈥檚 crucial to investigate the underlying factors and remember that a statistical relationship doesn鈥檛 confirm a direct cause-and-effect link.

Outliers in Statistics

In statistics, outliers are data points that are distinctly separate from the rest of the dataset. These can have a significant impact on results, especially in regression analysis. In the example with the multimillionaire who didn鈥檛 finish high school but earns extraordinarily well, this person represents an outlier. Including or excluding outliers from the analysis is sometimes a tough decision. They can skew data and affect the calculation of correlation and regression slopes negatively. In this case, including the multimillionaire drastically altered the correlation, giving a misleading picture of the relationship between education and income. It鈥檚 important to contextualize outliers: sometimes they indicate variability and sometimes measurement error or unique conditions. Reporting findings with and without outliers, as well as examining their reason, can help deliver a more accurate analysis.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

91影视

Short Answer

Step by step solution

Understanding Extrapolation in Regression

Distinguishing Correlation from Causation

Handling Outliers in Regression Analysis

Key Concepts

Extrapolation in Regression

Correlation vs Causation

Outliers in Statistics

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Math Textbooks

Calculus

Pure Maths

Theoretical and Mathematical Physics

Applied Mathematics

Mechanics Maths

Logic and Functions

Study anywhere. Anytime. Across all devices.