Chapter 5: Problem 25
Explain why it can be dangerous to use the leastsquares line to obtain predictions for \(x\) values that are substantially larger or smaller than those contained in the sample.
/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none}
Learning Materials
Features
Discover
Chapter 5: Problem 25
Explain why it can be dangerous to use the leastsquares line to obtain predictions for \(x\) values that are substantially larger or smaller than those contained in the sample.
All the tools & learning materials you need for study success - in one app.
Get started for free
As part of a study of the effects of timber management strategies (Ecological Applications [2003]: IIIOII123) investigators used satellite imagery to study abundance of the lichen Lobaria oregano at different elevations. Abundance of a species was classified as "common" if there were more than 10 individuals in a plot of land. In the table below, approximate proportions of plots in which Lobaria oregano were common are given. Proportions of Plots Where Lobaria oregano Are Common \begin{tabular}{lrrrrrrr} \hline Elevation (m) & 400 & 600 & 800 & 1000 & 1200 & 1400 & 1600 \\ Prop. of plots & \(0.99\) & \(0.96\) & \(0.75\) & \(0.29\) & \(0.077\) & \(0.035\) & \(0.01\) \\ with lichen & & & & \end{tabular} with lichen \begin{tabular}{l} with lichen \\ common \\ \hline \end{tabular} a. As elevation increases, does the proportion of plots for which lichen is common become larger or smaller? What aspect(s) of the table support your answer? b. Using the techniques introduced in this section, calculate \(y^{\prime}=\ln \left(\frac{p}{1-p}\right)\) for each of the elevations and fit the line \(y^{\prime}=a+b(\) Elevation). What is the equation of the best-fit line? c. Using the best-fit line from Part (b), estimate the proportion of plots of land on which Lobaria oregano are classified as "common" at an elevation of \(900 \mathrm{~m} .\)
In a study of 200 Division I athletes, variables related to academic performance were examined. The paper "Noncognitive Predictors of Student Athletes' Academic Performance"' (journal of College Reading and Learning [2000]: el67) reported that the correlation coefficient for college GPA and a measure of academic self-worth was \(r=0.48\). Also reported were the correlation coefficient for college GPA and high school GPA \((r=0.46)\) and the correlation coefficient for college GPA and a measure of tendency to procrastinate \((r=-0.36) .\) Higher scores on the measure of self-worth indicate higher self-worth, and higher scores on the measure of procrastination indicate a higher tendency to procrastinate. Write a few sentences summarizing what these correlation coefficients tell you about the academic performance of the 200 athletes in the sample.
Explain why the slope \(b\) of the least-squares line always has the same sign (positive or negative) as does the sample correlation coefficient \(r\).
With a bit of algebra, we can show that $$ \text { SSResid }=\left(1-r^{2}\right) \sum(y-\bar{y})^{2} $$ from which it follows that $$ s_{e}=\sqrt{\frac{n-1}{n-2}} \sqrt{1-r^{2}} s_{y} $$ Unless \(n\) is quite small, \((n-1) /(n-2) \approx 1\), so $$ s_{e} \approx \sqrt{1-r^{2}} s_{y} $$ a. For what value of \(r\) is \(s_{e}\) as large as \(s_{y}\) ? What is the least- squares line in this case? b. For what values of \(r\) will \(s_{e}\) be much smaller than \(s_{s} ?\) c. A study by the Berkeley Institute of Human Development (see the book Statistics by Freedman et al. listed in the back of the book) reported the following summary data for a sample of \(n=66\) California boys: \(r \approx .80\) At age 6 , average height \(\approx 46\) inches, standard deviation \(\approx 1.7\) inches. At age 18 , average height \(\approx 70\) inches, standard deviation \(\approx 2.5\) inches. What would \(s_{e}\) be for the least-squares line used to predict 18 -year-old height from 6 -year-old height? d. Referring to Part (c), suppose that you wanted to predict the past value of 6 -year-old height from knowledge of 18 -year-old height. Find the equation for the appropriate least-squares line. What is the corresponding value of \(\left.s_{e}\right\\}\)
A sample of 548 ethnically diverse students from Massachusetts were followed over a 19 -month period from 1995 and 1997 in a study of the relationship between TV viewing and eating habits (Pediatrics [ 2003\(]\) : 1321-1326). For each additional hour of television viewed per day, the number of fruit and vegetable servings per day was found to decrease on average by \(0.14\) serving. a. For this study, what is the dependent variable? What is the predictor variable? b. Would the least-squares line for predicting number of servings of fruits and vegetables using number of hours spent watching TV as a predictor have a positive or negative slope? Explain.
What do you think about this solution?
We value your feedback to improve our textbook solutions.