/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 13 Predicting weight Let's use mult... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

Predicting weight Let's use multiple regression to predict total body weight (TBW, in pounds) using data from a study of female college athletes. Possible predictors are \(\mathrm{HGT}=\) height (in inches), \(\% \mathrm{BF}=\) percent body fat, and age. The display shows the correlation matrix for these variables. a. Which explanatory variable gives by itself the best predictions of weight? Explain. b. With height as the sole predictor, \(\hat{y}=-106+3.65\) (HGT) and \(r^{2}=0.55\). If you add \%BF as a predictor, you know that \(R^{2}\) will be at least \(0.55 .\) Explain why. c. When you add \% body fat to the model, \(\hat{y}=-121+\) \(3.50(\mathrm{HGT})+1.35(\% \mathrm{BF})\) and \(R^{2}=0.66 .\) When you add age to the model, \(\hat{y}=-97.7+3.43(\mathrm{HGT})+\) \(1.36(\% \mathrm{BF})-0.960(\mathrm{AGE})\) and \(R^{2}=0.67\). Once you know height and \% body fat, does age seem to help you in predicting weight? Explain, based on comparing the \(R^{2}\) values.

Short Answer

Expert verified
a. Height (HGT); b. Adding predictors can't decrease \(R^2\); c. Age provides minimal added predictive value.

Step by step solution

01

Determine the Best Predictor

To find the best single predictor of weight, look for the variable with the highest correlation with TBW. The correlation matrix will show the correlation coefficients between weight and other variables. The variable with the highest absolute correlation value with weight is the best predictor.
02

Understand Multiple Regression Basics

In multiple regression, adding a new predictor to the model will never decrease the \(R^2\) value. This is because \(R^2\) is the proportion of the variance for a dependent variable that's explained by independent variables in a regression model. When you add predictors, even if they are not very helpful, they can't decrease the \(R^2\) because it is always adjusted to reflect any new information.
03

Analyze Improvement with Added Predictors

When \,%BF\ is added, the model changes to \(\hat{y} = -121 + 3.50(HGT) + 1.35(%BF)\) with an increased \(R^2\) of 0.66. The increase in \(R^2\) from 0.55 to 0.66 indicates that \,%BF\ is a valuable predictor, as it explains additional variance in weight.
04

Evaluate the Contribution of Age

Adding \(AGE\) changes the model to \(\hat{y} = -97.7 + 3.43(HGT) + 1.36( \% \,BF) - 0.960(AGE)\) with \(R^2 = 0.67\). The \(R^2\) only increases marginally from 0.66 to 0.67. This suggests that while \(AGE\) contributes some additional explanation of variance, it provides minimal additional predictive power beyond \(HGT\) and \%BF\.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Correlation Coefficient
In the context of multiple regression analysis, the correlation coefficient is a key statistic that measures the strength and direction of the linear relationship between two variables. It ranges from -1 to 1, where a value close to 1 or -1 indicates a strong linear relationship, and a value close to 0 indicates a weak linear relationship.
For example, if we look at the correlation between height (HGT) and total body weight (TBW) among female college athletes, we can determine how effectively height alone can predict weight. If the correlation coefficient between these two variables is close to 1 or -1, height is a good predictor on its own.
However, if you're working with multiple predictors, a single high correlation coefficient might not tell the full story. It's crucial to see how it relates to others in a multiple regression model because added variables might better capture or explain variations together, even if their individual correlation coefficients are lower.
  • The closer the correlation coefficient to ±1, the stronger the relationship.
  • A positive coefficient indicates both variables increase together.
  • A negative coefficient shows that one variable decreases as the other increases.
Coefficient of Determination (R^2)
The coefficient of determination, denoted as \(R^2\), tells us how well the predictor variables can explain the variance in the dependent variable (in our case, total body weight).
In simple terms, \(R^2\) is the proportion of the variance in the dependent variable that is predictable from the independent variables. For example, if \(R^2 = 0.55\), it means 55% of the variance in weight can be explained by height alone. This provides a clear picture of the prediction power of the model with just height.
As more predictors are added, \(R^2\) never decreases. In our exercise, adding \%BF increases \(R^2\) to 0.66, showing enhanced prediction ability with the inclusion of another variable.
Yet, when age is added, the increase from 0.66 to 0.67 suggests minimal added value, signifying age does not offer substantial additional information after considering height and body fat.
  • \(R^2\) values range from 0 to 1.
  • Higher \(R^2\) values mean a better fit of the model to data.
  • \(R^2\) will not decrease with the addition of more predictors.
Predictor Variables
Predictor variables are the independent variables used in a regression model to predict the outcome of the dependent variable. In our exercise, the predictor variables include height (\(HGT\)), percent body fat (%BF), and age. These are evaluated to see how well they predict total body weight among female college athletes.
Each predictor variable contributes to the predictive power of the regression model. For instance, height, when used alone, proved to be a significant predictor of weight but became even more powerful when combined with percent body fat information. This exemplifies the synergy of multiple predictors.
When choosing predictor variables:
  • Select variables based on strong theoretical backgrounds or previous research.
  • Be cautious of multicollinearity, which occurs when predictor variables are highly correlated amongst themselves, leading to less reliable predictions.
  • Evaluate the incremental value of each predictor using the change in \(R^2\) and other statistical tests.
Understanding and choosing the right set of predictor variables is integral to building a robust multiple regression model that accurately forecasts the dependent variable.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Price, age, and horsepower In the previous exercise, \(r^{2}=0.66\) when age is the predictor and \(R^{2}=0.69\) when both age and HP are predictors. Why do you think that the predictions of price don't improve much when HP is added to the model? (The correlation between HP and price is \(r=0.56,\) and the correlation between HP and age is \(r=-0.51 .)\)

Controlling has an effect The slope of \(x_{1}\) is not the same for multiple linear regression of \(y\) on \(x_{1}\) and \(x_{2}\) as compared to simple linear regression of \(y\) on \(x_{1},\) where \(x_{1}\) is the only predictor. Explain why you would expect this to be true. Does the statement change when \(x_{1}\) and \(x_{2}\) are uncorrelated?

Does leg press help predict body strength? Chapter 12 analyzed strength data for 57 female high school athletes. Upper body strength was summarized by the maximum number of pounds the athlete could bench press (denoted maxBP). This was predicted well by the number of times she could do a 60 -pound bench press (denoted BP60). Can we predict maxBP even better if we also know how many times an athlete can perform a 200 -pound leg press? The table shows results after adding this second predictor (denoted LP200) to the model. $$ \begin{array}{lcccc} \text { Term } & \text { Coef } & \text { SE Coef } & \text { T-Value } & \text { P-Value } \\ \text { Constant } & 60.60 & 2.87 & 21.10 & 0.000 \\ \text { BP60 } & 1.332 & 0.188 & 7.10 & 0.000 \\ \text { LP200 } & 0.211 & 0.152 & 1.39 & 0.171 \end{array} $$

An entrepreneur owns two filling stations - one at an inner city location and the other at an interstate exit location. He wants to compare the regressions of \(y=\) total daily revenue on \(x=\) number of customers who visit the filling station, for total revenue listed on a daily basis at the inner city location and at the interstate exit location. Explain how you can do this using regression modeling a. With a single model, having an indicator variable for location that assumes the slopes are the same for each location. b. With separate models for each location, permitting the slopes to be different.

Suppose you fit a straightline regression model to \(x=\) time and \(y=\) population. Sketch what you would expect to observe for (a) the scatterplot of \(x\) and \(y\) and (b) a plot of the residuals against the values of time.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.