/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Q118E Question: Adverse effects of hot... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

Question: Adverse effects of hot-water runoff. The Environmental Protection Agency (EPA) wants to determine whether the hot-water runoff from a particular power plant located near a large gulf is having an adverse effect on the marine life in the area. The goal is to acquire a prediction equation for the number of marine animals located at certain designated areas, or stations, in the gulf. Based on past experience, the EPA considered the following environmental factors as predictors for the number of animals at a particular station:

X1 = Temperature of water (TEMP)

X2 = Salinity of water (SAL)

X3 = Dissolved oxygen content of water (DO)

X4 = Turbidity index, a measure of the turbidity of the water (TI)

x5 = Depth of the water at the station (ST_DEPTH)

x6 = Total weight of sea grasses in sampled area (TGRSWT)

As a preliminary step in the construction of this model, the EPA used a stepwise regression procedure to identify the most important of these six variables. A total of 716 samples were taken at different stations in the gulf, producing the SPSS printout shown below. (The response measured was y, the logarithm of the number of marine animals found in the sampled area.)

a. According to the SPSS printout, which of the six independent variables should be used in the model? (Use α = .10.)

b. Are we able to assume that the EPA has identified all the important independent variables for the prediction of y? Why?

c. Using the variables identified in part a, write the first-order model with interaction that may be used to predict y.

d. How would the EPA determine whether the model specified in part c is better than the first-order model?

e.Note the small value of R2. What action might the EPA take to improve the model?

Short Answer

Expert verified

Answer

a. The variables which should be used in the model are ST_DEPTH, TGRSWT, and TI.

b. The EPA should not assume that they have identified all the important independent variables for prediction. The stepwise procedure tends to perform a large number of t-tests, inflating the overall probability of a Type I error, and does not automatically include higher-order terms (e.g., interactions and squared terms) in the final model which might not give all the important variables for prediction.

c. Using variables identified in part a, the first-order model with interaction can be written as E(y)=β0+β1(STDEPTH)+β2(TGRSWT)+β3(TI)+β4(STDEPTH)(TGRSWT)+β5(TGRSWT)(TI)+β6(STDEPTH)(TI).

d. To determine if model described in part c is better than first-order model, t-test hypothesis testing is conducted on interaction terms present in the model to check if they are statistically significant to the model or not.

e. The R2 values for the three models are 0.122, 0.182, and 0.187. These values are significantly low and indicate that the model fitted to the data is not a good fit. To improve the model, different sets of variables ca be used which explain the variation in the data better.

Step by step solution

01

Variable selection

From the SPSS printout, it is clear that for ST_DEPTH, TGRSWT, and TI the p-value are <0.050. At α = .10, if p-value < α then H0that the β parameter is not statistically significantrejected. Here for all three variables p-value < α indicating that all β values are statistically significant.

The variables which should be used in the model are ST_DEPTH, TGRSWT, and TI.

02

Drawbacks of stepwise regression model

The EPA should not assume that they have identified all the important independent variables for prediction. The stepwise procedure tends to perform a large number of t-tests, inflating the overall probability of a Type I error, and does not automatically include higher-order terms (e.g., interactions and squared terms) in the final model which might not give all the important variables for prediction.

03

Stepwise regression model

Using variables identified in part a, the first-order model with interaction can be written asE(y)=β0+β1(STDEPTH)+β2(TGRSWT)+β3(TI)+β4(STDEPTH)(TGRSWT)+β5(TGRSWT)(TI)+β6(STDEPTH)(TI).

04

Significance of interaction term

To determine if model described in part c is better than first-order model, t-test hypothesis testing is conducted on interaction terms present in the model to check if they are statistically significant to the model or not.

05

Interpretation of R2

The R2 values for the three models are 0.122, 0.182, and 0.187. These values are significantly low and indicate that the model fitted to the data is not a good fit. To improve the model, different sets of variables ca be used which explain the variation in the data better.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Question: Accuracy of software effort estimates. Periodically, software engineers must provide estimates of their effort in developing new software. In the Journal of Empirical Software Engineering (Vol. 9, 2004), multiple regression was used to predict the accuracy of these effort estimates. The dependent variable, defined as the relative error in estimating effort, y = (Actual effort - Estimated effort)/ (Actual effort) was determined for each in a sample of n = 49 software development tasks. Eight independent variables were evaluated as potential predictors of relative error using stepwise regression. Each of these was formulated as a dummy variable, as shown in the table.

Company role of estimator: x1 = 1 if developer, 0 if project leader

Task complexity: x2 = 1 if low, 0 if medium/high

Contract type: x3 = 1 if fixed price, 0 if hourly rate

Customer importance: x4 = 1 if high, 0 if low/medium

Customer priority: x5 = 1 if time of delivery, 0 if cost or quality

Level of knowledge: x6 = 1 if high, 0 if low/medium

Participation: x7 = 1 if estimator participates in work, 0 if not

Previous accuracy: x8 = 1 if more than 20% accurate, 0 if less than 20% accurate

a. In step 1 of the stepwise regression, how many different one-variable models are fit to the data?

b. In step 1, the variable x1 is selected as the best one- variable predictor. How is this determined?

c. In step 2 of the stepwise regression, how many different two-variable models (where x1 is one of the variables) are fit to the data?

d. The only two variables selected for entry into the stepwise regression model were x1 and x8. The stepwise regression yielded the following prediction equation:

Give a practical interpretation of the β estimates multiplied by x1 and x8.

e) Why should a researcher be wary of using the model, part d, as the final model for predicting effort (y)?

Reality TV and cosmetic surgery. Refer to the Body Image: An International Journal of Research (March 2010) study of the impact of reality TV shows on a college student’s decision to undergo cosmetic surgery, Exercise 12.17 (p. 725). Recall that the data for the study (simulated based on statistics reported in the journal article) are saved in the file. Consider the interaction model, , where y = desire to have cosmetic surgery (25-point scale), = {1 if male, 0 if female}, and = impression of reality TV (7-point scale). The model was fit to the data and the resulting SPSS printout appears below.

a.Give the least squares prediction equation.

b.Find the predicted level of desire (y) for a male college student with an impression-of-reality-TV-scale score of 5.

c.Conduct a test of overall model adequacy. Use a= 0.10.

d.Give a practical interpretation of R2a.

e.Give a practical interpretation of s.

f.Conduct a test (at a = 0.10) to determine if gender (x1) and impression of reality TV show (x4) interact in the prediction of level of desire for cosmetic surgery (y).

Going for it on fourth down in the NFL. Refer to the Chance (Winter 2009) study of fourth-down decisions by coaches in the National Football League (NFL), Exercise 11.69 (p. 679). Recall that statisticians at California State University, Northridge, fit a straight-line model for predicting the number of points scored (y) by a team that has a first-down with a given number of yards (x) from the opposing goal line. A second model fit to data collected on five NFL teams from a recent season was the quadratic regression model, E(y)=β0+β1x+β2x2.The regression yielded the following results: y=6.13+0.141x-0.0009x2,R2=0.226.

a) If possible, give a practical interpretation of each of the b estimates in the model.

b) Give a practical interpretation of the coefficient of determination,R2.

c) In Exercise 11.63, the coefficient of correlation for the straight-line model was reported asR2=0.18. Does this statistic alone indicate that the quadratic model is a better fit than the straight-line model? Explain.

d) What test of hypothesis would you conduct to determine if the quadratic model is a better fit than the straight-line model?

Suppose the mean value E(y) of a response y is related to the quantitative independent variables x1and x2

E(y)=2+x1-3x2-x1x2

a) Identify and interpret the slope forx2

b) Plot the linear relationship between E(y) andx2for role="math" localid="1649796003444" x1=0,1,2, whererole="math" localid="1649796025582" 1≤x2≤3

c) How would you interpret the estimated slopes?

d) Use the lines you plotted in part b to determine the changes in E(y) for eachrole="math" localid="1649796051071" x1=0,1,2.

e) Use your graph from part b to determine how much E(y) changes whenrole="math" localid="1649796075921" 3≤x1≤5androle="math" localid="1649796084395" 1≤x2≤3.

Cooling method for gas turbines. Refer to the Journal of Engineering for Gas Turbines and Power (January 2005) study of a high-pressure inlet fogging method for a gas turbine engine, Exercise 12.19 (p. 726). Recall that you fit a first-order model for heat rate (y) as a function of speed (x1) , inlet temperature (x2) , exhaust temperature (x3) , cycle pressure ratio (x4) , and airflow rate (x5) . A Minitab printout with both a 95% confidence interval for E(y) and prediction interval for y for selected values of the x’s is shown below.

a. Interpret the 95% prediction interval for y in the words of the problem.

b. Interpret the 95% confidence interval forE(y)in the words of the problem.

c. Will the confidence interval for E(y) always be narrower than the prediction interval for y? Explain.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.