/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 34 An article in the Journal of the... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

An article in the Journal of the American Ceramics Society (1992, Vol. 75, pp. \(112-116\) ) describes a process for immobilizing chemical or nuclear wastes in soil by dissolving the contaminated soil into a glass block. The authors mix \(\mathrm{CaO}\) and \(\mathrm{Na}_{2} \mathrm{O}\) with soil and model viscosity and electrical conductivity. The electrical conductivity model involves six regressors, and the sample consists of \(n=14\) observations. (a) For the six-regressor model, suppose that \(S S_{r}=0.50\) and \(R^{2}=0.94 .\) Find \(S S_{E}\) and \(S S_{R},\) and use this information to test for significance of regression with \(\alpha=0.05 .\) What are your conclusions? (b) Suppose that one of the original regressors is deleted from the model, resulting in \(R^{2}=0.92 .\) What can you conclude about the contribution of the variable that was removed? Answer this question by calculating an \(F\) -statistic. (c) Does deletion of the regressor variable in part (b) result in a smaller value of \(M S_{E}\) for the five-variable model, in comparison to the original six-variable model? Comment on the significance of your answer.

Short Answer

Expert verified
a) Significant regression; b) Removed variable has little impact; c) MSE decreases slightly.

Step by step solution

01

Understanding the Setup

We have a model with six regressors and a sample size of \( n = 14 \). The sum of squares due to regression \( SS_{r} = 0.50 \) and \( R^2 = 0.94 \). This indicates that 94% of the variability is explained by the model.
02

Calculate Sum of Squares Errors (SSE)

Using the formula for \( R^2 \), \( R^2 = \frac{SS_R}{SS_T} \), where \( SS_R = SS_{r} \) and \( SS_T = SS_R + SS_E \). From \( R^2 = 0.94 \), we find \( SS_E = (1 - 0.94) \times (SS_R + SS_E) = 0.06 \times SS_T \). Hence, \( SS_T = SS_R + 0.06 \times SS_T \), leading to \( SS_T = \frac{SS_R}{0.94} = \frac{0.50}{0.94} = 0.5319 \), and finally \( SS_E = SS_T - SS_R = 0.032 \).
03

Test Significance of Regression

To test the significance of regression, we use the \( F \)-statistic: \( F = \frac{(SS_R / p)}{(SS_E / (n - p - 1))} \), where \( p = 6 \) (number of regressors). Calculate \( MS_R = 0.50 / 6 = 0.0833 \) and \( MS_E = 0.032 / 7 = 0.00457 \). Thus, \( F = \frac{0.0833}{0.00457} \approx 18.22 \). With \( \, F_{6,7,0.05} = 3.87 \), the \( F \)-statistic is greater, so we reject the null hypothesis.
04

Evaluate the Impact of Removing a Regressor

After removing one regressor, \( R^2 = 0.92 \). Compute the change in \( R^2 \): \( \Delta R^2 = 0.94 - 0.92 = 0.02 \). The \( F \)-statistic for the change is \( F = \frac{(0.02 / 1)}{0.032 / 8} = 5 \). With \( F_{1,8,0.05} \approx 5.32 \), the calculated \( F \) is below this critical value, indicating the change is not statistically significant.
05

Comparison of MSE Values

For the six-variable model: \( MS_E = 0.00457 \) and for the five-variable model: \( MS_E = 0.032 / 8 = 0.004 \). The \( MS_E \) decreased slightly, indicating a better fit without the redundant variable, but only slightly.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Significance Testing in Multiple Linear Regression
Significance testing in multiple linear regression allows us to determine if our model explains enough variability to be statistically meaningful. We aim to decide whether the relationship observed between the dependent and independent variables is significant or just due to random chance.

When testing for significance, we formulate a null hypothesis (usually denoted as \( H_0 \)) that suggests no relationship between the variables, implying that all regression coefficients are equal to zero. The alternative hypothesis (\( H_1 \)) indicates that at least one variable has a meaningful contribution.

To test these hypotheses, we use the \( F \)-statistic, which is a ratio of two variances: the variance explained by the model to the variance within the residual errors. This statistic follows an \( F \)-distribution. If the calculated \( F \)-value is greater than the critical value from the \( F \)-distribution table, we reject the null hypothesis, indicating that the model significantly accounts for the variability in the dependent variable.
Model Comparison and the Role of Regression Variables
Model comparison involves evaluating different versions of a model to determine which one explains the data most adequately. By comparing models, we can assess the contribution of each variable to the overall model performance.

Suppose in a regression model, we start with a set of predictors and evaluate its potential by examining changes in the \( R^2 \) value (which reflects the proportion of variance explained by the model). When a variable is removed from a regression model, a change in the \( R^2 \) value occurs. If this change is small, it suggests the variable might not have made a significant contribution.

To quantify this, an \( F \)-test for variable inclusion/exclusion helps determine if removing a predictor significantly deteriorates the model fit. If the calculated \( F \)-statistic for the change in \( R^2 \) is lower than the critical value, the variable removal is not considered statistically significant. This systematic approach ensures the simplest model without sacrificing predictive accuracy.
Understanding R-squared Analysis
R-squared, denoted as \( R^2 \), is a measure used to assess the goodness of fit of a regression model. It represents the proportion of variance in the dependent variable that is predictable from the independent variables.

With values ranging from 0 to 1, an \( R^2 \) of 1 indicates a perfectly fitting model, while an \( R^2 \) of 0 indicates the model does not explain any of the variability. In simpler terms, \( R^2 \) helps us understand how well our independent variables are at predicting the outcome variable.

In practice, adding more variables can increase \( R^2 \), but it's essential not to add variables that do not improve the model significantly. This is where the adjusted \( R^2 \) becomes useful, as it accounts for the number of predictors relative to the sample size, ensuring only meaningful improvements in the model.
The Role of the F-statistic in Model Significance
The \( F \)-statistic plays a crucial role in determining the overall significance of a regression model. It helps us understand whether the variations accounted for by the model are substantial enough to be of statistical importance.

The \( F \)-statistic is calculated by dividing the mean square of the regression (MSR) by the mean square of the error (MSE). This ratio measures the model's ability to explain variance against the variability left unexplained.

A higher \( F \)-value indicates a more statistically significant model. By comparing this value against a critical value from the \( F \)-distribution table, we determine if the model's explanatory power is meaningful. If the calculated \( F \)-statistic exceeds the table value, we conclude that the model as a whole has significant predictive power over the response variable.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

An article in Optical Engineering ["Operating Curve Extraction of a Corrclator's Filter" (2004, Vol. \(43,\) pp. \(2775-2779)]\) reported on use of an optical correlator to perform an experiment by varying brightness and contrast. The resulting modulation is characterized by the useful range of gray levels. The data are shown below: \(\begin{array}{llllrrrrr}\text { Brightncss (\%): } & 54 & 61 & 65 & 100 & 100 & 100 & 50 & 57 & 54 \\ \text { Contrast (\%): } & 56 & 80 & 70 & 50 & 65 & 80 & 25 & 35 & 26 \\ \text { Useful range (ng): } & 96 & 50 & 50 & 112 & 96 & 80 & 155 & 144 & 255\end{array}\) (a) lit a multiple linear regression model to these data. (b) Iistimate \(\sigma^{2}\) (c) Compute the standard errors of the regression coefficicnts. (d) Prediet the uscful range when brightness \(=80\) and contrast \(=75\)

A multiple regression model was used to relate \(y=\) viscosity of a chemical product to \(x_{1}=\) temperature and \(x_{2}=\) reaction time. The data set consisted of \(n=15\) observations. (a) The estimated regression coefficients were \(\hat{\beta}_{0}=300.00\), \(\hat{\beta}_{1}=0.85,\) and \(\hat{\beta}_{2}=10.40 .\) Calculate an estimate of mean viscosity when \(x_{1}=100^{\circ} \mathrm{F}\) and \(x_{2}=2\) hours. (b) The sums of squares were \(S S_{T}=1230.50\) and \(S S_{E}=\) \(120.30 .\) Test for significance of regression using \(\alpha=\) 0.05. What conclusion can you draw? (c) What proportion of total variability in viscosity is accounted for by the variables in this model? (d) Suppose that another regressor, \(x_{3}=\) stirring rate, is added to the model. The new value of the error sum of squares is \(S S_{E}=117.20 .\) Has adding the new variable resulted in a smaller value of \(M S_{E}\) ? Discuss the significance of this result. (e) Calculate an \(F\) -statistic to assess the contribution of \(x_{3}\) to the model. Using \(\alpha=0.05,\) what conclusions do you reach?

A regression model is to be developed for predicting the ability of soil to absorb chemical contaminants. Ten observations have been taken on a soil absorption index \((y)\) and two regressors: \(x_{1}=\) amount of extractable iron ore and \(x_{2}=\) amount of bauxite. We wish to fit the model \(Y=\beta_{0}+\beta_{1} x_{1}+\) \(\beta_{2} x_{2}+\epsilon\). Some necessary quantities are. $$ \begin{array}{c} \left(\mathbf{X}^{\prime} \mathbf{X}\right)^{-1}=\left[\begin{array}{ccc} 1.17991 & -7.30982 \mathrm{E}-3 & 7.3006 \mathrm{E}-4 \\ -7.30982 \mathrm{E}-3 & 7.9799 \mathrm{E}-5 & -1.23713 \mathrm{E}-4 \\ 7.3006 \mathrm{E}-4 & -1.23713 \mathrm{E}-4 & 4.6576 \mathrm{E}-4 \end{array}\right] \\ \mathbf{X}^{\prime} \mathbf{y}=\left[\begin{array}{r} 220 \\ 36,768 \\ 9,965 \end{array}\right] \end{array} $$ (a) Estimate the regression coefficients in the model specified above. (b) What is the predicted value of the absorption index \(y\) when \(x_{1}=200\) and \(x_{2}=50 ?\)

You have fit a regression model with two regressors to a data set that has 20 observations. The total sum of squares is 1000 and the model sum of squares is 750 . (a) What is the value of \(R^{2}\) for this model? (b) What is the adjusted \(R^{2}\) for this model? (c) What is the value of the \(F\) -statistic for testing the significance of regression? What conclusions would you draw about this model if \(\alpha=0.05 ?\) What if \(\alpha=0.01 ?\) (d) Suppose that you add a third regressor to the model and as a result the model sum of squares is now \(785 .\) Does it seem to you that adding this factor has improved the model?

An article entitled "A Method for Improving the Accuracy of Polynomial Regression Analysis" in the Journal of Quality Technology \((1971,\) pp. \(149-155)\) reported the fol lowing data on \(y=\) ultimate shear strength of a rubber compound (psi) and \(x=\) cure temperature \(\left({ }^{\circ} F\right) .\) \( \begin{array}{c|c|c|c|c|c} y & 770 & 800 & 840 & 810 \\ \hline x & 280 & 284 & 292 & 295 \\ y & 735 & 640 & 590 & 560 \\ \hline x & 298 & 305 & 308 & 315 \end{array} \) (a) Fit a second-order polynomial to these data. (b) Test for significance of regression using \(\alpha=0.05 .\) (c) Test the hypothesis that \(\beta_{11}=0\) using \(\alpha=0.05\). (d) Compute the residuals from part (a) and use them to evaluate model adequacy.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.