/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Q170SE Question: Household food consump... [FREE SOLUTION] | 91影视

91影视

Question: Household food consumption. The data in the table below were collected for a random sample of 26 households in Washington, D.C. An economist wants to relate household food consumption, y, to household income, x1, and household size, x2, with the first-order model.

Ey=0+1x1+2x2

  1. Fit the model to the data. Do you detect any signs of multicollinearity in the data? Explain.
  2. Is there visual evidence (from a residual plot) that a second-order model may be more appropriate for predicting household food consumption? Explain.
  3. Comment on the assumption of constant error variance, using a residual plot. Does it appear to be satisfied?
  4. Are there any outliers in the data? If so, identify them.
  5. Based on a graph of the residuals, does the assumption of normal errors appear to be reasonably satisfied? Explain.

Short Answer

Expert verified

Answers

  1. To detect the sign of multicollinearity, it can be seen that the sign of the household鈥檚 income is negative but logically, the household鈥檚 consumption would increase with an increase in income. This might indicate the existence of multicollinearity.
  2. From the residual plot, it can be seen that the second-order model is more appropriate for the data.
  3. The error variance from the residual plot does not look constant as the error terms are closer for the early observation while for the later observations, the spread in error terms increases.
  4. Observation 26 is an outlier as the residual value for the observation was 2.789.
  5. The assumption of normal errors is not satisfied here as the error variance from the graph is visible that is not constant.

Step by step solution

01

Given information  

The number of observations is 26 households and the first order model is given as.

02

Model fitting 

a.

Given in the question is data of 26 household regarding their food consumption, y, to household income, and household size. The excel summary output is attached below. To detect the sign of multicollinearity, it can be seen that the sign of the household鈥檚 income is negative but logically, the household鈥檚 consumption would increase with an increase in income. This might indicate that existence of multicollinearity.

The model can be fitted using excel function data analysis. The values of y and ,x1 and x2 can be taken from the excel table and the regression model can be fitted using data analysis function in the data tab in the excel. This function automatically gives summary output of the model after getting the data about dependent, y, and independent variables, x1 and x2 .

For the anova table we need to calculate the mean if the independent variable and then calculate the SSR, SSE, and SST, after that one need to calculate the degrees of freedom and the mean squares and the F.

The SSR is calculated by usingn(Xj--xj..)2, and the SSE is calculated by squaring each term and adding them all. The SST is the sum of SSR and SSE. The MS regression is calculated by dividing SST by degrees of regression and similarly the MS residual is calculated by dividing SSE by degrees of residual and F is calculated by dividing MS regression by MS residual.

The coefficients of x is calculated by using this formula: nxy-xynx2-x2whereas the coefficient of intercept is calculated by yx2-xxynx2-x2.

Thestandard error is calculated bydividingthe standard deviation by the sample size's square root.

The excel summary input is attached here.

03

Residual plot

b.

The process to drawn the residual plot is given as follows:

  • Mean E = 0 - First, we demonstrate how a residual plot can detect a model in which the hypothesized relationship between E(y) and an independent variable x is mis specified. The assumption of mean error of 0 is violated in these types of models.
  • Constant Error Variance-Residual plots can also be used to detect violations of the assumption of constant error variance.
  • Errors Normally Distributed- Several graphical methods are available for assessing whether the random error e has an approximate normal distribution. If the assumption of normally distributed errors is satisfied, then we expect approximately 95% of the residuals to fall within 2 standard deviations of the mean of 0, and almost all of the residuals to lie within 3 standard deviations of the mean of 0.
  • Errors Independent- The assumption of independent errors is violated when successive errors are correlated.

From the residual plot, it can be seen that second-order model is more appropriate for the data.

The graph can be drawn by plotting the residual values which are calculated by y^-yon the y -axis and putting the no of observations on the x-axis. After plotting the individual combinations, a line can be drawn to reflect the relationship between the two parameters.

04

Constant error variance assumption

c.

The error variance from the residual plot does not look constant as the error terms are closer for the early observation while for the later observations, the spread in error terms increases.

05

Outlier

d.

Observation 26 is an outlier as the residual value for the observation was 2.789 and from the graph also it is visible that there is an outlier.

06

Assumption of normal errors 

The assumption of normally distributed errors is satisfied, then we expect approximately 95% of the residuals to fall within 2 standard deviations of the mean of 0, and almost all of the residuals to lie within 3 standard deviations of the mean of 0.

Here the assumption of normal errors is not satisfied here as the error variance from the graph is visible that is not constant. Some residual value observations are close to the regression line indicating small variance. However, some values are far from the regression line indicating a large variance between the y values and regressed y-values. This indicates that the error variance is not the same.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91影视!

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Question: Reality TV and cosmetic surgery. Refer to the Body Image: An International Journal of Research (March 2010) study of the impact of reality TV shows on one鈥檚 desire to undergo cosmetic surgery, Exercise 12.17 (p. 725). Recall that psychologists used multiple regression to model desire to have cosmetic surgery (y) as a function of gender(x1) , self-esteem(x2) , body satisfaction(x3) , and impression of reality TV (x4). The SPSS printout below shows a confidence interval for E(y) for each of the first five students in the study.

  1. Interpret the confidence interval for E(y) for student 1.
  2. Interpret the confidence interval for E(y) for student 4

Question: Suppose the mean value E(y) of a response y is related to the quantitative independent variables x1and x2

E(y)=2+x1-3x2-x1x2

a. Identify and interpret the slope forx2.

b. Plot the linear relationship between E(y) andx2forx1=0,1,2, where.

c. How would you interpret the estimated slopes?

d. Use the lines you plotted in part b to determine the changes in E(y) for each x1=0,1,2.

e. Use your graph from part b to determine how much E(y) changes when3x15and1x23.

Buy-side vs. sell-side analysts鈥 earnings forecasts. Refer to the Financial Analysts Journal (July/August 2008) comparison of earnings forecasts of buy-side and sell-side analysts, Exercise 2.86 (p. 112). The Harvard Business School professors used regression to model the relative optimism (y) of the analysts鈥 3-month horizon forecasts. One of the independent variables used to model forecast optimism was the dummy variable x = {1 if the analyst worked for a buy-side firm, 0 if the analyst worked for a sell-side firm}.

a) Write the equation of the model for E(y) as a function of type of firm.

b) Interpret the value of0in the model, part a.

c) The professors write that the value of1in the model, part a, 鈥渞epresents the mean difference in relative forecast optimism between buy-side and sell-side analysts.鈥 Do you agree?

d) The professors also argue that 鈥渋f buy-side analysts make less optimistic forecasts than their sell-side counterparts, the [estimated value of1] will be negative.鈥 Do you agree?

Question: Chemical plant contamination. Refer to Exercise 12.18 (p. 725) and the U.S. Army Corps of Engineers study. You fit the first-order model,E(Y)=0+1x1+2x2+3x3 , to the data, where y = DDT level (parts per million),X1= number of miles upstream,X2= length (centimeters), andX3= weight (grams). Use the Excel/XLSTAT printout below to predict, with 90% confidence, the DDT level of a fish caught 300 miles upstream with a length of 40 centimeters and a weight of 1,000 grams. Interpret the result.

Question: Suppose you fit the first-order multiple regression model y=0+1x1+2x2+ to n=25 data points and obtain the prediction equationy^=6.4+3.1x1+0.92x2 . The estimated standard deviations of the sampling distributions of 1 and 2are 2.3 and .27, respectively

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.