/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 75 An investigation was carried out... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

An investigation was carried out to study the relationship between speed \((\mathrm{ft} / \mathrm{sec}\) ) and stride rate (number of steps taken/sec) among female marathon runners. Resulting summary quantities included \(n=11, \sum(\) speed) \(=205.4\), \(\Sigma(\text { speed })^{2}=3880.08, \Sigma\) (rate) \(=35.16, \sum(\text { rate })^{2}=112.681\), and \(\sum(\) speed \()(\) rate \()=660.130\) a. Calculate the equation of the least squares line that you would use to predict stride rate from speed. b. Calculate the equation of the least squares line that you would use to predict speed from stride rate. c. Calculate the coefficient of determination for the regression of stride rate on speed of part (a) and for the regression of speed on stride rate of part (b). How are these related?

Short Answer

Expert verified
(a) Rate = 1.177 + 0.0819×Speed; (b) Speed = -2.830 + 24.474×Rate; (c) Both R² ≈ 0.915, indicating the strong relationship and symmetry.

Step by step solution

01

Calculate the Slope and Intercept for Predicting Stride Rate from Speed

For the least squares line, we need to find the slope \(b\) and intercept \(a\). The formula for the slope \(b\) is: \[ b = \frac{n \sum (\text{speed} \times \text{rate}) - \sum (\text{speed}) \sum (\text{rate})}{n \sum (\text{speed}^2) - (\sum (\text{speed}))^2} \]Substitute the given values: \[ b = \frac{11 \times 660.130 - 205.4 \times 35.16}{11 \times 3880.08 - 205.4^2} = \frac{7261.43 - 7220.664}{42680.88 - 42183.16} = \frac{40.766}{497.72} \approx 0.0819 \]The intercept \(a\) is calculated using:\[ a = \frac{\sum (\text{rate}) - b \sum (\text{speed})}{n} = \frac{35.16 - 0.0819 \times 205.4}{11} \approx 1.177 \]Therefore, the equation is \( \text{rate} = 1.177 + 0.0819 \times \text{speed} \).
02

Calculate the Slope and Intercept for Predicting Speed from Stride Rate

Similarly, calculate the slope for predicting speed from stride rate:\[ b = \frac{n \sum (\text{speed} \times \text{rate}) - \sum (\text{speed}) \sum (\text{rate})}{n \sum (\text{rate}^2) - (\sum (\text{rate}))^2} \]Substitute the given values:\[ b = \frac{11 \times 660.130 - 205.4 \times 35.16}{11 \times 112.681 - 35.16^2} = \frac{7261.43 - 7220.664}{1239.491 - 1237.8256} = \frac{40.766}{1.6654} \approx 24.474 \]The intercept \(a\) is:\[ a = \frac{\sum (\text{speed}) - b \sum (\text{rate})}{n} = \frac{205.4 - 24.474 \times 35.16}{11} \approx - 2.830 \]Therefore, the equation is \( \text{speed} = -2.830 + 24.474 \times \text{rate} \).
03

Calculate the Coefficient of Determination for Each Regression

The coefficient of determination \(R^2\) for each regression is computed as:\[ R^2 = \frac{\left(\sum (\text{speed} \times \text{rate}) - \frac{\sum (\text{speed}) \sum (\text{rate})}{n} \right)^2}{\left(\sum (\text{speed}^2) - \frac{(\sum (\text{speed}))^2}{n}\right) \left(\sum (\text{rate}^2) - \frac{(\sum (\text{rate}))^2}{n} \right)} \]Substitute to find \(R^2\) for both lines:\[ R^2 = \frac{(7261.43 - 7220.664)^2}{(3880.08 - 3839.156) \times (112.681 - 112.2367)} = \frac{40.766^2}{40.924\times 0.4443} = \frac{1662.752}{18.17} \approx 0.915 \]Thus, for both regressions, \(R^2 \approx 0.915\), indicating a strong relation. Both are identical because they share the same correlation basis.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Least Squares Method
The Least Squares Method is a fundamental technique commonly used in regression analysis. Its main goal is to find the line of best fit through a set of data points. This "best fit" minimizes the sum of the squares of the vertical distances of the points from the line. This method is beneficial when you want to make predictions based on a linear relationship between two variables, like in our exercise: speed and stride rate.

In the context of predicting stride rate from speed, we take the problem of how each point deviates vertically from a potential line. The Least Squares Method attempts to adjust the line to have the smallest possible sum of squared differences. Consider that each point represents an observation consisting of speed and stride rate. The line we calculate becomes a tool for predicting the stride rate for a given speed or vice versa. This involves calculating the slope and intercept, which we will delve into further. By understanding and applying the Least Squares Method, we can interpret relationships and trends within data, making it indispensable for data analysis.
Coefficient of Determination
The Coefficient of Determination, denoted as \(R^2\), is a statistical measure that helps us understand how well the regression line fits the data. Always ranging between 0 and 1, it represents the proportion of the variance in the dependent variable that is predictable from the independent variable. In simpler terms, it tells us how much of the change in our outcome can be explained by changes in our predictor.

For example, in our exercise, we derived an \(R^2\) value of approximately 0.915. This means that 91.5% of the variance in the stride rate can be explained by changes in speed, demonstrating a strong correlation. Conversely, 8.5% of the variance is due to other unforeseen factors. While an \(R^2\) close to 1 suggests a strong relationship, it's essential to consider the context of the data, as certain datasets inherently possess high correlation due to their nature.

It's important to note, though, that a high \(R^2\) doesn't necessarily imply a causative relationship between the two variables, just that they move together in a systematic way. Understanding \(R^2\)'s role aids analysts in assessing the effectiveness of their predictive models and provides clarity about the strength and direction of the relationship.
Slope and Intercept Calculation
Calculating the slope and intercept is a core part of finding the equation of the regression line. The slope \(b\) tells us how much the dependent variable (e.g., stride rate) is expected to increase or decrease given a one-unit increase in the independent variable (e.g., speed). On the other hand, the intercept \(a\) indicates the value of the dependent variable when the independent variable is zero.

For calculating the slope, we use: \( b = \frac{n \sum (\text{speed} \times \text{rate}) - \sum (\text{speed}) \sum (\text{rate})}{n \sum (\text{speed}^2) - (\sum (\text{speed}))^2} \). This formula derives from balancing the average outcomes around the line.

The intercept is calculated as: \( a = \frac{\sum (\text{rate}) - b \sum (\text{speed})}{n} \). It fits the line in accordance with the data's average location. Together, these calculations provide a comprehensive tool to interpret and predict relationships.

For the scenario where we predict stride rate from speed, our calculations resulted in the line \( \text{rate} = 1.177 + 0.0819 \times \text{speed} \). This means with each unit increase in speed, the stride rate increases by 0.0819 units, starting from a baseline of 1.177 when speed is zero. Simple yet powerful, this method anchors the foundation for data-driven decision-making.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

A study to assess the capability of subsurface flow wetland systems to remove biochemical oxygen demand (BOD) and various other chemical constituents resulted in the accompanying data on \(x=\) BOD mass loading \((\mathrm{kg} / \mathrm{ha} / \mathrm{d})\) and \(y=\) BOD mass removal \((\mathrm{kg} / \mathrm{ha} / \mathrm{d})\) ("Subsurface Flow Wetlands-A Performance Evaluation," Water Envir: Res., 1995: 244–247). $$ \begin{array}{c|cccccccccccccc} x & 3 & 8 & 10 & 11 & 13 & 16 & 27 & 30 & 35 & 37 & 38 & 44 & 103 & 142 \\ \hline y & 4 & 7 & 8 & 8 & 10 & 11 & 16 & 26 & 21 & 9 & 31 & 30 & 75 & 90 \end{array} $$ a. Construct boxplots of both mass loading and mass removal, and comment on any interesting features. b. Construct a scatter plot of the data, and comment on any interesting features.

Calcium phosphate cement is gaining increasing attention for use in bone repair applications. The article "Short-Fibre Reinforcement of Calcium Phosphate Bone Cement" (J. of Engr: in Med., 2007: 203-211) reported on a study in which polypropylene fibers were used in an attempt to improve fracture behavior. The following data on \(x=\) fiber weight (\%) and \(y=\) compressive strength (MPa) was provided by the article's authors. $$ \begin{array}{l|ccccccccc} x & 0.00 & 0.00 & 0.00 & 0.00 & 0.00 & 1.25 & 1.25 & 1.25 & 1.25 \\ \hline y & 9.94 & 11.67 & 11.00 & 13.44 & 9.20 & 9.92 & 9.79 & 10.99 & 11.32 \\\ x & 2.50 & 2.50 & 2.50 & 2.50 & 2.50 & 5.00 & 5.00 & 5.00 & 5.00 \\ \hline y & 12.29 & 8.69 & 9.91 & 10.45 & 10.25 & 7.89 & 7.61 & 8.07 & 9.04 \\ x & 7.50 & 7.50 & 7.50 & 7.50 & 10.00 & 10.00 & 10.00 & 10.00 & \\ \hline y & 6.63 & 6.43 & 7.03 & 7.63 & 7.35 & 6.94 & 7.02 & 7.67 \end{array} $$ a. Fit the simple linear regression model to this data. Then determine the proportion of observed variation in strength that can be attributed to the model relationship between strength and fiber weight. Finally, obtain a point estimate of the standard deviation of \(\epsilon\), the random deviation in the model equation. b. The average strength values for the six different levels of fiber weight are \(11.05,10.51,10.32,8.15,6.93\), and \(7.24\), respectively. The cited paper included a figure in which the average strength was regressed against fiber weight. Obtain the equation of this regression line and calculate the corresponding coefficient of determination. Explain the difference between the \(r^{2}\) value for this regression and the \(r^{2}\) value obtained in (a).

Torsion during hip external rotation and extension may explain why acetabular labral tears occur in professional athletes. The article "Hip Rotational Velocities During the Full Golf Swing" (J. of Sports Science and Med., 2009: 296-299) reported on an investigation in which lead hip internal peak rotational velocity \((x)\) and trailing hip peak external rotational velocity \((y)\) were determined for a sample of 15 golfers. Data provided by the article's authors was used to calculate the following summary quantities: $$ \begin{array}{r} \sum\left(x_{i}-\bar{x}\right)^{2}=64,732.83, \sum\left(y_{i}-\bar{y}\right)^{2}=130,566.96, \\ \sum\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)=44,185.87 \end{array} $$ Separate normal probability plots showed very substantial linear patterns. a. Calculate a point estimate for the population correlation coefficient. b. Carry out a test at significance level \(.01\) to decide whether there is a linear relationship between the two velocities in the sampled population; your conclusion should be based on a \(P\)-value. c. Would the conclusion of (b) have changed if you had tested appropriate hypotheses to decide whether there is a positive linear association in the population? What if a significance level of \(.05\) rather than \(.01\) had been used?

The catch basin in a storm-sewer system is the interface between surface runoff and the sewer. The catch-basin insert is a device for retrofitting catch basins to improve pollutantremoval properties. The article "An Evaluation of the Urban Stormwater Pollutant Removal Efficiency of Catch Basin Inserts" (Water Envir: Res., 2005: 500-510) reported on tests of various inserts under controlled conditions for which inflow is close to what can be expected in the field. Consider the following data, read from a graph in the article, for one particular type of insert on \(x=\) amount filtered ( 1000 s of liters) and \(y=\%\) total suspended solids removed. $$ \begin{array}{l|cccccccccc} x & 23 & 45 & 68 & 91 & 114 & 136 & 159 & 182 & 205 & 228 \\ \hline y & 53.3 & 26.9 & 54.8 & 33.8 & 29.9 & 8.2 & 17.2 & 12.2 & 3.2 & 11.1 \end{array} $$ Summary quantities are $$ \begin{aligned} &\sum x_{i}=1251, \sum x_{i}^{2}=199,365, \sum y_{i}=250.6 \\ &\sum y_{i}^{2}=9249.36, \sum x_{i} y_{i}=21,904.4 \end{aligned} $$ a. Does a scatter plot support the choice of the simple linear regression model? Explain. b. Obtain the equation of the least squares line. c. What proportion of observed variation in \% removed can be attributed to the model relationship? d. Does the simple linear regression model specify a useful relationship? Carry out an appropriate test of hypotheses using a significance level of \(.05\). e. Is there strong evidence for concluding that there is at least a \(2 \%\) decrease in true average suspended solid removal associated with a 10,000 liter increase in the amount filtered? Test appropriate hypotheses using \(\alpha=.05\). f. Calculate and interpret a \(95 \% \mathrm{CI}\) for true average \(\%\) removed when amount filtered is 100,000 liters. How does this interval compare in width to a CI when amount filtered is 200,000 liters? g. Calculate and interpret a \(95 \%\) PI for \% removed when amount filtered is 100,000 liters. How does this interval compare in width to the CI calculated in (f) and to a PI when amount filtered is 200,000 liters?

Mist (airborne droplets or aerosols) is generated when metal-removing fluids are used in machining operations to cool and lubricate the tool and workpiece. Mist generation is a concern to OSHA, which has recently lowered substantially the workplace standard. The article "Variables Affecting Mist Generaton from Metal Removal Fluids" (Lubrication Engr., 2002: 10-17) gave the accompanying data on \(x=\) fluid-flow velocity for a \(5 \%\) soluble oil \((\mathrm{cm} / \mathrm{sec})\) and \(y=\) the extent of mist droplets having diameters smaller than \(10 \mu \mathrm{m}\left(\mathrm{mg} / \mathrm{m}^{3}\right)\) : $$ \begin{array}{l|ccccccc} x & 89 & 177 & 189 & 354 & 362 & 442 & 965 \\ \hline y & .40 & .60 & .48 & .66 & .61 & .69 & .99 \end{array} $$ a. The investigators performed a simple linear regression analysis to relate the two variables. Does a scatter plot of the data support this strategy? b. What proportion of observed variation in mist can be attributed to the simple linear regression relationship between velocity and mist? c. The investigators were particularly interested in the impact on mist of increasing velocity from 100 to 1000 (a factor of 10 corresponding to the difference between the smallest and largest \(x\) values in the sample). When \(x\) increases in this way, is there substantial evidence that the true average increase in \(y\) is less than .6? d. Estimate the true average change in mist associated with a \(1 \mathrm{~cm} / \mathrm{sec}\) increase in velocity, and do so in a way that conveys information about precision and reliability.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.