Problem 23 Fuming because you are stuck in ... [FREE SOLUTION]

91影视

Understandable Statistics : Concepts and Methods

Charles Henry Brase

$Math Studyset 91影视 Explanations$ Math

10 Edition

Chapter 9: Problem 23

Fuming because you are stuck in traffic? Roadway congestion is a costly item, in both time wasted and fuel wasted. Let $x$ represent the average annual hours per person spent in traffic delays and let $y$ represent the average annual gallons of fuel wasted per person in traffic delays. A random sample of eight cities showed the following data (Reference: Statistical Abstract of the United States, 122 nd Edition). $$ \begin{array}{l|llllllll} \hline x(\mathrm{hr}) & 28 & 5 & 20 & 35 & 20 & 23 & 18 & 5 \\ \hline y(\mathrm{gal}) & 48 & 3 & 34 & 55 & 34 & 38 & 28 & 9 \\ \hline \end{array} $$ (a) Draw a scatter diagram for the data. Verify that $\Sigma x=154, \Sigma x^{2}=3712$, $\Sigma y=249, \Sigma y^{2}=9959$, and $\Sigma x y=6067$. Compute $r$ The data in part (a) represent average annual hours lost per person and average annual gallons of fuel wasted per person in traffic delays. Suppose that instead of using average data for different cities, you selected one person at random from each city and measured the annual number of hours lost $x$ for that person and the annual gallons of fuel wasted $y$ for the same person. $$ \begin{array}{l|cccccccc} \hline x(\mathrm{hr}) & 20 & 4 & 18 & 42 & 15 & 25 & 2 & 35 \\ \hline y(\mathrm{gal}) & 60 & 8 & 12 & 50 & 21 & 30 & 4 & 70 \\ \hline \end{array} $$ (b) Compute $\bar{x}$ and $\bar{y}$ for both sets of data pairs and compare the averages. Compute the sample standard deviations $s_{x}$ and $s_{y}$ for both sets of data pairs and compare the standard deviations. In which set are the standard deviations for $x$ and $y$ larger? Look at the defining formula for $r$, Equation $1 .$ Why do smaller standard deviations $s_{x}$ and $s_{y}$ tend to increase the value of $r$ ? (c) Make a scatter diagram for the second set of data pairs. Verify that $\Sigma x=161, \quad \Sigma x^{2}=4583, \quad \Sigma y=255, \quad \Sigma y^{2}=12,565$, and $\Sigma x y=7071 .$ Compute $r$. (d) Compare $r$ from part (a) with $r$ from part (c). Do the data for averages have a higher correlation coefficient than the data for individual measurements? List some reasons why you think hours lost per individual and fuel wasted per individual might vary more than the same quantities averaged over all the people in a city.

Short Answer

Expert verified

The city data set has verified sums, compute using formulas.

Step by step solution

Verify Given Sums for Part (a)

Check the sums for the given city data set:- For hours $x$: $\Sigma x = 28 + 5 + 20 + 35 + 20 + 23 + 18 + 5 = 154$ (verified)- $\Sigma x^2 = 28^2 + 5^2 + 20^2 + 35^2 + 20^2 + 23^2 + 18^2 + 5^2 = 3712$ (verified)- For gallons $y$: $\Sigma y = 48 + 3 + 34 + 55 + 34 + 38 + 28 + 9 = 249$ (verified)- $\Sigma y^2 = 48^2 + 3^2 + 34^2 + 55^2 + 34^2 + 38^2 + 28^2 + 9^2 = 9959$ (verified)- $\Sigma xy = 28\times48 + 5\times3 + 20\times34 + 35\times55 + 20\times34 + 23\times38 + 18\times28 + 5\times9 = 6067$

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91影视!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Scatter Diagram

A scatter diagram, also known as a scatter plot, is a graphical representation used to visualize the relationship between two quantitative variables. Each point on the diagram corresponds to a pair of values from the two datasets. For this specific exercise, one axis represents the average annual hours spent in traffic delays (variable $x$), and the other axis represents the average annual gallons of fuel wasted (variable $y$).

To create a scatter plot, you would plot each city鈥檚 data point as a coordinate on the diagram. For example, if city A has an $x$-value of 28 hours and a $y$-value of 48 gallons, you would place a point at (28, 48) on the chart. Doing this for all the data points allows us to visually inspect patterns or correlations between the variables.

By analyzing a scatter diagram, you can easily see if there's a positive or negative correlation between variables. A positive trend will show data points moving upwards as you go along the x-axis, suggesting an increase in one variable tends to correlate with an increase in the other variable.

Correlation Coefficient

The correlation coefficient, denoted as $r$, measures the strength and direction of a linear relationship between two variables on a scatter plot. This coefficient ranges from -1 to 1.

An $r$ value close to 1 implies a strong positive correlation, meaning as one variable increases, the other also increases.
An $r$ value close to -1 indicates a strong negative correlation, where one variable decreases as the other increases.
An $r$ value around 0 suggests no linear correlation between the variables.

For calculating $r$, you utilize a formula involving the sums of the products of paired scores, as seen in the gesture towards Equation 1. Smaller standard deviations in the datasets result in a more established linear relationship, potentially increasing the value of $r$. This is because smaller variances imply that data points are closer to the mean and vary less from each other.

Standard Deviation

Standard deviation is a measure of the amount of variation or dispersion present in a set of values. A low standard deviation means that the data points are close to the mean, whereas a high standard deviation indicates that the data points are spread out over a large range of values.

To compute the standard deviation for a sample set, you first calculate the sample variance. This involves taking each data point's deviation from the sample mean, squaring it, and then averaging these square deviations. The standard deviation $s$ is the square root of this variance. Mathematically, it is expressed as:
\[ s = \sqrt{\frac{\sum (x_i - \bar{x})^2}{n-1}} \]
Where $x_i$ represents each value in the dataset, $\bar{x}$ is the sample mean, and $n$ is the number of observations. Calculating for both $x$ and $y$ allows you to understand how spread out hours in traffic and fuel consumption are, respectively, across the dataset.

Sample Mean

The sample mean, denoted by $\bar{x}$ or $\bar{y}$, represents the average of a set of values. It's calculated by summing all the values and then dividing by the number of observations.

For example, if you have a set of data representing hours spent in traffic delays $[28, 5, 20, 35, 20, 23, 18, 5]$, you sum these values to get 154. To find the mean, you then divide by the number of observations, which is 8. So, the sample mean $\bar{x}$ is $154/8 = 19.25$ hours.

The same principle applies for finding the mean of gallons of fuel wasted, $\bar{y}$. The mean provides a central value of the dataset, helping to understand the average trend within the data. It is pivotal for both descriptive statistics and drawing conclusions about populations and data behavior.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

91影视

Short Answer

Step by step solution

Verify Given Sums for Part (a)

Key Concepts

Scatter Diagram

Correlation Coefficient

Standard Deviation

Sample Mean

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Math Textbooks

Discrete Mathematics

Decision Maths

Probability and Statistics

Theoretical and Mathematical Physics

Mechanics Maths

Applied Mathematics

Study anywhere. Anytime. Across all devices.