Problem 24 The following data set lists the... [FREE SOLUTION]

Chapter 3: Problem 24

The following data set lists the number of women from each of 10 different countries who were on the Rolex Women's World Golf Rankings Top 25 list as of March 31,2009 . The data, entered in that order, are for the following countries: Australia, Brazil, England, Japan, Korea, Mexico, Norway, Sweden, Taiwan, and United States. \(\begin{array}{lllllllll}2 & 1 & 1 & 2 & 9 & 1 & 1 & 2 & 2 & 4\end{array}\) a. Calculate the mean and median for these data. b. Identify the outlier in this data set. Drop the outlier and recalculate the mean and median. Which of these two summary measures changes by a larger amount when you drop the outlier? c. Which is the better summary measure for these data, the mean or the median? Explain.

Short Answer

Expert verified

The mean and median of the data are 2.5 and 2.0 respectively. After identifying and dropping the outlier 9, the new mean and median are 1.7 and 2.0 respectively. Therefore, the mean changed by a larger amount when dropping the outlier. The better summary measure for this data is the median because it is less sensitive to outliers.

Step by step solution

Calculation of mean and median

The mean is calculated by summing all the values and dividing the result by the number of data. The median is the middle number in a sorted, ascending or descending, list of numbers. Given this data set: \([2, 1, 1, 2, 9, 1, 1, 2, 2, 4]\). Sum of data = \(2 + 1 + 1 + 2 + 9 + 1 + 1 + 2 + 2 + 4 = 25\) and number of data = 10. Mean = \(25 / 10 = 2.5\) . For median, ordering the data gives \([1, 1, 1, 1, 2, 2, 2, 2, 4, 9]\). The median is the average of the 5th and 6th values, so Median = \( (2+2)/2 = 2.0\) .

Identification of outlier and recalculation

The outlier in this data set is 9, from Korea. This is clearly larger than the rest of the data points.After dropping the outlier, the data set becomes \([1, 1, 1, 2, 2, 1, 1, 2, 2, 4]\). The new sum = \(1 + 1 + 1 + 2 + 2 + 1 + 1 + 2 + 2 + 4 = 17\) and the count = 10. So the new mean = \(17 / 10 = 1.7\) . Organising the data we get \([1, 1, 1, 1, 2, 2, 2, 2, 4]\). The median remains the same as before which is 2.0.

Comparing change in mean and median

Comparing the change in the mean and median after dropping the outlier, the mean changes by \(2.5 - 1.7 = 0.8\) while the median remains the same. Therefore, we can see that the mean is more heavily influenced by the outlier.

Choosing better measure

Considering the outlier and its effect on the mean, it would be more appropriate to use the median as the measure of central tendency for this data. The median is less sensitive to extreme values (outliers), and hence provides a better summary measure for this data.

Unlock Step-by-Step Solutions & Ace Your Exams!

Full Textbook Solutions
Get detailed explanations and key concepts
Unlimited Al creation
Al flashcards, explanations, exams and more...
Ads-free access
To over 500 millions flashcards
Money-back guarantee
We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91影视!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Mean Calculation

Calculating the mean, or average, is one of the foundational tasks in data analysis.
It gives a central value for a data set. To find it, sum up all the numbers and then divide by how many numbers there are.

For our golf ranking data: \[\frac{2 + 1 + 1 + 2 + 9 + 1 + 1 + 2 + 2 + 4}{10} = \frac{25}{10} = 2.5\]

The mean tells us what each value would be if they were all the same. It's a useful measure but can be heavily influenced by outliers. If there's a very high or very low value compared to the others, the mean can become skewed.
This is why understanding the mean's limitations is also important when analyzing data.

Median Calculation

The median is a useful way to find the "middle" of a data set.
Unlike the mean, it does not get skewed by extremely high or low values, making it a robust measure of central tendency.
To calculate it:

Sort the data set in order: \[ [1, 1, 1, 1, 2, 2, 2, 2, 4, 9] \]
Find the middle number(s).
With an even number of observations, take the average of the two middle values.
In our example: \[\frac{2+2}{2} = 2.0\]

The median is an excellent representation of a data set's central point, especially when the data contains outliers.

Outlier Analysis

Outliers are values that differ significantly from other observations in a data set.
They can lead to misinterpretations if not handled properly.
In our data, the number 9 is much higher than the others, making it an outlier.
When analyzing data:

Check for any values that do not appear to fit the pattern of the rest.
Outliers can skew the mean.
They often point to special circumstances worth further investigation.

Removing outliers can sometimes provide a clearer view of the data, but understanding their origin is also crucial.

Central Tendency

Measures of central tendency are statistical tools that describe a central value for a data set.
They summarize data with a single number that represents the "center" of the data.
Common measures include mean, median, and mode.

The mean is greatly influenced by all data points.
The median is less sensitive to outliers, providing a more stable measure.
The mode (not covered here explicitly) shows the most frequently occurring value.

Choosing the right measure depends on the data's distribution and the presence of outliers.
For asymmetric data or data with outliers, the median is often preferred.

Data Set Analysis

Analyzing a data set involves understanding its distribution and identifying key characteristics like central tendencies and outliers.
The overall goal is to derive insights from the data that inform decisions or contribute to a deeper understanding.

Is the data symmetric or skewed?
Which values are most common?
Are there outliers that need handling?

For our golf ranking data, we examined how the presence of an outlier affected the mean more than the median.
The process of data set analysis helps in choosing the appropriate statistical measures to accurately represent the data.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

91影视

Short Answer

Step by step solution

Calculation of mean and median

Identification of outlier and recalculation

Comparing change in mean and median

Choosing better measure

Key Concepts

Mean Calculation

Median Calculation

Outlier Analysis

Central Tendency

Data Set Analysis

One App. One Place for Learning.

Most popular questions from this chapter

Recommended explanations on Math Textbooks

Geometry

Calculus

Logic and Functions

Mechanics Maths

Theoretical and Mathematical Physics

Statistics

Study anywhere. Anytime. Across all devices.