/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 73 The three measures of center int... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

The three measures of center introduced in this chapter are the mean, median, and trimmed mean. Two additional measures of center that are occasionally used are the midrange, which is the average of the smallest and largest observations, and the midfourth, which is the average of the two fourths. Which of these five measures of center are resistant to the effects of outliers and which are not? Explain your reasoning.

Short Answer

Expert verified
Median, trimmed mean, and midfourth are resistant to outliers; mean and midrange are not.

Step by step solution

01

Define the Measures of Center

Begin by defining each measure of center: 1. **Mean**: The arithmetic average of a set of values. 2. **Median**: The middle value in a data set when ordered. 3. **Trimmed Mean**: The mean calculated after removing a specified percentage of the smallest and largest values. 4. **Midrange**: The average of the smallest and largest values in a data set. 5. **Midfourth**: The average of two values that divide the data set into four equal parts.
02

Analyze Resistance to Outliers: Mean and Trimmed Mean

The mean is **not resistant** to outliers because it can be significantly influenced by extreme values. The trimmed mean is **resistant** to outliers to some extent because it excludes a certain percentage of extreme values, reducing their impact.
03

Analyze Resistance to Outliers: Median and Midrange

The median is **resistant** to outliers as it depends only on the middle value, not the extremes. The midrange is **not resistant** to outliers because it is directly calculated from the smallest and largest values, which could be extreme.
04

Evaluate Midfourth Resistance

The midfourth is generally **resistant** to outliers as it involves more central data values rather than extreme ends, similarly to the median's robustness.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Mean
The mean is often what people first think of when considering the average. Calculating the mean involves adding up all the numbers in a data set and then dividing by the total number of values. For example, if you have the data set \( \{3, 7, 8, 5, 10\} \), the mean would be \( (3+7+8+5+10) / 5 = 6.6 \).

However, the mean has a significant downside: it is not resistant to outliers. Outliers are unexpected values that differ significantly from the rest of the data. These extreme values can skew the mean, giving a less accurate picture of the dataset's center. For instance, if the number 50 were added to the previous data set, the mean would shoot up to \( (3+7+8+5+10+50) / 6 = 13.83 \), which would not represent the center well.
Median
The median offers a more robust measure of center compared to the mean. It refers to the midpoint of a data set when arranged in ascending order. For example, in the arranged dataset \( \{3, 5, 7, 8, 10\} \), the median is 7. When dealing with an even number of values, the median is calculated by taking the average of the two middle numbers.

The median is resistant to outliers. Even if an extreme value is added, it generally does not affect the median significantly, making it an excellent choice for data sets with outliers. In fact, if the 50 were added to the previous set, the ordered set becomes \( \{3, 5, 7, 8, 10, 50\} \), where the median is still a stable \( (7 + 8) / 2 = 7.5 \).
Trimmed Mean
The trimmed mean is a twist on the concept of mean, designed to enhance its resistance to outliers. It does this by trimming away a certain percentage of the smallest and largest values from the data set before calculating the mean. For example, to calculate a 10% trimmed mean from the data set \( \{3, 7, 8, 5, 10, 50\} \), you would remove the smallest and largest 10% of values (assuming a large enough dataset for effective trimming) before finding the mean of the remaining numbers.

Trimmed means balance out the influence of outliers, offering more stability than a simple mean. This makes it a powerful tool for data analysts dealing with datasets prone to extreme values. This feature gives the trimmed mean a unique place in statistical analysis, providing a compromise between the simplicity of the mean and the robustness of the median.
Midrange
The midrange measure of center is calculated by taking the average of the smallest and largest values in a data set. To illustrate, with the data set \( \{3, 7, 8, 5, 10, 50\} \), the midrange would be \( (3+50)/2 = 26.5 \).

Midrange is highly vulnerable to outliers, as it relies directly on the extremities of the data set. An unusually large or small value can drastically skew the midrange, making it unreliable for datasets with significant outliers. Therefore, while the midrange can offer quick insights in consistent datasets, it's not ideal when extremes are present.
Midfourth
Midfourth is a measure less commonly used but still valuable in understanding data. It involves calculating the average of two values that divide the dataset into four equal parts. This is analogous to finding the average of the first and third quartile values in a dataset.

The midfourth is known for its resistance to outliers, similar to the median. It focuses on the inner two quartiles, which naturally exclude extreme values, making the midfourth a robust measure of central tendency in datasets where extremes may distort traditional measures like the mean or midrange.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

In a study of warp breakage during the weaving of fabric (Technometrics, 1982: 63), 100 specimens of yarn were tested. The number of cycles of strain to breakage was determined for each yarn specimen, resulting in the following data: \(\begin{array}{rrrrrrrrrr}86 & 146 & 251 & 653 & 98 & 249 & 400 & 292 & 131 & 169 \\ 175 & 176 & 76 & 264 & 15 & 364 & 195 & 262 & 88 & 264 \\ 157 & 220 & 42 & 321 & 180 & 198 & 38 & 20 & 61 & 121 \\ 282 & 224 & 149 & 180 & 325 & 250 & 196 & 90 & 229 & 166 \\ 38 & 337 & 65 & 151 & 341 & 40 & 40 & 135 & 597 & 246 \\ 211 & 180 & 93 & 315 & 353 & 571 & 124 & 279 & 81 & 186 \\ 497 & 182 & 423 & 185 & 229 & 400 & 338 & 290 & 398 & 71 \\ 246 & 185 & 188 & 568 & 55 & 55 & 61 & 244 & 20 & 284 \\ 393 & 396 & 203 & 829 & 239 & 236 & 286 & 194 & 277 & 143 \\ 198 & 264 & 105 & 203 & 124 & 137 & 135 & 350 & 193 & 188\end{array}\) a. Construct a relative frequency histogram based on the class intervals \(0-100,100-200, \ldots\), and comment on features of the distribution. b. Construct a histogram based on the following class intervals: \(0-50,50-100,100-150\), \(150-200, \quad 200-300, \quad 300-400, \quad 400-500\), \(500-600,600-900 .\) c. If weaving specifications require a breaking strength of at least 100 cycles, what proportion of the yam specimens in this sample would be considered satisfactory?

A study carried out to investigate the distribution of total braking time (reaction time plus acceleratorto-brake movement time, in msec) during real driving conditions at \(60 \mathrm{~km} / \mathrm{h}\) gave the following summary information on the distribution of times ("A Field Study on Braking Responses during Driving," Ergonomics, 1995: 1903-1910): mean \(=535\) median \(=500\) mode \(=500\) sd \(=96\) minimum \(=220\) maximum \(=925\) 5 th percentile \(=400 \quad 10\) th percentile \(=430\) 90 th percentile \(=64095\) th percentile \(=720\) What can you conclude about the shape of a histogram of this data? Explain your reasoning.

Unlike most packaged food products, alcohol beverage container labels are not required to show calorie or nutrient content. The article "What Am I Drinking? The Effects of Serving Facts Information on Alcohol Beverage Containers" (J. of Consumer Affairs, 2008: 81-99) reported on a pilot study in which each individual in a sample was asked to estimate the calorie content of a \(12 \mathrm{oZ}\) can of light beer known to contain 103 cal. The following information appeared in the article: \begin{tabular}{lr} \hline Class & Percentage \\ \hline \(0-<50\) & 7 \\ \(50-<75\) & 9 \\ \(75-<100\) & 23 \\ \(100-<125\) & 31 \\ \(125-<150\) & 12 \\ \(150-<200\) & 3 \\ \(200-<300\) & 12 \\ \(300-<500\) & 3 \\ \hline \end{tabular} a. Construct a histogram of the data and comment on any interesting features. b. What proportion of the estimates were at least \(100 ?\) Less than \(200 ?\)

The value of Young's modulus (GPa) was determined for cast plates consisting of certain intermetallic substrates, resulting in the following sample observations ("Strength and Modulus of a Molybdenum-Coated Ti-25Al-10Nb-3U-1Mo Intermetallic," J. Mater. Engrg. Perform., 1997: 46-50): \(116.4\) \(115.9\) \(114.6\) \(115.2\) \(115.8\) a. Calculate \(\bar{x}\) and the deviations from the mean. b. Use the deviations calculated in part (a) to obtain the sample variance and the sample standard deviation. c. Calculate \(s^{2}\) by using the computational formula for the numerator \(S_{x x}\). d. Subtract 100 from each observation to obtain a sample of transformed values. Now calculate the sample variance of these transformed values, and compare it to \(s^{2}\) for the original data. State the general principle.

Every score in the following batch of exam scores is in the 60 's, 70 's, 80 's, or 90 's. A stem-and-leaf display with only the four stems \(6,7,8\), and 9 would not give a very detailed description of the distribution of scores. In such situations, it is desirable to use repeated stems. Here we could repeat the stem 6 twice, using 6L for scores in the low 60's (leaves 0, 1,2, 3 , and 4) and \(6 \mathrm{H}\) for scores in the high 60 's (leaves \(5,6,7,8\), and 9). Similarly, the other stems can be repeated twice to obtain a display consisting of eight rows. Construct such a display for the given scores. What feature of the data is highlighted by this display? \(\begin{array}{lllllllllllll}74 & 89 & 80 & 93 & 64 & 67 & 72 & 70 & 66 & 85 & 89 & 81 & 81 \\ 71 & 74 & 82 & 85 & 63 & 72 & 81 & 81 & 95 & 84 & 81 & 80 & 70 \\ 69 & 66 & 60 & 83 & 85 & 98 & 84 & 68 & 90 & 82 & 69 & 72 & 87 \\ 88 & & & & & & & & & & & & \end{array}\)

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.