/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 34 The three measures of center int... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

The three measures of center introduced in this chapter are the mean, median, and trimmed mean. Two additional measures of center that are occasionally used are the midrange, which is the average of the smallest and largest observations, and the midfourth, which is the average of the two fourths. Which of these five measures of center are resistant to the effects of outliers and which are not? Explain your reasoning.

Short Answer

Expert verified
Resistant: Median, Trimmed Mean, and Midfourth. Not resistant: Mean and Midrange.

Step by step solution

01

Understanding Resistance to Outliers

A measure of center is resistant if it is not heavily influenced by extreme values (outliers). We will evaluate each measure to determine its resistance.
02

Evaluate Mean

The mean is calculated as the sum of all values divided by the number of values. This measure is not resistant to outliers, as a single extreme value can significantly affect it.
03

Evaluate Median

The median is the middle value of a data set when ordered from least to greatest. This measure is resistant to outliers because it relies on the positional value rather than magnitude.
04

Evaluate Trimmed Mean

A trimmed mean involves removing a certain percentage of the smallest and largest data points before averaging the remaining points. This method increases resistance to outliers, especially with a sufficient trim.
05

Evaluate Midrange

The midrange is calculated as the average of the smallest and largest observations. It is not resistant to outliers, as these extreme values directly determine the midrange.
06

Evaluate Midfourth

The midfourth is the average of the two "fourths" or quartiles in a data set. Since these points are typically not at the extremes of the data set, the midfourth is somewhat resistant to outliers.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Mean
The mean is one of the most commonly used measures of central tendency. It is calculated by adding up all the values in a data set and then dividing by the number of values. This gives you the average. For instance, if you have data points like 2, 4, 6, 8, and 10, the mean would be \( \frac{2 + 4 + 6 + 8 + 10}{5} = 6 \).

A key feature of the mean is that it uses all the data points in its computation. However, this means that it is sensitive to outliers. An outlier is a value that "lies outside" most of the other values in your data set. For example, if your data set is 2, 4, 6, 8, and 50, the mean becomes \( \frac{2 + 4 + 6 + 8 + 50}{5} = 14 \).

As you can see, the presence of 50, which is an outlier, significantly increases the mean, indicating that mean is not resistant to outliers.
Median
The median is another important measure of central tendency, which represents the middle value when a data set is arranged in ascending order. If the number of observations is even, the median is the average of the two middle numbers. For instance, in the data set 3, 5, 7, the median is 5. For an even set, such as 3, 5, 7, 9, the median would be \( \frac{5 + 7}{2} = 6 \).

Unlike the mean, the median is a positional average meaning it does not consider the magnitude of the values. As a result, it is not affected by outliers. If the data set changes to 3, 5, 50, the median remains 5. Therefore, the median is considered a resistant measure of center.
Resistance to Outliers
Resistance to outliers is a valuable property for a measure of central tendency, as it prevents the measure from being unduly affected by extreme outliers. Outliers can distort statistical analyses by pulling the measure of center towards them.

  • The mean is not resistant as it incorporates all values, including outliers, in its calculation.
  • The median is resistant because its calculation is based on the position of data points, not their value.
  • The trimmed mean, by discarding a percentage of the highest and lowest data points, gains resistance to outliers.
  • The midrange, on the other hand, relies directly on extremes, thus it is not resistant.

Understanding resistance helps in selecting the best measure when summarizing data, especially in the presence of outliers.
Trimmed Mean
The trimmed mean offers a balance between the mean and the median. This measure is calculated by removing a certain percentage of the smallest and largest values in the data set and then determining the mean of the remaining data.

For example, in a data set of 1, 3, 5, 7, and 100, a trimmed mean with 20% trimming would remove 1 and 100, calculating the mean as \( \frac{3 + 5 + 7}{3} = 5 \).

By eliminating the extremes, the trimmed mean reduces the potential distortion from outliers, thus increasing its resistance. It provides a compromise between using all data points and completely ignoring outliers, like when using the median.
Midrange
The midrange is a measure of central tendency that considers only the smallest and largest observations in a data set. It is calculated by averaging these two extremes. For example, in the data set 2, 3, 10, the midrange would be \( \frac{2 + 10}{2} = 6 \).

Due to its reliance on the most extreme values, the midrange is highly sensitive to outliers. If an outlier is present, it can significantly skew the midrange. For example, if the data set changes to 2, 3, 100, the midrange becomes \( \frac{2 + 100}{2} = 51 \), demonstrating a substantial shift.

This sensitivity makes the midrange a less reliable measure in the presence of outliers, compared to more resistant measures like the median or trimmed mean.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Temperature transducers of a certain type are shipped in batches of 50 . A sample of 60 batches was selected, and the number of transducers in each batch not conforming to design specifications was determined, resulting in the following data: \(\begin{array}{llllllllllllllllllll}2 & 1 & 2 & 4 & 0 & 1 & 3 & 2 & 0 & 5 & 3 & 3 & 1 & 3 & 2 & 4 & 7 & 0 & 2 & 3 \\ 0 & 4 & 2 & 1 & 3 & 1 & 1 & 3 & 4 & 1 & 2 & 3 & 2 & 2 & 8 & 4 & 5 & 1 & 3 & 1 \\ 5 & 0 & 2 & 3 & 2 & 1 & 0 & 6 & 4 & 2 & 1 & 6 & 0 & 3 & 3 & 3 & 6 & 1 & 2 & 3\end{array}\) a. Determine frequencies and relative frequencies for the observed values of \(x=\) number of nonconforming transducers in a batch. b. What proportion of batches in the sample have at most five nonconforming transducers? What proportion have fewer than five? What proportion have at least five nonconforming units? c. Draw a histogram of the data using relative frequency on the vertical scale, and comment on its features.

The accompanying frequency distribution of fracture strength (MPa) observations for ceramic bars fired in a particular kiln appeared in the article "Evaluating Tunnel Kiln Performance" (Amer: Ceramic Soc. Bull., Aug. 1997: 59-63). \(\begin{array}{lccccc}\text { Class } & 81-<83 & 83-<85 & 85-<87 & 87-<89 & 89-<91 \\ \text { Frequency } & 6 & 7 & 17 & 30 & 43 \\ \text { Class } & 91-<93 & 93-<95 & 95-<97 & 97-<99 \\ \text { Frequency } & 28 & 22 & 13 & 3\end{array}\) a. Construct a histogram based on relative frequencies, and comment on any interesting features. b. What proportion of the strength observations are at least 85? Less than 95 ? c. Roughly what proportion of the observations are less than 90 ?

a. For what value of \(c\) is the quantity \(\sum\left(x_{f}-c\right)^{2}\) minimized? [Hint: Take the derivative with respect to \(c\), set equal to 0 , and solve.] b. Using the result of part (a), which of the two quantities \(\Sigma\left(x_{f}-\bar{x}\right)^{2}\) and \(\Sigma\left(x_{f}-\mu\right)^{2}\) will be smaller than the other (assuming that \(\bar{x} \neq \mu)^{2}\) ?

Exposure to microbial products, especially endotoxin, may have an impact on vulnerability to allergic diseases. The article "Dust Sampling Methods for Endotoxin-An Essential, But Underestimated Issue" (Indoor Air, 2006: \(20-27\) ) considered various issues associated with determining endotoxin concentration. The following data on concentration (EU/mg) in settled dust for one sample of urban homes and another of farm homes was kindly supplied by the authors of the cited article. \(\begin{array}{lrrrrrrrrrrr}\mathrm{U}: & 6.0 & 5.0 & 11.0 & 33.0 & 4.0 & 5.0 & 80.0 & 18.0 & 35.0 & 17.0 & 23.0 \\ \mathrm{~F}: & 4.0 & 14.0 & 11.0 & 9.0 & 9.0 & 8.0 & 4.0 & 20.0 & 5.0 & 8.9 & 21.0 \\ & 9.2 & 3.0 & 2.0 & 0.3 & & & & & & & \end{array}\) a. Determine the sample mean for each sample. How do they compare? b. Determine the sample median for each sample. How do they compare? Why is the median for the urban sample so different from the mean for that sample? c. Calculate the trimmed mean for each sample by deleting the smallest and largest observation. What are the corresponding trimming percentages? How do the values of these trimmed means compare to the corresponding means and medians?

Consider numerical observations \(x_{1}, \ldots, x_{n}\). It is frequently of interest to know whether the \(x_{i} s\) are (at least approximately) symmetrically distributed about some value. If \(n\) is at least moderately large, the extent of symmetry can be assessed from a stem-and-leaf display or histogram. However, if \(n\) is not very large, such pictures are not particularly informative. Consider the following alternative. Let \(y_{1}\) denote the smallest \(x_{i}, y_{2}\) the second smallest \(x_{i}\), and so on. Then plot the following pairs as points on a two-dimensional coordinate system: \(\left(y_{n}-\tilde{x}, \tilde{x}-y_{1}\right),\left(y_{n-1}-\tilde{x}, \tilde{x}-y_{2}\right)\), \(\left(y_{n-2}-\tilde{x}, \tilde{x}-y_{3}\right), \ldots\).There are \(n / 2\) points when \(n\) is even and \((n-1) / 2\) when \(n\) is odd. a. What does this plot look like when there is perfect symmetry in the data? What does it look like when observations stretch out more above the median than below it (a long upper tail)? b. The accompanying data on rainfall (acre-feet) from 26 seeded clouds is taken from the article "A Bayesian Analysis of a Multiplicative Treatment Effect in Weather Modification" (Technometrics, 1975: 161-166). Construct the plot and comment on the extent of symmetry or nature of departure from symmetry. \(\begin{array}{rrrrrrr}4.1 & 7.7 & 17.5 & 31.4 & 32.7 & 40.6 & 92.4 \\ 115.3 & 118.3 & 119.0 & 129.6 & 198.6 & 200.7 & 242.5 \\ 255.0 & 274.7 & 274.7 & 302.8 & 334.1 & 430.0 & 489.1 \\ 703.4 & 978.0 & 1656.0 & 1697.8 & 2745.6 & & \end{array}\)

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.