/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 34 Exposure to microbial products, ... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

Exposure to microbial products, especially endotoxin, may have an impact on vulnerability to allergic diseases. The article "Dust Sampling Methods for Endotoxin-An Essential, But Underestimated Issue" (Indoor Air, 2006: 20–27) considered various issues associated with determining endotoxin concentration. The following data on concentration (EU/mg) in settled dust for one sample of urban homes and another of farm homes was kindly supplied by the authors of the cited article. \(\begin{array}{llllllllllllll}\mathrm{U}: & 6.0 & 5.0 & 11.0 & 33.0 & 4.0 & 5.0 & 80.0 & 18.0 & 35.0 & 17.0 & 23.0 \\ \mathrm{~F}: & 4.0 & 14.0 & 11.0 & 9.0 & 9.0 & 8.0 & 4.0 & 20.0 & 5.0 & 8.9 & 21.0 \\ & 9.2 & 3.0 & 2.0 & 0.3 & & & & & & & \end{array}\) a. Determine the sample mean for each sample. How do they compare? b. Determine the sample median for each sample. How do they compare? Why is the median for the urban sample so different from the mean for that sample? c. Calculate the trimmed mean for each sample by deleting the smallest and largest observation. What are the corresponding trimming percentages? How do the values of these trimmed means compare to the corresponding means and medians?

Short Answer

Expert verified
The urban sample's mean is influenced by outliers, unlike the median. Trimmed means align more closely with medians, adjusting for skew.

Step by step solution

01

Calculate Sample Mean for Each Sample

First, sum all the values in the urban sample \(U\) dataset and divide by the number of observations.\[\text{Mean for U} = \frac{6 + 5 + 11 + 33 + 4 + 5 + 80 + 18 + 35 + 17 + 23}{11} = \frac{237}{11} \approx 21.55\] Similarly, sum all the values in the farm sample \(F\) dataset and divide by the number of observations.\[\text{Mean for F} = \frac{4 + 14 + 11 + 9 + 9 + 8 + 4 + 20 + 5 + 8.9 + 21 + 9.2 + 3 + 2 + 0.3}{15} = \frac{128.4}{15} \approx 8.56\]
02

Calculate Sample Median for Each Sample

The median is the middle value when data is ordered. For \(U\), arrange and find the middle:Values in \(U\): 4, 5, 5, 6, 11, 17, 18, 23, 33, 35, 80. Median for \(U\) = 17 (6th value in the ordered list)For \(F\), arrange values and find the middle:Values in \(F\): 0.3, 2, 3, 4, 4, 5, 8, 8.9, 9, 9, 9.2, 11, 14, 20, 21.Median for \(F\) (8th value in ordered list) = 8.9.
03

Compare Means and Medians

For \(U\), the mean is much larger than the median (21.55 vs. 17), likely due to the 80, a large outlier that skews the mean.For \(F\), the mean and median are quite similar (8.56 vs. 8.9), indicating a more even distribution with less extreme outliers impacting the mean.
04

Calculate Trimmed Means

Remove the smallest and largest values from each dataset. For \(U\), remove 4 and 80:Values: 5, 5, 6, 11, 17, 18, 23, 33, 35Trimmed mean for \(U\): \[\text{Trimmed Mean for U} = \frac{5 + 5 + 6 + 11 + 17 + 18 + 23 + 33 + 35}{9} = \frac{153}{9} \approx 17.00\]For \(F\), remove 0.3 and 21:Values: 2, 3, 4, 4, 5, 8, 8.9, 9, 9, 9.2, 11, 14, 20Trimmed mean for \(F\):\[\text{Trimmed Mean for F} = \frac{2 + 3 + 4 + 4 + 5 + 8 + 8.9 + 9 + 9 + 9.2 + 11 + 14 + 20}{13} = \frac{106.1}{13} \approx 8.16\]
05

Compare Trimmed Means, Means, and Medians

The trimmed mean for \(U\) is much closer to the median than the original mean was (17 vs. 21.55), suggesting the trimming reduced the skew impact.For \(F\), the trimmed mean (8.16) is still close to both the mean (8.56) and median (8.9), indicating trimming had a smaller effect due to less skew in original data.Trimming percentages: - \(U\) removed 2 out of 11 values ~18.18%,- \(F\) removed 2 out of 15 values ~13.33%.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Sample Mean
The sample mean is a basic statistical measure that provides an average of all the values in a dataset. It's calculated by summing all the numbers and then dividing by the count of those numbers.

In this exercise, we looked at concentrations in dust samples from both urban and farm homes. For the urban sample, we add all the values, totaling 237, and divide by 11, which gives us a mean of about 21.55. For the farm sample, the total is 128.4, and when divided by the 15 samples, we get a mean of approximately 8.56.

The difference between means suggests that urban homes have a higher average concentration of endotoxin in dust, underlining the importance of knowing how values range across different environments.
Sample Median
The median provides a 'middle' value in a dataset, offering an alternative to the mean that is less affected by extremely high or low values. To find it, you must first arrange numbers in order.

For the urban sample, the ordered values gave a median of 17, while for the farm sample, it was 8.9. Unlike the mean, the sample median can often resist the influence of outliers, providing a better sense of the data's central tendency when anomalies exist.

Notably, in the urban data, the sample median is less than the mean, highlighting potential data skewing by a high outlier of 80. This stark difference reinforces the need for multiple measures of central tendency in data analysis.
Trimmed Mean
The trimmed mean is a way to calculate the average by excluding a set percentage of the highest and lowest data points. This method reduces the potential impact of extreme outliers.

In this case, both datasets had their smallest and largest values removed: 4 and 80 from the urban set, 0.3 and 21 from the farm set. This gave a trimmed mean of 17.00 for urban homes and 8.16 for farm homes.

These values are much closer to the respective medians than the full means, particularly for urban homes. It removes some of the skewing influence and provides a clearer picture of the typical conditions in the environments.
Data Distribution
Data distribution describes how values are spread within a dataset. It helps identify patterns, trends, and potential anomalies. Typically, distributions fall under symmetric, skewed, or uniform categories.

In examining the urban and farm datasets, the urban data appears skewed due to the high value of 80, pushing the mean upwards but not affecting the median equally. Alternatively, the farm data demonstrates a more uniform distribution, with mean and median values being closer.

Understanding a dataset's distribution is crucial to choosing the appropriate statistical tools and interpreting results effectively without being misled by unusual data points.
Outliers
Outliers are data points that diverge significantly from other observations. They can heavily influence statistical measures like the mean, leading to misleading conclusions if not properly addressed.

In the urban data, the value 80 acts as a prominent outlier, impacting the mean significantly more than the median. The farm data had less impactful outliers but still demonstrated values that could be considered variations.

Identifying and understanding outliers can provide insight into data quirks or errors and is crucial in discerning true trends from anomalies. In practice, excluding or adjusting for outliers via methods like the trimmed mean can offer a more representative analysis.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Let \(\bar{x}_{n}\) and \(s_{n}^{2}\) denote the sample mean and variance for the sample \(x_{1}, \ldots, x_{n}\) and let \(\bar{x}_{n+1}\) and \(s_{n+1}^{2}\) denote these quantities when an additional observation \(x_{n+1}\) is added to the sample. a. Show how \(\bar{x}_{n+1}\) can be computed from \(\bar{x}_{n}\) and \(x_{n+1}\). b. Show that $$ n s_{n+1}^{2}=(n-1) s_{n}^{2}+\frac{n}{n+1}\left(x_{n+1}-\bar{x}_{n}\right)^{2} $$ so that \(s_{n+1}^{2}\) can be computed from \(x_{n+1}, \bar{x}_{n}\), and \(s_{n}^{2}\) c. Suppose that a sample of 15 strands of drapery yarn has resulted in a sample mean thread elongation of \(12.58 \mathrm{~mm}\) and a sample standard deviation of \(.512 \mathrm{~mm}\). A 16 th strand results in an elongation value of \(11.8\). What are the values of the sample mean and sample standard deviation for all 16 elongation observations?

The article "'Snow Cover and Temperature Relationships in North America and Eurasia" (J. Climate and Applied Meteorology, 1983: 460-469) used statistical techniques to relate the amount of snow cover on each continent to average continental temperature. Data presented there included the following ten observations on October snow cover for Eurasia during the years \(1970-1979\) (in million \(\mathrm{km}^{2}\) ): \(\begin{array}{llllllllll}6.5 & 12.0 & 14.9 & 10.0 & 10.7 & 7.9 & 21.9 & 12.5 & 14.5 & 9.2\end{array}\) What would you report as a representative, or typical, value of October snow cover for this period, and what prompted your choice?

Observations on burst strength \(\left(\mathrm{lb} /\right.\) in \(\left.^{2}\right)\) were obtained both for test nozzle closure welds and for production cannister nozzle welds ("Proper Procedures Are the Key to Welding Radioactive Waste Cannisters," Welding J., Aug. 1997: \(61-67)\) \(\begin{array}{lllllll}\text { Test } & 7200 & 6100 & 7300 & 7300 & 8000 & 7400 \\ & 7300 & 7300 & 8000 & 6700 & 8300 & \\ \text { Cannister } & 5250 & 5625 & 5900 & 5900 & 5700 & 6050 \\ & 5800 & 6000 & 5875 & 6100 & 5850 & 6600\end{array}\) Construct a comparative boxplot and comment on interesting features (the cited article did not include such a picture, but the authors commented that they had looked at one).

The article "The Pedaling Technique of Elite Endurance Cyclists" (Int. J. of Sport Biomechanics, 1991: 29-53) reported the accompanying data on single-leg power at a high workload: \(\begin{array}{lllllll}244 & 191 & 160 & 187 & 180 & 176 & 174 \\ 205 & 211 & 183 & 211 & 180 & 194 & 200\end{array}\) a. Calculate and interpret the sample mean and median. b. Suppose that the first observation had been 204 rather than 244 . How would the mean and median change? c. Calculate a trimmed mean by eliminating the smallest and largest sample observations. What is the corresponding trimming percentage? d. The article also reported values of single-leg power for a low workload. The sample mean for \(n=13\) observations was \(\bar{x}=119.8\) (actually 119.7692), and the 14th observation, somewhat of an outlier, was 159 . What is the value of \(\bar{x}\) for the entire sample?

The accompanying data set consists of observations on shower-flow rate (L/min) for a sample of \(n=129\) houses in Perth, Australia ("An Application of Bayes Methodology to the Analysis of Diary Records in a Water Use Study," J. Amer. Stat. Assoc., 1987: 705-711): $$ \begin{array}{rrrrrrrrrr} 4.6 & 12.3 & 7.1 & 7.0 & 4.0 & 9.2 & 6.7 & 6.9 & 11.5 & 5.1 \\ 11.2 & 10.5 & 14.3 & 8.0 & 8.8 & 6.4 & 5.1 & 5.6 & 9.6 & 7.5 \\ 7.5 & 6.2 & 5.8 & 2.3 & 3.4 & 10.4 & 9.8 & 6.6 & 3.7 & 6.4 \\ 8.3 & 6.5 & 7.6 & 9.3 & 9.2 & 7.3 & 5.0 & 6.3 & 13.8 & 6.2 \\ 5.4 & 4.8 & 7.5 & 6.0 & 6.9 & 10.8 & 7.5 & 6.6 & 5.0 & 3.3 \\ 7.6 & 3.9 & 11.9 & 2.2 & 15.0 & 7.2 & 6.1 & 15.3 & 18.9 & 7.2 \\ 5.4 & 5.5 & 4.3 & 9.0 & 12.7 & 11.3 & 7.4 & 5.0 & 3.5 & 8.2 \\ 8.4 & 7.3 & 10.3 & 11.9 & 6.0 & 5.6 & 9.5 & 9.3 & 10.4 & 9.7 \\ 5.1 & 6.7 & 10.2 & 6.2 & 8.4 & 7.0 & 4.8 & 5.6 & 10.5 & 14.6 \\ 10.8 & 15.5 & 7.5 & 6.4 & 3.4 & 5.5 & 6.6 & 5.9 & 15.0 & 9.6 \\ 7.8 & 7.0 & 6.9 & 4.1 & 3.6 & 11.9 & 3.7 & 5.7 & 6.8 & 11.3 \\ 9.3 & 9.6 & 10.4 & 9.3 & 6.9 & 9.8 & 9.1 & 10.6 & 4.5 & 6.2 \\ 8.3 & 3.2 & 4.9 & 5.0 & 6.0 & 8.2 & 6.3 & 3.8 & 6.0 & \end{array} $$ a. Construct a stem-and-leaf display of the data. b. What is a typical, or representative, flow rate? c. Does the display appear to be highly concentrated or spread out? d. Does the distribution of values appear to be reasonably symmetric? If not, how would you describe the departure from symmetry? e. Would you describe any observation as being far from the rest of the data (an outlier)?

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.