/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 83 Consider numerical observations ... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

Consider numerical observations \(x_{1}, \ldots, x_{n^{*}}\) It is frequently of interest to know whether the \(x_{i} \mathrm{~s}\) are (at least approximately) symmetrically distributed about some value. If \(n\) is at least moderately large, the extent of symmetry can be assessed from a stem-and-leaf display or histogram. However, if \(n\) is not very large, such pictures are not particularly informative. Consider the following alternative. Let \(y_{1}\) denote the smallest \(x_{i}, y_{2}\) the second smallest \(x_{i}\), and so on. Then plot the following pairs as points on a two-dimensional coordinate system: \(\left(y_{n}-\tilde{x}, \tilde{x}-y_{1}\right),\left(y_{n-1}-\tilde{x}, \tilde{x}-y_{2}\right),\left(y_{n-2}-\tilde{x}\right.\), \(\left.\tilde{x}-y_{3}\right), \ldots\) There are \(n / 2\) points when \(n\) is even and \((n-1) / 2\) when \(n\) is odd. a. What does this plot look like when there is perfect symmetry in the data? What does it look like when observations stretch out more above the median than below it (a long upper tail)? b. The accompanying data on rainfall (acre-feet) from 26 seeded clouds is taken from the article "A Bayesian Analysis of a Multiplicative Treatment Effect in Weather Modification" (Technometrics, 1975: 161-166). Construct the plot and comment on the extent of symmetry or nature of departure from symmetry. \(\begin{array}{rrrrrrr}4.1 & 7.7 & 17.5 & 31.4 & 32.7 & 40.6 & 92.4 \\ 115.3 & 118.3 & 119.0 & 129.6 & 198.6 & 200.7 & 242.5 \\ 255.0 & 274.7 & 274.7 & 302.8 & 334.1 & 430.0 & 489.1 \\ 703.4 & 978.0 & 1656.0 & 1697.8 & 2745.6 & & \end{array}\)

Short Answer

Expert verified
The plot shows a long upper tail, suggesting asymmetry.

Step by step solution

01

Arrange Data in Order

First, we arrange the data from smallest to largest to identify each \(y_i\). The sorted rainfall data is: 4.1, 7.7, 17.5, 31.4, 32.7, 40.6, 92.4, 115.3, 118.3, 119.0, 129.6, 198.6, 200.7, 242.5, 255.0, 274.7, 274.7, 302.8, 334.1, 430.0, 489.1, 703.4, 978.0, 1656.0, 1697.8, 2745.6.
02

Determine the Median

Since there are 26 observations, the median \(\tilde{x}\) is the average of the 13th and 14th ordered values. Calculate the median as follows: \(\tilde{x} = \frac{200.7 + 242.5}{2} = 221.6.\)
03

Calculate Difference Pairs

For each \(i\), compute the difference pairs as \((y_{n-i+1} - \tilde{x}, \tilde{x} - y_i)\). This leads to 13 pairs because \(n = 26\) is even. Calculate each pair and plot: (2523.8, 217.5), (1476.2, 213.9), (1250.4, 204.1), (756.4, 190.2), (484.4, 189.0), (280.6, 181.0), (267.5, 129.2), (208.4, 106.3), (97.5, 103.3), (108.4, 102.6), (114.3, 91.4), (479.4, 23.9), (468.4, 17.5).
04

Analyze Symmetry in Plot

When plotted, if the distribution is perfectly symmetric about \(\tilde{x}\), the points will lie on a line with slope -1, because \(y_{n-i+1} - \tilde{x} = - (\tilde{x} - y_i)\). A long upper tail will result in points mostly above this line.
05

Evaluate Data Plot

Compute the plot using the pairs calculated. Most points are significantly above the line with slope -1, indicating a stronger stretch above the median, suggesting a distribution with a long upper tail.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

median analysis
The median is a crucial concept in statistics, especially when discussing data symmetry. It represents the midpoint of a data set, meaning half the values are below it and half are above. For a perfectly symmetric distribution, the median not only splits the data into two equal halves, but it also suggests the central pivot point around which the data is balanced.

To find the median in a numerical data set, arrange all the observations in ascending order. If the number of observations, denoted as \(n\), is odd, the median is the middle value. If \(n\) is even, you compute the median as the average of the two middle numbers. In this exercise, with 26 rainfall data points, we average the 13th and 14th values to find the median \(\tilde{x}\). This helps to initiate the analysis of symmetry by determining what deviations from \(\tilde{x}\) look like.

Having a solid grasp of how to find and interpret the median makes it easier to understand the distribution's shape and detect asymmetry.
stem-and-leaf display
A stem-and-leaf display, though not used in the solution itself, provides a quick visual snapshot of data distribution. It organizes data to show its shape and distribution, making it easier to see patterns, such as clusters and gaps, as well as outliers. Each number is split into a "stem," usually the leading digit(s), and a "leaf," representing the trailing digits.

If we were to use a stem-and-leaf plot for the rainfall data in this exercise, we'd see each number represented in such a way that maintains the original data points. This is advantageous for smaller datasets because it retains the raw data while showing distribution.
  • Stems are written once, while leaves are aligned as trailing digits.
  • This format emphasizes frequency and order.
  • Helps detect symmetry by providing an intuitive view of data cluster distribution on either side of the median.

When data is symmetrically distributed, the leaves on either side of the stems would be nearly equal. For asymmetric data, more leaves extend in one direction, indicating skewness.
histogram interpretation
While the actual exercise focuses more on plotting pairs, understanding histograms is essential for visualizing larger data sets. A histogram displays the distribution of data by grouping values into "bins" along the x-axis and showing the frequency of values in those bins with bars on the y-axis.

When interpreting a histogram, symmetry is identified if the shape is roughly identical on both sides of the center point. A histogram can directly show if more data points are trailing toward the lower or upper end of the set, which hints at the presence of skewness.
  • A symmetrical histogram suggests a balanced spread, like a bell curve.
  • A skewed histogram, where one tail is longer, indicates that data is stretched toward that end.
  • Critical for identifying over-dense areas around certain values.

By comparing the heights of the histogram's bars, you may infer where values are concentrated and whether the dataset has long tails to either side, which directly affects symmetry and helps with asymmetry detection.
asymmetry detection
Detecting asymmetry in statistical data is vital in understanding distribution characteristics. Asymmetrical data, also known as skewed data, does not mirror equally around the median. Recognizing asymmetry helps foretell skewed patterns and long-tail distributions, which can dramatically influence statistical analysis and interpretation.

Using a plot where pairs of the differences are shown, as described in the exercise, provides a straightforward visual check for asymmetry. If plotting these pairs creates a line of points with a slope of -1, there is symmetry around the median. Data with long tails, often suggestive of skewness, will deviate from this line.
  • Data with a long upper tail shows points above the line with slope -1.
  • Conversely, a long lower tail places points below this line.
  • This visualization quickly highlights how far and in which direction data strays from symmetry.

With this approach, asymmetry detection becomes more intuitive, transforming complex data assessment into a more manageable task. For the exercise's data, the analysis reveals a pronounced upper tail, indicating significant skewness towards higher values, which is crucial for understanding rainfall distribution patterns.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Observations on burst strength \(\left(\mathrm{lb} /\right.\) in \(\left.^{2}\right)\) were obtained both for test nozzle closure welds and for production cannister nozzle welds ("Proper Procedures Are the Key to Welding Radioactive Waste Cannisters," Welding J., Aug. 1997: \(61-67)\) \(\begin{array}{lllllll}\text { Test } & 7200 & 6100 & 7300 & 7300 & 8000 & 7400 \\ & 7300 & 7300 & 8000 & 6700 & 8300 & \\ \text { Cannister } & 5250 & 5625 & 5900 & 5900 & 5700 & 6050 \\ & 5800 & 6000 & 5875 & 6100 & 5850 & 6600\end{array}\) Construct a comparative boxplot and comment on interesting features (the cited article did not include such a picture, but the authors commented that they had looked at one).

In a study of author productivity ("Lotka's Test," Collection Mgmt., 1982: 111-118), a large number of authors were classified according to the number of articles they had published during a certain period. The results were presented in the accompanying frequency distribution: $$ \begin{aligned} &\text { Number }\\\ &\begin{array}{lrrrrrrrrr} \text { of papers } & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & \\ \text { Frequency } & 784 & 204 & 127 & 50 & 33 & 28 & 19 & 19 & \\ \text { Number } & & & & & & & & & \\ \text { of papers } & 9 & 10 & 11 & 12 & 13 & 14 & 15 & 16 & 17 \\ \text { Frequency } & 6 & 7 & 6 & 7 & 4 & 4 & 5 & 3 & 3 \end{array} \end{aligned} $$ a. Construct a histogram corresponding to this frequency distribution. What is the most interesting feature of the shape of the distribution? b. What proportion of these authors published at least five papers? At least ten papers? More than ten papers? c. Suppose the five \(15 \mathrm{~s}\), three \(16 \mathrm{~s}\), and three \(17 \mathrm{~s}\) had been lumped into a single category displayed as " \(\geq 15\)." Would you be able to draw a histogram? Explain. d. Suppose that instead of the values 15,16 , and 17 being listed separately, they had been combined into a 15-17 category with frequency 11 . Would you be able to draw a histogram? Explain.

Automated electron backscattered diffraction is now being used in the study of fracture phenomena. The following information on misorientation angle (degrees) was extracted from the article "Observations on the Faceted Initiation Site in the Dwell-Fatigue Tested Ti-6242 Alloy: Crystallographic Orientation and Size Effects (Metallurgical and Materials Trans., 2006: 1507-1518). $$ \begin{array}{lcccc} \text { Class: } & 0-<5 & 5-<10 & 10-<15 & 15-<20 \\ \text { Rel freq: } & .177 & .166 & .175 & .136 \\ \text { Class: } & 20-<30 & 30-<40 & 40-<60 & 60-<90 \\ \text { Rel freq: } & .194 & .078 & .044 & .030 \end{array} $$ a. Is it true that more than \(50 \%\) of the sampled angles are smaller than \(15^{\circ}\), as asserted in the paper? b. What proportion of the sampled angles are at least \(30^{\circ}\) ? c. Roughly what proportion of angles are between \(10^{\circ}\) and \(25^{\circ} ?\) d. Construct a histogram and comment on any interesting features.

The accompanying data set consists of observations on shear strength (lb) of ultrasonic spot welds made on a certain type of alclad sheet. Construct a relative frequency histogram based on ten equal-width classes with boundaries \(4000,4200, \ldots\). [The histogram will agree with the one in "Comparison of Properties of Joints Prepared by Ultrasonic Welding and Other Means" (J. of Aircraft, 1983: 552-556).] Comment on its features. $$ \begin{array}{lllllll} 5434 & 4948 & 4521 & 4570 & 4990 & 5702 & 5241 \\ 5112 & 5015 & 4659 & 4806 & 4637 & 5670 & 4381 \\ 4820 & 5043 & 4886 & 4599 & 5288 & 5299 & 4848 \\ 5378 & 5260 & 5055 & 5828 & 5218 & 4859 & 4780 \\ 5027 & 5008 & 4609 & 4772 & 5133 & 5095 & 4618 \\ 4848 & 5089 & 5518 & 5333 & 5164 & 5342 & 5069 \\ 4755 & 4925 & 5001 & 4803 & 4951 & 5679 & 5256 \\ 5207 & 5621 & 4918 & 5138 & 4786 & 4500 & 5461 \\ 5049 & 4974 & 4592 & 4173 & 5296 & 4965 & 5170 \\ 4740 & 5173 & 4568 & 5653 & 5078 & 4900 & 4968 \\ 5248 & 5245 & 4723 & 5275 & 5419 & 5205 & 4452 \\ 5227 & 5555 & 5388 & 5498 & 4681 & 5076 & 4774 \\ 4931 & 4493 & 5309 & 5582 & 4308 & 4823 & 4417 \\ 5364 & 5640 & 5069 & 5188 & 5764 & 5273 & 5042 \\ 5189 & 4986 & & & & & \end{array} $$

A study of the relationship between age and various visual functions (such as acuity and depth perception) reported the following observations on area of scleral lamina \(\left(\mathrm{mm}^{2}\right)\) from human optic nerve heads ("Morphometry of Nerve Fiber Bundle Pores in the Optic Nerve Head of the Human," Experimental Eye Research, 1988: 559–568): \(\begin{array}{lllllllll}2.75 & 2.62 & 2.74 & 3.85 & 2.34 & 2.74 & 3.93 & 4.21 & 3.88 \\ 4.33 & 3.46 & 4.52 & 2.43 & 3.65 & 2.78 & 3.56 & 3.01 & \end{array}\) a. Calculate \(\sum x_{i}\) and \(\sum x_{i}^{2}\). b. Use the values calculated in part (a) to compute the sample variance \(s^{2}\) and then the sample standard deviation \(s\).

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.