/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 41 Annual sales, in millions of dol... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

Annual sales, in millions of dollars, for 21 pharmaceutical companies follow. \\[\begin{array}{rrrrrr} 8408 & 1374 & 1872 & 8879 & 2459 & 11413 \\ 608 & 14138 & 6452 & 1850 & 2818 & 1356 \\ 10498 & 7478 & 4019 & 4341 & 739 & 2127 \\ 3653 & 5794 & 8305 & &\end{array}\\] a. Provide a five-number summary. b. Compute the lower and upper limits. c. Do the data contain any outliers? d. Johnson \& Johnson's sales are the largest on the list at \(\$ 14,138\) million. Suppose a data entry error (a transposition) had been made and the sales had been entered as \(\$ 41,138\) million. Would the method of detecting outliers in part (c) identify this problem and allow for correction of the data entry error? e. Show a box plot.

Short Answer

Expert verified
a. Min: 608, Q1: 1872, Median: 4019, Q3: 8305, Max: 14138. b. Lower Limit: -2638.5, Upper Limit: 12815.5. c. Yes, 14138 is an outlier. d. Yes, 41138 would also be detected as an outlier.

Step by step solution

01

Arrange the Data

First, list all the annual sales data in increasing order. The data in millions of dollars is: \[608, 739, 1356, 1374, 1850, 1872, 2127, 2459, 2818, 3653, 4019, 4341, 5794, 6452, 7478, 8305, 8408, 8879, 10498, 11413, 14138.\]
02

Calculate the Five-Number Summary

The five-number summary includes the minimum, first quartile (Q1), median, third quartile (Q3), and maximum. - **Minimum**: 608 - **Q1**: The first quartile, which is the median of the first half of the data. For 21 data points, Q1 is the 6th value: 1872. - **Median**: Being the 11th value for 21 data points: 4019. - **Q3**: The third quartile, the median of the second half of the data, is the 16th value: 8305. - **Maximum**: 14138.
03

Compute the Interquartile Range (IQR)

Calculate the IQR, which is Q3 - Q1. \[IQR = 8305 - 1872 = 6433.\]
04

Calculate the Lower and Upper Limits

The lower and upper limits are computed to detect outliers. - **Lower Limit**: Q1 - 1.5 * IQR = 1872 - 1.5 * 6433 = -2638.5. - **Upper Limit**: Q3 + 1.5 * IQR = 8305 + 1.5 * 6433 = 12815.5.
05

Identify Outliers

Outliers are data points outside the lower and upper limits. Since no sales figures are below the lower limit (-2638.5), and only 14138 exceeds the upper limit (12815.5), the latter is an outlier.
06

Consider a Data Entry Error

If Johnson & Johnson's sales were erroneously entered as 41138 instead of 14138, the method in Step 5 would still identify it as an outlier since 41138 exceeds the upper limit significantly.
07

Draw a Box Plot

To create a box plot, draw a number line that accommodates the data range. Mark the five-number summary: minimum (608), Q1 (1872), median (4019), Q3 (8305), and maximum (14138). Outliers, like 14138, should be indicated with an asterisk or different marker.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Box Plot
A box plot, sometimes called a whisker plot, is an excellent tool for visualizing the distribution of a dataset. It's a simple graphical representation that provides a snapshot of the data's spread and its central ideas. To create a box plot, you need five key pieces of information: the minimum, first quartile (Q1), median, third quartile (Q3), and maximum values of the dataset.
This set of values is known as the five-number summary.
  • The box itself shows the interquartile range (IQR), spanning from Q1 to Q3.
  • The line inside the box is the median, which divides the data into two equal parts.
  • The ends of the "whiskers" extend to the minimum and maximum values, excluding outliers.
Outliers can be displayed as individual points beyond the whiskers, making them easily identifiable.
Interquartile Range
The interquartile range (IQR) is a vital statistic in descriptive statistics, which helps understand data variability. It is calculated as the difference between the third quartile (Q3) and the first quartile (Q1).
The IQR focuses on the central 50% of a dataset, offering insight into its density and spread.
  • IQR is a robust measure because it is not affected by extreme values or outliers, making it more reliable than the traditional range.
  • In this particular exercise, the IQR is computed as \[IQR = Q3 - Q1 = 8305 - 1872 = 6433\].
Understanding the IQR helps in identifying the range of the "middle" part of the data, ensuring that the analysis isn't skewed by atypical data points.
Outliers
Outliers are data points that differ significantly from the rest of the dataset. They can potentially skew and mislead statistical analysis. The detection and handling of outliers are crucial for obtaining reliable results.
To identify outliers using the IQR, you calculate the lower and upper limits as follows:
  • Lower Limit: \( Q1 - 1.5 \times IQR \)
  • Upper Limit: \( Q3 + 1.5 \times IQR \)
In this data, any sales figure beyond the calculated upper limit of 12815.5, such as 14138, is deemed an outlier.
Outliers in data can suggest anomalies or errors, and, as such, should be examined closely to determine their validity.
Quartiles
Quartiles are an essential aspect of data analysis, helping to break down a dataset into four equal parts. They offer a clearer picture of the data's spread and center.
  • **First Quartile (Q1)**: Represents the 25th percentile, where one-quarter of the data falls below this value.
  • **Median**: Also known as the second quartile, it is the midpoint of the dataset.
  • **Third Quartile (Q3)**: Marks the 75th percentile, indicating that three-quarters of the data are below this value.
In our context, the key quartile values are 1872 for Q1, 4019 for the median, and 8305 for Q3.
Quartiles are highly informative, especially when crafting a five-number summary and understanding data distribution. They help in identifying the general location of most data points, ensure easy plotting, and quick comparison across datasets.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Small business owners often look to payroll service companies to handle their employee payroll. Reasons are that small business owners face complicated tax regulations and penalties for employment tax errors are costly. According to the Internal Revenue Service, \(26 \%\) of all small business employment tax returns contained errors that resulted in a tax penalty to the owner (The Wall Street Journal, January 30,2006 ). The tax penalties for a sample of 20 small business owners follow: \\[ \begin{array}{llllllll} 820 & 270 & 450 & 1010 & 890 & 700 & 1350 & 350 & 300 & 1200 \\ 390 & 730 & 2040 & 230 & 640 & 350 & 420 & 270 & 370 & 620 \end{array}\\] a. What is the mean tax penalty for improperly filed employment tax returns? b. What is the standard deviation? c. Is the highest penalty, \(\$ 2040,\) an outlier? d. What are some of the advantages of a small business owner hiring a payroll service company to handle employee payroll services, including the employment tax returns?

The grade point average for college students is based on a weighted mean computation. For most colleges, the grades are given the following data values: \(A(4), B(3), C(2)\) \(\mathrm{D}(1),\) and \(\mathrm{F}(0)\). After 60 credit hours of course work, a student at State University earned 9 credit hours of \(A, 15\) credit hours of \(B, 33\) credit hours of \(C,\) and 3 credit hours of \(D\) a. Compute the student's grade point average. b. Students at State University must maintain a 2.5 grade point average for their first 60 credit hours of course work in order to be admitted to the business college. Will this student be admitted?

The National Association of Realtors reported the median home price in the United States and the increase in median home price over a five-year period (The Wall Street Journal, January 16,2006 ). Use the sample home prices shown here to answer the following questions. \\[ \begin{array}{llllll} 995.9 & 48.8 & 175.0 & 263.5 & 298.0 & 218.9 & 209.0 \\ 628.3 & 111.0 & 212.9 & 92.6 & 2325.0 & 958.0 & 212.5 \end{array} \\] a. What is the sample median home price? b. In January 2001 , the National Association of Realtors reported a median home price of \(\$ 139,300\) in the United States. What was the percentage increase in the median home price over the five-year period? c. What are the first quartile and the third quartile for the sample data? d. Provide a five-number summary for the home prices. e. Do the data contain any outliers? f. What is the mean home price for the sample? Why does the National Association of Realtors prefer to use the median home price in its reports?

A sample of 10 NCAA college basketball game scores provided the following data \((U S A\) Today, January 26,2004 ). $$\begin{array}{lclcr} & & & & \text { Winning } \\ \text { Winning Team } & \text { Points } & \text { Losing Team } & \text { Points } & \text { Margin } \\ \text { Arizona } & 90 & \text { Oregon } & 66 & 24 \\ \text { Duke } & 85 & \text { Georgetown } & 66 & 19 \\ \text { Florida State } & 75 & \text { Wake Forest } & 70 & 5 \\ \text { Kansas } & 78 & \text { Colorado } & 57 & 21 \\ \text { Kentucky } & 71 & \text { Notre Dame } & 63 & 8 \\ \text { Louisville } & 65 & \text { Tennessee } & 62 & 3 \\ \text { Oklahoma State } & 72 & \text { Texas } & 66 & 6\end{array}$$ $$\begin{array}{lccc} \text { Winning Team } & \text { Points } & \text { Losing Team } & \text { Points } & \text { Winning Margin } \\ \text { Purdue } & 76 & \text { Michigan State } & 70 & 6 \\ \text { Stanford } & 77 & \text { Southern Cal } & 67 & 10 \\ \text { Wisconsin } & 76 & \text { Illinois } & 56 & 20 \end{array}$$ a. Compute the mean and standard deviation for the points scored by the winning team. b. Assume that the points scored by the winning teams for all NCAA games follow a bell-shaped distribution. Using the mean and standard deviation found in part (a), estimate the percentage of all NCAA games in which the winning team scores 84 or more points. Estimate the percentage of NCAA games in which the winning team scores more than 90 points. c. Compute the mean and standard deviation for the winning margin. Do the data contain outliers? Explain.

The cost of consumer purchases such as housing, gasoline, Internet services, tax preparation, and hospitalization were provided in The Wall Street Journal, January 2, 2007. Sample data typical of the cost of tax-return preparation by services such as H\&R Block are shown here. \\[\begin{array}{lllll}120 & 230 & 110 & 115 & 160 \\ 130 & 150 & 105 & 195 & 155 \\ 105 & 360 & 120 & 120 & 140 \\\100 & 115 & 180 & 235 & 255\end{array}\\] a. Compute the mean, median, and mode. b. Compute the first and third quartiles. c. Compute and interpret the 90 th percentile.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.