/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 13 Our number system consists of th... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

Our number system consists of the digits \(0,1,2,3,4,5,6,7,8,\) and \(9 .\) The first significant digit in any number must be \(1,2,3,4,5,6,7,8,\) or 9 because we do not write numbers such as 12 as \(012 .\) Although we may think that each first digit appears with equal frequency so that each digit has a \(\frac{1}{9}\) probability of being the first significant digit, this is not true. In 1881 , Simon Newcomb discovered that first digits do not occur with equal frequency. This same result was discovered again in 1938 by physicist Frank Benford. After studying much data, he was able to assign probabilities of occurrence to the first digit in a number as shown. $$ \begin{array}{lccccc} \text { Digit } & 1 & 2 & 3 & 4 & 5 \\ \hline \text { Probability } & 0.301 & 0.176 & 0.125 & 0.097 & 0.079 \\ \hline \text { Digit } & 6 & 7 & 8 & 9 & \\ \hline \text { Probability } & 0.067 & 0.058 & 0.051 & 0.046 & \\ \hline \end{array} $$ The probability distribution is now known as Benford's Law and plays a major role in identifying fraudulent data on tax returns and accounting books. For example, the following distribution represents the first digits in 200 allegedly fraudulent checks written to a bogus company by an employee attempting to embezzle funds from his employer. $$ \begin{array}{lrrrrrrrrr} \hline \text { First digit } & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\ \hline \text { Frequency } & 36 & 32 & 28 & 26 & 23 & 17 & 15 & 16 & 7 \\ \hline \end{array} $$ (a) Because these data are meant to prove that someone is guilty of fraud, what would be an appropriate level of significance when performing a goodness- of-fit test? (b) Using the level of significance chosen in part (a), test whether the first digits in the allegedly fraudulent checks obey Benford's Law. (c) Based on the results of part (b), do you think that the employee is guilty of embezzlement?

Short Answer

Expert verified
Using a 0.01 significance level, the chi-squared statistic (26.42) exceeds the critical value (20.09), suggesting the distribution does not follow Benford's Law. Significant evidence indicates potential fraud.

Step by step solution

01

- Choosing the Level of Significance

Since this test is meant to prove fraud, a common choice of significance level is 0.01 (1%). This stringent level minimizes the probability of incorrectly accusing someone of fraud (Type I error).
02

- State the Hypotheses

Formulate the null hypothesis (H_0) and the alternative hypothesis (H_A):H_0: The first digit distribution follows Benford's Law.H_A: The first digit distribution does not follow Benford's Law.
03

- Calculate Expected Frequencies

Using Benford's probabilities and the total sample size (200 checks), calculate the expected frequency for each digit:1: 0.301 x 200 = 60.22: 0.176 x 200 = 35.23: 0.125 x 200 = 254: 0.097 x 200 = 19.45: 0.079 x 200 = 15.86: 0.067 x 200 = 13.47: 0.058 x 200 = 11.68: 0.051 x 200 = 10.29: 0.046 x 200 = 9.2
04

- Chi-Squared Goodness-of-Fit Test

Calculate the chi-squared statistic using the formula: df = k - 1(observed - expected)^2/expected.Chi-squared sum: ((36-60.2)^2/60.2) + ((32-35.2)^2/35.2) + ((28-25)^2/25) + ((26-19.4)^2/19.4) + ((23-15.8)^2/15.8) + ((17-13.4)^2/13.4) + ((15-11.6)^2/11.6) + ((16-10.2)^2/10.2) + ((7-9.2)^2/9.2) = 26.42
05

- Compare to the Critical Value

Check the chi-squared distribution table for 8 degrees of freedom (df) at the 0.01 significance level. The critical value is 20.09.Since 26.42 > 20.09, we reject the null hypothesis.
06

- Conclusion on Fraud

Since the first digit distribution does not follow Benford's Law and the null hypothesis is rejected, there is significant evidence to suggest that fraud may have occurred.

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Probability Distribution
A probability distribution represents how likely it is for different outcomes to occur in a random experiment. In the context of Benford's Law, the probability distribution specifies how frequently each digit (1 through 9) is expected to appear as the first significant digit in a large set of naturally occurring numbers. For example:
  • The digit '1' appears as the first digit about 30.1% of the time.
  • The digit '2' appears approximately 17.6% of the time.
This distribution is counterintuitive because we might expect each digit to appear with the same probability, around 1/9 or roughly 11.11%.
Benford's Law has been observed in a wide variety of datasets, including financial records, census data, and even the lengths of rivers. This makes it a valuable tool for detecting anomalies and potential fraud.
Goodness-of-Fit Test
A goodness-of-fit test helps determine if a sample matches a specified distribution. In this exercise, it's used to see if the distribution of first digits in fraudulent checks matches the distribution expected by Benford's Law.
Key steps to perform a goodness-of-fit test:
  • **Null Hypothesis (H0):** The observed data follows Benford's Law.
  • **Alternative Hypothesis (HA):** The observed data does not follow Benford's Law.
  • Calculate expected frequencies by multiplying the total number of observations by the probabilities given by Benford's Law.
  • Use a chi-squared test to compare the observed and expected frequencies.
This test is significant in contexts like auditing and forensic accounting, where identifying deviations from expected distributions can flag potential fraud or errors.
Chi-Squared Statistic
The chi-squared statistic is a measure used in statistics to compare observed data with data we would expect to obtain according to a specific hypothesis. In this exercise, it is used to test if the observed frequencies of first digits in the checks match those expected under Benford's Law.
Steps to compute the chi-squared statistic:
  • Calculate the difference between observed and expected frequencies for each digit.
  • Square these differences to avoid negative values.
  • Divide each squared difference by the expected frequency.
  • Sum all these values to get the chi-squared statistic.
Mathematically, this can be expressed as: \[ \text{Chi-squared} = \sum \frac{(O_i - E_i)^2}{E_i} \] where \( O_i \) is the observed frequency and \( E_i \) is the expected frequency for each digit.
Once computed, the chi-squared statistic is compared to a critical value from the chi-squared distribution table to decide whether to reject the null hypothesis. If the calculated statistic is higher than the critical value, the null hypothesis is rejected, indicating that the observed distribution does not follow Benford's Law, as seen in our exercise.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

The National Highway Traffic Safety Administration publishes reports about motorcycle fatalities and helmet use. The distribution shows the proportion of fatalities by location of injury for motorcycle accidents. $$ \begin{array}{lccccc} \hline \begin{array}{l} \text { Location } \\ \text { of injury } \end{array} & \begin{array}{l} \text { Multiple } \\ \text { Locations } \end{array} & \text { Head } & \text { Neck } & \text { Thorax } & \begin{array}{l} \text { Abdomen/ } \\ \text { Lumbar/Spine } \end{array} \\ \hline \text { Proportion } & 0.57 & 0.31 & 0.03 & 0.06 & 0.03 \\ \hline \end{array} $$ The following data show the location of injury and number of fatalities for 2068 riders not wearing a helmet. $$ \begin{array}{lccccc} \hline \begin{array}{c} \text { Location } \\ \text { of injury } \end{array} & \begin{array}{l} \text { Multiple } \\ \text { Locations } \end{array} & \text { Head } & \text { Neck } & \text { Thorax } & \begin{array}{l} \text { Abdomen/ } \\ \text { Lumbar/Spine } \end{array} \\ \hline \text { Number } & 1036 & 864 & 38 & 83 & 47 \\ \hline \end{array} $$ (a) Does the distribution of fatal injuries for riders not wearing a helmet follow the distribution for all riders? Use the \(\alpha=0.05\) level of significance. (b) Compare the observed and expected counts for each category. What does this information tell you?

Determine \((a)\) the \(\chi^{2}\) test statistic, \((b)\) the degrees of freedom, (c) the critical value using \(\alpha=0.05,\) and (d) test the hypothesis at the \(\alpha=0.05\) level of significance. \(H_{0}: p_{\mathrm{A}}=p_{\mathrm{B}}=p_{\mathrm{C}}=p_{\mathrm{D}}=\frac{1}{4}\) \(H_{1}\) : At least one of the proportions is different from the others. $$ \begin{array}{lcccc} \hline\text { Outcome } & \mathbf{A} & \mathbf{B} & \mathbf{C} & \mathbf{D} \\\ \hline \text { Observed } & 30 & 20 & 28 & 22 \\ \hline \text { Expected } & 25 & 25 & 25 & 25 \\ \hline \end{array} $$

Social Well-Being and Obesity The Gallup Organization conducted a survey in 2014 asking individuals questions pertaining to social well-being such as strength of relationship with spouse, partner, or closest friend, making time for trips or vacations, and having someone who encourages them to be healthy. Social well-being scores were determined based on answers to these questions and used to categorize individuals as thriving, struggling, or suffering in their social wellbeing. In addition, body mass index (BMI) was determined based on height and weight of the individual. This allowed for classification as obese, overweight, normal weight, or underweight. The data in the following contingency table are based on the results of this survey. $$ \begin{array}{lccc} & \text { Thriving } & \text { Struggling } & \text { Suffering } \\ \hline \text { Obese } & 202 & 250 & 102 \\ \hline \text { Overweight } & 294 & 302 & 110 \\ \hline \text { Normal Weight } & 300 & 295 & 103 \\ \hline \text { Underweight } & 17 & 17 & 8 \\ \hline \end{array} $$ (a) Researchers wanted to determine whether the sample data suggest there is an association between weight classification and social well-being. Explain why this data should be analyzed using a chi-square test for independence. (b) Do the sample data suggest that weight classification and social well- being are related? (c) Draw a conditional bar graph of the data by weight classification. (d) Write some general conclusions based on the results from parts (b) and (c).

How much does the typical person pay for a new 2015 Buick Regal? The following data represent the selling price of a random sample of new Regals (in dollars). $$ \begin{array}{lllll} \hline 41,215 & 41,303 & 41,453 & 41,898 & 40,988 \\ \hline 40,078 & 41,215 & 39,623 & 42,352 & 41,898 \\ \hline 40,533 & 42,580 & 40,306 & 41,670 & 39,851 \end{array} $$ (a) Is this data quantitative or qualitative? (b) Find the mean and median price of a new 2015 Regal. (c) Find the standard deviation and interquartile range. (d) Verify it is reasonable to conclude that this data come from a population that is normally distributed. (e) Draw a boxplot of the data. (f) Estimate the typical price paid for a new 2015 Buick Regal with \(90 \%\) confidence. (g) Would a \(90 \%\) confidence interval for all new 2015 domestic vehicles be wider or narrower? Explain.

In a survey of 3029 adult Americans, the Harris Poll asked people whether they smoked cigarettes and whether they always wear a seat belt in a car. The table shows the results of the survey. For each activity, we define a success as finding an individual who participates in the hazardous activity. $$ \begin{array}{lcc} & \begin{array}{c} \text { No Seat Belt } \\ \text { (success) } \end{array} & \begin{array}{c} \text { Seat Belt } \\ \text { (failure) } \end{array} \\ \hline \text { Smoke (success) } & 67 & 448 \\ \hline \text { Do not smoke (failure) } & 327 & 2187 \\ \hline \end{array} $$ (a) Why is this a dependent sample? (b) Is there a significant difference in the proportion of individuals who smoke and the proportion of individuals who do not wear a seat belt? In other words, is there a significant difference between the proportion of individuals who engage in hazardous activities? Use the \(\alpha=0.05\) level of significance.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.