/*! This file is auto-generated */ .wp-block-button__link{color:#fff;background-color:#32373c;border-radius:9999px;box-shadow:none;text-decoration:none;padding:calc(.667em + 2px) calc(1.333em + 2px);font-size:1.125em}.wp-block-file__button{background:#32373c;color:#fff;text-decoration:none} Problem 64 The article "Increases in Steroi... [FREE SOLUTION] | 91Ó°ÊÓ

91Ó°ÊÓ

The article "Increases in Steroid Binding Globulins Induced by Tamoxifen in Patients with Carcinoma of the Breast" \((J\). of Endocrinology, 1978: 219-226) reports data on the effects of the drug tamoxifen on change in the level of cortisol-binding globulin (CBG) of patients during treatment. With age \(=x\) and \(\Delta \mathrm{CBG}=y\), summary values are \(n=26\), \(\sum x_{i}=1613, \sum\left(x_{i}-\bar{x}\right)^{2}=3756.96, \sum y_{i}=281.9\) \(\sum\left(y_{i}-\bar{y}\right)^{2}=465.34\), and \(\sum x_{i} y_{i}=16,731\) a. Compute a \(90 \%\) CI for the true correlation coefficient \(\rho\). b. Test \(H_{0}: \rho=-.5\) versus \(H_{\mathrm{a}}: \rho<-.5\) at level \(.05\). c. In a regression analysis of \(y\) on \(x\), what proportion of variation in change of cortisol-binding globulin level could be explained by variation in patient age within the sample? d. If you decide to perform a regression analysis with age as the dependent variable, what proportion of variation in age is explainable by variation in \(\triangle \mathrm{CBG}\) ?

Short Answer

Expert verified
a. Approximate CI for \(\rho\) is \((-0.58, 0.02)\). b. Do not reject \(H_0: \rho = -0.5\). c. 9.6% of the variation in \(\Delta \text{CBG}\) is explained by age. d. 9.6% of variation in age is explained by \(\Delta \text{CBG}\).

Step by step solution

01

Calculate sample means

To find the sample means, we use the given sums for age \(x\) and change in cortisol-binding globulin \(y\):\[\bar{x} = \frac{\sum x_i}{n} = \frac{1613}{26} \approx 62.04\] \[\bar{y} = \frac{\sum y_i}{n} = \frac{281.9}{26} \approx 10.84\]
02

Calculate covariance and correlation coefficient

Covariance \(S_{xy}\) is calculated using the sum of products:\[S_{xy} = \sum (x_i y_i) - n \bar{x}\bar{y} = 16731 - 26 \times 62.04 \times 10.84 \approx 1315.64\] For variance, we use \(\sum(x_i-\bar{x})^2\): \[S_{xx} = \sum (x_i - \bar{x})^2 = 3756.96\] The variance \(S_{yy}\) is given: \[S_{yy} = 465.34\]Correlation coefficient \(r\) is then:\[r = \frac{S_{xy}}{\sqrt{S_{xx}S_{yy}}} = \frac{1315.64}{\sqrt{3756.96 \times 465.34}} \approx -0.31\]
03

Compute 90% confidence interval for \(\rho\)

Use Fisher's z-transformation to compute the confidence interval:\[z' = \frac{1}{2} \ln \left(\frac{1+r}{1-r}\right)\approx \frac{1}{2} \ln \left(\frac{1-0.31}{1+0.31}\right) \approx -0.320\] Standard error is \(\text{SE} = \frac{1}{\sqrt{n-3}} = \frac{1}{\sqrt{23}} \approx 0.208\)For a 90% CI, z-value is 1.645:\[ z' \pm 1.645 \times \text{SE} \approx -0.320\pm 0.342\]Back transform confidence limits for \(r\):\[\text{CI} = \left(\tanh(-0.662), \tanh(0.022)\right) \approx (-0.58, 0.02)\]
04

Perform hypothesis test for \(H_0: \rho = -0.5\)

Transform hypothesized \(\rho = -0.5\) using Fisher's transformation:\[z_{\rho} = \text{atanh}(-0.5) \approx -0.5493\] Compute test statistic using \(z'\): \[ z = \frac{-0.320 - (-0.5493)}{0.208} \approx 1.102\] Compare with critical z-value at \(0.05\) significance \(\text{one-tailed}\) \(z_c = -1.645\). Since 1.102 > -1.645, do not reject \(H_0\).
05

Determine variation explained in regression of \(y\) on \(x\)

The proportion of variation is the square of the correlation coefficient:\[ R^2 = (-0.31)^2 \approx 0.096\]Which corresponds to \(9.6\%\) of variation in \(\Delta \text{CBG}\) explained by age.
06

Variation explained if switching dependent variable

If age is the dependent variable, the explained variation remains the same due to the symmetric nature of correlation:\[ R^2 = (-0.31)^2 \approx 0.096\] Thus, \(9.6\%\) of variation in age is explained by \(\Delta \text{CBG}\).

Unlock Step-by-Step Solutions & Ace Your Exams!

  • Full Textbook Solutions

    Get detailed explanations and key concepts

  • Unlimited Al creation

    Al flashcards, explanations, exams and more...

  • Ads-free access

    To over 500 millions flashcards

  • Money-back guarantee

    We refund you if you fail your exam.

Over 30 million students worldwide already upgrade their learning with 91Ó°ÊÓ!

Key Concepts

These are the key concepts you need to understand to accurately answer the question.

Confidence Interval
In statistics, a confidence interval (CI) is a range of values that is used to estimate a population parameter. This range is constructed so that, with a specified degree of confidence, it contains the true parameter value. In the context of correlation analysis, the confidence interval is used to estimate the true correlation coefficient \(\rho\) of the population.

In our problem, we calculated a 90% confidence interval for the true correlation coefficient between age and changes in cortisol-binding globulin levels. Using Fisher's z-transformation, we found that the interval is approximately (-0.58, 0.02).

This interval tells us that we are 90% confident that the true correlation coefficient lies within this range. It's important to note that the interval includes zero, suggesting that there might be no correlation between the two variables, according to this data.
Regression Analysis
Regression analysis is a statistical method used to examine the relationship between two or more variables. In this context, we are looking at a simple linear regression where one variable \(y\) is predicted by another variable \(x\).

When conducting regression analysis on the cortisol-binding globulin levels with age as the independent variable, we used the correlation coefficient to estimate how much of the variation in globulin levels can be explained by age. This is achieved through the calculation of the coefficient of determination, which is denoted as \(R^2\).

In this exercise, the calculated \(R^2\) value is 0.096, which indicates that only 9.6% of the variance in cortisol-binding globulin levels can be accounted for by age. This suggests that other factors may play a more significant role in influencing these levels.
Hypothesis Testing
Hypothesis testing is a statistical method used to make decisions about a population based on sample data. It involves testing an assumption (called the null hypothesis) against an alternate hypothesis.

In the context of this exercise, the null hypothesis \(H_0\) was that the population correlation coefficient \(\rho\) equals -0.5. The alternate hypothesis \(H_a\) suggests that \(\rho < -0.5\).

Using Fisher's transformation and our sample data, we calculated a test statistic, which we compared to predefined critical values. In this case, the test statistic did not significantly differ from the hypothesized value. This resulted in failing to reject the null hypothesis, meaning we do not have sufficient evidence to claim the population correlation is less than -0.5 with the given data.
Fisher's Z-Transformation
Fisher's Z-Transformation is an essential technique in correlation analysis that stabilizes the variance of the correlation coefficient, making it usable for constructing confidence intervals and performing hypothesis tests.

The transformation involves converting the Pearson correlation coefficient to a Z-score. This transformation is particularly useful when dealing with small sample sizes because it allows for more accurate interval estimates and hypothesis testing.

When applying Fisher's Z-transformation in our exercise, we initially transformed the sample correlation coefficient \(r\) to a Z-score to find the confidence interval for the true correlation \(\rho\) and also to perform hypothesis testing. By converting to the Z-form and back, we can interpret statistical results more reliably, especially in instances involving smaller sample sizes.

One App. One Place for Learning.

All the tools & learning materials you need for study success - in one app.

Get started for free

Most popular questions from this chapter

Torsion during hip external rotation and extension may explain why acetabular labral tears occur in professional athletes. The article "Hip Rotational Velocities During the Full Golf Swing" (J. of Sports Science and Med., 2009: 296-299) reported on an investigation in which lead hip internal peak rotational velocity \((x)\) and trailing hip peak external rotational velocity \((y)\) were determined for a sample of 15 golfers. Data provided by the article's authors was used to calculate the following summary quantities: $$ \begin{array}{r} \sum\left(x_{i}-\bar{x}\right)^{2}=64,732.83, \sum\left(y_{i}-\bar{y}\right)^{2}=130,566.96, \\ \sum\left(x_{i}-\bar{x}\right)\left(y_{i}-\bar{y}\right)=44,185.87 \end{array} $$ Separate normal probability plots showed very substantial linear patterns. a. Calculate a point estimate for the population correlation coefficient. b. Carry out a test at significance level \(.01\) to decide whether there is a linear relationship between the two velocities in the sampled population; your conclusion should be based on a \(P\)-value. c. Would the conclusion of (b) have changed if you had tested appropriate hypotheses to decide whether there is a positive linear association in the population? What if a significance level of \(.05\) rather than \(.01\) had been used?

The catch basin in a storm-sewer system is the interface between surface runoff and the sewer. The catch-basin insert is a device for retrofitting catch basins to improve pollutantremoval properties. The article "An Evaluation of the Urban Stormwater Pollutant Removal Efficiency of Catch Basin Inserts" (Water Envir: Res., 2005: 500-510) reported on tests of various inserts under controlled conditions for which inflow is close to what can be expected in the field. Consider the following data, read from a graph in the article, for one particular type of insert on \(x=\) amount filtered ( 1000 s of liters) and \(y=\%\) total suspended solids removed. $$ \begin{array}{l|cccccccccc} x & 23 & 45 & 68 & 91 & 114 & 136 & 159 & 182 & 205 & 228 \\ \hline y & 53.3 & 26.9 & 54.8 & 33.8 & 29.9 & 8.2 & 17.2 & 12.2 & 3.2 & 11.1 \end{array} $$ Summary quantities are $$ \begin{aligned} &\sum x_{i}=1251, \sum x_{i}^{2}=199,365, \sum y_{i}=250.6 \\ &\sum y_{i}^{2}=9249.36, \sum x_{i} y_{i}=21,904.4 \end{aligned} $$ a. Does a scatter plot support the choice of the simple linear regression model? Explain. b. Obtain the equation of the least squares line. c. What proportion of observed variation in \% removed can be attributed to the model relationship? d. Does the simple linear regression model specify a useful relationship? Carry out an appropriate test of hypotheses using a significance level of \(.05\). e. Is there strong evidence for concluding that there is at least a \(2 \%\) decrease in true average suspended solid removal associated with a 10,000 liter increase in the amount filtered? Test appropriate hypotheses using \(\alpha=.05\). f. Calculate and interpret a \(95 \% \mathrm{CI}\) for true average \(\%\) removed when amount filtered is 100,000 liters. How does this interval compare in width to a CI when amount filtered is 200,000 liters? g. Calculate and interpret a \(95 \%\) PI for \% removed when amount filtered is 100,000 liters. How does this interval compare in width to the CI calculated in (f) and to a PI when amount filtered is 200,000 liters?

Suppose an investigator has data on the amount of shelf space \(x\) devoted to display of a particular product and sales revenue \(y\) for that product. The investigator may wish to fit a model for which the true regression line passes through \((0,0)\). The appropriate model is \(Y=\beta_{1} x+\epsilon\). Assume that \(\left(x_{1}, y_{1}\right), \ldots,\left(x_{n}, y_{n}\right)\) are observed pairs generated from this model, and derive the least squares estimator of \(\beta_{1}\).

Show that the "point of averages" \((\bar{x}, \bar{y})\) lies on the estimated regression line.

The Turbine Oil Oxidation Test (TOST) and the Rotating Bomb Oxidation Test (RBOT) are two different procedures for evaluating the oxidation stability of steam turbine oils. The article "Dependence of Oxidation Stability of Steam Turbine Oil on Base Oil Composition" (J. of the Society of Tribologists and Lubrication Engrs., Oct. 1997: 19-24) reported the accompanying observations on \(x=\) TOST time (hr) and \(y=\) RBOT time (min) for 12 oil specimens. $$ \begin{array}{lrrrrrr} \text { TOST } & 4200 & 3600 & 3750 & 3675 & 4050 & 2770 \\ \text { RBOT } & 370 & 340 & 375 & 310 & 350 & 200 \\ \text { TOST } & 4870 & 4500 & 3450 & 2700 & 3750 & 3300 \\ \text { RBOT } & 400 & 375 & 285 & 225 & 345 & 285 \end{array} $$ a. Calculate and interpret the value of the sample correla tion coefficient (as do the article's authors). b. How would the value of \(r\) be affected if we had le \(x=\) RBOT time and \(y=\) TOST time? c. How would the value of \(r\) be affected if RBOT time were expressed in hours? d. Construct normal probability plots and comment. e. Carry out a test of hypotheses to decide whether RBOT time and TOST time are linearly related.

See all solutions

Recommended explanations on Math Textbooks

View all explanations

What do you think about this solution?

We value your feedback to improve our textbook solutions.

Study anywhere. Anytime. Across all devices.