3.4 Critical Values

In the preceding section, we learned that the standard error is related to the precision of the interval estimate. A larger standard error yields a less precise estimate, that is, with a wider interval estimate.

We are interested in the interval that includes a particular percentage of all samples that can be drawn, usually the 95% of all samples that are closest to the population value. In our current example, the 95% of all samples with average candy weight that is closest to average candy weight in the population (2.8 grams).

In theoretical probability distributions like the normal distribution, the percentage of samples is related to the standard error. If we know the standard error, we know the interval within which we find the 95% of samples that are closest to the population value.

Figure 3.5: Standardized sample outcomes and the standard error.

Figure 3.5 shows the sampling distribution of average candy weight per sample bag. It contains two horizontal axes, one with average candy weight in grams (bottom) and one with average candy weight in standard errors, also called z scores (top).

In Figure 3.5, we approximate the sampling distribution with a theoretical probability distribution, namely the normal distribution. The theoretical probability distribution links probabilities (areas under the curve) to sample statistic outcome values (scores on the horizontal axis). For example, we have 2.5% probability of drawing a sample bag with average candy weight below 1.2 grams or 2.5% probability of drawing a sample bag with average candy weight over 4.4 grams.

3.4.1 Standardization and z scores

The average candy weights that are associated with 2.5% and 97.5% probabilities in Figure 3.5 depend on the sample that we have drawn. As you may notice while playing with Figure 3.3, changing the size of the sample also changes the average candy weights that mark the 2.5% and 97.5% probabilities.

We can simplify the situation if we standardize the sampling distribution: Subtract the mean of the sampling distribution from each sample mean in this distribution, and divide the result by the standard error. Thus, we transform the sampling distribution into a distribution of standardized scores. The mean of the new standardized variable is always zero.

If we use the normal distribution for standardized scores, which is called the standard-normal distribution or z distribution, there is a single z value that marks the boundary between the top 2.5% and the bottom 97.5% of any sample. This z value is 1.96. If we combine this value with -1.96, separating the bottom 2.5% of all samples from the rest, we obtain an interval [-1.96, 1.96] containing 95% of all samples that are closest to the mean of the sampling distribution.

In a standard-normal or z distribution, 1.96 is called a critical value. Together with its negative (-1.96), it separates the 95% sample statistic outcomes that are closest to the parameter, hence that are most likely to appear, from the 5% that are furthest away and least likely to appear. There are also critical z values for other probabilities, for instance, 1.64 for the middle 90% of all samples and 2.58 for the middle 99% in a standard-normal distribution.

3.4.2 Interval estimates from critical values and standard errors

Critical values in a theoretical probability distribution tell us the boundaries, or range, of the interval estimate expressed in standard errors. In a normal distribution, 95% of all sample means are situated no more than 1.96 standard errors from the population mean.

If the standard error is 0.5 and the population mean is 2.8 grams, we have 95% probability that the mean candy weight in a sample that we draw from this population lies between 1.82 grams (this is 1.96 times 0.5 subtracted from 2.8) and 3.78 grams.

Critical values make it easy to calculate an interval estimate if we know the standard error. Just take the population value and add the critical value times the standard error to obtain the upper limit of the interval estimate. Subtract the critical value times the standard error from the population value to obtain the lower limit.

Lower limit of the interval estimate = population value – critical value * standard error.
Upper limit of the interval estimate = population value + critical value * standard error.

(Standard) normal distributions make life easier for us, because there is a fixed critical value for each probability, such as 1.96 for 95% probability, which is well-worth memorizing.