1.3 A Continuous Random Variable: Overweight And Underweight.

Let us now look at another variable: the weight of candies in a bag. The weight of candies is perhaps more interesting to the average consumer than candy colour because candy weight is related to calories.

1.3.1 Continuous variable

Weight is a continuous variable because we can always think of a new weight between two other weights. For instance, consider two candy weights: 2.8 and 2.81 grams. It is easy to see that there can be a weight in between these two values, for instance, 2.803 grams. Between 2.8 and 2.803 we can discern an intermediate value such as 2.802. In principle, we could continue doing this endlessly, e.g., find a weight between 2.80195661 and 2.80195662 grams even if our scales may not be sufficiently precise to measure any further differences. It is the principle that counts. If we can always think of a new value in between two values, the variable is continuous.

Continuous variable: We can always think of a new value in between two values.

1.3.2 Continuous sample statistic

We are not interested in the weight of a single candy. If a relatively light candy is compensated for by a relatively heavy candy in the same bag, we still get the calories that we want. We are interested in the average weight of all candies in our sample bag, so average candy weight in our sample bag is our key sample statistic. We want to say something about the probabilities of average candy weight in the samples of candies that we can draw. Can we do that?

When we turn to the probabilities of getting samples with a particular average candy weight, we run into problems with a continuous sample statistic. If we would want to know the probability of drawing a sample bag with an average candy weight of 2.8 grams, we should exclude sample bags with an average candy weight of 2.81 grams, or 2.801 grams, or 2.8000000001 grams, and so on. In fact, we are very unlikely to draw a sample bag with an average candy weight of exactly 2.8 grams, that is, with an infinite number of zeros trailing 2.8. In other words, the probability of such a sample bag is for all practical purposes zero and negligible.

This applies to every average candy weight, so all probabilities are virtually zero. The probability distribution of the sampling space, that is, of all possible outcomes, is going to be very boring: just (nearly) zeros. And it will take forever to list all possible outcomes within the sampling space, because we have an infinite number of possible outcomes. After all, we can always find a new average candy weight between two selected weights.

1.3.3 Probability density

With a continuous sample statistic, we must look at a range of values instead of a single value. We can meaningfully talk about the probability of having a sample bag with an average candy weight of at least 2.8 grams or at most 2.8 grams. We choose a threshold, in this example 2.8 grams, and determine the probability of values above or below this threshold. We can also use two thresholds, for example the probability of an average candy weight between 2.75 and 2.85 grams. This is probably what you were thinking of when I referred to a bag with 2.8 grams as average candy weight.

If we cannot determine the probability of a single value, which we used to depict on the vertical axis in a plot of a sampling distribution, and we have to link probabilities to a range of values on the x axis, for example, average candy weight above/below 2.8 grams, how can we display probabilities? We have to display a probability as an area between the horizontal axis and a curve. This curve is called a probability density function, so if there is a label to the vertical axis of a continuous probability distribution, it is “Probability density” instead of “Probability”.

Figure 1.5 shows an example of a continuous probability distribution for the average weight of candies in a sample bag. This is the familiar normal distribution so we could say that the normal curve is the probability density function here. The total area under this curve is set to one, so the area belonging to a range of sample outcomes (average candy weight) is 1 or less, as probabilities should be.

Figure 1.5: How do we display probabilities in a continuous sampling distribution? Tip: Click on a slider handle and use your keyboard arrow keys to make small changes to the slider handle position.

A probability density function can give us the probability of values between two thresholds. It can also give us the probability of values up to (and including) a threshold value, which is known as a left-hand probability, or the probability of values above (and including) a threshold value, which is called a right-hand probability. In a null hypothesis significance test (Chapter 4), right-hand and left-hand probabilities are used to calculate p values.

Why did I put (and including) between parentheses? It does not really matter whether we add the exact boundary value (2.8 grams) to the probability on the left or on the right because the probability of getting a bag with average candy weight at exactly 2.8 grams (with a very long trail of zero decimals) is negligible.

Are you struggling with the idea of areas instead of heights (values on the vertical axis) as probabilities? Just realize that we could use the area of a bar in a histogram instead of the height as indication of the probability in discrete probability distributions, for example, Figure 1.4. The bars in a histogram are all equally wide, so (relative) differences between bar areas are equal to differences in bar height.

1.3.4 Probabilities always sum to 1

While you were playing with Figure 1.5, you may have noticed that displayed probabilities always add up to one. This is true for every probability distribution because it is part of the definition of a probability distribution.

In addition, you may have realized that we can use probability distributions in two ways. We can use them to say how likely or unlikely we are to draw a sample with the sample statistic value in a particular range. For example, what is the chance that we draw a sample bag with average candy weight over 2.9 grams? But we can also use a probability distribution to find the threshold values that separate the top ten per cent or the bottom five per cent in a distribution. If we want a sample bag with highest average candy weight, say, belonging to the ten per cent bags with highest average candy weight, what should be the minimum average candy weight in the sample bag?