Confidence intervals are used to estimate a range within which a population parameter is likely to lie. This topic is generally covered in statistics texts in a chapter titled "estimation". Confidence intervals can be calculated for pretty much any population parameter, but they are most frequently used to estimate population means and proportions. The video demonstrates a spreadsheet model for estimating a population mean from a random sample.

There are two methodologies for calculating confidence intervals:

We know or have reason to think we know the population standard deviation

We need to estimate the population standard deviation from the sample data we are using

It should be noted that it is much more common that the standard deviation is not known. For completeness both methodologies are demonstrated and the differences between them are discussed.

To compute an interval four values are needed:

A point estimate of the mean (calculated from your sample)

The standard deviation (either known or calculated from the sample)

The sample size being used to estimate the population parameter

The confidence level desired

Of course we would like to be 100% sure that our interval contains the population mean, however almost nothing is certain and since we are using properties of the normal distribution to calculate our interval – a distribution that is not closed - we can’t be 100% certain. So we have to settle for “pretty certain”. In general pretty certain is somewhere around 90% or 95%, and we will see that there is a trade off for being more certain. The difference between our chosen certainty or confidence level and 100% is called alpha and it is also the probability that the interval we end up with does not contain the population mean. Alpha is also referred to as the level of significance.

The spreadsheet used in this video may be downloaded from: Our resources page