Code. Models. Analysis. Decisions.

Confidence Interval of the Mean with Excel

Confidence intervals are used to estimate a range within which a population parameter is likely to lie. This topic is generally covered in statistics texts in a chapter titled "estimation". Confidence intervals can be calculated for pretty much any population parameter, but they are most frequently used to estimate population means and proportions. Confidence Intervals of a mean are used to provide an interval estimate within which the true value of the population parameter may lie. The video demonstrates a spreadsheet model for estimating a population mean from a random sample.

There are two methodologies for calculating confidence intervals:
  1. We know or have reason to think we know the population standard deviation
  2. We need to estimate the population standard deviation from the sample data we are using

It should be noted that it is much more common that the standard deviation is not known. For completeness both methodologies are demonstrated and the differences between them are discussed.

How to Calculate a Confidence Interval

To compute an interval four values are needed:
  1. A point estimate of the mean (calculated from your sample)
  2. The standard deviation (either known or calculated from the sample)
  3. The sample size being used to estimate the population parameter
  4. The confidence level desired

Understanding Confidence Intervals

Of course we would like to be 100% sure that our interval contains the population mean, however almost nothing is certain and since we are using properties of the normal distribution to calculate our interval – a distribution that is not closed - we can’t be 100% certain. So we have to settle for “pretty certain”. In general pretty certain is somewhere around 90% or 95%, and we will see that there is a trade off for being more certain. The difference between our chosen certainty or confidence level and 100% is called alpha and it is also the probability that the interval we end up with does not contain the population mean. Alpha is also referred to as the level of significance. This is the same alpha used in hypothesis testing. It can be interpreted here as the probability that the confidence interval does not contain the population mean.

The spreadsheet used in this video may be downloaded here.