Interpreting Confidence Intervals


Confidence intervals are often misunderstood.

Let’s use an example. You have a population of oak trees in your city. There are about 10,000 of them. You’ve been asked to get the mean height of all the oak trees. Instead of measuring every single tree, you collect a sample of 50 trees. The mean height of the sample is 50ft with a standard deviation of 7.5ft. Based on a 95% confidence level, you calculate a confidence interval for mean height that stretches between 48ft and 52ft.

The confidence level expresses the uncertainty of the estimation process. 95% confidence means that if you take repeated random samples from a population and construct a confidence interval for each sample using the same method. You can expect that 95% of these intervals will capture the population mean. You can also expect that 5% of the total will not capture the population mean. To say it again, you can expect 95% of the random intervals you generate to capture the population parameter, which is the population mean.

In practice we usually one collect one sample, as many samples can get expensive and time-consuming.

Misconceptions

The first misinterpretation is that a 95% confidence interval means that 95% of all the data values in your data set fall within the interval. ft It may not be accurate to say that 95% of all the values in your data set fall in the interval of 48 to 52 feet

The second common misinterpretation, is that a 95% confidence interval implies that 95% of all possible sample means fall within the range of the interval. This is not necessarily true.

The third common misinterpretation is to assume that a confidence interval refers to the only possible source of error in your results. While every confidence interval includes a margin of error, many other kinds of errors can enter into statistical analysis. For example, the questions in a survey may be poorly designed or sampling bias may affect the sample data.

Leave a Reply