Confidence Intervals Explained: Formula, Meaning, and How to Calculate

Suppose you want to know the average height of every adult in your country. Measuring all of them is impossible, so you measure a sample — say 1,000 people — and calculate their average. The number you get feels solid, but it hides a problem: it depends entirely on which people happened to land in your sample. A different 1,000 people would have produced a different average.

On its own, a sample mean says nothing about that uncertainty, so it looks far more exact than it really is. A confidence interval fixes this by reporting a range instead of a single number — and that range is one of the most useful, and most misunderstood, tools in statistics.

What Is a Confidence Interval?

A confidence interval (CI) is a range of values that is likely to contain the true population parameter — most often the population mean. Instead of a single figure, it gives you a band the true value probably falls within, and the width of that band tells you how precise your estimate is before you act on it.

So instead of saying:

“The average height is 175 cm.”

a confidence interval lets you say:

“The average height is between 173 cm and 177 cm, with 95% confidence.”

That second statement is more honest. It reports your best estimate and how much uncertainty that estimate carries — exactly what you need when a decision rides on the number.

What “95% Confidence” Actually Means

You’ll most often see 95% confidence intervals, and this is where almost everyone — including many trained professionals — gets confused.

It’s tempting to say: “There’s a 95% chance the true average lies inside this particular interval.” That interpretation is wrong.

The true average is a fixed number. Once the data is collected, it either falls inside your interval or it doesn’t — there’s no probability left to assign. The correct interpretation is about the method, not the single interval:

If you repeated your study many times, each time collecting a fresh sample and building a new interval, about 95% of those intervals would contain the true value.

This comes from frequentist statistics, a school of thought that says probability can only be defined for things that are genuinely random. Your sample is random, because it was drawn at random. But the true population mean is not random — it is a fixed number that already exists, even though you don’t know it.

For example, suppose the true average height of everyone in your population today is exactly 174.43 cm. That number is real and definite; you simply haven’t measured it, and not knowing it doesn’t make it random. This is why the frequentist view refuses to say “there’s a 95% chance that 174.43 cm falls in the interval.” Instead it says: “intervals produced by this method contain the true value 95% of the time in the long run.”

The Confidence Interval Formula

For a confidence interval around an average, the formula is:

interval = x̄ ± z × (σ / √n)

where:

  • is the sample mean (the point estimate)
  • σ is the standard deviation of the data
  • n is the sample size
  • σ / √n is the standard error
  • z is the critical value that sets the confidence level

The result always has the same shape: estimate ± margin of error.

Worked Example

Suppose you measure 1,000 adults, get a sample mean of 171 cm, and the standard deviation is 16 cm. For a 95% interval, z = 1.96.

  1. Standard error: 16 / √1000 = 16 / 31.6 ≈ 0.51 cm
  2. Margin of error: 1.96 × 0.51 ≈ 1.0 cm
  3. Interval: 171 ± 1.0 → 170 cm to 172 cm

So with 95% confidence, the true average adult height lies between 170 cm and 172 cm.

Critical z-Values

The z in the formula comes from the standard normal distribution. It answers one question: how many standard errors must you reach out on each side to capture the chosen percentage of the curve?

Confidence levelArea in each tailz-value
90%5.0%1.645
95%2.5%1.960
99%0.5%2.576

What Changes the Width of a Confidence Interval

Three things determine how wide the interval turns out:

  • Confidence level. A higher confidence level requires a larger z. Being more certain — say 99% instead of 95% — widens the interval, which is rarely worth the loss of precision. That’s why most work sticks with 95%, which is already reliable enough for most purposes.
  • Variability in the data. The more spread out your sample is, the wider the interval, because a noisier sample tells you less about the true mean.
  • Sample size. A larger sample narrows the interval. Since n sits under the square root in the formula, more data shrinks the standard error and sharpens the estimate.

Why Confidence Intervals Matter in Quality and Process Improvement

In Lean Six Sigma and statistical process control, you almost never have the whole population — you work from samples pulled off a running process. Confidence intervals are what keep those sample-based conclusions honest:

  • When you estimate a process mean or a defect rate, a CI shows whether your sample is precise enough to act on, or whether you’re reading noise.
  • In process capability studies, a Cpk calculated from a small sample can swing widely; a confidence interval around it reveals how much you can actually trust that capability number.
  • In hypothesis testing and A/B-style comparisons, an interval tells you not just whether two processes differ, but by how much — and whether that difference is even meaningful.

A single point estimate invites overconfidence. A confidence interval forces you to ask the right question: how sure am I, really?

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *