Hypothesis Test of a Coin Toss


This entry is part 2 of 3 in the series Statistics Hypothesis

Let’s look at a simplified hypothesis test. To begin learning about hypothesis testing, let’s look at a very simple example. We’ll use the example of tossing a coin. This will be a simplified version of the hypothesis test. Hypothesis tests are based on probability, not certainty. We’ll briefly discuss probability first.

We know that the probability of getting a head in a fair coin toss is 50%. We know that getting two heads in a row is 25%. We know this because we’ve studied the multiplication rule. The probability of getting three heads in a row is 50% times 50% times 50% which is 1/8 which is 12.5%. Getting heads four times in a row is only 6.25%. That’s unlikely but certainly not impossible. You need to create a threshold for yourself. You choose 5%. There’s nothing special about 5%. You assume that the coin is fair, but if the probability of the outcome is less than 5%, you’ll conclude that the coin is actually rigged.

Let’s start the test. You flip the coin six times. It turns up heads all six times. Wow! That is our sample data. You compute that the probability of getting six heads in a row from the toss of a fair coin is 50% multiplied together six times. That’s 1.5625%.

Here are the four steps in conducting a hypothesis test

  1. State the null hypothesis and the alternative hypothesis.
  2. Choose a significance level.
  3. Find the p-value.
  4. Reject or fail to reject the null hypothesis.

Step 1

The null hypothesis is a statement that is assumed to be true unless there’s convincing evidence to the contrary. The null hypothesis typically assumes that your observed data occurs by chance. The alternative hypothesis is a statement that contradicts the null hypothesis, and is accepted as true only if there’s convincing evidence for it. The alternative hypothesis typically assumes that your observed data does not occur by chance.

The null hypothesis states that the coin is fair. That’s the normal state. That’s the status quo or the typical state of things. That is like saying the coin is innocent until proven guilty.

The alternative hypothesis is a statement that contradicts the null hypothesis and is accepted only as true if there is convincing evidence for it. In this case it states that the coin is rigged and our evidence is convincing. The results of our experiment were the results of rigging.

Step 2

Choose your significance level. This is the threshold at which you will consider results statistically significant, or was it just by chance. The significance level is also the probability of rejecting the null hypothesis when it is true. That’s like wrongly convicting an innocent person. Quite often the significance level is 5%.

Step 3

Find the p-value

Step 4

Reject or fail to reject the null hypothesis. If your p-value is less than the significance level then you will reject the null hypothesis. If the p-value is greater than the significance level you will fail to reject the null hypothesis.

Coin Toss Experiment

You toss a coin six times and get heads all six times. Is that enough evidence to say that the coin is rigged, or were those six heads just chance? Six heads gives a p-value of 1.56%. How did we get that? We took the probability of one half (50%) and put that to the power of six to get 1 / 64 which equals 0.0156, or 1.56%. Earlier we may have chosen a significance level of 5%. Since our p-value is 1.56%, and that is less than the significance level of 5%, we reject the null hypothesis.

Ultimately, it’s your responsibility as a data professional to determine how much evidence you need to decide that a result is statistically significant.

Errors

Probability is the foundation of hypothesis testing. When working in probabilities, there is no certainty and mistakes can be made.

In hypothesis testing, there are two types of errors you can make when drawing a conclusion, a Type I error (type one error) and a Type II error. A Type I error, also known as a false positive, occurs when you reject a null hypothesis that is actually true. In other words, you conclude that your result is statistically significant (“positive”) when in fact it occurred by chance. To reduce your chance of making a Type I error, (type one error) choose a lower significance level, perhaps 1% instead of 5%.

However, choosing a lower significance level means you’re more likely to make a Type II error or a false negative. This occurs when you fail to reject a null hypothesis, which is actually false. In other words, you conclude your result occurred by chance when it’s in fact statistically significant.

Choosing a significance level is important. Suppose you are testing the fabric strength of a parachute manufacturer. The null hypothesis is that it’s strong enough. It’s fine. It’s good and we need to prove that it’s NOT strong enough. Test it and show that it meets a standard of “strong enough”. The worst case is a false positive, which is when we say it’s strong enough when it is not. Very bad. To minimize the risk of a type one error, choose a significance level of about 1% (instead of the standard 5%).

Series Navigation<< Hypothesis Testing IntroductionHypothesis Test for the Mean >>

Leave a Reply