Hypothesis Testing Introduction


This entry is part 1 of 3 in the series Statistics Hypothesis

Series of Posts

This post is the first part of a short series of posts called Hypothesis Testing. You can see the list of posts in this series at the top of this post. The previous series of posts was Confidence Intervals.

What is a hypothesis? A hypothesis is an idea that can be tested. For example, you could say the price of a dozen eggs in Los Angeles is more than four dollars. The average salary for a data analyst in Canada is $70,000. Can we test to see if that estimate is correct? How about a statement like the mean salary for a data analyst is above $70,000.

A hypothesis test uses sample data to evaluate an assumption about a population parameter.

In order to understand hypothesis testing you’ll need to understand descriptive statistics and inferential statistics. Also, we have talked about confidence intervals.

We have the null hypothesis, demoted H0. We also have the alternative hypothesis denoted H1 or HA. The null hypothesis is the one to be tested, and the alternative is everything else. The null hypothesis is a statement about a population parameter, such as the mean, that is assumed to be true until it is shown to be false. The null hypothesis stands until is is rejected. Outcomes of test refer to the population parameter rather than the sample statistic.

Learn from Other Sources

Here is an article on statistical significance. It’s called An Easy Introduction to Statistical Significance (With Examples).

The alternative hypothesis typically, although not always, assumes that the observed data does not occur by chance. The significance level is the threshold at which a result is considered statistcically significnat and

The p-value is the probability of observing results that are at least as extreme as those observed when the null hypothesis is true.

Significance level is the threshold at which a result is considered statistically significant. It is also the probability of rejecting a null hypothesis when it is true.

Alpha is the probability of rejecting the null hypothesis, if it is true. Typical values for alpha are 0.01, 0.05 and 0.1.

Summary

  • The null hypothesis is assumed to be true unless there is convincing evidence to the contrary. It’s like innocent until proven guilty.
  • The probability of rejecting the null hypothesis when it’s true is the significance level. The significance level is the threshold at which you will consider results statistically significant.
  • You fail to reject the null hypothesis in your test which means that the p-value is greater than the significance level.
  • If your p-value is greater than your significance level, do not reject the null hypothesis.
  • A Type I error means that you have rejected the null hypothesis when it is actually true and you’ve concluded that the result is statistically significant when in fact it occurred by chance.
  • A two-sample t-test involves two population means and can use scipy.stats_ttest()
Series NavigationHypothesis Test of a Coin Toss >>

Leave a Reply