Type I and Type II Errors | Smoke Detector and the Boy Who Cried Wolf

A/B testing is an essential component of large scale online services today. So essential, that every worth mentioning online business is already doing it. A/B testing is also used in email marketing by all major online retailers. The Obama for America data science team received a lot of press coverage for leverage data science, especially A/B testing during the presidential campaign.

Here is an interesting article on this topic: http://kylerush.net/blog/optimization-at-the-obama-campaign-ab-testing/

If you have been involved in anything related A/B testing (online experimentation) on UI, relevance or email marketing, chances are that you have heard of Type I and Type II errors are. The usage of these terms is very common but a good understanding of these terms is not as common.

I have seen illustrations as simple as this. [Source: http://allpsych.com/]

I intend to share two great examples I recently read that will help you remember this very important concept in hypothesis testing.

TYPE I ERROR: An alarm without a fire. TYPE II ERROR: A fire without an alarm.

Every cook knows how to avoid Type I Error – just remove the batteries. Unfortunately, this increases the incidences of Type II error. 🙂

Reducing the chances of Type II error would mean making the alarm hypersensitive, which in turn would increase the chances of Type I error.

Another way to remember this is by recalling the story of the Boy Who Cried Wolf.

Null Hypothesis: There is no wolf.
Alternate Hypothesis: There is a wolf.

Villagers believing the boy when there was no wolf (Rejecting null hypothesis incorrectly): Type I Error
Villagers not believing the boy when there actually was a wolf (Rejecting alternate hypothesis incorrectly): Type II Error

The purpose of the post is not to explain type I and type II errors. If this is the first time you are hearing about these terms, here is the Wikipedia entry: Type I and Type II Error.