A statistical hypothesis is a statement or claim about some unrealized true state of nature. It helps determine whether something is likely due to chance or to some other factor of interest.
Hypothesis testing is a powerful statistical technique that allows you to determine whether a certain factor or variable is statistically significant or not.
Hypothesis testing can help us make decisions. Some examples might be:
- Average gas mileage differs depending on whether the gasoline is purchased at Shell or BP.
- The variability of machined thickness is dependent on the type of tool used.
- Product quality is independent of the raw material supplier.
- Whether the solution to our Six Sigma project actually made a difference or is the change we observe only due to chance.
The actual hypothesis tested consists of two statements about the true state of nature. H₀ is the null hypothesis and is the statement of a zero or null difference that is to be tested. The other statement is the alternative hypothesis, H₁, and is the statement that must be true if the null hypothesis is false.
Other standard terms used in hypothesis testing are:
- Type I error: the mistake of rejecting the null hypothesis when it is true.
- Type II error: the mistake of failing to reject the null hypothesis when it is false
- p-value: represents the exact probability of making a Type I error and is calculated from the data itself.
A common analogy of hypothesis testing can be taken from our legal system where an accused on trial is presupposed to be innocent unless the prosecution presents overwhelming evidence to convict him. In this example, the hypotheses to be tested are stated as:
- H₀: Defendant is Innocent
- H₁: Defendant is guilty
Regardless of the jury’s conclusion, they are never really sure of the true state of nature. Concluding “H₀: Defendant is Innocent” does not mean that the defendant is in fact innocent. An H₀ conclusion simply means that the evidence was not overwhelming enough to justify a conviction. On the other hand, concluding H₁ does not prove guilt; rather, it implies that the evidence is so overwhelming that the jury can have a high level of confidence in a guilty verdict.
Since verdicts are concluded with less than 100% certainty, either conclusion has some probability of error. The probability of committing a Type I error is defined as alpha, α, and the probability of committing a Type II error is β.
In a courtroom, α, the probability of convicting an innocent person, is of critical concern. To minimize the risk of such an erroneous conclusion, our courts require overwhelming evidence to conclude H₁. Although minimizing α has its advantages, it should be obvious that requiring overwhelming evidence to conclude H₁ will in turn increase β, the probability of a Type II error. To resolve this dilemma, hypothesis tests are designed such that:
- The most critical decision error is a Type I error.
- α is set at a minimum level, usually .05 or .01.
- Based on the above, the hypothesis statement to be tested for at least (1-α)100% confidence is placed in H₁.
- The nature of most statistical hypothesis tests require that the equality condition be placed in H₀.
- To minimize β while holding α constant requires increased sample sizes.
Let’s take another example and look at a situation where we want to reduce the variation in a process. Our initial data is taken at the outset of our project. We analyze our process, tighten it up by standardizing the process, control some important factors, implement our solutions and take another sample.
Test
Null hypothesis | H₀: σ₁ / σ₂ = 1 | |||
Alternative hypothesis | H₁: σ₁ / σ₂ ≠ 1 | |||
Significance level | α = 0.05 | |||
Method | Test Statistic |
DF1 | DF2 | P-Value |
Bonett | 13.17 | 1 | 0.000 | |
Levene | 12.06 | 1 | 38 | 0.001 |
From the above you can see that either method, Bonett or Levene, provides a p-value less than our significance level of α = 0.05. Therefore we can conclude with 95% confidence that there is a statistical difference in the variance of the two samples.