The goal of hypothesis testing is to evaluate the observed effect in the sample data & determine the probability of seeing such an effect by chance in population.
How do we prove if an effect seen in sample is statistically significant in population? Short answer is by disapproving the null hypothesis (Ho) i.e. effect exists by chance in the sample. Below are steps followed in Hypothesis testing -
- The first step is to quantify the size of the apparent effect by choosing a test statistic. Some common examples of test statistics are difference in means, difference in standard deviation, correlation etc.
- The second step is to define a null hypothesis, which is a model of the system based on the assumption that the apparent effect is not real. Null Hypothesis (Ho) assumes that observed effect does not exist.
- The third step is to compute a p-value, which is the probability of seeing the apparent effect if the null hypothesis is true (i.e. there is no effect). Model test statistics, and compute the probability of occurence of null hypothesis, this is also known as p-value.
- The last step is to interpret the result. If the p-value is low, the effect is said to be statistically significant, which means that it is unlikely to have occurred by chance (Null Hypothesis is disapproved). In that case we infer that the effect is more likely to appear in the larger population. By convention, 5% is the threshold of statistical significance. If the p-value is less than 5%, the effect is considered significant; otherwise it is not.
NOTE: The p-value depends on the choice of test statistic and the model of the null hypothesis, and sometimes these choices determine whether an effect is statistically significant or not.