Hypothesis Testing
Differences of Groups:
1. Chi Square
• Compares observed frequencies to expected frequencies
example: Is the distribution of sex and voting behaviour due to chance or is there a difference between the sexes on voting behaviour?
2. t-Test
· looks at differences between two groups on some variable of interest
· the IV must have only two groups (male/female, undergrad/grad)
ex: Do males and females differ in the amount of hours they spend shopping in a given month?
3. ANOVA
· Tests the significance of group differences between two or more groups
· The IV has two or more categories
· Only determines that there is a difference between groups, but doesn’t tell which is different
ex: Do SAT scores differ for low-, middle-, and high-income students?
ANOVA
An ANOVA test is a way to find out if survey or experiment results are significant. In other words, they help you to figure out if you need to reject the null hypothesis or accept the alternate hypothesis. Basically, you’re testing groups to see if there’s a difference between them.
Types of Tests.
There are two main types: one-way and two-way. Two-way tests can be with or without replication.
· One-way ANOVA between groups: used when you want to test two groups to see if there’s a difference between them.
· Two way ANOVA without replication: used when you have one group and you’re double-testing that same group. For example, you’re testing one set of individuals before and after they take a medication to see if it works or not.
· Two way ANOVA with replication: Two groups, and the members of those groups are doing more than one thing. For example, two groups of patients from different hospitals trying two different therapies.
When to use a one-way ANOVA
Situation 1: You have a group of individuals randomly split into smaller groups and completing different tasks. For example, you might be studying the effects of tea on weight loss and form three groups: green tea, black tea, and no tea
Situation 2: Similar to situation 1, but in this case the individuals are split into groups based on an attribute they possess. For example, you might be studying leg strength of people according to weight. You could split participants into weight categories (obese, overweight and normal) and measure their leg strength on a weight machine.
Limitations of the One-Way ANOVA
A one-way ANOVA will tell you that at least two groups were different from each other. But it won’t tell you what groups were different. If your test returns a significant f-statistic, you may need to run an ad hoc test (like the Least Significant Difference test) to tell you exactly which groups had a difference in means.
Two Way ANOVA
A Two Way ANOVA is an extension of the One Way ANOVA. With a One Way, you have one variable affecting a dependent variable. With a Two Way ANOVA, there are two independents. Use a two way ANOVA when you have one measurement variable (i.e. a quantitative variable) and two nominal variables. In other words, if your experiment has a quantitative outcome and you have two categorical explanatory variables, a two way ANOVA is appropriate.
Chi-Square
· A chi-square goodness of fit test determines if a sample data matches a population. For more details on this type, see: Goodness of Fit Test.
· A chi-square test for independence compares two variables in a contingency table to see if they are related. In a more general sense, it tests to see whether distributions of categorical variables differ from each another.
1. A very small chi square test statistic means that your observed data fits your expected data extremely well. In other words, there is a relationship.
2. A very large chi square test statistic means that the data does not fit very well. In othr words, there isn’t a relationship.
formula for the chi-square statistic used in the chi square test is:
P Values
The P value, or calculated probability, is the probability of finding the observed, or more extreme, results when the null hypothesis (H0) of a study question is true – the definition of ‘extreme’ depends on how the hypothesis is being tested. P is also described in terms of rejecting H0 when it is actually true, however, it is not a direct probability of this state.
NULL HYPOTHESIS
The null hypothesis is usually an hypothesis of "no difference" e.g. no difference between blood pressures in group A and group B. Define a null hypothesis for each study question clearly before the start of your study
Alternative hypothesis
If your P value is less than the chosen significance level then you reject the null hypothesis i.e. accept that your sample gives reasonable evidence to support the alternative hypothesis. It does NOT imply a "meaningful" or "important" difference; that is for you to decide when considering the real-world relevance of your result.
Note: The choice of significance level at which you reject H0 is arbitrary. Conventionally the 5% (less than 1 in 20 chances of being wrong), 1% and 0.1% (P < 0.05, 0.01 and 0.001) levels have been used. These numbers can give a false sense of security. Type I error is the false rejection of the null hypothesis and type II error is the false acceptance of the null hypothesis.
Notes about Type I error:
· is the incorrect rejection of the null hypothesis
· maximum probability is set in advance as alpha
· is not affected by sample size as it is set in advance
· increases with the number of tests or end points (i.e. do 20 rejections of H0 and 1 is likely to be wrongly significant for alpha = 0.05)
Notes about Type II error:
· is the incorrect acceptance of the null hypothesis
· probability is beta
· beta depends upon sample size and alpha
· can't be estimated except as a function of the true population effect
· beta gets smaller as the sample size gets larger
· beta gets smaller as the number of tests or end points increases
Note:Most authors refer to statistically significant as P < 0.05 and statistically highly significant as P < 0.001 (less than one in a thousand chance of being wrong).