# Analysis of Variance (ANOVA) Model
## Usual framing
Say we have $K$ groups, and we are considering a multiple groups. We'd like to check if one of the group means has a significantly different than the others. The null hypothesis is that all the group means, $\mu_1, ..., \mu_K$, are the same:
$
H_0: \mu_1 = \mu_2 = ... = \mu_K
$
While the alternative is:
$
H_1: \text{One of the means is different}
$
Notice that the alternative hypothesis doesn't state *which* group mean is different, just that one of them is different. The complement to all is any. Note that we also have to assume that the *variance of each group* is the same (homoskedasticity).
In order to see which group mean is different, multiple pairwise hypothesis tests need to be done to show which one is different. This brings about a [[Multiple Testing Problem|multiplicity problem]].
The test statistic that's used to test this statistic is the F-statistic, which compares the ratio of the *variance of the group means* and the *variance within a group*.
$
F = \frac{\text{Var}(\mu_i)}{\sigma^2}
$
If the null hypothesis is true, then there is no variance among the group means (i.e. $\text{Var}(\mu_i) = 0$). Therefore, the F-statistic will be 0 or small. Conversely, if the group means are significantly different, then the variance among them will be higher, supporting the alternative hypothesis.
The F-test is an instance of the [[Likelihood ratio test|likelihood ratio test]] taking on a well-known distribution. Normal likelihoods lead to the F statistic.
## Linear regression framing
With [[Simple Linear Regression|simple linear regression]] and [[Multiple linear regression|multiple linear regression]], remember that the non-intercept coefficients are interpreted in terms of *changes to the mean*. Only the intercept is interpreted as a mean.
If we have a simple linear regression with one binary predictor indicating treatment vs placebo, then the two means of the groups are given as:
$
\begin{align}
&E(Y\mid X = 0 ) = \beta_0 \\
&E(Y\mid X = 1 ) = \beta_0 +\beta_1 \\
\end{align}
$
If the two groups means are not significantly different, it also implies that $\beta_1 = 0$. We can also perform ANOVA if we compare these two nested models (one model's coefficients are all contained in another).
---
# References
[[Applied Linear Regression#6. Testing and Analysis of Variance]]