# Maximum likelihood estimators for regression
Given the model structure and assumptions of [[Simple Linear Regression|simple linear regression]] or [[Multiple linear regression|multiple linear regression]], we can derive a distribution for the outcomes:
Since the errors are assumed to be normally distributed according to
$
\varepsilon_i \sim N(0, \sigma^2)
$
and that we assume the linear regression model:
$
Y_i = \beta_0 + \beta_1 X_1 + \varepsilon_i
$
Then by the [[Statistical properties of the Normal distribution|properties of the normal distribution]], the outcomes are also normally distributed, conditioned on the value of the predictor:
$
Y \mid X_1 \sim N(\beta_0 + \beta_1 X_1, \sigma^2)
$
Based on this distribution, we can derive the likelihood for some observed dataset $(y_1, x_1), ..., (y_n, x_n)$:
$
L(\beta_0, \beta_1, \sigma^2) = \prod^n_{i=1} \frac{1}{\sigma\sqrt{2\pi}} \exp \left\{ - \frac{1}{2\sigma^2} (y_i - \mathbf{x}_i'\beta)^2 \right\}
$
By maximizing the likelihood with respect to both $\beta$ and $\sigma^2$, we get their maximum likelihood estimators:
$
\hat{\beta} = (X'X)^{-1}X'Y
$
$
\hat{\sigma}^2 = \frac{1}{n} (y - X'\hat{\beta})'(y - X'\hat{\beta})
$
Note that the MLE estimator for $\beta$ lines up with the [[Ordinary least square (OLS) estimators|OLS estimators]], but the MLE for $\sigma^2$ is biased (by a factor of $(n-p)/n$) for finite datasets — though it is asymptotically unbiased.
---
# References
- [[Applied Linear Regression#4. Interpretation of Main Effects]]