# Maximum likelihood estimators for regression Given the model structure and assumptions of [[Simple Linear Regression|simple linear regression]] or [[Multiple linear regression|multiple linear regression]], we can derive a distribution for the outcomes: Since the errors are assumed to be normally distributed according to $ \varepsilon_i \sim N(0, \sigma^2) $ and that we assume the linear regression model: $ Y_i = \beta_0 + \beta_1 X_1 + \varepsilon_i $ Then by the [[Statistical properties of the Normal distribution|properties of the normal distribution]], the outcomes are also normally distributed, conditioned on the value of the predictor: $ Y \mid X_1 \sim N(\beta_0 + \beta_1 X_1, \sigma^2) $ Based on this distribution, we can derive the likelihood for some observed dataset $(y_1, x_1), ..., (y_n, x_n)$: $ L(\beta_0, \beta_1, \sigma^2) = \prod^n_{i=1} \frac{1}{\sigma\sqrt{2\pi}} \exp \left\{ - \frac{1}{2\sigma^2} (y_i - \mathbf{x}_i'\beta)^2 \right\} $ By maximizing the likelihood with respect to both $\beta$ and $\sigma^2$, we get their maximum likelihood estimators: $ \hat{\beta} = (X'X)^{-1}X'Y $ $ \hat{\sigma}^2 = \frac{1}{n} (y - X'\hat{\beta})'(y - X'\hat{\beta}) $ Note that the MLE estimator for $\beta$ lines up with the [[Ordinary least square (OLS) estimators|OLS estimators]], but the MLE for $\sigma^2$ is biased (by a factor of $(n-p)/n$) for finite datasets — though it is asymptotically unbiased. --- # References - [[Applied Linear Regression#4. Interpretation of Main Effects]]