# Simple Linear Regression A simple linear regression is a statistical model that approximates the relationship between two variables $X$ and $Y$ through a linear relationship: $ Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i $ The variables $X$, $Y$ and $\varepsilon$ can be called several names; - $Y$ can be referred to as the *outcome* or dependent variable - $X$ can be called the *covariate*, *predictor* or *independent variable* - $\varepsilon$ are the errors that represent the gap between the observed outcome and the model value The model structure itself is a form of an assumption (a modeling assumption). It should be viewed as an approximation rather than true reality. The model parameters, $(\beta_0, \beta_1, \sigma^2)$ can be estimated in several ways: - Most famously, there are the [[Ordinary least square (OLS) estimators|ordinary least square estimators]]. - [[Maximum likelihood estimators for regression|Maximum likelihood estimators]] may be used, but the MLE for variance is biased. ## Assumptions The assumptions of linear regression lie in the errors, not the covariate and outcome. Typically, they are assumed to have zero mean, have constant mean and form a simple random sample. More info on these assumptions and their consequences can be found in the [[Statistical properties of the OLS estimators|statistical properties of the OLS estimators]]. In short, they are the following: $ \begin{array} EE(\varepsilon \mid X = x) = 0 \\ \text{Var}(\varepsilon \mid X = x) = \sigma^2 \end{array} $ ## Interpreting the parameters To interpret the model coefficients, we need to get them alone in the model. ### Intercept For the intercept, we need to take the expectation of the model and substitute a value of 0 for the predictor. $ E(Y_i \mid X = 0) = \beta_0 $ Therefore, we interpret the intercept as *the average value of the outcome when the covariate equals 0*. In more concrete terms, we often refer to to $X = 0$ as the *baseline* since the other parameter is interpreted with respect to it. In the case of a binary predictor: >[!example] >If $X = 1$ means someone is on treatment A and $X = 0$ means that someone is on placebo, then $\beta_0$ represents the average outcome of someone in the placebo group. In the case of a continuous predictor: >[!example] >If $X$ represents number of hours spent exercising, then $\beta_0$ represents the average outcome of someone who has zero hours of exercise. ### Non-intercept For the non-intercept, we need to deal with two equations: one when the predictor equals 0 and another when the predictor equals 1. $ \begin{align} &E(Y \mid X = 0) = \beta_0 \\ &E(Y \mid X = 1) = \beta_0 + \beta_1 \\ \end{align} $ If we take the difference of these two equations, we can isolate $\beta_1$: $ E(Y \mid X = 1) - E(Y \mid X = 0) = \beta_1 $ Therefore, we interpret the non-intercept as *the average change in the outcome* for a unit increase (+1) in the covariate. Depending on the type of the covariate, this "unit increase" can have different interpretations. In the case of a binary predictor: >[!example] >If $X = 1$ means someone is on treatment A and $X = 0$ means that someone is on placebo, then $\beta_1$ represents the average change in the outcome associated with being on treatment A, relative to the placebo group. In the case of a continuous predictor: >[!example] >If $X$ represents number of hours spent exercising, then $\beta_0$ represents the average change in the outcome associated with an extra hours of exercise. ### Interpreting under log transforms When the outcome is on the logarithmic scale, the non-intercept coefficients become *percent changes* in the outcome for a unit increase in the predictor $ \frac{E(Y \mid X_j = x + 1, \mathbf{X})}{E(Y \mid X_j = x, \mathbf{X})} \approx \frac{C \cdot \exp{\beta_j (x + 1)}}{C \cdot \exp{\beta_j x}} = \exp(\beta_j) $ When *both* the outcome and predictor are on the logarithmic scale, the non-intercept coefficients become *power change* in the outcome for a $k$ factor increase in the predictor $ \frac{E(Y \mid X_j = cx, \mathbf{X})}{E(Y \mid X_j = x, \mathbf{X})} \approx \frac{\exp{\beta_j \log(kx)}}{\exp{\beta_j \log(x)}} = k^{\beta_j} $ ## Code implementation - [[Linear regression in R]] ## Potential Problems - [[High multicollinearity leads to high variance for the OLS estimates]] - [[Unobserved covariates can drastically change estimated coefficients]] --- # References [[Applied Linear Regression#2. Simple Linear Regression]]