Confidence intervals for binomial parameter

# Confidence intervals for binomial parameter There are actually several types of confidence intervals that we can derive for the parameter of a binomial distribution. ## Wald confidence interval The Wald confidence interval is the usual one that most people are familiar with, where we add and subtract the standard deviation multiplied by some quantile of the standard Normal or t-distribution. $ \left( \hat{\pi} - z_{\alpha/2} \cdot \sqrt{\frac{\hat{\pi}(1-\hat{\pi})}{n}}, \hat{\pi} + z_{\alpha/2} \cdot \sqrt{\frac{\hat{\pi}(1-\hat{\pi})}{n}} \right) $ The Wald interval works best when the sample size is large, since the sampling distribution of $\hat{\pi}$ is better approximated by a Normal distribution. Conversely, its coverage properties suffer when the sample size is small. In this case, we might prefer a different interval with some sort of correction. For example: ## The Wilson (Score) confidence interval The Wilson interval is an interval based on the [[Hypothesis tests for binomial parameter#Score Test|score test for the binomial parameter]]. The score test statistic is given by: $ T = \frac{S(\pi_0)}{\sqrt{\mathcal{I}(\pi_0})} = \frac{\sqrt{n}(\hat{\pi} - \pi_0)}{\sqrt{\pi_0 (1 - \pi_0)}} $ so a confidence interval can be derived from this expression by isolating $\pi_0$ in the middle. It's too complicated to put here, but the `BinomCI()` function from the `DescTools` implements this interval. ## Agresti-Coull Interval The Agresti-Coull interval incorporates a continuity correction to account for the fact that the binomial distribution is discrete. This continuity correction improves the coverage probability of the interval in exchange for wider CI's/larger p-values. `BinomCI` also implements this method. ## Notes These intervals are for one-sample problems, two-sample hypothesis tests can be organized by [[Contingency tables|contingency tables]]. --- # References - [[Categorical Data Analysis#Chapter 1 - Distributions and Inference for Categorical Data]]