Regression¶

Confidence intervals¶

The confidence interval for a point estimate measures is the interval within which we have a particular degree of confidence the true value resides. For example, the 95% confidence interval for the mean height in a population may be [1.78m, 1.85m].

Confidence intervals can be calculated in this way:

Let $\alpha$ be the specified confidence level. eg $\alpha = 0.95$ for the 95% confidence level.
Let $f(x; n-1)$ be the pdf for Student’s t distribution, parameterised by the number of degrees of freedom which is the sample size (n) minus 1.
Calculate $t = f(1 - \alpha/2; n-1)$
Then the confidence interval for the point estimate is:

$\bar{x} - t \frac{s}{\sqrt{n}} \leq x \leq \bar{x} + t \frac{s}{\sqrt{n}}$

Where $\bar{x}$ is the estimated value of the statistic, $x$ is the true value and $s$ is the sample standard deviation.

Isotonic regression¶

Fits a step-wise monotonic function to the data. A useful way to avoid overfitting if there is a strong theoretical reason to believe that the function $y = f(x)$ is monotonic. For example, the relationship between the floor area of houses and their prices.

Linear regression¶

The simplest form of regression. Estimates a model with the equation:

$\hat{y} = \beta_0 + \beta_1 x_1 + ... + \beta_n x_n$

where the $\beta_i$ are parameters to be estimated by the model and the $x_i$ are the features.

The loss function is usually the squared error.

Normal equation¶

The equation that gives the optimal parameters for a linear regression.

Rewrite the regression equation as $\hat{y} = X \beta$ .

Then the formula for $\beta$ which minimizes the squared error is:

$\beta = (X^T X)^{-1} X^T y$

Logistic regression¶

Used for modelling probabilities. It uses the sigmoid function ( $\sigma$ ) to ensure the predicted values are between 0 and 1. Values outside of this range would not make sense when predicting a probability. The functional form is:

$\hat{y} = \sigma(\beta_0 + \beta_1 x_1 + ... + \beta_n x_n)$

Multicollinearity¶

When one of the features is a linear function of one or more of the others.

P-values¶

Measure the statistical significance of the coefficients of a regression. The closer the p-value is to 0, the more statistically significant that result is.

The p-value is the probability of seeing an effect greater than or equal to the one observed if there is in fact no relationship.

In a regression the formula for calculating the p-value of a coefficient is:

TODO