Linear Regression

Linear regression is a method for modeling the relationship between a dependent variable (which may be a vector) and one or more explanatory variables by fitting linear equations to observed data. The case of one explanatory variable is called Simple Linear Regression. For several explanatory variables the method is called Multiple Linear Regression.

Details

Let (x₁,…,x_p) be a vector of input variables and y=(y₁,…,y_k) be the response. For each j=1,…,k,the linear regression model has the format [Hastie2009]:

y _j=β_0j+ β_1jx₁+...+ β_pjx_p

Here x_i, i=1,...,p, are referred to as independent variables, and y_j are referred to as dependent variables or responses.

The linear regression is multiple if the number of input variables p > 1.

Training Stage

Let (x₁₁,...,x_1p,y₁),…,(x_n1,...,x_np,y_n ) be a set of training data, n >> p. The matrix X of size n x p contains observations x_ij,i=1,...,n,j=1,….,p, of independent variables.

To estimate the coefficients (β_0j,...,β_pj) one these methods can be used:

Normal Equation system
QR matrix decomposition

Prediction Stage

Linear regression based prediction is done for input vector (x₁,…,x_p) using the equation y_j=β_0j+ β_1jx₁+...+β_pjx_p for each j=1,…,k.