Linear Models

"Linear model describes a quantitative response in terms of a linear combination of predictors. You can use a linear model to make predictions or explain the relationship between the response and the predictors. Linear models are very flexible and widely used in applications in physical science, engineering, social science and business." as stated by Julian J. Faraway in his book Linear Models with R.

Linear models are used for explaining the relationship between a single variable Y ( which represents a response or output) and one or more predictors or inputs X1, X2,....Xp (where p represents the number of different predictors).

A general form for the model is:

Y = f(X1, X2,..., Xp ) + ε

where f() represents an unknown function and ε the error of the model.

If we assume that f() is a smooth, continuos function that leaves us with a wide range of possibilities, and with all the data (predictors we are working with) we can collect is not enough to try to estimate f() directly, so we have to use a more restrictive form: linear.

For example, if we are working with 4 predictors:

Y = f(X_1,X_2,X_3,X₄) + ε

we modify the previous infinite function in a linear way, so we get the following linear model:

Y = β₀ + β₁X₁ + β₂X₂ + β₃X₃ + β₄X₄ + ε

Hence, using the lineal model approach the problem to solve is reduced to estimate the parameters (β_0,β_1,β_2,β_3,β₄) instead of the infinite dimentional f().

A model is lineal if the parameters enter linearly in the equation. For instance, the following equations are linear:

Y = β₀ + β₁X₁ + β₂X₂ + β₃X₃ + ε

Y = β₀ + β₁X₁ + β₂X₂ + β₃X₃²+ ε

Y = β₀ + β₁X₁ + β₂ log X₂+ ε

Y = β₀ + β₁X₁ + β₂X₂+ β₃X₁X₂+ ε

While these equations are not:

Y = β₀ + β₁X₁ + β₂X₂^β3 + ε

Y = β₀^(β1X1) + β₂X₂ + ε

Some equations can be transformed to linearity, so linear models are very flexible and are able to handle complex datasets.

When the number of predictors is one it is called Simple Linear Model:

Y = β₀ + β₁X₁ + ε

and, when there is more than one predictor is called Multiple Linear Model:

Y = β₀ + β₁X₁ + β₂X₂+ β₃X₃+ β₄X₄+ ε

The error is a necessary factor to take into account when working with linear models for the following reasons:
1. Effect of variables not considered in the model
2. Unforeseen events (catastrophes, fashions, etc.)
3. Erors from observations or measurements

Linear models can be used for:
1. Verifying the existence of a linear relationship
2. Predicting the variable Y as a function of X or Xs (X1, X2 ... Xk)

Data World Blog

Search This Blog

Linear Models

Popular posts from this blog

Support Vector Machines (SVM) in R (package 'kernlab')

Initial Data Analysis (infert dataset)

Ant Colony Optimization (part 2) : Graph optimization using ACO