Matrix representation

To estimate the parameters of a linear model we use Matrix algebra.

The data we are working with can be represented in a tabular form. For example, if we are working with three predictors:

y₁ x₁₁x₁₂ x₁₃

y₂ x₂₁x₂₂ x₂₃

y₃ x₃₁x₃₂ x₃₃

... ... ... ...

y_n x_n1x_n2 x_n3

Each line represents one observation in our data (n), y is the response and x are the predictors.

We put this data into a matrix representation:

Y = Xβ + ε

As we can see in the previous function, this model divides the response into two components Xβ (systematic component) and ε (random component). We have to take into account that the column of 1s represents the intercept term.

The design matrix or model matrix is the matrix built with the values of explanatory variables, denoted by X. Each row represents one observation in the dataset, while the columns correspond to the variables and their specific values for that observation.

The design matrix contains data on the independent variable (explanatory variables) which try to explain observed data on a response variable (dependent variable) in terms of the explanatory variables. The theory relating to such models makes substantial use of matrix manipulations involving the design matrix.

Example simple linear model with 5 observations:

Example multiple linear model with 5 observations and 6 predictors:

Data World Blog

Search This Blog

Matrix representation

Popular posts from this blog

Support Vector Machines (SVM) in R (package 'kernlab')

Initial Data Analysis (infert dataset)

Ant Colony Optimization (part 2) : Graph optimization using ACO