What does multivariate linear regression do?


Multivariate linear regression using lm()

Simulate data, independent variables:

x1 = runif(10, 0, 100) 
x2 = runif(10, 10, 200) 
x3 = runif(10, 100, 400)
cor.test(x1,x2)$p.value; cor.test(x1,x3)$p.value; cor.test(x2,x3)$p.value
## [1] 0.2151276
## [1] 0.7329013
## [1] 0.3232681

The \(x_i\) must not be (strongly) correlated!

Let’s simulate some reponse:

y = 3 + 2*x1 + 3*x2 + 1*x3 + rnorm(10, 0, 2)   # simulated response

Let’s see if we can “rediscover” the true coefficients above:

mdata = data.frame(y = y, x1 = x1, x2 = x2, x3 = x3)
head(mdata)
##          y         x1        x2       x3
## 1 557.7894 75.9815207  51.58790 246.8139
## 2 720.2404 97.3230971  68.20277 319.2263
## 3 597.0158 66.6150526  55.27691 293.9997
## 4 600.0651  0.8985721 114.35953 251.0833
## 5 562.8531 26.4637249 113.62655 167.9717
## 6 331.0584 30.2783098  52.04472 109.3102
lm.res = lm(y ~ x1 + x2 + x3, data = mdata)    # Additive model, Wilkinson-Rogers notation
summary(lm.res)
## 
## Call:
## lm(formula = y ~ x1 + x2 + x3, data = mdata)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -2.5867 -0.9367  0.2019  0.7856  2.2606 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 5.356328   2.434827     2.2   0.0701 .  
## x1          1.983107   0.019371   102.4 5.85e-11 ***
## x2          2.989270   0.020393   146.6 6.80e-12 ***
## x3          0.999827   0.008325   120.1 2.25e-11 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 1.737 on 6 degrees of freedom
## Multiple R-squared:  0.9999, Adjusted R-squared:  0.9999 
## F-statistic: 2.272e+04 on 3 and 6 DF,  p-value: 1.491e-12


Reading the output

More possibilities