This is the note for the video tutorial of generalized least square by Broad Institute’s MIA talks.

Ordinary Least Square (OLS)

Linear model

\[\textbf{y} = \textbf{X} \boldsymbol{\beta} + \boldsymbol{\epsilon}\]
  • $\textbf{y}$: response variables
  • $\textbf{X}$: predictor(s)
  • $\mathbf{\beta}$: effect size
  • $\mathbf{\epsilon}$: noise, error, residual; $E(\boldsymbol{\epsilon})=0, Var(\boldsymbol{\epsilon})=\sigma^2\textbf{I}$

Suppose n observations, p predictors, it can be also written,

\[\begin{bmatrix}y_1 \\ y_2 \\ ... \\ y_n \end{bmatrix} = \begin{bmatrix}x_{11} & ... & x_{1p} \\ x_{21} & ... & x_{2p} \\ ... & ... & ...\\ x_{n1} & ... & x_{np} \end{bmatrix} \begin{bmatrix} \beta_1 \\ \beta_2 \\... \\ \beta_3 \end{bmatrix} + \begin{bmatrix} \epsilon_1 \\ \epsilon_2 \\ ... \\ \epsilon_n \end{bmatrix}\]

Minimize SSR/SSE

\[S(\boldsymbol{b}) := (\boldsymbol{y} - \boldsymbol{Xb})^T(\boldsymbol{y}-\boldsymbol{Xb})\] \[\hat{\boldsymbol{\beta}}_{OLS} := argmin_{b} S(\boldsymbol{b})\] \[\hat{\boldsymbol{\beta}}_{OLS} = (\boldsymbol{X}^T\boldsymbol{X})^{-1}\boldsymbol{X}^T\boldsymbol{y}\]

Strength

  • closed formula
  • Gauss-Markov: $\hat{\boldsymbol{\beta}}_{OLS}$ is the best linear unbiased estimator of $\boldsymbol{\beta}$, given $\epsilon_i’s$ are i.i.d.

Q: what if $\epsilon_i’s$ are not i.i.d.? what is the best?

Weighted Least Square (WLS) - drop identical distribution

\[\Sigma := Var(\boldsymbol{\epsilon}) = \begin{pmatrix} \sigma_1^2 & 0 & ... & 0 \\ 0 & \sigma_2^2 & ... & 0 \\ ... & ... & ... & ... \\ 0 & 0 & ... & \sigma_n^2 \end{pmatrix}\]

Let $D^2 = \Sigma^{-1}$ and $\delta = D\epsilon$

\[Dy = DX\beta + D\epsilon\] \[Var(\delta) = Var(D\epsilon) = DVar(\epsilon)D^{-1} = D\Sigma D^T = \textbf{I}\] \[\hat{\beta_{WLS}} = ((DX)^T(DX))^{-1}(DX)^TDy = (X^T\Sigma^{-1}X)^{-1}X^T\Sigma^{-1}y\]

Generalized Least Square (GLS) - drop i.i.d.

\[\Sigma:= Var(\epsilon)\]

Let L be the Cholesky decomp. of $\Sigma^{-1}$ and $L^TL = \Sigma^{-1}$

\[Ly = LX\beta + L\epsilon\] \[Var(L\sigma) = LVar(\epsilon)L^T = \textbf{I}\] \[\hat{\beta_{GLS}} = (X^T\Sigma^{-1}X)^{-1}X^T\Sigma^{-1}y\]