lscov

Least-squares solution in presence of known covariance

Syntax

``x = lscov(A,b)``
``x = lscov(A,b,w)``
``x = lscov(A,b,C)``
``x = lscov(A,b,C,alg)``
``[x,stdx] = lscov(___)``
``[x,stdx,mse] = lscov(___)``
``[x,stdx,mse,S] = lscov(___)``

Description

example

````x = lscov(A,b)`, for linear system `A*x = b`, returns the least-squares solution that minimizes the sum of squared errors `r'*r`, where ```r = b - A*x``` for matrix `A` and column vector `b`. If `b` is a matrix, the corresponding columns of `x` and `b` satisfy the equation separately.```

example

````x = lscov(A,b,w)` returns the weighted least-squares solution that minimizes `r'*diag(w)*r`, where `r = b - A*x`.```

example

````x = lscov(A,b,C)` returns the generalized least-squares solution that minimizes `r'*inv(C)*r`, where `r = b - A*x` and the covariance matrix of `b` is proportional to `C`.```
````x = lscov(A,b,C,alg)` specifies the algorithm for solving the linear system. By default, `lscov` uses the Cholesky decomposition of `C` to compute `x`. Specify `alg` as `"orth"` to use an orthogonal decomposition of `C`. If `C` is not invertible, `lscov` uses an orthogonal decomposition regardless of the value of `alg`.```
````[x,stdx] = lscov(___)` also returns the estimated standard errors of `x`. The standard errors are an estimate of the standard variation in `x`. You can use any of the input argument combinations in previous syntaxes.```

example

````[x,stdx,mse] = lscov(___)` also returns the mean squared error. The mean squared error is proportional to the value that the function minimizes.```

example

````[x,stdx,mse,S] = lscov(___)` also returns the estimated covariance matrix of `x`. You can use this syntax only if `b` is a column vector.```

Examples

collapse all

Create a matrix `A` and vector `b` for the linear system `A*x = b`. Compute the least-squares solution of the linear system using `lscov`. Specify three output arguments to return the solution, its estimated standard errors, and its mean squared error.

```a1 = [0.2; 0.5; 0.6; 0.8; 1.0; 1.1]; a2 = [0.1; 0.3; 0.4; 0.9; 1.1; 1.4]; A = [ones(size(a1)) a1 a2]; b = [0.17; 0.26; 0.28; 0.23; 0.27; 0.24]; [x,stdx,mse] = lscov(A,b)```
```x = 3×1 0.1018 0.4844 -0.2847 ```
```stdx = 3×1 0.0058 0.0206 0.0135 ```
```mse = 1.2774e-05 ```

Add noise to `b` to create a matrix of samples and compute the least-squares estimate of the matrix using the backslash operator (`\`). Calculate the standard error for each row of the result.

```B = b + randn(6,1e5); X = A\B; s1 = std(X,0,2)```
```s1 = 3×1 1.6311 5.7609 3.7838 ```

Compare the standard errors obtained using backslash with the standard errors obtained using `lscov`. Rescale the standard errors `stdx` by the square root of `mse` to account for scaling performed by `lscov`. The standard errors over `X` generally match the standard errors computed by `lscov`. The rescaled standard errors indicate that if `b` has uniform noise across its elements, the second element of `x` is affected much more than the first element.

`s2 = stdx/sqrt(mse)`
```s2 = 3×1 1.6349 5.7661 3.7845 ```

Create a matrix `A` and vector `b` for the problem `A*x = b`. Create a vector of relative observational weights, and compute the weighted least-squares solution.

```a1 = [0.2; 0.5; 0.6; 0.8; 1.0; 1.1]; a2 = [0.1; 0.3; 0.4; 0.9; 1.1; 1.4]; A = [ones(size(a1)) a1 a2]; b = [0.17; 0.26; 0.28; 0.23; 0.27; 0.34]; w = [1 1 1 1 1 0.1]'; [x1,stdx1,mse1] = lscov(A,b,w)```
```x1 = 3×1 0.1046 0.4614 -0.2621 ```
```stdx1 = 3×1 0.0309 0.1152 0.0814 ```
```mse1 = 3.4741e-04 ```

Compute the ordinary least-squares solution of the same problem and plot both solutions. For the first five points, the weighted least-squares solution is closer to `b` than the ordinary least-squares solution is. Because the sixth element of the weighted least-squares solution was weighted down, the sixth point of its solution is farther from `b`.

```[x2,stdx2,mse2] = lscov(A,b); x = 1:6; plot(x,b,"o",x,A*x1,"o",x,A*x2,"o") legend("b","Weighted","Ordinary")```

Create a matrix `A` and vector `b` for the problem `A*x = b`. Create a matrix of covariance scales, and compute a generalized least-squares solution.

```a1 = [0.2; 0.5; 0.6; 0.8; 1.0; 1.1]; a2 = [0.1; 0.3; 0.4; 0.9; 1.1; 1.4]; A = [ones(size(a1)) a1 a2]; b = [0.17; 0.26; 0.28; 0.23; 0.27; 0.24]; C = 0.2*ones(size(a1)) + 0.8*diag(ones(size(a1))); [x,stdx,mse] = lscov(A,b,C)```
```x = 3×1 0.1018 0.4844 -0.2847 ```
```stdx = 3×1 0.0061 0.0206 0.0135 ```
```mse = 1.5967e-05 ```

Create a matrix `A` and vector `b` for the problem `A*x = b`. Compute an estimate of the covariance matrix of `x`.

```a1 = [0.2; 0.5; 0.6; 0.8; 1.0; 1.1]; a2 = [0.1; 0.3; 0.4; 0.9; 1.1; 1.4]; A = [ones(size(a1)) a1 a2]; b = [0.17; 0.26; 0.28; 0.23; 0.27; 0.24]; [x,stdx,mse,S] = lscov(A,b)```
```x = 3×1 0.1018 0.4844 -0.2847 ```
```stdx = 3×1 0.0058 0.0206 0.0135 ```
```mse = 1.2774e-05 ```
```S = 3×3 10-3 × 0.0341 -0.1075 0.0617 -0.1075 0.4247 -0.2712 0.0617 -0.2712 0.1829 ```

The coefficient standard errors are equal to the square roots of the values on the diagonal of this covariance matrix.

`sqrt(diag(S))`
```ans = 3×1 0.0058 0.0206 0.0135 ```

Input Arguments

collapse all

Operands, specified as vectors or matrices. `A` and `b` must have the same number of rows. If `b` is a matrix, then `lscov` returns one solution for each column of `b`.

Data Types: `single` | `double`
Complex Number Support: Yes

Relative weights, specified as a nonnegative real column vector with the same number of rows as `A`. The weights are typically either counts or inverse variances. When you specify this argument, `lscov` returns the weighted least-squares solution of the linear system `A*x = b` and minimizes `r'*diag(w)*r`, where `r = b - A*x`.

Data Types: `single` | `double`

Scaled covariance matrix, specified as a real symmetric (Hermitian if complex) matrix. `C` can be positive definite or semidefinite. If `C` is positive definite, `lscov` returns the least-squares solution of the linear system `A*x = b` and minimizes `r'*inv(C)*r`, where `r = b - A*x`, with covariance matrix proportional to `C`.

If `C` is positive semidefinite, `lscov` returns the solution of `A*x + T*e = b`, where ```T*T' = C``` and `e` is the vector with the minimum norm among all vectors `e` for which this system has a solution. This system has a solution only if `b` is in the column space of ```[A T]``` .

Data Types: `single` | `double`

Least-squares solution algorithm, specified as one of these values:

• `"chol"` — Use the Cholesky decomposition of `C` and invert that factor to transform the problem into ordinary least squares.

• `"orth"` — Use an orthogonal decomposition algorithm. This algorithm is computationally more expensive, but more appropriate when `C` is ill conditioned or singular.

By default, the `lscov` function uses the Cholesky decomposition. However, if `lscov` determines that `C` is semidefinite, then the function uses an orthogonal decomposition algorithm regardless of the value of `alg`.

Output Arguments

collapse all

Solution, returned as a vector or matrix. If `A` is an `m`-by-`n` matrix and `b` is an `m`-by-`p` matrix, then `x` is an `n`-by-`p` matrix, including the case when `p` is `1`. If `A` is underdetermined, then there are multiple possible `x`. In this case, some elements of `x` are restricted to zero to obtain a uniquely defined solution.

If `A` has full storage, `x` is also full. If you specify a scaled covariance matrix, `x` is sparse if `A`, `b`, and `C` are sparse. Otherwise, `x` is sparse if `A` and `b` are sparse.

Estimated standard errors, returned as a vector. Given that the scaled covariance matrix describes the variation in `b`, the standard errors are an estimate of the standard variation in `x`. If `A` is underdetermined, then `stdx` contains zeros in the elements corresponding to the restricted zero elements of `x`.

`lscov` scales `stdx` based on the mean squared error, but if `C` is known to be exactly the covariance matrix of `b`, then the scaling is unnecessary. To get the appropriate estimates in this case, scale `stdx` by `sqrt(1/mse)`.

Mean squared error, returned as a scalar. If `b` has covariance matrix `sigma^2*C` or `sigma^2*diag(1./w)`, then `mse` is an estimate of `sigma^2`. `lscov` assumes that the covariance matrix of `b` is known only up to a scale factor, and `mse` is an estimate of that unknown scale factor.

Estimated covariance matrix. The function returns `S` only if `b` is a column vector. If `A` is underdetermined, then `S` contains zeros in the rows and columns corresponding to the restricted zero elements of `x`.

`lscov` scales `S` based on the mean squared error, but if `C` is known to be exactly the covariance matrix of `b`, then the scaling is unnecessary. To get the appropriate estimates in this case, scale `S` by `1/mse`.

Algorithms

When `m`-by-`n` matrix `A` and `m`-by-`m` matrix `C` are full rank in a generalized least-squares problem, these standard formulas represent the outputs of `lscov` when `m` is greater than or equal to `n`.

```x = inv(A'*inv(C)*A)*A'*inv(C)*b mse = (b - A*x)'*inv(C)*(b - A*x)./(m-n) S = inv(A'*inv(C)*A)*mse stdx = sqrt(diag(S))```

When `m` is less than `n`, the mean squared error is 0.

For weighted least squares, the standard formulas apply when substituting `diag(1./w)` for `C`. For ordinary least squares, substitute the identity matrix for `C`.

The `lscov` function uses methods that are faster and more stable than the standard formulas, and are applicable to rank-deficient cases. For instance, `lscov` computes the Cholesky decomposition `C = R'*R` and then solves the least-squares problem `(R'\A)*x = (R'\b)` instead, using the same algorithm that is used in `mldivide` for `A\b` to solve a least-squares problem.

References

[1] Paige, Christopher C. "Computer Solution and Perturbation Analysis of Generalized Linear Least Squares Problems." Mathematics of Computation 33, no. 145 (1979): 171–83. https://doi.org/10.2307/2006034.

[2] Golub, Gene H., and Charles F. Van Loan. Matrix Computations. Baltimore, MD: Johns Hopkins University Press, 1996.

[3] Goodall, Colin R. "Computation using the QR decomposition." Handbook of Statistics 9 (1993): 467–508. https://doi.org/10.1016/S0169-7161(05)80137-3.

[4] Strang, Gilbert. Introduction to Applied Mathematics. Wellesley, MA: Wellesley-Cambridge Press, 1986.

Version History

Introduced before R2006a