Main Content

lscov

Least-squares solution in presence of known covariance

    Description

    example

    x = lscov(A,b), for linear system A*x = b, returns the least-squares solution that minimizes the sum of squared errors r'*r, where r = b - A*x for matrix A and column vector b. If b is a matrix, the corresponding columns of x and b satisfy the equation separately.

    example

    x = lscov(A,b,w) returns the weighted least-squares solution that minimizes r'*diag(w)*r, where r = b - A*x.

    example

    x = lscov(A,b,C) returns the generalized least-squares solution that minimizes r'*inv(C)*r, where r = b - A*x and the covariance matrix of b is proportional to C.

    x = lscov(A,b,C,alg) specifies the algorithm for solving the linear system. By default, lscov uses the Cholesky decomposition of C to compute x. Specify alg as "orth" to use an orthogonal decomposition of C. If C is not invertible, lscov uses an orthogonal decomposition regardless of the value of alg.

    [x,stdx] = lscov(___) also returns the estimated standard errors of x. The standard errors are an estimate of the standard variation in x. You can use any of the input argument combinations in previous syntaxes.

    example

    [x,stdx,mse] = lscov(___) also returns the mean squared error. The mean squared error is proportional to the value that the function minimizes.

    example

    [x,stdx,mse,S] = lscov(___) also returns the estimated covariance matrix of x. You can use this syntax only if b is a column vector.

    Examples

    collapse all

    Create a matrix A and vector b for the linear system A*x = b. Compute the least-squares solution of the linear system using lscov. Specify three output arguments to return the solution, its estimated standard errors, and its mean squared error.

    a1 = [0.2; 0.5; 0.6; 0.8; 1.0; 1.1]; 
    a2 = [0.1; 0.3; 0.4; 0.9; 1.1; 1.4]; 
    A = [ones(size(a1)) a1 a2]; 
    b = [0.17; 0.26; 0.28; 0.23; 0.27; 0.24];
    [x,stdx,mse] = lscov(A,b)
    x = 3×1
    
        0.1018
        0.4844
       -0.2847
    
    
    stdx = 3×1
    
        0.0058
        0.0206
        0.0135
    
    
    mse = 1.2774e-05
    

    Add noise to b to create a matrix of samples and compute the least-squares estimate of the matrix using the backslash operator (\). Calculate the standard error for each row of the result.

    B = b + randn(6,1e5);
    X = A\B;
    s1 = std(X,0,2)
    s1 = 3×1
    
        1.6311
        5.7609
        3.7838
    
    

    Compare the standard errors obtained using backslash with the standard errors obtained using lscov. Rescale the standard errors stdx by the square root of mse to account for scaling performed by lscov. The standard errors over X generally match the standard errors computed by lscov. The rescaled standard errors indicate that if b has uniform noise across its elements, the second element of x is affected much more than the first element.

    s2 = stdx/sqrt(mse)
    s2 = 3×1
    
        1.6349
        5.7661
        3.7845
    
    

    Create a matrix A and vector b for the problem A*x = b. Create a vector of relative observational weights, and compute the weighted least-squares solution.

    a1 = [0.2; 0.5; 0.6; 0.8; 1.0; 1.1]; 
    a2 = [0.1; 0.3; 0.4; 0.9; 1.1; 1.4]; 
    A = [ones(size(a1)) a1 a2]; 
    b = [0.17; 0.26; 0.28; 0.23; 0.27; 0.34];
    w = [1 1 1 1 1 0.1]';
    [x1,stdx1,mse1] = lscov(A,b,w)
    x1 = 3×1
    
        0.1046
        0.4614
       -0.2621
    
    
    stdx1 = 3×1
    
        0.0309
        0.1152
        0.0814
    
    
    mse1 = 3.4741e-04
    

    Compute the ordinary least-squares solution of the same problem and plot both solutions. For the first five points, the weighted least-squares solution is closer to b than the ordinary least-squares solution is. Because the sixth element of the weighted least-squares solution was weighted down, the sixth point of its solution is farther from b.

    [x2,stdx2,mse2] = lscov(A,b);
    x = 1:6;
    plot(x,b,"o",x,A*x1,"o",x,A*x2,"o")
    legend("b","Weighted","Ordinary")

    Create a matrix A and vector b for the problem A*x = b. Create a matrix of covariance scales, and compute a generalized least-squares solution.

    a1 = [0.2; 0.5; 0.6; 0.8; 1.0; 1.1]; 
    a2 = [0.1; 0.3; 0.4; 0.9; 1.1; 1.4]; 
    A = [ones(size(a1)) a1 a2]; 
    b = [0.17; 0.26; 0.28; 0.23; 0.27; 0.24];
    C = 0.2*ones(size(a1)) + 0.8*diag(ones(size(a1)));
    [x,stdx,mse] = lscov(A,b,C)
    x = 3×1
    
        0.1018
        0.4844
       -0.2847
    
    
    stdx = 3×1
    
        0.0061
        0.0206
        0.0135
    
    
    mse = 1.5967e-05
    

    Create a matrix A and vector b for the problem A*x = b. Compute an estimate of the covariance matrix of x.

    a1 = [0.2; 0.5; 0.6; 0.8; 1.0; 1.1]; 
    a2 = [0.1; 0.3; 0.4; 0.9; 1.1; 1.4]; 
    A = [ones(size(a1)) a1 a2]; 
    b = [0.17; 0.26; 0.28; 0.23; 0.27; 0.24];
    [x,stdx,mse,S] = lscov(A,b)
    x = 3×1
    
        0.1018
        0.4844
       -0.2847
    
    
    stdx = 3×1
    
        0.0058
        0.0206
        0.0135
    
    
    mse = 1.2774e-05
    
    S = 3×3
    10-3 ×
    
        0.0341   -0.1075    0.0617
       -0.1075    0.4247   -0.2712
        0.0617   -0.2712    0.1829
    
    

    The coefficient standard errors are equal to the square roots of the values on the diagonal of this covariance matrix.

    sqrt(diag(S))
    ans = 3×1
    
        0.0058
        0.0206
        0.0135
    
    

    Input Arguments

    collapse all

    Operands, specified as vectors or matrices. A and b must have the same number of rows. If b is a matrix, then lscov returns one solution for each column of b.

    Data Types: single | double
    Complex Number Support: Yes

    Relative weights, specified as a nonnegative real column vector with the same number of rows as A. The weights are typically either counts or inverse variances. When you specify this argument, lscov returns the weighted least-squares solution of the linear system A*x = b and minimizes r'*diag(w)*r, where r = b - A*x.

    Data Types: single | double

    Scaled covariance matrix, specified as a real symmetric (Hermitian if complex) matrix. C can be positive definite or semidefinite. If C is positive definite, lscov returns the least-squares solution of the linear system A*x = b and minimizes r'*inv(C)*r, where r = b - A*x, with covariance matrix proportional to C.

    If C is positive semidefinite, lscov returns the solution of A*x + T*e = b, where T*T' = C and e is the vector with the minimum norm among all vectors e for which this system has a solution. This system has a solution only if b is in the column space of [A T] .

    Data Types: single | double

    Least-squares solution algorithm, specified as one of these values:

    • "chol" — Use the Cholesky decomposition of C and invert that factor to transform the problem into ordinary least squares.

    • "orth" — Use an orthogonal decomposition algorithm. This algorithm is computationally more expensive, but more appropriate when C is ill conditioned or singular.

    By default, the lscov function uses the Cholesky decomposition. However, if lscov determines that C is semidefinite, then the function uses an orthogonal decomposition algorithm regardless of the value of alg.

    Output Arguments

    collapse all

    Solution, returned as a vector or matrix. If A is an m-by-n matrix and b is an m-by-p matrix, then x is an n-by-p matrix, including the case when p is 1. If A is underdetermined, then there are multiple possible x. In this case, some elements of x are restricted to zero to obtain a uniquely defined solution.

    If A has full storage, x is also full. If you specify a scaled covariance matrix, x is sparse if A, b, and C are sparse. Otherwise, x is sparse if A and b are sparse.

    Estimated standard errors, returned as a vector. Given that the scaled covariance matrix describes the variation in b, the standard errors are an estimate of the standard variation in x. If A is underdetermined, then stdx contains zeros in the elements corresponding to the restricted zero elements of x.

    lscov scales stdx based on the mean squared error, but if C is known to be exactly the covariance matrix of b, then the scaling is unnecessary. To get the appropriate estimates in this case, scale stdx by sqrt(1/mse).

    Mean squared error, returned as a scalar. If b has covariance matrix sigma^2*C or sigma^2*diag(1./w), then mse is an estimate of sigma^2. lscov assumes that the covariance matrix of b is known only up to a scale factor, and mse is an estimate of that unknown scale factor.

    Estimated covariance matrix. The function returns S only if b is a column vector. If A is underdetermined, then S contains zeros in the rows and columns corresponding to the restricted zero elements of x.

    lscov scales S based on the mean squared error, but if C is known to be exactly the covariance matrix of b, then the scaling is unnecessary. To get the appropriate estimates in this case, scale S by 1/mse.

    Algorithms

    When m-by-n matrix A and m-by-m matrix C are full rank in a generalized least-squares problem, these standard formulas represent the outputs of lscov when m is greater than or equal to n.

    x = inv(A'*inv(C)*A)*A'*inv(C)*b
    mse = (b - A*x)'*inv(C)*(b - A*x)./(m-n)
    S = inv(A'*inv(C)*A)*mse
    stdx = sqrt(diag(S))

    When m is less than n, the mean squared error is 0.

    For weighted least squares, the standard formulas apply when substituting diag(1./w) for C. For ordinary least squares, substitute the identity matrix for C.

    The lscov function uses methods that are faster and more stable than the standard formulas, and are applicable to rank-deficient cases. For instance, lscov computes the Cholesky decomposition C = R'*R and then solves the least-squares problem (R'\A)*x = (R'\b) instead, using the same algorithm that is used in mldivide for A\b to solve a least-squares problem.

    References

    [1] Paige, Christopher C. "Computer Solution and Perturbation Analysis of Generalized Linear Least Squares Problems." Mathematics of Computation 33, no. 145 (1979): 171–83. https://doi.org/10.2307/2006034.

    [2] Golub, Gene H., and Charles F. Van Loan. Matrix Computations. Baltimore, MD: Johns Hopkins University Press, 1996.

    [3] Goodall, Colin R. "Computation using the QR decomposition." Handbook of Statistics 9 (1993): 467–508. https://doi.org/10.1016/S0169-7161(05)80137-3.

    [4] Strang, Gilbert. Introduction to Applied Mathematics. Wellesley, MA: Wellesley-Cambridge Press, 1986.

    Extended Capabilities

    Version History

    Introduced before R2006a