Should mldivide Return a Solution for Square, Rank-Deficient, but Consistent Set of Linear Equations?

Question

0 Stimmen

lindata.mat

load lindata

A and b define a linear system of equations Ax = b.

A is 12 x 12 with rank

rank(A)
ans = 6

But the linear equations are consistent with an infinite number of solutions. One such solution is

x1 = repmat([0;0;1],4,1);

Another solution can be found with the sym version of mldivide

x2 = A\b;
Warning: Solution is not unique because the system is rank-deficient.

Verify

A*[x1,x2] - b

ans =

But the numeric version of mldivide fails

x3 = double(A)\double(b)
Warning: Matrix is singular to working precision.
x3 = 12×1
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
   NaN
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

I guess this isn't terribly surprising. But the doc page mldivide states:

"

If A is a square n-by-n matrix and B is a matrix with n rows, then x = A\B is a solution to the equation A*x = B, if it exists."

My interpretation of that last clause is that it would be properly read as "if a solution exists."

Here, many solutions exist but mldivide didn't find a solution. Is this an oversight on the doc page, or am I misreading it, or would it be reasonable to expect mldivide to return a solution for this sort of problem?

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

Christine Tobler am 10 Mär. 2025

In MATLAB Online öffnen

3 Stimmen

Hi Paul!

You can use

x = linsolve(A, b, struct('RECT', true));

to use the QR method of mldivide even if A is square.

To get the minimum-norm solution of x, instead of just any solution x like mldivide does, you can use

x = lsqminnorm(A, b);

To see what algorithm is chosen by mldivide, you can call

dA = decomposition(A);

and check the Type property of the returned object.

3 Kommentare
1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

Paul am 10 Mär. 2025

In MATLAB Online öffnen

lindata.mat

Hi Christine,

Thanks for responding.

load lindata
A = double(A); b = double(b);

Good to know about that feature of decomposition

dA = decomposition(A);
dA.Type
ans = 'lu'

Trying linsolve as recommended does yield a correct solution.

x = linsolve(A, b, struct('RECT', true));
Warning: Rank deficient, rank = 6, tol =  1.902859e-14.
norm(A*x -b)
ans = 3.6605e-15

What is the difference between the QR approach in linsolve and naively using QR as

[Q,R] = qr(A);
x = R\(Q\b);
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND =  4.182874e-34.
norm(A*x-b)
ans = 205.1666

One minor nit about the doc page at linsolve. That page says that RECT is to be used when A has a different number of rows than columns. Later on the page it states "If A does not have the properties that you specify in opts, then linsolve returns incorrect results." Here we have a square matrix, which is not RECT, but linsolve still returned a correct result.

Do you think the statement cited in my question from the doc page at mldivide, \ should be softened a bit to indicate that mldivide might not return a solution in some cases, even if a solution exists.

Finally, do you have any thoughts to share on pros/cons on updating mldivide so it does return a correct result in a case such as we are discussing here (A square; A rank-deficient, A,b consistent)? Whould it be too expensive to determine if the equations are consistent? (I assume that LU is better than QR for the square case in general).

Christine Tobler am 11 Mär. 2025

In MATLAB Online öffnen

The QR approach in linsolve and mldivide (they are the same) is to do a permuted QR decomposition, estimate the rank r of A based on the diagonal of the R matrix, and then truncate the system to only use the first r rows and columns of R. See the Answer here for details.

For linsolve, setting RECT will have A treated like a rectangular matrix (meaning QR decomposition is used), which safe for a square matrix. If a non-upper triangular matrix is passed in, but A is not upper triangular, its lower triangle will just be assumed to be zero. The same is true for lower triangular and symmetric, where we just read one triangle of the matrix and assume the other is its conjugate transpose.

So the output is always "valid" in the sense that the operation is valid for how the input matrix is interpreted, but it could be confusing. I will let the documentation team know your suggestions.

For the statement in mldivide, perhaps that second bullet point should say "If A is a square n-by-n matrix and B is a matrix with n rows, then x = A\B is a solution to the equation A*x = B, if A is well-conditioned.".

On having mldivide detect this case: Performance is part of the problem, particularly because a majority of calls to mldivide are from applications that always use square matrices. But more importantly, the QR algorithm we use relies on a tolerance for detecting the rank of a matrix, which is fine for a least-squares problem, but may not give the expected result otherwise.

For example, if A was a diagonal matrix with very different scaling on the diagonal, the smallest elements on the diagonal would be cut off, with those elements of the output getting set to zero:

format shortg
linsolve(diag(10.^(0:-5:-30)), ones(7, 1), struct('RECT', true))
Warning: Rank deficient, rank = 3, tol =  1.554312e-15.
ans = 7×1
1.0e+00 *

            1
        1e+05
        1e+10
            0
            0
            0
            0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
linsolve(diag(10.^(0:-5:-30)), ones(7, 1))
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND =  1.000000e-30.
ans = 7×1
1.0e+00 *

            1
        1e+05
        1e+10
        1e+15
        1e+20
        1e+25
        1e+30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

This is because the QR-based algorithm assumes that any change of A that is less than eps * norm(A) (more-or-less) is irrelevant, so these small diagonal elements should have no impact and should be removed to avoid potential Inf/NaN/overflow. But if we're not looking for a least-squares solution, but just for a direct solution, these elements have massive impact, because the smaller the element is, the larger its inverse.

Paul am 11 Mär. 2025

Thank you for this reponse.

Melden Sie sich an, um zu kommentieren.

Answer 2

Jack am 8 Mär. 2025

0 Stimmen

Hey Paul,

Good question. mldivide (\) is supposed to find a solution when one exists, but when A is square and singular, MATLAB treats it as numerically unstable and returns NaNs instead of picking one solution.

The reason A\b works in sym but not in double is that the symbolic version does exact algebra, so it can return a valid solution even when there are infinitely many. The numeric version, on the other hand, works with floating-point numbers. Since A is singular, MATLAB can’t get a stable least-squares solution, so it just throws out NaNs.

If you still want a solution in double, you can use the pseudoinverse instead:

x3 = pinv(A) * b;

That should give you one possible solution.

As for the doc page, technically mldivide is doing what it says—it guarantees a solution when A is invertible or full-rank, but when it’s singular, it just gives a warning and returns NaNs instead of trying to pick one from infinitely many.

Follow me so you can message me anytime with future MATLAB questions. If this helps, please accept the answer as well.

13 Kommentare
11 ältere Kommentare anzeigen 11 ältere Kommentare ausblenden

David Goodmanson am 9 Mär. 2025

Bearbeitet: David Goodmanson am 12 Mär. 2025

In MATLAB Online öffnen

Hi Paul,

consider the following situation for Ax = b

% A (mxn)      % x  (nxq)     % y  (mxq)
m = 5;    % m = 2,3, ...
n = 6;
q = 3;
A = rand(m,n);
A(end,:) = A(end-1,:)     % make A rank deficient by 1
% A(:,end) = A(:,end-1);  % another way to do it
b = rand(m,q);
x = A\b
Warning: Rank deficient, rank = 4, tol =  2.371612e-15. 
 
x =
    0.2542    0.6945    0.0126
   -1.0880   -0.7025   -0.3020
    0.0745    0.2533    0.4463
         0         0         0
    0.9049    0.4365    0.5911
         0         0         0

This works for any number of rows, so that A can be wider than tall or taller than wide, with one exception. When m = 6 and A is square, you get the nan deal:

Warning: Matrix is singular to working precision. 
x1 =
NaN   NaN   NaN
NaN   NaN   NaN
NaN   NaN   NaN
NaN   NaN   NaN
NaN   NaN   NaN
NaN   NaN   NaN

I guess the considerations are different when A is square, but I think it's interesting that backslash declines to come up with an answer in the square case but is fine doing so when A is rectangular.

Paul am 9 Mär. 2025

Bearbeitet: Paul am 9 Mär. 2025

lindata.mat

Hi David,

As I read mldivide, \, if A is tall or wide, the solver jumps immediately to a QR solver. If A is square, the algorithm has several decision points, but I think for my particular problem, and your example, it would end at LU. So I guess that's the difference (or at least a difference) that you observed in your example.

I tried to use qr to solve my square problem, but w/o success (I don't know much, if anything, about qr, lu, etc.)

In my case where b is a vector, maybe it would be feasible for mldivide to do a rank test to determine that the equations are consistent and then return pinv(A)*b (maybe there are better options)?

However, another issue to consider would be when B is a matrix in A*x = B. In this case, the equations could be consistent for some columns of B but not for others, which, I suppose, would complicate any additional logic that could be added to mldivide to address this issue (square, consistent system).

IMO, the doc page should be updated to say:

"If A is a square n-by-n matrix and B is a matrix with n rows, then x = A\B is usually a solution to the equation A*x = B, if it exists. See blah, blah for exceptions."

Do you think it would be useful to be able to call the functional form of mldvide with a second output that would, at a minimum, indicate which algorithm was ultimately selected?

[x,alg] = mldvide(A,b)

might be useful for debugging.

John D'Errico am 9 Mär. 2025

Bearbeitet: John D'Errico am 9 Mär. 2025

In MATLAB Online öffnen

@Jack

I agree with most of what you said. But I disagree that MLDIVIDE should check for rank, and if that suggests a singular system, should default to the PINV solution. The first reason I see is PINV will sometimes be highly computationally intensive. And as you point out, if some columns of b live in the column space of A (and so are consistent) then the use of PINV could be confusing. Even worse is the case where a column of b lives almost in the column space of A.

A = magic(4) % rank = 3
A = 4×4
    16     2     3    13
     5    11    10     8
     9     7     6    12
     4    14    15     1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
b = A(:,1)
b = 4×1
    16
     5
     9
     4
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A\b % backslash actually gets it right, but pinv will certainly give some other result
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND =  1.306145e-17.
ans = 4×1
     1
     0
     0
     0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
pinv(A)*b
ans = 4×1
    0.9500
   -0.1500
    0.1500
    0.0500
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
b2 = [A(:,1), A(:,1) + 1e-15*null(A)]
b2 = 4×2
   16.0000   16.0000
    5.0000    5.0000
    9.0000    9.0000
    4.0000    4.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

The columns of B are almost identical, but the second column lies just slightly out of the column space of A. Should MLDIVIDE decide to use PINV on the second column, but not the first? UGH.

I very much agree the documentation could be made more clear.

Paul am 10 Mär. 2025

Bearbeitet: Paul am 10 Mär. 2025

In MATLAB Online öffnen

lindata.mat

@John D'Errico

As @David Goodmanson pointed out, at least for the case where b is a vector it might be a bit jarring from a user perspective that mldivide, \ returns a correct result for the cases where A is short or wide but not square, when all three have an exact solution.

load lindata
A = double(A); b = double(b);
x = [A(1:end-1,:)\b(1:end-1), A\b , [A;A(end,:)]\[b;b(end)]]
Warning: Rank deficient, rank = 6, tol =  1.902859e-14.
Warning: Matrix is singular to working precision.
Warning: Rank deficient, rank = 6, tol =  2.362767e-14.
x = 12×3
   -0.0000       NaN         0
         0       NaN         0
    2.0000       NaN   -0.0000
         0       NaN    0.0000
    0.0000       NaN   -0.0000
   -0.0000       NaN    2.0000
         0       NaN   -0.0000
   -0.0000       NaN         0
         0       NaN    2.0000
         0       NaN         0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
format short e
A*x-b
ans = 12×3
1.0e+00 *

  -7.0448e-16          NaN  -1.5958e-16
   1.7430e-16          NaN  -1.5619e-15
  -8.8818e-16          NaN            0
   4.4409e-16          NaN  -1.7764e-15
   6.6613e-16          NaN  -4.4409e-16
   4.9304e-32          NaN  -5.9165e-31
   8.8818e-16          NaN            0
   8.8818e-16          NaN   8.6807e-16
  -1.7764e-15          NaN   1.7764e-15
  -2.0142e-16          NaN   1.7764e-15
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

As was noted previously, if b is a matrix then using different algorithms for different columns of b would be problematic. But if b is vector, would it be problematic for the logic in mldivide to include something like

if (isvector(b) && rank(A) ~= height(A) && rank(A)==rank([A,b]))
    x = pinv(A)*b;
end
x
x = 12×1
   3.1919e-16
  -5.2882e-17
   1.0000e+00
  -1.6653e-16
   6.5080e-17
   1.0000e+00
   1.3878e-17
  -2.6105e-16
   1.0000e+00
   1.1102e-16
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A*x - b
ans = 12×1
1.0e+00 *

   2.7756e-16
  -6.9725e-16
  -1.3323e-15
  -4.1078e-15
   4.4409e-16
  -4.1633e-17
   8.8818e-15
  -7.5495e-15
   3.5527e-15
   5.3291e-15
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A = magic(4); b = A(:,1);
if (isvector(b) && rank(A) ~= height(A) && rank(A)==rank([A,b]))
    x = pinv(A)*b;
end
x
x = 4×1
   9.5000e-01
  -1.5000e-01
   1.5000e-01
   5.0000e-02
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A*x-b
ans = 4×1
1.0e+00 *

   7.1054e-15
   3.5527e-15
   5.3291e-15
   3.5527e-15
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>

Under what circumstances is pinv highly compuationally intensive?

What are your thoughts on adding a second output to the functional form of mldivide, as described at the end of this comment, to get feedback on which algorithm mldivide actually used?

John D'Errico am 10 Mär. 2025

Bearbeitet: John D'Errico am 10 Mär. 2025

In MATLAB Online öffnen

Under what circumstances is pinv computationally intensive? Large arrays.

A = rand(2000);timeit(@() lu(A))
ans =
0.026602
A = rand(2000);timeit(@() svd(A,0))
ans =
0.84203

2Kx2K is not even what I would call large in this context these days.

Would users freak out, if mldivide started to take 30 or 40 times as much time to return a result? On a problem where the result is not even unique?

Instead, just return a warning, that says, IF they understand what the warning means, that this result is suspect.

Should MLDIVIDE return the specific algorithm used, IF asked? I think more information is never a bad thing. And only someone who understands what that information means will probably ever ask that "question", at least in this case. The only issue then is how they would indicate that information. A simple algorithmic name {"QR", "LU", CHOL"} might be insufficient to some, but they could always read the help docs to expand the meaning.

Jack am 10 Mär. 2025

Bearbeitet: Jack am 10 Mär. 2025

Pinv relies on computing the singular value decomposition (SVD), which has cubic complexity and can be significantly slower than LU or QR for large matrices. In a 2000×2000 matrix, LU takes roughly 0.03 seconds while SVD takes about 0.84 seconds—about 30 times slower. If mldivide were modified to return a solution via pinv for square, singular matrices, users might indeed notice a substantial performance penalty, especially on larger problems.

In essence, mldivide is designed to return a solution only when it can do so reliably using its standard LU (or QR, for non-square cases) methods. When the matrix is singular and the system has infinitely many solutions, returning a particular solution via pinv (which computes the minimum-norm solution) could be computationally intensive and might not be what a typical user expects, given that the solution isn’t unique.

Regarding returning diagnostic information about which algorithm was used, that could be useful for advanced users. However, mldivide’s simplicity as a one-line operator is one of its strengths, and changing its interface (for example, adding a second output to indicate the algorithm) could break backward compatibility or complicate usage for the majority of users. A more balanced approach might be to provide a separate diagnostic function that inspects the matrix condition and details the decomposition strategy used internally without altering mldivide’s behavior.

Walter Roberson am 12 Mär. 2025

@Jack

The vast majority of mldivide use is through the \ operator, which is a syntax that does not permit returning multiple outputs. In order for it to be used in a syntax that permitted returning multiple outputs, it would pretty much have to be invoked as

[Result, ExtraInfo] = mldivide(A,b);

though I guess it would also work to invoke

[Result, ExtraInfo] = A\b;

Currently that syntax generates an error about "Too many output arguments."

So there cannot be any present use of returning multiple outputs from \ or mldivide() that does not generate an error message. Accordingly, the only possible interferance with backwards compatibility by adding a second output argument would be against people deliberately writing

try; [Result, Extrainfo] = A\b; report_that_forbidden_event_has_occurred(); end

the number of which is surely vanishingly small.

Melden Sie sich an, um zu kommentieren.

Should mldivide Return a Solution for Square, Rank-Deficient, but Consistent Set of Linear Equations?

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

3 Kommentare
1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

Weitere Antworten (1)

13 Kommentare
11 ältere Kommentare anzeigen 11 ältere Kommentare ausblenden

Kategorien

Produkte

Version

Tags

Community Treasure Hunt

Should mldivide Return a Solution for Square, Rank-Deficient, but Consistent Set of Linear Equations?

0 Kommentare -2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

Akzeptierte Antwort

3 Kommentare 1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

Weitere Antworten (1)

13 Kommentare 11 ältere Kommentare anzeigen 11 ältere Kommentare ausblenden

Kategorien

Produkte

Version

Tags

Siehe auch

Community Treasure Hunt

0 Kommentare
-2 ältere Kommentare anzeigen -2 ältere Kommentare ausblenden

3 Kommentare
1 älteren Kommentar anzeigen 1 älteren Kommentar ausblenden

13 Kommentare
11 ältere Kommentare anzeigen 11 ältere Kommentare ausblenden