Should mldivide Return a Solution for Square, Rank-Deficient, but Consistent Set of Linear Equations?

load lindata
A and b define a linear system of equations Ax = b.
A is 12 x 12 with rank
rank(A)
ans = 6
But the linear equations are consistent with an infinite number of solutions. One such solution is
x1 = repmat([0;0;1],4,1);
Another solution can be found with the sym version of mldivide
x2 = A\b;
Warning: Solution is not unique because the system is rank-deficient.
Verify
A*[x1,x2] - b
ans = 
But the numeric version of mldivide fails
x3 = double(A)\double(b)
Warning: Matrix is singular to working precision.
x3 = 12×1
NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
I guess this isn't terribly surprising. But the doc page mldivide states:
"
  • If A is a square n-by-n matrix and B is a matrix with n rows, then x = A\B is a solution to the equation A*x = B, if it exists."
My interpretation of that last clause is that it would be properly read as "if a solution exists."
Here, many solutions exist but mldivide didn't find a solution. Is this an oversight on the doc page, or am I misreading it, or would it be reasonable to expect mldivide to return a solution for this sort of problem?

 Akzeptierte Antwort

Hi Paul!
You can use
x = linsolve(A, b, struct('RECT', true));
to use the QR method of mldivide even if A is square.
To get the minimum-norm solution of x, instead of just any solution x like mldivide does, you can use
x = lsqminnorm(A, b);
To see what algorithm is chosen by mldivide, you can call
dA = decomposition(A);
and check the Type property of the returned object.

3 Kommentare

Hi Christine,
Thanks for responding.
load lindata
A = double(A); b = double(b);
Good to know about that feature of decomposition
dA = decomposition(A);
dA.Type
ans = 'lu'
Trying linsolve as recommended does yield a correct solution.
x = linsolve(A, b, struct('RECT', true));
Warning: Rank deficient, rank = 6, tol = 1.902859e-14.
norm(A*x -b)
ans = 3.6605e-15
What is the difference between the QR approach in linsolve and naively using QR as
[Q,R] = qr(A);
x = R\(Q\b);
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 4.182874e-34.
norm(A*x-b)
ans = 205.1666
One minor nit about the doc page at linsolve. That page says that RECT is to be used when A has a different number of rows than columns. Later on the page it states "If A does not have the properties that you specify in opts, then linsolve returns incorrect results." Here we have a square matrix, which is not RECT, but linsolve still returned a correct result.
Do you think the statement cited in my question from the doc page at mldivide, \ should be softened a bit to indicate that mldivide might not return a solution in some cases, even if a solution exists.
Finally, do you have any thoughts to share on pros/cons on updating mldivide so it does return a correct result in a case such as we are discussing here (A square; A rank-deficient, A,b consistent)? Whould it be too expensive to determine if the equations are consistent? (I assume that LU is better than QR for the square case in general).
The QR approach in linsolve and mldivide (they are the same) is to do a permuted QR decomposition, estimate the rank r of A based on the diagonal of the R matrix, and then truncate the system to only use the first r rows and columns of R. See the Answer here for details.
For linsolve, setting RECT will have A treated like a rectangular matrix (meaning QR decomposition is used), which safe for a square matrix. If a non-upper triangular matrix is passed in, but A is not upper triangular, its lower triangle will just be assumed to be zero. The same is true for lower triangular and symmetric, where we just read one triangle of the matrix and assume the other is its conjugate transpose.
So the output is always "valid" in the sense that the operation is valid for how the input matrix is interpreted, but it could be confusing. I will let the documentation team know your suggestions.
For the statement in mldivide, perhaps that second bullet point should say "If A is a square n-by-n matrix and B is a matrix with n rows, then x = A\B is a solution to the equation A*x = B, if A is well-conditioned.".
On having mldivide detect this case: Performance is part of the problem, particularly because a majority of calls to mldivide are from applications that always use square matrices. But more importantly, the QR algorithm we use relies on a tolerance for detecting the rank of a matrix, which is fine for a least-squares problem, but may not give the expected result otherwise.
For example, if A was a diagonal matrix with very different scaling on the diagonal, the smallest elements on the diagonal would be cut off, with those elements of the output getting set to zero:
format shortg
linsolve(diag(10.^(0:-5:-30)), ones(7, 1), struct('RECT', true))
Warning: Rank deficient, rank = 3, tol = 1.554312e-15.
ans = 7×1
1.0e+00 * 1 1e+05 1e+10 0 0 0 0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
linsolve(diag(10.^(0:-5:-30)), ones(7, 1))
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.000000e-30.
ans = 7×1
1.0e+00 * 1 1e+05 1e+10 1e+15 1e+20 1e+25 1e+30
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
This is because the QR-based algorithm assumes that any change of A that is less than eps * norm(A) (more-or-less) is irrelevant, so these small diagonal elements should have no impact and should be removed to avoid potential Inf/NaN/overflow. But if we're not looking for a least-squares solution, but just for a direct solution, these elements have massive impact, because the smaller the element is, the larger its inverse.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Hey Paul,
Good question. mldivide (\) is supposed to find a solution when one exists, but when A is square and singular, MATLAB treats it as numerically unstable and returns NaNs instead of picking one solution.
The reason A\b works in sym but not in double is that the symbolic version does exact algebra, so it can return a valid solution even when there are infinitely many. The numeric version, on the other hand, works with floating-point numbers. Since A is singular, MATLAB can’t get a stable least-squares solution, so it just throws out NaNs.
If you still want a solution in double, you can use the pseudoinverse instead:
x3 = pinv(A) * b;
That should give you one possible solution.
As for the doc page, technically mldivide is doing what it says—it guarantees a solution when A is invertible or full-rank, but when it’s singular, it just gives a warning and returns NaNs instead of trying to pick one from infinitely many.
Follow me so you can message me anytime with future MATLAB questions. If this helps, please accept the answer as well.

13 Kommentare

Hi Jack,
I think the difference between sym and double has nothing to do with exact algebra or floating point precision. Rather, two different algorithms are involved where the former returns a solution and the latter does not.
For the numeric case, Matlab could return a solution. In fact, you showed what one solution could be, namely pinv(A)*b, which is x1 in the question. So the question is whether or not the algorithm for mldivide, \ could determine that a solution exists and then take appropriate action to return one, and, if mldivide could, would it make sense to do so.
Actually, the doc page is not doing what it says. The quoted statement doesn't make any exceptions for A being singular, but perhaps it should.

mldivide (the backslash operator) is designed to return a solution only when it can compute a stable, unique solution. When A is square but rank‐deficient, the system has infinitely many solutions, and the numeric algorithm used in double precision cannot reliably pick one without risking numerical instability. In such cases, MATLAB deliberately returns NaNs rather than attempting to choose one solution arbitrarily. By contrast, the symbolic version performs exact algebra and can return one of the many possible solutions. If you need a solution in double precision for a consistent, but rank‐deficient system, you can use the pseudoinverse (pinv) as a workaround (for example, x = pinv(A)*b).

Hi Paul,
consider the following situation for Ax = b
% A (mxn) % x (nxq) % y (mxq)
m = 5; % m = 2,3, ...
n = 6;
q = 3;
A = rand(m,n);
A(end,:) = A(end-1,:) % make A rank deficient by 1
% A(:,end) = A(:,end-1); % another way to do it
b = rand(m,q);
x = A\b
Warning: Rank deficient, rank = 4, tol = 2.371612e-15.
x =
0.2542 0.6945 0.0126
-1.0880 -0.7025 -0.3020
0.0745 0.2533 0.4463
0 0 0
0.9049 0.4365 0.5911
0 0 0
This works for any number of rows, so that A can be wider than tall or taller than wide, with one exception. When m = 6 and A is square, you get the nan deal:
Warning: Matrix is singular to working precision.
x1 =
NaN NaN NaN
NaN NaN NaN
NaN NaN NaN
NaN NaN NaN
NaN NaN NaN
NaN NaN NaN
I guess the considerations are different when A is square, but I think it's interesting that backslash declines to come up with an answer in the square case but is fine doing so when A is rectangular.
Hi David,
As I read mldivide, \, if A is tall or wide, the solver jumps immediately to a QR solver. If A is square, the algorithm has several decision points, but I think for my particular problem, and your example, it would end at LU. So I guess that's the difference (or at least a difference) that you observed in your example.
I tried to use qr to solve my square problem, but w/o success (I don't know much, if anything, about qr, lu, etc.)
In my case where b is a vector, maybe it would be feasible for mldivide to do a rank test to determine that the equations are consistent and then return pinv(A)*b (maybe there are better options)?
However, another issue to consider would be when B is a matrix in A*x = B. In this case, the equations could be consistent for some columns of B but not for others, which, I suppose, would complicate any additional logic that could be added to mldivide to address this issue (square, consistent system).
IMO, the doc page should be updated to say:
"If A is a square n-by-n matrix and B is a matrix with n rows, then x = A\B is usually a solution to the equation A*x = B, if it exists. See blah, blah for exceptions."
Do you think it would be useful to be able to call the functional form of mldvide with a second output that would, at a minimum, indicate which algorithm was ultimately selected?
[x,alg] = mldvide(A,b)
might be useful for debugging.
mldivide follows different paths based on the shape of A, and when A is square, it defaults to LU, which breaks down for singular matrices. When A is rectangular, QR comes into play, which is why it sometimes works in rank-deficient cases.
Trying qr manually won’t help much since QR alone doesn’t solve the system—it just decomposes A. But your idea of mldivide checking for rank and falling back to pinv(A) * b when the system is consistent could work in some cases. The challenge is handling A*x = B when some columns of B are consistent and others aren’t. MATLAB would have to decide on a per-column basis, which might make the implementation more complex.
I agree that the doc could be clearer—saying "usually a solution" instead of an absolute statement would better reflect edge cases like this. And having an option like [x, alg] = mldivide(A, b) to return the solver used would be useful, especially when dealing with singular or ill-conditioned matrices.
I agree with most of what you said. But I disagree that MLDIVIDE should check for rank, and if that suggests a singular system, should default to the PINV solution. The first reason I see is PINV will sometimes be highly computationally intensive. And as you point out, if some columns of b live in the column space of A (and so are consistent) then the use of PINV could be confusing. Even worse is the case where a column of b lives almost in the column space of A.
A = magic(4) % rank = 3
A = 4×4
16 2 3 13 5 11 10 8 9 7 6 12 4 14 15 1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
b = A(:,1)
b = 4×1
16 5 9 4
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A\b % backslash actually gets it right, but pinv will certainly give some other result
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.306145e-17.
ans = 4×1
1 0 0 0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
pinv(A)*b
ans = 4×1
0.9500 -0.1500 0.1500 0.0500
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
b2 = [A(:,1), A(:,1) + 1e-15*null(A)]
b2 = 4×2
16.0000 16.0000 5.0000 5.0000 9.0000 9.0000 4.0000 4.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
The columns of B are almost identical, but the second column lies just slightly out of the column space of A. Should MLDIVIDE decide to use PINV on the second column, but not the first? UGH.
I very much agree the documentation could be made more clear.
I suspect you wanted to proceed on to a demonstration with b2 ?
A = magic(4) % rank = 3
A = 4×4
16 2 3 13 5 11 10 8 9 7 6 12 4 14 15 1
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
b2 = [A(:,1), A(:,1) + 1e-15*null(A)]
b2 = 4×2
16.0000 16.0000 5.0000 5.0000 9.0000 9.0000 4.0000 4.0000
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A\b2
Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 1.306145e-17.
ans = 4×2
1.0000 0.7500 0 -0.7500 0 0.7500 0 0.2500
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
pinv(A)*b2
ans = 4×2
0.9500 0.9500 -0.1500 -0.1500 0.1500 0.1500 0.0500 0.0500
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Oh, yes. Forgot that, :)
Anyway, I honestly think the real answer should be that when backslash sees a singular matrix, if should issue a fluorescent warning, of some different color. It should say something to the effect of DON'T TRUST ANYTHING YOU SEE FROM THIS COMPUTATION!
As @David Goodmanson pointed out, at least for the case where b is a vector it might be a bit jarring from a user perspective that mldivide, \ returns a correct result for the cases where A is short or wide but not square, when all three have an exact solution.
load lindata
A = double(A); b = double(b);
x = [A(1:end-1,:)\b(1:end-1), A\b , [A;A(end,:)]\[b;b(end)]]
Warning: Rank deficient, rank = 6, tol = 1.902859e-14.
Warning: Matrix is singular to working precision.
Warning: Rank deficient, rank = 6, tol = 2.362767e-14.
x = 12×3
-0.0000 NaN 0 0 NaN 0 2.0000 NaN -0.0000 0 NaN 0.0000 0.0000 NaN -0.0000 -0.0000 NaN 2.0000 0 NaN -0.0000 -0.0000 NaN 0 0 NaN 2.0000 0 NaN 0
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
format short e
A*x-b
ans = 12×3
1.0e+00 * -7.0448e-16 NaN -1.5958e-16 1.7430e-16 NaN -1.5619e-15 -8.8818e-16 NaN 0 4.4409e-16 NaN -1.7764e-15 6.6613e-16 NaN -4.4409e-16 4.9304e-32 NaN -5.9165e-31 8.8818e-16 NaN 0 8.8818e-16 NaN 8.6807e-16 -1.7764e-15 NaN 1.7764e-15 -2.0142e-16 NaN 1.7764e-15
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
As was noted previously, if b is a matrix then using different algorithms for different columns of b would be problematic. But if b is vector, would it be problematic for the logic in mldivide to include something like
if (isvector(b) && rank(A) ~= height(A) && rank(A)==rank([A,b]))
x = pinv(A)*b;
end
x
x = 12×1
3.1919e-16 -5.2882e-17 1.0000e+00 -1.6653e-16 6.5080e-17 1.0000e+00 1.3878e-17 -2.6105e-16 1.0000e+00 1.1102e-16
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A*x - b
ans = 12×1
1.0e+00 * 2.7756e-16 -6.9725e-16 -1.3323e-15 -4.1078e-15 4.4409e-16 -4.1633e-17 8.8818e-15 -7.5495e-15 3.5527e-15 5.3291e-15
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A = magic(4); b = A(:,1);
if (isvector(b) && rank(A) ~= height(A) && rank(A)==rank([A,b]))
x = pinv(A)*b;
end
x
x = 4×1
9.5000e-01 -1.5000e-01 1.5000e-01 5.0000e-02
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
A*x-b
ans = 4×1
1.0e+00 * 7.1054e-15 3.5527e-15 5.3291e-15 3.5527e-15
<mw-icon class=""></mw-icon>
<mw-icon class=""></mw-icon>
Under what circumstances is pinv highly compuationally intensive?
What are your thoughts on adding a second output to the functional form of mldivide, as described at the end of this comment, to get feedback on which algorithm mldivide actually used?
Pinv becomes computationally intensive when dealing with large or nearly singular matrices because it relies on the singular value decomposition (SVD), which has a significantly higher computational cost compared to the LU or QR decompositions used in mldivide for full-rank matrices. In cases where A is large or ill-conditioned, the SVD (and thus pinv) can be very slow.
Regarding the idea of adding a second output to mldivide that indicates which algorithm was used (e.g., LU, QR, or a fallback to pinv), it's an interesting suggestion for debugging and transparency. However, mldivide is designed to be a simple, one-line operator that hides its internal decision-making process. Changing its interface to return diagnostic information would complicate its use and could introduce backward compatibility issues. A better approach might be to offer a separate diagnostic function that reports on the condition of A and details the decomposition strategy chosen, without altering the behavior of mldivide itself.
Under what circumstances is pinv computationally intensive? Large arrays.
A = rand(2000);timeit(@() lu(A))
ans =
0.026602
A = rand(2000);timeit(@() svd(A,0))
ans =
0.84203
2Kx2K is not even what I would call large in this context these days.
Would users freak out, if mldivide started to take 30 or 40 times as much time to return a result? On a problem where the result is not even unique?
Instead, just return a warning, that says, IF they understand what the warning means, that this result is suspect.
Should MLDIVIDE return the specific algorithm used, IF asked? I think more information is never a bad thing. And only someone who understands what that information means will probably ever ask that "question", at least in this case. The only issue then is how they would indicate that information. A simple algorithmic name {"QR", "LU", CHOL"} might be insufficient to some, but they could always read the help docs to expand the meaning.

Pinv relies on computing the singular value decomposition (SVD), which has cubic complexity and can be significantly slower than LU or QR for large matrices. In a 2000×2000 matrix, LU takes roughly 0.03 seconds while SVD takes about 0.84 seconds—about 30 times slower. If mldivide were modified to return a solution via pinv for square, singular matrices, users might indeed notice a substantial performance penalty, especially on larger problems.

In essence, mldivide is designed to return a solution only when it can do so reliably using its standard LU (or QR, for non-square cases) methods. When the matrix is singular and the system has infinitely many solutions, returning a particular solution via pinv (which computes the minimum-norm solution) could be computationally intensive and might not be what a typical user expects, given that the solution isn’t unique.

Regarding returning diagnostic information about which algorithm was used, that could be useful for advanced users. However, mldivide’s simplicity as a one-line operator is one of its strengths, and changing its interface (for example, adding a second output to indicate the algorithm) could break backward compatibility or complicate usage for the majority of users. A more balanced approach might be to provide a separate diagnostic function that inspects the matrix condition and details the decomposition strategy used internally without altering mldivide’s behavior.

The vast majority of mldivide use is through the \ operator, which is a syntax that does not permit returning multiple outputs. In order for it to be used in a syntax that permitted returning multiple outputs, it would pretty much have to be invoked as
[Result, ExtraInfo] = mldivide(A,b);
though I guess it would also work to invoke
[Result, ExtraInfo] = A\b;
Currently that syntax generates an error about "Too many output arguments."
So there cannot be any present use of returning multiple outputs from \ or mldivide() that does not generate an error message. Accordingly, the only possible interferance with backwards compatibility by adding a second output argument would be against people deliberately writing
try; [Result, Extrainfo] = A\b; report_that_forbidden_event_has_occurred(); end
the number of which is surely vanishingly small.

Melden Sie sich an, um zu kommentieren.

Kategorien

Mehr zu Linear Algebra finden Sie in Hilfe-Center und File Exchange

Produkte

Version

R2024b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by