This tool saves your covariance matrices, turning them into something that really does have the property you will need. That is, when you are trying to use a covariance matrix in a tool like mvnrnd, it makes no sense if your matrix is not positive definite. So mvnrnd will fail in that case.
But sometimes, it appears that users end up with matrices that are NOT symmetric and positive definite (commonly abbreviated as SPD) and they still wish to use them to generate random numbers, often in a tool like mvnrnd. A solution is to find the NEAREST matrix (minimizing the Frobenius norm of the difference) that has the desired property of being SPD.
I see the question come up every once in a while, so I looked in the file exchange to see what is up there. All I found was nearest_posdef. While this usually almost works, it could be better. It actually failed completely on most of my test cases, and it was not as fast as I would like, using an optimization. In fact, in the comments to nearest_posdef, a logical alternative was posed. That alternative too has its failures, so I wrote nearestSPD, partly based on what I found in the work of Nick Higham.
nearestSPD works on any matrix, and it is reasonably fast. As a test, randn generates a matrix that is not symmetric nor is it at all positive definite in general.
U = randn(100);
nearestSPD will be able to convert U into something that is indeed SPD, and for a 100 by 100 matrix, do it quickly enough.
tic,Uj = nearestSPD(U);toc
Elapsed time is 0.008964 seconds.
The ultimate test of course, is to use chol. If chol returns a second argument that is zero, then MATLAB (and mvnrnd) will be happy!
[R,p] = chol(Uj);
p
p =
0
1.1.0.0 | Documentation fix |
Inspired by: Nearest positive semi-definite covariance matrix
Inspired: Control Functionals, iahncajigas/nSTAT
Create scripts with code, output, and formatted text in a single executable document.
Umberto Picchini (view profile)
Life saving tool!
May I also recommend another tool from the wonderful Nick Higham (Higham's 1988 research paper has inspired the present nearestSPD). See for example modchol_ldlt.m in https://github.com/higham/modified-cholesky which is based on the 1998 paper by Cheng and Higham https://epubs.siam.org/doi/10.1137/S0895479896302898
The file above can be used almost like nearestSPD, except that we need to add 1 further line of code. Also, while nearestSPD does not require the input to be a symmetric matrix, instead modchol_ldlt has such requirement. However, in some of my applications, nearestSPD was unable to produce a result (the computation was somehow trapped inside a "while" loop). Instead such problem didn't appear with modchol_ldlt.
Furthermore, it seems that modchol_ldlt finds a matrix that is closer to the original one, at least in my experiments.
Here is an example. We first start with nearestSPD.
>> U=[1.0000 0.7426 0.1601 -0.7000 0.5500;
0.7426 1.0000 -0.2133 -0.5818 0.5000;
0.1601 -0.2133 1.0000 -0.1121 0.1000;
-0.7000 -0.5818 -0.1121 1.0000 0.4500;
0.5500 0.5000 0.1000 0.4500 1.0000];
>> chol(U)
Error using chol
Matrix must be positive definite.
% Ok, so let's use nearestSPD:
>> Uj=nearestSPD(U);
>> [~,p]=chol(Uj);
>> p
p =
0 % Great! It works.
>> Uj
Uj =
1.0358 0.7665 0.1683 -0.6487 0.5032
0.7665 1.0159 -0.2078 -0.5476 0.4688
0.1683 -0.2078 1.0019 -0.1003 0.0893
-0.6487 -0.5476 -0.1003 1.0734 0.3831
0.5032 0.4688 0.0893 0.3831 1.0610
% let's try now with modchol_ldlt:
[L, DMC, P] = modchol_ldlt(U);
U_higham = P'*L*DMC*L'*P; % this is the matrix we want
>> [~,p]=chol(U_higham);
>> p
p =
0 % Great! This one also work.
%Now let's see how Uj and U_higham differ from the original U:
>> U-Uj
ans =
-0.0358 -0.0239 -0.0082 -0.0513 0.0468
-0.0239 -0.0159 -0.0055 -0.0342 0.0312
-0.0082 -0.0055 -0.0019 -0.0118 0.0107
-0.0513 -0.0342 -0.0118 -0.0734 0.0669
0.0468 0.0312 0.0107 0.0669 -0.0610
>> U-U_higham
ans =
0 0 0 0 0
0 0 0.0000 0 0
0 0.0000 0 0.0000 -0.0000
0 0 0.0000 -0.5865 0
0 0 -0.0000 0.0000 0.0000
So, compared to nearestSPD, here modchol_ldlt somehow perturbs only one of the original matrix entries (though not by a small amount), while still obtaining a desirable result.
The above is of course not a systematic comparison, and I am no expert in this specific research area, so I recommend reading the original papers to form your own idea. As I said, in my specific case modchol_ldlt helped dealing with an infinitely long loop.
Klenilmar Dias (view profile)
Hi John D'Errico,
Excellent code!!!
But, what is the explanation for the nearestSPD tool to be so fast?
Could you explain?
Best regards, Klenilmar
GTA (view profile)
Dear usdb1 usdb, the function you want is ready and just download the right side up.
usdb usdb1 (view profile)
please, I ask for the version of Matlab containning the function nearestspd.
John Smith (view profile)
Per this https://math.stackexchange.com/a/332465/580706 answer on stackexchange, in matrix's eigenvalues are still negative after applying Higham's algorithm, wouldn't it be better just to apply the method in that answer instead of the loop that tries to gradually increase the diagonal?
First, no loop is required.
Second, the nudge is informative/selective rather than blind.
John Smith (view profile)
Sometimes nearestSPD works too well. I have a matrix which I applied nearestSPD to and it passes chol() test, so it should have all greater than zero eigenvalues. But when I take a logm() of it it fails. logm() uses schur() decomposition to get the eigenvalues.
Anything can be done to resolve this?
kammo (view profile)
Tingmingke Lu (view profile)
Lorenzo Gianani (view profile)
Jesse (view profile)
Hi John,
I had a read of this code and it's quite a nice little code and well put together, upon reading some of the comments, I came to see why it might be taking so long for some people and they reason I came to is your use of min(eig). In certain pathological cases, such as when an eigenvalue is exactly equal to zero (consider x=[1,2,0;3,4,0;0,0,0]) the algorithm fails as it will repeatedly try to add zero to fix the matrix, thus failing. I suggest adding a small check max(tol,min(eig)) instead of min(eig), where tol is some arbitrary small number like 10^-12 times the maximum singular value or some other property. Cheers, Jesse.
Mayssa (view profile)
After running the code I still had negative eigenvalues, obvisiouly, the chol test wasn't efficient with my matrix. Instead, I tried this test (p = any(eig(Ahat)<0); worked much better.
Thanks for the code !
John D'Errico (view profile)
Nil suggests that his algorithm is simpler. Yet, it will often fail.
A0 = rand(10);
B = (A0 + A0')/2;
[U,Sigma] = eig(B);
Ahat = U*max(Sigma,0)*U';
chol(Ahat)
Error using chol
Matrix must be positive definite.
Instead, the result from nearestSPD will survive the chol test with no error.
L = chol(nearestSPD(A0));
Nil Garcia (view profile)
In my opinion the code seems unnecessary lengthy. This should be enough:
B = (A + A')/2;
[U,Sigma] = eig(B);
Ahat = U*max(Sigma,0)*U';
reza shahabian (view profile)
dhaba india (view profile)
820408 (view profile)
In my simulations, A is a covariance matrix that is not really PSD because of floating point precision. I used B=nearestPSD(A) to fix this problem but for eig(B), I am getting negative eigen values that are very close to zero. Can you please comment on why EVD test is failing
Andrea Libri (view profile)
John D'Errico (view profile)
Please send me an example case that has this problem.
Yi He (view profile)
Thanks John. But when I run some 125*125 covariance matrices, the progress stands at ' mineig = min(eig(Ahat));' for pretty long time (actually almost over 10 hours). What could I do with this issue?
Thanks again.
Aeimit Lakdawala (view profile)
Meng (view profile)
Thank you. Very useful!
John D'Errico (view profile)
I actually nudge the matrix just a bit at the very end if necessary. The code should ensure that chol applied to the result will always yield a valid factorization, and that is essentially the test in MATLAB to be truly SPD. So while the Higham algorithm will ensure positive semi-definite, if chol should always work, then it will indeed be positive definite.
Ioannis P (view profile)
Could you please explain if this code is giving a positive definite or a semi-positive definite matrix? You have written the following:
"From Higham: "The nearest symmetric positive semidefinite matrix in the
Frobenius norm to an arbitrary real matrix A is shown to be (B + H)/2,
where H is the symmetric polar factor of B=(A + A')/2."
...
arguments: (input)
A - square matrix, which will be converted to the nearest Symmetric
Positive Definite Matrix."
Could you please clarify this? Thanks!
Ahmed (view profile)
Eric (view profile)
Kudos to you, John, mostly for calling attention to Higham's paper. Trying to use the other files you mentioned was driving me crazy, because of their high probability of failure.
John D'Errico (view profile)
Sorry about that. Frobenius norm is minimized.
Petr Posik (view profile)
Hi John. I miss in the description how the "nearness" of the 2 matrices, U and Uj, is measured. Could you comment on that? Thanks, Petr