Fastest way to compute J' * J, where J is sparse
Ältere Kommentare anzeigen
I have a sparse rectangular matrix, J, for which I want to compute:
>> H = J' * J;
It's a bit slow (transpose is taking 5s and the matrix multiplication 9s), and given this is a special and very common case of a transpose and multiply, I was wondering if MATLAB had a faster way, e.g. one which avoids an explicit transpose.
15 Kommentare
John D'Errico
am 3 Nov. 2014
I imagine there are tools to do this operation, but they are down at the blas level. I can see as how it would be nice to have a tool that would explicitly compute J'*J, and do so efficiently though, taking advantage of the symmetry of the problem and avoiding the transpose.
and given this is a special and very common case of a transpose and multiply
It's common to encounter an expression like that, but less common to have to implement it directly. So, it makes me want to ask why you think you have to. For example, I often see it when people make the mistake of solving
J*x=y
using an explicit pseudo-inverse inv(J'*J)*J'*y as opposed to J\y.
John D'Errico
am 3 Nov. 2014
Matt is right, that it is often perhaps a bad idea, IF you are using it for the wrong reason. However, there are cases one can think of, where J'*J is useful, even necessary. Even there however, there might be alternatives. For example, to compute a covariance matrix, you could generate inv(J'*J). But you could also form a QR factorization for J, a Q-less QR in fact. Then
J'*J = R'*R
And since R is triangular, performing that inverse is far more efficiently done.
Mohammad Abouali
am 3 Nov. 2014
Between J\y and inv(J'*J)*J'*y, definitely J\y is better. But an alternative is (J'*J) \ (J'*y). This last form actually sometimes help the results, particularly if J is MxN and M>>N and the condition number is not that good. Note that mldivide(A,B) or (A\B) only uses QR-solver if A is not square matrix. However if A is square matrix there are much more options and the inverse solution is selected much more carefully. So sometimes it helps to use (J'*J)\ (J'*y) to deal with an square matrix. Refer to:
go down to the bottom of the page and check the flowchart under the algorithm.
Oliver Woodford
am 3 Nov. 2014
Torsten
am 4 Nov. 2014
Maybe J does not have full rank N and there are infinitely many solutions that minimize |J*x-y|_2 .
If this is not the case, I can not find a reason why A=J\y should give a solution different from B=(J'*J)\(J'*y).
Best wishes
Torsten.
Oliver Woodford
am 4 Nov. 2014
This last form actually sometimes help the results, particularly if J is MxN and M >>N and the condition number is not that good.
@Mohammad, did you instead mean to say "if the condition number is not that bad"? The condition number of J'*J is always the square of J, making the conditioning worse.
>> J=rand(3000,100);
>> cond(J)
ans =
20.9806
>> cond(J'*J)
ans =
440.1848
In this case you require the regularization that the weight matrix provides in order to find the correct solution for the Gauss-Newton step.
@Oliver, No, regularization still doesn't require explicit construction of J'*J. If you are trying to do the regularized mldivide operation,
(J'*J+beta*speye(N))\ J'*y
then you could equivalently do the following, and it would be better conditioned,
[J;sqrt(beta)*speye(N)]\[y; zeros(N,1)]
If you are still willing to trade conditioning for speed, there may be further brainstorming to be done, but then please provide more info about J, e.g. the density nnz(J)/numel(J) so that we can perhaps simulate the problem data.
Oliver Woodford
am 4 Nov. 2014
Mohammad Abouali
am 4 Nov. 2014
@Matt: Good catch. Yes, I meant if the condition number is good or not that bad; but apparently I mixed the two sentences in one, completely changing the meaning.
Oliver Woodford
am 20 Nov. 2015
Matt J
am 20 Nov. 2015
I don't think you ever responded to John's suggestion about using a QR decomposition. For sparse matrices, you can do a Q-less decomposition
R=qr(J,0);
and then exploit the fact that J'*J=R'*R.
Oliver Woodford
am 20 Nov. 2015
Matt J
am 20 Nov. 2015
I don't know the complexity. It will have to be tested. I imagine this could be advantageous only for certain kinds of sparsity structure.
Oliver Woodford
am 20 Nov. 2015
Antworten (2)
Azzi Abdelmalek
am 3 Nov. 2014
Bearbeitet: Azzi Abdelmalek
am 3 Nov. 2014
c=sparse(J);
H=full(c*c');
2 Kommentare
Oliver Woodford
am 3 Nov. 2014
John D'Errico
am 3 Nov. 2014
Bearbeitet: John D'Errico
am 3 Nov. 2014
Um, J is already assumed to be in sparse form, and one would definitely not want to compute a full result when working with sparse matrices. Finally, you put the transpose on the wrong term, computing J*J', not J'*J.
I don't think there's anything available to accelerate an exact calculation of J'*J for general J. However, if you know in advance that J'*J happens to be banded to diagonals -k:k for small k (or if it can be approximated as such), then it might help to compute the 2*k+1 non-trivial diagonals individually. You can do so without transposition as below.
[m,n]=size(J);
k=2;
kc=k+1;
tic;
B=zeros(n);
B(:,kc)=sum(J.^2);
for i=1:k
tmp=sum(J(:,1:end-i).*J(:,i+1:end));
B(1:end-i,kc-i)=tmp;
B(i+1:end,kc+i)=tmp;
end
result=spdiags(B,-k:k,n,n);
toc;
Whether this is actually faster will probably depend on the specifics of J. If nothing else, it spares you the large memory consumption of holding wide sparse matrices such as J' in RAM
>> J=sparse(m,n); Jt=J'; whos J Jt
Name Size Bytes Class Attributes
J 3192027x3225 25824 double sparse
Jt 3225x3192027 25536240 double sparse
Replacing J'*J by a banded approximation is something I haven't tried myself with Gauss-Newton specifically, but the role of J'*J is already as an approximation there, so I think it could work. Other minimization algorithms tend to be robust to small errors in the derivatives.
Kategorien
Mehr zu Mathematics finden Sie in Hilfe-Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!