Fastest way to index large arrays
30 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
I have two sets of arrays, A and B. The "A" arrays have about 1 million elements. The "B" arrays have about 65 thousand elements. For every element in A I need to find the corresponding element in B and pull a related value. Here's a crude minimal working example
PhiA = round(359*rand(1,1e6));
ThetaA = round(179*rand(1,1e6));
PhiB = repmat(0:359,1,180);
ThetaB = reshape(repmat(0:179,360,1),1,[]);
VarB = 1:180*360;
out = nan(1e6,1);
tic
for loop = 1:length(PhiA)
idx = PhiA(loop) == PhiB & ThetaA(loop) == ThetaB;
out(loop) = VarB(idx);
end
toc
Given the size of the arrays this is not very fast, over 40 seconds on my machine. The profiler tells me that those two lines in the for loop are the slowest in my code, and surprisingly they split the burden almost exactly 50/50.
This is actually my already faster version: originally A and B were tables and the profiler told me that the slow operations were accessing and storing into the tables. Switching to arrays has sped up things a little but not as much as I hoped.
How could I make this faster?
0 Kommentare
Akzeptierte Antwort
dpb
am 4 Okt. 2022
With the lookup arrays structured as they are, you don't need a lookup at all; you can just calculate the row directly --
fnRow=@(phi,theta)phi+360*theta+1;
so, with this,
PhiA = round(359*rand(1,1e6));
ThetaA = round(179*rand(1,1e6));
PhiB = repmat(0:359,1,180);
ThetaB = reshape(repmat(0:179,360,1),1,[]);
VarB = 1:180*360;
tic
out=VarB(fnRow(PhiA,ThetaA));
toc
4 Kommentare
dpb
am 5 Okt. 2022
I probably would have simply used logical addressing in the calculation selection...
isOK=isfinite(all(A,2));
out=VarB(fnRow(PhiA(isOK),ThetaA(isOK)));
The above assumes the A array is the one of interest and checks that there are no missing lines.
If out must be the same size as A in the row dimension, then you would need to preallocate it to ensure it is that size; otherwise it will be only as large as the last non-missing element in A location. It only matters it the last N elements are those missing, but you may not have any way to know that isn't going to be the case so defensive coding would preallocate.
If the above is more like the way the code is constructed, then
isOK=isfinite(all([PhiA.' ThetaA.'],2));
looks ominous but will be fast and is easier to write than the two conditions on each vector with &
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Matrix Indexing finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!