Select neighbours of a vector efficiently
Ältere Kommentare anzeigen
a = 1:10;
k = 4;
I have to loop through the elements of a and pick the k - 4 - neighbors equally distributed to each side (excluding the actual element I am on - n -) except when I get close to the boundaries:
a n neigh
1 * 2 3 4 5
2 * 1 3 4 5
3 * 1 2 4 5
4 * 2 3 5 6
5 * 3 4 6 7
6 * 4 5 7 8
7 * 5 6 8 9
8 * 6 7 9 10
9 * 6 7 8 10
10 * 6 7 8 9
My dummy algorithm:
nmA = numel(a);
for n = a
lo = n-k/2;
up = n+k/2;
if lo < 1
up = up + 1-lo;
lo = 1;
elseif up > nmA
lo = lo - up + nmA;
up = nmA;
end
out = setdiff(lo:up,n)
end
Any help for an effcient algorithm? (filters, maybe?)
I am retrieving the neighbours because I need to calculate the trimmed mean and std on the neighbours.
(NOTE: my real a can be as much as 3e6 and k = 60, always even)
Thanks in advance
---------------------------------------------------------------------------------------------------------------------------------------------------------
THE FINAL ALGORITHM Just for fun... and thanks to everybody.
I have a dataset which contains intraday financial data. The first part of the dataset has 2 612 670 observations spread among 1061 days. (I have 10 parts)
For each day I have to calculate the moving mean/std with particular conditions for the beginnig/end of the day.
Firstly, I tried to loop per day but the sole indexing of obs belonging to a day was 75% of total computational time (> 22 sec).
In the end I chose the vectorised solution proposed by Jan and applied the vectorization concept to eliminate the day by day loop.
Now, for k = 20 it takes < 3 sec!
% Load part of the dataset (first column serial dates, second column prices)
tmp = load([R.d 'dataset.mat'],fields{part});
k = 20;
k2 = k / 2;
% Last observation for each day and 0
last(1,1,:) = uint32([0
find(diff(fix(tmp.(fields{part})(:,1))))
size(tmp.t200301(:,1),1)]);
% Keep only the prices (free some memory)
tmp = tmp.t200301(:,2);
% Begin of the day
Ini = repmat(uint32(1:k+1).',1,k2);
Ini(1:k+2:k*k2) = [];
Ini = reshape(Ini, k, k2);
Ini = bsxfun(@plus,Ini,last(1:end-1));
% End of day
Fin = bsxfun(@minus, last(2:end), Ini(k:-1:1,k2:-1:1,1))+1;
% Number of observartions per day
numobs = diff(last);
% Create middle and concatenate with Ini and Fin - Loop or out of memory
last = cat(3,Last,size(tmp,1));
pos = uint32([1:k2, k2+2:k+1].');
mu = zeros(last(end),1);
sigma = mu;
for n = uint16(1:numel(numobs))
neigh = sort(tmp([Ini(:,:,n),...
bsxfun(@plus, pos, uint32(0:numobs(n)-k-1)),...
Fin(:,:,n)]));
mu(last(n)+1:last(n+1)) = mean(neigh(2:end-1,:));
sigma(last(n)+1:last(n+1)) = std(neigh(2:end-1,:));
end
2 Kommentare
Jan
am 22 Jul. 2011
Do you want to create the [out] vector dynamically, or do you need a complete matrix containing the index vectors as lines?
Oleg Komarov
am 22 Jul. 2011
Akzeptierte Antwort
Weitere Antworten (3)
Jan
am 22 Jul. 2011
Some ideas:
- "for n = 1:numel(a)" is usually faster than "for n = a"
- For indexing integer types are faster than DOUBLEs. I use DOUBLEs in my example for better readability.
- SETDIFF is expensive due to sorting.
- Instead of check for exceptions inside the code, you can hard-code the exceptions by creating 3 loops:
nA = numel(a);
k2 = k / 2;
v = 1:k+1;
for n = 1:k2 % Initial part
out = v(v ~= n);
end
v = [1:k2, k2+2:k+1];
for n = k2+1:nA-k2
out = v;
v = v + 1;
end
v = nA-k:nA;
for n = nA-k2+1:nA % Final part
out = v(v ~= n);
end
5 Kommentare
Titus Edelhofer
am 22 Jul. 2011
Hi Jan,
I already once see you writing that using integers as indices is faster. Somehow I was not able to reproduce this... Do you have some example that shows this?
Titus
Jan
am 22 Jul. 2011
@Titus: See http://www.mathworks.com/matlabcentral/answers/8461-how-to-make-a-double-summation-in-matlab-with-vectorized-loops . But there seems to be some interaction with the JIT acceleration: Very simple loops do not profit from the integers. But is is worth to compare the runtime for real world programs.
Jan
am 22 Jul. 2011
@Titus: Same speed:
x = rand(10000, 1);
tic; a = 1; b = 2000; for i = 1:1000; v = x(a:b); end, toc
tic; a = uint32(1); b = uint32(2000); for i = 1:1000; v = x(a:b); end, toc
Different speed:
tic; ind = 1:2000; tic;for i = 1:1000; v = x(ind); end, toc
tic; ind = uint32(1):uint32(2000); for i = 1:1000; v = x(ind); end, toc
Titus Edelhofer
am 22 Jul. 2011
Thanks! Now I understand. It's the indexing that is faster. I once tried this but replaced the loop counter by integers but it got slower this way. Thanks again! Titus
Oleg Komarov
am 22 Jul. 2011
Andrei Bobrov
am 22 Jul. 2011
Hi Oleg! My version.
a = 1:10;
k = 4;
k2 = k/2;
nA = numel(a);
idx = cumsum([[1:k2,k2+2:k+1];ones(nA-k-1,k)]);
% OR idx = bsxfun(@plus,[1:k/2,k/2+2:k+1],(0:nA-k-1)');
a1 =bsxfun(@plus,1:k+1,[0;0]);
a2 =bsxfun(@plus,nA-(0:k),[0;0]);
ii = (1:k2).^2;
a1(ii)=0;
a2(ii)=0;
aa = [a1;a2]';
aout = reshape(aa(aa~=0),k,[])';
out = a([aout(1:k2,:);idx;aout(end-(k2-1:-1:0),end:-1:1)]);
9 Kommentare
Jan
am 22 Jul. 2011
@Andrei: Oleg asked for a=1:3e6 and k=60. Then the result needs 1.44GB, but your intermediate arrays will use additional temporary memory.
Jan
am 22 Jul. 2011
@Andrei: Oleg asked for a=1:3e6 and k=60. Then the result needs 1.44GB, but your intermediate arrays will use additional temporary memory.
a1=bsxfun(@plus,1:k+1,[0;0]) ?! Why not [1:k+1; 1:k+1] or repmat(1:k+1, 2, 1)?
Oleg Komarov
am 22 Jul. 2011
Andrei Bobrov
am 22 Jul. 2011
In row : you - aout = reshape(aa(aa~=0),k+1,[]).';
I - aout = reshape(aa(aa~=0),k,[]).';
Oleg Komarov
am 22 Jul. 2011
Andrei Bobrov
am 22 Jul. 2011
Oleg. <http://forum.exponenta.ru/download.php?id=5931>
Oleg Komarov
am 22 Jul. 2011
Andrei Bobrov
am 22 Jul. 2011
Oleg. Sorry, n = numel(a). Сorrected.
Oleg Komarov
am 22 Jul. 2011
Sean de Wolski
am 22 Jul. 2011
I would probably try to do it with a convolution. A 'valid' one with a :
kernel = ([1;1;0;1;1]./4); %a is a column vector (example for mean)
std as a function of two convolutions is also doable. Then do the boundaries manually with a for-loop.
1 Kommentar
Oleg Komarov
am 22 Jul. 2011
Kategorien
Mehr zu Matrix Indexing finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!