find indices of row subsets
Ältere Kommentare anzeigen
I am trying to find vectorized matlab function ind = item2ind(item,t) to solve following problem: I have a list of row vectors
item = [2 3 1; 2 1 2; 3 1 1; 1 3 3]
and vector of all possible item elements at each item row vector
t = [1 1 2 2 2 3 3];
I need to find indexes of of separate item rows elements corresponding to the vector t in this way:
ind = [3 6 1; 3 1 4; 6 1 2; 1 6 7]
But
item = [1 1 1]
does not correspond to the vector t, because there are 3 "1" elements, and t contains only 2 "1" elements.
Note: Serial version is inefficient for large item (10000 x 100) and t (1 x 200).
function ind = item2ind(item,t)
[nlp,N] = size(item);
ind = zeros(nlp,N);
for i = 1:nlp
auxitem = item(i,:);
auxt = t;
for j = 1:N
I = find(auxitem(j) == auxt,1,'first');
if ~isempty(I)
auxt(I) = 0;
ind(i,j) = I;
else
error('Incompatible content of item and t.');
end
end
end
end
Additional remarks:
Most of the time is spent on the line:
I = find(auxitem(j) == auxt,1,'first');
Is there any clever trick how to speed up this line of code? I tried this, for example, but without any speedup:
I = ipos(auxitem(j) == auxt); I = I(1);
where ipos is preallocated as:
ipos = 1:length(t);
Thanks in advance for any help ...
6 Kommentare
And what should be the result of item = [1 1 1], ind = [1 2 NaN]?
And that should be on a per-row basis?
Is 't' always sorted?
Edit: I see that you're failing out of the subroutine answers my first two questions. So then we just need to know if 't' is sorted.
Well, that's going to make a mess when you want to keep track of indices - but I think we can manage.
I've got something that should loop through the unique elements of t to do what you want. Now I've got to figure out the not-sorted issue.
Edit: No worries, 200 extra 'find' commands on 't' will only add ~2 milliseconds.
Michal
am 9 Mai 2018
Akzeptierte Antwort
Weitere Antworten (2)
function ind = item2ind(item, t);
maxRun = length(t) + 1;
[T , TI] = accumsort(t, maxRun);
ind = zeros(size(item));
for k = 1:size(item, 1)
[aItem, aItemI] = accumsort(item(k, :), maxRun);
% [m, index] = ismember(aItem, T);
% Faster with undocumented function:
[m, index] = builtin('_ismemberhelper', aItem, T);
if all(m)
ind(k, aItemI) = TI(index);
else
error('Incompatible item.');
end
end
end
function [T, TI] = accumsort(t, maxRun)
[sortedT, TI] = sort(t);
T = sortedT * maxRun;
c = -1;
for k = 1:numel(T)
if T(k) ~= c
d = 0;
c = T(k);
else
d = d + 1;
end
T(k) = T(k) + d;
end
end
For some test data of size [10'000 x 100] I get a runtime of 0.21 sec instead of 1.3 sec of the original version.
With calling ismember the runtime is 0.41 sec. Internally ismember calls the helper function builtin('_ismemberhelper') for sorted data of type double. If it is known already, that the input is sorted, calling the internal function avoids the overhead.
If you have a C compiler, converting accumsort to a C-mex would be useful.
maxRun must a any number greater than the highest number of repetitions in t. length(t)+1 is guaranteed to be larger.
9 Kommentare
Jan
am 9 Mai 2018
You have at least a speed up of a factor 6.2 . I hope you are happy about this already. I try to create a C-Mex in addition. Wait a little bit...
Michal
am 9 Mai 2018
Jan
am 9 Mai 2018
As far as I can see, my code can be parallelized also directly:
parfor k = 1:size(item, 1)
I've made a very interesting experiment with a C-Mex function:
// file: accumsortX.c
// mex -O accumsortX.c
#include "mex.h"
void mexFunction(int nlhs, mxArray *plhs[], int nrhs, const mxArray *prhs[]) {
double *t, *T, maxT, c, d;
size_t n;
t = mxGetPr(prhs[0]);
maxT = mxGetScalar(prhs[1]);
n = mxGetNumberOfElements(prhs[0]);
plhs[0] = mxCreateUninitNumericMatrix(1, n, mxDOUBLE_CLASS, mxREAL);
T = mxGetPr(plhs[0]);
c = *t - 1;
while (n-- != 0) {
if (c != *t) {
d = 0.0;
c = *t;
} else {
d += 1.0;
}
*T++ = maxT * *t++ + d;
}
return;
}
It has exactly the same speed as the Matlab version of accumsort - but I've moved the sorting to the main function:
function ind = item2ind_jan2(item,t)
maxT = max(t) + 1;
[T, TI] = sort(t);
T = accumsort(T, maxT);
ind = zeros(size(item));
for k = 1:size(item, 1)
[aItem, aItemI] = sort(item(k, :));
aItem = accumsort(aItem, maxT);
[m, index] = builtin('_ismemberhelper', aItem, T);
if all(m)
ind(k, aItemI) = TI(index);
else
error('item not compatible to t.');
end
end
end
function [T] = accumsort(t, maxT)
T = t * maxT;
c = -1;
for k = 1:numel(T)
if T(k) ~= c
d = 0;
c = T(k);
else
d = d + 1;
end
T(k) = T(k) + d;
end
end
Replacing accumsort() by accumsortX() does not change the run-time, even if I do not create a new vector in the Mex but write to the original array.
But let me repeat: I still have no idea how you measure the timings, because for your given input data:
t = 1:10;
tt = repmat(t,1,3);
[~,p] = sort(rand(100000,30),2);
item = tt(p);
I get the error message for incompatible items. Please clarify this detail, because I'm eager to reproduce your timings in any way.
Wick
am 10 Mai 2018
Michal, what sort of computer are you running on? I've got a Threadripper with more cores and you're running circles around me on your timings. I had heard that the math libraries aren't optimized for AMD processors and clearly that appears to be the case.
Michal
am 10 Mai 2018
Wick
am 10 Mai 2018
Interesting. Jan's code is always faster on my computer than the numbers you gave. My code is faster on my machine for the short runs but for the long runs your machine is faster. It appears your memory subsystem is a bit more efficient.
Michal
am 10 Mai 2018
Here you go. At the sizes you suggested, this shouldn't take too long. It has a single 'for' loop that cycles through the unique values of 't'.
I'm using logical indexing to identify all the elements in 'item' that match the given unique 't' and summing across the row. If the sum exceeds the number of times that value showed up in 't' you get your error. Otherwise I'm using 'cumsum' in a creative fashion (in my ever so humble opinion) to provide the indexes back to the location of the unique value in the original vector 't'.
Good Luck!
function ind = item2ind(item,t)
unique_t = unique(t);
ind = zeros(size(item));
try
% a single 'for' loop as long as the unique elements of t
for jj = 1:length(unique_t)
O = zeros(size(item));
O(item == unique_t(jj)) = 1;
positions_of_t = [0 find(t == unique_t(jj))];
% adding zero so sub_index call below will always reference a non-zero element
sub_index = cumsum(O,2) .* O + 1;
ind = ind + positions_of_t(sub_index);
% this is why we needed the 0 in positions_of_t above
end
catch
error('Incompatible content of item and t.');
end
12 Kommentare
Michal
am 9 Mai 2018
Wick
am 9 Mai 2018
Test case for this code. I moved 't' to be out of order.
items =
2 3 1
2 1 2
3 1 1
1 3 3
t =
1 2 2 1 2 3 3
ind =
2 6 1
2 1 3
6 1 4
1 6 7
In terms of your concern that 'item' may get large, that shouldn't slow this down particularly much. In the end you've got to consider a logical index of every 't' to every 'item'. We managed to keep that all vectorized and the only 'for' loop is through the elements of (unique(t)).
Wick
am 9 Mai 2018
I changed my code so it's a bit faster. I got rid of the if/then statement each loop iteration. If there are too many elements to be assigned to the values in 'positions_of_t' the code is already going to error. In which case, we just let it error. However, since you had a specific error message in mind, the try/catch/end structure is wrapped around the whole thing.
That should save several hundred calls to if/then.
I don't have the parallel toolbox and you have a fast computer. :) That took me 28 seconds.
It surprises me it's that slow. Let me play.
Edit: The trouble is the memory time spent assigning the values of positions_of_t. The following code is slower on my system but the loop should now be parallelizable. Therefore, you can try this with the parallel toolbox loop and see if it speeds things up.
function ind = item2ind(item,t)
unique_t = unique(t);
ind = zeros([size(item) numel(unique_t)]);
try
% a single 'for' loop as long as the unique elements of t
for jj = 1:length(unique_t)
O = zeros(size(item));
O(item == unique_t(jj)) = 1;
positions_of_t = [0 find(t == unique_t(jj))];
% adding zero so sub_index call below will always reference a non-zero element
sub_index = cumsum(O,2) .* O + 1;
ind(:,:,jj) = positions_of_t(sub_index);
% this is why we needed the 0 in positions_of_t above
end
catch
error('Incompatible content of item and t.');
end
ind = sum(ind,3);
Wick
am 9 Mai 2018
Sorry, I don't think I'm of any further help. I'm pretty good at vectorizing code, but it appears in this case that leaving things as nested 'for' loops is the better choice.
I think the trouble is the non-sequential memory writes associated with the index assignments. MATLAB's just got too much overhead on those if the compiled 'for' loops are able to run faster.
Jan
am 9 Mai 2018
For your test data:
t = 1:50;
tbig = repmat(t,1,5);
[~,p] = sort(rand(100000,length(tbig)),2);
item = tbig(p);
Your code from the question stops with the error "Incompatible content of item and t." So how can you measure the timings?!
The same for:
t = 1:10;
tbig = repmat(t,1,25);
[~,p] = sort(rand(100000,length(tbig)),2);
item = tbig(p);
I tried it with:
t = repmat(1:50, 1, 5);
t = t(randperm(length(t)));
item = zeros(10000, 100);
for k = 1:10000; item(k, :) = t(randperm(length(t), 100)); end
This let your (and my) item2ind() run successfully.
Do I miss a point?!
Michal
am 9 Mai 2018
Wick
am 9 Mai 2018
Jan,
My code is faster for small length 't' and much, much slower for large 't'. You vectorized in a completely different way than I did (and used an undocumented function but we won't use that against you). My question is, is there some rule of thumb my snippet of code didn't follow that I should change how I code things? I've always felt I was pretty good at vectorizing my MATLAB code but I've been coming here to learn how to be better. Obviously, you know some tricks I don't.
Thanks.
Kategorien
Mehr zu Parallel Computing Toolbox finden Sie in Hilfe-Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!