Optimized code for loop, If-statement for large dataset

1 Ansicht (letzte 30 Tage)
Jhon Gray
Jhon Gray am 25 Mai 2019
Bearbeitet: per isakson am 26 Mai 2019
I was hoping to delete some certain rows using condion. My data is in double format (790127*24) I approximate the total code need 25 hours using Run and Time which is huge. Is there any way of optimiing the script.
TIA...
n=0;
for i = 1 : length(d_A)
if any(isnan(d_A(i-n, 6))) ...
&& any(isnan(d_A(i-n, 7))) ...
&& any(isnan(d_A(i-n, 8))) ...
&& any(isnan(d_A(i-n, 9))) ...
&& any(isnan(d_A(i-n,10))) ...
&& any(isnan(d_A(i-n,11))) ...
&& any(isnan(d_A(i-n,12))) ...
&& any(isnan(d_A(i-n,13))) ...
&& any(isnan(d_A(i-n,14))) ...
&& any(isnan(d_A(i-n,15))) ...
&& any(isnan(d_A(i-n,16))) ...
&& any(isnan(d_A(i-n,17))) ...
&& any(isnan(d_A(i-n,18))) ...
&& any(isnan(d_A(i-n,19)))
d_A(i-n,:) = [];
n=n+1;
end
end

Akzeptierte Antwort

per isakson
per isakson am 25 Mai 2019
Bearbeitet: per isakson am 26 Mai 2019
Try this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
d_A(i-n,:) = [];
n=n+1;
end
end
and this
%%
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
is_to_be_deleted = false( size(d_A,1), 1 );
n=0;
for i = 1 : length(d_A)
if all( isnan( d_A( i-n, ixcol ))) % i-n is a scalar
% d_A(i-n,:) = [];
is_to_be_deleted(i-n) = true;
n=n+1;
end
end
d_A( is_to_be_deleted, : ) = [];
Caveat: not tested
In response to comment:
Now it's possible to factor out the for-loop. Try this
%% Sample data
A = rand( [8,4] );
ixcol = [2,3];
A([3,5],ixcol) = nan;
A( randperm( numel(A), 9 ) ) = nan;
%%
[ A3, ix_deleted3 ] = cssm_3( A, ixcol );
[ A4, ix_deleted4 ] = cssm_4( A, ixcol );
ix_deleted3 == ix_deleted4 %#ok<NOPTS,EQEFF>
function [ A, ix_deleted ] = cssm_3( A, ixcol )
is_to_be_deleted = false( size(A,1), 1 );
for jj = 1 : length(A)
if all( isnan( A( jj, ixcol )))
is_to_be_deleted(jj) = true;
end
end
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
function [ A, ix_deleted ] = cssm_4( A, ixcol )
is_to_be_deleted = all( isnan( A( :, ixcol ) ), 2 );
A( is_to_be_deleted, : ) = [];
ix_deleted = find( is_to_be_deleted );
end
it outputs
>> cssm
ans =
2×1 logical array
1
1
The vectorized version, cssm_4, might not improve performance significantly, but in my opinion it makes cleaner code.
  2 Kommentare
Jhon Gray
Jhon Gray am 25 Mai 2019
Bearbeitet: Jhon Gray am 25 Mai 2019
Wow. The second one is super fast.But there's a little bit problem here. The code would be like this.No need of i-n in this case.Thanks for helping.Take love.
ixcol = [ 6, 7, 8, 9,10,11,12,13,14,15,16,17,18,19 ];
is_to_be_deleted = false( size(d_A,1), 1 );
n=0;
for i = 1 : 1000 %length(d_A)
if all( isnan( d_A( i, ixcol ))) % i-n is a scalar
% d_A(i-n,:) = [];
is_to_be_deleted(i) = true;
%n=n+1;
end
end
d_A( is_to_be_deleted, : ) = [];
per isakson
per isakson am 26 Mai 2019
Bearbeitet: per isakson am 26 Mai 2019
I surmised that there was a problem and added the last line in bold.
It's as a bad for performance to remove one line at a time as adding one line at a time. In both cases the matrix is rewritten to memory in each operation.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Matrices and Arrays finden Sie in Help Center und File Exchange

Produkte


Version

R2018b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by