creating a string multiple string filter on multiple columns
Ältere Kommentare anzeigen
Hello, I have a n x m (row-column data) that I previously was able to do some basic analysis on.
How can I create a multiple "string filter" for each column and remove the unwanted "strings" , after filtering I then need to concatenate the column after removing the unwanted strings.
data = randn(n,m);
results = cell(1,m);
for jj = 1:m
results{jj} = perform_analysis(data(:,jj));
end
Example:
First Filter is AA, BB, CC, DD (independent of each other) then concatenate "some data" on the column x.
Continue this type of filter until all columns have removed the unwanted strings while the data is concatenated for all columns.
Thanks...
3 Kommentare
DGM
am 7 Feb. 2022
Can you provide an example of an input array, the strings to be removed, and the intended resultant output?
Michael Angeles
am 7 Feb. 2022
Jan
am 7 Feb. 2022
I do not understand, what you are asking for. What does this mean: concatenate "some data" on the column x ?
What is the shown table? A string array? Then setdiff should work.
Akzeptierte Antwort
Weitere Antworten (1)
Assuming you're dealing with a cell array of chars or string arrays:
A = {'AA'; 'AB'; 'BA'; 'BB'; 'AC'; 'CA'; 'BC'; 'CB'; 'CC'};
toremove = {'AA','BB','CC'};
% you could do it with ismember()
B = A(~ismember(A,toremove))
% or you could use setdiff()
C = setdiff(A,toremove,'stable')
2 Kommentare
Michael Angeles
am 9 Feb. 2022
It should work fine on 2D arrays, but you have to realize that the result will necessarily not be 2D anymore.
A = {'AA'; 'AB'; 'BA'; 'BB'; 'AC'; 'CA'; 'BC'; 'CB'; 'CC'};
A = [A A(randperm(numel(A))) A(randperm(numel(A)))]; %replicate to 3 columns
toremove = {'AA','BB','CC'};
% you could do it with ismember()
B = A(~ismember(A,toremove))
% or you could use setdiff()
C = setdiff(A,toremove,'stable')
Note that setdiff() returns only the unique values, whereas using ismember() returns everything. Since A in this case is three randomly permuted copies of the same column, the result from B is three times that of C, as it contains three copies of each matching element.
If you are getting errors, you'll have to describe exactly what you're doing and what error you're getting.
EDIT:
Regarding columnwise filtering and padding:
A = {'AA'; 'AB'; 'BA'; 'BB'; 'AC'; 'CA'; 'BC'; 'CB'; 'CC'};
A = repmat(A,[1 3]);
A(:) = A(randperm(numel(A))) % 3x3 but matches aren't uniformly distributed
toremove = {'AA','BB','CC'};
B = cell(size(A));
maxr = 0;
for c = 1:size(A,2)
thisb = A(~ismember(A(:,c),toremove),c);
B(1:numel(thisb),c) = thisb;
maxr = max(maxr,numel(thisb));
end
B = B(1:maxr,:)
Alternatively, you could put each column in a nested cell array:
B = cell([1 size(A,2)]);
maxr = 0;
for c = 1:size(A,2)
B{c} = A(~ismember(A(:,c),toremove),c);
end
B
Again, similar can be done with setdiff() if you only want the unique results.
Kategorien
Mehr zu Characters and Strings finden Sie in Hilfe-Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
