strcmp and rows of dataset table

6 Ansichten (letzte 30 Tage)
Bina
Bina am 27 Dez. 2011
i have a text dataset with 4 columns and n rows : cl1 cl2 cl3 cl4 i want to know how can i use strcmp() to show which rows are with the same CL2 and CL3 (no CL2=CL3)for example ,according to the dataset below i want to show row1 and row4 , becouse they have same cl2 and cl3,
cl1 cl2 cl3 cl4
---------------------------
a b c d
d j h n
s b v y
q b c g
and as i said dataset has "n" rows so some rows have same CL2-Cl3 and... i want to make domains, for example Domain1={some rows with same CL2-CL3} Domain2={some rows with another same CL2-Cl3} , ...
pleasecheck code below and give me idea what should i do? how to use strcmp() in this case? and how to show the target rows?
fid = fopen('Input2.txt','r')
data = textscan(fid,'%s %s %s %s')
fclose(fid)
indices = strcmp(data{2}{1},data{2})&&(data{1})
sum(indices)

Akzeptierte Antwort

Matt Tearle
Matt Tearle am 27 Dez. 2011
Sounds like a job for categorical arrays! Huzzah! (Assuming you have Statistics Toolbox.) BTW, you said "dataset" but you're using cell arrays, so I assume you don't mean the dataset array in Stats TB. However, they may be a useful way to package your data. Anyway... why not make a new variable that is the combination of columns 2 and 3, and look for the unique values of that array:
twoandthree = nominal(strcat(data{2},'-',data{3}))
data = [data{:}];
domains = getlabels(twoandthree)
for k=1:length(domains)
foo = data(twoandthree==domains{k},:)
end
If you don't have Stats TB, you can achieve the same result with unique and strcmp:
twoandthree = strcat(data{2},'-',data{3})
data = [data{:}];
domains = unique(twoandthree)
for k=1:length(domains)
foo = data(strcmp(twoandthree,domains{k}),:)
end
Also, note I'm using [data{:}] to extract the four columns (each being a cell array) and concatenate them together into a single four-column table (ie a single n-by-4 cell array containing strings). If you're going to be accessing by rows, that's a nicer arrangement of data.
But, as I mentioned, dataset arrays may also make life nice, depending on what you're doing to do with the subsets.
data = dataset(data{:},'VarNames',strcat('cl',cellstr(num2str((1:4)'))))
twoandthree = nominal(strcat(data.cl2,'-',data.cl3))
domains = getlabels(twoandthree)
for k=1:length(domains)
foo = data(twoandthree==domains{k},:)
end
  3 Kommentare
Matt Tearle
Matt Tearle am 27 Dez. 2011
[Strikes heroic pose] Don't thank me. Thank logical indexing. [Rides off into sunset]
Walter Roberson
Walter Roberson am 27 Dez. 2011
Indexing! Indexing! Get your red-hot Logical Indexing here!
Authorized! Signed! Get your red-hot Logical Indexing!
Vectorized! Multidimensional! Endorsed by "Shane" Tearle!
Get your read-hot Logical Indexing!

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Mehr zu Cell Arrays finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by