Counting the unique values
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
I have a values
out=
c1 c2,,,,,,,,,,,,,,,,,,,,,c5
'gene1' 'd' 'u' 'd' 'u' 'd'
'gene2' 'u' 'u' 'u' 'u' 'd'
'gene3' 'u' 'u' 'd' 'u' 'u'
'gene4' 'd' 'u' 'u' 'd' 'd'
'gene5' 'u' 'u' 'u' 'u' 'd'
'gene6' 'd' 'u' 'u' 'u' 'u'
'gene7' 'd' 'u' 'd' 'u' 'u'
Taking the first column 'c1' value for gene1 is 'd' this value must be compared with all other column if there is more than 3 same vales it should be displayed,,Only first clumn must be compared with others columns
in 1st col there are 3d's ,2nd col 4u's(since 1st col is u),3rd col
4u's,,,,,,,,,,,6&7th gene's there are only 1 ans 2 d's respectively,so it should be deleted
So i need output as
'gene1' 'd' 'u' 'd' 'u' 'd'
'gene2' 'u' 'u' 'u' 'u' 'd'
'gene3' 'u' 'u' 'd' 'u' 'u'
'gene4' 'd' 'u' 'u' 'd' 'd'
'gene5' 'u' 'u' 'u' 'u' 'd'
Pleae help
1 Kommentar
Akzeptierte Antwort
Freddy
am 18 Jul. 2012
Hello Pat,
first idea i came up with:
A = {'','c1','c2','c3','c4','c5';
'gene1' 'd' 'u' 'd' 'u' 'd';...
'gene2' 'u' 'u' 'u' 'u' 'd';...
'gene3' 'u' 'u' 'd' 'u' 'u';...
'gene4' 'd' 'u' 'u' 'd' 'd';...
'gene5' 'u' 'u' 'u' 'u' 'd';...
'gene6' 'd' 'u' 'u' 'u' 'u';...
'gene7' 'd' 'u' 'd' 'u' 'u'};
limit = 3;
F = cell2mat(A(2:end,2:end));
A(logical([0;sum(bsxfun(@eq,F,F(:,1)),2)>=limit]),:);
Hopefully it will help you.
Freddy
1 Kommentar
Jan
am 18 Jul. 2012
You can omit the "logical", if you use:
A([true; sum(bsxfun(@eq, F, F(:,1)), 2) >= limit], :)
Weitere Antworten (2)
Walter Roberson
am 18 Jul. 2012
c1_column = 2; %looks like column 2 to me, since column 1 has gene name
match_count = arrayfun(@(K) sum(out{K,cl_column} == [out{K,c1_column+1:end}]), 1:size(out,1));
out(match_count > 3, :)
Your problem description is inconsistent about what to do if the number of matches is exactly 3. You wrote that it has to be more than 3, but your sample output includes the case where it is exactly 3.
2 Kommentare
Jan
am 18 Jul. 2012
@Pat: Come on, I'm sure you are able to fix this typo by your own. The "1" (one) looks very similar to the "l" (lowercase L). It is your job to participate as far as possible in the solution of your problems.
Bjorn Gustavsson
am 18 Jul. 2012
Pat,
I strongly suggest you change the encoding of your data! Make your variable an integer array with for example 0 for 'd' and 1 for 'u'. Then you could do something like this:
genes = [0,1,0,1,0
1 1 1 1 0
1 1 0 1 1
0 1 1 0 0];
lim4disp = 3;
genes(sum(repmat(genes(:,1),[1,size(genes,2)]) == genes,2)>=lim4disp,:)
0 Kommentare
Siehe auch
Kategorien
Mehr zu Genomics and Next Generation Sequencing finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!