Getting strings to combine multiple times
8 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Ok I have a school project that I have to group a DNA sequence of 550437 codons together. At the moment I have it set up as a string. Basically 1 letter per cell on 550437 cells. I have to show how many times AAA, ATC, and CGG show up in that sequence without overlap. I also have to show the location of the first 10. I've tried reshaping from a 550437x1 to a 183479x3 but the order doesn't align every third from left to right. Column 1 will have the first 183479, the second column will have the second and the third column will have the final set. I would either like to group every 3 cells into one cell, or just give me a numeric notation telling me when my selected sequence shows up. Here's what I have so far to show me how many times each sequence shows up. Now I can't figure out how to find where the first 10 instances of each show up.
x=1;
i=1;%%%Variable for AAA
h=1;%%%Variable for ATC
t=1;%%%Variable for CGG
AAAmatch=0;%%%Sets up for exact match
ATCmatch=0;%%%Sets up for exact match
CGGmatch=0;%%%Sets up for exact match
AAAcount=0;%%%Counter for AAA match
ATCcount=0;%%%Counter for ATC match
CGGcount=0;%%%Counter for CGG match
%%%Locates AAA match in entire sequence without overlap
for i=1:length(DNA)-2
if strcmp(DNA(i),'A')
AAAmatch=AAAmatch+1;
end
if strcmp(DNA(i+1),'A')
AAAmatch=AAAmatch+1;
end
if strcmp(DNA(i+2),'A')
AAAmatch=AAAmatch+1;
end
if AAAmatch==3
AAAcount=1+AAAcount;
end
AAAmatch=0;
end
%%%Locates ATC match in entire sequence without overlap
for h=1:length(DNA)-2
if strcmp(DNA(h),'A')
ATCmatch=ATCmatch+1;
end
if strcmp(DNA(h+1),'T')
ATCmatch=ATCmatch+1;
end
if strcmp(DNA(h+2),'C')
ATCmatch=ATCmatch+1;
end
if ATCmatch==3
ATCcount=1+ATCcount;
end
ATCmatch=0;
end
%%%Locates CGG match in entire sequence without overlap
for t=1:length(DNA)-2
if strcmp(DNA(t),'C')
CGGmatch=CGGmatch+1;
end
if strcmp(DNA(t+1),'G')
CGGmatch=CGGmatch+1;
end
if strcmp(DNA(t+2),'G')
CGGmatch=CGGmatch+1;
end
if CGGmatch==3
CGGcount=1+CGGcount;
end
CGGmatch=0;
end
Thoughts?
1 Kommentar
Azzi Abdelmalek
am 28 Apr. 2016
You can make your question clear and brief, by posting an example with the expected result. You can also add some explanations.
Antworten (1)
Walter Roberson
am 28 Apr. 2016
Consider using strfind() . But you do need to put in some logic to detect a potential overlap between the final character of one and the first of the next. Also if you had something like 'AAAA' then strfind() of 'AAA' will return both 1 and 2 (that is, strfind does not care about overlaps.) Still, strfind() will help give you candidate positions that you can winnow out.
What would you want the result to be if there was 'AAATCGG' in the sequence? Is that one AAA and one CGG, or is it one ATC ?
2 Kommentare
Siehe auch
Kategorien
Mehr zu Workspace Variables and MAT Files finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!