Loop through the DNA array and record all of the locations of the triplets (codons): ‘AAA’, ‘ATC’ and ‘CGG’.
10 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
Austin Shipley
am 14 Nov. 2020
Bearbeitet: Sai Veeramachaneni
am 17 Nov. 2020
My code so far is functional, but I don't think that it's correct. I am supposed to loop through the cell array and record the locations of each codon, while skipping over the ones that contain a character from preceding codon. For example, if part of the sequence contains [A,T,C,C,G,G] then the section with CCG should be skipped. I'm just not entirely sure what the best way to do that would be.
Here is what I have so far:
fid = fopen('sequence_long.txt','r')
A = textscan(fid,'%3s');
DNA = A{1};
fclose(fid);
i = 1;
%loops through array and counts codon occurrences
%finds the index location of individual codons
while i < length(DNA)
i = i + 1;
if strcmp(DNA(i),'AAA')
num_AAA = nnz(strcmp(DNA,'AAA'));
loc_AAA = find(strcmp(DNA,'AAA'));
elseif strcmp(DNA(i),'ATC')
num_ATC = nnz(strcmp(DNA,'ATC'));
loc_ATC = find(strcmp(DNA,'ATC'));
elseif strcmp(DNA(i),'CGG')
num_CGG = nnz(strcmp(DNA,'CGG'));
loc_CGG = find(strcmp(DNA,'CGG'));
end
end
fprintf('The number of AAA values is: %.f',num_AAA)
fprintf('The index location of AAA values: %.f\n',loc_AAA(1:10))
fprintf('The number of ATC values is: %.f',num_ATC)
fprintf('The index location of ATC values: %.f\n',loc_ATC(1:10))
fprintf('The number of CGG values is: %.f',num_CGG)
fprintf('The index location of CGG values: %.f\n',loc_CGG(1:10))
0 Kommentare
Akzeptierte Antwort
Sai Veeramachaneni
am 17 Nov. 2020
Bearbeitet: Sai Veeramachaneni
am 17 Nov. 2020
One workaround is to iterate over the sequence and skip the next two characters whenever we find a codon.
You can look at the below code for your reference.
DNA = 'AAATCATCGGCGGATC';%Example sequence
i = 1;
loc_AAA = [];
loc_ATC = [];
loc_CGG = [];
num_AAA = 0;
num_ATC = 0;
num_CGG = 0;
while i <= length(DNA)-2
if DNA(i)=='A' && DNA(i+1)=='A' && DNA(i+2)=='A'
loc_AAA = [loc_AAA i];
num_AAA = num_AAA + 1;
i = i + 3; %Skip the next two characters
elseif DNA(i)=='A' && DNA(i+1)=='T' && DNA(i+2)=='C'
loc_ATC = [loc_ATC i];
num_ATC = num_ATC + 1;
i = i + 3;
elseif DNA(i)=='C' && DNA(i+1)=='G' && DNA(i+2)=='G'
loc_CGG = [loc_CGG i];
num_CGG = num_CGG + 1;
i = i + 3;
else
i = i + 1;
end
end
0 Kommentare
Weitere Antworten (0)
Siehe auch
Kategorien
Mehr zu Audio and Video Data finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!