- preallocate and asign into the array instead
- size x() based on the length of the string, not hardcode the loop count
I want to convert a character series into numerical series using for loop
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
I have a character sequence stored in variable DNA_SEQS = 'AGGTAT.....'. The sequence consists of four type of character 'A', 'C', 'T' & 'G', therefore I have used swith case to generate the numerical sequence. The code I have written is:
seqs = fastaread('AF0071891.fasta');
DNA_SEQS = seqs.Sequence;
len = length(DNA_SEQS);
for j = 1:5
x = [];
a = DNA_SEQS(j);
switch a
case 'A'
v = 0;
case 'C'
v = 1;
case 'G'
v = 2;
case 'T'
v = 3;
end
x(j+1) = [x(j) v];
end
By using this code I supposed to get a numerical array like [0,2,2,3,0] but I got an error as: Index exceeds matrix dimensions.
Please help
0 Kommentare
Akzeptierte Antwort
dpb
am 8 Jun. 2022
Bearbeitet: dpb
am 8 Jun. 2022
for j = 1:5
x = [];
a = DNA_SEQS(j);
...
You wipe out what you put in x later every time you start through the loop again...don't do that!!! :)
x = [];
for j = 1:5
a = DNA_SEQS(j);
...
instead, although you should
N=strlength(DNA_SEQS);
x=zeros(1,N);
for j = 1:N
a = DNA_SEQS(j);
...
However, in MATLAB you don't need a loop; use a lookup table instead. One way (not necessarily the fastest, but pretty easy to code) would be
DNA_VALS=interp1(double('ACGT'),0:3,double(DNA_SEQS));
This would return for your sample above...
>> DNA_SEQS = 'AGGTAT';
DNA_VALS=interp1(double('ACGT'),0:3,double(DNA_SEQS))
DNA_VALS =
0 2 2 3 0 3
>>
Weitere Antworten (1)
DGM
am 8 Jun. 2022
You can use ismember():
thisstr = 'AGGATATC';
charmap = 'ACGT';
[~,idx] = ismember(thisstr,charmap);
idx = idx-1
4 Kommentare
dpb
am 8 Jun. 2022
For exactly the reason I outlined above as a possibility -- it isn't a char() array --
>> DNA_SEQS='AGGTAT'; % assign as char() string (and array of char())
>> N=strlength(DNA_SEQS) % strlength() is same as length(x,2) here...
ans =
6
>> for i=1:N,disp(DNA_SEQS(i));end % works find for a char() array with () addressing
A
G
G
T
A
T
>> DNA_SEQS = cellstr('AGGTAT'); % redefine as a cellstr() instead...
>> N=strlength(DNA_SEQS) % strlength knows about what is in the cell
N =
6
>> for i=1:N,disp(DNA_SEQS(i));end % but it fails as you see...
{'AGGTAT'}
Index exceeds the number of array elements (1).
>>
WHY!!!???
>> size(DNA_SEQS) % because now the cellstr is a 1x1 CELL array, NOT 1x6 char() array...
ans =
1 1
>>
How to make work???
"Use the curlies, Luke!!!"
>> for i=1:N,disp(DNA_SEQS{1}(i));end
A
G
G
T
A
T
>>
NB: above the use of {1} to "dereference" the cell array back to the content of the char() array inside it -- the subsequent "smooth" parenstheses (i) then picks the ith element from that vector again, just as it did directly when it was "only" a char() array, not a char() array in a cell.
Strings behave similarly as cellstr(); you have to use {} (the "curlies") to reference inside the string to the individual characters that make up the string array element.
Siehe auch
Kategorien
Mehr zu Cell Arrays finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!