MATLAB Answers

How can i compute Amino Acid composition for my protein sequence data?

2 views (last 30 days)
Nedz
Nedz on 23 Apr 2020
Answered: Tim DeFreitas on 23 Apr 2020
How can i get/compute the amino composition for my protein sequences inorder to further use it to train my SVM classifier?
for example if, i have the following sequence as one of my sequence sample:
'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE'

  0 Comments

Sign in to comment.

Accepted Answer

Tommy
Tommy on 23 Apr 2020
Edited: Tommy on 23 Apr 2020
allAA = sort('ARNDCQEGHILKMFPSTWYV');
seq = 'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE';
counts = histc(seq, allAA);
freq = counts/numel(seq);
for aa = allAA
fprintf('%c: %d/%d (%.4f%%)\n', aa, counts(allAA==aa), numel(seq), freq(allAA==aa));
end
%{
prints:
A: 1/49 (0.0204%)
C: 0/49 (0.0000%)
D: 10/49 (0.2041%)
E: 12/49 (0.2449%)
F: 2/49 (0.0408%)
G: 1/49 (0.0204%)
H: 0/49 (0.0000%)
I: 5/49 (0.1020%)
K: 3/49 (0.0612%)
L: 4/49 (0.0816%)
M: 0/49 (0.0000%)
N: 3/49 (0.0612%)
P: 1/49 (0.0204%)
Q: 2/49 (0.0408%)
R: 0/49 (0.0000%)
S: 1/49 (0.0204%)
T: 0/49 (0.0000%)
V: 1/49 (0.0204%)
W: 0/49 (0.0000%)
Y: 3/49 (0.0612%)
%}

  0 Comments

Sign in to comment.

More Answers (1)

Tim DeFreitas
Tim DeFreitas on 23 Apr 2020
If you have the Bioinformatics Toolbox, there's also the AACOUNT function:https://www.mathworks.com/help/bioinfo/ref/aacount.html
seq = 'AEYDDSLIDEEEDDEDLDEFKPIVQYDNFQDEENIGIYKELEDLIEKNE';
counts = aacount(seq)
% Optional: plotting included
aacount(seq, 'chart', 'bar')

  0 Comments

Sign in to comment.


Translated by