problem with regexp and split, and picking some cells

I have the following input:
>> data(1).Header
ans =
AF051909 |392-397:CAGCTG| |413-418:CAGGTG|
I needed to save them to cells as {'392-397', 'CAGCTG'; '413-418', 'CAGGTG';}
I so I used regexp to do so with the following code:
struKm(1).trueBinding = regexp(data(1).Header,'\s\||\:|\|','split');
this returns:
>> struKm(1).trueBinding
ans =
'AF051909' '392-397' 'CAGCTG' '' '413-418' 'CAGGTG' ''
as you can see there are to empty cells and I tried to find out why they are there but failed.
I also tried to ignore that and continue to picking up the cell that I need for the rest of my code which is 'CAGCTG' and 'CAGGTG'. I have this code to pick them up:
[r1,r2] = ismember(struKm(1).trueBinding,set)
it return zeros.
Can someone help with two issues please?
Regards, A.

 Akzeptierte Antwort

Azzi Abdelmalek
Azzi Abdelmalek am 2 Okt. 2012
Bearbeitet: Azzi Abdelmalek am 2 Okt. 2012
you can maintain your code and add a line code to remove empty elements
data='AF051909 |392-397:CAGCTG| |413-418:CAGGTG|'
s=regexp(data,'\s\||\:|\|','split');
s(cellfun(@(x) isempty(x),s))=[]

Weitere Antworten (1)

Abdulaziz
Abdulaziz am 3 Okt. 2012
Thank you for your reply.
I solved the issues but another is appeared.
Now,
struKm(i).seqNam = cellstr(regexp(data(i).Header, '\s\||\:|\|','split')); % determen the seqance name heads
struKm(i).seqNam(cellfun(@(x) isempty(x),struKm(i).seqNam))=[];
This code is in a FOR LOOP.
the result for this code is:
ans =
'AF051909' '392-397' 'CAGCTG' '413-418' 'CAGGTG'
some seqNams contain only one Binding site (CAGCTG). for Example:
ans =
'M13483' '445-450' 'CAACTG'
Now I want to pick the Binding sites only which are (CAGCTG, CAGGTG, CAACTG , ... etc)
I have another for loop that will do it. The code:
struSize = length(struKm);
tempcell = cell(1,1);
for m=1:struSize
if (length(struKm(m).seqNam) == 3)
resultsk.BS{m} = struKm(m).seqNam(3);
disp(m);
end
if (length(struKm(m).seqNam) == 5)
resultsk.BS{m} = cellstr(struKm(m).seqNam([3,5]));
%tempcell = struKm(m).seqNam([3,5]); resultsk.BS{m} = cellstr(tempcell);
disp(m);
end
end
and the result for this code:
>> resultsk.BS{:}
ans =
'CAGCTG' 'CAGGTG'
ans =
'CAACTG'
ans =
'CAACTG'
The problem with some cells that have two binding sites which made the cell next to cell.
I need them all in one row. still struggling with this. Can you please help?
Thank you, A

2 Kommentare

post a sample of your data
>AF051909 |392-397:CAGCTG| |413-418:CAGGTG|
tgccgctcagaaaaaaacgatctttggtgaacagtaggagccatctgagcggtgcgacgcattgtgctcccattccacacgctgcggcggccctCAGCTGtcatgcctggaaCAGGTGgtgtaaggcaatccctgggcagccgtgctccccgcccccccccgggccgaccttaaaggcgctgcgtgtgccctggctcctc
>M13483 |445-450:CAACTG|
ccttacatggtctgggggctccctggctgatcctctcccctgcccttggctccatgaatggcctcggcagtcctagcgggtgcgaaggggaccaaataaggcaaggtggcagaccgggccccccacccctgcccccggctgctcCAACTGaccctgtccatcagcgttctataaagcggccctcctggagccagccaccc
>M26773 |446-451:CAACTG|
cttacatggtctgggagccccctggctgatcctctaccctgcccttggctccaagaatggcctcagcggtcctagatggtgctaaggcgaccaaataaggcaaggtggcagatcaggggccccccacccctgcccccggctgctcCAACTGaccccgtccatcagagagctataaagctgcgctccaggcgactgacacc
>M86232 |447-452:CACTTG|
ctgtgctattctggtttggatgtgactcagaacacagttgaacattatttgaactcacagagcttgccattctggaagcacagccttatatgtagtgtccatgggcagtcctattatgggaaagcaacttgagagaaaaggcgggtCACTTGcttgtgcgcaggtcctggaatttgaaatatccagaggcctctacagaa
>M86233 |447-452:CACTTG|
ctgtgctattctagtttggatgtgactcaggacagagttgaacattatttgaattcacagagcttgccatgctggaagcacagccttatatgtagtgtccatgggcagtcctattatggcaaagcaacttgagagaaaaggcgggtCACTTGcttgtgcgcaggtcctggaatttgaaatatccagaggccctacagaat
>X00371 |326-331:CACCTG|
gagctgtcctgcctcgccacaatggCACCTGccctaaaatagcttcccatgtgagggctagagaaaggaaaagattagaccctccctggatgagagagagaaagtgaaggagggcaggggagggggacagcgagccattgagcgatctttgtcaagcatcccagaaggtataaaaacgcccttgggaccaggcagcctca
>X53154 |440-445:CAGCTG|
cgaaggattggtaggcttgccgtcacaggacccccgctggctgactcaggggcgcaggctcttgcgggggagctggcctcccgcccccacggccacgggccctttcctggcaggacagcgggatcttgCAGCTGtcaggggaggggatgacgggggactgatgtcaggaggggatacaaatagtgccgacggctaggggg
>X59034 |442-447:CAGCTG| |461-466:CAGGTG|
accaaacacaatgacaagcctctgactcatgatctatgtagactctcagacactttacatctagtaagagtatagcgatcatgttaagcaaggcacgtctgtggccacagaaggccccaagctttgaggctgtgggcagctCAGCTGtcatgcgggcacaCAGGTGatgtaagacaatagctgtggagtcagctggcttc

Melden Sie sich an, um zu kommentieren.

Kategorien

Produkte

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by