Compare two strings with some restrictions

Question

0 Stimmen

Hey, how are you?

I have to compare to strings of n and m lines each other to see if they have the same messages. The messages are the following way:

!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053

!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053

!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053

!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053

!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054

!AIVDM,1,1,,A,137FrD02Bu0:=9@GS16Vu5O`00S<,0*5D0054

!AIVDM,1,1,,B,33F6AD00@0P9ud6GbtWAQmob22rA,0*2B0055

As you can see the last four numbers change from 0000 to 5959 the first two are minutes and the other two seconds. I have the code to compare all the messages from one script to another but now I have to compare just the messages that have and ending in a range that we put. Exemple:

!AIVDM,1,1,,A,13GQ>C@P00P9rHrGasGf4?wn20SK,0*020059

This message ends with 0059 I should compare it with all the messages that end from the number 0000 and 0159. That makes a comparison with the numbers that are one minut above and up the message.

4 Kommentare
2 ältere Kommentare anzeigen 2 ältere Kommentare ausblenden

Stephen23 am 15 Sep. 2021

Bearbeitet: Stephen23 am 15 Sep. 2021

"the output is another string that contains the messages that are the same in both strings"

It is not clear what "both strings" you are referring to.

Please show the exact expected output for the provided data.

flashpode am 15 Sep. 2021

Bearbeitet: flashpode am 18 Sep. 2021

Okay one string is this one:

!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053

!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053

!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053

!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053

!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054

and the other string is:

"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"

"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"

"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"

"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"

so the output is another string that has the messages taht are in both strings

Melden Sie sich an, um zu kommentieren.

Melden Sie sich an, um diese Frage zu beantworten.

Follow Question

Answer 1

Walter Roberson am 15 Sep. 2021

In MATLAB Online öffnen

0 Stimmen

In https://www.mathworks.com/matlabcentral/answers/1452949-get-the-last-for-digits-as-the-time-this-message-was-sent#answer_787044 I showed you have to extract the last 4 digits of each line, as text.

The result would have been a cell array of character vectors. You can str2double() to get a set of decimal numbers.

Once you have the set of decimal numbers, referred to below as DN, then

dur = minutes(floor(DN/100)) + seconds(mod(DN,100));

If you do that for both sets of data, getting dur1 and dur2, then

[~, M1, S1] = hms(dur1);
[~, M2, S2] = hms(dur2);
[has_match0, idx0] = ismember(M1, M2);
[has_match1, idx1] = ismember(M1+1, M2);
M1_has_match = has_match0 | has_match1;
M1_match(has_match1) = idx1(has_match1);
M1_match(has_match0) = idx0(has_match0);
M1_matches = find(M1_has_match);
M2_matches = M1_match(M1_has_match);

If I got everything right, then M1_matches will be the index into the first set of durations in which there are matches, and M2_matches will be the corresponding indexes into the second set of durations that match the first set.

Any one entry in the first set of durations is only looked for once in the second set of durations, but because of the matching process, any given entry in the second set of durations could match more than one entry in the first set of durations. You did not ask for the closest match that occurs within a particular time interval: you asked for matches that occur if the second set has any entry that has the same minute as one in the first set, or is the next minute after one in the first set.

31 Kommentare
29 ältere Kommentare anzeigen 29 ältere Kommentare ausblenden

Walter Roberson am 16 Sep. 2021

In MATLAB Online öffnen

S1s = [

"!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053"

"!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053"

"!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053"

"!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053"

"!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054"

"!AIVDM,1,1,,A,137FrD02Bu0:=9@GS16Vu5O`00S<,0*5D0054"

"!AIVDM,1,1,,B,33F6AD00@0P9ud6GbtWAQmob22rA,0*2B0055"

"!AIVDM,1,1,,B,13F9RTPP00P9rL0GasQf4?wf2D1E,0*430055"

"!AIVDM,1,1,,A,4028j;1vDfG0o09cG0Gdh4i000S:,0*560056"

"!AIVDM,1,1,,A,D028j;0flffp,0*430056"

"!AIVDM,1,1,,B,13E`977P00P:06:Gc8DkW?wh2@Q3,0*690056"

"!AIVDM,1,1,,A,13EpM3PP0009nVdGb9Itfwwh0HQ6,0*0A0056"

"!AIVDM,1,1,,B,13GQ:Fw01@P9qH6GaS:WiVEh2<1E,0*6D0056"

"!AIVDM,1,1,,B,13cq;9000IP9saTGb3d0b0ad8<1s,0*320057"

"!AIVDM,1,1,,A,39NSCRU000P9uM`GbVRU=@qD0000,0*480057"

"!AIVDM,1,1,,A,13Esmv000009qWBGb=BLHAUl0L1U,0*5A0057"

"!AIVDM,1,1,,A,13F:b60P0J09sKpGb1UhGOwl0<1l,0*3C0058"

"!AIVDM,1,1,,A,13EaMT?000P9wK2Gblptiooh0D1;,0*430059"

"!AIVDM,1,1,,A,13GNje0P00P9nebGb5nv4?wl00SB,0*4B0059"

"!AIVDM,1,1,,A,13GQ>C@P00P9rHrGasGf4?wn20SK,0*020059"

"!AIVDM,1,1,,B,H3dPfW4UC=D7@?>q82knjo2P9430,0*610059"

"!AIVDM,1,1,,B,13GPhM0P00P9rGPGast>4?wn2<1C,0*080059"

"!AIVDM,1,1,,A,13ErMfPP01P9rG0Gasc>4?wn26p4,0*0F0100"

"!AIVDM,1,1,,A,39NSVP500009tArGbL07q6Mn0F;r,0*490100"

"!AIVDM,1,1,,A,4028ioivDfG0s09kDvGag6G006p0,0*010100"

"!AIVDM,1,1,,A,D028ioj"

"!AIVDM,1,1,,A,4028jJ1vDfG1009cHtGdh1g026p4,0*100100"

"!AIVDM,1,1,,B,137FrD0v2u0:=9TGRwMVu5N00H0j,0*5E0101"

"!AIVDM,1,1,,B,19NS@=@01qP9u@fGQs3PbP`40H0l,0*030101"

"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D0102"

];

S2s = [

"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"

"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"

"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"

"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"

"!AIVDM,1,1,,B,19NS@=@01qP9tp4GQkJ0bh`200SP,0*780000"

"!AIVDM,1,1,,B,137FrD0v2u0:=4pGS;s6u5On00SJ,0*000000"

"!AIVDM,1,1,,A,4028jJ1vDfG0009cIVGdh2?0280S,0*400000"

"!AIVDM,1,1,,B,H3GQ9khl4LLTF0l5T0000000000,2*070001"

"!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080001"

"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"

"!AIVDM,1,1,,B,D028jJ03`N?b<`O6Dl<O6D0,2*350002"

"!AIVDM,1,1,,B,137JlD51h0P9tddGbCQSm0j2081e,0*0E0002"

"!AIVDM,1,1,,A,13EuB00P0009n`TGb82:ugv600RQ,0*110003"

"!AIVDM,1,1,,A,13EsReP00009vQ`Gbj65gPh400SJ,0*350003"

"!AIVDM,1,1,,B,33EtT>5000P9nI2Gb8H5FP<60Dm:,0*580004"

"!AIVDM,1,1,,B,13Efqs800109q6DGb0wHhq@826p0,0*0A0004"

"!AIVDM,1,1,,A,13F9RTPP00P9rKpGasQf4?v820Sf,0*6D0004"

"!AIVDM,1,1,,B,13EmCs70010:0;bGc<Lbh3280<14,0*340004"

"!AIVDM,1,1,,A,13EcsW7P00P:07PGc9Ws@gv82D0l,0*060004"

"!AIVDM,1,1,,B,13EpM3PP0009nVPGb9EJG?v:0<1N,0*590005"

"!AIVDM,1,1,,A,15AMJH00000:1i6Ga0oP0Af:0<16,0*470005"

"!AIVDM,1,1,,A,13cq;9000GP9sUlGb2IPRPb680T5,0*510005"

"!AIVDM,1,1,,B,4028j;1vDfG0509cFhGdh5Q0083a,0*5C0006"

"!AIVDM,1,1,,B,D028j;0flffp,0*400006"

"!AIVDM,1,1,,A,137FrD032u0:=5<GS;6Vu5N:0<1=,0*620006"

"!AIVDM,1,1,,B,13Esmv000009qW:Gb=BLHAT@0H4>,0*660007"

"!AIVDM,1,1,,A,13E`977P00P:06DGc8D00?v>26p0,0*2B0007"

"!AIVDM,1,1,,A,33cm<M1th209wjpGb066k1v<0P00,0*470007"

"!AIVDM,1,1,,A,13GQ:Fw01?P9qW4GaW=7d6:>26p0,0*150007"

"!AIVDM,1,1,,B,13F:b60P0I09sHjGb0D0Bwv@0<1i,0*780008"

"!AIVDM,1,1,,B,B3E`?Q00002Q8LUrpW7Q3wT5kP06,0*760008"

"!AIVDM,1,1,,B,H3`fKwPiDp40000000000000000,2*5F0008"

"!AIVDM,1,1,,B,H3`fKwTTDBE5847@4lpnl0200320,0*700008"

"!AIVDM,1,1,,B,33EaMT?000P9wK4Gblq<iop<00jk,0*390009"

"!AIVDM,1,1,,B,H4hJ<S0l58T4R118Tp<E=>0TTV0,2*450009"

"!AIVDM,1,1,,A,13GPhM0P00P9rGHGast>4?vB20Sr,0*610009"

"!AIVDM,2,1,2,B,53cq;982CRFhT8P<000`thiV1H4p4@Tt00000017D1@CC4en0K1DhPkP0000,0*2C0009"

"!AIVDM,2,2,2,B,00000000000,2*250009"

"!AIVDM,1,1,,B,13GQ>C@P00P9rHjGasGf4?vB2@5d,0*0D0009"

"!AIVDM,1,1,,A,7000003dTINH,0*6D0010"

"!AIVDM,1,1,,A,19NS@=@01qP9tt6GQlahb@`F0H65,0*2D0010"

"!AIVDM,1,1,,A,33FMMd0P0009o1fGapC<Uwv@000k,0*330010"

"!AIVDM,1,1,,B,4028ioivDfG0909kDvGag6G00@6;,0*730010"

"!AIVDM,1,1,,B,D028ioj<Tffp,0*2F0010"

"!AIVDM,2,1,3,A,53FMMd82;AMHD4i4001=049DpdE:3C400000001I081234He0=hUDTR@CPK0,0*030010"

"!AIVDM,2,2,3,A,hDm1C33kP00,2*1F0010"

"!AIVDM,1,1,,B,13ErMfPP00P9rFrGasc>4?vD20RI,0*3C0010"

"!AIVDM,1,1,,B,4028jJ1vDfG0:09cIRGdh1g0286J,0*090010"

"!AIVDM,1,1,,B,13GNje0P00P9nePGb5mN4?vD00S=,0*170011"

"!AIVDM,1,1,,B,13EoPo7P00P:0IjGc:d00?vF2@6T,0*490011"

"!AIVDM,1,1,,A,13FtuD?P00P9tuDGbFw4JgvH00T<,0*3F0011"

"!AIVDM,1,1,,B,H3GQ9klTC=D4s;H51npno0106400,0*040011"

"!AIVDM,1,1,,B,137FrD0uBu0:=68GS9P6u5NF0@7B,0*2D0012"

"!AIVDM,1,1,,A,H33mw2TTCBD6ubp00000001@4440,0*170012"

"!AIVDM,1,1,,B,H4hJ<S4T1=30000J7;FoPP1P<560,0*4A0013"

"!AIVDM,1,1,,A,B3EpfoP0002OoB5rlUwQ3wT5kP06,0*1B0013"

"!AIVDM,1,1,,B,15AMJH00000:1i6Ga0pP0AhL0D16,0*5B0014"

"!AIVDM,1,1,,B,13F9RTPP00P9rKrGasQf4?vL288J,0*570014"

];

msg1 = regexp(S1s, '.*(?=\d{4}$)', 'match', 'once');

msg2 = regexp(S2s, '.*(?=\d{4}$)', 'match', 'once');

t1 = regexp(S1s, '\d{4}$', 'match', 'once');

t2 = regexp(S2s, '\d{4}$', 'match', 'once');

mask1 = ismissing(msg1) | ismissing(t1);

mask2 = ismissing(msg2) | ismissing(t2);

origidx1 = (1:length(msg1));

origidx2 = (1:length(msg2));

msg1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];

msg2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];

DN1 = str2double(t1);

DN2 = str2double(t2);

dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));

dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));

[~, Min1, S1] = hms(dur1);

[~, Min2, S2] = hms(dur2);

num_msg1 = length(msg1);

msg_match = cell(num_msg1, 1);

for K = 1 : num_msg1

all_match_idx = find(msg1(K) == msg2);

if isempty(all_match_idx);

fprintf('No text match for line #%d -> "%s"\n', origidx1(K), msg1(K));

continue;

end

fprintf('potential match for line #%d -> "%s", checking times\n', origidx1(K), msg1(K));

disp(K), disp(all_match_idx)

complete_match_idx = all_match_idx(Min1(K) == Min2(all_match_idx) | Min1(K) == Min2(all_match_idx) - 1);

msg_match{K} = complete_match_idx;

if isempty(complete_match_idx)

fprintf('line %#d -> "%s" matched text but not time\n', origidx1(K), msg1(K));

else

fprintf('line %#d -> "%s" matches on time too! Matches are:\n', origidx1(K), msg1(K));

msg2(complete_match_idx)

end

No text match for line #1 -> "!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C" No text match for line #2 -> "!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B" No text match for line #3 -> "!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*39" No text match for line #4 -> "!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E" No text match for line #5 -> "!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C" No text match for line #6 -> "!AIVDM,1,1,,A,137FrD02Bu0:=9@GS16Vu5O`00S<,0*5D" No text match for line #7 -> "!AIVDM,1,1,,B,33F6AD00@0P9ud6GbtWAQmob22rA,0*2B" No text match for line #8 -> "!AIVDM,1,1,,B,13F9RTPP00P9rL0GasQf4?wf2D1E,0*43" No text match for line #9 -> "!AIVDM,1,1,,A,4028j;1vDfG0o09cG0Gdh4i000S:,0*56" No text match for line #10 -> "!AIVDM,1,1,,A,D028j;0flffp,0*43" No text match for line #11 -> "!AIVDM,1,1,,B,13E`977P00P:06:Gc8DkW?wh2@Q3,0*69" No text match for line #12 -> "!AIVDM,1,1,,A,13EpM3PP0009nVdGb9Itfwwh0HQ6,0*0A" No text match for line #13 -> "!AIVDM,1,1,,B,13GQ:Fw01@P9qH6GaS:WiVEh2<1E,0*6D" No text match for line #14 -> "!AIVDM,1,1,,B,13cq;9000IP9saTGb3d0b0ad8<1s,0*32" No text match for line #15 -> "!AIVDM,1,1,,A,39NSCRU000P9uM`GbVRU=@qD0000,0*48" No text match for line #16 -> "!AIVDM,1,1,,A,13Esmv000009qWBGb=BLHAUl0L1U,0*5A" No text match for line #17 -> "!AIVDM,1,1,,A,13F:b60P0J09sKpGb1UhGOwl0<1l,0*3C" No text match for line #18 -> "!AIVDM,1,1,,A,13EaMT?000P9wK2Gblptiooh0D1;,0*43" No text match for line #19 -> "!AIVDM,1,1,,A,13GNje0P00P9nebGb5nv4?wl00SB,0*4B" No text match for line #20 -> "!AIVDM,1,1,,A,13GQ>C@P00P9rHrGasGf4?wn20SK,0*02" No text match for line #21 -> "!AIVDM,1,1,,B,H3dPfW4UC=D7@?>q82knjo2P9430,0*61" No text match for line #22 -> "!AIVDM,1,1,,B,13GPhM0P00P9rGPGast>4?wn2<1C,0*08" No text match for line #23 -> "!AIVDM,1,1,,A,13ErMfPP01P9rG0Gasc>4?wn26p4,0*0F" No text match for line #24 -> "!AIVDM,1,1,,A,39NSVP500009tArGbL07q6Mn0F;r,0*49" No text match for line #25 -> "!AIVDM,1,1,,A,4028ioivDfG0s09kDvGag6G006p0,0*01" No text match for line #27 -> "!AIVDM,1,1,,A,4028jJ1vDfG1009cHtGdh1g026p4,0*10" No text match for line #28 -> "!AIVDM,1,1,,B,137FrD0v2u0:=9TGRwMVu5N00H0j,0*5E" No text match for line #29 -> "!AIVDM,1,1,,B,19NS@=@01qP9u@fGQs3PbP`40H0l,0*03" No text match for line #30 -> "!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D"

found_something_at = find(~cellfun(@isempty, msg_match))

found_something_at = 0×1 empty double column vector

Walter Roberson am 17 Sep. 2021

In MATLAB Online öffnen

I want to compare with all the messages sended one minut before and after of the set 1

What does it mean to "compare" ??

If you are going strictly by time, then notice that your first string input

"!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053"

is minute 00, and so matching on time would be asking to match strings with minute 00 or 01. Which strings are those?

  "!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
"!AIVDM,1,1,,B,19NS@=@01qP9tp4GQkJ0bh`200SP,0*780000"
"!AIVDM,1,1,,B,137FrD0v2u0:=4pGS;s6u5On00SJ,0*000000"
"!AIVDM,1,1,,A,4028jJ1vDfG0009cIVGdh2?0280S,0*400000"
"!AIVDM,1,1,,B,H3GQ9khl4LLTF0l5T0000000000,2*070001"
"!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080001"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"
"!AIVDM,1,1,,B,D028jJ03`N?b<`O6Dl<O6D0,2*350002"
"!AIVDM,1,1,,B,137JlD51h0P9tddGbCQSm0j2081e,0*0E0002"
"!AIVDM,1,1,,A,13EuB00P0009n`TGb82:ugv600RQ,0*110003"
"!AIVDM,1,1,,A,13EsReP00009vQ`Gbj65gPh400SJ,0*350003"
"!AIVDM,1,1,,B,33EtT>5000P9nI2Gb8H5FP<60Dm:,0*580004"
"!AIVDM,1,1,,B,13Efqs800109q6DGb0wHhq@826p0,0*0A0004"
"!AIVDM,1,1,,A,13F9RTPP00P9rKpGasQf4?v820Sf,0*6D0004"
"!AIVDM,1,1,,B,13EmCs70010:0;bGc<Lbh3280<14,0*340004"
"!AIVDM,1,1,,A,13EcsW7P00P:07PGc9Ws@gv82D0l,0*060004"
"!AIVDM,1,1,,B,13EpM3PP0009nVPGb9EJG?v:0<1N,0*590005"
"!AIVDM,1,1,,A,15AMJH00000:1i6Ga0oP0Af:0<16,0*470005"
"!AIVDM,1,1,,A,13cq;9000GP9sUlGb2IPRPb680T5,0*510005"
"!AIVDM,1,1,,B,4028j;1vDfG0509cFhGdh5Q0083a,0*5C0006"
"!AIVDM,1,1,,B,D028j;0flffp,0*400006"
"!AIVDM,1,1,,A,137FrD032u0:=5<GS;6Vu5N:0<1=,0*620006"
"!AIVDM,1,1,,B,13Esmv000009qW:Gb=BLHAT@0H4>,0*660007"
"!AIVDM,1,1,,A,13E`977P00P:06DGc8D00?v>26p0,0*2B0007"
"!AIVDM,1,1,,A,33cm<M1th209wjpGb066k1v<0P00,0*470007"
"!AIVDM,1,1,,A,13GQ:Fw01?P9qW4GaW=7d6:>26p0,0*150007"
"!AIVDM,1,1,,B,13F:b60P0I09sHjGb0D0Bwv@0<1i,0*780008"
"!AIVDM,1,1,,B,B3E`?Q00002Q8LUrpW7Q3wT5kP06,0*760008"
"!AIVDM,1,1,,B,H3`fKwPiDp40000000000000000,2*5F0008"
"!AIVDM,1,1,,B,H3`fKwTTDBE5847@4lpnl0200320,0*700008"
"!AIVDM,1,1,,B,33EaMT?000P9wK4Gblq<iop<00jk,0*390009"
"!AIVDM,1,1,,B,H4hJ<S0l58T4R118Tp<E=>0TTV0,2*450009"
"!AIVDM,1,1,,A,13GPhM0P00P9rGHGast>4?vB20Sr,0*610009"
"!AIVDM,2,1,2,B,53cq;982CRFhT8P<000`thiV1H4p4@Tt00000017D1@CC4en0K1DhPkP0000,0*2C0009"
"!AIVDM,2,2,2,B,00000000000,2*250009"
"!AIVDM,1,1,,B,13GQ>C@P00P9rHjGasGf4?vB2@5d,0*0D0009"
"!AIVDM,1,1,,A,7000003dTINH,0*6D0010"
"!AIVDM,1,1,,A,19NS@=@01qP9tt6GQlahb@`F0H65,0*2D0010"
"!AIVDM,1,1,,A,33FMMd0P0009o1fGapC<Uwv@000k,0*330010"
"!AIVDM,1,1,,B,4028ioivDfG0909kDvGag6G00@6;,0*730010"
"!AIVDM,1,1,,B,D028ioj<Tffp,0*2F0010"
"!AIVDM,2,1,3,A,53FMMd82;AMHD4i4001=049DpdE:3C400000001I081234He0=hUDTR@CPK0,0*030010"
"!AIVDM,2,2,3,A,hDm1C33kP00,2*1F0010"
"!AIVDM,1,1,,B,13ErMfPP00P9rFrGasc>4?vD20RI,0*3C0010"
"!AIVDM,1,1,,B,4028jJ1vDfG0:09cIRGdh1g0286J,0*090010"
"!AIVDM,1,1,,B,13GNje0P00P9nePGb5mN4?vD00S=,0*170011"
"!AIVDM,1,1,,B,13EoPo7P00P:0IjGc:d00?vF2@6T,0*490011"
"!AIVDM,1,1,,A,13FtuD?P00P9tuDGbFw4JgvH00T<,0*3F0011"
"!AIVDM,1,1,,B,H3GQ9klTC=D4s;H51npno0106400,0*040011"
"!AIVDM,1,1,,B,137FrD0uBu0:=68GS9P6u5NF0@7B,0*2D0012"
"!AIVDM,1,1,,A,H33mw2TTCBD6ubp00000001@4440,0*170012"
"!AIVDM,1,1,,B,H4hJ<S4T1=30000J7;FoPP1P<560,0*4A0013"
"!AIVDM,1,1,,A,B3EpfoP0002OoB5rlUwQ3wT5kP06,0*1B0013"
"!AIVDM,1,1,,B,15AMJH00000:1i6Ga0pP0AhL0D16,0*5B0014"
"!AIVDM,1,1,,B,13F9RTPP00P9rKrGasQf4?vL288J,0*570014"

EVERY one of those is minute 00, so EVERY one of them would match on time.

What does not match on time? Well, the last 8 of the S1 entries

  "!AIVDM,1,1,,A,13ErMfPP01P9rG0Gasc>4?wn26p4,0*0F0100"
  "!AIVDM,1,1,,A,39NSVP500009tArGbL07q6Mn0F;r,0*490100"
  "!AIVDM,1,1,,A,4028ioivDfG0s09kDvGag6G006p0,0*010100"
  "!AIVDM,1,1,,A,D028ioj"
  "!AIVDM,1,1,,A,4028jJ1vDfG1009cHtGdh1g026p4,0*100100"
  "!AIVDM,1,1,,B,137FrD0v2u0:=9TGRwMVu5N00H0j,0*5E0101"
  "!AIVDM,1,1,,B,19NS@=@01qP9u@fGQs3PbP`40H0l,0*030101"
  "!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D0102"

are either no valid time (4th one) or minute 01. You previously asked only to look into the same and next minute, so minute 01 in the first strings should match minute 01 or 02 in the second strings, and none of those exist, so the last 8 would not match under the old rules.

Look: S1 message 1 has t0(time) so I want to compare it with the messages from S2 with t0 +- 1 minute.

And you just modified the rules to also look backwards by 1 minute. So the 0102 in the input would look back up to minute 00 in t2... which would match everything in t2.

Under the rules you just defined, everything in t1 matches everyhing in t2, with the exception of

"!AIVDM,1,1,,A,D028ioj"

But... saying +/- 1 minute might mean that you want the difference to be no more than 60 seconds, which is different than what you had asked for before, which involved only looking at the minute number. Should a t1 entry of 0000 match a t2 entry of 0105 because the minute 00 is +/- 1 to the minute 01 in t2? Or would you want the match to fail because the time difference would be more than 60 seconds?

With the data you have, ever entry in t1 is within +/- 60 seconds of every entry in t2, with the exception of the

"!AIVDM,1,1,,A,D028ioj"

entry which has no time.

So... matching only on time is not going to be useful.

flashpode am 17 Sep. 2021

Here is the code you gave me with some diferences, the lines I've put % are the ones that do not understand why you have done them because they do not change nothing

msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');

msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');

t1 = regexp(AIS1, '\d{4}$', 'match', 'once');

t2 = regexp(AIS2, '\d{4}$', 'match', 'once');

mask1 = ismissing(msg_AIS1) | ismissing(t1);

mask2 = ismissing(msg_AIS2) | ismissing(t2);

origidx1 = (1:length(msg_AIS1));

origidx2 = (1:length(msg_AIS2));

msg_AIS1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];

msg_AIS2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];

DN1 = str2double(t1);

DN2 = str2double(t2);

dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));

dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));

[~, M1, S1] = hms(dur1); % Dar tiempo en 2 variables, utilizar M1 para crear rangos

[~, M2, S2] = hms(dur2);

num_msg_AIS1 = length(msg_AIS1);

msg_match = cell(num_msg_AIS1, 1);

for K = 1 : num_msg_AIS1

all_match_AIS = find(msg_AIS1(K) == msg_AIS2); % encontrar mesnaes iguales

if isempty(all_match_AIS);

fprintf('No hay coincidencias para la linia #%d -> "%s"\n', origidx1(K), msg_AIS1(K));

continue;

end

fprintf('potencial coincidencia #%d -> "%s", checking times\n', origidx1(K), msg_AIS1(K));

disp(K), disp(all_match_AIS)

complete_match_AIS = all_match_AIS(M1(K) == M2(all_match_AIS) | M1(K) == M2(all_match_AIS) - 1 |M1(K) == M2(all_match_AIS) + 1);% Rango creado +-1 minuto de cada mensaje

msg_match{K} = msg_AIS1(complete_match_AIS); %coger los mensajes de las lineas que han coincidido

% if isempty(complete_match_AIS)

% fprintf('line %#d -> "%s" coincide texto pero no tiempo\n', origidx1(K), msg_AIS1(K));

% else

% fprintf('line %#d -> "%s" matches on time too! Matches are:\n', origidx1(K), msg_AIS1(K));

% msg_AIS2(complete_match_AIS)

% end

end

%# encontrar celdas vacias (creacion de la variable)

emptyCells = cellfun(@isempty,msg_match);

%# quitar las celdas vacias

msg_match(emptyCells) = [];

then I removed the emptycells but there are some cells that contain a string of 2x1 or 3x1 that are messages. Why are those messages on a string? If they are repeated I want to have them in a different line. I am gonna do it now.

AND answering your question it would be the second option as you already done. I am really greatful

Walter Roberson am 18 Sep. 2021

In MATLAB Online öffnen

S1s = [
  "!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053"
  "!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053"
  "!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053"
  "!AIVDM,1,1,,B,13EcsW7P00P:07JGc9Wh0?wb2<0m,0*0E0053"
  "!AIVDM,1,1,,A,13Efqs800109q6fGb0tHhq?d2L1<,0*3C0054"
  "!AIVDM,1,1,,A,137FrD02Bu0:=9@GS16Vu5O`00S<,0*5D0054"
  "!AIVDM,1,1,,B,33F6AD00@0P9ud6GbtWAQmob22rA,0*2B0055"
  "!AIVDM,1,1,,B,13F9RTPP00P9rL0GasQf4?wf2D1E,0*430055"
  "!AIVDM,1,1,,A,4028j;1vDfG0o09cG0Gdh4i000S:,0*560056"
  "!AIVDM,1,1,,A,D028j;0flffp,0*430056"
  "!AIVDM,1,1,,B,13E`977P00P:06:Gc8DkW?wh2@Q3,0*690056"
  "!AIVDM,1,1,,A,13EpM3PP0009nVdGb9Itfwwh0HQ6,0*0A0056"
  "!AIVDM,1,1,,B,13GQ:Fw01@P9qH6GaS:WiVEh2<1E,0*6D0056"
  "!AIVDM,1,1,,B,13cq;9000IP9saTGb3d0b0ad8<1s,0*320057"
  "!AIVDM,1,1,,A,39NSCRU000P9uM`GbVRU=@qD0000,0*480057"
  "!AIVDM,1,1,,A,13Esmv000009qWBGb=BLHAUl0L1U,0*5A0057"
  "!AIVDM,1,1,,A,13F:b60P0J09sKpGb1UhGOwl0<1l,0*3C0058"
  "!AIVDM,1,1,,A,13EaMT?000P9wK2Gblptiooh0D1;,0*430059"
  "!AIVDM,1,1,,A,13GNje0P00P9nebGb5nv4?wl00SB,0*4B0059"
  "!AIVDM,1,1,,A,13GQ>C@P00P9rHrGasGf4?wn20SK,0*020059"
  "!AIVDM,1,1,,B,H3dPfW4UC=D7@?>q82knjo2P9430,0*610059"
  "!AIVDM,1,1,,B,13GPhM0P00P9rGPGast>4?wn2<1C,0*080059"
  "!AIVDM,1,1,,A,13ErMfPP01P9rG0Gasc>4?wn26p4,0*0F0100"
  "!AIVDM,1,1,,A,39NSVP500009tArGbL07q6Mn0F;r,0*490100"
  "!AIVDM,1,1,,A,4028ioivDfG0s09kDvGag6G006p0,0*010100"
  "!AIVDM,1,1,,A,D028ioj"
  "!AIVDM,1,1,,A,4028jJ1vDfG1009cHtGdh1g026p4,0*100100"
  "!AIVDM,1,1,,B,137FrD0v2u0:=9TGRwMVu5N00H0j,0*5E0101"
  "!AIVDM,1,1,,B,19NS@=@01qP9u@fGQs3PbP`40H0l,0*030101"
  "!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D0102"
  ];
  
S2s = [
"!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
"!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
"!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
"!AIVDM,1,1,,A,D028ioj<Tffp,0*2C0000"
"!AIVDM,1,1,,B,19NS@=@01qP9tp4GQkJ0bh`200SP,0*780000"
"!AIVDM,1,1,,B,137FrD0v2u0:=4pGS;s6u5On00SJ,0*000000"
"!AIVDM,1,1,,A,4028jJ1vDfG0009cIVGdh2?0280S,0*400000"
"!AIVDM,1,1,,B,H3GQ9khl4LLTF0l5T0000000000,2*070001"
"!AIVDM,1,1,,A,H33mw2Q>uV0luHTpN3800000000,2*080001"
"!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"
"!AIVDM,1,1,,B,D028jJ03`N?b<`O6Dl<O6D0,2*350002"
"!AIVDM,1,1,,B,137JlD51h0P9tddGbCQSm0j2081e,0*0E0002"
"!AIVDM,1,1,,A,13EuB00P0009n`TGb82:ugv600RQ,0*110003"
"!AIVDM,1,1,,A,13EsReP00009vQ`Gbj65gPh400SJ,0*350003"
"!AIVDM,1,1,,B,33EtT>5000P9nI2Gb8H5FP<60Dm:,0*580004"
"!AIVDM,1,1,,B,13Efqs800109q6DGb0wHhq@826p0,0*0A0004"
"!AIVDM,1,1,,A,13F9RTPP00P9rKpGasQf4?v820Sf,0*6D0004"
"!AIVDM,1,1,,B,13EmCs70010:0;bGc<Lbh3280<14,0*340004"
"!AIVDM,1,1,,A,13EcsW7P00P:07PGc9Ws@gv82D0l,0*060004"
"!AIVDM,1,1,,B,13EpM3PP0009nVPGb9EJG?v:0<1N,0*590005"
"!AIVDM,1,1,,A,15AMJH00000:1i6Ga0oP0Af:0<16,0*470005"
"!AIVDM,1,1,,A,13cq;9000GP9sUlGb2IPRPb680T5,0*510005"
"!AIVDM,1,1,,B,4028j;1vDfG0509cFhGdh5Q0083a,0*5C0006"
"!AIVDM,1,1,,B,D028j;0flffp,0*400006"
"!AIVDM,1,1,,A,137FrD032u0:=5<GS;6Vu5N:0<1=,0*620006"
"!AIVDM,1,1,,B,13Esmv000009qW:Gb=BLHAT@0H4>,0*660007"
"!AIVDM,1,1,,A,13E`977P00P:06DGc8D00?v>26p0,0*2B0007"
"!AIVDM,1,1,,A,33cm<M1th209wjpGb066k1v<0P00,0*470007"
"!AIVDM,1,1,,A,13GQ:Fw01?P9qW4GaW=7d6:>26p0,0*150007"
"!AIVDM,1,1,,B,13F:b60P0I09sHjGb0D0Bwv@0<1i,0*780008"
"!AIVDM,1,1,,B,B3E`?Q00002Q8LUrpW7Q3wT5kP06,0*760008"
"!AIVDM,1,1,,B,H3`fKwPiDp40000000000000000,2*5F0008"
"!AIVDM,1,1,,B,H3`fKwTTDBE5847@4lpnl0200320,0*700008"
"!AIVDM,1,1,,B,33EaMT?000P9wK4Gblq<iop<00jk,0*390009"
"!AIVDM,1,1,,B,H4hJ<S0l58T4R118Tp<E=>0TTV0,2*450009"
"!AIVDM,1,1,,A,13GPhM0P00P9rGHGast>4?vB20Sr,0*610009"
"!AIVDM,2,1,2,B,53cq;982CRFhT8P<000`thiV1H4p4@Tt00000017D1@CC4en0K1DhPkP0000,0*2C0009"
"!AIVDM,2,2,2,B,00000000000,2*250009"
"!AIVDM,1,1,,B,13GQ>C@P00P9rHjGasGf4?vB2@5d,0*0D0009"
"!AIVDM,1,1,,A,7000003dTINH,0*6D0010"
"!AIVDM,1,1,,A,19NS@=@01qP9tt6GQlahb@`F0H65,0*2D0010"
"!AIVDM,1,1,,A,33FMMd0P0009o1fGapC<Uwv@000k,0*330010"
"!AIVDM,1,1,,B,4028ioivDfG0909kDvGag6G00@6;,0*730010"
"!AIVDM,1,1,,B,D028ioj<Tffp,0*2F0010"
"!AIVDM,2,1,3,A,53FMMd82;AMHD4i4001=049DpdE:3C400000001I081234He0=hUDTR@CPK0,0*030010"
"!AIVDM,2,2,3,A,hDm1C33kP00,2*1F0010"
"!AIVDM,1,1,,B,13ErMfPP00P9rFrGasc>4?vD20RI,0*3C0010"
"!AIVDM,1,1,,B,4028jJ1vDfG0:09cIRGdh1g0286J,0*090010"
"!AIVDM,1,1,,B,13GNje0P00P9nePGb5mN4?vD00S=,0*170011"
"!AIVDM,1,1,,B,13EoPo7P00P:0IjGc:d00?vF2@6T,0*490011"
"!AIVDM,1,1,,A,13FtuD?P00P9tuDGbFw4JgvH00T<,0*3F0011"
"!AIVDM,1,1,,B,H3GQ9klTC=D4s;H51npno0106400,0*040011"
"!AIVDM,1,1,,B,137FrD0uBu0:=68GS9P6u5NF0@7B,0*2D0012"
"!AIVDM,1,1,,A,H33mw2TTCBD6ubp00000001@4440,0*170012"
"!AIVDM,1,1,,B,H4hJ<S4T1=30000J7;FoPP1P<560,0*4A0013"
"!AIVDM,1,1,,A,B3EpfoP0002OoB5rlUwQ3wT5kP06,0*1B0013"
"!AIVDM,1,1,,B,15AMJH00000:1i6Ga0pP0AhL0D16,0*5B0014"
"!AIVDM,1,1,,B,13F9RTPP00P9rKrGasQf4?vL288J,0*570014"
      ];
AIS1 = S1s;
AIS2 = S2s;
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once');
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
mask1 = ismissing(msg_AIS1) | ismissing(t1);
mask2 = ismissing(msg_AIS2) | ismissing(t2);
origidx1 = (1:length(msg_AIS1));
origidx2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];
msg_AIS2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
num_msg_AIS1 = length(msg_AIS1);
msg_match = cell(num_msg_AIS1, 1);
for K = 1 : num_msg_AIS1
   time_mask = isbetween(dur2, dur1(K)-minutes(1), dur1(K)+minutes(1));
   msg_match{K} = AIS2(time_mask);
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
AIS1_with_matches = AIS1;
AIS1_with_matches(emptyCells) = [];
msg_match(emptyCells) = [];
%cross-checks to see that everything worked okay
AIS1_with_matches(1:3)
ans = 3×1 string array
    "!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053"
    "!AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053"
    "!AIVDM,1,1,,B,15AMJH00000:1i8Ga0v@0Akb00Sw,0*390053"
msg_match(1:3)
ans = 3×1 cell array
    {58×1 string}
    {58×1 string}
    {58×1 string}
msg_match{1}(1:3)
ans = 3×1 string array
    "!AIVDM,1,1,,A,13ErMfPP00P9rFpGasc>4?wn2802,0*070000"
    "!AIVDM,1,1,,B,13FMMd0P0009o1jGapD=5gwl06p0,0*780000"
    "!AIVDM,1,1,,A,4028ioivDfFss09kDvGag6G0080D,0*790000"
AIS1_with_matches{end}
ans = '!AIVDM,1,1,,B,13FtuD?P00P9tuDGbG1TJgv40H1;,0*7D0102'
msg_match{end}(1:3)
ans = 3×1 string array
    "!AIVDM,1,1,,B,13FtuD?P00P9tuDGbFw4Jgv40L1f,0*030002"
    "!AIVDM,1,1,,B,D028jJ03`N?b<`O6Dl<O6D0,2*350002"
    "!AIVDM,1,1,,B,137JlD51h0P9tddGbCQSm0j2081e,0*0E0002"

Yes, it worked. The entries with time 0102 are more than 1 minute from the entries with 0000 and 0001 so the 0000 and 0001 did not make it into the match list.

The outputs here are AIS1_with_matches and msg_match. AIS1_with_matches is the list of messages in AIS1 that match something inside AIS2. Then for each of those entries, msg_match is a cell array of all of the messages within +/- 1 minute in AIS2.

Notice that most messages are repeated a lot, since most messages are within 1 minute of most entries.

flashpode am 18 Sep. 2021

In MATLAB Online öffnen

And this is the code:

linia_dolenta1=[];
linia_dolenta2=[];
N=size(AIS1,1)
P=size(AIS2,1)
for i=1:1:N 
    seq1=AIS1(i);
    linia=convertStringsToChars(seq1);
   
    if length(linia)<15
        linia_dolenta1 = [linia_dolenta1,i];
    end  
end
for j=1:1:P
    
     seq2=AIS2(j);
     linia=convertStringsToChars(seq2);
     
      if length(linia)<15
          linia_dolenta2 = [linia_dolenta2,j];
          
      end
end 
     
size(AIS1)
size(AIS2)
AIS1([linia_dolenta1],:) = [];
AIS2([linia_dolenta2],:) = [];
size(AIS1)
size(AIS2)
N=size(AIS1,1)
P=size(AIS2,1)
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once');
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
 mask1 = ismissing(msg_AIS1) | ismissing(t1);
 mask2 = ismissing(msg_AIS2) | ismissing(t2);
origi_AIS1 = (1:length(msg_AIS1));
origi_AIS2 = (1:length(msg_AIS2));
 msg_AIS1(mask1) = []; t1(mask1) = []; origi_AIS1(mask1) = [];
 msg_AIS2(mask2) = []; t2(mask2) = []; origi_AIS2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
[~, M1, S1] = hms(dur1); % Dar tiempo en 2 variables, utilizar M1 para crear rangos
[~, M2, S2] = hms(dur2);
num_msg_AIS1 = length(msg_AIS1);
msg_match = cell(num_msg_AIS1, 1);
for K = 1 : num_msg_AIS1
   all_match_AIS = find(msg_AIS1(K) == msg_AIS2); % encontrar mensajes iguales
   if isempty(all_match_AIS);
       fprintf('No hay coincidencias para la linia #%d -> "%s"', origi_AIS1(K), msg_AIS1(K)); % '%s' para un string
       continue;
   end
   fprintf('potencial coincidencia #%d -> "%s", checking times', origi_AIS1(K), msg_AIS1(K));
   disp(K), disp(all_match_AIS)
   complete_match_AIS = all_match_AIS(M1(K) == M2(all_match_AIS) | M1(K) == M2(all_match_AIS) - 1 |M1(K) == M2(all_match_AIS) + 1);% Rango creado +-1 minuto de cada mensaje
   msg_match{K} = msg_AIS1(complete_match_AIS); %coger los mensajes de las lineas que han coincidido
 if isempty(complete_match_AIS)
        fprintf('line %#d -> "%s" coincide texto pero no tiempo\n', origi_AIS1(K), msg_AIS1(K));
     else
        fprintf('line %#d -> "%s" Tambien coincide el tiempo Son:\n', origi_AIS1(K), msg_AIS1(K));
       msg_AIS2(complete_match_AIS)
 end
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
msg_match(emptyCells) = [];
% find(strcmp(msg_match, string))
[nRows, ~] = cellfun(@size,msg_match); 
isMultiRow = nRows>1; 
msg_match(isMultiRow) = cellfun(@(a) {a'}, msg_match(isMultiRow));
msg_match(isMultiRow) = cellfun(@(a){strsplit(a,'!AIV')},msg_match(isMultiRow)); % here I got the problem

Walter Roberson am 18 Sep. 2021

In MATLAB Online öffnen

Revised code:

AIS1_file = '2021030100AIS1.txt';
AIS2_file = '2021030100AIS2.txt';
AIS1 = string(readlines(AIS1_file));
AIS2 = string(readlines(AIS2_file));
AIS1(strlength(AIS1) < 15) = [];
AIS2(strlength(AIS2) < 15) = [];
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once');
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
mask1 = ismissing(msg_AIS1) | ismissing(t1);
mask2 = ismissing(msg_AIS2) | ismissing(t2);
origidx1 = (1:length(msg_AIS1));
origidx2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];
msg_AIS2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
num_msg_AIS1 = length(msg_AIS1);
msg_match = cell(num_msg_AIS1, 1);
for K = 1 : num_msg_AIS1
   time_mask = isbetween(dur2, dur1(K)-minutes(1), dur1(K)+minutes(1));
   msg_match{K} = reshape(AIS2(time_mask), 1, []);  %user wants rows
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
AIS1_with_matches = AIS1;
AIS1_with_matches(emptyCells) = [];
msg_match(emptyCells) = [];
nRows = cellfun(@length, msg_match); 
isMultiRow = nRows>1; 
%msg_match(isMultiRow) = cellfun(@(a){strsplit(a,'!AIV')},msg_match(isMultiRow)); % here I got the problem

Walter Roberson am 23 Sep. 2021

In MATLAB Online öffnen

AIS1_file = '2021030100AIS1.txt';
AIS2_file = '2021030100AIS2.txt';
AIS1 = string(readlines(AIS1_file));
AIS2 = string(readlines(AIS2_file));
AIS1(strlength(AIS1) < 15) = [];
AIS2(strlength(AIS2) < 15) = [];
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once');
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once');
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
mask1 = ismissing(msg_AIS1) | ismissing(t1);
mask2 = ismissing(msg_AIS2) | ismissing(t2);
origidx1 = (1:length(msg_AIS1));
origidx2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; t1(mask1) = []; origidx1(mask1) = [];
msg_AIS2(mask2) = []; t2(mask2) = []; origidx2(mask2) = [];
DN1 = str2double(t1);
DN2 = str2double(t2);
dur1 = minutes(floor(DN1/100)) + seconds(mod(DN1,100));
dur2 = minutes(floor(DN2/100)) + seconds(mod(DN2,100));
num_msg_AIS1 = length(msg_AIS1);
msg_match = cell(num_msg_AIS1, 1);
matches_anything_in_AIS1 = false;
for K = 1 : num_msg_AIS1
   time_mask = isbetween(dur2, dur1(K)-minutes(1), dur1(K)+minutes(1));
   matches_anything_in_AIS1 = matches_anything_in_AIS1 | time_mask;
   msg_match{K} = reshape(AIS2(time_mask), 1, []);  %user wants columns
end
%# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
AIS1_with_matches = AIS1;
AIS1_NO_MATCHING = AIS1(emptyCells);
AIS1_with_matches(emptyCells) = [];
msg_match(emptyCells) = [];
AIS2_NO_MATCHING = AIS2(~matches_anything_in_AIS1);
nRows = cellfun(@length, msg_match); 
isMultiRow = nRows>1; 
%msg_match(isMultiRow) = cellfun(@(a){strsplit(a,'!AIV')},msg_match(isMultiRow)); % here I got the problem

flashpode am 23 Sep. 2021

In MATLAB Online öffnen

Well I meant with that code:

AIS1(strlength(AIS1) < 15) = [];
AIS2(strlength(AIS2) < 15) = [];
N=size(AIS1,1); %% Importante detras que sino daba error el codigo
msg_AIS1 = regexp(AIS1, '.*(?=\d{4}$)', 'match', 'once'); % todo el mensaje menos las ultimas 4 cifras
msg_AIS2 = regexp(AIS2, '.*(?=\d{4}$)', 'match', 'once');
t1 = regexp(AIS1, '\d{4}$', 'match', 'once'); % sacar ultimas 4 cifras
t2 = regexp(AIS2, '\d{4}$', 'match', 'once');
Time_AIS1 = duration(strcat('00:',extractBefore(t1,3),':',extractAfter(t1,2))); % Poner en formato hh:mm:ss
Time_AIS1 = Time_AIS1+hours(cumsum([0;diff(Time_AIS1)<0])); %añadir una unidad en hh cada vez que se reinicia mm:ss
Time_AIS2 = duration(strcat('00:',extractBefore(t2,3),':',extractAfter(t2,2)));
Time_AIS2 = Time_AIS2+hours(cumsum([0;diff(Time_AIS2)<0]));
mask1 = ismissing(msg_AIS1) | ismissing(Time_AIS1);
mask2 = ismissing(msg_AIS2) | ismissing(Time_AIS2);
origi_AIS1 = (1:length(msg_AIS1));
origi_AIS2 = (1:length(msg_AIS2));
msg_AIS1(mask1) = []; Time_AIS1(mask1) = []; origi_AIS1(mask1) = [];
msg_AIS2(mask2) = []; Time_AIS2(mask2) = []; origi_AIS2(mask2) = [];
[H1, M1, S1] = hms(Time_AIS1); % Dar tiempo en 2 variables, utilizar M1 para crear rangos
[H2, M2, S2] = hms(Time_AIS2);  
msg_match = cell(N, 1);
for K = 1:1:N
   all_match_AIS = find(msg_AIS1(K) == msg_AIS2); % encontrar mensajes iguales
   if isempty(all_match_AIS) %fprintf para escribir datos en un archivo de texto
%        fprintf('No hay coincidencias para la linia #%d -> "%s"\n', origi_AIS1(K), msg_AIS1(K)); % '%s' para un string
       continue;
   end
%    fprintf('potencial coincidencia #%d -> "%s", checking times\n', origi_AIS1(K), msg_AIS1(K));
%    disp(K), disp(all_match_AIS)
   if H1(K)== H2(all_match_AIS)
   % crear rango de coincidencia de minutos
      complete_match_AIS = all_match_AIS(M1(K) == M2(all_match_AIS) | M1(K) == M2(all_match_AIS) - 1 | M1(K) == M2(all_match_AIS) + 1);% Rango creado +-1 minuto de cada mensaje
      msg_match{K} = msg_AIS1(complete_match_AIS); %coger los mensajes de las lineas que han coincidido. IMPORTANTE
   end 
 if isempty(complete_match_AIS)
%         fprintf('line %#d -> "%s" coincide texto pero no tiempo\n', origi_AIS1(K), msg_AIS1(K));
     else
%        fprintf ('line %#d -> "%s" coincide tambien el tiempo. Los resultados son:\n', origi_AIS1(K), msg_AIS1(K));
        msg_AIS2(complete_match_AIS) %IMPORTANTE
 end
 %# encontrar celdas vacias (creacion de la variable)
emptyCells = cellfun(@isempty,msg_match);
%# quitar las celdas vacias
msg_match(emptyCells) = [];
% Quitar los strings de dentro de la cell (cat)--> para concadenar
Matching_msg = cellstr(cat(1, msg_match{:}));
end
Matching_msg = string(Matching_msg);

flashpode am 23 Sep. 2021

Because it does not work to me, I mean it does not do the comparison I do not know why

flashpode am 23 Sep. 2021

And the other one is so much clear to me

Melden Sie sich an, um zu kommentieren.

Answer 2

chrisw23 am 22 Sep. 2021

1 Stimme

strEx = "!AIVDM,1,1,,A,137JlD52h0P9tdRGbCQSm0kV0<1p,0*4C0053 !AIVDM,1,1,,B,13EsReP00009vQ`Gbj65gPiV00Sd,0*7B0053";

% check/modify the expression under https://regex101.com/

exp = "(?<prefix>!\w*),(?<ident1>\d),(?<ident2>\d),,(?<ident3>\w),(?<strLoad>[\w\d:?<>@`]*),(?<time>[*\d\w]*)";

tbl = struct2table(regexp(strEx,exp,'names'))

This is just an example how to parse text by a simple grouped regular expression. I use the website described to write and test expressions. The table allows easy access for further processing (ie. datetime conversion) as previously shown. Look at string based compare methods like 'contains' or 'matches' , i.e. tbl.strLoad.contains("137JlD52h0P9td") -> results in logical index to access matches

Hope it helps

Christian

2 Kommentare
Keine anzeigen Keine ausblenden

Walter Roberson am 23 Sep. 2021

In MATLAB Online öffnen

[\w\d:?<>@`]

I think that could more easily be [^,] which is "anything other than a comma"

chrisw23 am 23 Sep. 2021

Ur right "This is just an example... " and no 'best code' competition

Melden Sie sich an, um zu kommentieren.

Compare two strings with some restrictions

4 Kommentare
2 ältere Kommentare anzeigen 2 ältere Kommentare ausblenden

Akzeptierte Antwort

31 Kommentare
29 ältere Kommentare anzeigen 29 ältere Kommentare ausblenden

Weitere Antworten (1)

2 Kommentare
Keine anzeigen Keine ausblenden

Kategorien

Tags

Community Treasure Hunt

Compare two strings with some restrictions

4 Kommentare 2 ältere Kommentare anzeigen 2 ältere Kommentare ausblenden

Akzeptierte Antwort

31 Kommentare 29 ältere Kommentare anzeigen 29 ältere Kommentare ausblenden

Weitere Antworten (1)

2 Kommentare Keine anzeigen Keine ausblenden

Kategorien

Tags

Siehe auch

Community Treasure Hunt

4 Kommentare
2 ältere Kommentare anzeigen 2 ältere Kommentare ausblenden

31 Kommentare
29 ältere Kommentare anzeigen 29 ältere Kommentare ausblenden

2 Kommentare
Keine anzeigen Keine ausblenden