Find continuous file name jump

1 Ansicht (letzte 30 Tage)
Tsuwei Tan
Tsuwei Tan am 9 Mai 2018
Kommentiert: Tsuwei Tan am 9 Mai 2018
I have dozen thousand files with names like the following:
SHARK_225054651_41_0547_r001
SHARK_225054651_41_0548_r005
SHARK_225054651_41_0548_r009
...
SHARK_225054651_41_0619_r121
SHARK_225054651_41_0620_r125
...
SHARK_225062101_41_0621_r001
SHARK_225062101_41_0621_r005
SHARK_225062101_41_0622_r009
...
SHARK_225062101_41_0653_r121
SHARK_225062101_41_0654_r125
each file's name end up with .....r%%%, the three %%% digits are 001, 005,....up to 121, 125. Total thirty-two with increment equals four and the same SHARK_%%%%%%%%%_%%_%%%%_r%% name before _r%%%. Then another file starts over with the same r%%% iteration.
However, the actual file has a "jump" for instance r005 is missing between r001 and r009.
Is there a way to read out the file name with some logic loop to pick up the missing one?

Akzeptierte Antwort

Stephen23
Stephen23 am 9 Mai 2018
Bearbeitet: Stephen23 am 9 Mai 2018
C = { % fake data:
'SHARK_225054651_41_0548_r001'
'SHARK_225054651_41_0548_r005'
'SHARK_225054651_41_0548_r009'
'SHARK_225054651_41_0548_r021'
'SHARK_225054651_41_0548_r025'
'SHARK_225062101_41_0621_r005'
'SHARK_225062101_41_0621_r009'
'SHARK_225062101_41_0621_r013'
'SHARK_225062101_41_0621_r025'
};
T = regexp(C,'^(\w+)r(\d{3})$','tokens','once'); % split
A = cellfun(@(c)c(1),T); % 1st token
B = cellfun(@(c)c(2),T); % 2nd token
[U,~,X] = unique(A(:)); % 1st token unique only
V = str2double(B(:)); % 2nd token -> numeric values
G = 1:4:25; % the required values (change this to 1:4:125).
C = accumarray(X,V,[],@(v){setdiff(G,v)});
The outputs of interest to you are:
  • U the unique groups, e.g. 'SHARK_225054651_41_0547_' and 'SHARK_225062101_41_0621_' in my example fake data.
  • C the missing rXXX values.
These outputs are shown here:
>> U{1}
ans = SHARK_225054651_41_0548_
>> C{1}
ans =
13 17
>> U{2}
ans = SHARK_225062101_41_0621_
>> C{2}
ans =
1 17 21
You could easily loop over these, or display them in the command windows or your GUI, etc:
>> Z = [U,cellfun(@num2str,C,'uni',0)]';
>> for k = 1:numel(U), fprintf('%s: %s\n',U{k},sprintf('%3d, ',C{k})); end
SHARK_225054651_41_0548_: 13, 17,
SHARK_225062101_41_0621_: 1, 17, 21,
  1 Kommentar
Tsuwei Tan
Tsuwei Tan am 9 Mai 2018
This is great!!! Thank you so much!!!

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by