Sort according to specific string contained in file name

28 Ansichten (letzte 30 Tage)
TL
TL am 24 Jun. 2022
Beantwortet: Stephen23 am 24 Jun. 2022
I have two lists of file names (including the whole path), at some point within the file name there is a subject ID and both lists contain exactly the same 25 IDs because there are two sets of files from each study participant. I need to sort the two lists so that the IDs correspond at each row, i.d. I want something like
List A List B
010822_AB030391 240922_AB030391
130922_FS120387 050322_FS120387
but right now what I have is
List A List B
010822_AB030391 050322_FS120387
130922_FS120387 240922_AB030391
because the lists are just sorted according to the first character and so the IDs don't correspond.
I had several ideas but they all seem too complicated or don't work well, e.g. I tried to split the file names at the underscore, to sort alphabetically and then merge the parts again. I also tried isolating the ID from one list, looping through that isolated list and finding the corresponding entry in the second list that contains a specific ID. But I think there should be a more elegant way to do this and I'd be happy to hear any tips! Right now both lists are character arrays, but maybe they should be a struct or something more easily manipulated.
  2 Kommentare
Dyuman Joshi
Dyuman Joshi am 24 Jun. 2022
The IDs are after the underscore ( _ ) ?
TL
TL am 24 Jun. 2022
Yes exactly, however there are some more underscores in the path before the file name, e.g.
'D:\DATA\XY\Project_XY\\label_sPR12345_AB67890-0011.nii '
The number of characters before the ID is always the same, also the number of delimiters before it

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Karim
Karim am 24 Jun. 2022
Bearbeitet: Karim am 24 Jun. 2022
You can try spliting the lists, then ordering them and then using the indixes to sort the original list:
ListA = ["010822_AB030391";
"130922_FS120387"];
ListB = ["050322_FS120387";
"240922_AB030391"];
% temporarly split the string to use the ID
tmpListA = split(ListA,"_");
tmpListB = split(ListB,"_");
% sort list A
[ListA_sort, orderA] = sort(tmpListA(:,2));
% find the corresponding order for list B
[~,orderB] = ismember(ListA_sort,tmpListB(:,2));
% order the original list
ListA = ListA(orderA)
ListA = 2×1 string array
"010822_AB030391" "130922_FS120387"
ListB = ListB(orderB)
ListB = 2×1 string array
"240922_AB030391" "050322_FS120387"
  3 Kommentare
Karim
Karim am 24 Jun. 2022
You can use the same idea/concept, but use it in a couple of steps. See below with some random data.
With the final list u can use the same procedure as in the answer.
MyFile = [ "D:\DATA\XY\Project_XY\\label_sPR12345_AB67890-0011.nii";
"D:\DATA\XY\Project_XY\\label_sPR40922_AB03091-0011.nii";
"D:\DATA\XY\Project_XY\\label_sPR30922_FS12038-0011.nii"];
tmpListA = split(MyFile,"\")
tmpListA = 3×6 string array
"D:" "DATA" "XY" "Project_XY" "" "label_sPR12345_AB67890-0011.nii" "D:" "DATA" "XY" "Project_XY" "" "label_sPR40922_AB03091-0011.nii" "D:" "DATA" "XY" "Project_XY" "" "label_sPR30922_FS12038-0011.nii"
% pick the last column of the first tmp list
tmpListA = split(tmpListA(:,end) ,"_")
tmpListA = 3×3 string array
"label" "sPR12345" "AB67890-0011.nii" "label" "sPR40922" "AB03091-0011.nii" "label" "sPR30922" "FS12038-0011.nii"
% again pick the last column
tmpListA = split(tmpListA(:,end) ,"-")
tmpListA = 3×2 string array
"AB67890" "0011.nii" "AB03091" "0011.nii" "FS12038" "0011.nii"
% finaly only keep the first column
tmpListA = tmpListA(:,1)
tmpListA = 3×1 string array
"AB67890" "AB03091" "FS12038"
TL
TL am 24 Jun. 2022
Works perfectly, thank you so much! This will be very useful long term, this issue keeps coming up in my project

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (1)

Stephen23
Stephen23 am 24 Jun. 2022
S = ["D:\DATA\XY\Project_XY\label_sPR12345_AB67890-0011.nii";
"D:\DATA\XY\Project_XY\label_sPR40922_AB03091-0011.nii";
"D:\DATA\XY\Project_XY\label_sPR30922_FS12038-0011.nii"];
[~,X] = sort(regexp(S,'[A-Z]+\d+\-','match','once'));
S = S(X)
S = 3×1 string array
"D:\DATA\XY\Project_XY\label_sPR40922_AB03091-0011.nii" "D:\DATA\XY\Project_XY\label_sPR12345_AB67890-0011.nii" "D:\DATA\XY\Project_XY\label_sPR30922_FS12038-0011.nii"

Kategorien

Mehr zu Shifting and Sorting Matrices finden Sie in Help Center und File Exchange

Tags

Produkte


Version

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by