Create multiple subtables from multiple .tsv tables

6 Ansichten (letzte 30 Tage)
julian gaviria
julian gaviria am 29 Jan. 2025
Verschoben: Voss am 30 Jan. 2025
I have 120 .tsv files (see example example in "sub-m0001_file.tsv"). The path is the same for all the files except in the 9th folder. See the paths for the first two .tsv files below:
/f1/f2/f3/f4/f5/f6/f7/f8/sub-m0001/f10/f11/file.tsv
/f1/f2/f3/f4/f5/f6/f7/f8/sub-m0002/f10/f11/file.tsv
How can I get subtables (i.e., 1 table per file) including only the following six columns: 'trans_x', 'trans_y', 'trans_z', 'rot_x', 'rot_y', 'rot_z'?
The following code does it only for the first .tsv file. Any hint to go recursively over the 120 .tsv files?
mat = dir('/f1/f2/f3/f4/f5/f6/f7/f8/sub-m*/f10/f11/file.tsv');
for files_i = 1:length(mat)
data = fullfile(mat(files_i).name);
x = readtable(data,"FileType","text",'Delimiter', '\t');
vars = {'trans_x' 'trans_y' 'trans_z' 'rot_x' 'rot_y' 'rot_z'};
new_x = x(:,vars);
end
Then, I need to store each file in a folder which filename corresponds to sub-m*. for instance (example sub-m0001_subfile.txt) see:
/new/path/sub-m0001/sub-m0001_subfile.txt
/new/path/sub-m0002/sub-m0002_subfile.txt
Many thanks in advance

Akzeptierte Antwort

Stephen23
Stephen23 am 29 Jan. 2025
Bearbeitet: Stephen23 am 30 Jan. 2025
"The following code does it only for the first .tsv file. Any hint to go recursively over the 120 .tsv files? "
There is nothing in your code that in any way stores or saves the data from each iteration, so your code iterates through each file, imports the file data, and then discards/overwrites the file data on the next loop iteration. So in the end it might look as if it only imported data from one file. But looking at the value of files_i would tell you how many files it has iterated over.
Solution: either use indexing to allocate the imported data into one array (e.g. a cell array or structure array) or export the data into files on each iteration.
"I need to store each file in a folder which filename corresponds to sub-m*. for instance (example sub-m0001_subfile.txt) "
Then you need to export the table data. For example:
V = {'trans_x','trans_y','trans_z','rot_x','rot_y','rot_z'};
S = dir('/f1/f2/f3/f4/f5/f6/f7/f8/sub-m*/f10/f11/*file.tsv');
for k = 1:numel(S)
% import file data:
F = fullfile(S(k).folder,S(k).name);
T = readtable(F,"FileType","text",'Delimiter', '\t');
% optional: store imported filedata:
S(k).data = T;
% export table data:
G = extractBefore(S(k).name,'_');
H = fullfile('/new/path',G,[G,'_subfile.txt']);
U = T(:,V);
writetable(U,H)
end
The data is all stored in the structure S. You can access this using indexing, e.g. the 2nd file:
S(2).folder % location
S(2).name % filename
S(2).data % imported file data
  6 Kommentare
Stephen23
Stephen23 am 30 Jan. 2025
Bearbeitet: Stephen23 am 30 Jan. 2025
"Issue 2: "E" does not work properly"
Please show the exact path and code that you used. It works as expected here:
P='./f1/f2/f3/f4/f5/f6/f7/f8/sub-m0001/f10/f11'; mkdir(P); dlmwrite(fullfile(P,'file.tsv'),1)
P='./f1/f2/f3/f4/f5/f6/f7/f8/sub-m0002/f10/f11'; mkdir(P); dlmwrite(fullfile(P,'file.tsv'),2)
S = dir('./f1/f2/f3/f4/f5/f6/f7/f8/sub-m*/f10/f11/file.tsv');
for k = 1:numel(S)
E = regexprep(S(k).folder,{'^.*/f8/','/f10/.*$'},'')
end
E = 'sub-m0001'
E = 'sub-m0002'
Which means that you are doing something different to what you explained or showed, e.g. your folder names are not really f1, f2, etc. Guessing important information like this is much less reliable than it being written down.
In any case, here are alternative approaches that might work for your (duplicated?) folder names:
for k = 1:numel(S)
E = regexprep(S(k).folder,{'^.*/f8/','/.*$'},'')
end
E = 'sub-m0001'
E = 'sub-m0002'
for k = 1:numel(S)
E = regexp(S(k).folder,'sub-m\d+','match','once')
end
E = 'sub-m0001'
E = 'sub-m0002'
julian gaviria
julian gaviria am 30 Jan. 2025
Verschoben: Voss am 30 Jan. 2025
issue1: "I get the following error because, in deed, the output (destination) file does not exist, it must be created."
You were right, the problem was the incomplete filename in the anchor expression indicating the end of the input text. E.g.,:
incorrect
E = regexprep(S(k).folder,{'^.*/f8/','/f10/.*$'},'')
correct
E = regexprep(S(k).folder,{'^.*/f8/','/f101a/.*$'},'')
Issue 2: "Issue 2: "E" does not work properly"
thanks a lot for the input. "mkdir" was the solution
H = fullfile('/new/path',E);
mkdir(H)
N = fullfile(H,[G,'.txt']);
U = T(:,V);
writetable(U,N, 'WriteVariableNames',0)

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Produkte


Version

R2022b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by