MATLAB Answers

How to write a Loop to run a specific code to every table in a cell

10 views (last 30 days)
Behzad Navidi
Behzad Navidi on 20 Dec 2019
Edited: dpb on 23 Dec 2019
I have many excel files (540 files) which I used this code to import them to Matlab:
D = 'C:\Users\Behzad\Desktop\New folder (2)';
filePattern = fullfile(D, '*.xlsx');
file = dir(filePattern);
x={};
for k = 1 : numel(file)
baseFileName = file(k).name;
fullFileName = fullfile(D, baseFileName)
x{k} = readtable(fullFileName);
fprintf('read file %s\n', fullFileName);
end
After doing that x is a 1*540 cell which contains 540 tables. I want to apply this code below for all these tables:
T = readtable(filename);
sort = sortrows(T, 8);
selected_table = sort (:, 8:9);
dt1 = datetime([1982 01 01]);
dt2 = datetime([2018 12 31]);
allDates = (dt1 : calmonths(1) : dt2).';
allDates.Format = 'MM/dd/yyyy';
tempTable = table(allDates(~ismember(allDates,selected_table.data)), NaN(sum(~ismember(allDates,selected_table.data)),size(selected_table,2)-1),'VariableNames',selected_table.Properties.VariableNames);
T2 = outerjoin(sort,tempTable,'MergeKeys', 1);
T2 = fillmissing(T2, 'next', 'DataVariables', {'lat', 'lon', 'station_elevation'})
I don't know how to do that. I guess I should use the loop again but haven't any idea how to do it.
Best Regards

  2 Comments

Behzad Navidi
Behzad Navidi on 20 Dec 2019
it's hard for me I don't know how to write the code. I mean I can do this on a paper but as I'm new to Matlab I don't know this language well. the problem is I don't know how to define a new loop when x{k} is generated from the previous loop. I searched a lot and write this code below:
% allDates should be out of the loop because it's not necessary to be in the loop
dt1 = datetime([1982 01 01]);
dt2 = datetime([2018 12 31]);
allDates = (dt1 : calmonths(1) : dt2).';
allDates.Format = 'MM/dd/yyyy';
% creating loop
for idx = 1:numel(x)
T = readtable(idx);
sort = sortrows(T, 8);
selected_table = sort (:, 8:9);
tempTable = table(allDates(~ismember(allDates,selected_table.data)), NaN(sum(~ismember(allDates,selected_table.data)),size(selected_table,2)-1),'VariableNames',selected_table.Properties.VariableNames);
T2 = outerjoin(sort,tempTable,'MergeKeys', 1);
T2 = fillmissing(T2, 'next', 'DataVariables', {'lat', 'lon', 'station_elevation'})
end
if you can help with this issue too, I would be grateful.

Sign in to comment.

Accepted Answer

Adam Danz
Adam Danz on 20 Dec 2019
Edited: Adam Danz on 20 Dec 2019
I can't run your code but here are some feedback at first glance.
First, always use correct indentation so that it's much easier to read the code. "Smart indentation" can be done by selecting all of the code and pressing ctrl+i.
See my 3 notes within your code.
% allDates should be out of the loop because it's not necessary to be in the loop
dt1 = datetime([1982 01 01]);
dt2 = datetime([2018 12 31]);
allDates = (dt1 : calmonths(1) : dt2).';
allDates.Format = 'MM/dd/yyyy';
% 1) pre-allocate a cell array that will store
% your tables (see note #3)
T2 = cell(size(x); % this should work, I don't know what x is
% creating loop
for idx = 1:numel(x)
T = readtable(idx);
% 2) This line should probably be T = readtable(x(idx));
sort = sortrows(T, 8);
selected_table = sort (:, 8:9);
tempTable = table(allDates(~ismember(allDates,selected_table.data)), NaN(sum(~ismember(allDates,selected_table.data)),size(selected_table,2)-1),'VariableNames',selected_table.Properties.VariableNames);
T2 = outerjoin(sort,tempTable,'MergeKeys', 1);
T2 = fillmissing(T2, 'next', 'DataVariables', {'lat', 'lon', 'station_elevation'})
% 3) You're overwriting the variabe T2 on each iteration of the i-loop.
% to save each table, do this
% T2{idx} = fillissing(.....)
end
Now, to access table #n, T2{n}
As I said, I can't run your code so if you come across problems, show me the updated code and describe what went wrong.
Also, it would be a very wise investment of your time to spend 1/2 an hour teaching yourself how to debug code using debug mode.

  15 Comments

Behzad Navidi
Behzad Navidi on 22 Dec 2019
Thanks all of you. I learned so many things in this question from you and now my problem solved. Thanks for your time and kindness.
Adam Danz
Adam Danz on 22 Dec 2019
Glad I could help. dpb's timetable/retime approach is a nice alternative to this approach that you may want to keep in mind.
dpb
dpb on 22 Dec 2019
dpb would strongly recommend same, in fact! :)
It makes use of the methods TMW has built into the product for the express purpose; is clear and concise and solves the problem at hand (once BN fixes whatever typo there is in his copy anyway).

Sign in to comment.

More Answers (2)

dpb
dpb on 21 Dec 2019
Edited: dpb on 23 Dec 2019
D='C:\Users\Behzad\Desktop\New folder (2)';
d=dir(fullfile(D, '*.xlsx'));
for i=1:numel(d)
tt=readtable(fullfile(d(i).folder,d(i).name)); % read file as table
tt=table2timetable(tt,'RowTimes',tt.data); % convert to timetable ('data' is typo for 'date'???)
tt.data=[]; % get rid of the now unneeded date column
[~,isrt]=sort(tt.Time); % sorted row index with which to...
tt=tt(isrt,:); % sort by increasing time
tt.region_id=categorical(tt.region_id); % I'd do this with all these type variables, just example how
tt=retime(tt,'monthly'); % put on monthly basis, create any missing
tt=fillmissing(tt,'next','datavariables',{'lat','lon','station_elevation'}); % fill in missing
isn=isnan(tt.tm_m); % find missing mean temperatures
tt.tm_m(isn)=mean([tt.tmax_m(isn) tt.tmin_m(isn)],2); % fill in with average min/max (presumes they're not nan, too)
end
Above does all you asked for each file; do what is needed like writetable or save or whatever at the end of the loop before going on and overwriting the previous.
No second loop required for any operations; read the Matlab examples carefully and see how vectorized functions work...it's the key to effective use of Matlab.

  2 Comments

Behzad Navidi
Behzad Navidi on 22 Dec 2019
Dear dpb,
First of all, I want to thank you for your kindly answer. Unfortunately after run this code I get following error:
Unable to perform assignment because the left and right sides
have a different number of elements.
dpb
dpb on 22 Dec 2019
Have to see the code exactly as you ran it and the specific error in context...it ran on sample file here.
If the optional second argument to the mean() function were missing that would explain the error, though; it would average the columns to two values instead of the rows to the number of missing elements.
NB: Your fillmissing will NOT fill in anything for the temperatures for the missing dates so there will be NaN average temperatures at those locations. That may well be desired; just noting.

Sign in to comment.


Image Analyst
Image Analyst on 21 Dec 2019
I'd build your second snippet of code into a function and then call that inside the loop of your first snippet.

  0 Comments

Sign in to comment.

Sign in to answer this question.


Translated by