creating a loop for a specific computation

tCOD=readtable(full_file_name,'FileType','text', ...
'headerlines',end_of_header_line,'readvariablenames',0);
all_time=tCOD{:,3:8};
all_time_second= all_time(:,4)*3600+all_time(:,5)*60+all_time(:,6); % seconds
unique_seconds=unique(all_time_second);
The attached data file can be read by the above codes and "unique_seconds" variable can be created correctly. When multiple data files are involved, I used the following codes;
for j=1:num_of_files
tCOD{j,:}=readtable(full_file_name(j,:),'FileType','text', ...
'headerlines',end_of_header_line(j),'readvariablenames',0);
end
all_time_0=vertcat(tCOD{:});
all_time=all_time_0{:,3:8};
all_time_second= all_time(:,4)*3600+all_time(:,5)*60+all_time(:,6); % seconds
For multiple files, all_time_second consists of n x 1 array where n equals the row numbers of tCOD.
for k=1:num_of_files
[a(k),b(k)]=size(tCOD{k,:});
end
where a consists of row size of each tCOD array (378063 and 377840 for two different data files). I need to apply unique(all_time_second) command for each independent n x 1 array from tCOD as follows;
for example if two files are read;
unique_seconds_1=unique(all_time_second(1:a(1)));
unique_seconds_2=unique(all_time_second(a(1)+1:a(1)+a(2)));
for example if three files are read;
unique_seconds_1=unique(all_time_second(1:a(1)));
unique_seconds_2=unique(all_time_second(a(1)+1:a(1)+a(2)));
unique_seconds_3=unique(all_time_second(a(1)+a(2)+1:a(1)+a(2)+a(3)));
How I can create a loop or something else to create unique_seconds arrays w.r.t. the file numbers without explicitly writing unique_seconds_i?
I attached the one representative data file.

8 Kommentare

dpb
dpb am 31 Jul. 2021
No idea what you're trying to do here...explain to us what the end result is intended to be -- it would probably be simplest to create a small data set that shows input and expected output and if not obvious, how the output is derived from the inputs.
It is not at all clear what the point of unique() and size() are -- the unique values in data() will be some set of numbers, those included in different subsections of the vector would be determinable by simply one set of indices.
Explain what it is that are trying to do...
sermet OGUTCU
sermet OGUTCU am 31 Jul. 2021
Bearbeitet: sermet OGUTCU am 31 Jul. 2021
What I'm trying to do is just to automatically create aa arrays without explicitly writing the last two or three lines (w.r.t. the column size of A). The column size of A is variable and it depends on the data input. So, I need to formulate for creating aa as a single loop or something else w.r.t. column size of A. I intentionally didn't give the codes relating to creating A for avoiding the redundancy. The aim of size() and unique() also do not important and they are related to other parts of my codes.
Image Analyst
Image Analyst am 31 Jul. 2021
Did you see where he said "create a small data set that shows input and expected output"?
Again, please attach a specific example using smaller numbers if you still need help.
sermet OGUTCU
sermet OGUTCU am 31 Jul. 2021
I'll modify my question.
dpb
dpb am 31 Jul. 2021
Just edit in place or add to it...even a verbal description of what the end object is might be sufficient; I/we could make some assumptions, but a clear definition would be bester... :)
sermet OGUTCU
sermet OGUTCU am 31 Jul. 2021
I modified my question. Thanks in advance.
dpb
dpb am 31 Jul. 2021
OK. But... :) (always a rhetorical "but" isn't there... <VBG>)
One could do something like that, but instead let me ask what the purpose in building those vectors of unique times is? Are you wanting to merge/select data from the various files on the basis of matching them? If so, there are much simpler ways to do so that don't rely on building such arrays.
Also, you don't need the sizes() arrays and the linear indexing even to do whaty you are doing -- set membership functions such as intersect() are vectorized and would work over the indivdual file content without the combining of the data.
But, before we go down that rabbit hole with Alice, let's also make sure we can't just go to the tea party instead... :)
sermet OGUTCU
sermet OGUTCU am 1 Aug. 2021
Bearbeitet: sermet OGUTCU am 1 Aug. 2021
The purpose of creating the unique times (unique_seconds array) is just to compute how many unique times (different times) are there in the data file (it is 2880 for the single attached data). In this way, I can know the number of independent time sets in data file (there are bunch of repeated time sets in the data file). The above codes are purely intended to do so. When multiple files are read, I need to compute the number of each file's independent time sets (for example, 2880,2880,2500, etc.). If there are other more convenience way to do so, I don't necessarily use the above codes.

Melden Sie sich an, um zu kommentieren.

 Akzeptierte Antwort

dpb
dpb am 1 Aug. 2021
Bearbeitet: dpb am 2 Aug. 2021

1 Stimme

OK, just making sure...although it may be interesting, still not positive it's of all that much use in the end...to collect data at similar times of day across days where may not be the same number of observations still will need to use join or similar. I'd think turning it into a time table and retime could be worth consideration as well...
But,anyway, just to answer the Q? asked, quit trying to figure out your purpose... :)
tCOD=[]; % an empty placeholder
datdir='YourDataStoragePath';
d=dir(fullfile(datdir,'COD_MatchWildCardString*.CLK')); % return list of wanted files
nFiles=numel(d); % how many files found
nUT=zeros(nFiles,1); % preallocate unique times number array
for j=1:nFiles
ffn=fullfile(d.folder,d.name); % get fully qualified file name
nHdr=nHeaderLines(ffn); % find number header lines to skip
tC=readtable(ffn,'FileType','text', ... % and read the file to temporary table
'numHeaderLines',nHdr, ...
'ReadVariablNnames',0);
% create useful variable names -- you can fill in rest I'm not sure of best
tC.Properties.VariableNames(3:8)={'Year','Month','Day','Hour','Min','Sec'};
tC.DateTime=datetime(tC{:,3:8}); % create the date time variable
nUT(j)=numel(unique(tC.DateTime)); % and count number timestamps this file
tCOD=[tCOD;tC]; % catenate the new file to end(*)
% whatever else want to do on individual file here...
end
% whatever else want to do on combined file here...
...
The helper function to get the number of header lines -- could be an internal function in the main m file or make it its own m-file if it might be useful elsewhere besides...or, if the number is fixed and known a priori, can dispense with; this adds generality if it is variable between files or groups of files.
If it is fixed and known to be so for a given group of files, can be called just once instead of inside the loop.
function nHdr=nHeaderLines(ffn)
% return number of lines in header of .CLK file -- looks for specific
% string "END OF HEADER" as last line of the header in the file...
% will fail if string not found...
fid=fopen(ffn); % open file to read
nHdr=1; % initialize counter
while ~contains(fgetl(fid),'END OF HEADER') % look for the end of the header string
nHdr=nHdr+1;
end
fid=fclose(fid);
end
(*) NB: The catenation of the files this way may slow things down as they're pretty big; if this is turns out to be a serious problem, post back....this isn't the most efficient but the easiest to code and if only doing once may be good enough.
When get a large file built, be sure to save it in .mat file and then won't have to rebuild it and so loading will be much quicker.
Again, might want to consider a timetable here depending upon just what are next step(s)...

3 Kommentare

dpb
dpb am 1 Aug. 2021
ADDENDUM/ERRATUM:
NB: Missed the argument of qualifiled file name inside the helper function in original -- had initially passed the dir() struct for the specific file, but that entails creating the ffn twice so moved the fullfile() call but forgot to go back and add the variable...
sermet OGUTCU
sermet OGUTCU am 2 Aug. 2021
Thank you for the additional clarification.
dpb
dpb am 2 Aug. 2021
No problem...I also just noticed (and corrected little bit ago) that left out the unique call in argument of numel for nUT...so it'll be the height of the table, not the count intended.

Melden Sie sich an, um zu kommentieren.

Weitere Antworten (0)

Kategorien

Tags

Gefragt:

am 31 Jul. 2021

Kommentiert:

dpb
am 2 Aug. 2021

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by