problem with track of index

1 Ansicht (letzte 30 Tage)
huda nawaf
huda nawaf am 28 Jul. 2012
hi,
I have 17770 files , to make my code faster I merged each set of files to creat one file.
eventually , I got 2221 files with different length , but I do not remember how many old files within each new file. what I have is the length of each old file.
The problem that I faced is :I have to increment the ind each time read old file :
array(k)=ind i.e if the lengths of old files are 12, 45, 10,23 ,4 ,10,11,13...etc
and the lengths of new files are : 57,37,21 if we suppose that the first new file include two old files (12+45), and the second include(10+23+4), etc.
what i need is when read the new file if the counter =12 for example,ind=ind+1 and when counter = 12+45 ,then ind=ind+1; so on.
I can not tune the index of counter each time read new file. Note: this is just example my files with very long lengths.
thanks in advance
  3 Kommentare
huda nawaf
huda nawaf am 28 Jul. 2012
thanks, the answers of all three questions are yes. my problem with 17770 files is long story , and I displayed it in this forum. no one can solve it. I ran my labtop core 5 8 hours in day to collect 400 users from 400000 users because my code looked for id of users over all 17771, so is very slow. But I find when I reduce the no. of files to be 2221 , the code be faster. unfortunately , I faced the problem above. in addition , I have no idea about what you suggest regarding binary files.
per isakson
per isakson am 28 Jul. 2012
Bearbeitet: per isakson am 28 Jul. 2012
What version of Matlab do you run? 64bit?
Binary file alternatives:
  • mat-file version 7.3, access with the function, matfile, which creates an object. v7.3 is a HDF5-file under the hood.
  • netCDF or HDF
  • plain binary, i.e. fwrite/fread
I'm not aware of what kind of data you are working with. Base on this question, I guess it might a job for SQL. However, SQL comes with a learning curve. See e.g. SQLite or better a system someone near you uses.

Melden Sie sich an, um zu kommentieren.

Antworten (1)

Image Analyst
Image Analyst am 28 Jul. 2012
I doubt that if you include the time to combine thousands of files into a single file, and the time to break them apart again (if needed) that it will be much less than just processing the thousands of files individually. What kind of times are you getting for the two approached? Anyway, if you combine them, then it's your responsiblity to keep track of the sizes somehow, say in a second file with just the original files sizes as text, if you need that information later.
You say " I can not tune the index of counter each time read new file". Well for one run, you have just one new file (which is composed of the thousands of smaller files). The file pointer (what you called index of counter) does not need to be "tuned" - it starts at 1 and the help for fread() says this:
A = fread(fileID, sizeA) reads sizeA elements into A and positions the file pointer after the last element read. sizeA can be an integer, or can have the form [m,n].
so the file pointer is left at the end of the last read location - no tuning necessary.

Kategorien

Mehr zu Data Import and Analysis finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by