Filter löschen
Filter löschen

Textscan and cpu usage

1 Ansicht (letzte 30 Tage)
Ari
Ari am 7 Okt. 2017
Kommentiert: per isakson am 7 Okt. 2017
I used textscan to read a csv file of 0.5 GB. Sometimes it took less than a minute to complete but in another time it took almost 10 minutes! When I compare the cpu usages during those instances, I noticed that the cpu usage is high for the former (25% in a quadcores machine, so a full core) and low for the latter (less than 5%). Anybody has this experience?
  3 Kommentare
Ari
Ari am 7 Okt. 2017
I have 8GB RAM, and no I didn't run anything simultaneously.
When it reads fast then the memory usage (in the Task Manager - process - Matlab) increases rapidly and so is the CPU. When it reads slow the memory usage stays constant and so is the CPU (at low percentage). What is strange is that if I read the file first using fileread, although just as a dummy, then do the textscan (on the file and not on the string), it reads fast all the time. I came across this 'trick' by looking at what importdata does. Importdata uses also textscan, but instead of textscanning the file direclty, it reads the file into a string first (using fileread), then do the textscan on the string.
per isakson
per isakson am 7 Okt. 2017
  • This large difference in speed makes me think about swapping, but that isn't likely with 8GB RAM. Did try to use the Resource Manager to see what's going on?
  • I once tested speed of I/O with some large files. I had problems to reproduce the results. End of story: I drew the conclusion that my test "messed up" the system cache, which in turn increased the execution times, but certainly not an order of magnitude.
  • I often use fileread in combination with textscan, when the string needs some fixing before parsing. I was initially surprised it is nearly as fast as reading with textscan.
  • Which versions of Matlab and Windows do you use?

Melden Sie sich an, um zu kommentieren.

Antworten (1)

Kian Azami
Kian Azami am 7 Okt. 2017
Bearbeitet: per isakson am 7 Okt. 2017
I just heard that the computation process by the cpu is a very nonlinear process and for this reason every time you see a different behavior. There are some publication about this issue, to study the behavior of the cpu computations.
I put a youtube link which one of the prominent scientists talks about this issue. Worth to listen! https://www.youtube.com/watch?time_continue=17&v=iW2QJRDEBMw

Kategorien

Mehr zu Large Files and Big Data finden Sie in Help Center und File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by