specifying a stride length in ncread

3 Ansichten (letzte 30 Tage)
Chad Greene
Chad Greene am 4 Okt. 2016
Beantwortet: Aylin am 14 Okt. 2016
I have a big 1.5 GB .nc file. Data loading is the slowest part of my processing, but I'm lucky that it will be sufficient to load only every Nth data point. Loading the whole file takes about 0.14 seconds:
tic
z = ncread('myfile.nc','z');
toc
Elapsed time is 0.142669 seconds.
which is about the same amount of time it takes when I specify that which indices to load:
tic
z = ncread('myfile.nc','z',[1 1],[Inf Inf],[1 1]);
toc
Elapsed time is 0.156108 seconds.
And so it should be faster if I specify a "stride" of more than 1. But it actually takes much more time to load every 2nd datapoint:
tic
z = ncread('myfile.nc','z',[1 1],[Inf Inf],[2 2]);
toc
Elapsed time is 4.992349 seconds.
Increasing the stride length beyond 2 seems to bring data loading time back down, but I have to use a stride length of 8 or more to get any benefit at all. What gives? Any ideas for fixes?
  4 Kommentare
KSSV
KSSV am 5 Okt. 2016
Have you tried the same with netcdf.getVar?
Chad Greene
Chad Greene am 5 Okt. 2016
Oh, interesting idea. The issue persists!
tic
ncid = netcdf.open('myfile.nc');
z = netcdf.getVar(ncid,2,[1 1],[12444 12444],[1 1]);
toc
Elapsed time is 0.231038 seconds.
tic
ncid = netcdf.open('myfile.nc');
z = netcdf.getVar(ncid,2,[1 1],[12444/2 12444/2],[2 2]);
toc
Elapsed time is 4.881778 seconds.

Melden Sie sich an, um zu kommentieren.

Akzeptierte Antwort

Aylin
Aylin am 14 Okt. 2016
It looks like this issue is actually occurring in the underlying NetCDF C library that MATLAB uses. Here is a discussion on the NetCDF mailing list about this issue from 2013:
As an example, I downloaded the ‘ test_echam_spectral.nc ’ NetCDF file from
Then, I entered the following commands into the MATLAB command prompt:
>> tic; z = ncread('test_echam_spectral.nc', 'xl'); toc; % Elapsed time is 0.035419 seconds
>> tic; z = ncread('test_echam_spectral.nc', 'xl', [1 1 1 1], [Inf Inf Inf Inf], [2 1 1 1]); toc; % Elapsed time is 0.424505 seconds
Clearly, the strided read is about an order of magnitude slower than the contiguous read. I was able to remedy the issue by reading the whole array, and then filtering the array using MATLAB’s inbuilt array manipulation syntax:
>> tic; z = ncread('test_echam_spectral.nc', 'xl'); z = z(1:2:192, :, :, :); toc; % Elapsed time is 0.041134 seconds
Note that this still takes a little more time than reading the whole array contiguously. However, this is much faster than using strided read in the ‘ ncread ’ function.

Weitere Antworten (0)

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by