"diff" function doesn't work properly with small numbers
129 Ansichten (letzte 30 Tage)
Ältere Kommentare anzeigen
For some reason when difference between n and n+1 is too small diff function assumes the solution is 0.
There are +-290 data points on the plot, The precision is 10^(-10), As far as i know Matlab works on 16 or 32 digits so it shouldn't be a problem.
Technically on the plot there should be on no constants, Just increase and decrease of value.
Pomiary=cisnienie300920151701average300
Czas = Pomiary{:, 4};
Temperatura = Pomiary{:, 5};
CzasDMY= Czas / 86400 + datenum(1970, 1, 1);
y = Temperatura;
x = CzasDMY;
ydiff=diff(y,1);
wieksze = (ydiff > 0);
mniejsze = (ydiff < 0);
gora = y;
dol = y;
gora(~wieksze) = NaN;
dol(~mniejsze) = NaN;
plot(x,y,'b',x, gora, 'r', x, dol, 'g');
grid on;
xlim tight;
xlim("auto");
ylim("auto");
legend("Constant", "Increasing", "Decreasing");
legend("Position", [0.15754,0.1468,0.20438,0.12165]);

8 Kommentare
dpb
vor etwa eine Stunde
Bearbeitet: dpb
vor 21 Minuten
whos -file x
whos -file y
d=dir('cisni*.mat');
whos('-file',d.name)
load x
load y
X=[x y];
fprintf('%.12f %.12f\n',X(1:10,:).')
dy=diff(y);
iy=find(dy==0);
nnz(iy)
This shows there are 5 separate repeated instances in the y vector.
iy
shows that there aren't repeated values more than two in a rwo in this data set at least so the averaging technique in the earlier Answer would work to produce something that would have no zero differences if that is the ultimate goal.
Why it is significant and not just accepting the result as is is, so far, unclear? But, as noted, the problem is not in diff() or machine precision, but that the data have been rounded such that there really are identical values.
fprintf('%.14f\n',y(iy(1)+[-1:2]))
plot(x(iy(1)+[-1:2]),y(iy(1)+[-1:2]),'*-')
Reproduces exactly the problem illustrated before -- the data are identical to machine precision because the values have been rounded to seven (7) decimal digits and when read into memory from the input file containing those values, they were interpreted and stored identically in memory. Ergo, the diff() between those subsequent positions is, as it returns identically zero.
As my Answer over the same subset of the data shows, your only choices if you find this result unacceptable is to provide the data with full precision as input on the hope that there will be a difference in later digits in the original before the rounding or as illustrated there, interpolate over the range beyond the duplicated values to produce a different result for the second/repeated value such that a subsequent diff() would be nonzero. The caveats noted there are still in play, of course.
The basic answer is that your data are, indeed, not changing at every point in either a positive or negative direction but are unchanging over at least two consecutive positions and diff() is just doing its job.
Fangjun Jiang
vor 13 Minuten
@dpb, @Sylwester, There is no problem regarding diff(). There is no probelm regarding data accuracy or precision. It is a visual mis-conception.
First, as @dpb pointed out, in the whole set of 288 data points, there is only 5 places where the data value is un-changed thus regarded as "Constant" trend.
@Sylwester had this thought. Plot all the data in BLUE color, plot all the "Increasing" trend data in RED color, plot all the "Decreasing" trend data in GREEN color. Since the RED and GREEN color are going to over-write the BLUE color, the resulting plot should show almost no "BLUE" section, since there is only 5 out of 288 data points that are "Constant" trend.
But there is no problem regarding diff() function. It is just a visual mis-conception. Or it is due to how the plot() function connects the data points with the line style when there are "NAN" data points.
I only changed to this line.
plot(x,y,'.',x, gora, 'r+', x, dol, 'g*');
and the resulting plot gives the correct visual impression (that there is almost no BLUE "Constant" data).

Antworten (3)
Fangjun Jiang
am 22 Dez. 2025 um 15:45
The data value and results make sense. There is no problem using diff() to process your data based on your example data.
%%
format long
y=[36 1023.08766260000
37 1023.03861350000
38 1023.01522350000
39 1023.01522350000
40 1022.96080630000]
ydiff=diff(y,1)
wieksze = (ydiff > 0)
mniejsze = (ydiff < 0)
By default, MATLAB uses 64 bits floating-point data to represent a numeric value.
At around value 1023, its relative accuracy is 1e-13, sufficient to represent your data precision 10e-10.
The problem you observed comes from your raw data. Note that y(3,2) and y(4,2) are exactly the same by visual observation.
eps(1023)
Check the document for eps(). You will understand the issue better.
doc eps
3 Kommentare
Fangjun Jiang
vor etwa 2 Stunden
The length of diff() output is 1 smaller than its input length. Your code didn't seem to consider this.
diff(1:3)
Fangjun Jiang
vor 2 Minuten
The length difference of 1 between the input and output of the diff() function is not an issue either in this case.
There is no issue regarding diff() function or data accuray/precision. The OP has a visual mis-conception due to the way that the plot(x,y,'b') function connects data points with color and line style when there are "NAN" data points in the "y" data set.
dpb
vor etwa 19 Stunden
Bearbeitet: dpb
vor etwa 3 Stunden
X=[
36 1023.08766260000
37 1023.03861350000
38 1023.01522350000
39 1023.01522350000
40 1022.96080630000];
dx=diff(X)
As hypothesized above, some of the temperature/pressure values are identical owing to the apparent rounding to seven (7) decimal digits.
You would have to have at least one more decimal place in the above between the 3rd and 4th data values in order for the difference to not be identically zero.
If you're transferring data from one place to another, to avoid this don't use text files but save the whole internal precision by using .mat files or binary formatted transfer if from some external source. Besides being able to retain full precision (note that precision does not necessarily imply accuracy), it's much more efficient in speed and memory/disk space.
As for your comment above about the values that "They are meant to be the same, The issue is that for some reason function for marking if value increased/decreased has holes in it and skips points unless difference is high enough", that makes no sense at all -- the two values are identically the same so how can there be any sense of the value changed that "increased/decreased" implies?
If you're trying to measure an overall change; then diff is entirely the wrong function as it is on a pointwise basis and so will indeed notice when there are any points for which the difference is actually zero.
Looking at your small subsample of data
plot(X(:,1),X(:,2),'*-')
indeed, there is an overall negative trend, but it isn't uniformly decreasing at every point, just overall. If you want indications of trends excluding such points, you'd have to do something like find the inflection points and then (say) the two points on either side and then use the adjusted temperature to compute the change.
Note that you would also have to locate any locations of more than two successive points being the same and then do something over those ranges. Also, in doing something like this you'll run into the issue that @Fangjun Jiang raised about the differenced vector being shorter than the original so the points are offset by one in the addressing.
For the simple example here
ix=find(dx(:,2)==0); % locate the zero point `
fprintf('%d %15.10f\n',X(ix+[0:1],:).') % display where are relatively
X(:,3)=X(:,2); % augment the X array
X(ix+1,3)=mean(X(ix+[0 2],3)); % replace the unchange with linear interp1
hold on
plot(X(:,1),X(:,3),'rx-')
legend('Original','Interpolated','location','northeast')
diff(X)
Now you don't have any zeros in the 3rd column diff().
0 Kommentare
Paul
vor 21 Minuten
The data in gora and dol are on the plot as can be seen below when using markers. However, if the y-data pattern is
increasing->decreasing->increasing ...
then the gora and dol will have data->nan->data ...
and so the data points in gora and dol won't be connected on the plot (and won't be visible at all if not using markers)
load x
load y
ydiff=diff(y,1);
wieksze = (ydiff > 0);
mniejsze = (ydiff < 0);
gora = y;
dol = y;
gora(~wieksze) = NaN;
dol(~mniejsze) = NaN;
figure
plot(x,y,'b',x, gora, 'r-o', x, dol, 'g-x');
xlim([7.3623688,7.3623691]*1e5)
xl = xlim;
counts = (1:numel(x)).';
index = x>xl(1) & x < xl(2);
format long
[counts(index),x(index),y(index),gora(index),dol(index),wieksze(index),mniejsze(index)]
0 Kommentare
Siehe auch
Kategorien
Mehr zu Logical finden Sie in Help Center und File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!




