when I add two floating point numbers the result is not correct above 262144
1 Ansicht (letzte 30 Tage)
Ältere Kommentare anzeigen
Mark Ekblad
am 30 Apr. 2018
Kommentiert: John D'Errico
am 30 Apr. 2018
When I execute the following I always get back 262144, it is like it is treating it as an integer
a=single(262144.000);
for i=1:1234,
a= a+single(0.01);
end;
display(a-floor(a));
sprintf('%10.6f',a)
single
0
ans =
'262144.000000'
or if I just do a = single(262144.000)+single(0.01) result is 262144
0 Kommentare
Akzeptierte Antwort
James Tursa
am 30 Apr. 2018
Bearbeitet: James Tursa
am 30 Apr. 2018
The amount you are adding, 0.01, is less than the eps of the number you are using. E.g.,
>> a = single(262144.000)
a =
262144
>> eps(a)
ans =
0.0313
>> a + 0.01
ans =
262144
So the result of the addition does not change the value of "a" because there is not enough precision in "a". I.e., the closest number to 262144.01 in IEEE single precision is in fact 262144. The next highest number is 262144.03125.
0 Kommentare
Weitere Antworten (2)
the cyclist
am 30 Apr. 2018
It is not possible to exactly represent decimal numbers with a finite number of bits (in this case, 32 bits, because you are specifying single-precision).
So, you will get that
single(262144) + single(0.1)
is most closely representable by
2.6214409e5
but that
single(262144) + single(0.01)
is most closely represented as
262144
The numbers are not "being treated as integers", but as the closest representation in this 32 bit system as possible.
You would similarly see that
262144 + 1.e-10
is represented as
2.621440000000001e+05
but
262144 + 1.e-11
is represented (and displayed) as
262144
in double precision.
0 Kommentare
John D'Errico
am 30 Apr. 2018
Time to learn what eps means. After all, YOU were the one who chose to use single precision here. So that means you should also learn what the consequences of that decision will be.
a=single(262144.000)
a =
single
262144
eps(a)
ans =
single
0.03125
eps(a) is essentially the smallest number that can be added to a, and still get a different number.
a == a+eps(a)
ans =
logical
0
a == a+eps(a)/2
ans =
logical
1
Think of eps(a) as the size of the least significant bit in the number a.
2 Kommentare
the cyclist
am 30 Apr. 2018
As an addendum to help your understanding, note that
a = single(262144);
e = eps(a);
b = log2(e)
results in
b = -5
So, eps() returns the distance to the next-largest floating-point number (of the same precision), and in this case that distance is 2^(-5).
This illustrates the binary aspect of the representation.
Siehe auch
Kategorien
Mehr zu Logical finden Sie in Help Center und File Exchange
Produkte
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!