float_params2
Version 1.0.1 (2,66 KB) von
Marco Cococcioni
MATLAB Code for Parameters of Floating-Point Arithmetics
`float_params2` is a MATLAB function for obtaining the parameters of several
floating-point arithmetics. The parameters are built into the code and are
not computed at run time.
The parameters are
- the unit roundoff,
- the smallest positive (subnormal) floating-point number,
- the smallest positive normalized floating-point number,
- the largest floating-point number,
- the number of binary digits in the significand (including the
implicit leading bit)
and the arithmetics supported are
- bfloat8,
- bfloat16,
- IEEE half precision (fp16),
- IEEE single precision (fp32),
- IEEE double precision (fp64),
- IEEE quadruple precision (fp128).
The code was developed in MATLAB R2020a and works with versions at least
back to R2016b.
This is a small extension to float_params of Nick Higham, to which I added the
support to the 8-bit Brain Float, as proposed at Intel by Naveen K. Mellempudi.
More details can be found here: https://arxiv.org/abs/1905.12334
I also renamed NVIDIA tf32 into tf19, just to reflect that it is a 19-bit precision float.
Zitieren als
Marco Cococcioni (2024). float_params2 (https://www.mathworks.com/matlabcentral/fileexchange/93835-float_params2), MATLAB Central File Exchange. Abgerufen .
Kompatibilität der MATLAB-Version
Erstellt mit
R2021a
Kompatibel mit allen Versionen
Plattform-Kompatibilität
Windows macOS LinuxTags
Quellenangaben
Inspiriert von: float_params
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!Live Editor erkunden
Erstellen Sie Skripte mit Code, Ausgabe und formatiertem Text in einem einzigen ausführbaren Dokument.
Version | Veröffentlicht | Versionshinweise | |
---|---|---|---|
1.0.1 | very small update |
||
1.0.0 |