float_params2

MATLAB Code for Parameters of Floating-Point Arithmetics

Marco Cococcioni

Version 1.0.1 (2,66 KB)

20 Downloads

(0)

10. Jun 2021

Herunterladen

In MATLAB Online öffnen

Verfolgen

Herunterladen

In MATLAB Online öffnen

Verfolgen

`float_params2` is a MATLAB function for obtaining the parameters of several

floating-point arithmetics. The parameters are built into the code and are

not computed at run time.

The parameters are

- the unit roundoff,

- the smallest positive (subnormal) floating-point number,

- the smallest positive normalized floating-point number,

- the largest floating-point number,

- the number of binary digits in the significand (including the

implicit leading bit)

and the arithmetics supported are

- bfloat8,

- bfloat16,

- IEEE half precision (fp16),

- IEEE single precision (fp32),

- IEEE double precision (fp64),

- IEEE quadruple precision (fp128).

The code was developed in MATLAB R2020a and works with versions at least

back to R2016b.

This is a small extension to float_params of Nick Higham, to which I added the

support to the 8-bit Brain Float, as proposed at Intel by Naveen K. Mellempudi.

More details can be found here: https://arxiv.org/abs/1905.12334

I also renamed NVIDIA tf32 into tf19, just to reflect that it is a 19-bit precision float.

Zitieren als

Marco Cococcioni (2026). float_params2 (https://de.mathworks.com/matlabcentral/fileexchange/93835-float_params2), MATLAB Central File Exchange. Abgerufen 2. August 2026.

Quellenangaben

Inspiriert von: float_params

Kompatibilität der MATLAB-Version

Kompatibel mit allen Versionen

Plattform-Kompatibilität

Windows
macOS
Linux

In neuem Tab öffnen

Version	Veröffentlicht	Versionshinweise	Action
1.0.1	10. Jun 2021	very small update	Herunterladen
1.0.0	10. Jun 2021		Herunterladen