16-bit Scalar API#

group scalar_s16_api

Functions

int32_t s16_to_s32(exponent_t *a_exp, const int16_t b, const exponent_t b_exp, const unsigned remove_hr)#

Convert a 16-bit floating-point scalar to a 32-bit floating-point scalar.

Converts a 16-bit floating-point scalar, represented by the 16-bit mantissa b and exponent b_exp, into a 32-bit floating-point scalar, represented by the 32-bit returned mantissa and output exponent a_exp.

remove_hr, if nonzero, indicates that the output mantissa should have no headroom. Otherwise, the output mantissa will be the same as the input mantissa.

Parameters:

a_exp – [out] Output exponent
b – [in] 16-bit input mantissa
b_exp – [in] Input exponent
remove_hr – [in] Whether to remove headroom in output

Returns:

32-bit output mantissa

int16_t s16_inverse(exponent_t *a_exp, const int16_t b)#

Compute the inverse of a 16-bit integer.

b represents the integer \(b\). a and a_exp together represent the result \(a \cdot 2^{a\_exp}\).

Operation Performed:

\[\begin{flalign*} a \cdot 2^{a\_exp} \leftarrow \frac{1}{b} && \end{flalign*}\]

Parameters:

a_exp – [out] Output exponent \(a\_exp\)
b – [in] Input integer \(b\)

Returns:

Output mantissa \(a\)

int16_t s16_mul(exponent_t *a_exp, const int16_t b, const int16_t c, const exponent_t b_exp, const exponent_t c_exp)#

Compute the product of two 16-bit floating-point scalars.

a and a_exp together represent the result \(a \cdot 2^{a\_exp}\).

b and b_exp together represent the result \(b \cdot 2^{b\_exp}\).

c and c_exp together represent the result \(c \cdot 2^{c\_exp}\).

Operation Performed:

\[\begin{flalign*} a \cdot 2^{a\_exp} \leftarrow \left( b\cdot 2^{b\_exp} \right) \cdot \left( c\cdot 2^{c\_exp} \right) && \end{flalign*}\]

Parameters:

a_exp – [out] Output exponent \(a\_exp\)
b – [in] First input mantissa \(b\)
c – [in] Second input mantissa \(c\)
b_exp – [in] First input exponent \(b\_exp\)
c_exp – [in] Second input exponent \(c\_exp\)

Returns:

Output mantissa \(a\)