XS3 Mixed-Depth Vector Functions

void xs3_vect_s32_to_s16(int16_t a[], const int32_t b[], const unsigned length, const right_shift_t b_shr)

Convert a 32-bit vector to a 16-bit vector.

This function converts a 32-bit mantissa vector \(\bar b\) into a 16-bit mantissa vector \(\bar a\). Conceptually, the output BFP vector \(\bar{a}\cdot 2^{a\_exp}\) represents the same values as the input BFP vector \(\bar{b}\cdot 2^{b\_exp}\), only with a reduced bit-depth.

In most cases \(b\_shr\) should be \(16 - b\_hr\), where \(b\_hr\) is the headroom of the 32-bit input mantissa vector \(\bar b\).

The output exponent \(a\_exp\) will be given by

\( a\_exp = b\_exp + b\_shr \)

Parameter Details

a[] represents the 16-bit output mantissa vector \(\bar a\).

b[] represents the 32-bit input mantissa vector \(\bar b\).

a[] and b[] must each begin at a word-aligned address.

length is the number of elements in each of the vectors.

b_shr is the signed arithmetic right-shift applied to elements of \(\bar b\).

Operation Performed:

\[\begin{split}\begin{align*} & a_k \leftarrow sat_{16}(\lfloor b_k \cdot 2^{-b\_shr} \rfloor) \\ & \qquad\text{ for }k\in 0\ ...\ (length-1) \end{align*}\end{split}\]

Block Floating-Point

If \(\bar b\) are the 32-bit mantissas of a BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the resulting vector \(\bar a\) are the 16-bit mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp + b\_shr\).

Parameters
  • a[out] Output vector \(\bar a\)

  • b[in] Input vector \(\bar b\)

  • length[in] Number of elements in vectors \(\bar a\) and \(\bar b\)

  • b_shr[in] Right-shift appled to \(\bar b\)

Throws

ET_LOAD_STORE – Raised if a or b is not word-aligned (See Note: Vector Alignment)

void xs3_vect_s16_to_s32(int32_t a[], const int16_t b[], const unsigned length)

Convert a 16-bit vector to a 32-bit vector.

a[] represents the 32-bit output vector \(\bar a\).

b[] represents the 16-bit input vector \(\bar b\).

Each vector must begin at a word-aligned address.

length is the number of elements in each of the vectors.

Operation Performed:

\[\begin{split}\begin{align*} & a_k \leftarrow b_k \cdot 2^{8} \\ & \qquad\text{ for }k\in 0\ ...\ (length-1) \end{align*}\end{split}\]

Block Floating-Point

If \(\bar b\) are the mantissas of BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the resulting vector \(\bar a\) are the 32-bit mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\). If \(a\_exp = b\_exp - 8\), then this operation has effectively not changed the values represented.

Notes

  • The multiplication by \(2^8\) is an artifact of the VPU’s behavior. It turns out to be significantly more efficient to include the factor of \(2^8\). If this is unwanted, xs3_vect_s32_shr() can be used with a b_shr value of 8 to remove the scaling afterwards.

  • The headroom of output vector \(\bar a\) is not returned by this function. The headroom of the output is always 8 bits greater than the headroom of the input.

Parameters
  • a[out] 32-bit output vector \(\bar a\)

  • b[in] 16-bit input vector \(\bar b\)

  • length[in] Number of elements in vectors \(\bar a\) and \(\bar b\)

Throws

ET_LOAD_STORE – Raised if a or b is not word-aligned (See Note: Vector Alignment)

void xs3_vect_complex_s32_to_complex_s16(int16_t a_real[], int16_t a_imag[], const complex_s32_t b[], const unsigned length, const right_shift_t b_shr)

Convert a complex 32-bit vector into a complex 16-bit vector.

This function converts a complex 32-bit mantissa vector \(\bar b\) into a complex 16-bit mantissa vector \(\bar a\). Conceptually, the output BFP vector \(\bar{a}\cdot 2^{a\_exp}\) represents the same value as the input BFP vector \(\bar{b}\cdot 2^{b\_exp}\), only with a reduced bit-depth.

In most cases \(b\_shr\) should be \(16 - b\_hr\), where \(b\_hr\) is the headroom of the 32-bit input mantissa vector \(\bar b\). The output exponent \(a\_exp\) will then be given by

\( a\_exp = b\_exp + b\_shr \)

Parameter Details

a_real[] and a_imag[] together represent the complex 16-bit output mantissa vector \(\bar a\), with the real part of each \(a_k\) going in a_real[] and the imaginary part going in a_imag[].

b[] represents the complex 32-bit mantissa vector \(\bar b\).

a_real[], a_imag[] and b[] must each begin at a word-aligned address.

length is the number of elements in each of the vectors.

b_shr is the signed arithmetic right-shift applied to elements of \(\bar b\).

Operation Performed:

\[\begin{split}\begin{align*} & b_k' \leftarrow sat_{16}(\lfloor b_k \cdot 2^{-b\_shr} \rfloor) \\ & Re\{a_k\} \leftarrow Re\{b_k'\} \\ & Im\{a_k\} \leftarrow Im\{b_k'\} \\ & \qquad\text{ for }k\in 0\ ...\ (length-1) \end{align*}\end{split}\]

Block Floating-Point

If \(\bar b\) are the complex 32-bit mantissas of a BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the resulting vector \(\bar a\) are the complex 16-bit mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp + b\_shr\).

Parameters
  • a_real[out] Real part of complex output vector \(\bar a\).

  • a_imag[out] Imaginary part of complex output vector \(\bar a\).

  • b[in] Complex input vector \(\bar b\).

  • length[in] Number of elements in vectors \(\bar a\) and \(\bar b\)

  • b_shr[in] Right-shift appled to \(\bar b\).

Throws

ET_LOAD_STORE – Raised if a_real, a_imag or b are not word-aligned (See Note: Vector Alignment)

void xs3_vect_complex_s16_to_complex_s32(complex_s32_t a[], const int16_t b_real[], const int16_t b_imag[], const unsigned length)

Convert a complex 16-bit vector into a complex 32-bit vector.

a[] represents the complex 32-bit output vector \(\bar a\). It must begin at a double word (8-byte) aligned address.

b_real[] and b_imag[] together represent the complex 16-bit input mantissa vector \(\bar b\). Each \(Re\{b_k\}\) is b_real[k], and each \(Im\{b_k\}\) is b_imag[k].

The parameter length is the number of elements in each of the vectors.

length is the number of elements in each of the vectors.

Operation Performed:

\[\begin{split}\begin{align*} & Re\{a_k\} \leftarrow Re\{b_k\} \\ & Im\{a_k\} \leftarrow Im\{b_k\} \\ & \qquad\text{ for }k\in 0\ ...\ (length-1) \end{align*}\end{split}\]

Block Floating-Point

If \(\bar b\) are the complex 16-bit mantissas of a BFP vector \(\bar{b} \cdot 2^{b\_exp}\), then the resulting vector \(\bar a\) are the complex 32-bit mantissas of BFP vector \(\bar{a} \cdot 2^{a\_exp}\), where \(a\_exp = b\_exp\).

Notes

  • The headroom of output vector \(\bar a\) is not returned by this function. The headroom of the output is always 16 bits greater than the headroom of the input.

Parameters
  • a[out] Complex output vector \(\bar a\).

  • b_real[in] Real part of complex input vector \(\bar b\).

  • b_imag[in] Imaginary part of complex input vector \(\bar b\).

  • length[in] Number of elements in vectors \(\bar a\) and \(\bar b\)

Throws

ET_LOAD_STORE – Raised if a is not double-word-aligned (See Note: Vector Alignment)

void xs3_vect_s16_extract_high_byte(int8_t a[], const int16_t b[], const unsigned len)

Extract an 8-bit vector containing the most significant byte of a 16-bit vector.

This is a utility function used, for example, in optimizing mixed-width products. The most significant byte of each element is extracted (without rounding or saturation) and inserted into the output vector.

Parameters
  • a[out] 8-bit output vector \(\bar a\)

  • b[in] 16-bit input vector \(\bar b\)

  • len[in] The number of elements in \(\bar a\) and \(\bar b\)

Throws

ET_LOAD_STORE – Raised if a or b is not word-aligned (See Note: Vector Alignment)

void xs3_vect_s16_extract_low_byte(int8_t a[], const int16_t b[], const unsigned len)

Extract an 8-bit vector containing the least significant byte of a 16-bit vector.

This is a utility function used, for example, in optimizing mixed-width products. The least significant byte of each element is extracted (without rounding or saturation) and inserted into the output vector.

Parameters
  • a[out] 8-bit output vector \(\bar a\)

  • b[in] 16-bit input vector \(\bar b\)

  • len[in] The number of elements in \(\bar a\) and \(\bar b\)

Throws

ET_LOAD_STORE – Raised if a or b is not word-aligned (See Note: Vector Alignment)

void xs3_mat_mul_s8_x_s8_yield_s32(xs3_split_acc_s32_t accumulators[], const int8_t matrix[], const int8_t input_vect[], const unsigned M_rows, const unsigned N_cols)

Multiply-accumulate an 8-bit matrix by an 8-bit vector into 32-bit accumulators.

This function multiplies an 8-bit \(M \times N\) matrix \(\bar W\) by an 8-bit \(N\)-element column vector \(\bar v\) and adds it to the 32-bit accumulator vector \(\bar a\).

accumulators is the output vector \(\bar a\) to which the product \(\bar W\times\bar v\) is accumulated. Note that the accumulators are encoded in a format native to the XS3 VPU. To initialize the accumulator vector to zeros, just zero the memory.

matrix is the matrix \(\bar W\).

input_vect is the vector \(\bar v\).

matrix and input_vect must both begin at a word-aligned offsets.

M_rows and N_rows are the dimensions \(M\) and \(N\) of matrix \(\bar W\). \(M\) must be a multiple of 16, and \(N\) must be a multiple of 32.

The result of this multiplication is exact, so long as saturation does not occur.

Parameters
  • accumulators[inout] The accumulator vector \(\bar a\)

  • matrix[in] The weight matrix \(\bar W\)

  • input_vect[in] The input vector \(\bar v\)

  • M_rows[in] The number of rows \(M\) in matrix \(\bar W\)

  • N_cols[in] The number of columns \(N\) in matrix \(\bar W\)

Throws

ET_LOAD_STORE – Raised if matrix or input_vect is not word-aligned (See Note: Vector Alignment)

void xs3_mat_mul_s8_x_s16_yield_s32(int32_t output[], const int8_t matrix[], const int16_t input_vect[], const unsigned M_rows, const unsigned N_cols, int8_t scratch[])

Multiply an 8-bit matrix by a 16-bit vetor for a 32-bit result vector.

This function multiplies an 8-bit \(M \times N\) matrix \(\bar W\) by a 16-bit \(N\)-element column vector \(\bar v\) and returns the result as a 32-bit \(M\)-element vector \(\bar a\).

output is the output vector \(\bar a\).

matrix is the matrix \(\bar W\).

input_vect is the vector \(\bar v\).

matrix and input_vect must both begin at a word-aligned offsets.

M_rows and N_rows are the dimensions \(M\) and \(N\) of matrix \(\bar W\). \(M\) must be a multiple of 16, and \(N\) must be a multiple of 32.

scratch is a pointer to a word-aligned buffer that this function may use to store intermediate results. This buffer must be at least \(N\) bytes long.

The result of this multiplication is exact, so long as saturation does not occur.

Parameters
  • output[inout] The output vector \(\bar a\)

  • matrix[in] The weight matrix \(\bar W\)

  • input_vect[in] The input vector \(\bar v\)

  • M_rows[in] The number of rows \(M\) in matrix \(\bar W\)

  • N_cols[in] The number of columns \(N\) in matrix \(\bar W\)

  • scratch[in] Scratch buffer required by this function.

Throws

ET_LOAD_STORE – Raised if matrix or input_vect is not word-aligned (See Note: Vector Alignment)

unsigned xs3_vect_sXX_add_scalar(int32_t a[], const int32_t b[], const unsigned length_bytes, const int32_t c, const int32_t d, const right_shift_t b_shr, const unsigned mode_bits)

Add a scalar to a vector.

Add a scalar to a vector. This works for 8, 16 or 32 bits, real or complex.

length_bytes is the total number of bytes to be output. So, for 16-bit vectors, length_bytes is twice the number of elements, whereas for complex 32-bit vectors, length_bytes is 8 times the number of elements.

c and d are the values that populate the internal buffer to be added to the input vector as follows: Internally an 8 word (32 byte) buffer is allocated (on the stack). Even-indexed words are populated with c and odd-indexed words are populated with d. For real vectors, c and d should be the same value — the reason for d is to allow this same function to work for complex 32-bit vectors. This also means that for 16-bit vectors, the value to be added needs to be duplicated in both the higher 2 bytes and lower 2 bytes of the word.

mode_bits should be 0x0000 for 32-bit mode, 0x0100 for 16-bit mode or 0x0200 for 8-bit mode.