Overview¶

The processor is designed for seamless integration into consumer electronic products requiring voice interfaces for Automatic Speech Recognition (ASR), or communication and conferencing. In addition to the class-leading voice processing, XVF3610 processor implements specific features and interfaces required for use in closely integrated applications such and incorporated into a TV or set-top box.

Two modes of operation are supported by the XVF3610

3610-UA - Audio and control via a USB2.0 interface
3610-INT – Audio via I2S and control over I2C interfaces

The functional block diagram of the XVF3610 is shown in the figures below.

Fig. 7 Functional block diagram of XVF3610 in UA configuration¶

Fig. 8 Functional block diagram of XVF3610 in INT configuration¶

Audio processing¶

The VocalFusion XVF3610 voice processor converts and enhances audio captured using a pair of low-cost digital microphones. Processed audio streams are suitable for use in Automatic Speech Recognition (ASR) or voice communications applications and benefit from a range of configurable audio processing techniques to allow customisation to the use case. The embedded audio processing provides the following features:

2 microphone far-ﬁeld operation.
Full 360-degree operation in “coﬀee table” applications or 180 degree for operation in edge-of-room products such as smart TVs.
16kHz voice processing, with optional 16kHz and 48kHz interface sample rates.
Full duplex, Stereo, Acoustic Echo cancellation with a maximum tail length of 225ms accommodating highly reverberant environments. (Reference audio for cancellation provided via I2S Slave interface).
Automatic bulk delay insertion, of up to 150ms, to account for positive or negative reference audio delays ensuring optimal echo cancellation with all audio output paths.
Cancellation of point noise sources via a 256-frequency band Interference Canceller.
Switchable stationary noise suppressor.
Adjustable gain over a 60dB range with automatic gain control.
Audio output filtering and range limiter.
Independent audio processing paths and control of parameters for communications and ASR audio.

System Interfaces¶

The VocalFusion XVF3610 voice processor provides the following additional interfaces to increase usability and reduce total system cost:

4 General Purpose Output pins. These can be configured as simple digital I/O pins, Pulse Width Modulated (PWM) outputs and rate adjustable LED flashers.
4 General Purpose Input pins. These can be used as simple logic inputs or event capture (edge detection).
SPI master interface to control and interrogate an SPI slave device, such as ADCs, DACs or external keyword detection devices.

Booting and Initial configuration¶

The VocalFusion XVF3610 voice processor can be booted over SPI by a local host processor or from a separate, user-supplied, QSPI Flash memory. When operating with flash, the memory can be used for the following functions:

A default firmware image for power-on operation.
An upgrade image. Upgrades are provided via I2C or USB providing a host-controlled upgrade process for over-the-air device management.
A persistent user information space to allow user-configured data such as board identifiers and serial numbers to be maintained across multiple firmware upgrade cycles.
An upgradable user command space. Commands stored in this space are executed at boot time allowing the definition of start-up behaviour, VocalFusion XVF3610 configuration and setup of SPI peripheral devices connected to it.

With the exception of the persistent user information the contents of the flash, and therefore the configuration of the system can be upgraded and configured using the Device Firmware Upgrade (DFU) mechanism from the host processor.

Note

The two XVF3610 configurations; one providing I2S/I2C interface (XVF3610‑INT) and one providing a USB interface (XVF3610‑UA) are delivered as separate sets of firmware.