Using LPDDR#

This section shows how to use LPDDR1 devices in target systems which employ the xcore.ai (XU316) device.

An LPDDR1 memory device may be connected to certain xcore.ai devices. The memory device may be either 256Mbit, 512Mbit or 1024Mbit in size. See the relevant device datasheets for full connectivity details and electrical specifications.

The LPDDR memory contents may be accessed by application software executing on either tile 0 or on tile 1 of an xcore.ai device. It is not possible for both tiles to access LPDDR from a single application.

The application developer provides annotated C-language source to indicate which elements of the application reside in LPDDR. The bootloader performs the hardware setup required for the application to execute from, to read from and to write to this memory.

Note: In a system with two xcore.ai devices, the LPDDR device and the flash memory device (used to boot the system) must be connected to the same xcore.ai device.

Accessing LPDDR in an application#

Executable code or data entities that must reside in LPDDR must be annotated with:

__attribute__ ((section(".ExtMem<qualifier>")))

<qualifier> must be the string .bss for data which must be initialised to 0 by the bootstrap (traditionally known as BSS) (and not occupy space in the flash image).

<qualifier> may be any other string for data which is initialised or for code.

The following examples illustrate how the __attribute__ annotation should be applied:

// Place in BSS - zero-initialise in bootstrap - do not occupy space in the flash image
__attribute__ ((section(".ExtMem.bss")))
char working_area[10 * 1024 * 1024];

// Place in BSS - zero-initialise in bootstrap - do not occupy space in the flash image
__attribute__ ((section(".ExtMem.bss")))
extern char working_area[10 * 1024 * 1024];

// Annotation for executable code - occupies space in the flash image
__attribute__ ((section(".ExtMem.code")))
void procedure(int * p) {
  ...
}

// Annotation for initialised data - occupies space in the flash image
__attribute__ ((section(".ExtMem.data")))
unsigned ddr_data_word = 0xdeadbeef;

// Annotation for initialised data - occupies space in the flash image
__attribute__ ((section(".ExtMem_data")))
unsigned ddr_stuff[32768] = { 0x12345678, 0x234567ab, 0x4567abcd };

The annotation must be placed immediately before the definition and any declaration of the entity. Use extern in a declaration, to avoid multiple definitions of the same object.

Compiling for external memory (LPDDR) and software-defined memory#

As described under memory models, accessing a data object in external or software-defined memory directly, by name, requires the large or hybrid memory model. But accessing it via a pointer can be done in any memory model. XMOS recommends choosing one of the following schemes:

Scheme 1: the general case: all objects may be accessed directly#

Compile the whole application for the same model, either large or hybrid.

Model large is required only when the contiguous data area of any one tile (usually in internal RAM) exceeds 256KB, or if branches in code exceed the maximum range of the small model.

Scheme 2: access objects in LPDDR or software-defined memory via pointers#

Can be used for external/software memory access when all of the following are true:

  1. No code (functions) will be placed in external memory, only data.

  2. External/software memory data is in a few, large objects which can be defined in dedicated source files. The scheme is not suitable if external/software memory objects are spread throughout an application.

  3. Internal RAM data for a tile fits within 256KB.

Advantage: smaller and faster code is generated to access internal RAM data, than when the large model is used.

Define external/software memory objects in specific source files, with an accessor function, returning a pointer to the data. The source files defining external/software memory data and accessor functions must be compiled under the hybrid model. (The large model would also work, but is unnecessary if the conditions above are met.) All other source files can be compiled under the default (small) model.

This scheme is intended for the specific case of storing a small number of very large objects in a memory other than the internal RAM. Writing address-getter functions for many, smaller, objects is not the recommended model.

Example:

external.c#

// xcc -mcmodel=hybrid external.c ...

__attribute__((section(“.ExtMem.data”))) int readings[MAX];
int * get_readings(void) { return readings; }

external.h#

int * get_readings(void);

main.c#

// xcc -mcmodel=small main.c ...

#include “external.h”
...
int * r = get_readings();
r[3] = k;
int a = r[7];

Hardware setup#

The xcore.ai device has a LPDDR controller which must be configured by the bootloader. The parameters required for configuration are provided in the target XN file. The following information is required:

  1. The frequency at which the LPDDR interface is clocked

  2. The size of the LPDDR device (256Mbit, 512Mbit or 1024Mbit)

  3. xcore.ai output pad drive strengths (inputs to the LPDDR device)

  4. LPDDR output drive strengths (inputs to the xcore.ai device)

LPDDR clock frequency specification#

The LPDDR clock may be provided either by the system PLL or by the secondary PLL.

Using the primary (system) PLL#

The LPDDR clock may be specified as a frequency which is derived from the system PLL. The system PLL operates at multiple of 100MHz. This is divided by a constant to give the LPDDR clock. The LPDDR clock is driven from the system PLL via a fixed divide-by-two followed by a programmable divider. The LPDDR clock frequency is:

f_lpddr = f_syspll / div

where div is an even integer in the inclusive range 0x2 to 0x20000. When using the system PLL, the value of div is computed based on the LPDDR frequency specified in the XN file. The following target XN file excerpt shows the parameters required to provide 100MHz a LPDDR clock:

<Extmem SizeMbit="1024" Frequency="100MHz">

This is shown in context along with the drive strength specification:

<Packages>
  <Package id="0" Type="XS3-UnA-1024-FB265">
    <Nodes>
      <Node Id="0" InPackageId="0" Type="XS3-L16A-1024" Oscillator="24MHz" SystemFrequency="600MHz" ReferenceFrequency="100MHz">

        <Boot>
          <Source Location="bootFlash"/>
        </Boot>

        <Extmem SizeMbit="1024" Frequency="100MHz">

          <!-- Attributes for Padctrl and Lpddr XML elements are as per equivalently named 'Node Configuration' registers in datasheet -->
          <!--
            Padctrl attributes are applied to each named signal in the set below:
            [6] = Schmitt enable, [5] = Slew, [4:3] = drive strength, [2:1] = pull option, [0] = read enable

            Therefore:
            0x30: 8mA-drive, fast-slew output
            0x31: 8mA-drive, fast-slew bidir
          -->
          <Padctrl clk="0x30" cke="0x30" cs_n="0x30" we_n="0x30" cas_n="0x30" ras_n="0x30" addr="0x30" ba="0x30" dq="0x31" dqs="0x31" dm="0x30"/>

          <!--
            LPDDR emr_opcode attribute:
            emr_opcode[7:5] = LPDDR drive strength to xcore.ai

            0x20: Half drive strength
          -->
          <Lpddr emr_opcode="0x20"/>
        </Extmem>

The drive strength specifications should be provided based on board design parameters. The above values are for the XMOS XK-EVK-XU316 board which should be used as a reference design when creating a new board.

Using the secondary PLL#

The secondary PLL may be used as a source for the LPDDR clock. This overcomes limits in the available LPDDR frequencies which exist when using the primary (system) PLL. There are two sets of parameters required:

  1. The PLL configuration values required to obtain a specified PLL output frequency

  2. A division value (which must be an even integer in the range 0x2 to 0x20000) used to divide the PLL output frequency to obtain the LPDDR clock frequency

The PLL output frequency is given by:

f_out = (Oscillator x SecondaryPllFeedbackDiv/2) / (SecondaryPllInputDiv x SecondaryPllOutputDiv)

The following XN excerpt illustrates the specification of these parameters to provide the PLL output frequency 322MHz and a 166MHz LPDDR clock:

<Node Id="0" InPackageId="0" Type="XS3-L16A-1024" Oscillator="24MHz"
 SystemFrequency="600MHz" ReferenceFrequency="100MHz"
 SecondaryPllInputDiv="1" SecondaryPllOutputDiv="3" SecondaryPllFeedbackDiv="83">

  <Extmem SizeMbit="1024" SourcePll="SecondaryPll" Divider="2">

The following shows this excerpt in context (note the detailed comments as shown in the setup using the primary PLL have been removed for brevity):

<Node Id="0" InPackageId="0" Type="XS3-L16A-1024" Oscillator="24MHz"
 SystemFrequency="600MHz" ReferenceFrequency="100MHz"
 SecondaryPllInputDiv="1" SecondaryPllOutputDiv="3" SecondaryPllFeedbackDiv="83">

  <Extmem SizeMbit="1024" SourcePll="SecondaryPll" Divider="2">

    <!-- Attributes for Padctrl and Lpddr XML elements are as per equivalently named 'Node Configuration' registers in datasheet -->
    <Padctrl clk="0x30" cke="0x30" cs_n="0x30" we_n="0x30" cas_n="0x30" ras_n="0x30" addr="0x30" ba="0x30" dq="0x31" dqs="0x31" dm="0x30"/>

    <Lpddr emr_opcode="0x20"/>
  </Extmem>

Level 1 cache#

A level 1 cache is situated between the xCORE tile and the LPDDR memory. This is a unified I and D cache, fully-associative, with write-back. It has 8 lines and the line size is 32 bytes. The replacement policy is pseudo-LRU (Least Recently Used).

xCORE instructions are provided to prefetch, invalidate and flush this cache.

It is not advisable to have more than 2 logical cores access the LPDDR because ‘cache-thrashing’ will occur (where data required by a logical core is repatedly evicted by another logical core and must be re-loaded).