Data handling and memory safety

Extra data handling features

References

xC, like C++, provides references as a method of indirectly refering to some data. For example, the following declarations create a reference x to the integer i:

int i = 5;
int &x = i;

Reading and writing a reference is the same as reading and writing to the original variable:

printf("The value of x is %d\n", x);
x = 7;
printf("x has been updated to %d\n", x);
printf("i has also been updated to %d\n", i);

References can also refer to array or structure elements:

int a[5] = {1,2,3,4,5};
int &y = a[0];

printf("y has value %d\n", y);

Function parameters can also be references. For example, the following function takes a reference and updates the value it refers to:

void f(int &x) {
  x = x + 1;
}

This function can be called with the value to refer to as an argument:

void pass_by_reference_example() {
  int i = 5;
  printf("Value of i is %d\n", i);
  f(i);
  printf("Value of i is %d\n", i);
}

References can be passed between tasks as interface function arguments. For example, the function update in the following interface can alter the variable provided as an argument:

interface if1 {
  void update(int &x);
};

This can be called as such:

void task(client interface if1 i) {

   ...
   i.update(y);  // this may change the value of y

}

Just as passing arrays over interface calls, updating a reference works even when the communicating tasks are on different tiles.

Nullable types

Resources in xC such as interfaces, chanends, ports and clocks, must always have a valid value. The nullable qualifier allows these types to be the special value null indicating no value. This is similar to optional types in some programming languages.

The nullable qualifier is a ? symbol. So the following declaration is a nullable port:

port ?p;

Given a nullable typed variable, the program can check whether it is null using the isnull built-in function e.g.:

if (!isnull(p))  {
   // We know p is not null so can use it here
   ...
}

This facility is particularly useful for optional function parameter, for example:

// function that takes a port and optionally a second port
void f(port p, port ?q);

References can also be declared as nullable. Since the nullable qualifier applies to the reference it needs to appear to the right of the reference symbol, for example:

// Function that takes an optional integer 'y' to update
void f(int x, int &?y);

Finally, array can also be declared nullable. In this case the declaration needs to be explicit that the parameter is a reference to an array, for example:

// Function that takes an optional integer array 'a'
void f(int (&?a)[5]);

Variable length arrays

In xC, array declarations need to be a constant size. The exception to this is a local array that can be declared as a variable size based on a parameter, provided that parameter is marked both static and const:

void f(static const int n)
{
  printf("Array length = %d\n", n);

  int arr[n];
  for (int i = 0; i < n; i++) {
    arr[i] = i;
    for (int j = 0; j < i; j++) {
      arr[i] += arr[j];
    }
  }

  printf("-------\n");
  for (int i = 0; i < n; i++) {
    printf("Element %d of arr is %d\n", i, arr[i]);
  }
  printf("-------\n\n");
}

When calling functions with static parameters, the argument has to be either:

  • a constant expression
  • a static const parameter to the caller function

For example:

void g(static const int n)
{
  // static parameter can be called with a constant expression argument
  f(2);
  // or passing on a static const parameter
  f(n);
}

These restrictions mean that the compiler can still statically track stack usage despite the local array having variable size.

Multiple returns

In xC, functions can return multiple values. For example, the following function returns two values:

{int, int} swap(int a, int b) {
   return {b, a};
}

When calling the function, multiple values can be assigned at once:

int x = 5, y = 7;
{x, y} = swap(x, y);

Memory safety

In C and xC you have several ways of accessing memory (see the figure below)

Different ways to access memory
images/memory_access-crop.png

In C memory might be allocated via variable declarations or malloc. Only access to these allocated regions is allowed, if the program tries to access outside these regions then it is invalid behavior and the results are undefined.

In xC, there is also the notion of parallel tasks and the language has an additional restriction: no task is allowed to access memory that another task owns. The task that owns memory has permission to write to it. Memory ownership can transfer between tasks at well defined points in the program.

If a task accesses memory it is not supposed to, it is a program error and can cause destructive, hard to trace bugs. xC helps by adding checks to catch invalid memory access early (e.g. at compile time or early on during program execution) to eliminate these bugs. For example, all the bugs in the figure below will be detected.

Common invalid memory operations
images/memory_bugs-crop-wide.png

To be able to do all these checks and use C-style pointers, xC needs to have extra annotations on the pointer types. These restrictions help ensure memory safety - see Pointers.

Runtime exceptions

The xC compiler will try and spot memory errors at compile time and report the error during compilation. For example, given the following code:

void f()
{
  int a[5];
  a[7] = 10;
}

The compiler will fail to compile the code with the following error:

bad.xc: In function `f':
bad.xc:4: error: index of array exceeds its upper bound

However, sometimes it is unknown at compile time whether a memory error occurs, for example:

void f(int a[]) {
  a[7] = 10;
}

In this case the compiler will insert a runtime check that the memory operation is safe. If the operation is not safe then an exception is raised. The exception will cause the program to halt at the point of the error (rather than trashing memory and failing in some hard to trace way later on). If the debugger is connected then it will report that an exception has occurred and where it has occurred (if the program is compiled with debug information on).

For production release of an application, it is possible to install a general exception handler to the program that, for example, reboots the device on failure. Of course, by this time, no errors should actually occur in the program.

Bounds checking

The compiler keeps track of the bounds of all arrays and pointers (with the exception of unsafe pointers - see Pointers). Accessing outside of these bounds will cause a compiler error or runtime exception.

If the compiler cannot determine that usage is safe at compile time, it will insert runtime checks. However, these checks take time to execute. For arrays, the checks can often be eliminated by indicating to the compiler that the bound of an array is related to another variable using the following syntax:

// Function takes an array of size n
void f(int a[n], unsigned n) {
  for (int i = 0; i < n; i++)
    a[i] = i;
}

In this case the code needs no bounds checks since the compiler can infer that all access are within the bounds of the array (given by the variable n).

The compiler will still need to check that the bound is correct when the function is called. For example, the following would be cause an error:

void f(int a[n], unsigned n);

void g() {
  int a[5];
  f(a, 8);  // error - bound does not match
}

Parallel usage checks

At compile time, the tools check for parallel usage violations i.e. that no task accesses memory that another task owns. It can detect ownership by detecting which tasks write to a variable. For example, if you tried to compile the following program:

#include <stdio.h>

int g = 7;

void task1() { g = 7; }

void task2() { printf("%d",g);}

int main() {
  par {
    task1();
    task2();
  }
  return 0;
}

Then the compiler would return the following error:

par.xc:10: error: use of `g' violates parallel usage rules
par.xc:7: error: previously used here
par.xc:5: error: previously used here

If data is only read, two tasks can access it. So the following would be valid:

#include <stdio.h>

const int g = 7;

void task1() { printf("%d", g+2); }

void task2() { printf("%d", g);}

int main() {
  par {
    task1();
    task2();
  }
  return 0;
}

Pointers

Pointers are very powerful programming devices but it is also very easy to end up with a pointer performing an invalid memory access. The bounds checking stops pointers performing an invalid access via pointer arithmetic. However, pointers could still point to de-allocated memory and invalid parallel usage access could happen indirectly via pointers. xC detects these invalid uses and causes either a compile-time or run-time error.

To do this, every pointer needs to be allocated a kind. There are four kinds of pointer: restricted, aliasing, movable and unsafe. Any pointer declaration can describe the pointer kind with the following syntax:

pointee-type * pointer-kind pointer-variable

For example, the following declaration is a movable pointer to int named p:

int * movable p;

If no pointer kind is described in the declaration, a default is assumed. The default depends on whether the declaration is a global variable, parameter or local variable. The following table shows the defaults.

Declaration location

Default

Global variable

Restricted

Parameter

Restricted

Local variable

Aliasing

Function returns

No default - must be explicitly declared

Aliasing

To keep track of pointers that point to de-allocated memory or that may cause an invalid parallel access of memory, the compiler must be able to track pointer aliasing. Aliasing happens when two program elements refer to the same region of memory. the figure below shows an example of aliasing.

Three program elements referring to the same memory region
images/memory_aliasing-crop.png

Restricted Pointers

In both C and xC there is the concept of a restricted pointer. This is a pointer that cannot alias i.e. the only way to access that memory location is via that pointer. In C, the compiler can assume that access via a restricted pointer is non-aliased but does not perform any checks to make sure this is the case. In xC, extra checks ensure that the non-aliasing restriction is true.

The first check in xC, is that given a restricted pointer to an object, the program cannot access the memory via the original variable:

int i = 5;
int * restrict p = &i;

printf("%d", i);  // this is an error

Function parameters default to restricted so the following would also be invalid:

int i = 5;

// The function argument defaults to a restricted pointer
void f(int *p) {
  i = 7; // this is an error due to the call below
}

void g() {
  f(&i);
}

The second check on restricted pointers that xC makes is that the pointer cannot be re-assigned or copied:

int i = 5, j = 7;
int * restrict p = &i;
int * restrict q;

p = &j;   // invalid - cannot reassign to a restricted pointer
q = p;    // invalid - cannot copy a restricted pointer

These checks ensure that a restricted pointer always points to a tracked non-aliased location. It also cannot point to de-allocated memory.

Since pointer function parameters are restricted by default, the compiler also checks that no aliases are created at the point of a function call:

// Function that takes two restricted pointers
void f(int *p, int *q);

void g() {
  int i;
  f(&i, &i);  // this is invalid since the arguments alias
}

Since restricted pointers cannot be copied, function return types cannot be of restricted type.

Pointers that are allowed to alias

Restricted pointers are quite limited in their use. It is often very convenient to have pointers that alias. The alias pointer kind allows this but with different usage rules to restricted pointers.

Local pointers default to aliasing so the following code is valid:

int f() {
   int i = 5, j = 7;
   int *p = &i;  // this is an aliasing pointer
   int *q = &j;  // so is this

   p = q;  // aliasing pointers can be reassigned and copied

   return i + *p + *q;
}

To keep track of the aliases made by aliasing pointers the following restrictions apply:

  • You cannot pass alias pointers to different tasks in a par
  • You cannot have indirect access to an alias pointer (e.g. a pointer to an alias pointer)

If a function takes pointer parameters or returns a pointer that may alias, it needs to be explicitly written into the type of the function. For example the following function’s return value may alias its argument:

char * alias strchr(const char * alias haystack, int needle);

Global pointers can be accessed anywhere in the program so aliasing cannot be easily tracked. Accordingly, in xC, global pointers cannot be aliasing. They default to restricted but may be marked as unsafe or movable.

Function parameters

When passing pointers to functions, there are some special rules to allow conversion between pointer kinds. Firstly, a restricted pointer can be passed to a function taking an alias pointer arguments:

void f(int * alias x);

void g(int *y) {
   f(x);   // y is restricted but can be passed as an alias
}

In fact, within a function, a restricted pointer is treated like an aliasing pointer:

void g(int *y)  // 'y' is a restricted pointer
{
  // within the function 'y' acts like an aliasing pointer
  int *p = y;
  y = p;
  ...
}

An aliasing pointer can be passed to a function taking a restricted pointer argument if it does not alias any of the other arguments:

void f(int *x, int *y) { *x;}

int g() {
  int i = 5, j = 8;
  int *p = &i;
  int *q = &j;
  f(p, q);     // valid since p does not alias q
}

This only works if the source of the aliasing pointer is local to the function. If the aliasing pointer is assigned to a value that is from an incoming argument or global variable, an additional restriction is made on the function being called - it cannot access any global variables (since these may alias the pointer being passed). For example, the following is invalid:

int i = 5;

int f(int *q) {
  return *q + i; // compiler assumes 'q' does not alias 'i'
}

void g() {
  int *p = &i;
  f(p);        // invalid since f accesses a global and 'p' has
               // non-local scope
}

This code will fail at compile time with the error:

p.xc:10: error: passing non-local alias to function `f' which accesses a global variable

Transferring ownership (movable pointers)

It is useful to transfer the ownership of a pointer between parts of the program. For example:

  • Transfer ownership to a global variable to be used at a later time
  • Transfer ownership between tasks running in parallel

In these cases, the pointers still need to have no aliases to avoid race conditions and dangling pointers but restricted pointers cannot be reassigned or copied.

Movable pointers provide a solution. These pointers can be transferred but only in a way that means they retain non-aliasing properties and can never refer to de-allocated memory.

A movable pointer is declared using the movable type qualifier:

int i = 5;
int * movable p = &i;

Just like with restricted pointers, the program cannot use i in this case since that would break the non-aliasing property of the pointer.

Movable pointer values can be transferred using the move operator:

int * movable q;

q = move(p);

The move operator sets the source pointer to null. This ensures that only one variable has ownership of the memory location at a time.

The move operator also has to be used when passing a movable pointer into a function or returning a movable pointer:

int * movable global_p;

void f(int * movable p) { global_p = move(p); }

int * movable g(void) {
  return move(global_p); // need to use the move operator here
}

void h(void) {
  int i = 5;
  int * movable p = &i;
  f(move(p));  // need to use the move operator here
  p = g();
}

Movable pointers cannot refer to de-allocated memory. To ensure this the following restriction applies:

A movable pointer must point to the same region it was initialized with when it goes out of scope.

A runtime check is inserted to ensure this (so an exception can happen when the pointer goes out of scope). For example, the following is invalid:

int* movable global_p;

void f() {
   int i = 5;
   int *p = &i;
   global_p = move(p);

 }  // <-- at this point an exception occurs since 'p'
    //     does not point to the region it was
    //     initialized to

This avoids global_p pointing to de-allocated memory.

Transferring pointers between parallel tasks

Pointers can be passed as interface function arguments, for example:

interface if1 {
  void f(int * alias p);
};

The tasks share the pointer for the duration of the transaction, for example:

void f(server interface if1 i) {
  select {
    case i.f(int * alias p):
      printf("%d", *(p+2));
      break;
  }
}

When an interface is declared containing functions with pointer arguments it cannot be used across tiles (since tiles have separate memory spaces).

Restricted and alias pointers can only be used for the duration of the transaction. For example, the following is invalid:

void f(server interface if1 i) {
  int * alias q;
  select {
    case i.f(int * alias p):
      q = p; // invalid, cannot move the alias to a larger scope
      break;
  }
  printf("%d", *q);
}

To transfer a pointer beyond the scope of the transaction, movable pointers should be used e.g:

interface if2 {
  void f(int * movable p);
};

void f(server interface if2 i) {
  int * movable q;
  select {
    case i.f(int * movable p):
      q = move(p); // ok, ownership is transferred
      break;
  }
  printf("%d", *q);
}

This way, one task must relinquish ownership of the memory region at a well defined point for the other task to use it (so no accidental race conditions can occur).

Unsafe pointers

An unsafe pointer type is provided for compatibility with C and to implement dynamic, aliasing data structures (for example linked lists). This is not the default pointer type and the onus is on the programmer to ensure memory safety for these types.

An unsafe pointer is opaque unless accessed in an unsafe region. A function can be marked as unsafe to show that its body is an unsafe region:

unsafe void f(int * unsafe x) {
  // We can dereference x in here,
  // but be careful - it may point to garbage
  printintln(*x);
}

Unsafe functions can only be called from unsafe regions. You can make a local unsafe region by marking a compound statement as unsafe:

void g(int * unsafe p) {
  int i = 99;
  unsafe {
    p = &i;
    f(p);
  }
  // Cannot dereference p or call f from here
}

These regions allow the programmer to manage the parts of their program that are safe by construction and the parts that require the programmer to ensure safety.

Within unsafe regions, unsafe pointers can be explicitly cast to safe pointers - providing a contract from the programmer that the pointer can be regarded as safe from then on.

It is undefined behavior for an unsafe pointer to be written from one task and read from another.

See Also