Unsafe Code

Summary

Introduction

In C# it is simply not possible to have an 'uninitialized variable' or 'dangling pointer', or an expression that indexes an array beyond its bounds. Whole categories of bugs that routinely plague C and C++ have been removed. The C# language differs notable from C and C++ by its omission of pointers as data types. C# instead used references and the ability to create objects that are managed by the garbage collector.

While every pointer type in C/C++ has a reference type counterpart in C#, the use of pointers cannot be fully eliminated. For example, interacting with the underlying type system, using memory-mapped files, or implementing a time-critical algorithm may not be fully possible without the use of pointers.

Writing unsafe code is much like writing C/C++ within a C# program. In unsafe code it is possible to:

Unsafe code must be clearly marked with unsafe so that developers cannot use unsafe features accidentally

Unsafe Contexts

Unsafe features of C# are only available in unsafe contexts. An unsafe context is created by including the unsafe modifier in the declaration of a type or member, or by using an unsafe statement:

For example, the following two declarations are equivalent:

public unsafe struct Node
{
    public int     Value;
    public Node*   Left;        // unsafe
    public Node*   Right;       // unsafe
}

public struct Node
{
    public int     Value;
    public unsafe Node*   Left;
    public unsafe Node*   Right;
}

And the following two declarations are also equivalent:

public class A
{
    public unsafe void foo( char* pString );
}

public unsafe class A
{
    public void foo( char* pString );
}

Pointer Types

In an unsafe context, there are three kinds of types:

  1. value-type
  2. reference-type
  3. pointer-type

Where a pointer-type can be:

For example, the following declares two pointers to integers:

int* pNum1, pNum2;    // pNum1 and pNum2 are both int*

Note the following issues with pointers in an unsafe context:

Fixed and Moveable Variables

Fixed variables are those that exist in a location unaffected by the operation of the garbage collector. Examples of fixed variables include:

Moveable variables on the other hand reside in locations that can be relocated or disposed of by the garbage collector. Examples of moveable variables include:

For a fixed variable, operator & (address of) can be applied to obtain the address of the variable. However, for a moveable variable, the operator & can only be applied using a fixed statement. The address remains valid for the duration of the fixed statement.

Pointer in Expressions

Pointer Indirection

The unary * operator denotes pointer indirection and is used to obtain the underlying variable pointed to by the pointer:

public void foo()
{
    unsafe
    {
        long l = 10;
        long *pL = &l;
    }
}

It is a compile-time error to apply the unary * operator to an expression of type void or to an expression that is not of a pointer type.

Pointer member access

Members of a pointer type are accessed using the P->I notation where P is an expression of a pointer type and I must denote an accessible member of the type pointed to by P. Note that the P->I notation is exactly equivalent to (*P).I. For example:

public struct Point
{
    public int x;
    public int y;
}

public class Test
{
    public unsafe void TestPoint()
    {
        Point     Pt;
        Point*    pPt;
        pPt->x = 10;        // same as (*pPt).x
        pPt->y = 20;        // same as (*pPt).y
    }
}

Pointer element access

Elements of a pointer type are accessed using the P[E] notation where P is an expression of a pointer type other than void*, and E must be an expression of a type that can be implicitly converted to int, uint, long, or ulong. A pointer element access of the form P[E] is exactly equivalent to *(P+E).

public void foo2()
{
    unsafe
    {
        char* pChar = stackalloc char[256];
        for (int i = 0; i < 255; i++)
        pChar[i] = (char)i;            // same as: *(pChar + i) = (char)i;
    }
}

Address-Of operator

Given an expression E of type T and is classified as a fixed variable, the construct &E computes the address of the variable given by E. The type of the result is T* and is classified as a value. The construct &E for a moveable variable produces a compile-time error. 

Pointer Arithmetic

Given an expression E of a pointer type T*, then E++ corresponds to E + sizeof(T).

Given an expression E of a pointer type T*, then E+1 corresponds to E + sizeof(T).

Given an expression E of a pointer type T*, then E+N corresponds to E + N.sizeof(T).

public void foo()
{
    unsafe
    {
        int l   = 10;         // sizeof(int) is 4
        int *pL = &l;         // 0x0012ed08
        pL++;                 // 0x0012ed0c + 4 = 0x0012ed0c
        pL++;                 // 0x0012ed0c + 4 = 0x0012ed10
        pL = pL + 1;          // 0x0012ed10 + 4 = 0x0012ed14
        pL = pL + 2;          // 0x0012ed14 + 8 = 0x0012ed1c
    }
}

Given an expression E1 of a pointer type T1* and given an expression E2 of a pointer type T2*, then comparison operators compare the addresses given by E1 and E2 as if the addresses were unsigned integers.

sizeof operator

Recall that the sizeof operator returns the number of bytes occupied by a variable of a given type. sizeof operator was previously discussed here. Note the following size for some of the unmanaged types:

The fixed statement

The fixed statement prevents relocation of a variable by the garbage collector in unsafe contexts only. It takes the form of:

fixed( T* ptr = expression ) statement

Where T refers to an unmanaged type or void, expression is one that can be implicitly converted to T*, and statement is an executable statement or block.

Without fixed, a pointer to an unmanaged type would be of little use because the garbage collector could relocate variables unpredictably. Therefore, it is a compile-time error to obtain the address of a moveable type without using the fixed statement. For example:

public class SomePoint
{
    public int x;
    public int y;
}

public class Test
{
    public void FixVariables()
    {
        Primer.SomePoint pt = new Primer.SomePoint(); // managed type

        // Now do some pointer operations on the managed pt object. Note that you can used
        // fixed only within an unsafe context

        unsafe
        {
            fixed( Point* pPt = &pt ) // Error. Cannot take the address or size of a variable of a managed type ('Primer.Point')
                pPt->x = 10;

            fixed( int* pX = &(pt.x), pY = &(pt.y))
            {
                *pX = 100;
                *pY = 10;
            }
        }
    }
}

public void FixVar2()
{
    string strName = "Fixing Variables";
    unsafe
    {
        fixed( char* pStr = strName )
        {
            for ( int i = 0; pStr[i] != '\0'; i++)
                System.Diagnostics.Trace.Write( pStr[i] );
        }
    }
}

In the example above, pX and pY are considered local variables within the fixed block. Such local variables declared by the fixed statement are considered read-only and cannot be assigned to or passed as a ref or out parameter.

It is the programmer's responsibility to ensure that pointers created by the fixed statement do not survive beyond the execution of those statements. For example, if a pointer created by the fixed statement is passed to external APIs, that it is the programmer's responsibility to ensure that the APIs do not retain any memory for these pointers.

Fixed objects may cause fragmentation of the heap because they cannot be moved by the garbage collector. Therefore, use fixed objects only when absolutely necessary.

stackalloc

stackalloc is used to allocate a block of memory on the stack. stackalloc is only valid for local variables in unsafe contexts. stackalloc takes the following form:

T* p = stackalloc T[integral-xpression]

Where,

public unsafe void AllocateMemoryOnTheStack()
{
    int* pNumbers = stackalloc int[3];    // type of items is int, and there will be 3 items. Content is uninitialized
    pNumbers[0] = 1;
    pNumbers[1] = 2;
    pNumbers[2] = 3;
}

stackalloc T[E] allocates  E * sizeof(T) bytes from the call stack and returns a pointer of type T to the newly allocated block. If there was not enough memory, StackOverflowException is thrown.