Construct an array object using a sequence of bytes

“How are arrays in .NET laid out in memory? 》Introduces a memory layout for array objects under .NET. Now that we know the memory layout, we can naturally create a byte sequence to represent an array object according to this layout rule, just like “Draw an object in memory in pure binary form” to construct an ordinary object, and “You know. How are NET strings stored in memory? 》Same as constructing a string object.

1. Array type layout
2. Construct an array using byte arrays
3. Construct an array using unmanaged local memory
4. Performance test

1. Array type layout

Let’s briefly review the memory layout of array objects. As shown in the figure below, for 32-bit (x86) systems, Object Header and TypeHandle each occupy 4 bytes; but for 64-bit (x64), the TypeHandle that stores the method table pointer naturally extends to 8 bytes, but Object The Header is still 4 bytes. In order to ensure that TypeHandle is based on 8-byte memory alignment, 4 bytes of “Padding” will be prepended.

549310c9b841e2c3e16d938 696609768.png

The payload content (Payload) adopts the following layout: the first 4 bytes store the length of the array in the form of UInt32, and the content of each array element is stored in sequence. For 64-bit (x64), in order to ensure the memory alignment of the array elements, there is a 4-byte Padding between the two.

2. Construct an array using byte array

The BuildArray method shown below helps us build an array of specified length. The array element type is determined by the generic parameters. As shown in the code snippet, we calculate the number of bytes occupied by the target array according to the above memory layout rules, and create a corresponding byte array to represent the constructed array. We write the value of the TypeHandle (method table address) of the array type (T[]) into the corresponding location (both offset and length are IntPtr.Size), and the following 4 bytes are written into the length of the array. . Since then, an empty array with specified element type/length has been constructed, and we let the returned array variable point to the IntPtr.Sizeth byte (4 bytes/8 bytes) of the array.

unsafe static T[] BuildArray<T>(int length)
{
    var byteCount =
        IntPtr.Size // Object header + Padding
         + IntPtr.Size // TypeHandle
         + IntPtr.Size // Length + Padding
         + Unsafe.SizeOf<T>() * length // Elements
        ;

   var bytes = new byte[byteCount];
    Unsafe.Write(Unsafe.AsPointer(ref bytes[IntPtr.Size])
        , typeof(T[]).TypeHandle.Value);
    Unsafe.Write(Unsafe.AsPointer(ref bytes[IntPtr.Size * 2]), length);

    T[] array = null!;
    Unsafe.Write(Unsafe.AsPointer(ref array)
        , new IntPtr(Unsafe.AsPointer(ref bytes[IntPtr.Size])));
    return array;
}

Next, let’s verify whether the array built by BuildArray can be used normally. As shown in the code snippet below, we call this method to construct an integer array with a length of 100, and use debugging assertions to determine whether the length of the constructed array is normal, and to verify whether each element is empty. Next we assign a value to each array element and use debugging assertions to verify whether the assignment is valid.

var array = BuildArray<int>(100);
Debug.Assert(array.Length == 100);
Debug.Assert(array.All(it => it == 0));
for (int index = 0; index < array.Length; index + + )
{
    array[index] = index;
}
for (int index = 0; index < array.Length; index + + )
{
    Debug.Assert(array[index] == index);
}

The above demonstrates the construction of an array of value type (Int32). The following uses a similar form to construct an array of reference type (String).

var array = BuildArray<string>(100);
Debug.Assert(array.Length == 100);
Debug.Assert(array.All(it => it is null));
for (int index = 0; index < array.Length; index + + )
{
    array[index] = index.ToString();
}
for (int index = 0; index < array.Length; index + + )
{
    Debug.Assert(array[index] == index.ToString());
}

3. Construct an array using unmanaged local memory

Since we can use a continuous section of managed memory (byte array) to construct an array of specified element type and specified length, we can naturally use unmanaged memory to achieve the same purpose. The biggest benefit of using unmanaged local memory to build an array is obvious, that is, it will not put any pressure on the GC, provided that we can release the allocated content ourselves. In order for us to transform the BuildArray method defined above into the following form: After completing the calculation of the number of bytes, we call NativeMemory’s AllocZeroed method to allocate memory of a suitable length and empty the content (set to zero). Next, write the TypeHandle and length to the corresponding location according to the layout rules. Finally, just let the returned variable point to the address corresponding to TypeHandle.

unsafe static T[] BuildArray<T>(int length)
{
    var byteCount =
        IntPtr.Size // Object header + Padding
         + IntPtr.Size // TypeHandle
         + IntPtr.Size // Length + Padding
         + Unsafe.SizeOf<T>() * length // Elements
        ;

    var pointer = NativeMemory.AllocZeroed((uint)byteCount);
    Unsafe.Write(Unsafe.Add<nint>(pointer, 1)
        , typeof(T[]).TypeHandle.Value);
    Unsafe.Write(Unsafe.Add<nint>(pointer, 2), length);

    T[] array = null!;
    Unsafe.Write(Unsafe.AsPointer(ref array)
        , new IntPtr(Unsafe.Add<nint>(pointer, 1)));
    return array;
}

unsafe static void Free<T>(T[] array)
{
    var address = *(nint*)Unsafe.AsPointer(ref array);
    NativeMemory.Free(Unsafe.Add<nint>(
        address.ToPointer(), -1));
}

The above code also implements the Free method used to release local memory. We get the address of the released array object by “de-addressing” the specified array variable, but this address is not the initial location of the allocated memory, so we need to move forward one position (InPtr.Size) to get the pointer to the initial memory address. And use it as a parameter of NativeMemory’s Free method, so that the memory allocated in the BuildArray method can be released.

var random = new Random();
while(true)
{
    var length = random.Next(10, 100);
    var array = BuildArray<int>(length);
    Debug.Assert(array.Length == length);
    Debug.Assert(array.All(it=>it == 0));

    for (int index = 0; index < length; index + + )
        array[index] = index;
    for (int index = 0; index<length; index + + )
        Debug.Assert(array[index] == index);

    Free(array);
}

In the following demo program, we call the BuildArray method in an infinite loop to build an integer array of random length. Then we use debugging assertions to verify its length and initial value of the elements, and then assign and verify each element. . Since the Free method is called in each loop to release the created array object, the memory will always be maintained in a stable state. This can be verified from the memory diagnostic tool provided by VS.

4. Performance test

Let’s finally do a simple performance test to see the performance difference between the two programming methods of BuildArray + Free and direct new T[]. As shown in the code snippet below, we have defined two Benchmark methods. The ManagedArray method directly returns an integer array created using the new keyword, with a length of 1024; the NativeArray method calls the BuildArray method to build an integer of the same length. array, and call the Free method to “release” it.

[MemoryDiagnoser]
public class Benchmark
{
    [Benchmark]
    public int[] ManagedArray()
        => new int[1024];
    [Benchmark]
    public void NativeArray()
        =>Free(BuildArray<int>(1024));

    unsafe static T[] BuildArray<T>(int length);
    unsafe static void Free<T>(T[] array);
}

Shown below are the results of the performance test. It can be seen that NativeArray not only does not have GC-based allocation, but also takes less than half of the original time.