Slice expansion strategy after Golang version 1.18

Directory

  • Problem export
  • sizeclasses.go file
  • Expansion strategy caused by single element insertion
    • Capacity less than 256
    • Capacity greater than 256
  • Inserting elements in batches leads to capacity expansion
  • Summarize

Question export

I have been learning Golang recently and am very interested in the expansion strategy of slice.
First look at the following piece of code

var arr1, arr2, arr3, arr4 []int64
var arr5, arr6, arr7, arr8 []int32

for i := 0; i < 25; i + + {<!-- -->
arr1 = append(arr1, int64(i))
arr5 = append(arr5, int32(i))
}

arr3 = append(arr3, arr1...)
arr7 = append(arr7, arr5...)

for i := 0; i < 513; i + + {<!-- -->
arr2 = append(arr2, int64(i))
arr6 = append(arr6, int32(i))
}

arr4 = append(arr4, arr2...)
arr8 = append(arr8, arr6...)

fmt.Printf("arr1 cap=%d\t", cap(arr1))
fmt.Printf("arr2 cap=%d\t", cap(arr2))
fmt.Printf("arr3 cap=%d\t", cap(arr3))
fmt.Printf("arr4 cap=%d\\
", cap(arr4))
fmt.Printf("arr5 cap=%d\t", cap(arr5))
fmt.Printf("arr6 cap=%d\t", cap(arr6))
fmt.Printf("arr7 cap=%d\t", cap(arr7))
fmt.Printf("arr8 cap=%d\\
", cap(arr8))

The result of running the above program is as follows (Golang version is 1.21.1)

arr1 cap=32 arr2 cap=848 arr3 cap=26 arr4 cap=608
arr5 cap=32 arr6 cap=864 arr7 cap=28 arr8 cap=576

The results of arr1 and arr5 should be easy to understand. When the 17th element is inserted, expansion occurs and the capacity doubles. That seems to be it. However, I am confused about the results of arr2 and arr6, and I have no way to analyze the results of arr3, arr4, arr7 and arr8. Why is this happening? By comparing the capacities of arr2 and arr6, it seems that the capacity after expansion is still related to the memory occupied by a single element. Is this true? Let’s explore it together next.

sizeclasses.go file

In Golang, there is a sizeclasses.go, which exists under the runtime package. Part of the code of this file is as follows:

// class bytes/obj bytes/span objects tail waste max waste min align
// 1 8 8192 1024 0 87.50% 8
// 2 16 8192 512 0 43.75% 16
// 3 24 8192 341 8 29.24% 8
// 4 32 8192 256 0 21.88% 32
// 5 48 8192 170 32 31.52% 16
// 6 64 8192 128 0 23.44% 64
// 7 80 8192 102 32 19.07% 16
// 8 96 8192 85 32 15.95% 32
// 9 112 8192 73 16 13.56% 16
// 10 128 8192 64 0 11.72% 128
// 11 144 8192 56 128 11.82% 16
// 12 160 8192 51 32 9.73% 32
// 13 176 8192 46 96 9.59% 16
// 14 192 8192 42 128 9.25% 64
// 15 208 8192 39 80 8.12% 16
// 16 224 8192 36 128 8.15% 32
// 17 240 8192 34 32 6.62% 16
// 18 256 8192 32 0 5.86% 256
// 19 288 8192 28 128 12.16% 32
// 20 320 8192 25 192 11.80% 64
// 21 352 8192 23 96 9.88% 32
// 22 384 8192 21 128 9.51% 128
// 23 416 8192 19 288 10.71% 32
// 24 448 8192 18 128 8.37% 64
// 25 480 8192 17 32 6.82% 32
// 26 512 8192 16 0 6.05% 512
// 27 576 8192 14 128 12.33% 64
// 28 640 8192 12 512 15.48% 128
// 29 704 8192 11 448 13.93% 64
// 30 768 8192 10 512 13.94% 256
// 31 896 8192 9 128 15.52% 128
// 32 1024 8192 8 0 12.40% 1024
// 33 1152 8192 7 128 12.41% 128
// 34 1280 8192 6 512 15.55% 256
// 35 1408 16384 11 896 14.00% 128
// 36 1536 8192 5 512 14.00% 512
// 37 1792 16384 9 256 15.57% 256
// 38 2048 8192 4 0 12.45% 2048
// 39 2304 16384 7 256 12.46% 256
// 40 2688 8192 3 128 15.59% 128
// 41 3072 24576 8 0 12.47% 1024
// 42 3200 16384 5 384 6.22% 128
// 43 3456 24576 7 384 8.83% 128
// 44 4096 8192 2 0 15.60% 4096
// 45 4864 24576 5 256 16.65% 256
// 46 5376 16384 3 256 10.92% 256
// 47 6144 24576 4 0 12.48% 2048
// 48 6528 32768 5 128 6.23% 128
// 49 6784 40960 6 256 4.36% 128
// 50 6912 49152 7 768 3.37% 256
// 51 8192 8192 1 0 15.61% 8192
// 52 9472 57344 6 512 14.28% 256
// 53 9728 49152 5 512 3.64% 512
// 54 10240 40960 4 0 4.99% 2048
// 55 10880 32768 3 128 6.24% 128
// 56 12288 24576 2 0 11.45% 4096
// 57 13568 40960 3 256 9.99% 256
// 58 14336 57344 4 0 5.35% 2048
// 59 16384 16384 1 0 12.49% 8192
// 60 18432 73728 4 0 11.11% 2048
// 61 19072 57344 3 128 3.57% 128
// 62 20480 40960 2 0 6.87% 4096
// 63 21760 65536 3 256 6.25% 256
// 64 24576 24576 1 0 11.45% 8192
// 65 27264 81920 3 128 10.00% 128
// 66 28672 57344 2 0 4.91% 4096
// 67 32768 32768 1 0 12.50% 8192

Why do I need to mention this file? It is because it is a comparison file required for the slice expansion strategy later. Need to pay attention to the bytes/obj column

Expansion strategy caused by single element insertion

Capacity less than 256

Corresponding to arr1 and arr5 in the above example, the formula for calculating the capacity of slice is as follows:

newSlice = oldSlice * 2 * Memory occupied by a single element -> Check the corresponding table in sizeclasses to get the nearest number greater than this number.
newSlice = the number after getting / the memory occupied by a single element

The above method obtains the capacity of the final slice after expansion. It should be noted that the table we are checking is the bytes/obj column.
In order to verify whether the above method is correct, we manually calculate arr5 in the example:
The capacity of arr1 at the beginning was 0. When the first element was inserted, the capacity expanded. According to the calculation formula
newSlice = 0 * 2 * 4 gets 0, by looking at the table, the nearest is 8, then newSlice = 8 / 4 gets 2
This is why the capacity of int32 slice is 2 when inserting the first element. There is another interesting thing at this time.

var a []int32
a2 := []int32{<!-- -->1}
a = append(a, int32(3))
fmt.Println(cap(a), cap(a2))

The result obtained by the above code is

2 1

In fact, this is very easy to understand. One is expansion and the other is initialization. I won’t go into details.
Back to the expansion analysis of arr5. After inserting the first element, the current capacity becomes 2. When the third element is inserted, the capacity expands again. By calculating 2 * 2 * 4 = 16, the table looks up exactly, then The final result is 16 / 4 = 4, so when the third element is inserted, the capacity is expanded to 4. In the same way, when the fifth element is inserted, the expansion occurs, 4 * 2 * 4 = 32. If you check the table, you will get 32 / 4 = 8. Through the same method, we can get why the final capacity of arr5 is 32.
The analysis of arr1 is the same as arr5, except that the memory of a single element is 8 bytes.

Capacity greater than 256

The capacity is greater than 256, corresponding to arr2 and arr6 in the example. The method for calculating the expanded capacity is as follows:

newSlice = ((old + (oldSize + 3*256)/4) * Number of bytes occupied by a single element) -> Check the table to get the latest value greater than it
The obtained value / the number of bytes occupied by a single element is the final expansion size

The above method is to show the calculation method after the capacity is greater than 256
Similarly, we take arr2 as an example to analyze its expansion results.
First, when arr2 inserts the 257th element, expansion occurs: (256 + (256 + 3 * 256) / 4) * 8 = 4096. By looking at the table, we get the nearest element that is larger than it. The value ofis exactly 4096, then4096 / 8 = 512. Then, when the 513th element is inserted, expansion occurs again, and the formula is introduced: (512 + (512 + 3 * 256) / 4) * 8 = 6656, which can be obtained by looking up the table, The required value is 6784, 6784 / 8 = 848. This is the expansion calculation process of arr2 after it is larger than 256. The final result of arr6 can also be calculated in the same way, except that the single memory of int32 is 4 bytes.

The above is the capacity calculation method when all single elements are inserted into the slice and expansion occurs.

Batch insertion of elements leads to capacity expansion

Inserting elements in batches results in capacity expansion, corresponding to arr3, arr4, arr7 and arr8 in the example.
At this time, we introduce the key code of slice expansion in the slice.go file.

newcap := oldCap
doublecap := newcap + newcap
if newLen > doublecap {<!-- -->
newcap = newLen
} else {<!-- -->
const threshold = 256
if oldCap <threshold {<!-- -->
newcap = doublecap
} else {<!-- -->
// Check 0 < newcap to detect overflow
// and prevent an infinite loop.
for 0 < newcap & amp; & amp; newcap < newLen {<!-- -->
// Transition from growing 2x for small slices
// to growing 1.25x for large slices. This formula
// gives a smooth-ish transition between the two.
newcap + = (newcap + 3*threshold) / 4
}
// Set newcap to the requested cap when
// the newcap calculation overflowed.
if newcap <= 0 {<!-- -->
newcap = newLen
}
}
}

The above code solves the problem of single insertion expansion and batch insertion expansion. It should be noted that the newCap value obtained is not the real cap result. You need to change newCap * the number of bytes of a single element, and then get the final result by looking up the table.

In this section, we will not analyze the calculation process in the initial example, but look at the following piece of code:

var arr9 []int32
for i := 0; i < 32; i + + {<!-- -->
arr9 = append(arr9, int32(i))
}
fmt.Println("arr9 cap=", cap(arr9))

arr9 = append(arr9, arr6...)
fmt.Println("arr9 cap=", cap(arr9), " len=", len(arr9))
arr9 = append(arr9, arr6...)
fmt.Println("arr9 cap=", cap(arr9), " len=", len(arr9))

arr6 in the above code is arr6 in the initial example. The running results are as follows

arr9 cap= 32
arr9 cap= 576 len= 545
arr9 cap= 1344 len= 1058

In this example, a large amount of data is inserted while the original data is available, resulting in capacity expansion. Analyze the output results.
First, the first output is very simple, the same as the original example. For the second output, 513 elements were inserted in batches, and expansion occurred at this time. Returning to the expansion part of the code in the source file of slice.go, it is obvious that newLen > doublecap, so newCap is < strong>32 + 513 = 545, then545 * 4 = 2180, by looking up the table, we get2304, then2304 / 4 = 576< /strong>. For the third output, newLen = 1058, newLen < doublecap, and then into that loop, 576 + (576 + 3 * 256) / 4 = 912. Because 912 < 1058, loop again 912 + (912 + 3 * 256) / 4 = 1332, at this time 1322 > 1058, then newCap That’s 1322. Finally 1322 * 4 = 5288, through table lookup, we get 5376, 5376 / 4 = 1344.

Summary

To understand slice expansion, the slice.go file and sizeclasses.go file are the most important. By combining the slice.go and sizeclasses.go files, you can master the essence of slice expansion.

syntaxbug.com © 2021 All Rights Reserved.