1. Preface
card_table is one of the core technologies of CLR. Its bit mark loops through the old generation heap (oldest_gen) to find out the reference of the old heap object to the new generation. Extracting it from CLR and GC is a relatively complex project. Taking .Net8 as the example Blueprint, this article simplifies the complex, continue to study.
2. Overview
The old generation generally refers to the second generation, which is max_generation. Loop through this to find the heap segment in this generation, loop through the address range (old generation address) of the referenced object of each heap segment through bit marking, and mark the referenced object alive. Let’s take a picture first to get a general idea.
The card_word and card_word_end ranges traverse each generation 2 heap segment to find the new generation objects referenced in the generation 2 heap segment.
3. Argument
Check this process through lldb. First, find the addresses of the old generation object (n1) and the new generation object (n2). The former refers to the latter.
Name n1 = new Name("1111"); GC.Collect();GC.Collect(); Name n2 = new Name("3333"); n1.selfName= n2;GC.Collect(0);
At the JIT_WriteBarrier code hosting Main:
0x7fff78d85511: lea rdi, [rdi + 0x10] 0x7fff78d85515: mov rsi, qword ptr [rbp - 0x18] 0x7fff78d85519: call 0x7ffff730f6e0; JIT_WriteBarrier
The addresses of the two objects are as follows:
(lldb) c (lldb) register read rdi rsi rdi = 0x00007fbf6a808b08 rsi = 0x00007fbf6cc00028
Take a look at what JIT_WriteBarrier does
libcoreclr.so`JIT_WriteBarrier: -> 0x7ffff730f6e0 < + 0>: mov qword ptr [rdi], rsi 0x7ffff730f6e3 < + 3>: mov r8, rdi 0x7ffff730f6e6 < + 6>: movabs rax, 0x7fbee54ff2a0 0x7ffff730f6f0 < + 16>: shr rdi, 0x16 0x7ffff730f6f4 < + 20>: cmp byte ptr [rdi + rax], 0x0 0x7ffff730f6f8 < + 24>: jne 0x7ffff730f6fe ; < + 30> 0x7ffff730f6fa < + 26>: rep ret 0x7ffff730f6fc < + 28>: nop 0x7ffff730f6fe <+30>: movabs r9, 0x7fbf68000000 0x7ffff730f708 < + 40>: cmp rsi, r9 0x7ffff730f70b < + 43>: jae 0x7ffff730f70e ; < + 46> 0x7ffff730f70d < + 45>: ret 0x7ffff730f70e <+46>: movabs r9, 0x7fff68000000 0x7ffff730f718 < + 56>: cmp rsi, r9 0x7ffff730f71b < + 59>: jb 0x7ffff730f71f ; < + 63> 0x7ffff730f71d < + 61>: rep ret 0x7ffff730f71f < + 63>: shr rsi, 0x16 0x7ffff730f723 < + 67>: mov dl, byte ptr [rsi + rax] 0x7ffff730f726 < + 70>: cmp dl, byte ptr [rdi + rax] 0x7ffff730f729 < + 73>: jb 0x7ffff730f72e ; < + 78> 0x7ffff730f72b < + 75>: rep ret 0x7ffff730f72d < + 77>: nop 0x7ffff730f72e < + 78>: movabs rax, 0x7faedb5ff040 0x7ffff730f738 < + 88>: mov ecx, r8d 0x7ffff730f73b < + 91>: shr r8, 0xb 0x7ffff730f73f < + 95>: shr ecx, 0x8 0x7ffff730f742 < + 98>: and ecx, 0x7 0x7ffff730f745 < + 101>: mov dl, 0x1 0x7ffff730f747 < + 103>: shl dl, cl 0x7ffff730f749 < + 105>: test byte ptr [r8 + rax], dl 0x7ffff730f74d < + 109>: je 0x7ffff730f751; < + 113> 0x7ffff730f74f < + 111>: rep ret 0x7ffff730f751 < + 113>: lock 0x7ffff730f752 < + 114>: or byte ptr [r8 + rax], dl 0x7ffff730f756 < + 118>: movabs rax, 0x7fbedf4ef500 0x7ffff730f760 < + 128>: shr r8, 0xa 0x7ffff730f764 < + 132>: cmp byte ptr [r8 + rax], -0x1 0x7ffff730f769 < + 137>: jne 0x7ffff730f76d ; < + 141> 0x7ffff730f76b < + 139>: rep ret 0x7ffff730f76d < + 141>: mov byte ptr [r8 + rax], -0x1 0x7ffff730f772 < + 146>: ret
The code is a bit long, so I’ll simulate it in C:
n1.selfName=n2; r8=n1.selfName; rax=cardw_card_bundle//Further restrict the scope based on card_table n1.selfName=n1.selfName>>0x16 if((cardw_card_bundle + n1.selfName)==0) { return; } else { if(transient heap starting address<=n2<transient heap end address) { n2=n2<<0x16; rax=0x7faedb5ff040; *(rax + n1.selfName>>0xB)=8;//The windows here are 0xFF } }
The rough meaning is to move the second generation object n1.selfName to the right
0x0B + card_table first address, its value is assigned to 0x8.
continue:
br del b gc.cpp:38448 c 38445 limit = min (end, card_address (end_card)); 38446 #endif // FEATURE_CARD_MARKING_STEALING 38447 } -> 38448 if (!foundp || (last_object >= end) || (card_address (card) >= end)) 38449 { 38450 if (foundp & amp; & amp; (cg_pointers_found == 0)) 38451 {<!-- -->
limit Previous article said that it is an end range in the 2nd generation heap segment, and a starting range. These two variables constitute the search for old objects that reference the new generation from this range. Age object. Take a look at the end address of this limit:
(lldb) p/x limit (uint8_t *) $81 = 0x00007fbf6a808b18 ""
The address of n1.selfName above is: 0x00007fbf6a808b08
The end range address of limit is: 0x00007fbf6a808b18
It can be assumed that this loop contains the addresses of 2nd generation objects. then it can be found
4.find_card
Before limit, the CLR calls find_card to find the address range in which this second-generation object refers to the new-generation object.
Take a look inside the find_card function
b gc.cpp:37953 c 37952 last_card_word = & amp;card_table [card_word (card)]; -> 37953 bit_position = card_bit (card); 37954 #ifdef CARD_BUNDLE 37955 // if we have card bundles, consult them before fetching a new card word 37956 if (bit_position == 0)
card_table is the address added after n1.selfName is shifted to the right in the third step of the argument above.
(lldb) p/x card_table (uint32_t *) $88 = 0x00007faedb5ff040
Assign the end of the generation 2 heap segment to the card_word index through gc_heap::find_card_dword
37970 size_t lcw = card_word(card) + (bit_position != 0); -> 37971 if (gc_heap::find_card_dword (lcw, card_word_end) == FALSE) 37972 { 37973 return FALSE; 37974 }
Through the BitScanForward function, get the number of 0s in the bit mark from right to left
DWORD bit_index; 38006 uint8_t res = BitScanForward ( & amp;bit_index, card_word_value); -> 38007 assert (res != 0); 38008 card_word_value >>= bit_index; 38009 bit_position + = bit_index;
In this way, we can determine the value in the address where card_table is located after the displacement of n1.selfName in the old generation in the third step above, and then calculate its card
card = (last_card_word - & amp;card_table[0]) * card_word_width + bit_position;
In this way, the entire range from card to card_end is confirmed. Then it loops through this range to find the range in which the second-generation object refers to the new-generation object. Tag it.
Another thing to note here is that in addition to the card_table loop, there is also a heap segment loop. The former is a small loop inside, and the latter is a large loop outside.
if (seg) { #ifdef BACKGROUND_GC should_check_bgc_mark (seg, & amp;consider_bgc_mark_p, & amp;check_current_sweep_p, & amp;check_saved_sweep_p); #endif //BACKGROUND_GC beg = heap_segment_mem (seg); #ifdef USE_REGIONS end = heap_segment_allocated (seg); #else end = compute_next_end (seg, low); #endif //USE_REGIONS #ifdef FEATURE_CARD_MARKING_STEALING card_word_end = 0; #else // FEATURE_CARD_MARKING_STEALING card_word_end = card_of (align_on_card_word (end)) / card_word_width;
5. Mark
How do you mark it after you find it? via mark_object_simple
b mark_object_simple c n..... 24875 -> 24876 o = mark_queue.queue_mark (o); 24877 if (o != nullptr) 24878 { 24879 m_boundary (o);
qeeue_mark searches the mark queue. If the object has already been marked, then there is no need to mark it again.
Check if n2 is marked
(lldb) x/8gx 0x00007fbf6cc00028 0x7fbf6cc00028: 0x00007fff7911c471 0x00007fffe6bff8e0 0x7fbf6cc00038: 0x0000000000000000 0x0000000000000000 0x7fbf6cc00048: 0x0000000000000000 0x0000000000000000 0x7fbf6cc00058: 0x0000000000000000 0x0000000000000000
Because the object has already been marked, there is no need to mark it again. The above general cross-generation reference operation, the details are still worthy of further exploration.
References are as follows:
CLR cross-generation tagged memory model
CLR card_table displacement and array
The operation mode of lldb watch card_table (CLR)