.Net8 CLR cross-generation (card_table) continued

1. Preface

card_table is one of the core technologies of CLR. Its bit mark loops through the old generation heap (oldest_gen) to find out the reference of the old heap object to the new generation. Extracting it from CLR and GC is a relatively complex project. Taking .Net8 as the example Blueprint, this article simplifies the complex, continue to study.

2. Overview
The old generation generally refers to the second generation, which is max_generation. Loop through this to find the heap segment in this generation, loop through the address range (old generation address) of the referenced object of each heap segment through bit marking, and mark the referenced object alive. Let’s take a picture first to get a general idea.

The card_word and card_word_end ranges traverse each generation 2 heap segment to find the new generation objects referenced in the generation 2 heap segment.
3. Argument
Check this process through lldb. First, find the addresses of the old generation object (n1) and the new generation object (n2). The former refers to the latter.

Name n1 = new Name("1111");
    GC.Collect();GC.Collect();
    Name n2 = new Name("3333");
    n1.selfName= n2;GC.Collect(0);

At the JIT_WriteBarrier code hosting Main:

0x7fff78d85511: lea rdi, [rdi + 0x10]
0x7fff78d85515: mov rsi, qword ptr [rbp - 0x18]
0x7fff78d85519: call 0x7ffff730f6e0; JIT_WriteBarrier

The addresses of the two objects are as follows:

(lldb) c
(lldb) register read rdi rsi
     rdi = 0x00007fbf6a808b08
     rsi = 0x00007fbf6cc00028

Take a look at what JIT_WriteBarrier does

libcoreclr.so`JIT_WriteBarrier:
-> 0x7ffff730f6e0 < + 0>: mov qword ptr [rdi], rsi
    0x7ffff730f6e3 < + 3>: mov r8, rdi
    0x7ffff730f6e6 < + 6>: movabs rax, 0x7fbee54ff2a0
    0x7ffff730f6f0 < + 16>: shr rdi, 0x16
    0x7ffff730f6f4 < + 20>: cmp byte ptr [rdi + rax], 0x0
    0x7ffff730f6f8 < + 24>: jne 0x7ffff730f6fe ; < + 30>
    0x7ffff730f6fa < + 26>: rep ret
    0x7ffff730f6fc < + 28>: nop
    0x7ffff730f6fe <+30>: movabs r9, 0x7fbf68000000
    0x7ffff730f708 < + 40>: cmp rsi, r9
    0x7ffff730f70b < + 43>: jae 0x7ffff730f70e ; < + 46>
    0x7ffff730f70d < + 45>: ret
    0x7ffff730f70e <+46>: movabs r9, 0x7fff68000000
    0x7ffff730f718 < + 56>: cmp rsi, r9
    0x7ffff730f71b < + 59>: jb 0x7ffff730f71f ; < + 63>
    0x7ffff730f71d < + 61>: rep ret
    0x7ffff730f71f < + 63>: shr rsi, 0x16
    0x7ffff730f723 < + 67>: mov dl, byte ptr [rsi + rax]
    0x7ffff730f726 < + 70>: cmp dl, byte ptr [rdi + rax]
    0x7ffff730f729 < + 73>: jb 0x7ffff730f72e ; < + 78>
    0x7ffff730f72b < + 75>: rep ret
    0x7ffff730f72d < + 77>: nop
    0x7ffff730f72e < + 78>: movabs rax, 0x7faedb5ff040
    0x7ffff730f738 < + 88>: mov ecx, r8d
    0x7ffff730f73b < + 91>: shr r8, 0xb
    0x7ffff730f73f < + 95>: shr ecx, 0x8
    0x7ffff730f742 < + 98>: and ecx, 0x7
    0x7ffff730f745 < + 101>: mov dl, 0x1
    0x7ffff730f747 < + 103>: shl dl, cl
    0x7ffff730f749 < + 105>: test byte ptr [r8 + rax], dl
    0x7ffff730f74d < + 109>: je 0x7ffff730f751; < + 113>
    0x7ffff730f74f < + 111>: rep ret
    0x7ffff730f751 < + 113>: lock
    0x7ffff730f752 < + 114>: or byte ptr [r8 + rax], dl
    0x7ffff730f756 < + 118>: movabs rax, 0x7fbedf4ef500
    0x7ffff730f760 < + 128>: shr r8, 0xa
    0x7ffff730f764 < + 132>: cmp byte ptr [r8 + rax], -0x1
    0x7ffff730f769 < + 137>: jne 0x7ffff730f76d ; < + 141>
    0x7ffff730f76b < + 139>: rep ret
    0x7ffff730f76d < + 141>: mov byte ptr [r8 + rax], -0x1
    0x7ffff730f772 < + 146>: ret

The code is a bit long, so I’ll simulate it in C:

n1.selfName=n2;
r8=n1.selfName;
rax=cardw_card_bundle//Further restrict the scope based on card_table
n1.selfName=n1.selfName>>0x16
if((cardw_card_bundle + n1.selfName)==0)
{
  return;
}
else
{
  if(transient heap starting address<=n2<transient heap end address)
  {
    n2=n2<<0x16;
  rax=0x7faedb5ff040;
  *(rax + n1.selfName>>0xB)=8;//The windows here are 0xFF
  }
}

The rough meaning is to move the second generation object n1.selfName to the right

0x0B + card_table first address, its value is assigned to 0x8.

continue:

br del
b gc.cpp:38448
c
38445 limit = min (end, card_address (end_card));
   38446 #endif // FEATURE_CARD_MARKING_STEALING
   38447 }
-> 38448 if (!foundp || (last_object >= end) || (card_address (card) >= end))
   38449 {
   38450 if (foundp & amp; & amp; (cg_pointers_found == 0))
   38451 {<!-- -->

limit Previous article said that it is an end range in the 2nd generation heap segment, and a starting range. These two variables constitute the search for old objects that reference the new generation from this range. Age object. Take a look at the end address of this limit:

(lldb) p/x limit
(uint8_t *) $81 = 0x00007fbf6a808b18 ""

The address of n1.selfName above is: 0x00007fbf6a808b08

The end range address of limit is: 0x00007fbf6a808b18
It can be assumed that this loop contains the addresses of 2nd generation objects. then it can be found
4.find_card
Before limit, the CLR calls find_card to find the address range in which this second-generation object refers to the new-generation object.
Take a look inside the find_card function

b gc.cpp:37953
c
   37952 last_card_word = & amp;card_table [card_word (card)];
-> 37953 bit_position = card_bit (card);
   37954 #ifdef CARD_BUNDLE
   37955 // if we have card bundles, consult them before fetching a new card word
   37956 if (bit_position == 0)

card_table is the address added after n1.selfName is shifted to the right in the third step of the argument above.

(lldb) p/x card_table
(uint32_t *) $88 = 0x00007faedb5ff040

Assign the end of the generation 2 heap segment to the card_word index through gc_heap::find_card_dword

37970 size_t lcw = card_word(card) + (bit_position != 0);
-> 37971 if (gc_heap::find_card_dword (lcw, card_word_end) == FALSE)
   37972 {
   37973 return FALSE;
   37974 }

Through the BitScanForward function, get the number of 0s in the bit mark from right to left

DWORD bit_index;
   38006 uint8_t res = BitScanForward ( & amp;bit_index, card_word_value);
-> 38007 assert (res != 0);
   38008 card_word_value >>= bit_index;
   38009 bit_position + = bit_index;

In this way, we can determine the value in the address where card_table is located after the displacement of n1.selfName in the old generation in the third step above, and then calculate its card

card = (last_card_word - & amp;card_table[0]) * card_word_width + bit_position;

In this way, the entire range from card to card_end is confirmed. Then it loops through this range to find the range in which the second-generation object refers to the new-generation object. Tag it.

Another thing to note here is that in addition to the card_table loop, there is also a heap segment loop. The former is a small loop inside, and the latter is a large loop outside.

if (seg)
            {
#ifdef BACKGROUND_GC
                should_check_bgc_mark (seg, & amp;consider_bgc_mark_p, & amp;check_current_sweep_p, & amp;check_saved_sweep_p);
#endif //BACKGROUND_GC
                beg = heap_segment_mem (seg);
#ifdef USE_REGIONS
                end = heap_segment_allocated (seg);
#else
                end = compute_next_end (seg, low);
#endif //USE_REGIONS
#ifdef FEATURE_CARD_MARKING_STEALING
                card_word_end = 0;
#else // FEATURE_CARD_MARKING_STEALING
                card_word_end = card_of (align_on_card_word (end)) / card_word_width;

5. Mark

How do you mark it after you find it? via mark_object_simple

b mark_object_simple
c
n.....
   24875
-> 24876 o = mark_queue.queue_mark (o);
   24877 if (o != nullptr)
   24878 {
   24879 m_boundary (o);

qeeue_mark searches the mark queue. If the object has already been marked, then there is no need to mark it again.

Check if n2 is marked

(lldb) x/8gx 0x00007fbf6cc00028
0x7fbf6cc00028: 0x00007fff7911c471 0x00007fffe6bff8e0
0x7fbf6cc00038: 0x0000000000000000 0x0000000000000000
0x7fbf6cc00048: 0x0000000000000000 0x0000000000000000
0x7fbf6cc00058: 0x0000000000000000 0x0000000000000000

Because the object has already been marked, there is no need to mark it again. The above general cross-generation reference operation, the details are still worthy of further exploration.

References are as follows:

CLR cross-generation tagged memory model

CLR card_table displacement and array

The operation mode of lldb watch card_table (CLR)