Implement a database (DataBase) Go language implementation version from scratch 7. Free list: reuse pages

English source address
Since our B-tree is immutable, each update to the kv store will create new nodes on the path instead of updating the current nodes, making some nodes inaccessible from the latest version. We need to reuse these from the old version Inaccessible nodes, otherwise, the database file will grow infinitely.

Design Free List

In order to reuse these pages, we will add a free list of persistent storage to track unused pages. The update operation reuses pages in the list before adding new pages, and adds unused pages in the current version to the list.
The list is used as a stack (first in, last out), and each update operation can remove or add from the top of the list.

func (fl *FreeList) Total() int {<!-- -->
\t
}

func (fl *FreeList) Get(topn int) uint64 {<!-- -->
\t
}

func (fl *FreeList) Update(popn int, freed []uint64) {<!-- -->
\t
}

Like b-trees, free lists are also immutable. Each node contains:

  1. Multiple pointers to unused pages
  2. link to next node
  3. The total number of items in the list. This only applies to the head node.

    format of node
const BNODE_FREE_LIST = 3
const FREE_LIST_HEADER = 4 + 8 + 8
const FREE_LIST_CAP = (BTREE_PAGE_SIZE - FREE_LIST_HEADER) / 8

Functions to access list nodes

func flnSize(node BNode) int {<!-- -->
\t
}

func flnNext(node BNode) uint64 {<!-- -->
\t
}

func flnPtr(node BNode, idx int) {<!-- -->
\t
}

func flnSetPtr(node BNode, size uint16, next uint64){<!-- -->
\t
}

func flnSetHeader(node BNode, size uint16, next uint64) {<!-- -->
\t
}

func flnSetTotal(node BNode, total uint64) {<!-- -->
\t
}

Data type of free list

The FreeList type contains a pointer to the head node and a callback function for managing disk pages.

type FreeList struct {<!-- -->
head uint64

get func(uint64) BNode
new func(BNode) uint64
use func(uint64, BNode)
}

These callbacks are different from b-trees because the pages used by the list are managed by the list itself.

  1. The new callback function is only used to append new pages, since the free list must reuse pages from itself
  2. There is no del callback function, because the free list adds unused pages to itself
  3. The use callback function registers the pending update operation to the reused page.
type BTree struct {<!-- -->
    // pointer (a nonzero page number)
    root uint64
    // callbacks for managing on-disk pages
    get func(uint64) BNode // dereference a pointer
    new func(BNode) uint64 // allocate a new page
    del func(uint64) // deallocate a page
}

Implementation of free list

Getting the nth item from a list is just a simple list traversal.

func (fl *FreeList) Get(topn int) uint64 {<!-- -->
// assert(0 <= topn & amp; & amp; topn < f1.Total())
node := fl. get(fl. head)
for flnSize(node) <= topn {<!-- -->
topn -= flnSize(node)
next := flnNext(node)
// assert(next != 0)
node = fl. get(next)
}
return flnPtr(node, flnSize(node)-topn-1)
}

Updating the list is a tricky operation, it first removes the popn item from the list, and then adds the freed item to the list, which can be divided into 3 stages:
1. If the head node is larger than popn, remove it. The node itself will be added to the list later. Repeat this step until it no longer satisfies the condition.
2. We may need to remove some items from the list, and may add some items to the list, updating the list header requires a new page, and the new page should be reused from the items in the list itself. Pop from the list one by one Some items, until enough pages are available for reuse in the next stage.
3. Modify the list by adding new nodes.

func (fl *FreeList) Update(popn int, freed []uint64) {<!-- -->
   // assert(popn <= fl. Total())
   if popn == 0 & amp; & amp; len(freed) == 0 {<!-- -->
   return
   }
   total := fl. Total()
   reuse := []uint64{<!-- -->}

   for fl.head != 0 & amp; & amp; len(reuse)*FREE_LIST_CAP < len(freed) {<!-- -->
   node := fl. get(fl. head)
   freed = append(freed, fl. head)
   if popn >= flnSize(node) {<!-- -->
   popn -= flnSize(node)
   } else {<!-- -->
   remain := flnSize(node) - popn
   popn = 0
   for remain > 0 & amp; & amp; len(reuse)*FREE_LIST_CAP < len(freed) + remain {<!-- -->
   remain--
   reuse = append(reuse, flnPtr(node, remain))
   }
   for i := 0; i < remain; i + + {<!-- -->
   freed = append(freed, flnPtr(node, i))
   }
   }
   total -= flnSize(node)
   fl.head = flnNext(node)
   }
   // assert(len(reuse)*FREE_LIST_CAP >= len(freed) || fl.head == 0)
   flPush(fl, freed, reuse)
   flnSetTotal(fl. get(fl. head), uint64(total + len(freed)))
}

func flPush(fl *FreeList, freed []uint64, reuse []uint64) {<!-- -->
   for len(freed) > 0 {<!-- -->
   new := BNode{<!-- -->make([]byte, BTREE_PAGE_SIZE)}
   size := len(freed)
   if size > FREE_LIST_CAP {<!-- -->
   size = FREE_LIST_CAP
   }
   flnSetHeader(new, uint16(size), fl.head)
   for i, ptr := range freed[:size] {<!-- -->
   flnSetPtr(new, i, ptr)
   }
   freed = freed[size:]

   if len(reuse) > 0 {<!-- -->
   fl. head, reuse = reuse[0], reuse[1:]
   fl. use(fl. head, new)
   } else {<!-- -->
   fl.head = fl.new(new)
   }
   }
   // assert(len(reuse) == 0)
}

Manage disk page

Finished modifying the data structure. Temporary pages are kept in a dictionary, mapped by their assigned page numbers. Deleted page numbers are also there.

type KV struct {<!-- -->
Path string
fp *os.File
tree BTree
mmap struct {<!-- -->
file int
total int
chunks [][]byte
}
page struct {<!-- -->
flusheduint64
temp[][]byte
nfree int
nappend int
updates map[uint64][]byte
}
}

Page management of b tree

The pageGet function was also modified to return a temporary page, since the freelist code relies on this behavior.

func (db *KV) pageGet(ptr uint64) BNode {<!-- -->
if page, ok := db.page.updates[ptr]; ok {<!-- -->
// assert(page != nil)
return BNode{<!-- -->page}
}
return pageGetMapped(db, ptr)
}

func pageGetMapped(db *KV, ptr uint64) BNode {<!-- -->
start := uint64(0)
for _, chunk := range db.mmap.chunks {<!-- -->
end := start + uint64(len(chunk))/BTREE_PAGE_SIZE
if ptr < end {<!-- -->
offset := BTREE_PAGE_SIZE * (ptr - start)
return BNode{<!-- -->chunk[offset : offset + BTREE_PAGE_SIZE]}
}
start = end
}
panic("bad ptr")
}

Change the function that allocates b-tree pages to reuse free-list pages first

func (db *KV) pageNew(node BNode) uint64 {<!-- -->
// assert(len(node.data) <= BTREE_PAGE_SIZE)

ptr := uint64(0)
if db.page.nfree < db.free.Total() {<!-- -->
ptr = db.free.Get(db.page.nfree)
\t\t
db.page.nappend++
} else {<!-- -->
ptr = db.page.flushed + uint64(db.page.nappend)
db.page.nappend++
}
db.page.updates[ptr] = node.data
return ptr
}

Deleted pages are marked and later updated to the free list

func (db *KV) pageDel(ptr uint64) {<!-- -->
db.page.updates[ptr] = nil
}

Page management for free lists

func (db *KV) pageAppend(node BNode) uint64 {<!-- -->
// assert(len(node.data) <= BTREE_PAGE_SIZE)
ptr := db.page.flushed + uint64(db.page.nappend)
db.page.nappend++
db.page.updates[ptr] = node.data
return ptr
}

func (db *KV) pageUse(ptr uint64, node BNode) {<!-- -->
db.page.updates[ptr] = node.data
}

Update free list

Before extending the file and writing pages to disk, we must first update the free list because it creates pending write operations

func writePages(db *KV) error {<!-- -->
freed := []uint64{<!-- -->}
for ptr, page := range db.page.updates {<!-- -->
if page == nil {<!-- -->
freed = append(freed, ptr)
}
}
db.free.Update(db.page.nfree, freed)

for ptr, page := range db.page.updates {<!-- -->
if page != nil {<!-- -->
copy(pageGetMapped(db, ptr). data, page)
}
}
return nil
}

A pointer to the head of the list is added to the master page.

Done

The kv store is partially done. It is persistent and crash-resistant, although it can only be accessed sequentially.
There will be more content in the second part of the book (but also for a fee).

  • Relational database on KV storage
  • Support for concurrent access to databases and transactions