Use go language to build blockchain Part3. Persistence and command line interface

English source address

Introduction

So far, we have built a blockchain with a proof-of-work system, which makes mining possible. Our implementation is getting closer to a fully functional blockchain. But it still lacks some important functionality. Today we will start storing the blockchain in a database, after which we will implement a simple command line interface to perform operations on the blockchain. Essentially, the blockchain is a distributed database. We will ignore The ‘distributed’ part, while focusing on the ‘database’ part.

Database selection

Currently, there is no database in our implementation; instead, we create blocks each time we run the program and store them in memory. We cannot reuse the blockchain, we cannot share it with others, so we need to store it in stored on disk.
Which database do we need? Actually, any one will do. In the original Bitcoin paper, there was no mention of using a certain database, so which database to use is up to the developer. BitcoinCore was originally developed by Satoshi Nakamoto Release, currently the reference implementation for Bitcoin, uses LevelDB (although it was only introduced to the client in 2012). And we will be using…

BoltDB

because:

it’s simple and minimal
It is implemented in Go language
It does not need to run the server
It allows building the data structure we want

From BoltDB’s README on github:

Bolt is a pure Go key-value store, inspired by Howard Chu’s LMDB project. The project’s goal is to provide a simple, fast and reliable database for projects that do not require a full database server (such as Postgre or MySQL).
Since the Bolt is intended to be used as a low-level function, simplicity is key. The API will be small, and focus only on getting values and setting values. That’s it

Sounds like a perfect fit for our needs! Let’s take a moment to review.
BoltDB is a kv store, which means that there are no tables like SQL RDBMS (MySQL, Postgre, etc) and other relational database management systems, no rows, no columns. Instead, data is stored as key-value pairs (like Golang Like a dictionary in ). Key-value pairs are stored in buckets, and buckets are used to group similar pairs (this is similar to a table in an RDBMS). So, in order to get a value, you need to know a bucket and a key.
An important feature of BoltDB is that there are no data types: keys and values are byte arrays. Since we will store Go structs (specifically Blocks) in them. We need to serialize them, that is, implement the conversion of Go structs to bytes Array and get it back from a byte array. We will use encoding/gob, but can also use json, xml, Protocol Buffers, etc. We use encoding/gob because it is simple and is part of the Go standard library part.

Database structure

Before starting to implement the persistence logic, we first need to decide how to store data in the DB. For this, we will refer to the implementation of Bitcoin Core.
In short, Bitcoin Core uses two ‘buckets’ to store data:

blocks block, which stores the metadata of all blocks in the chain
chainstate, stores the state of the chain, that is, all currently unused transaction outputs and some metadata

Also, chunks are stored on disk as separate files. This is done for performance reasons: reading individual chunks does not require loading all (or parts) of them into memory. We will not implement this.
In a block, key-value pairs are:

‘b’ + 32 bytes block hash value -> block index record
‘f’ + 4 bytes file number -> file information record
‘l’->4 byte file number: last used block file number
‘R’ -> 1 byte boolean: whether we are reindexing
‘F’ + 1 byte flag name length + flag name string -> 1 byte boolean value: various states that can be turned on or off
‘t’ + 32 bytes transaction hash -> transaction index record

In chainstate, key->value pairs are:

‘c’ + 32 bytes of transaction hash -> unspent transaction output record for this transaction
‘B’->32 byte block hash: The database indicates the block hash value of the unused transaction output

A detailed description can be found here

Since we don’t have transactions yet, we only have the blocks bucket. Also, as mentioned above, we store the entire DB as a single file, and don’t store blocks in separate files. So we don’t need anything related to file numbers . So these are the key-value pairs we will use:

32-byte Block-hash->block structure (serialization)
‘l’ -> the hash of the last block in the chain

That’s all you need to know to start implementing a persistence mechanism.

Serialization

As mentioned earlier, in BoltDB, values can only be of type []byte, and we want to store Block structures in DB. We will use encoding/gob to serialize these structures.
Let’s implement Block’s serialization method Serialize (error handling omitted for brevity)

func (b *Block) Serialize() []byte {<!-- -->
var result bytes. Buffer
encoder := gob.NewEncoder( & amp;result)

err := encoder. Encode(b)
if err != nil {<!-- -->
return nil
}

return result. Bytes()
}

This part is simple: first, declare a buffer to store the serialized data, then initialize a gob encoder and encode the block, the result is returned as a byte array.
Next, we need a deserialization function that will take a byte array as input and return a Block. This will not be a method, but a standalone function.

func DeserializeBlock(d []byte) *Block {<!-- -->
var block Block

decoder := gob. NewDecoder(bytes. NewReader(d))
_ = decoder. Decode( & amp; block)
return & block
}

That’s all there is to serialization!

Persistence

Let’s start with the NewBlockchain function. Currently, it creates a new blockchain instance and adds the genesis block to it. What we want it to do is:

open a DB file
Check the blockchain stored in it
If there is a blockchain
1. Create a new Blockchain instance
2. Set the top of the Blockchain instance to the hash of the last block stored in the database
If there is no existing blockchain:
1. Create a genesis block
2. Store in DB
3. Save the hash value of the genesis block as the last hash value
4. Create a new blockchain instance whose tip points to the genesis block.

In code, it looks like this:

func NewBlockchain() *Blockchain {<!-- -->
var tip []byte
db, err := bolt. Open(dbFile, 0600, nil)
\t
err = db.Update(func(tx *bolt.Tx) error {<!-- -->
b := tx.Bucket([]byte(blocksBucket))
\t\t
if b == nil {<!-- -->
genesis := NewGenesisBlock()
b, err := tx.CreateBucket([]byte(blocksBucket))
err = b.Put(genesis.Hash, genesis.Serialize())
err = b.Put([]byte("1"), genesis.Hash)
tip = genesis. Hash
} else {<!-- -->
tip = b. Get([]byte("1"))
}
return nil
})
bc := Blockchain{<!-- -->tip, db}
\t
return & bc
}

Let’s look at it piece by piece.

db, err := bolt. Open(dbFile, 0600, nil)

This is the standard way of opening a BoltDB file. Note that it will not return an error if there is no such file.

err = db.Update(func(tx *bolt.Tx) error {<!-- -->
...
})

In BoltDB, operations on the database run in transactions. There are two types of transactions: read-only and read-write. Here, we open a read-write transaction (db.Update(…)), because we want the genesis The chunks are put into the DB.

b := tx.Bucket([]byte(blocksBucket))

if b == nil {<!-- -->
genesis := NewGenesisBlock()
b, err := tx.CreateBucket([]byte(blocksBucket))
err = b.Put(genesis.Hash, genesis.Serialize())
err = b.Put([]byte("l"), genesis.Hash)
tip = genesis. Hash
} else {<!-- -->
tip = b. Get([]byte("l"))
}

This is the core of the function. Here, we get the bucket where the block is stored: if it exists, we read the l key from it; if it doesn’t exist, we generate the genesis block, create the bucket, and save the block to the bucket , and update the l key of the hash value of the last block of the storage chain.
Also, notice the new method of creating blockchains:

bc := Blockchain{<!-- -->tip, db}

Instead of storing all the blocks, we only store the end of the chain. Also, we store a DB connection, because we want to open it once and keep it open while the program is running. Therefore, the blockchain structure is now It looks like this:

type Blockchain struct {<!-- -->
tip[]byte
db *bolt.DB
}

The next method we want to update is AddBlock: Adding a block to a chain is now not as simple as adding an element to an array. From now on, we will store blocks in the database:

func (bc *Blockchain) AddBlock(data string) {<!-- -->
var lastHash[]byte
\t
_ = bc.db.View(func(tx *bolt.Tx) error {<!-- -->
b := tx.Bucket([]byte(blocksBucket))
lastHash = b. Get([]byte("l"))
\t\t
return nil
})
\t
newBlock := NewBlock(data, lastHash)
\t
_ = bc.db.Update(func(tx *bolt.Tx) error {<!-- -->
b := tx.Bucket([]byte(blocksBucket))
_ = b.Put(newBlock.Hash, newBlock.Serialize())
_ = b.Put([]byte("l"), newBlock.Hash)
bc.tip = newBlock.Hash
\t\t
return nil
})
}

let’s see part by part

err := bc.db.View(func(tx *bolt.Tx) error {<!-- -->
b := tx.Bucket([]byte(blocksBucket))
lastHash = b. Get([]byte("l"))

return nil
})

This is another (read-only) type of BoltDB transaction. Here, we get the last block hash from the database and use it to mine the new block hash.

newBlock := NewBlock(data, lastHash)
b := tx.Bucket([]byte(blocksBucket))
err := b.Put(newBlock.Hash, newBlock.Serialize())
err = b.Put([]byte("l"), newBlock.Hash)
bc.tip = newBlock.Hash

After mining a new block, we save its serialized representation to DB and update the key, which now stores the hash of the new block.
Done! It’s not that hard, right?

Check the blockchain

Now all new blocks are saved in the database, so we can reopen the blockchain and add new blocks to it. But after implementing this, we lost a nice feature: we can no longer print the blocks of the blockchain block, because we no longer store blocks in an array. Let’s fix this bug!
BoltDB allows iterating over all keys in a bucket, but keys are stored in endian order, and we want blocks to be printed in the order they are in the blockchain. Also, since we don’t want to load all blocks into memory (Our blockchain database can be huge! Or let’s pretend it can be), and we’ll read them one by one. For this, we need a blockchain iterator:

type BlockchainIterator struct {<!-- -->
currentHash[]byte
db *bolt.DB
}

Every time we want to iterate over a block in the blockchain, an iterator is created which will store the currently iterated block hash and the connection to the DB. Thanks to the latter, an iterator is logically attached to to the blockchain (it is the Blockchain instance that stores the DB connection), therefore, in the Blockchain method create:

func (bc *Blockchain) Iterator() *BlockchainIterator {<!-- -->
bci := &BlockchainIterator{<!-- -->bc.tip, bc.db}
\t
return bci
}

Note that the iterator initially points to the top of the blockchain, so it will be fetched top-down, from the newest block to the oldest. In fact, choosing a top block means voting for the blockchain’ ‘. A blockchain can have multiple branches, the longest branch of which is considered the main branch. After getting a top block (it can be any block of the blockchain), we can rebuild the entire block chain and find its length and the work required to build it. This fact also means that the top block is an identifier for the blockchain.
BlockchainIterator does only one thing: it returns the next block from the blockchain.

func (i *BlockchainIterator) Next() *Block {<!-- -->
var block *Block
\t
_ = i.db.View(func(tx *bolt.Tx) error {<!-- -->
b := tx.Bucket([]byte(blocksBucket))
encodeBlock := b. Get(i. currentHash)
block = DeserializeBlock(encodeBlock)
\t\t
return nil
})
i.currentHash = block.PrevBlockHash
return block
}

That’s the DB part!

CLI

So far, our implementation does not provide any interface to interact with the program: we just execute NewBlockchain, bc.AddBlock in the main function. It’s time to improve it! We hope to have these commands:

blockchain_go addblock "Pay 0.031337 for a coffee"
blockchain_go printchain

All command-line related operations will be handled by the CLI structure

type CLI struct {<!-- -->
bc *Blockchain
}

Its entry point is the Run function

func (cli *CLI) Run() {<!-- -->
cli. validateArgs()

addBlockCmd := flag.NewFlagSet("addblock", flag.ExitOnError)
printChainCmd := flag.NewFlagSet("printchain", flag.ExitOnError)

addBlockData := addBlockCmd. String("data", "", "Block data")

switch os.Args[1] {<!-- -->
case "addblock":
_ = addBlockCmd. Parse(os. Args[2:])
case "printchain":
_ = printChainCmd. Parse(os. Args[2:])
default:
cli. printUsage()
os. Exit(1)
}

if addBlockCmd.Parsed() {<!-- -->
if *addBlockData == "" {<!-- -->
addBlockCmd. Usage()
os. Exit(1)
}
cli. addBlock(*addBlockData)
}

if printChainCmd.Parsed() {<!-- -->
cli. printChain()
}
}

We use the flag package in the standard library to parse command line arguments.

addBlockCmd := flag.NewFlagSet("addblock", flag.ExitOnError)
printChainCmd := flag.NewFlagSet("printchain", flag.ExitOnError)
addBlockData := addBlockCmd. String("data", "", "Block data")

First, we create two subcommands, addblock and printchain, and add the -data flag to the former. printchain doesn’t have any flags.

switch os.Args[1] {<!-- -->
case "addblock":
err := addBlockCmd. Parse(os. Args[2:])
case "printchain":
err := printChainCmd. Parse(os. Args[2:])
default:
cli. printUsage()
os. Exit(1)
}

Next, we validate the user-supplied command and parse the associated flag subcommand.

if addBlockCmd.Parsed() {<!-- -->
if *addBlockData == "" {<!-- -->
addBlockCmd. Usage()
os. Exit(1)
}
cli. addBlock(*addBlockData)
}

if printChainCmd.Parsed() {<!-- -->
cli. printChain()
}

Next, we check which subcommands are parsed and run the associated functions.

func (cli *CLI) addBlock(data string) {<!-- -->
cli.bc.AddBlock(data)
fmt.Println("Success!")
}

func (cli *CLI) printChain() {<!-- -->
bci := cli.bc.Iterator()

for {<!-- -->
block := bci. Next()

fmt.Printf("Prev. hash: %x\\
", block. PrevBlockHash)
fmt.Printf("Data: %s\\
", block.Data)
fmt.Printf("Hash: %x\\
", block.Hash)
pow := NewProofOfWork(block)
fmt.Print("PoW: %s\\
", strconv.FormatBool(pow.Validate()))
fmt. Println()

if len(block. PrevBlockHash) == 0 {<!-- -->
break
}
}
}

This part is very similar to the part we wrote earlier. The only difference is that we now use a BlockchainIterator to iterate over the blocks in the blockchain.
Let’s also not forget to modify the main function accordingly:

func main() {<!-- -->
bc := NewBlockchain()

defer bc.db.Close()

cli := CLI{<!-- -->bc}
cli. Run()
}

Note that no matter what command line arguments are provided, a new blockchain will be created.
That’s it! Let’s check that everything works:

╰─ ./main printchain ─╯
Prev. hash:
Data: Genesis Block
Hash: 000000af3d84a4798bfe67ced5cd779e63bad34351cf7d5624db731d9a88d55c
PoW: true

╰─ ./main addblock -data "Send 1 BTC to Ivan" ─╯
Mining the block containing "Send 1 BTC to Ivan"
000000a14750159cf16bd4e80ea05a2f75818bdfefd17de2c4f7a222fbd5ba1f

Success!

╰─ ./main addblock -data "Pay 0.31337 BTC for a coffee" ─╯
Mining the block containing "Pay 0.31337 BTC for a coffee"
000000b2ddc5a69a640004db0370476a78012b0ad5d6c8e57d53cef4c6cba8c8

Success!

╰─ ./main printchain ─╯
Prev. hash: 000000a14750159cf16bd4e80ea05a2f75818bdfefd17de2c4f7a222fbd5ba1f
Data: Pay 0.31337 BTC for a coffee
Hash: 000000b2ddc5a69a640004db0370476a78012b0ad5d6c8e57d53cef4c6cba8c8
PoW: true

Prev. hash: 000000af3d84a4798bfe67ced5cd779e63bad34351cf7d5624db731d9a88d55c
Data: Send 1 BTC to Ivan
Hash: 000000a14750159cf16bd4e80ea05a2f75818bdfefd17de2c4f7a222fbd5ba1f
PoW: true

Prev. hash:
Data: Genesis Block
Hash: 000000af3d84a4798bfe67ced5cd779e63bad34351cf7d5624db731d9a88d55c
PoW: true

(sound of beer cans being opened)

Summary

Next time we will implement: addresses, wallets and (possibly) transactions. Stay tuned!

Links
Full source codes
Bitcoin Core data storage
boltdb
encoding/gob
flag