Novice to Coin Part 1 - Blockchain Data Structure

This is the first post in a series building a cryptocurrency from scratch, the series introduction can be found here.

Every day new developers are learning just how simple blockchain technologies are. One of the first things you may notice is that the collection of algorithms are profoundly clever, but the mechanics of creating a verifiably immutable chain of data is borderline trivial. If you need proof, take a look at how many introductory blog posts are building a toy blockchain. These all include working implementations of a blockchain. One good reason to implement your own is to prove to yourself just how trivial the data structure is. Certainly, my first reaction to reading up on bitcoins block header was: is this all?

Of course, the answer is no. Creating a blockchain is table-stakes, and from there a series of clever algorithms allow transactions to occur without a central authority. But this structure is the heart of all that follows, so building a blockchain is a great place to start.

Like you may have guessed, a blockchain contains a series of blocks. The smallest useful block contains an arbitrary amount of data and a special hash used to verify the block that comes before it. In Diagram 1 you can see that our data structure resembles a linked list, except that the blocks do not point to each other. Rather, they verify each other. In addition, you'll notice that in order to create a new block, the existing blocks do not need to change.

The hash function used here has several important properties. There are two critical properties of a good hash function. First, for any given input to the hash function, the result must always be the same. Meaning if you and I were to hash the string "hello world", we would both get the same hash. The second critical property is that it is extremely difficult to find two inputs to the hashing function which produce the same output. Bitcoin and many other cryptocurrencies have elected to use the SHA256 hashing function which has these properties as well as several other's which we will not cover.

Therefore it would be extremely difficult to modify Block-0 in Diagram-1 in such a way that Hash(Block-0) in Block-1 would stay the same. In addition, if someone wanted to remove Block-1, the chain would no longer be valid because Block-2 contains Hash(Block-1). Because of this, like links in an actual chain, you cannot remove or change one of the blocks. Block-0 is often referred to as the Genesis Block.

Introducing: LynxCoin

It's a chain of stars, too much?

Our toy implementation will be called LynxCoin, named after the Lynx constellation. Let's begin with the block object. As we learned it should contain a previous hash and some data. In Kotlin, this can be achieved with a data class containing two variables and a hash function. Notice that the hash joins all of the data together before passing it into the hashing function, this ensures that all data is taken into account before computing the hash.


private val blockChain = mutableListOf<ToyBlock>()

/**
 * Add and validate a new block.
 */
fun add(block: ToyBlock) {
  if (blockChain.isNotEmpty()) {
    if (blockChain.last().getHash() != block.lastHash) {
      val message = "Invalid block: wrong lastHash."
      throw IllegalArgumentException(message)
    }
  }

  blockChain.add(block)
}

/**
 * Create and add a new block containing some data.
 */
fun add(data: Any) {
  add(ToyBlock(blockChain.last().getHash(), data))
}

Now tie it all together with our main function, and a helper to view the chain:

fun main(args: Array<String>) {
  add(ToyBlock("0", "Genesis block"))
  add("More data")
  add(LocalDateTime.now())
  printChain()
}

/**
 * Print the blocks.
 */
fun printChain() {
  print("Size: ${blockChain.size}")

  var i = 0
  blockChain.forEach {
    println("\n")
    println("Block ${i++}")
    println("hash: ${it.getHash()}")
    println("last hash: ${it.lastHash}")
    println("data: ${it.data}")
  }
}

Running our program produces the following output:
Size: 3

Block 0
hash: FE81582DB2E60D3933F31981CB920EA74189E9574695C0915401E6B7332FB758
last hash: 0
data: Genesis block

Block 1
hash: 7012C5E83A583F02ADCADAF2A7319CAFE45FBFA8649FE53FAFE9AD444D57AA28
last hash: FE81582DB2E60D3933F31981CB920EA74189E9574695C0915401E6B7332FB758
data: More data

Block 2
hash: BDAE7AA36CCFB6431C8788D0F354935A6E5D77987243FB075AF60EC837B14FE1
last hash: 7012C5E83A583F02ADCADAF2A7319CAFE45FBFA8649FE53FAFE9AD444D57AA28
data: 2018-07-08T15:12:13.268

Notice how the hash of each block matches the last hash in the following block, the hash for Block-0 starts with FE815, and Block-1 has the same value for last hash. As we continue we'll add additional data and constraints to these hashes which give us some very interesting results.

The code for this project can be found on GitHub.

Next time...

This concludes our introduction to the blockchain data structure. Next time we'll dive deeper and introduce the concepts of currency and transactions. Once transactions are part of the chain we can also investigate transaction validation. With this, we begin to see the underpinnings that will later allow our currency to be decentralized.

When you're ready, the next article is here.



Comments

Popular posts from this blog

Decommissioning the cloud: A self-hosted future

Using a JFileChooser to browse AWS S3

Taking Stuff Apart: Realistic Modulette-8