Novice to Coin Part 1 - Blockchain Data Structure

Every day new developers are learning just how simple blockchain technologies are. One of the first things you may notice is that the collection of algorithms are profoundly clever, but the mechanics of creating a verifiably immutable chain of data is borderline trivial. If you need proof, take a look at how many introductory blog posts are building a toy blockchain. These all include working implementations of a blockchain. One good reason to implement your own is to prove to yourself just how trivial the data structure is. Certainly, my first reaction to reading up on bitcoins block header was: is this all?

Of course, the answer is no. Creating a blockchain is table-stakes, and from there a series of clever algorithms allow transactions to occur without a central authority. But this structure is the heart of all that follows, so building a blockchain is a great place to start.

Like you may have guessed, a blockchain contains a series of blocks. The smallest useful block contains an arbitrary amount of data and a special hash used to verify the block that comes before it. In Diagram 1 you can see that our data structure resembles a linked list, except that the blocks do not point to each other. Rather, they verify each other. In addition, you'll notice that in order to create a new block, the existing blocks do not need to change.

The hash function used here has several important properties. There are two critical properties of a good hash function. First, for any given input to the hash function, the result must always be the same. Meaning if you and I were to hash the string "hello world", we would both get the same hash. The second critical property is that it is extremely difficult to find two inputs to the hashing function which produce the same output. Bitcoin and many other cryptocurrencies have elected to use the SHA256 hashing function which has these properties as well as several other's which we will not cover.

Therefore it would be extremely difficult to modify Block-0 in Diagram-1 in such a way that Hash(Block-0) in Block-1 would stay the same. In addition, if someone wanted to remove Block-1, the chain would no longer be valid because Block-2 contains Hash(Block-1). Because of this, like links in an actual chain, you cannot remove or change one of the blocks. Block-0 is often referred to as the Genesis Block.

Introducing: LynxCoin

It's a chain of stars, too much?

Our toy implementation will be called LynxCoin, named after the Lynx constellation. Let's begin with the block object. As we learned it should contain a previous hash and some data. In Kotlin, this can be achieved with a data class containing two variables and a hash function. Notice that the hash joins all of the data together before passing it into the hashing function, this ensures that all data is taken into account before computing the hash.


private val blockChain = mutableListOf<ToyBlock>()

/**
 * Add and validate a new block.
 */
fun add(block: ToyBlock) {
  if (blockChain.isNotEmpty()) {
    if (blockChain.last().getHash() != block.lastHash) {
      val message = "Invalid block: wrong lastHash."
      throw IllegalArgumentException(message)
    }
  }

  blockChain.add(block)
}

/**
 * Create and add a new block containing some data.
 */
fun add(data: Any) {
  add(ToyBlock(blockChain.last().getHash(), data))
}

Now tie it all together with our main function, and a helper to view the chain:

fun main(args: Array<String>) {
  add(ToyBlock("0", "Genesis block"))
  add("More data")
  add(LocalDateTime.now())
  printChain()
}

/**
 * Print the blocks.
 */
fun printChain() {
  print("Size: ${blockChain.size}")

  var i = 0
  blockChain.forEach {
    println("\n")
    println("Block ${i++}")
    println("hash: ${it.getHash()}")
    println("last hash: ${it.lastHash}")
    println("data: ${it.data}")
  }
}

Running our program produces the following output:
Size: 3

Block 0
hash: FE81582DB2E60D3933F31981CB920EA74189E9574695C0915401E6B7332FB758
last hash: 0
data: Genesis block

Block 1
hash: 7012C5E83A583F02ADCADAF2A7319CAFE45FBFA8649FE53FAFE9AD444D57AA28
last hash: FE81582DB2E60D3933F31981CB920EA74189E9574695C0915401E6B7332FB758
data: More data

Block 2
hash: BDAE7AA36CCFB6431C8788D0F354935A6E5D77987243FB075AF60EC837B14FE1
last hash: 7012C5E83A583F02ADCADAF2A7319CAFE45FBFA8649FE53FAFE9AD444D57AA28
data: 2018-07-08T15:12:13.268

Notice how the hash of each block matches the last hash in the following block, the hash for Block-0 starts with FE815, and Block-1 has a last hash pointer of the same value. As we continue we'll add additional data and constraints to these hashes which give us some very interesting results.

The code for this project can be found on GitHub.

Next time...

This concludes our introduction to the blockchain data structure. Next time we'll dive deeper and introduce the concepts of currency and transactions. Once transactions are part of the chain we can also investigate transaction validation. With this, we begin to see the underpinnings that will later allow our currency to be decentralized.

Novice to Coin Introduction

In this series of blog posts, the concepts needed to understand and implement the fundamentals of blockchain and cryptocurrency technologies will be introduced. During each post a new cryptocurrency concept will be introduced, then the technical details will be implemented as a practical demonstration of how they work. The currency will be implemented with Kotlin, which I've found to be expressive while maintaining readability. Future iterations of this project may include alternate implementations in other languages.

This series will be broken into several parts:
  1. Data Structure
  2. Currency and Transactions
  3. Decentralization and Mining
  4. Optimization: Merkle Trees
Data Structure
An introduction to the mechanisms that allow us to verify that data in the chain has not been tampered with. We'll begin with a high-level explanation of how it works and move onto a succinct implementation which will serve as a stepping stone for the following posts.

Currency and Transactions
Once familiar with the basic construction of a blockchain, we'll dive deeper and introduce the concepts of currency and transactions. Once transactions are part of the chain we can also investigate transaction validation. With this, we begin to see the underpinnings that will later allow our currency to be decentralized.

Decentralization and Mining
We'll discuss some of the concepts that allow us to create a trustless peer to peer network by introducing the Byzantine Generals Problem and how it is solved by Proof of Work and Proof of Stake. Our toy implementation goes distributed as we create a small web service to submit blocks.

Optimization: Merkle Trees
Now that we have a decentralized cryptocurrency we can begin looking into various optimizations to make them more usable. One of the most important ideas involves the usage of Merkle Tree's, their use allows us to validate transactions in a block without downloading the entire blockchain.

Building a REST Client with Kotlin and Retrofit2

If you regularly work with REST endpoints you're probably used to a lot of tedium. Between carefully constructing the parameters, encoding data in just the right format, efficiently sending the requests, parsing the results and handling errors at every step along the way; there is a lot of room to make mistakes.

For simple APIs, it often makes sense to manually format the request and send it out. Checking a status code, and maybe mapping results to an object aren't that big a deal for a simple API. But what if there are 20 different endpoints? Or you need to manage authorization headers? Eventually, it makes sense to start looking for a wrapper library, or maybe making your own wrapper layer.

That's what this post is about.

For JVM developers, pairing Kotlin, Retrofit2 and Gson make it easy to bang out a type-safe client wrapper.

Let us jump right into the code:

// Create data classes for serializing inputs and outputs.
data class Results<out T>(
    val next: String?,
    val previous: String?,
    val results: List<T>
)

data class CryptoQuote (
    val ask_price: Double,
    val bid_price: Double,
    val mark_price: Double,
    val high_price: Double,
    val low_price: Double,
    val open_price: Double,
    val symbol: String,
    val id: String,
    val volume: Double
)

// Bounds, Intervals, and Spans are enum types.
data class Historical (
    val data_points: List<DataPoint>,
    val bounds: Bounds,
    val interval: Intervals,
    val span: Spans,
    val symbol: String,
    val id: String,
    val open_price: Double?,
    val open_time: Date?,
    val previous_close_price: Double?,
    val previous_close_time: Date?
) {
  // Nested data object used by Historical
  data class DataPoint(
      val begins_at: Date,
      val open_price: Double,
      val close_price: Double,
      val high_price: Double,
      val low_price: Double,
      val volume: Double,
      val session: String,
      val interpolated: Boolean
  )
}

// Define interface using Retrofit annotations and data objects.
interface RobinhoodCryptoQuotesApi {
  /**
   * @param ids comma separated list of ids
   */
  @GET(marketdata/forex/quotes/)
  fun getQuoteIds(
    @Query(ids) ids: String
  ): Call<Results<CryptoQuote>>

  /**
   * @param symbols comma separated list of symbols
   */
  @GET(marketdata/forex/quotes/)
  fun getQuoteSymbols(
    @Query(symbols) symbols: String
  ): Call<Results<CryptoQuote>>

  /**
   * @param symbol id or symbol for a pairing.
   */
  @GET(marketdata/forex/quotes/{symbol}/)
  fun getQuote(
    @Path(symbol) symbol: String
  ): Call<CryptoQuote>

  
  @GET("marketdata/forex/historicals/{id-pair}/")
  fun getHistoricals(
      @Path("id-pair") idPair: String,
      @Query("bounds") bounds: Bounds,
      @Query("spans") spans: Spans,
      @Query("interval") interval: Intervals
  ): Call<Historical>
}

This is a complete listing of the Robinhood market data endpoints. If you aren't familiar with Kotlin, this code is essentially declaring an interface and some POJOs. By sprinkling in some special annotations used by Retrofit2, it can be converted into a concrete object for accessing the endpoints. Errors are even handled by the Call objects which wrap your results.

To go from endpoint definition to concrete object we need to invoke the Retrofit2 builder. While the HttpClient doesn't really support anything besides OkHttp, there are hooks to adapt the serialization and call adapter layers. Here you can see I configure OkHttp to insert a token header, and add Gson and Enum adapters for serialization.


  // Inject the authentication header.
  private class TokenInterceptor(private val token: String) : Interceptor {
    override fun intercept(chain: Interceptor.Chain): Response {
      val newRequest = chain.request().newBuilder()
          .addHeader("Authorization", "Token $token")
          .build()
      return chain.proceed(newRequest)
    }
  }

  // Build a token aware HttpClient
  val tokenHttpClient = OkHttpClient.Builder()
      .addInterceptor(TokenInterceptor("my-token"))
      .build()

  // Build the api
  val api = Retrofit.Builder()
      .addConverterFactory(GsonConverterFactory.create())
      .addConverterFactory(EnumRetrofitConverterFactory())
      .baseUrl("https://api.robinhood.com/")
      .client(tokenHttpClient)
      .build()
      .create(RobinhoodCryptoQuotesApi::class.java)

  // Make a request 
  val response = api.getHistoricals(
      idPair = "3d961844-d360-45fc-989b-f6fca761d511",
      spans = Spans.DAY,
      bounds = Bounds.TWENTY_FOUR_SEVEN,
      interval = Intervals.DAY
  ).execute()

  // Print out some data  
  for(data in response!!.body()!!.data_points) {
    println(data.high_price)
  }

And that's all! In the final code, there was some additional work to make an OAuthTokenInterceptor, but all of that complexity is constrained to the constructor. Defining the endpoints is a matter of typing out the data classes for serialization, naming your methods, and applying the Retrofit2 annotations.

All in all, Retrofit is a very concise way to define a client library in a type-safe way. The code described in this article can be found on GitHub.