Private Blockchain: is it for you?
“Blockchain” is a word that has been on everybody's tongues for the last couple of years. Various industries like FinTech, Healthcare, the public sector, and even individual small companies think about its adoption.
Often we equate “Blockchain” with a large public network, like Bitcoin or Ethereum. First things that come to our mind are cryptocurrencies, ICO’s, asset tokenization, and gas fees for operating on a network. All of this is true, but shows the perspective of Public Blockchain only. Besides it, there are Private Blockchains that give us a different point of view.
Private Blockchains are designed for companies that want to have more control over data and more privacy with fine grained permission control, also, since they have different design, they don’t have the concept of gas fees. Companies, which we can call network participants, interested in such solutions are often willing to host Blockchain in their private infrastructure.
Private Blockchains, as they are younger and dedicated to companies, not to the broad spectrum of individual users, are often neglected. Since they are missing a large, public network aspect, there are a lot of controversies around them. They spark many questions:
- Are Blockchains useful in small networks and when do they actually make sense?
- Do we need Private Blockchain in our case?
- Are Private Blockchains only a useless marketing hype without any actual meaning?
- Can’t we just use classic, old-fashioned but well-tested and adopted solutions to avoid architectural complexity?
Although the topic is really large, complex, and often depends on individual use cases, I’ll try to address these questions and start the initial discussion.
Back to the basics
Before I try to answer any questions, let's deconstruct Blockchain and think which concepts made it so innovative and attractive to business. Such demystification is needed as we tend to treat Blockchain as an unknown magic.
Note that a lot of core concepts are also the same to public chains, but will help us later on to review the usefulness of Private Blockchains and compare them with classic architectural solutions. Let’s start!
First of all: Immutability. In Blockchain, all new entries are packed and stored in blocks, where each subsequent block of the ledger contains a reference to the previous one.
Blockchain data organization is different from the classic database. It is a pure transaction log without any mutable addons. You can’t simply modify already added records because Blockchain forms into an unmodifiable audit log. Every block is also cryptographically secured.
Thanks to immutability, we can’t accidentally edit history, so faulty database queries run by new administrator can’t erase our precious production data anymore. In case of malicious external or internal attack, it’s much harder to corrupt entries, especially older ones. Every change in a single block means new block hash and the necessity to change hash references in neighbour blocks. Unfortunately, immutability makes it harder to comply with the GDPR, but it’s a story for a different article.
Immutability is a powerful concept, but it wouldn’t be complete without data replication. Blockchain at its heart is a distributed system and every distributed system should be resilient to data loss. It’s achieved by replicating data on a couple of machines. If one fails, we could recreate the state from another one.
Blockchain replication has also another important aspect. It complements immutability and can protect us from data modification. Now the attacker must break into several machines to permanently modify the chain.
Moreover, Blockchain is designed for limited trust cases. So its inner communication is Byzantine Fault Tolerant. Blockchain can not only work properly when some machines fail, but also when some of them are actively lying and returning incorrect data. Thanks to that, we can detect impostors.
Note that BFT algorithms are not magical creatures and should still meet one basic principle – to detect impostors, we need clear separation between nodes. They should be deployed in different organizations and clusters. Preferably, each organization should have its own one.
All right, so we have an immutable log of operations shared among many participants. Which by itself is an interesting combination. Smart Contracts are even more meaningful concept built on top of that combination. They are introduced in such Blockchains as Ethereum, Hyperledger Fabric. Essentially, a smart contract is a small program containing some logic that must be run by all involved parties. You can think of it as an agreement that must be validated by both sides before the actual transaction or a part of business logic that is shared among organizations.
Agreements have been around for a long time and we know this concept pretty well. If two sides don’t trust each other but must cooperate for common good, they write a basic set of rules and then they validate it on each transaction.
Same is with Smart Contracts, but since it’s done automatically, it’s more reliable than human verification and orders of magnitude faster.
Do we really need it all at once?
All right then, we’ve reviewed foundations that made Blockchain happen. Without their existence, Blockchains wouldn’t be large, attack resilient, high throughput networks working in limited trust cases.
But the question is… do we really need all of this at once? Especially in Private Blockchain cases?
Well… we tend to look at Blockchain as a whole, but often we need only some subset of its concepts.
Immutable data is a rising trend and one of the more tempting and desired Blockchain attributes. Immutability has a lot of great features:
- It naturally forms a transaction/audit log of past transactions.
- It enables additional safety as we are sure that crucial data won’t get accidentally modified.
- Data in the form of a chain is also easier to analyze, as we can revisit past transactions, audit them, and check how they influence the current system state. We can also search for anomalies like large transaction series from suspicious accounts.
- It can also be an additional benefit for clients — they feel safer with their data being always stable and unchangeable.
All of this sounds powerful and useful, but… If Immutability is the only thing that interests us, we don’t need Blockchain at all. If your business is standalone and you won’t ever collaborate with other organizations on sensitive, unmodifiable data or need to report it in real time to some external authority, then you can consider more lightweight solutions.
In my opinion, Immutability gained a lot of traction and public notice thanks to Blockchain, which is a good thing as it adds a lot of value to industry standards, but in many cases, it’s better to cut off some complexity and use something easier than Blockchain.
Apache Kafka — Unchangeable log is the core concept of Kafka
This approach can be especially useful when we are interested in a transaction/operations log from last week/month/year and the previous data can be utilized (ei. recorded as a single snapshot in relational DB or compressed and stored on Amazon S3, Google Storage or other).
Kafka is and always will be faster than Blockchain, as it is focused on high throughput. It’s also easier to adopt and to maintain later — it’s not that hard to find engineers knowing it.
Kafka is also highly tunable, via a dozen of different params, so we can:
- decide how long we want to keep our data thanks to the retention policy assigned to individual topics and therefore reduce the storage space required,
- shift between low latency system needs or more batch oriented ones.
When it comes to data encryption, Kafka has built-in support for TLS communication only. This might be a major drawback when storing sensitive data and GDPR data. In such cases, we can encrypt our data at disk/operating system level and manually at Producer/Consumer level, but it requires more work.
If Kafka is not the answer and we want to focus on pure immutable, never erasing data log which is cryptographically secure, we can use Immutable databases.
Amazon Quantum Ledger Database and Immutable storage for Azure Blob are good options if you can tightly couple your system with a service provider and are open to using PaaS solutions. They pretty much work out of the box.
ImmuDB is a good choice if we want to host a database in our own infrastructure. It’s lightweight, high-speed, has built-in cryptographic proof and verification and it’s focused only around immutability. ImmuDB is a good example of an immutable database, but surely not the only one.
We don’t want to lose our data, so we need some kind of replication, right? Well, here the answer is short. If we want only this feature and we’ve heard that Blockchain provides it, then the good news is that every modern distributed system has it. Moreover, if we’re interested only in Replication, not in Immutability, then Blockchain can complicate a lot of things.
All distributed databases such as Cassandra, Mongo, CouchDB, and also Apache Kafka can be used in that case. Since they provide replication mechanisms, we can achieve high availability of data. They also are very tunable, so you can decide how many replicas of your data are needed. In Blockchain, it’s more a case of data everywhere or not on Blockchain at all. (Well, that's true to some extent as we actually can tune a little bit Hyperledger Fabric to that scenario, but it’s a story for a different post.) If you need a relational database, Postgres has a concept of replication too.
If you’re interested in immutable data which is replicated also, there’s no fear as immutable databases provide such a feature as well.
As it was mentioned earlier, Smart Contracts work like agreements that must be verified by each side before actually submitting the operation. But there’s no point in using Smart Contracts if both organizations trust each other and don’t need to constantly validate each other's work.
Same applies if we have only one organization that hosts the whole solution — we won’t have a stable, long-term guarantee that an agreement was fulfilled during some operation as we don’t have any external, independent authority that validated it. Guarantee can be broken with some effort as organization, which controls every node, can override historical data.
When do we actually need Blockchain?
Ok, so we’ve looked critically at many use cases, proposed simpler solutions. What are the use cases for Private Blockchains then?
Private Blockchain makes sense, but one condition must be met: limited trust.
We can fully use all Blockchain concepts if we want to create communication channels between a few parties that don't trust each other and should validate each other's transactions. That way we can benefit from using Smart Contracts and write rules crucial to our business in them.
Otherwise, we can over-engineer our system.
Results of over-engineering
- Performance decrease
Blockchains are slow. As they operate in limited trust cases, their consensus mechanisms are Byzantine Fault Tolerant and are much more heavy-weight than well known and used Crash Fault Tolerant ones.
- High infrastructure costs
Private Blockchains, especially Hyperledger Fabric, have a lot of nodes. Besides nodes/peers, we have Orderer nodes and Membership Provider services for each organization. Having more services to deploy means more attention in production and probably usage of cluster orchestrators like Kuberenetes, central logging and monitoring services etc.
Not to mention high storage space required for storing the Ledger itself
- Development time
Since Blockchain is focused around immutability from day one, it’s much harder to evolve. Systems change to better suit business needs. If you don’t do it well, after some time, you might have a lot of unneeded data stored on many machines. Before putting anything on Blockchain, it’s best to think twice and try to predict the future platform shape, which decreases functionalities delivery time.
- Development costs
It’s much harder to find an experienced Private Blockchain team. Although writing smart contracts isn't that hard, deploying them into production requires good knowledge of distributed systems combined with cryptography, network management skills. Because of the complexity, the time of development will also increase.
Last thoughts: immutability as a new standard?
Immutability does not always mean that data is unerasable and unmodifiable. If we are one organization that controls all machines and disks, and always will be, then we can modify the Ledger on all nodes or just delete everything when we want. It requires more effort than in a regular database case, but is doable and therefore not 100% bulletproof. In a single organization, Immutability can guard us against accidental data modifications which can be achieved without Blockchain.
Nevertheless, Immutability by itself might be a valuable feature and can add up to many industries. It’s not always about 100% unbreakability, but more about adding more guarantees to the system. Let’s say that we have a company specialized in small loans. Modifying the whole ledger just to change a single user’s record might be too affordable. From a user's perspective, it might add more trust to the loan provider. It’s not 100% unbreakable, but definitely better than many current systems not having any reliability standards.
Sadly, the immutability trend (and hopefully raising standard) is commonly associated with Blockchain and as you should probably know from the previous part, it might be an overkill. We can do it easier with more classic solutions without Blockchain adoption. We'll see what the future holds in that matter.
We’ve covered quite a lot of topics in this article. We talked about concepts behind Blockchain like immutability, replication, and smart contracts, demystified them a bit, and showed that often we need only some subset of Blockchain features.
What are the key takeaways from this article?
- Private Blockchain is extremely useful when many untrusted parties must cooperate on some problems.
- Before actual Blockchain adoption, it’s good to look at our project’s needs and validate if we don’t need only some subset of Blockchain features. Which is a common scenario.
- Blockchain is not the answer to all problems, but specialized tools in limited trust cases.
- Classic architectural solutions often may better address our project needs and guarantee immutability and replication.
- Over-engineering is especially painful in the Blockchain case.