What's Interesting About TigerBeetle?

TigerBeetle: a fixed-schema, performance-oriented, replicated, highly available financial database. This more or less sums up what the project is about. Still, TigerBeetle generates a seemingly disproportionate amount of attention. What's the fuss? What is interesting about TigerBeetle?
Let's take a closer look!
TigerBeetle from an SQL viewpoint
We'll start with a rather traditional perspective. Most of you probably have a solid background in SQL, and relational databases determine the way we assess other database systems (even if something is not a relational database, we say it's "NoSQL").
Unlike a relational database, TigerBeetle has a fixed schema, with only three "entity types": ledgers, accounts, and transfers. Each entity type has a fixed number of columns of fixed types. The schema can be visualised as an ERP diagram:

This schema supports a well-known accounting concept: double-entry bookkeeping. It has been around since the Middle Ages, and operates on a simple principle: every transfer always concerns two accounts, with one debit and one credit. That is, money (or whatever resource is tracked) never disappears, and never appears out of thin air.
If you haven't worked in the accounting domain, the terminology might not be that well-known. That's why TigerBeetle prepared a primer to get you up to speed with accounting 101. Which is interesting and educational in itself, even if you're not that interested in TigerBeetle itself.
What's OLTP?
TigerBeetle positions itself as an OLTP database (online transaction processing), which traditionally is the space occupied by SQL databases. OLTP is contrasted with OLAP (online analytical processing, e.g., ClickHouse).
However, TigerBeetle introduces further differentiation. First, there's OLGP (online general purpose), which is the category of PostgreSQL, MySQL etc.; while "true" OLTP, handling real-time transactions, is where TigerBeetle falls in.
Whether a database is OLGP or OLTP is also reflected in typical usage patterns. With OLGP databases, reads most often dominate writes. In OLTP, you'll most often create new transfers. Write-heavy workloads are what TigerBeetle is optimized for.
Consistency
When working with a SQL database, we have a range of isolation levels to choose from (repeatable read, snapshot isolation, etc.), which also means that we implicitly choose which "phenomena" we might encounter (such as phantom read, lost updates, etc.).
With TigerBeetle, the story is much simpler: you get strict serializability, and there are no other options. That is, transactions are always executed serially, one after another, where each transaction fully observes all effects of the previous ones.
Data operations: inserting & querying
When interacting with TigerBeetle, you don't use SQL or any dedicated query language, but an API. There are client libraries for the most popular programming ecosystems. You might also just use the bundled CLI client.
All data in TigerBeetle is immutable. Hence, the only thing you can do is inserting (and querying). There are no UPDATEs, DELETEs, or ALTER SCHEMAs, as the schema is fixed.
Inserting includes three features that are crucial for many of TigerBeetle's use cases.
First, you can require that the account balance never falls below 0. If a transfer would violate that, it's rejected. Since all transfers are applied to the current state serially, when a transfer is accepted, you can be certain the account balance allows it.
Secondly, transfers can be linked in a chain. TigerBeetle guarantees that either all transfers in a chain will be accepted, or none will. The transfers in the chain might involve the same or completely distinct accounts from various ledgers.
Thirdly, you can create "pending", two-phase transfers, which simply reserve some amount; such a transfer can have an attached timer, given in seconds from creating the reservation. A pending transfer can then be posted (finalized) or voided. Credit card pre-authorisations or timed seat reservations are good intuitions on how this feature works.
An important trait of creating transfers is that each has a unique ID, which should be assigned by the end-client. Here, the end client refers to the browser or mobile app, not an intermediate API layer. Each such ID should be persisted (e.g., using the client's local storage) and then attached to any retry attempts. It serves as an idempotency key, ensuring that you pay only once, even if the network is flaky.
TigerBeetle as a part of a larger system
TigerBeetle targets exceptional write performance and, as part of that strategy, has limited "supporting" features. You get a database that emphasizes data safety (more on that later) and writes as fast as possible, but doesn't offer anything for security or authentication; it's assumed the database runs in a trusted environment with trusted users.
That's why TigerBeetle should always be positioned behind some kind of a gateway. Such a gateway serves two purposes: security and batching. First, it should make sure that whatever requests are forwarded are properly authenticated. Secondly, requests should be batched for best performance. Each client session is limited to one in-flight request. Hence, while TigerBeetle processes a batch, the gateway should accumulate transfers for the next one.
While the gateway should handle authentication, you need to ensure it doesn't become a bottleneck in the system. That is, performing a SQL request for each incoming transfer to look up the user is a no-go. Instead, you might leverage lightweight methods, such as verifying a JWT token, or, if absolutely necessary, looking up the user in a Redis cache.
As for the read side, you might query TigerBeetle for a specific transfer or a range of transfers, or fetch the account's current balance. But for anything above that, you'll need to create a projection. This might be tricky to keep in sync manually. Luckily, TigerBeetle offers CDC data capture, which publishes all transfers to an AMQP queue (e.g., RabbitMQ) in an at-least-once manner.
Hence, a template architecture for a system with TigerBeetle handling transactions could be the following:

Ideally, when it comes to the "hot write-path", the gateway should be stateless. For other operations, it can and should use an OLGP database. Hence, when creating a user account, you would create an account in a SQL database with all the usual details, along with an account in TigerBeetle.
Quite possibly, every TigerBeetle deployment should be accompanied by a SQL database. The first would handle high-contention writes; the other would support data, CRUD operations, projections, and all kinds of reads and queries.
Use-cases
Given the fixed schema, can TigerBeetle be used only for financial applications? Not necessarily: there's a lot you can model with the concepts of debits & credits, not only money!
For example: ticket sales, which, for popular events, always puts a heavy strain on the ticketing system (often causing it to crash). Or rate limiting access to an API.
TigerBeetle docs contain a number of recipes, but if you think about it, keeping track of credits in a serverless system, energy metering, and token usage in an LLM are all instances of the same problem—keeping track of a limited resource, and making sure that the resource never disappears, or is never fabricated. Hence, the use cases of TigerBeetle far exceed the financial domain.
Design of data replication and safety
TigerBeetle is a replicated database that guarantees that once a transfer is accepted, it will be durably written: it will never disappear (unless there's catastrophic data loss across all replicas, of course). On the one hand, that might not sound particularly novel, as many databases claim to ensure data safety through replication.
On the other hand, once you start digging into the design of TigerBeetle, you appreciate the number of failure scenarios the team has taken into account, against which the database is continuously tested. These failures range from random disk bit flips and misdirected writes (apparently, this can happen when data is written at the wrong offset) to grey failures of the disk & network (where things work, just really slowly), and all the way to sudden and not-so-sudden death of randomly chosen nodes.
And if you look more closely at other databases, this level of resilience against various catastrophes is unprecedented. We've come a long way from fsyncgate, but I don't think many other databases can claim such resilience.
TigerBeetle ensures local data safety through storing checksums of all data blocks (separately from the data), arranging data in a tree, and hash-chaining everything. Hash-chaining allows fast verification of data consistency; you might know the concept from git or blockchain.
As for distributed data safety, TigerBeetle is a consensus-based replicated state machine. There's a single leader node that handles all traffic. There's no sharding or load-balancing. As such, any scaling happens by adding faster memory or faster CPUs ("scale up"), not by scaling out.
This is by design: the use cases for which TigerBeetle is designed often involve data that shards poorly, that is, situations where there's a small number of "hot" accounts involved in most transactions. Moreover, having everything local makes it much easier to ensure consistency and strict serializability. Not only that: the whole TigerBeetle process is single-threaded. It uses a single CPU core! And it's still blazingly fast! (Or maybe—because of that?)

TigerBeetle recommends deployment on 6 nodes. This is unusual, as you are normally encouraged to deploy on an odd number of nodes so that quorums can be easily formed. Here, the reasoning is that you should use 3 cloud providers, each within milliseconds of the others, with each hosting 2 nodes. Then, a total outage of one cloud provider doesn't bring your system to a halt. Moreover, a write quorum is formed by a leader + 2 other nodes, so as long as the leader node is alive, you only need half the nodes. A majority is needed to elect a new leader, though.
Replication is synchronous; that is, once you receive a response to your request, the data is guaranteed to be safe: written to disk and replicated across the cluster. TigerBeetle encourages aggressive batching (up to 8k transfers in a single request), both for requests and later in cluster replication.
Replication algorithm
If you read about a consensus-based replicated state machine, algorithms such as Multi-Paxos or Raft might come into mind. However, TigerBeetle uses something different: Viewstamped Replication (VR). This algorithm is lesser-known; however, it was developed at the same time as Paxos, and independently of it.
You might say that the core algorithm for all three approaches (Paxos, Raft, VR) is the same. The devil lies in the details, of course. Raft is simpler to understand than the others, but it also imposes the most constraints, especially during unclean shutdowns and recovery. Paxos originally is only a consensus algorithm, later extended to multiple values in Multi Paxos, while VR directly deals with a replicated log.
VR might have been chosen over Paxos as it's simpler to implement, while providing exactly the primitives that TigerBeetle's team needed. It also performs better (with fewer round-trips).
Testing
TigerBeetle is thoroughly tested through a couple of novel techniques.
First, there's the VOPR (Viewstamped Operation Replicator), which implements Deterministic Simulation Testing (DST) for a TigerBeetle cluster. VOPR is developed alongside the database and runs around the clock, trying to find bugs in the codebase. Once a bug is found, you get a random seed along with a precise trace on how to reproduce the problem locally.

VOPR works by providing alternative implementations for all sources of indeterminism: the disk, the network, and the system clock. It is capable of simulating an entire TigerBeetle cluster on a single core, and can fast-forward time, compressing what would be one month of running TigerBeetle into an hour. It can also inject all kinds of failures, such as dropped messages or simulated bad sectors.
Moreover, TigerBeetle has been tested with Antithesis, another approach to DST that operates at the Docker container level. Antithesis might also introduce faults related to disk & network, however, without the tight integration that VOPR provides.
Finally, Jepsen tests of TigerBeetle revealed only two data-safety issues: missing query results and problems with a debugging API (both fixed). As noted in the analysis, TigerBeetle exhibits exceptional resilience to disk failures. Also, some issues related to client or database crashes during upgrades have been identified. But overall, if you compare Jepsen's results across various databases, TigerBeetle's results showcase a really solid design.
Notable engineering choices
There are so many interesting engineering choices made by the team! The TigerBeetle approach is definitely not for everybody, as the use case is very specific, but there are always some lessons to be learned. An incomplete list:
- TigerBeetle is written in Zig, a systems programming language. Rust and C were both considered, but Zig might have been chosen for its precise control over memory, ergonomics, no undefined behaviors (in contrast to C), and the exact feature set needed for a low-level database project.
- All memory is allocated upfront, there are no dynamic allocations.
- Everything runs on a single core and is deterministic.
- All I/O is direct, bypassing the kernel's page caches and buffering.
- TigerBeetles maintains a "cluster clock", ensuring that timestamps are always-increasing. If it detects excessive time drift on any node, that node crashes.
- The code is written using Tiger style, inspired by NASA's power-of-ten rules, and aiming at "getting it right the first time". On a code level, recursion is not allowed; there are a ton of assertions, only simple abstractions can be used, methods can be at most 70 lines long, and more! On the design level, there's a lot of proactive thinking and research before writing the code.
TigerBeetle: a new generation of databases?
Joran Greef, the CEO and founder of TigerBeetle, argues that two events of 2018 mark the start of a new era in database development. These are: fsyncgate (how databases handle disk write failures and I/O buffering) and protocol-aware recovery in case of failures. Both of these shaped the design of TigerBeetle.
Joran believes we might witness a new breed of databases: ones developed over years, not decades. That is thanks to advancements such as DST, Direct I/O, io_uring, new systems programming languages (Zig, Rust), and explicit fault models, which accelerate database development.
TigerBeetle might just be the lead goose in a new generation of domain-specific, blazingly fast, and durable databases.
