Contents

Good practices for schema evolution with Protobuf using ScalaPB and fs2grpc

Krzysztof Ciesielski

07 Apr 2023.9 minutes read

Good practices for schema evolution with Protobuf using ScalaPB and fs2grpc webp image

Schema-driven paradigm has become very popular in different communication models. Defining APIs using Interface Description Languages allows evolving schema more robustly and safely than with JSON-based messages. Depending on various factors, adding/removing message fields can be perfectly safe or break binary compatibility after a deployment. You might need different policies for schema evolution when using gRPC services, event sourcing, or Akka Cluster actor communication, not to mention that the decision may be different for specific wire protocol. Let’s look at various communication modes and consider whether to use backward, forward, or full compatibility modes.

Technology

This article focuses on Google Protocol Buffers, AKA Protobuf, the underlying default protocol in gRPC services and a solid choice for messaging-based communication. The gRPC communication style is It has its specific approach to allowing schema changes, required fields, and default values. When developing backends using Scala, you can leverage ScalaPB to generate case classes from the .proto contract during project compilation. You can also make these generated types even more robust with extra annotations, which we’ll explore below.

Communication style

gRPC

This way of synchronous communication between services is becoming increasingly popular. Typical REST contracts are cumbersome to evolve, not to mention better performance and streaming support. Another issue is modeling service use cases as HTTP verbs like POST/PUT/DELETE/etc., plus resource URLs, which often don’t map well to actual use cases. gRPC addresses these difficulties. With gRPC, we often talk about clients and servers. Usually, the server service maintains its contract as protobuf files, which can be used to generate case classes using ScalaPB. Then fs2grpc can be used to create a thin client layer based on Cats Effect, and the CI can put all this code in a versioned .jar artifact published by the server. Client services can then use these generated artifacts as their dependencies.

Messaging

Asynchronous message-based communication opens many advantages like decoupling, delivery guarantees, event sourcing, command sourcing, etc. It can be implemented with Kafka, Pulsar, or other infrastructure solutions. When realized with Kafka, this pattern is often enriched with an external Schema Registry, which facilitates schema compatibility management between producers and consumers. Let’s focus on the following two integration patterns:

  • 1 Producer - multiple consumers
    When a service publishes its events and consumers can come and go to read and process these events. Usually, it’s the producer who drives schema changes.
  • 1 Consumer - multiple producers
    When a service reads commands from a topic, where potential client services can publish. Therefore, the consumer updates the schema in this case, which makes it similar to the gRPC server.
  • Communication between actors of the same type
    This case is unique to models where actors of the same type send messages to each other, for example, within Akka Cluster. We won’t focus on such a case in depth. Let me just mention that this situation needs full compatibility to ensure that actors can exchange messages in both new and old versions during a rolling update.

Compatibility modes cheat sheet

Let’s now break down mentioned Protobuf usages into lists of allowed operations like adding/removing fields, changing optionality, etc. First, a quick recap of compatibility modes:

  • backward - the most well-known mode. It means that data consumers should be able to consume older versions of data objects.
  • forward - used when the consumer has an older schema version but needs to understand data serialized with a newer schema.
  • full - the most restrictive constraint, requiring both backward and forward compatibility.

gRPC

It’s recommended to keep backward compatibility for requests. The server updates its schema first, without worrying about requests from clients who use older schemas. Then clients update their schemas gradually. For responses, the desired mode is forward, meaning clients can still process new messages sent from the server. Please note that the no_box wrapper is described in details later in the Optionality section.

modecommunicationscenariomethod
BACKWARD gRPC requests add optional field to request add to proto
BACKWARD gRPC requests change required request field to optional remove no_box (ScalaPB specific)
BACKWARD gRPC requests change optional request field to required breaking
BACKWARD gRPC requests remove optional field from request remove from proto
BACKWARD gRPC requests remove required field from request remove from proto
BACKWARD gRPC requests add required field to request breaking
BACKWARD gRPC requests add a field to a oneof in a request add to proto
BACKWARD gRPC requests move a field into a new oneof in a request move in proto
BACKWARD gRPC requests remove a field from a oneof in a request breaking**
FORWARD gRPC responses add optional field to response add to proto
FORWARD gRPC responses add required field to response add to proto
FORWARD gRPC responses remove optional field from response remove from proto
FORWARD gRPC responses change required response field to optional breaking
FORWARD gRPC responses remove required field from response breaking*
FORWARD gRPC responses add a field to a oneof in a response breaking**
FORWARD gRPC responses move a field into a new oneof in a response optional field only, move in proto
FORWARD gRPC responses remove a field from a oneof in a response remove from proto

Keep in mind that removing fields in Protobuf is OK as long as you don’t reuse the index number. You can use the reserved keyword to emphasize that an index shouldn’t be reused.

* possible with some restrictions, see the Optionality section below.
** A special note on oneofs: Removing a oneof variant in requests makes the server decode that value into UNKNOWN, which is theoretically indistinguishable from an unset value.
That's why I’m marking it as backward incompatible. Likewise, adding a field to a oneof is considered a forward incompatible change. If you’re sure all your clients are using ScalaPB, and your’re handling UNKNOWN values correctly, such changes can be performed, but with extreme caution. See this blog post for a more thorough explanation and thoughts on compatibility for more non-typical changes to oneof fields.

Messaging: events

In this communication style we have a publisher who manages the contract, and consumers who read the published events. Such style is often implemented with a .jar artifact containing the .proto file together with generated ScalaPB classes and fs2grpc services, maintained and published by the server. The publisher is then typically the first to update its code to adjust to the new contract, while consumers follow. However, let’s consider all the sub-cases:

  1. Forward compatibility is enough when the producer wants to update events without breaking consumers, who still would be able to use the old schema before they update.
  2. Alternatively, you might want to update consumers first. If that’s your preferred ordering, it’s backward compatibility you’ll need to keep. Such an approach is mentioned in the Avro and Schema Registry documentation, check it out for a deeper dive.
  3. Full compatibility is the most restrictive variant. It is recommended when we want to make sure that even after adjusting consumers to the latest schema, they still will be able to parse older events and have special handling for fields with present or missing values. Choose this approach when you expect historical events to be replayed, allowing correct handling of all older versions by the consumers, for example, in event sourcing.
modecommunicationscenariomethod
FORWARD/FULL messaging: events add optional field add to proto
FORWARD messaging: events add required field add to proto
FULLmessaging: events add required field breaking
FORWARD/FULL messaging: events remove optional field remove from proto
FORWARD ONLY messaging: events change optional field to required add no_box
FULLmessaging: events change optional field to required breaking
FORWARD/FULL messaging: events change required field to optional breaking*
FORWARD/FULL messaging: events remove required field breaking
FORWARD/FULL messaging: events add a field to a oneof breaking
FORWARD/FULL messaging: events move a field into a new oneof move in proto (optional only)
FORWARD ONLY messaging: events remove a field from a oneof remove from proto
FULLmessaging: events remove a field from a oneof breaking

Messaging: commands

This mode is very similar to handling GRPC requests. A command handler maintains the contract, so it needs to be backward compatible.

modecommunication scenariomethod
BACKWARD messaging: commands add optional field add to proto
BACKWARD messaging: commands add required field breaking
BACKWARD messaging: commands remove required field remove from proto
BACKWARD messaging: commands remove optional field remove from proto
BACKWARD messaging: commands change required field to optional remove no_box
BACKWARD messaging: commands change optional field to required breaking
BACKWARD messaging: commands add a field to a oneof add to proto
BACKWARD messaging: commands remove a field from a oneof breaking
BACKWARD messaging: commands move a field into a new oneof move in proto

Optionality and no_box

Since version 3, the Protobuf protocol doesn’t allow required fields, making the optional default. The explanation is that allowing adding/removing required fields was causing too many problems with unexpected broken wire compatibility (see the original GitHub comment).

However, as the tables show, required fields make sense for some scenarios. For example, for BACKWARD compatibility, it’s perfectly fine to have required fields in the initial schema and delete them later. This applies to gRPC requests or asynchronous commands. To make ScalaPB skip wrapping non-primitive types with Option, add a no_box annotation, like:

google.protobuf.Timestamp createdAt = 1 [(scalapb.field).no_box = true];

Removing required fields breaks FORWARD compatibility, but in special circumstances, we can perform this operation in steps:

  • Change required field to optional by removing the no_box wrapper
  • Update all consumers to handle the None case
  • Start producing the None case

Caution! Such an operation is risky. It doesn’t fully follow compatibility rules from our cheat sheet, so I’m marking it as “breaking” anyway.

Default values

Primitive protobuf types like string, double, and others won’t be wrapped with Option by ScalaPB. Instead, default values will be set, like the empty string “”. It can be very dangerous because, in most cases, these default values are actually illegal, so if you omit a field when constructing a case class, you are in trouble. To strengthen type safety, consider two approaches:

  • Add no_default_values_in_constructor to your .proto file, which will simply disable default values. Disadvantage: this makes the field strictly required without an easy way to evolve it into an optional field later.
    option (scalapb.options) = {
    no_default_values_in_constructor: true
    }; 
  • Use wrapper types
    Such types may increase code noise by requiring additional .value calls to get to the actual value, but combined with no_box, this approach gives you more control over the evolution of optionality.

Bonus: buf

Maintaining protobuf files, distributing the contract, enforcing compatibility rules, and managing a consistent set of API design rules requires serious investments. The buf project is an interesting toolset that aims to automate these processes. It ships with a newly developed high-performance Protobuf compiler, a linter that enforces good API design choices and structure, a breaking change detector that enforces compatibility at the source code or wire level, and a generator that invokes your protoc plugins based on a configurable template. It also contains The Buf Schema Registry (BSR) - a hosted SaaS platform that serves as your organization’s source of truth for your Protobuf APIs. Consider adding elements of buf to your project pipeline to squeeze more from the Protobuf experience.

Conclusions

Deciding on compatibility requirements for your schemas depends on various factors. Leveraging ScalaPB extensions like required fields or disabled default values can increase type safety on the application side without compromising Protobuf wire compatibility, as long as evolution rules are well adjusted to the use case. I hope that this guide will help you fine-tune these rules for your specific application, or even entirely automate their enforcement.

Check: Data serialization tools comparison: Avro vs Protobuf

Reviewed by: Michał Ostruszka, Michał Matłoka, Adam Rybicki, Andrzej Bil, Adrian Wydra, Bartek Henkiel

Blog Comments powered by Disqus.