Contents

A novel technique for creating ergonomic and tree-shakable TypeScript libraries

Mieszko Sabo

28 Sep 2023.10 minutes read

A novel technique for creating ergonomic and tree-shakable TypeScript libraries webp image

In this article, I will present an interesting technique for creating tree-shakable TypeScript libraries without compromising user experience.

Here's the high-level overview:

  • Users consume our library's API through a single entry point (think Zod).
  • This allows them to use techniques such as method chaining, builder pattern, etc.
  • However, to use this entry point, they need to initialize it by explicitly importing the desired functionalities. Importing more functionalities will extend its API and add to the final bundle size.
  • This allows the library author to ship more features without worrying that the public API will become bloated because it's the user who decides which APIs they want enabled and included in the bundle.

To better understand this concept and see how it can be implemented, let's take a look at a concrete example.

Consider Zod, a popular TypeScript-first schema validation library. It offers a great developer experience and has a rich ecosystem.

Here's the basic usage:

import { z } from "zod"; // 12.8 kB

const LoginSchema = z.object({
  email: z.string().email(),
  password: z.string().min(8),
});

// Throws error
LoginSchema.parse({ email: "", password: "" });

// Returns data as { email: string; password: string }
LoginSchema.parse({ email: "jane@example.com", password: "12345678" });

One of the greatest strengths of Zod is its ergonomics. It offers an intuitive and concise interface for creating schemas by chaining non-mutating methods together.

From a developer ergonomics perspective, an API that allows for method chaining has one great advantage: API discovery. Typing "." makes your IDE immediately display all possible things you might want to do along with the precise types of these operations.

gif

However, there's one problem: static import size. Working with Zod starts with importing a single z object and going from there. This means that every time we use Zod for anything, we're forced to import all the available methods. Whether or not we use .brand, .superRefine, or .passthrough, they will end up in our app's bundle. That's because Zod is not tree-shakable. But what does that mean?

What's a Tree-Shakable Library?

In short, tree shaking means removing unused code. JavaScript bundlers do this by analyzing import and export statements and deciding which exports are unused. Tree shaking refers not only to code modules within your source code but also to all your dependencies.

Not all libraries are tree-shakable. In order for a library to be tree-shakable, it needs to be structured in a specific way.

Example of a Tree-Shakable Library Structure

Consider the following example:

// `tree-shakable-lib`

export const fun1 = () => "fun1";

export const fun2 = () => "fun2"

Then, if your app looks like this:

import { fun1 } from "tree-shakable-lib";

console.log(fun1());

The bundled code will look like this:

const fun1 = () => "fun1";

console.log(fun1());

Notice that the definition of fun2 didn't make it to the bundle, thus reducing the size.

Example of a Non Tree-Shakable Library Structure

Now, consider the following example:

// `non-tree-shakable-lib`

export const lib = {
    fun1: () => "val1",
    fun2: () => "val2"
}

Then, if your app looks like this:

import { lib } from "./non-tree-shakable-lib.js";

console.log(lib.fun1());

The bundled code will look like this:

const lib = {
    fun1: () => "val1",
    fun2: () => "val2"
};

console.log(lib.fun1());

Notice that even though we use only the fun1 method, the whole lib object needs to be bundled because the bundler can't safely extract a specific subpart of an object or a class.

Tip: You can easily experiment with tree shaking online with Rollup REPL

What I wrote above is just a brief introduction to the larger topic of tree shaking, which should suffice for this article.

Valibot = Zod with Tree Shaking?

Now, knowing how to create a tree-shakable library, how would we go about making Zod tree-shakable? Well, a recently introduced library called Valibot took a stab at it.

Here's how our previous Zod example looks when rewritten with Valibot:

import { email, minLength, object, parse, string } from 'valibot'; // 0.85 kB

const LoginSchema = object({
  email: string([email()]),
  password: string([minLength(8)]),
});

// Throws error
parse(LoginSchema, { email: '', password: '' });

// Returns data as { email: string; password: string }
parse(LoginSchema, { email: 'jane@example.com', password: '12345678' });

Although the API looks familiar, it has a more functional flavor:

  • Instead of calling a parse method on the LoginSchema, we call an independent parse function with the schema as an argument.
  • Instead of dot-chaining additional constraints to the string validator, we pass them as arguments. Each of the constraints is a separately imported function.

This fits the simplified rule of tree-shakable libraries: If a library exports a bunch of pure functions, it's probably tree-shakable. Indeed, that's the case for Valibot. Notice the import size added in a comment in the first line. It's 15 times smaller than Zod, but it does exactly the same thing. Of course, as you add more functions to your app, the import size will grow, but realistically, it's not common to use 100% of any library's functionality, especially versatile ones like Zod and Valibot.

However, there's one problem with Valibot. While the functional API allows the library to be tree-shakable, I'd argue that, in this case, it also makes the API potentially less ergonomic. It's not as easy to reason about ergonomics as it is about import sizes, as the former is more nuanced and could depend on personal taste. However, I still feel that Zod's developer-friendliness yields superiority. Here are some drawbacks of Valibot's API:

  • One can't easily explore the API without leaving the IDE.
  • Creating each schema requires either jumping from the schema definition to the import statement to add new functions or listing them all in advance.

Correttore: The Best of Both Worlds

To sum up what we've learned so far:

  • Zod has great developer experience (DX) but isn't tree-shakable.
  • Valibot is tree-shakable, but its API could be seen as less ergonomic.

A simple question arises: Can we create a library that has the same API as Zod but somehow is also tree-shakable? The answer is yes; that's exactly what the technique I'm going to present is all about. As a proof of concept, I created Correttore:

import { email, minLength, object, initCorrettore, string } from "correttore"; // 0.54 kB

export const c = initCorrettore({
  string,
  email,
  object,
  minLength,
});

const LoginSchema = c.object({
  email: c.string().email(),
  password: c.string().minLength(8),
});

// Throws error
LoginSchema.parse({ email: "", password: "" });

// Returns data as { email: string; password: string }
LoginSchema.parse({ email: "jane@example.com", password: "12345678" });

To elaborate:

  1. In the first line, we import initCorrettore, as well as all "pieces" of the API we would be using in our app.
  2. We create and export a c variable by calling initCorrettore. This c corresponds to the well-known z from Zod. Everywhere else in our app, we would just import c from this file.
  3. We create a LoginSchema in what seems to be the same way as we do in Zod, just using our c object. The difference is that when we're chaining the methods, we see only the ones we previously imported and passed to the initCorrettore function. For example, if we removed email from the initCorrettore call, the email method would disappear from string(), and we'd get a compile-time TypeScript error in the schema object. On the other hand, if we imported and added, for example, a number validator, it would appear on the c object as a valid method.

gif

The rest is the same as in Zod, yet our import size is just over 500 bytes, a 25x size reduction!

What's the Secret Sauce

To write such a library, you'll need just two ingredients:

  1. Proxy objects
  2. Some type-level programming.

Proxy

Proxies allow you to intercept and redefine fundamental operations for any object. In the example below, we create a proxy for someObject so that each time one of its properties is accessed, we log which one.

const someObject = {
    a: 100,
    b: "hello"
}

const proxyObject = new Proxy(someObject, {
    get: (target, prop) => {
        console.log(`property ${String(prop)} was called!`);
        return target[prop as keyof typeof target];
    }
});

proxyObject.a; // logs: "property a was called!"
proxyObject.b; // logs: "property b was called!"

Inside the proxy handler (the second argument to the Proxy constructor), we can do virtually anything. You should check out MDN Web Docs on Proxy to see more examples and a full API overview.

So the basic idea is that when the user initializes Correttore with a record of parsers (such as number, string, boolean, etc.), we create a proxy out of an empty object and simply run the parsers dynamically as they're called:

// NOTE: the code below works, but for now, the `c` object is typed as `any`

// mocked parsers for brevity
const string = (x: any) => `${x} is a string`;
const number = (x: any) => `${x} is a number`;
const boolean = (x: any) => `${x} is a boolean`;

const simplifiedInitCorrettore = (parsers: Record<string, (arg: any) => string>) => {
    return new Proxy({} as any, {
        get(target, key) {
            if (key in parsers) {
                return (args: any) => parsers[key as any](args);
            } else {
                throw new Error("Unknown parser");
            }
        }
    })
}

const c = simplifiedInitCorrettore({
    string,
    number,
    // we purposely omit boolean parser here
});

console.log(c.string("hello")); // -> "hello is a string"
console.log(c.number(42)); // -> "hello is a string"
console.log(c.boolean(true)); // -> Error: Unknown parser

The implementation above shows the general idea but doesn't support method chaining or proper typings yet.

To support method chaining (e.g., c.string().email()), we’ll add a parsersChain variable, which will keep all parsers in the method chain (*). For example, c.string().email() will correspond to [stringParser, emailParser]. Only once .parse is called, we’ll iterate through all the parsers and validate the argument with each one (**).

const createParserProxy = (
  p: Parser<any>,
  parsers: Record<string, (...args: any) => Parser<any>>,
  parsersChain: Parser<any>[]
) =>
  new Proxy(p, {
    get(target, key) {
      if (key === "parse") { // (**)
        return (arg: any) => {
          parsersChain.forEach((p) => p.parse(arg));
          return target.parse(arg);
        };
      }
      if (key in parsers) {
        return (args: any) =>
          createParserProxy(parsers[key as any](args), parsers, [
            ...parsersChain, // (*)
            parsers[key as any](args),
          ]);
      } else {
        throw new Error(`Unknown parser ${key as string}`);
      }
    },
  });

export const initCorrettore = <
  Parsers extends Record<string, (...args: any) => Parser<any>>
>(
  parsers: Parsers
): calculateCorrettoreType<keyof Parsers> => {
  return new Proxy({} as calculateCorrettoreType<keyof Parsers>, {
    get(target, key) {
      if (key in parsers) {
        return (args: any) =>
          createParserProxy(parsers[key as any](args), parsers, [
            parsers[key as any](args),
          ]);
      } else {
        throw new Error(`Unknown parser ${key as string}`);
      }
    },
  });
};

The implementation itself is filled with instances of any type, but the resulting c object has proper typings due to the as calculateCorrettoreType<keyof Parsers> snippet.

Type-Level Programming

That's where type-level programming comes in. One needs to write a type that reflects the given record of imported parsers. That's mostly straightforward, but the challenging part is to make it work in a way so that, for example, the email parser will only be available after string() parser and not, for example, after number().

// Full correttore type with all parsers and extensions
type FullCorrettore = {
  // StringParser is an object that includes the `parse` method as well
  // as all the other parsers that can occur after `string()`, for example
  // `email` or `minLength`. 
  string: () => StringParser;

  // Similarly for this one.
  number: () => NumberParser;

  // see ObjectParser implementation here:
  // https://github.com/mieszkosabo/correttore/blob/main/src/parsers/object.ts
  object: <S extends Record<string, Parser<any>>>(schema: S) => ObjectParser<S>;
};

// given the list of imported parsers and validator, calculate a type for the correttore object
type calculateCorrettoreType<
  features,
  parser extends Record<string, (...args: any[]) => any> = FullCorrettore
> = {
  [k in Extract<keyof parser, features | "parse">]: k extends "parse"
    ? parser[k]
    : (
        ...args: Parameters<parser[k]>
      ) => calculateCorrettoreType<features, ReturnType<parser[k]>>;
};

Summary

The technique I demonstrated in this article has some obvious advantages in terms of user experience and bundle size optimization, but does it have any drawbacks?

From the perspective of the library consumer, not really. One could argue that library initialization is an extra step, and in cases where the user doesn't care about import size (such as server-side applications), it adds complexity. However, this could be mitigated by adding an additional export from the library with a pre-initialized object that contains the full API.

Because this technique optimizes the user experience, it is more difficult for library authors. First, combining all the pieces together with a proxy, while usually straightforward, requires writing non-type-safe code with lots of any. Second, providing the correct typing for the library consumer may require some serious type gymnastics, which can be a bit daunting for inexperienced type-level programmers.


I'm really interested in hearing what you think about this, so let me know in a comment below or reach me via email at mieszko.sabo@softwaremill.pl.

You might also want to check out the source code for the correttore library I created as a proof of concept: https://github.com/mieszkosabo/correttore.

If you're interested in helping bring this project to a production-ready state, feel free to open a PR with changes that bring it closer to a 1:1 Zod-compatible API.

Tech review by: Robert Lubarski

Blog Comments powered by Disqus.