Rust Static vs. Dynamic Dispatch
Today, I will talk a bit about all the nitty-gritty details of static and dynamic function calling in Rust. I know that many of you associate that topic directly with polymorphism in Rust, but static vs. dynamic dispatch is not only about polymorphism, traits, and trait objects; it's a more general topic on which I hope to elaborate below. In addition to the dispatch mechanisms themselves, we will also explore other related aspects such as inlining, the Sized
trait/method bounds, and dynamic dispatch in the async context—all somehow associated with the dispatching mechanisms in Rust.
What is a dispatch
I guess you can replace the word dispatch
with a function call, and static/dynamic dispatch to see how function calls are resolved. As mentioned at the beginning, the dispatch mechanism can be associated with polymorphism and figuring out which implementation of the given trait to call, but it can also be considered more broadly, applying to function and method calls in general. I’ll try to show you both dispatch mechanisms in both scenarios, with and without polymorphism, so that you can better grasp the idea if needed.
Static Dispatch
The simplest case for the static dispatch is classic function call in Rust:
fn compute(x: i32) -> i32 {
x + 1
}
fn main() {
let result = compute(5);
println!("Result: {}", result);
}
The call to compute(5)
is resolved at compile time. Static dispatch is the standard case, applying to most regular function calls where specific types are known at compile time. If you think about it, this definition can easily be extended to any generic example where static dispatch takes place.
fn max_value<'a, T: PartialOrd>(a: &'a T, b: &'a T) -> &'a T {
if a > b {
a
} else {
b
}
}
fn main() {
// Using max_value with integers
let num1 = 10;
let num2 = 20;
let max_num = max_value(&num1, &num2);
println!("The maximum number is {}", max_num);
// Using max_value with floating-point numbers
let float1 = 15.5;
let float2 = 8.75;
let max_float = max_value(&float1, &float2);
println!("The maximum float is {}", max_float);
// Using max_value with chars
let char1 = 'a';
let char2 = 'z';
let max_char = max_value(&char1, &char2);
println!("The maximum char is {}", max_char);
}
Because the types can be resolved to specific implementations at compile time we have another example of static dispatch, just like the one with regular functions before.
Previous example used PartialOrd
trait with default integer implementation but we can make our own traits and implementations just to make the matter a bit clearer:
trait Speak {
fn speak(&self) -> String;
}
struct Dog;
struct Cat;
impl Speak for Dog {
fn speak(&self) -> String {
"Woof".to_string()
}
}
impl Speak for Cat {
fn speak(&self) -> String {
"Meow".to_string()
}
}
fn make_noise<T: Speak>(animal: &T) {
println!("{}", animal.speak());
}
fn main() {
let dog = Dog;
let cat = Cat;
make_noise(&dog); // Here, the compiler knows to call Speak for Dog at compile time.
make_noise(&cat); // Here, the compiler knows to call Speak for Cat at compile time.
}
The function make_noise
accepts any type that implements the Speak
trait. For such cases, Rust uses static dispatch to execute the speak
method for a specific type. Here’s what happens:
- the Rust compiler determines the concrete type of the argument (Dog or Cat) at compile time.
- For each type, Rust generates a tailor-made version of the
make_noise
function—resulting in one specific version for Dog and another for Cat.
Since these function calls are resolved at compile time, runtime type checking or dynamic dispatch is unnecessary. This results in faster and more predictable execution, as the function addresses are already known and embedded in the code. This transformation from a generic type to multiple specific function versions is called monomorphization, which makes calls to speak as efficient as if they were implemented without any traits or generics. However, monomorphization also has its drawbacks; it can increase compilation times and potentially make the CPU instruction cache less effective due to multiple copies of the same instructions, potentially bloating the cache.
Note on #[inline] annotations
It’s often stated that in the case of generics and static dispatch, the compiler will inline functions for specific types.
This statement is only partially true, during compilation the standard inlining mechanisms apply. The Rust compiler is quite adept at deciding when to inline a function, however, this decision depends on multiple factors, including the complexity of the function and its call frequency.
In many cases, you might still find it beneficial to explicitly use #[inline]
or #[inline(always)]
annotations. rustc
is designed to aggressively inline generic functions during monomorphization, particularly when compiling with optimization flags such as --release
. Whenever you encounter performance-sensitive sections of your code or have a small function that is called frequently, consider explicitly marking your function with #[inline]
or #[inline(always)]
to see if it improves performance. Those annotations can also become useful when you have a utility function in a library crate that you believe would benefit from inlining into crates that depend on it.
Dynamic Dispatch
As you can imagine by now, dynamic dispatch differs significantly from static dispatch. With dynamic dispatch, you can call a method on a trait without knowing the specific type or implementation of the object at compile time; this resolution occurs at runtime.
Using our trivial example for Dogs and Cats we can create a dynamic version like below:
trait Speak {
fn speak(&self) -> String;
}
struct Dog;
struct Cat;
impl Speak for Dog {
fn speak(&self) -> String {
"Woof".to_string()
}
}
impl Speak for Cat {
fn speak(&self) -> String {
"Meow".to_string()
}
}
fn make_noise(animal: &dyn Speak) {
println!("{}", animal.speak());
}
fn main() {
let dog = Dog;
let cat = Cat;
make_noise(&dog); // The specific method is looked up in the vtable at runtime.
make_noise(&cat); // The specific method is looked up in the vtable at runtime.
}
The interesting part is, of course, the make_noise
function, which takes a &dyn Speak
type of argument. This time, we resolve the exact method to be executed at runtime using a virtual method table (vtable). &dyn
in effect tells us the type and the address of the method for that type when called at runtime. We use those two pieces of information against the vtable to figure out which function exactly should be called.
Wide Pointers and Trait Objects
The reason we are passing the animal
to our function with an address is that at compile time we don’t know the exact size of the argument, but we know the size of the pointer, and we can compile it. Because, in effect, we are not only passing the pointer to the type but also a vtable address where everything can be resolved, we need to use a so-called wide pointer
.
Not all wide pointers are trait objects, but the opposite holds true: we have a pointer with additional information. In this case, a pointer to a data of concrete type and a pointer to a vtable that contains the implementations of the trait methods for the specific type.
It turns out that you can use any type able to hold a wide pointer for dynamic dispatch, like Box
and Arc
. See the example below:
trait Animal {
fn speak(&self);
}
struct Dog;
struct Cat;
impl Animal for Dog {
fn speak(&self) {
println!("Woof!");
}
}
impl Animal for Cat {
fn speak(&self) {
println!("Meow!");
}
}
fn main() {
// Using & reference for dynamic dispatch
let dog = Dog;
let cat = Cat;
let animals: Vec<&dyn Animal> = vec![&dog, &cat];
for animal in animals {
animal.speak();
}
// Using Box for dynamic dispatch
let boxed_dog: Box<dyn Animal> = Box::new(Dog);
let boxed_cat: Box<dyn Animal> = Box::new(Cat);
boxed_dog.speak();
boxed_cat.speak();
// Using Arc for dynamic dispatch
use std::sync::Arc;
let arc_dog: Arc<dyn Animal> = Arc::new(Dog);
let arc_cat: Arc<dyn Animal> = Arc::new(Cat);
arc_dog.speak();
arc_cat.speak();
}
Self: Sized Trait Bound
The special (in the context of dynamic dispatch) trait bound Sized
can be used to manage how traits and methods can or cannot be used with dynamic dispatch. This can have consequences for the users of your traits or methods. When using the Self: Sized
bound, you can prevent trait object creation and enforce static dispatch.
The Sized
trait is automatically implemented for most types in Rust; it basically means that the size of the type is known at compile time. The Self: Sized
bound, when added to a trait or a method within a trait, specifies that they can only be used with types that are Sized
, in other words, their size must be known at compile time.
As mentioned in the earlier section, the reason we are using a Trait Object is precisely because we don’t know the size of the trait being used as the argument to our function. The trait objects are !Sized
, which means they are not Sized
and their size is not known at compile time; the pointer in the trait object can point to any instance of the implementing type. In other words, if we mark the trait as Sized
, we are saying that this trait can be used only with types where the size is known at compile time, and we cannot use this trait to create trait objects (hence we cannot use it with dynamic dispatch).
trait MyTrait: Sized {
fn do_something(&self);
}
struct MyStruct;
impl MyTrait for MyStruct {
fn do_something(&self) {
println!("Doing something!");
}
}
fn use_trait_object(trait_obj: &dyn MyTrait) {
trait_obj.do_something(); // This would not compile because MyTrait is not object-safe due to Self: Sized bound.
}
The Self: Sized
bound on a specific method within a trait makes that particular method unusable through a trait object, though the trait object itself can be constructed and used.
trait MyTrait {
fn do_something(&self) where Self: Sized;
fn do_anything(&self);
}
struct MyStruct;
impl MyTrait for MyStruct {
fn do_something(&self) {
println!("Doing something specific!");
}
fn do_anything(&self) {
println!("Doing anything!");
}
}
fn use_trait_object(trait_obj: &dyn MyTrait) {
trait_obj.do_something(); // This will not compile because do_something requires Self: Sized.
trait_obj.do_anything(); // This works fine.
}
Dynamic Dispatch in the async world
Async functions are not directly supported with Rust traits, and using them brings some additional challenges and nuances to know. The async
method returns a Future
, and the size of this future is not known at compile time because it depends on the state captured by the async block, inherently making them !Sized
.
You cannot directly use normal trait methods that return a Future
because the futures' sizes being unknown prevent them from being directly returned by a method in a dyn Trait
. This is analogous to the limitation posed by Self: Sized
, but from the opposite direction—instead of the trait requiring a known size, the return type inherently does not have a known size.
To overcome this problem, we can return something that will be Sized
and will contain our future so that we can work with it later. For that, we can use Box<dyn Future<Output = T> + Send>
as a result type from trait methods. This approach encapsulates the future in a trait object, making it possible to handle different future types under a unified interface.
Continuing on with our Dog/Cat example we can write the same in an async way using the Box method:
use tokio;
use std::future::Future;
use std::pin::Pin;
trait Animal {
fn speak(&self) -> Pin<Box<dyn Future<Output = String> + Send>>;
}
struct Cat;
struct Dog;
impl Animal for Cat {
fn speak(&self) -> Pin<Box<dyn Future<Output = String> + Send>> {
Box::pin(async move {
let response = "Meow".to_string();
println!("{}", response);
response
})
}
}
impl Animal for Dog {
fn speak(&self) -> Pin<Box<dyn Future<Output = String> + Send>> {
Box::pin(async move {
let response = "Woof".to_string();
println!("{}", response);
response
})
}
}
#[tokio::main]
async fn main() {
let cat = Cat;
let dog = Dog;
let cat_says = cat.speak().await;
let dog_says = dog.speak().await;
println!("The cat says: {}", cat_says);
println!("The dog says: {}", dog_says);
}
The speak
method now returns Pin<Box<dyn Future<Output = String> + Send>>
. The return type is a boxed future, which is Sized
. The Pin
type used here is to prevent the future from being moved in memory after it has been created. This is a safety mechanism because the Future itself can contain self-references. In both Dog and Cat, the async blocks are manually boxed so that the async block is placed on the heap, and a pointer to it is returned. All captured variables are moved into the future’s environment with the move
keyword.
Async_trait library
The previous example highlighted the use of Box<dyn Future<Output = T> + Send>
as a return type for the async function in traits, but we can do better. Actually, there is a library that handles that for us automatically, and it's called async_trait
.
use async_trait::async_trait;
use tokio;
#[async_trait]
trait Animal {
async fn speak(&self) -> String;
}
struct Cat;
struct Dog;
#[async_trait]
impl Animal for Cat {
async fn speak(&self) -> String {
let response = "Meow".to_string();
println!("{}", response);
response
}
}
#[async_trait]
impl Animal for Dog {
async fn speak(&self) -> String {
let response = "Woof".to_string();
println!("{}", response);
response
}
}
#[tokio::main]
async fn main() {
let cat = Cat;
let dog = Dog;
let cat_says = cat.speak().await;
let dog_says = dog.speak().await;
println!("The cat says: {}", cat_says);
println!("The dog says: {}", dog_says);
}
With async_trait
, we have automatic boxing, where by using the annotation, we automatically transform async trait methods at compile time to return Box<dyn Future<Output = T> + Send>
(or Box<dyn Future<Output = T>>
if Send
is not needed), and we don’t need to write all the boilerplate code associated with pinning and boxing; it's all done for us.
Summary
In summary, dynamic dispatch, when compared to static, reduces compile times as it doesn’t have to compile many copies of the same methods for different types. Similarly, the CPU instruction cache won't be bloated anymore. However, like everything in this world, it has its drawbacks too: we have fewer optimizations done while compiling our code and no inlining done for us. On top of that, we have to pay a small runtime cost associated with finding the specific function in the vtable.
Understanding these mechanisms and their trade-offs is essential for any programmer when deciding how to structure your structs/traits and methods and what performance implications they may have. Both mechanisms have advantages and disadvantages and introduce additional aspects we need to consider when maintaining our code.
Might interest you: