A Quick but Relevant Digression on Virtual Dispatch

Consider the following example reviewing inherintance and dynamic polymorphism.

struct Animal {
    virtual ~Animal() = default;

    virtual string make_noise() = 0;
};

struct Dog : public Animal {
    ~Dog() override {}

    string make_noise() override { return "bork"; }
};


int main (){
    Animal* dog1 = new Dog();
    // prints "bork"
    cout << dog1->make_noise() << endl;
    delete dog1;

    return 0;
}

In particular, we are able to access Dog’s make_noise method even through a pointer of type Animal*.

dynamic_polymorphism

Let’s take a look at the memory layout of our example of dynamic polymorphism in the pictured above. The dog1 object contains a pointer vptr to the Dog classes’s vtable. This vtable will be shared across all instances of Dog and will hold pointers to all virtual methods from the base Animal class. All instances of Dog will have vptrs directed at the same vtable stored in read-only memory. In this example, there is only one virtual method in the vtable: make_noise.

This is a fine technical implementation of dynamic polymorphism and is perfectly usable in numerous applications, but brings about the following problems for those seeking the most performant code:

  1. Each Dog object will contain its respective vptr, increasing its memory footprint by the size of a pointer.
  2. Each invocation of make_noise() will involve pointer indirection via the vptr and will never be optimized away by the compiler due to the runtime nature of runtime polymorphism.

A common way to avoid these deficiencies is through ✨templates✨.

The Problem at Hand

Before describing CRTP, let’s dive into a specific non-templated implementation.

// Template pattern
struct Animal {
    virtual ~Animal() = default;

    virtual string make_noise() = 0;
    string make_more_noise() {
        return make_noise() + ' ' + make_noise() + ' ' + make_noise();
    }
};

struct Dog : public Animal {
    ~Dog() override {}

    string make_noise() override { return "bork"; }
};

struct Cat : public Animal {
    ~Cat() override {}

    string make_noise() override { return "meow"; }
};

int main (){
    Animal* dog = new Dog();
    Animal* cat = new Cat();
    // prints "bork bork bork"
    cout << dog->make_more_noise() << endl;
    // prints "meow meow meow"
    cout << cat->make_more_noise() << endl;
    delete dog;
    delete cat;

    return 0;
}

Here, we are effectively leveraging the template pattern in the abstract class Animal. Subclasses of Animal must implement the make_noise method, and in doing so will also define the implementation of make_more_noise. This is beneficial as we share the implementation of make_more_noise across all subclasses.

As discussed earlier, however, this implemetation carries the same unperformant characteristics previously discussed: (1) An slightly enlarged memory footprint for each object to accomodate the vptr and (2) additonal pointer indirection during invocations of virtual methods.

CRTP is a templated solution to address this.

The Curious and Recurring Template Solution

Let’s take a moment to analyze the following code. The magic is in the code.

template <typename T>
struct Animal {
    string make_more_noise(){
        auto* t = static_cast<T*>(this);
        return t->make_noise() + ' ' + t->make_noise() + ' ' + t->make_noise();
    }
};

struct Dog : public Animal<Dog>{
    string make_noise(){
        return "bork";
    }
};

struct Cat : public Animal<Cat>{
    string make_noise(){
        return "meow";
    }
};

int main (){
    Dog dog;
    Cat cat;
    // prints "bork bork bork"
    cout << dog.make_more_noise() << endl;
    // prints "meow meow meow"
    cout << cat.make_more_noise() << endl;
}

To those unfamiliar with CRTP, it may not be obvious that this compiles. In defining a subclass of Animal, the definition of Dog passes itself as the template argument to Animal. This all feels very self-referential as the definiton of Dog depends on Animal<Dog> which in turn depends on the definition of Dog again…

For reasons I will not (can not) explain, this compiles and works just as the definition of make_more_noise makes out to be. Instead of using virtual dispatch within the base class to invoke functions of the subclass, that information is passed as a template parameter and parsed at compile time. And since everything is managed at compile time, there is no need for subclasses to internally contain a vptr or lean into pointer indirection during function invocation!

References

  1. Wikipedia - Curiously recurring template pattern