User-Defined Deduction Guides for Class Templates in C++

I remember when I first got serious about modern C++. It was in 2017, my company was starting a new project from scratch, and we chose the latest standard: C++17, which had just been released. Diving into all the modern features of C++, I remember one feature that left me particularly puzzled: user-defined deduction guides. They are used to influence CTAD (Class Template Argument Deduction), which was a major feature of C++17. I was puzzled because I couldn’t understand why anyone would ever need to write such a guide by hand…

Last year, after 7 years of C++17 and above, I finally needed to write my very first own template deduction guide! 🤯

I know, it took me 15 months to write an article on this topic, but here it is!

Context#

I was working on a plugin for QEMU’s ARM system emulator to record (“trace”) the execution of an ARM system running Android. I wrote a generic trace generator, and then 2 specializations for ARM 32-bit or 64-bit variants, all in the same plugin. The code avoided virtual functions as much as possible because the application was highly performance-sensitive and I didn’t want to pay for the cost of vtables and virtual calls.

Genericity was mainly achieved by using templates. A lot of templates. Most of these templates worked well with the automatic template deduction rules. Most of them, but not all of them. This is the context where I needed my own deduction guide: writing templates to trace the execution of a CPU, with all the modifications to its registers and to the system memory.

In the following, I will of course not show the real code, but a minimal dummy code to illustrate the situation. It will really be a dummy code: don’t expect well-designed classes, just plausible code where a “user-defined deduction guide” is necessary. However, I think that somehow realistic code samples will be more instructive than Foo-Bar examples.

An Example Where CTAD Fails#

Base class template#

Let’s start with the base class template for a generic trace generator:

#include <iostream>
#include <tuple>

template <typename Registers, typename... MemoryReaders>
class TraceGenerator {
public:
	explicit TraceGenerator(std::tuple<MemoryReaders...>&& readers) :
		registers_m{},
		readers_m{readers} {}

	virtual ~TraceGenerator() = default;

	void generate() {
		std::cout << __PRETTY_FUNCTION__ << '\n';
		some_architecture_specific_function();
		// ...
		// Let's omit the code that uses registers_m and readers_m, for the sake of simplicity
	}

private:
	Registers registers_m;
	std::tuple<MemoryReaders...> readers_m;

	virtual void some_architecture_specific_function() = 0;
};

The template has 2 parameters:

a type Registers, with the registers of the target architecture / CPU.
a list of types MemoryReaders, to read the memories of the target machine.

There is one pure virtual function so this “class” must be inherited to implement the function. Yes, I said that I avoided virtual as much as possible, not at all 😉

Specialization for ARM#

Android runs on ARM processors. We need something for Android.

struct ArmRegisters {};

template <typename... MemoryReaders>
class ArmTraceGenerator : public TraceGenerator<ArmRegisters, MemoryReaders...> {
public:
	using TraceGenerator<ArmRegisters, MemoryReaders...>::TraceGenerator;

private:
	void some_architecture_specific_function() override {
		std::cout << __PRETTY_FUNCTION__ << '\n';
	}
};

We implement the pure virtual function, we set the Registers template parameter, but we still have a class template: the MemoryReaders pack still have to be set to create an actual class. Because constructors are not automatically inherited in C++, we must add this using declaration to expose the base class constructors (see section “Inheriting constructors” from “Using-declaration” on cppreference). And yes: because the content of Registers is not actually used here, an empty structure is enough for our example.

Finally, we can write a simple main() that uses this class template to create an actual generator for our target. Let’s say this machine has 2 memories, one RAM, one FLASH. We create two reader classes, a tuple with an instance of one of each these classes, and we can finally create the trace generator:

struct RamReader {};

struct FlashReader {};

int main() {
	auto generator = ArmTraceGenerator(std::tuple{RamReader{}, FlashReader{}});
	generator.generate();
}

We create our object with auto generator = ArmTraceGenerator(...) without explicitly specifying the types in the parameter pack. This means we are asking the compiler to deduce the MemoryReaders parameter pack and instantiate the appropriate ArmTraceGenerator class.

However, as beautiful and valid as this code seems, it doesn’t compile (*):

main.cpp: In function ‘int main()’:
main.cpp:46:79: error: class template argument deduction failed:
   46 |         auto generator = ArmTraceGenerator(std::tuple{RamReader{}, FlashReader{}});
      |                                                                               ^

(... followed by many other template errors, as always with templates)

Why? Because CTAD is not applicable in this situation, as explained in P2582R1 :

In C++17, deduction guides (implicit and explicit) are not inherited when constructors are inherited

P1021R6 already suggested to change this behavior for C++20, but it “was not finalized in time”. P2582R1 also proposed a similar change, and it was accepted for C++23. However, the compiler support is still very poor for P2582R1 “Class template argument deduction from inherited constructors”: GCC implements it since version 14, but Clang and MSVC don’t.

(*) = so “it doesn’t compile” on my machine because I am compiling this example with GCC 12.2 and -std=c++20.

For this code to compile, we have 4 solutions:

Use GCC 14+ and -std=c++23.
Don’t rely on CTAD: specify the types explicitly with auto generator = ArmTraceGenerator<RamReader, FlashReader>(...). This is fine here, as there are only 2 types in the parameter pack. Nevertheless, listing all the actual types can become a nightmare. In my real code, I had a parameter pack with more than 1000 elements, so CTAD was essential.
Remove the using TraceGenerator<ArmRegisters, MemoryReaders...>::TraceGenerator to stop inheriting constructors and provide a compatible constructor directly in the derived class:
```
explicit ArmTraceGenerator(std::tuple<MemoryReaders...>&& readers)
    : TraceGenerator<ArmRegisters, MemoryReaders...>{std::move(readers)} {}
```
Write a user-defined deduction guide.

Let’s explore this last possibility (as it’s the topic of this blog post 😂).

User-Defined Deduction Guide#

Since the compiler cannot deduce the template arguments, we can write a deduction guide to help it. This is surprisingly simple:

template <typename... MemoryReaders>
ArmTraceGenerator(std::tuple<MemoryReaders...>) -> ArmTraceGenerator<MemoryReaders...>;

Basically, this guide explains that creating an ArmTraceGenerator object from a tuple<MemoryReaders...> in fact creates an ArmTraceGenerator<MemoryReaders...>.

As with everything in C++, user-defined deduction guides must be known when they are to be used, so this guide must be before the main() function.

The code now compiles and prints:

void TraceGenerator<Registers, MemoryReaders>::generate() [with Registers = ArmRegisters; MemoryReaders = {RamReader, FlashReader}]
void ArmTraceGenerator<MemoryReaders>::some_architecture_specific_function() [with MemoryReaders = {RamReader, FlashReader}]

Other Examples#

You might be thinking:

OK, fine. You’ve shown an example where CTAD fails, but C++23 solved the issue and a user-defined guide was not even required, as they were workarounds even in C++17. Do you have any other (more solid) examples?

Yes 🥳

Example Where CTAD Fails Without a Workaround (*)#

(*) = At least, I haven’t found any.

Let’s consider another class template to handle chunks of memory. It encapsulates an std::span over caller-owned data:

template <typename T>
	requires std::is_integral_v<T>
class MemoryChunk {
public:
	explicit MemoryChunk(T* data, std::size_t size) :
		chunk_m(data, size) {
		std::cout << __PRETTY_FUNCTION__ << '\n';
	}

	void print_info() const {
		std::cout << __PRETTY_FUNCTION__ << '\n';
		const auto bytes = chunk_m.size() * sizeof(T);
		std::cout << "Memory size = " << bytes << " bytes" << '\n';
	}

private:
	std::span<T> chunk_m;
};

We can use it:

int main() {
    auto data = std::array<int, 16>{};
    auto chunk = MemoryChunk{data.data(), data.size()};
    chunk.print_info();
}

This works perfectly fine, CTAD is possible and T is deduced to be int, as shown in the program output:

MemoryChunk<T>::MemoryChunk(T*, std::size_t) [with T = int; std::size_t = long unsigned int]
void MemoryChunk<T>::print_info() const [with T = int]
Memory size = 64 bytes

Now, let’s say we want construct MemoryChunks from various types of iterators. We can add a templated constructor:

template <std::contiguous_iterator Iterator>
MemoryChunk(Iterator begin, Iterator end) :
	chunk_m(begin, std::distance(begin, end)) {
	std::cout << __PRETTY_FUNCTION__ << '\n';
}

However, we clearly see that CTAD won’t work here: T doesn’t appear so the compiler cannot deduce it and will require a user-defined deduction guide.

We must find a way to deduce T from Iterator. Our first attempt could be:

template <typename Iterator>
MemoryChunk(Iterator begin, Iterator end) -> MemoryChunk<decltype(*begin)>;

We can try this new constructor in our main():

int main() {
	auto data = std::array<int, 16>{};
	auto chunk = MemoryChunk{data.begin(), data.end()};
	chunk.print_info();
}

Sadly, this code doesn’t compile… For once, the last template errors are the most informative (in general, we just read the first one and ignore everything after it 😅):

main.cpp:34:1: error: template constraint failure for ‘template<class T>  requires  is_integral_v<T> class MemoryChunk’
main.cpp:34:1: note: constraints not satisfied
main.cpp: In substitution of ‘template<class T>  requires  is_integral_v<T> class MemoryChunk [with T = int&]’:
main.cpp:34:1:   required by substitution of ‘template<class Iterator> MemoryChunk(Iterator, Iterator)-> MemoryChunk<decltype (* begin)> [with Iterator = int*]’

The last line shows that the compiler is trying to use our guide
The first line tells that the requires clause on T is not satisfied.
The third line shows that T is int&.

T is deduced as int&, not just int. Indeed, *data.begin() returns a reference, not a copy of the first element. An assertion like static_assert(std::is_same_v<decltype(*data.begin()), int&>); doesn’t fail.

We just have to modify the rule to remove the reference and the code works fine:

template <typename Iterator>
MemoryChunk(Iterator begin, Iterator end) -> MemoryChunk<std::remove_reference_t<decltype(*begin)>>;

Note that guides are SFINAE friendly. We can have both rules and the code works.

You could even add an obviously incorrect rule like the one below, and the code would still work:

template <typename Iterator>
MemoryChunk(Iterator begin, Iterator end) -> MemoryChunk<typename Iterator::dummy_static_field_that_doesnt_exist>;

Example to Overwrite the Deduced Type#

For this last example, let’s imagine a super basic Metadata class template:

template <typename T>
struct Metadata{
	T name;
};

We can use it with a string literal:

int main() {
	auto metadata = Metadata{.name = "name of something"};
	static_assert(std::is_same_v<decltype(metadata), Metadata<const char*>>);
}

The assertion shows that CTAD has deduced const char*, which is the correct type for a string literal.

But let’s imagine that we want to modify the name in the metadata afterward. metadata.name[0] = 'N'; doesn’t compile: error: assignment of read-only location ‘* metadata.Metadata<const char*>::name’.

We can influence CTAD by writing a deduction guide that forces T to be std::string instead of const char*. Unlike our previous guides, it’s not templated:

MemoryMetadata(const char*) -> MemoryMetadata<std::string>;

The previous assertion fails and must be replaced with static_assert(std::is_same_v<decltype(metadata), Metadata<std::string>>);. And as desired, metadata.name[0] = 'N'; now compiles.

Conclusion#

Class Template Argument Deduction (CTAD) was one of the most powerful features of C++17. It greatly simplifies the use of class templates by eliminating the need to explicitly specify template arguments. However, there are situations in which CTAD fails.

User-defined deduction guides are the solution in these cases. They are fairly simple to write and allow you to restore deduction behavior, or even influence it when the default rules do not meet your requirements. You may never need them, but now you know how to write them and in which situations they can be useful.