Automatic Downcasting: How Does It Work?

Posted by khuey on 14 December 2021

Many programming languages include mechanisms for dynamic polymorphism. These pose challenges for debuggers, because viewing only fields from the declared type of a variable may not be particularly useful. Automatically deducing the most-"derived" type and downcasting to it presents the entire object to developers and makes debugging code that uses dynamic polymorphism much more pleasant. Our Pernosco Omniscient Debugger automatically downcasts types that use dynamic polymorphism in supported languages (C++, Rust, and Ada). You might also be familiar with this technique in gdb via the set print object on command. But how is it actually implemented?

What is dynamic polymorphism?

Polymorphism broadly refers to ways in which the same source code in a high level language can run different functions depending on the types of inputs to that function. It is split into static and dynamic polymorphism depending on whether the selection of the function happens at compile time (static) or run time (dynamic). In C++, operator overloading is an example of static polymorphism. The addition operator can be overloaded to do different things to an integer and a matrix. What a + b does in C++ thus depends on the types of a and b. That decision is made at compile time. Virtual functions, on the other hand, are an example of dynamic polymorphism. What base_class->method(arg1, arg2) does in C++ if BaseClass::method is virtual depends on which subtype of BaseClass is present at run time.

In statically typed languages without garbage collection, dynamic polymorphism is typically implemented with a technique called virtual method tables that I'll discuss a a bit later.

What is downcasting?

In languages with classes and inheritance, "downcasting" refers to casting a reference to an instance of a base class to one of its derived classes. I will use it here to refer to casting from a base class to the most-derived class, because that is the most useful class to present to a user of a debugger. The Rust programming language does not have inheritance per se, but it makes sense to think of trait objects as the base class and concrete types as the most-derived class.

Who needs to downcast?

Downcasting is essential to any implementation of dynamic polymorphism. When a function call is made on a polymorphic type (such as our base_class->method(arg1, arg2) example from earlier) the program needs to find the subtype of this instance of base_class in order to find the correct implementation of method to invoke. In C++ and similar languages, the compiler does this by creating a virtual method table (often called a virtual table or even just a vtable) and adding a hidden member variable to the beginning of BaseClass that points to the vtable. There is a vtable for each concrete polymorphic type and they are located in the static data section of the binary. For classes that inherit from BaseClass, their vtables contain a function pointer for each virtual method on BaseClass, and the compiler generates code for this call that looks up method in the vtable and jumps to it. The machine code sequence for calling a virtual function looks something like:

0: At compile time, the compiler recognizes that method is the Nth virtual method on BaseClass.

1: At run time, the program has a BaseClass* that it wants to call method on. It first dereferences the BaseClass* to get its first member: the vtable pointer.

2: The program then loads the Nth function pointer from the vtable (which is located at vtable_pointer + N * sizeof(void*)).

3: The program places all of the function's arguments in their positions according to the calling convention.

4: Finally the program calls the function pointer it just loaded and method has been invoked.

This method of dynamic dispatch has two limitations: it can only dynamically dispatch based on a single type and it requires the number of methods available for dynamic dispatch on a type be fixed at compile time. For statically typed languages, these are typically acceptable tradeoffs for the speed that virtual calls provide compared to other more featured forms of dynamic dispatch.

dynamic_cast<void*>

Astute readers may have noticed that the virtual call sequence above does not mention casting the BaseClass argument to the correct subtype. With virtual calls that is the responsibility of the callee, not the caller. The methods in the vtable, therefore, must be prepared to accept a BaseClass* value and convert it accordingly. The compiler hides this from the programmer by automatically inserting "thunks" that will static_cast<DerivedClass*>(base_class) before calling the methods the programmer wrote. If the derived class does not use multiple inheritance then this static_cast will convert to nothing at the machine code level. If the derived class does use multiple inheritance and BaseClass is not the first base class a pointer adjustment is required to convert to a DerivedClass*, and that is done in the compiler-generated "thunk". Because an adjustment is often unnecessary, this cast is deferred to the callee so that it can be skipped entirely in the common case.

Because the cast from the base type to the derived type is a no-op at the machine code level except for cases of multiple inheritance, the handling of this is specific to C++, which is the only language with multiple inheritance that Pernosco supports. The information necessary to cast from the base type to the derived type is actually present elsewhere in order to implement another feature: C++'s dynamic_cast<void*>. dynamic_cast normally requires run-time type information (RTTI) to be enabled, but any polymorphic object can be dynamically cast to void* even without RTTI, and dynamic_cast<void*> can never fail. dynamic_cast<void*> returns a void* pointing to the beginning of the most derived type and is useful for establishing object identity. This is accomplished by storing another value in the vtable: the offset to the top of the object.

Earlier I said that the function pointer corresponding to the Nth virtual method is located at vtable_pointer + N * sizeof(void*). This additional offset to the top of the object is actually stored before the first virtual method. The full layout of the vtable, with offsets relative to the value of the vtable pointer actually stored in the object, is:

Offset	Value
-0x10	Offset to the top of the object
-0x8	Pointer to RTTI information (null if `-fno-rtti`)
0	First virtual method pointer
0x8	Second virtual method pointer
0x10	Third virtual method pointer
...	Nth virtual method pointer

So a dynamic_cast<void*> loads the vtable pointer from the object, loads the offset located two words before the location the vtable pointer points to, and adjusts the object pointer with the loaded offset to produce a pointer to the most derived type.

In a C++ program this can only be used to convert to a void*, and not to access methods or data members on the type, a debugger can use debug information produced by the compiler to determine the most derived type from the vtable itself.

How does the debugger put it all together?

The final piece of this puzzle is understanding how debuggers match the vtable pointer up to a type. I said earlier that vtable pointers point into the static data section of the binary. Those locations in the static data have symbols with special prefixes indicating that they are vtables of a certain type. In C++ the TV prefix is used to indicate that the symbol is a vtable for a specified type. Thus the debugger can use the vtable pointer to find the appropriate vtable symbol, and use that to find the type the vtable is for. With knowledge of the correct type and the adjusted pointer thanks to the information present to enable dynamic_cast<void*>, a debugger can then display the best view of the object.

What about Rust?

Rust's vtables are a bit different. Instead of embedding the vtable pointer in the object, dynamically polymorphic types in Rust (trait objects) are "fat pointers" that contain separate pointers to data and to a vtable. Storing the vtable pointer outside of the object itself allows extending existing objects with new traits and adding traits to values that aren't objects at all (e.g. primitive values). It also means that the data pointer inside a trait object always points to the most derived type and no work is needed to adjust the data pointer. Like with C++, the object's type can be inferred by the debugger from where the vtable pointer points to in the static data section. Rust uses special global variables to represent its vtables rather than symbol names with special prefixes, but the process is more or less the same.