C enables bugs

The C programming language is not memory safe and it is not type safe. In practice, this means that a simple error that confuses the type of an object¹—or, as well see, forgets to check the type of an object—can lead to code accessing arbitrary memory. When an adversary is able to arrange for the type of one object to be confused for the another type, we call this a type confusion attack and it frequently results in complete software compromise.

In many cases, a modern C compiler with warnings turned on will catch simple examples of this, but there’s one programming pattern where the compiler cannot help you because the code itself is relying on type confusion enabled by guarantees in the language standard. And unlike most other modern (and even not that modern) languages rule this behavior out. That pattern is hand-rolled inheritance.

Let’s take a look at an example. Let’s say we want to have a linked list containing multiple types of objects. Here’s one approach.

enum {
    FOO_TYPE,
    BAR_TYPE,
};

struct Foo {
    // Common members.
    int type;
    char *name;
    struct Foo *next;
    // Unique members.
    char *data;
};

struct Bar {
    // Common members.
    int type;
    char *name;
    struct Foo *next;
    // Unique members.
    long x;
};

Notice that both struct Foo and struct Bar have the same 3 initial members in common, a type field, and name, and a next pointer.

Structures in C are laid out in memory sequentially (possibly with padding for alignment reasons) C23 §6.7.2.1. The upshot is that we can convert a pointer to a struct Foo into a pointer to a struct Bar and then access the members the three members in common. Here’s an example that walks such a linked list and prints out the name field of each element in the list.

void print_list(struct Foo *head_of_list) {
    for (struct Foo *p = head_of_list; p != NULL; p = p->next) {
        puts(p->name);
    }
}

Notice that even if some of the members of the list are struct Bar, this code works correctly because the only members that are accessed are name and next and those are in the same location in struct Bar.

The problem comes in when you decide to access one of the unique members. Let’s update our print_list function.

void print_list(struct Foo *head_of_list) {
    for (struct Foo *p = head_of_list; p != NULL; p = p->next) {
        printf("%s: %s\n", p->name, p->data);
    }
}

The problem, of course, is that we’ve tried to access the data member of a struct Bar. This is type confusion. It’s undefined behavior. The compiler is free to do anything it wants with this code except that it cannot tell that the code has any bugs so it’s likely to treat the data it gets from p->data as a pointer to a string and then try to print it. Since long and char * usually have the same alignment and size, this means we’ll be using the x member of the struct Bar as if it were a pointer. This example is likely to crash.

The fix is simple: Don’t access a member unless you know the type of the structure is what you expect. Since C cannot help us here, we have to use the type member to disambiguate. This leads to the following correct code.²

void print_list(struct Foo *head_of_list) {
    for (struct Foo *p = head_of_list; p != NULL; p = p->next) {
        if (p->type == FOO_TYPE) {
            printf("%s: %s\n", p->name, p->data);
        } else {
            struct Bar *b = (struct Bar *)p;
            printf("%s: 0x%lX\n", b->name, b->x);
        }
    }
}

Here’s a link to Compiler Explorer with this code if you want to play with it.

C is the problem

The root of the issue here is that C enables this style of code but provides no tools to help you use it safely. Most modern languages simply disallow it and provide users with tools to deal with it.

In Java, for example, we would use a base class that both Foo and Bar inherit from. And if we tried to cast an instance of Foo to Bar, we’ll get a runtime error, specifically a ClassCastException.

In Rust, this error would be prevented at compile time. It’s simply not possible to treat at Foo as a Bar without using an unsafe block.

Even C++ provides the tools to handle this safely. It won’t stop you from casting a Foo * to a Bar * exactly as in C;³ however, you can use C++’s object-oriented nature to create a base class, as we do with Java, and then use dynamic_cast to convert from a base pointer to a Foo or Bar pointer. We don’t get an exception if this down cast fails,⁴ instead the result is nullptr.

This is not a theoretical problem, but a real one that impacts real code bases, large and small. I ran across an instance of this issue in some code using libxml2. Libxml2 takes XML or (old) HTML and produces a tree of nodes representing the documents. There are several different types of nodes, including element nodes, attribute nodes, and text nodes. Element nodes (and text nodes for that matter) are represented by an xmlNode structure where as attributes are represented by an xmlAttr structure. These are structured similarly to struct Foo and struct Bar above in that they have some members in common—including a type, a name, and pointers children, last, parent, next, and prev that point to other nodes in the tree. Other nodes like the xmlDtd (which comes from a <!DOCTYPE ...>) behave similarly.

In the case I saw, an xmlAttr * was being treated as an xmlNode * (which was fine for traversing the list of attributes on an element) but the properties field of an xmlNode was being accessed but xmlAttr doesn’t have a properties field. This type confusion was leading to crashes. The fix was simple (check the type field exactly as in my example above).

Note that using an xmlNode * to point to an xmlAttr is an expected use of the API. See, for example, xmlSetNs which sets a namespace on a node of type either XML_ELEMENT_NODE or XML_ATTRIBUTE_NODE. Its first argument is a pointer to an xmlNode so if you wish to set the namespace of an attribute node, you need to cast the address to an xmlNode *.⁵

Not using C is the solution

We need to stop using C when we have alternatives. I see no other way.

To quote Fish in a Barrel, “Stop writing C/C++.”

I don’t only mean an object in the sense of object-oriented programming, but rather the more general notion of a “region of data storage in the execution environment, the contents of which can represent values” (C23 §3.15). ↩
It’s correct, but brittle! If we add a third type of object, then we need to identify all of the places we made a decision based on type and update them. ↩
As cppreference.com points out, “A downcast can also be performed with static_cast, which avoid the cost of the runtime check, but it’s only safe if the program can guarantee (through some other logic) that the object pointed to by [the argument to the dynamic_cast] is definitely [an object of the correct type].” One example of a code base using “some other logic” is LLVM which uses custom logic to keep track of the types of objects. See the Type class for examples.
Notice how brittle this approach is. Here’s a comment from Types.h.
```
    /// Definitions of all of the base types for the Type system.  Based on this
    /// value, you can cast to a class defined in DerivedTypes.h.
    /// Note: If you add an element to this, you need to add an element to the
    /// Type::getPrimitiveType function, or else things will break!
    /// Also update LLVMTypeKind and LLVMGetTypeKind () in the C binding.
```
↩
dynamic_cast on a reference can raise a std::bad_cast exception since it cannot return nullptr. ↩
In this particular case, the code for xmlSetNs doesn’t switch on the type member. Instead, it checks if type is either XML_ELEMENT_NODE or XML_ATTRIBUTE_NODE and if so, sets the ns member of the xmlNode even though ns isn’t one of the common members that appears in all of these structs. Instead, ns is the first unique member of both the xmlNode and xmlAttr structures. The location of the ns members relative to the start of their respective structures is an undocumented invariant. I keep writing that this style of code is brittle and this is no different. ↩