Arbitrary-sized Integers in C23

2022-12-19 · 4 min read


  • c
  • c23
  • programming

Since 2023 is getting close, and with it (hopefully) an implementation of the C23 spec, in this post I will showcase one of the exciting features coming to C programmers: _BitInt, which allow the programmer to define integers of exactly NN bits (in contrast to something like int_least32_t), where NN is given at compile time.

In comparison to C++, this is similar to std::bitset, but is actually quite different, in the sense that the programmer can interact with the integer using arithmetic operations (e.g., + or *) whereas std::bitset only allows bit manipulation.

Rationale

The main reason behind this proposal, is to enable developers programming specific algorithms, or more generally applications where a specific size of variables is required - such as in cryptography, graphics (RGB values), IP addresses (or in general, embedded applications), to express themselves better.

For example, a function which accepts an RGB value of KK bits (where KK is known at compile time), can be written as:

constexpr size_t K = /* compile time value */;
typedef _BitInt(K) color;

struct RGB {
    color R;
    color G;
    color B;
}; /* sizeof(RGB) == 3 * K */

The second reason for adding this, is for developers targeting FPGAs, a type of embedded processor. In these systems, there is two main differences in comparison to traditional desktop CPUs:

  1. Memory is usually expensive: so, replacing int (which often 32 or 64 bits) with _BitInt(X) is much more efficient, space wise.
  2. Word size is flexible: since FPGAs are used in embedded applications, manufacturers implement integers of larger size, depending on the application. Thus, using integers of specific size may not impact performance, and actually benefit it.

Finally, using _BitInt removes the headache of trying to write true portable code, as the size of types like int is not defined in the specification, but rather only defined to be at least something (e.g., int is at least 8 bits).

State of Compilers

Currently (December 2022) the only compiler to implement this feature is Clang, and that is only due to an extension in LLVM, previously called _ExtInt which did exactly this. So, if one wants to test this feature, use Clang version 15 (or earlier versions, with the directive #define _BitInt(X) _ExtInt(X)).

It should be noted that the name _ExtInt is now deprecated, in favour of the new _BitInt type.

Clang’s Implementation

As Clang (LLVM) is the only compiler currently that implements this feature, we have some room to discuss how this is implemented.

In LLVM-IR (the LLVM “assembly language”) there are several integer types. In fact, there are 16,777,215 types (22412^{24} - 1). For example, the type i64 denotes a 64-bit integer. So, compiling a _BitInt(N) to the corresponding LLVM-IR type iN is quite straightforward. The main difficulty in implementing this feature is (efficiently) implementing arithmetic, but with enough time this can also be solved.

It may be important to note that for regular CPUs, it is safe to assume this implementation is not the most efficient, but that is fine, as this feature is mostly directed at FPGAs. Note that, for example, in my machine a _BitInt(3) takes one byte in memory.

Some Fun

Since Clang already supports this feature, we can have some fun! Here for example, we allocate one mebibyte (2202^{20} bits), and checking whether 22202^{2^{20}} is even or not (just to be sure):

#define _BitInt(X) _ExtInt(X)

int main(void) {
    _BitInt (1048576) x = ((_BitInt (1048576)) 1) << (1 << 20);

    bool is_even = x % 2 == 0;
    puts( (is_even) ? "2^(2^20) is even!" : "2^(2^20) is odd!" );
}

The result:

2^(2^20) is even!

Safe to say my CPU is alright. It might be good to note that the resulting binary is 1.5M in size, which is to be expected (by the way, the assembly (ARM) was just short of 400,000 lines of code). So perhaps creating huge integers is not a great idea.

See also: N2763, for the official specification.

Update (November 2023): Clang has “officially” implemented _BitInt, deprecating _ExtInt. Moreover, Clang now limits the size of _BitInt-declared integers to 128 bits.