Arbitrary-sized Integers in C23
2022-12-19 · 4 min read
- c
- c23
- programming
Since 2023 is getting close, and with it (hopefully) an implementation of the C23 spec, in this post I will showcase one of the exciting features coming to C programmers: _BitInt
, which allow the programmer to define integers of exactly bits (in contrast to something like int_least32_t
), where is given at compile time.
In comparison to C++, this is similar to std::bitset
, but is actually quite different, in the sense that the programmer can interact with the integer using arithmetic operations (e.g., +
or *
) whereas std::bitset
only allows bit manipulation.
Rationale
The main reason behind this proposal, is to enable developers programming specific algorithms, or more generally applications where a specific size of variables is required - such as in cryptography, graphics (RGB values), IP addresses (or in general, embedded applications), to express themselves better.
For example, a function which accepts an RGB value of bits (where is known at compile time), can be written as:
constexpr size_t K = /* compile time value */;
typedef _BitInt(K) color;
struct RGB {
color R;
color G;
color B;
}; /* sizeof(RGB) == 3 * K */
The second reason for adding this, is for developers targeting FPGAs, a type of embedded processor. In these systems, there is two main differences in comparison to traditional desktop CPUs:
- Memory is usually expensive: so, replacing
int
(which often 32 or 64 bits) with_BitInt(X)
is much more efficient, space wise. - Word size is flexible: since FPGAs are used in embedded applications, manufacturers implement integers of larger size, depending on the application. Thus, using integers of specific size may not impact performance, and actually benefit it.
Finally, using _BitInt
removes the headache of trying to write true portable code, as the size of types like int
is not defined in the specification, but rather only defined to be at least something (e.g., int
is at least 8 bits).
State of Compilers
Currently (December 2022) the only compiler to implement this feature is Clang, and that is only due to an extension in LLVM, previously called _ExtInt
which did exactly this. So, if one wants to test this feature, use Clang version 15 (or earlier versions, with the directive #define _BitInt(X) _ExtInt(X)
).
It should be noted that the name _ExtInt
is now deprecated, in favour of the new _BitInt
type.
Clang’s Implementation
As Clang (LLVM) is the only compiler currently that implements this feature, we have some room to discuss how this is implemented.
In LLVM-IR (the LLVM “assembly language”) there are several integer types. In fact, there are 16,777,215 types (). For example, the type i64
denotes a 64-bit integer. So, compiling a _BitInt(N)
to the corresponding LLVM-IR type iN
is quite straightforward. The main difficulty in implementing this feature is (efficiently) implementing arithmetic, but with enough time this can also be solved.
It may be important to note that for regular CPUs, it is safe to assume this implementation is not the most efficient, but that is fine, as this feature is mostly directed at FPGAs. Note that, for example, in my machine a _BitInt(3)
takes one byte in memory.
Some Fun
Since Clang already supports this feature, we can have some fun! Here for example, we allocate one mebibyte ( bits), and checking whether is even or not (just to be sure):
#define _BitInt(X) _ExtInt(X)
int main(void) {
_BitInt (1048576) x = ((_BitInt (1048576)) 1) << (1 << 20);
bool is_even = x % 2 == 0;
puts( (is_even) ? "2^(2^20) is even!" : "2^(2^20) is odd!" );
}
The result:
2^(2^20) is even!
Safe to say my CPU is alright. It might be good to note that the resulting binary is 1.5M in size, which is to be expected (by the way, the assembly (ARM) was just short of 400,000 lines of code). So perhaps creating huge integers is not a great idea.
See also: N2763, for the official specification.
Update (November 2023): Clang has “officially” implemented _BitInt
, deprecating _ExtInt
. Moreover, Clang now limits the size of _BitInt
-declared integers to 128 bits.