Choosing an Appropriate Integer Type in C and C++

RMAG news

Introduction

When Dennis Ritchie created C, he made int (a signed integer type) be the default type. The size (number of bits) of an int was deliberately not specified. Even when C was standardized, all that was guaranteed was a minimum size. The rationale was that the size of int should be the “natural” word size for an integer on a given CPU.

If you needed only smaller signed integers and wanted to save a bit of space, Ritchie gave us short; or, if you needed bigger integers, he gave us long. (C99 gave us even bigger integers with long long.) If you only needed unsigned integers, you could include unsigned in a declaration. C99 also gave us specific-sized signed integer type aliases (e.g., int32_t) and unsigned type aliases (e.g., uint32_t).

However, in programming, negative integers (thus requiring the a signed integer type), aren’t needed most of the time. The length of strings, count of objects, size of objects, size of files, etc., are all unsigned integers. Specific-sized type aliases are needed even less than signed integers.

Yet I’ve seen a lot of code that uses integer types inappropriately. Such code can convey either underspecified or misleading information to readers (including yourself in several months’ time). It’s best to choose the right integer type for the right purpose.

Guidelines

Here are my guidelines for choosing an integer type:

When representing a count of bytes in memory, use the size_t standard type alias.

This is the type used by both the C and C++ standard libraries, e.g., by memcpy(), strlen(), std::string::size(), etc., so there’s plenty of precedent.

When representing either the size of or a position within a file on disk, use the off_t POSIX type alias.

If you’re dealing with very large files, on some platforms, you may need to compile with -D_FILE_OFFSET_BITS=64 to get a 64-bit version of off_t.

When representing a count of objects in memory, use size_t also.

This is also the type used by both the C and C++ standard libraries, e.g., by fread() and fwrite().

Only if you need to represent a value contained within a specific number of bits or you need to conform to a specific API, use one of the int8_t, int16_t, int32_t or int64_t type aliases for signed types; or one of the uint8_t, uint16_t, uint32_t, or uint64_t type aliases for unsigned types.

The only times you typically need a fixed-size integer is when you “externalize” a value, e.g., write it to disk or send it over a socket.

Using a fixed-size integer when you don’t actually need a specific number of bits conveys wrong information to the reader.

Furthermore:

When representing an integer value that must be the exact size of a pointer, use either the standard intptr_t or uintptr_t type alias.

Only if you need negative values, use one of short, int, long, or long long with int being preferred unless you need either smaller or larger values.

Lastly:

Otherwise use one of unsigned short, unsigned, unsigned long, or unsigned long long similarly with unsigned being preferred unless you need either smaller or larger values.

That is, unless you’re dealing with one of the listed cases above, default to using unsigned types.

Conclusion

Choosing the right integer type conveys correct information to the reader and can eliminate run-time checks.

Epilogue

Originally, and up until C99, int was the implicit type, that is if you didn’t specify any type at all, it was understood to be int. For example:

power( x, n ) /* x and n are int; returns int */
{
int p;
for ( p = 1; n > 0; n )
p *= x;
return p;
}

defines a function that has int parameters and returns int, yet int isn’t used in the declaration.

Function prototypes were back-ported from C++ to C89, yet the original “K&R style” function definitions were still allowed all the way up until C23. The ANSI C committee is a conservative bunch.

Even weirder, pre-C99 also allowed int to be implicit in declarations such as:

i; // int i
*p; // int *p
*a[4]; // int *a[4]
*f(); // int *f()

Fortunately, such declarations have long since been illegal.

Please follow and like us:
Pin Share