Bit-fields, expressions and types (WG14 N2958)

Joseph Myers

The question of whether the width of a bit-field is part of its type, which is largely left unspecified by the standard after DR#315, was previously discussed in N1260 in 2007. It has since become clear that in fact (a) that question is poorly specified, because there are several different contexts in which the type of a bit-field can be relevant in the standard, and sometimes different contexts require different concepts of what the type is, and (b) there are additional issues with what expressions have the special properties associated with a bit-field, and with determining even the type with which a bit-field is declared, which are not addressed in that paper.

This paper provides an updated view of the different contexts in which types of bit-fields are relevant, some of which are new with features of C11 or C23, and describes the other related issues. It is intended to provide a description of these issues in a single place, not to propose any changes to the C standard, and does not itself need to be discussed at a WG14 meeting. However, any changes in this regard proposed in future should engage with the existence of the different contexts described, and the kinds of expressions that might or might not be considered bit-fields, to avoid the problems that arise when it is supposed that a single notion of the type of a bit-field is sufficient. N1260 may be referred to for more details of the textual history and old Defect Reports relating to bit-fields.

Some issues described here may only apply when an implementation accepts implementation-defined types (outside the list of types required to be accepted) as the declared types of bit-fields.

What expressions are bit-fields?

Whether an expression is a bit-field is unambiguous in the case of an lvalue: an lvalue for a bit-field results from the use of the . member access operator, naming a bit-field member, on an lvalue for a structure or union, or the -> operator, naming a bit-field member, on a pointer to a structure or union; an lvalue for a bit-field, when enclosed in parentheses or selected by _Generic, remains an lvalue for a bit-field.

Whether a non-lvalue expression is a bit-field is less clear. A bit-field member may be selected with . from a non-lvalue structure or union (the results of an assignment, conditional expression, comma expression or function call with structure or union type), and those cases are also clearly bit-fields. The ambiguous cases are comma expressions whose second operand is a bit-field expression, and assignment expressions for assignment to an lvalue for a bit-field (and expressions constructed from those using comma expressions, parentheses and _Generic, recursively).

This is significant in at least two places in the standard. The rules for integer promotions refer to “A bit-field of type _Bool, int, signed int, or unsigned int”, where “type” must be understood from the context to mean declared type, but it is unclear what expressions count as bit-fields, and left unspecified how bit-fields with another declared type are handled (even if the type restricted by the width is narrower than int). The specification for sizeof disallows as an operand “an expression that designates a bit-field member”; so does that for typeof in N2927 (accepted for C23). The proposal N2945 would introduce a third such case where it is significant whether an expression is a bit-field, integer promotions for _BitInt bit-fields.

As a concrete example to illustrate these issues, consider the code:

// Example 1
struct s { unsigned int u : 1; } v;
int i = _Generic((0, v.u) << 0, int : 0);
int j = sizeof(0, v.u);

Empirically, implementations (at least GCC and Clang) agree that this code is valid. That is, the expression (0, v.u) is a valid operand to sizeof: it is not “an expression that designates a bit-field member”. But it is also subject to integer promotion to type int as shown by the use of _Generic being accepted. That means that either that expression is considered “A bit-field of type _Bool, int, signed int, or unsigned int”, so that the special rule about integer promotions applied to bit-fields applies, or it is considered to have “an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int” for the purposes of the rules about integer promotions (that is, it is considered to have a type like unsigned int : 1 for the purposes of that rule).

A small variation on this example is:

// Example 2
struct s { unsigned long int u : 1; } v;
int i = _Generic((0, v.u) << 0, int : 0);
int j = sizeof(0, v.u);

Again, implementations accept this code, but the special rule about bit-fields for integer promotions no longer applies (unless bit-fields of implementation-defined types are considered also to be covered by it), indicating that “type” is being interpreted to include width for the purposes of integer promotion.

However, the result of sizeof differs between implementations for these examples.

Contexts for bit-field types

The following are some of the contexts in which the term “type” may be applied to a bit-field, or to an expression that, as described above, might or might not be considered a bit-field for certain purposes. Anything addressing issues with types of bit-fields might need to introduce different terminology for the different notions of “type” and ensure that appropriate terminology is used in each place.

The type given by the type specifiers

This type is relevant for text talking about the declaration of a bit-field. The context for some other uses of “type” implies that this type is being referred to there as well; for example, the phrase “A bit-field of type _Bool, int, signed int, or unsigned int” quoted above.

However, types that are the same type, when considered as type specifiers in a non-bit-field declaration, may not be handled the same for a bit-field. In particular, the rule that plain int, or a typedef for plain int, might be treated as unsigned int leads to significant ambiguity (especially in the presence of typeof) because wording elsewhere in the standard does not attempt to distinguish whether the type of an expression, say typeof(1+1), or a composite type, for example, is int or signed int. (The case of a typedef for plain int is only actually mentioned in a footnote; it’s not clear it follows from normative text. This implementation-definedness also extends to types such as long int, if allowed for bit-fields and not explicitly signed or unsigned, as specified in DR#315.)

Similarly, as noted in DR#013, it is not specified which of two possible composite types is chosen for the composite type of an enumeration and the compatible integer type. If that composite type is then taken with typeof and used to declare a bit-field, the ambiguity extends to the type given by the type specifiers for that bit-field.

The type of a bit-field lvalue for assignment

When a not-exactly representable value is stored in a bit-field, then, as noted in DR#120, no normative text gives semantics for that assignment other than the general semantics for conversion of out-of-range values from an arithmetic type to an integer type. Thus, in this context, the type of the bit-field must be understood to be the declared type restricted by the width, so that those semantics for conversions apply.

The type of a bit-field for integer promotions

As discussed above, bit-fields narrower than int are in practice promoted to int by the integer promotions, even when the type specifiers give some other type such as unsigned long int. This might be achieved by a notion of “type” that involves bit-field width, or by special-casing bit-fields (including comma expressions whose second operand is a bit-field and assignment expressions assigning to a bit-field) in integer promotions.

A notion of “type” that involves bit-field width leaves unspecified whether long int : INT_WIDTH promotes to int, because it’s not clear how the rank of long int : INT_WIDTH compares to that of int.

The type of a bit-field in arithmetic

Consider a bit-field that is definitely not promoted by the integer promotions; for example, long long int : INT_WIDTH + 1, on an implementation where that is valid. Such bit-fields in arithmetic provide a case of implementation divergence; if the width is considered part of the type, arithmetic may then occur on numbers of exactly such a width (implementations now need to support such arithmetic on integers of unusual width anyway, as part of supporting _BitInt), while if it is not, such a bit-field, not having been changed by integer promotions, is considered of type long long int.

The type of a bit-field in sizeof or typeof

Consider one of the comma or assignment expressions above that is not “an expression that designates a bit-field member”, so is valid in sizeof or typeof, but that nevertheless refers to a bit-field as an rvalue (and, as discussed above, has associated integer promotion properties). Implementations differ about whether the type in this context is that given by the type specifiers or a type restricted by the width.

Note also that there is an interaction with the special-case rule that a bit-field declared as int might be considered unsigned. If it is considered unsigned, then in general the resulting type extracted by typeof must also be unsigned to be able to represent all values of the bit-field; even if the width is not considered part of the type by typeof, the result of applying typeof to an int : INT_WIDTH bit-field must be unsigned if that bit-field is unsigned.

// Example 3 - compile for plain int bit-fields being unsigned
#include <limits.h>
struct s { int u : INT_WIDTH; } v;
_Static_assert ((typeof(0, v.u))(-1) > 0);

The type of a bit-field in the _Generic controlling expression

This is similar to the typeof case, including the implementation divergence, but applies also to expressions that are “an expression that designates a bit-field member”, since there is no restriction on such expressions in the controlling expression of _Generic. Since, as discussed above, typeof on an implicitly unsigned bit-field should produce an unsigned type, one might expect _Generic matching to act accordingly.

// Example 4 - compile for plain int bit-fields being unsigned
#include <limits.h>
struct s { int u : INT_WIDTH; } v;
_Static_assert (_Generic(v.u, int : 1, default : 0) == 0);

The type of a bit-field in the initializer for an auto variable

This, if auto is accepted for C23, is similar to the typeof and _Generic cases. The same issue about plain int bit-fields being considered unsigned applies as for typeof.