WG14 Document: N1058

Date: 2004-03-03

Rationale for having separate decimal floating-point data types.

There had been a discussion after the Kona meeting about the advantage and disadvantage of having a separate set of data type for decimal floating point. Below presents the case for the model used by N1016.

1/ The fact that there are two sets of floating point types in itself does not mean the language would become more complex. The complexity question should be answered from the perspective of the user's program - that is, does the new data types add complexity to the user's code ? The answer is probably no except for the issues surrounding implicit conversions. For a program that uses only binary floating point types, or uses only decimal fp types, the programmer is still working with three fp types. We are not making the program more difficult to write, understand, or maintain.

2/ Implicit conversions can be handled by simply disallowing them (except maybe for cases that involve literals). If we do this, for CUs that have both binary and dec fp types, the code is still clean and easy to understand.

3/ If we only have one set of data types, and if we provide std pragmas to allow programs to use both representations, in a large source file with std pragma flipping the meaning of the types back and forth, the code is actually a field of land mines for the maintenance programmer, who might not immediately aware of the context of the piece of code.
Since the effect of a pragma is a lexical region within the program, additional debugger information is needed to keep track of the changing meaning of data types.

4/ Giving two meanings to one data type hurts type safety. A program may bind by mistake to the wrong library, causing runtime errors that are difficult to trace. It is always preferable to detect errors during compile time. Overloading the meaning of a data type makes the language more complicated, not more simple.

5/ A related advantage of using separate types is that it facilitates the use of source checking/scanning utilities (or scripts). They can easily detect which fp types is used in a piece of code with just local processing. If a std pragma can change the representation of a type, the use of grep, for example, as an aid to understand and to search program text would become very difficult.

6/ Suppose the standard only defines a library for basic arithmetic operations. A C program would have to code an expression by breaking it down into individual function calls. This coding style is error prone, and the resulting code difficult to understand and maintain. A C++ programmer would almost definitely provide his/her own overloaded operators. Rather than having everyone to come up their own, we should define it in the standard. If C++ defines these types as class, C should provide a set of types matching the behavior.

Relatively speaking, this is not a technical issue for the implementation, as it might seem on the surface initially - i.e. it might seem easier to just tag new meaning to existing types using a compiler option - but is an issue about usability for the programmer. The meaning of a piece of code can become obscure if we reuse the float/double/long double types. Also, we have a chance here to bind the C behavior directly with IEEE, reducing the number of variations among implementations. This would help programmer writing portable code, with one source tree building on multiple platforms. Using a new set of data types is the cleanest way to achieve this.

Comments received on N1016

Below captures the comments received on N1016. This may serve as the starting point for technical discussion.

The comments are grouped under the section numbers of N1016. To faciliate referencing, they are tagged with "KONA-nn", even though not all of them were collected in the Kona meeting. The first part lists the "outstanding comments"; the second part lists those that has been applied to the current draft. But since the committee hasn't actually gone through any of them in a discussion, we do not mean the second part are already addressed. We will go through them all in the Sydney meeting, but the first part is probably those that we will spend most of the time.

5.1 Conversions between decimal floating and integer

KONA-01 "F.4 fully defines floating to integer conversions: the value converts or raises FE_INVALID. Raising FE_INVALID does not interrupt the program, nor is it a performance hit."

KONA-02 "It would be better if [1] were changed to a Recommended Practice.  Also, it would be more consistent if conversions to unsigned did the modulo wrap.  As if, it were first converted to a 128-bit signed integer type and then converted to the unsigned type."

KONA-03 "[2]  Assuming +infinity and -infinity are representable in the floating type, then all values are in the range of representable values.  So, change 'quiet NaN' to 'infinity with the appropriate sign'."

5.2 Conversions among decimal floating types, and between decimal floating types and generic floating types

KONA-04 "[4] Need to add words about greater precision and/or range."

KONA-05 "I would rephrase the rules as finding the first type that has an adequate range and precision to meet the model numbers in float.h (i.e. ignoring any 'exceptional' numbers such as subnormals).  And would specify a constraint error that a type including NaNs or infinities couldn't be converted to one without them without an explicit cast."

7 Floating-point environment <fenv.h>

KNOA-06 "7.6 What is the difference between floor and down?  What is the difference between ceiling and up?"

9.1 Decimal mathematics <math.h>

KNOA-07 "7.12 [5], why do you only provide a _Decimal32 macro for QNaN? Seems like _Decimal64 and _Decimal128 versions would also be useful?"

Annex A.

KONA-08 "Your support for Signaling NaNs differs from WG14 paper N1011."

KONA-09 "Why not remove note1 and make the normative text do the correct thing for IEEE:  apply sign, then round."

Annex C.

KONA-10 "Need to add casts to list of allowed places.  Also, need to add function return."

The following comments have been incorporated into the current draft:

2.2 References

KONA-11 "Need to add TC1 of C99. It had a major impact on <fenv.h>."

4 Characteristics of decimal floating types <decfloat.h>

KONA-12 "[2] DEC_EVAL_METHOD: Add: Except for assignment and casts (both remove any extra range and precision)"

KONA-13 "[3] Add: suitable for use in #if preprocessing directives."

KONA-14 "The *_EPSILON values seem wrong.  Should be something like 1e-6DF, 1e-15DF, and 1e-33DF."

KONA-15 "You should also add *_DEN macro symbols for the smallest denormalized number. This is something we forgot to do to <float.h>."

5.2 Conversions among decimal floating types, and between decimal floating types and generic floating types

KONA-16 "[4] Need to add to the part on HUGE_VAL*, something about appropriate sign. What about the rounding mode being different than round to nearest?  In that case, the result is sometimes the largest finite number instead of infinity."

5.4 Usual arithmetic conversions

KONA-17 "Need to add to the part on HUGE_VAL*, something about appropriate sign. What about the rounding mode being different than round to nearest?  In that case, the result is sometimes the largest finite number instead of infinity."

5.5 Default argument promotion

KONA-18 "Prototypes are now(?) required (implicit function declaration was removed in C99).  You might mean varargs. Imaginary float is NOT promoted to imaginary double, so Dec32 should not promote to Dec64.  So, remove all of the 5.5 stuff."

9.3 Formatted input/output specifiers

KONA-19 "scanf needs to be able to read into _Dec32."