TECHNICAL REPORT

WG14 Nxxx

Interpretation of the Sequence Point Specification

Working draft

1. General
1.1 Scope
1.2 References

2. Expression Evaluation

3. The Sequence Point Concept
3.1 Consistency of the definition
3.2 Tools for doing the analysis

4. Interpretation of the Standard

5. Function Call

6. Floating-Point Status Flags

7. Conclusion

`Introduction`

The concept of sequence point has been the subject of discussions since C90 was published. The central idea in the specification was to impose a sequential relationship upon certain operations during the evaluation of an expression. The purpose was to guarantee well defined behaviors in cases where such behavior is needed. The sequential relationship gives an impression that these operations can be layout in a linear sequence. Even though this is true, but such sequence is not always unique. The Standard leaves the exact order of certain operations unspecified. This has caused difficulty in determining the status of complicated expressions.

A number of attempts has been made to address this problem. These include an effort during the drafting of C99, a formal model described in N925, and an analysis of the problem in N926 and N927. The effort is ongoing.

This technical report provides a conceptual framework to study the problem, and uses to framework to interpret the Standard. This analysis can be used to further understand the model in N925, and algorithms in N926 and N927.

`1 General`

`1.1 Scope`

This Technical Report provides an analysis of sequence point semantics as defined in Standard C. It develops some basic tools basing on the mathematical concept of partial ordering and then apply them to the interpretation of the specification. Materials in this report are not intended to become a normative part of the Standard, but part of it could be useful in the rationale. Unless there are major deficiency discovered by this analysis, any weakness and inconsistency in the specification are to be addressed by defect reports separately.

`1.2 References`

ISO/IEC 9899:1999, Information technology - Programming languages, their environments and system software interfaces - Programming Language C.
WG14 document N925 - A formal model of sequence points and related issues, working draft.
WG14 document N926 - Sequence points anaylsis.
WG14 document N927 - Another formalism for sequence points.
WG14 defect report DR087.

`2 Expression Evaluation`

There are three aspects to an expression : type, value, and side effects. We will focus on the latter two in this report. Both of these are the result of evaluating an expression. The question we are asking here is one of timing - that is, when do side effects occur and when does the value of evaluating an expression become available.

An expression is consisted of operands and optionally operators. In order to evaluate it, the expression has to be broken down into individual operations - one operation at a time. This process is defined by the Standard recursively following the syntactic construct of an expression. This specification implicitly imposes a partial ordering on the sequence of operations. We use the following example to illustrate.

a++ + b;

The above is a full expression with two operands and two operators. The sequence of operations can be broken down as follows:

1) evaluate a
2) evaluate a + 1
3) evaluate a + b (using the value of a in 1)
x) update a (using the value of a in 2)

The order (or relative timing) of steps 1-3 is completely specified. But the timing of step x is not, not completely. It can take place anytime after step 2 and before the end of the evaluation of the expression. This gives implementations the freedom to exploit instruction level parallelism. But this also means that the evaluation could run into trouble if 'a' is also updated by some other means. Consider a variation of the above:

a = a++ + b;

There is an additional step due to the assignment:

4) update a (using the value of a in 3)

Step 4 can happen anytime after step 3 and before the end of the whole expression. This period of time overlaps the corresponding one for step x. The final value of 'a' becomes undefined.

In the same token, the evaluation could also run into trouble if the value of 'a' is read within the same questionable time period. We will examine this further later.

Note that we are interested in the relative timing of events, not the absolute timing. We are interested in whether one operation has to happen before another one, or whether multiple operations could happen concurrently. This relationship can best be represented in a graph as follows:

              o
              |
              1
              |
              2
             / \
            /   3
           |     \
           x      4
            \    /
             \ /
              $

The '$' marks the end of the whole expression and the 'o' marks the beginning. The graph depicts a partial ordering. Nodes connected by an edge are comparable, with the lower one greater. The key to our analysis is to identify the sets of nodes (operations) that could potentially occur concurrently.

`3 The Sequence Point Concept`

Partial ordering among operations within an expression is a cause of undefined behavior. In order to limit the occurrence of such behaviors, the Standard introduces the concept of sequence point. In a nutshell, it imposes a restriction on the relative timing of side effects. In the example above, the updating of 'a' is a side effect. If we modify the graph to the following:

              1
              |
              2
             / \
            x   3
             \ /
              |
              |
              4
              |
              $

the evaluation would become well defined. Our question here is therefore twofold: 1) Whether the Standard's definition of sequence point is internally consistent. 2) Whether the definition produces evaluation results that match our expectation as programmers.

The Standard defines sequence point in two steps. The first is the specification in 5.1.2.3 paragraph 2:

"... At certain specified points in the execution sequence called sequence points, all side effects of previous evaluations shall be complete and no side effects of subsequent evaluations shall have taken place."

And also in 6.5 paragraph 2:

"Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression. Furthermore, the prior value shall be read only to determine the value to be stored."

If we treat events during the evaluation of an expression as happening within specific time intervals, the above imposes a restriction on the end points of these intervals. That is, it defines the relationship between side effect and sequence point, but does not define the relationship between sequence point and the rest of the expression. This latter is described in the second step.

Within the specification of the semantics of expressions, the Standard inserts sequence points at certain locations. One such example is in the Logical-And operator (6.5.13). These specifications fix the positions of sequence points within the expression. The relative positions of side effects are therefore indirectly established.

`3.1 Consistency of the definition`

Consider the following expression:

int x;
(++x && x) + (++x && x);

The syntax parse tree can be drawn as follows:

             +
            / \
           /   \
          /     \
         /       \
        &&       &&
       / \       /\
      S   \     S \
     /    x    /    x
    ++        ++
     \         \
      x         x

An 'S' is used to indicate a sequence point. It is treated like an operator in the tree. In the abstract machine, when a sequence point node is "evaluated", all outstanding side effects must be complete. The questions is whether the sequence points in the &&-operators are sufficient to nail down the relevant side effects. Or, to put it in another way, what are the outstanding side effects with respect to a sequence point.

Listed below are the operations. They can be derived by going through all the nodes in the syntax tree:

1) evaluate (x+1)   (in left subexpression of +)
2) update x         (using value in 1)
3) sequence point
4) evaluate (x)     (in left subexpression of +)
5) evaluate (x+1)   (in right subexpression of +)
6) update x         (using value in 5)
7) sequence point
8) evaluate (x)     (in right subexpression of +)
9) evaluate +

The partial ordering of these steps can be determined as follows:

All operands of an operator must be evaluated before evaluating the operator.
The order of evaluation of operands are unspecified except under certain situations. (6.5p3, 6.5.2.2p10, and 6.5.16p4.) The exceptions are function call (), &&, ||, ?:, and comma operators. (6.5.2.2p10, 6.5.13p4, 6.5.14p4, 6.5.15p4, and 6.5.17p2.)

The partial ordering can be represented by a graph. The evaluation steps and events are represented by nodes in the graph. (For conciseness, we will use the terms evaluation step, operation, event and node interchangeably when the meaning is clear.)

                   o
                 /   \
                1     5
                |     |
                2     6
                |     |
                3     7
                |     |
                4     8
                 \   /
                   9
                   |
                   $

There is a sequence point before the evaluation starts, and one when the full expression ends. We add dummy nodes 'o' and '$" to start and end the graph. Let us examine the application of 5.1.2.3p2.

5.1.2.3p2 requires all side effects of evaluations prior to a sequence point be complete at the sequence point. To determine if an evaluation satisfies this criteria, we need to determine all predecessors of a particular sequence point. Since the set of predecessors of a node in a partial ordering is defined, 5.1.2.3p2 is decidable for any well formed expression. For example, the predecessors of step 3 are 1 and 2; and of step 7 are 5 and 6. Therefore these two updates must be complete by the time we reach 3 and 7 respectively. But we cannot tell the relative timing of the two updates themselves.

It can be argued that 5.1.2.3p2 should be applied to steps 2 and 7 as well. Since step 2 could potentially start before 7 while step 3 end after, step 2 is a "potential predecessor" of 7, but its side effect might not be complete until after 7. This is a violation of 5.1.2.3p2.

This last point is the source of debate and confusion about sequence point. When a sequence point is too far deep inside a subexpression, it is sometimes not easy to determine if it provides sufficient coverage to protect the side effects. This difficulty is rooted at the partial ordering.

Other than this weakness, and another one to be discussed in section 4, there seems to be no inconsistency in the sequence point definition. This is because the Standard does not explicitly provide a procedure for determining the status of an expression (i.e. whether it is well defined). An expression is well defined if it has no constraint or semantics violation. This specifies what are undefined expressions. By its negative nature, this cannot lead to an assertion where an expression is both well defined and not well defined. The above weakness is therefore not an internal inconsistency. Depending on how we interpret the "predecessor of a sequence point", we may end up with a larger or smaller set of well defined expressions. Whether this set matches the programmers' expectation is the question of external consistency.

`3.2 Tools for doing the Analysis`

The key to understanding sequence point is in understanding the partial ordering. Tools from mathematics are readily available for this purpose. Also, because expressions are naturally depicted by syntax parse trees, we can make use of them as well. The above examples are already using these devices.

One important step in the analysis is in determining the set of predecessors of a give node. The usual definition is the set of all nodes that are compared smaller than a given one. But this excludes nodes that cannot be compared. As noted in the above discussion, the concept of "potential predecessors" is important. This is independent of how we interpret 5.1.2.3p2; in fact, the potential predecessors will shred light on how we should read this paragraph.

The set of potential predecessors for a given node is the union of the set of predecessors and the set of nodes that have no ordering with the given node. This gives a slightly larger set than we need; we are not interested in updates that are already committed before the previous sequence point(s). So we form a subset from the potential predecessors as follows:

Identify all sequence points that are predecessors of the given node.
Remove all predecessors of each of these sequence points.
Remove all these sequence points.

This remaining subset is the one we need. We will call this the non committed predecessors of a given node.

For completeness, we also define successors of a given node (to be those that compare greater), and potential successors (to be the union of successors and the set of nodes that have no ordering with the given node). Successors and potential predecessors are compliments, as are predecessors and potential successors.

`4 Interpretation of the Standard`

Let us read the Standard again using the tools we have developed.

In 5.1.2.3p2, previous evaluations of a sequence point are the potential predecessors of a sequence point. And subsequent evaluations are the successors of a sequence point. 5.1.2.3p2 therefore requires that when the abstract machine reaches a sequence point, there is no outstanding side effect.

In 6.5p2, the evaluations falling "between the previous and next sequence point" are those in the set of non committed predecessors of the next sequence point. So if an object is updated more than once within this set, the expression is undefined. We can design an algorithm to test the status of an expression basing on this property (together with other properties discussed below).

The resolution of DR087 implied there are operations that do not overlap. Function calls are such operations. An object can be updated more than once without causing trouble if all such updates are done by non-overlapping operations. The resoulion of this DR did not reinforce the notion of atomicity which is a stronger property. Consider the following piece of code:

int i = 1;
void foo() { i++; }
...
foo() + i++; /* line A, inside a function body */

Line A would not be undefined (but unspecified) if the function call is atomic.

Let us now go to the second sentence of 6.5p2:

"... Furthermore, the prior value shall be read only to determine the value to be stored."

Not all intermediate values from subexpressions are used in side effects (i.e. stores). For example, a value can be used to compute the resulting value of the full expression. As noted in section 2, the evaluation could run into trouble if a read overlaps a write to the same object. The interpretation we have so far only talks about stores; a piece is therefore still missing. This sentence is trying to fill that gap. Consider the following expression:

++i + i;

Repeating the procedures above, we can come up with the following sequence of operations and the partial ordering.

1) evaluate i          (first i)
2) evaluate (i+1)
3 update   i          (using the value in 2)
4) evaluate i          (second i)
5) evaluate (2) + (4)

                    o
                   / \
                  1   \
                  |   |
                  2   |
                 / \ |
                3   \ 4
                 \   \|
                  \   5
                   \ |
                    \ |
                     $

Since 3 is a non committed predecessor of 4, the second read of i and the store could potentially overlap. The expression should be undefined according to our intuition. But it doesn't violate 5.1.2.3p2 nor the first sentence of 6.5p2. This is because so far we only restrict writes but not reads.

A read can potentially overlap a non committed write if it follows the write or if it has no ordering relationship with the write. It is safe to have a read lying strictly before the write; that is, the read is a predecessor (not a potential predecessor). So it seems that the wordings of the second sentence of 6.5p2 should be changed.

If we examine the above expression more carefully, we will find that even though the value of the full expression is undetermined, it does not really affect the computation of the overall program. This is because the value is not used. The computation can still be defined provided that the overlap access doesn't cause any hardware execeptions. On the other hand, the following is a problem:

a = ++i + i;

as the value is used to modify 'a'. In the second sentence of 6.5p2, if the prior value is read but not used in any store, the implementation can throw away the read (or as-if it does so) to produce an evaluation that is well behaved.

This still leave us with the problem that the read and store may apply to different objects. An attempt to revise the second sentence of 6.5p2 is as follows:

6.5p2
"... Furthermore, the prior value shall be read before the object is modified."

`5. Function Call`

The resolution of DR087 requires function calls not to overlap. Using the tools developed above, the analysis of a function call is not difficult; we just need to note that there are multiple steps involved:

The evaluation of the function designator and the arguments; order of evaluation is unspecified;
A sequence point;
Execution of the function body;
A sequence point.

DR087 did not mention whether the non-overlap requirement applies to the whole calling operation or just the execution of the function body. Also this requirement might cause problem to inlining. It is not clear how an instruction scheduler could effectively handle this. It appears that the intend of the DR was to protect the function body only.

Regardless of which interpretation we choose, we cannot derive it from existing text in the Standard. One possible way to fix this is:

6.5p2:
"Between the previous and next sequence point an object shall have its stored value modified at most once by the evaluation of an expression unless the modification is done by non-overlapping operations."

6.5.2.2p10, add a sentence at the end:
"The actual call is a non-overlapping operation."

`6. Floating-Point Status Flags`

The floating-point status flags are part of the floating point environment. They are set as side effects in execptional floating-point arithementic to provide auxiliary information. They are never cleared in these operations. They are sticky in the sense that multiple set operations can occur concurrently and the result would still be defined. We can further modify 6.5p2 to reflect this as follows:

`7. Conclusion`

An analysis was given on the sequence point specification. An interpretation of the Standard was also provided. The analysis points out certain weakness in the specification, but taken as a whole the Standard is consistent. The weaknesses can be addressed by defect report after further discussions.

The analysis is not difficult to understand. The central idea is that sequence points are not related to each other in a linear manner. The underlying order is a partial order. The phrase "between the previous and next sequence point" has to be interpreted with this background in mind. After this is handled, the rest of the analysis falls into place fairly easily.

There has been effort to provide algorithms and formal model to determine the status of an expression. These are useful; and the analysis presented here can used to help. But it is not a good idea to incorporate such algorithms as a normative part of the Standard. If the Standard is already consistent, any addition can either add nothing new or make the specification inconsistent. These models and algorithms are best presented in the Rationale.

TECHNICAL REPORT

WG14 Nxxx

Interpretation of the Sequence Point Specification

Working draft

Contents

Introduction

1 General

1.1 Scope

1.2 References

2 Expression Evaluation

3 The Sequence Point Concept

3.1 Consistency of the definition

3.2 Tools for doing the Analysis

4 Interpretation of the Standard