From DAJ@prime-a.tees-poly.ac.uk Fri Jan 24 10:37:59 1992
Received: from eros.uknet.ac.uk by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8)
	id AA21144; Fri, 24 Jan 92 10:37:59 +0100
Message-Id: <9201240937.AA21144@dkuug.dk>
Received: from tees-poly.ac.uk by eros.uknet.ac.uk via JANET with NIFTP (PP) 
          id <2591-0@eros.uknet.ac.uk>; Fri, 24 Jan 1992 09:38:39 +0000
Date: Fri, 24 Jan 92 09:40:00 BST
From: DAJ@prime-a.tees-poly.ac.uk
To: i18n@dkuug.dk
Subject: IN DEFENSE OF SYMBOLIC NAMES FOR CHARACTERS
X-Charset: ASCII
X-Char-Esc: 29

To:  ISO/IEC JTC1/SC22/WG20      From:  David Joslin      January 23, 1992
     (copy to i18n)

There is a lot to be said for using symbolic names for characters in
programs, from the point of view of good programming practice.

The position is clear in the case of numeric (rather than character) data.
Consider the (Extended Pascal) program fragment:

(a)  var P: array [1..7] of integer;

     const BEL = chr(7);

     var Q: array [2..8] of real;

     ............................

     for i:= 1 to 7 do Q[i+1]:= P[i];

Suppose that the number of elements in array A were to be changed from 7 to
some other value, e.g 10.  Every "7" in the program must be changed to "10"
- or must it?  We do not want to change the "7" in the definition of BEL !
And we DO want to change the "8" in the definition of array B to "11".  So
the substitution cannot be done mechanically, and there is a great risk of
error in the process.  How much better to write:

(b)  const N = 7;

     var P: array [1..N] of integer;

     const BEL = chr(7);                  {we really do want 7 here!}

     var Q: array [2..N+1] of real;

     ............................

     for i:= 1 to N do Q[i+1]:= P[i];

where the same change requires replacing "7" by "10" in one place only, the
definition of N.

That is elementary, of course:  I'm sure I don't need to teach WG20 experts
how to program, I just wanted to set the scene for considering characters.
Now suppose that we have a program which recognises $ as a "herald"
character, that requires some special action.  If I write:

(c)  if ch = '$' then ....

I get the required effect;  but if I want to change the herald character to
@ say, and change all "$" characters in the program to "@" characters, I may
get some dollar amounts printed out as say "@9.99" !  Again I should have
used a symbolic name for my herald character, e.g:

(d)  const herald = '$';

     ...................

     if ch = herald then ....

This also makes the program clearer to read.

(Of course I could also have said "const herald = chr(36)" {ISO 646}, or
"const herald = chr(??????)" {ISO 10646}, or whatever.)

The only time I might not want to use symbolic names is when I *really
know* that I will be dealing with a specific character, e.g:

(e)  if ch = '$' then .... {process dollar amount following}

Even then, I might be able to write something like

(f)  const e_acute = chr(233);  {ISO 8859-1}

     .........................

     if ch = e_acute then ....

and I would be able to compile my program even on a Pascal system/machine
that only had ISO 646, but the target program would run on a machine with
ISO 8859-1.  (Of course if I try to run it on a machine with only ISO 646
I am in trouble, but then I am attempting the impossible.)

Does any of this help our debate?

daj
