From keld@dkuug.dk Thu Jan 16 05:51:11 1992
Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8)
	id AA13482; Thu, 16 Jan 92 05:51:11 +0100
Date: Thu, 16 Jan 92 05:51:11 +0100
From: Keld J|rn Simonsen <keld@dkuug.dk>
Message-Id: <9201160451.AA13482@dkuug.dk>
To: Teruhiko.Kurosaka@eng.sun.com, taylor%limbo.intuitive.com%xopusw@Sun.COM
Subject: Re:  (SC22WG14.174) Re: (XoJIG 432) (i18n.146) Re: Support for symbolic character names
Cc: i18n@dkuug.dk, wg14@dkuug.dk
X-Charset: ASCII
X-Char-Esc: 29

> Dave Taylor writes:
>    |the collating / transliteration tables nice and obvious too.  Indeed,
>    |one wonders why we don't just have everything defined that way anyway,
>    |so that regular C could contain tests like:
>    |
>    |	if (ch == COLON || ch == EXCLAMATION_MARK || ch == ASTERISK)
>    |
>    |rather than the much less portable, and more cryptic tests like:
>    |
>    |	if (ch == ':' || ch == '!' || ch == '*')
>    |
> Interesting.  To me, the former is more cryptic than the latter.  The
> latter is much more straight.  And I don't understand why the former
> is more portable than the latter.  Only advantage I can see in the
> former way is that you can edit a program on the machine that lacks
> the character you'd like to display on the target system.

Well, that is also an important issue. That allows people to
write and maintain C programs on such machinery. And such
machinery may be very widespread in some communities.

The main problem with the above scheme is that you cannot use it
in strings, else is is conceptually very much the same as the one
I have proposed. There is not much difference between writing

      if (ch == COLON) ....

and
      
      if (ch == L'\<COLON>') ....

The other thing is that  you build on the information you have in the
charmap according to my proposal, and that is the standard place to
define your character set. So why invent another mechanism for something 
which already exist.

One problem with the #define solution is also that it restricts
itself to variable names, which then means that you will use some sort
of natural language, eg English or Danish. This will make your 
program a bit cultural dependent and thus less portable. 
Would you like to maintain a program with German or Italian 
character names? The POSIX charmap scheme can then be made less
cultural dependent, as \<c,> is equally evident to a German, Danish
or English - even Japanese - programmer.

> BTW, is anybody trying to name all the 10,000+ Chinese characters?

Yes, this has been done in SC2/WG2. They have just numbered them
and then they use the number as the name.

Keld