From taylor@limbo Tue Jan 21 09:32:07 1992
Received: from uucp-gw-1.pa.dec.com by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8)
	id AA19925; Tue, 21 Jan 92 09:32:07 +0100
Received: by uucp-gw-1.pa.dec.com; id AA22574; Tue, 21 Jan 92 00:09:50 -0800
Received: by limbo; Tue, 21 Jan 92 00:07:00 pst
Message-Id: <9201210807.AA18740@limbo.intuitive.com>
Subject: Re: support for symbolic names
To: i18n@dkuug.dk, xojig@xopen.co.uk
Date: Tue, 21 Jan 92 0:06:57 PST
From: Dave Taylor <taylor@limbo.intuitive.com>
Reply-To: Dave Taylor <taylor@limbo.intuitive.com>
Organization: Intuitive Systems, Mountain View, California  +1 (415) 966-1151
X-Mailer: Elm [version 2.02]
X-Charset: ASCII
X-Char-Esc: 29

Teruhiko Kurosaka comments:

> Walt Daniels writes:
>> So 90% of the "symbolic" names are just codepoint numbers. Why not just
>> use the numbers for the rest of them and eliminate the European bias!
> I agree with Walt.

And so we end up at a point where we're going to see code like:

	if (ch == '<540404>') {
	   ...

which I posit means we've completely failed at the most basic
goal of internationalization; to offer a system that people
can *use* to help create global programs.  

If we are indeed destined to be trapped in the world of 
completely non-mnemonic systems where everything is identified
by some obscure numeric sequencing, then we might as well give
up now; few, if any programmers in their right mind are going to 
be willing to internationalize their code or work with 
internationalized code.

It's clearly difficult to come up with meaningful mnemonics that
are consistant and useful across languages, but perhaps a subset
solution is better than nothing, or better than insisting on a
solution that must work across all alphabets?  Why not a combination
of solutions where a certain core of latin-based languages have
one type of notation available for use, then glyph-based languages
have another, and so on. 

It would be complex, but when working with specific domains, like
the EC / Americas, it would be vastly better than a pure numeric
scheme, wouldn't it?

Further, I suggest that the natural evolution of things would
result in programmers forced to internationalize using:

	#define LOWER_O_SLASH '<540404>'

at the top so they can read and work with the code.  And so
they will end up solving a problem we seemingly cannot, and the 
end result will be less portable, real life code.

Remember, the approach to mnemonics I'm proposing isn't EXCLUSIVE
of other approaches.  In fact, you don't even have to use it.  It
doesn't need to cover all characters in all alphabets.  It might
well be constrained to common 8859-1 based languages, in fact.  But
it WOULD be valuable in that domain, and if I'm programming a
complex application, being able to check "ENYE" would be terrific,
and vastly clearer than some mysterious numeric coding value.

I think we need to step back a bit and think about the grand goals
of this overall effort: Are we trying to Solve The Problem in some
ideal fashion, or are we a *working group* trying to offer useful,
meaningful and valuable *working* and *workable* solutions to some
of the problems and challenges of internationalization?

						-- Dave Taylor

Intuitive Systems				        SunWorld Magazine
Mountain View, CA			        	San Francisco, CA

taylor@intuitive.com         taylor@netcom.com        taylor@sunworld.com