From keld@dkuug.dk Sun Feb 24 16:20:35 1991
Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8)
	id AA10355; Sun, 24 Feb 91 16:20:35 +0100
Date: Sun, 24 Feb 91 16:20:35 +0100
From: Keld J|rn Simonsen <keld@dkuug.dk>
Message-Id: <9102241520.AA10355@dkuug.dk>
To: i18n@dkuug.dk, iso10646@jhuvm.bitnet
Subject: AT&T Bell Labs wishes for shorthand character names
Cc: npn@sirius.att.com, wg14@dkuug.dk
X-Charset: ASCII
X-Char-Esc: 29

Hello character & i18n people!

I have got this request from Nils-Peter Nelson, AT&T Bell Laboratories
via Thom Plum, who is also in ISO SC22/WG14 (Programming language C).
He asks for comments and information on this issue of shorthand
for characters.

Keld

Date: Thu, 21 Feb 91 15:45 EST
Original-From: sirius!npn (Nils-Peter Nelson +1 201 582 6078)
To: plum@plumhall.com
Subject: digraphs, trigraphs, etc.

Some kind person send me a paper of yours from an unidentified
publication which discusses the various ISO character sets.
I've been working with Brian Kernighan on an ISO 8859-1 version
of troff.  Brian has already modified the code to accept the
8 bit input, and my group is currently working on the ditroff-
to-PostScript conversion for the additional characters. (This
is not trivial, since Adobe has neglected to provide font
positions for many of the new characters, although the
outlines and names are there.)
As a favor to ASCII people we want to preserve the troff convention
of providing ASCII digraphs for the new characters; however, we
now see that the troff conventions differ from the commonly used
VT200 terminal digraphs commonly used world-wide. As an example,
the British Pound sign is \(ps in troff, but is typed as
<COMPOSE> L - on most ISO 8859-1 terminals. My way of handling
this is to change troff to use the VT200 digraphs in the future.

I've spoken to Dennis Ritchie several times because he faces
similar problems with C-- even if variable names are ASCII he
wants to be able to handle ISO 8859 strings.

What would avoid all the sturm-and-drang is if the ISO committees,
as they agree on things like 8859-1,2,3,... and 10646 would
provide *as part of the standard* the "shorthand" notation in
the next "lower" character set. As an example, in 8859-1,
character 10/03 might be defined by:
10/03	L-	POUND SIGN
(The digraph in this case must be two ISO 646 characters.)
The standard would also recommend that, where 8 bit keyboards
were not available, the sequence <COMPOSE> L - would be
equivalent to 10/03. <COMPOSE> could be undefined-- on keyboards
it's a key, in troff it would be \(, in C it might be something
else.
ISO 10646 would require a tetragraph, but again, it should be one
recommended by the standards committee.

Since you announced yourself to be the "contact person for
the general public" I'm asking you to bring this to the
attention of the various committees. If the standardization
is not offered by ISO, we run the risk of different conventions
in troff, TeX, C, MS-DOS, etc.
	Nils-Peter Nelson
