From keld@dkuug.dk Wed Feb 27 19:17:29 1991
Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8)
	id AA24196; Wed, 27 Feb 91 19:17:29 +0100
Date: Wed, 27 Feb 91 19:17:29 +0100
From: Keld J|rn Simonsen <keld@dkuug.dk>
Message-Id: <9102271817.AA24196@dkuug.dk>
To: i18n@dkuug.dk, wg14@dkuug.dk
Subject: Johan van Wingen on shorthands
Cc: npn@sirius.att.com
X-Charset: ASCII
X-Char-Esc: 29

Forwarded to wg14 and i18n and npn. - Keld

Dear Colleagues
A few comments to Kelds reply.

> > VT200 terminal digraphs commonly used world-wide. As an example,
> > the British Pound sign is \(ps in troff, but is typed as
> > <COMPOSE> L - on most ISO 8859-1 terminals. My way of handling
> > this is to change troff to use the VT200 digraphs in the future.

????????? What is a ISO 8859-1 terminal? A DEC VT 340?

> > What would avoid all the sturm-and-drang is if the ISO committees,
> > as they agree on things like 8859-1,2,3,... and 10646 would
> > provide *as part of the standard* the "shorthand" notation in
> > the next "lower" character set. As an example, in 8859-1,
> > character 10/03 might be defined by:
> > 10/03	L-	POUND SIGN
> > (The digraph in this case must be two ISO 646 characters.)
> > The standard would also recommend that, where 8 bit keyboards
> > were not available, the sequence <COMPOSE> L - would be
> > equivalent to 10/03. <COMPOSE> could be undefined-- on keyboards
> > it's a key, in troff it would be \(, in C it might be something
> > else.
> > ISO 10646 would require a tetragraph, but again, it should be one
> > recommended by the standards committee.
>
> Yes, this has been proposed within ISO.

?????????????????

> Actually it was SC22 who proposed this to SC2 - to provide unique
> naming and short identifiers to all characters provided by SC2.
> The SC22 requirements were stated in the paper SC22 N622R.
> SC2 responded by assigning unique (long descriptive) names for
> characters in the new universal character code ISO (DIS) 10646.
> But they did not want to provide short identifiers.

This discussion has not been finished in SC2.

> As SC22 needed this for various purposes, like the C and POSIX
> standards, a NWI has been proposed and accepted by SC22.
> This NWI covers internationalization (i18n) and includes character set
> work on identifying and conversion within character sets.
> The text is not fully clear on the shorthand requirement,
> but I think it is sufficiently clear that this work is included.

This is putting things upside down. It has been agreed that the topic
of short identifiers MIGHT be handled by WG20, taking it over from the
SC22 Ad Hoc group on Character Issues.

> The NWI is assigned to the new SC22 WG20 on internationalization.
> They have not met yet. The convenor is Dick Weaver of IBM, he
> is on (at least) the i18n@dkuug.dk list and thus gets these messages.

It is intended to have the first meeting June 1991 in the Netherlands.

> 2. ISO 6937-2 & ISO 10367 have a short naming of the Latin
>    characters and also some special characters. These shorthands

Not ISO 10367, and if it depends on Mr. Hekimi it will go from 6937 too.
(I do not agree with him.)

> 5. Danish Standards (the Danish ISO member body) has produced an
>    elaborate "Example Danish National Locale" for POSIX, included
>    in the POSIX.2 draft 10 (published a bit later than the rest of
>    draft 10) and also in the next draft. I have been very active
>    in producing this specification. There are shorthands for a
>    considerable part of ISO 10646, covering many alphabetic and
>    ideographic characters, some 25000 characters in all (1300 non-
>    ideographic). IMHO it is the most elaborate work available today
>    on shorthands. Mostly the shorthands are two-character from
>    the invariant ISO 646 set (ASCII minus 12 characters), but
                                                                        .
But they have never been discussed in SC2, and only will if a New Work
Item Proposal is approved for it.
                                                                        .
> 7. Johan van Wingen from Nederlands Normaliserings Institut (the Dutch
>    ISO member body) has a convention for character naming, which is
>    two-character and drawn from ASCII (I think). It is used in his
>    survey of which languages requires which characters, and also
>    how these characters are collated in each of these languages.
>    The papers are avaliable electronically - one source is the
>    iso10646 archive at jhuvm.bitnet.

Not yet, not yet, but it will. These are two separate reports.
But the notation was made for convenience, without further claims to
act as short identifiers. Use is intentionaally restricted to 6937.

> 12. Alain LaBonte' of the Canadian Standards Association is working
>     on a shorthand, especially for chinese characters, as far as
>     understand. I have not seen this work, though.

I have not seen a worked out example yet. It is described in SC2/WG3
N 125.

As for SGML, TR 9573 is being revised. It will contain all the entity
declarations. Only Part 13, Mathematical symbols, has appeared as yet.

Best regards, Johan van Wingen