From keld@dkuug.dk Fri Sep  4 23:17:39 1992
Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8)
	id AA23480; Fri, 4 Sep 92 23:17:39 +0200
Date: Fri, 4 Sep 92 23:17:39 +0200
From: Keld J|rn Simonsen <keld@dkuug.dk>
Message-Id: <9209042117.AA23480@dkuug.dk>
To: i18n@dkuug.dk
Subject: Re:  request for feedback on character set identification proposal
Cc: iso10646@jhuvm.bitnet
X-Charset: ASCII
X-Char-Esc: 29

Some comments on character set naming.

I believe the paper from Ju:rgen Bettels is a good starting
point for a very necessesary task.

I have done quite some work on this also, viz. the 
JTC1/SC2/WG2 and JTC1/SC22/WG20 papers I have delivered 
and the RFC 1345 I was the author of.

The scope is about right, as far as I can tell, I have tried
to cover the same (almost) in the above mentioned papers.

Some additional remarks:

I believe SC22 will need for programming language support
character set names like "ASCII", "ISO8859-1" "IBM850"
"UCS2" - that is: tokens that could be used in programming languages
as a way to charactarize strings etc. Several names (aliases)
would be convenient for this, eg "ISO8859-1", "ISO-IR-100",
"IBM819" which could all mean the same encoding.
Is that intended to be covered by the WG3 project?

Also I believe that names covering more than one registration
in the ISO 2375 registry will be needed. Eg "ISO8859-1"
covers registration 6 and 100 and a C0 and C1 control set,
in total 4 2375 registered sets. For convenience it should be possible
to refer this under one name. The same goes for eg. the different
Japanese encodings, where explicit shifts between character
sets (SO/SI) may occur.

In SC22, POSIX is doing specifications of character sets in "charmaps"
and a registry of charmaps are on the agenda of SC22/WG15 and
SC22/WG20. Charmap format is specified in ISO/IEC DIS 9945-2:1992
This should be consistent with the new work from WG3. 

It would be very convenient if the new registry could be provided
on-line, eg as a Directory (X.500) service, and via FTAM and FTP.

Keld Simonsen
