From greger@friherr.demon.co.uk Sat Jun 17 17:27:38 1995
Received: from disperse.demon.co.uk by dkuug.dk with SMTP id AA12183
  (5.65c8/IDA-1.4.4j for <i18n@dkuug.dk>); Sat, 17 Jun 1995 18:28:37 +0200
Received: from post.demon.co.uk by disperse.demon.co.uk id aa16251;
          17 Jun 95 17:28 +0100
Received: from friherr.demon.co.uk by post.demon.co.uk id aa19723;
          17 Jun 95 17:28 +0100
Comments: Authenticated sender is <greger@friherr.demon.co.uk>
From: Greger Leijonhufvud <greger@friherr.demon.co.uk>
Organization:  Friherr Software Ltd
To: Kees Pronk <C.Pronk@twi.tudelft.nl>
Date:          Sat, 17 Jun 1995 17:27:38 +0000
Mime-Version:  1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
Subject:       Re: dutch locale (retry)
Reply-To: greger@friherr.com
Cc: wg15-uk@xopen.co.uk, i18n@dkuug.dk
Priority: normal
X-Mailer: Pegasus Mail for Windows (v2.0-WB4)
Message-Id:  <9506171728.aa19723@post.demon.co.uk>
X-Charset: ASCII
X-Char-Esc: 29


> Dear Greger,      (* retry because earlier mail bounced *)

> We discussed your answers and the results of the Enschede meeting in our
> subgroup.
> As a result we have some further questions and remarks.
> The text below follows my earlier questions:
> > 
> > 
> > The UK profile is intended to allow for both 'traditional' use conformance,
> > i.e. in an 8-bit environment as now, and also for a more 'advanced' 
> > future environment; using 10646. While you could restrict the profile to only 
> > use 8859-5, I would suggest doing as we did.
> > 
> We discussed this and decided to keep to your model of having more than
> one environment; however, see below....
> 
-------
> We would like to investigate whether the following set up
> of levels of conformance would be possible:
> level-1 8859-9 ( = latin-5) is prescribed for governmental use now,
> level-2 8859-1 ( as a fall-back position ) for industry,
> level-3 10646  ( subset as in the UK profile),
> level-4 6937   ( Latin Telematic set) Has been used for a large scale
> 	         governmental project. I know this has been recently
> 		 withdrawn.
> Would this be feasible?

Certainly. Note that we are discussing changing the name from
'level' to 'type' or 'set', as they might not be proper levels
(which does indicate proper subsetting). Personally, I am beginning
to question the use of the large 10646 set  (i.e. including Arabic,
Hebrew, etc.) in a national profile. It seems somewhat pretentious to
specify how e.g. Arabic characters are sorted in Holland (or
England...). In the case of the UK, it may be more rewarding to
include only characters used in languages 'recognized' by e.g.
Social Services or any other official institution, and which are
commonly spoken in the country, which for the UK would possibly be
English, Welsh, Gaelic, Cornish, Greek, Devanagari, Bengali,
Gurmukhi, and Gutajari.

From that perspective, there are two sets of symbolic name tables;
one which covers the national langauge(s) and standard European
languages,  and one which contains also other important scripts. In
your case, using Dutch as the national language the mnemonic table 
containing the symbolic names should be one containing all characters 
in 8859-1, 8859-9, and 6937.  The profiles are identical for these 
code sets. The bigger one chould contain a relevant subset of 10646; 
I suppose whatever is used in Indonesia is also important. If you 
have any ideas on this, I would appreciate them....

> > > 3- The POSIX document emphasizes a particular short-hand style for
> > >    symbolic names in the charmap.
> > >    The UK document is not using that style, but instead uses longer names.
> > >    As there is some resistance here against the short hand style we would
> > >    like to use the longer names, but are unsure of the consequences.
> > 
> > The POSIX document uses a particular, 2-character short-hand style as an
> > example; you can use whichever one you wish. It should be noted that 
> > the only reasonable unambiguous character names are those in 10646; the
> > longer ones I used are based on those. 
> > There are no particular consequences; The charmap must be defined anyway
> > as part of the profile...
> > 
> We decided to stay with the longer names.

> > > 4- As a practical consequence of the previous question: would it be
> > >    possible for us to obtain the source material for the UK-profile?
> > 
> > Absolutely!
> > 
> Could we arrange something to pick up your material from a FTP-site?
> Could you inform us about the word processing format used?
> Do you have available tools to extract the information from
> the word processsing format such that it will fit the localedef
> program?

I will follow-up on that in a subsequent message, including where the 
sources can be picked up. I currently use WordPerfect 6.0a, but also 
troff. The localedef 'sources' must. of course, be in 8859-1 (or,
more properly, in the POSIX Portable Character Set (ASCII...).


> > > 5- In the Dutch language a particular digraph exists: the <ij>.
> > >    It seems the rules for POSIX do allow to devise correct collating rules
> > >    for this digraph. However, it seems impossible to have correct
> > >    versions of toupper and tolower.
> > >    We would like to write (<ij>,<IJ>).
> > >    Herman Weegenaar will produce a proposal for discussion at the 
> > >    Enschede meeting.
> > > 
> > You are correct. This is a problem, and as it depended on C language
> > conformance (upper/lower conversion can only be done for a character,
> > not for a 'string') we could not change it. Of course, if <ij> is a single
> > character, it works... 
> > Defining a that multi-character symbols must be case converted is fine,
> > but as no software currently supports it, where are we?
> >
> We postponed this one for the moment as there is a discussion going on in
> the Netherlands whether we would need a 27th letter!

> Thanks for your cooperation,

> Kees Pronk.

Greger Leijonhufvud





------------------------------------------------------
Greger Leijonhufvud        tel:   +44 (0)181 747 3313
Friherr Software Ltd       fax:   +44 (0)181 742 1851
6 Belgrave Court           email: greger@friherr.com
London W4 4LG, UK
