From greger@cerberus Fri Feb  7 01:02:51 1992
Received: from eros.uknet.ac.uk by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8)
	id AA20872; Fri, 7 Feb 92 01:02:51 +0100
Received: from xopen.co.uk by eros.uknet.ac.uk with UUCP 
          id <20894-0@eros.uknet.ac.uk>; Thu, 6 Feb 1992 18:01:30 +0000
Received: by xopen (15.11/15.6) id AA13419; Thu, 6 Feb 92 16:57:39 gmt
Message-Id: <7872.9202061649@cerberus.West.Sun.Com>
Received: from friherr.West.Sun.COM by cerberus.West.Sun.Com (5.65/) id AA07872;
          Thu, 6 Feb 92 16:49:22 GMT
Received: by friherr (5.65/Stpd4) id AA11207; Thu, 6 Feb 92 16:51:33 GMT
Date: Thu, 6 Feb 92 16:51:33 GMT
To: i18n <i18n%dkuug.dk@xopen.co.uk>
Cc: unicode@sun.com
From: greger%cerberus@xopen.co.uk ("greger@West.Sun.Com: Greger Leijonhufvud, SunSoft, High Wycombe, U.K.")
Subject: Re: (XoJIG 407) (i18n.131) Locale specific data manipulation
X-Charset: ASCII
X-Char-Esc: 29


In reply to your message of Fri Dec  6 04:32:59 1991
-------
Chang Hyeoungkyu writes

>2. Requirement

>2.1 Collating Arabic characters

>       Source of information:

>               Digital Guide to Developing International Software,
>               Digital Press, 1991.

>       Words are sorted in code order with the Arabic vowels
>       characters excluded. Groups of words having the same
>       consonants are then sorted in code order including the vowel
>       characters.

>       I don't think that this kind of information can be expressed
>       in the current "localedef", personally.

The current specification in POSIX.2 would, indeed, support exactly
this type of collation!!



>       My opinion:

>       It is not possible to express Ideographic-to-Phonetic
>       conversion information in the current "localedef".

>       We have two choices. The first is to have locale specific
>       function. That is to say, if we call strcoll(), then it should
>       call strcoll_ja_JP@phonetic_sort().

>       The second choice is that we extend the "localedef" to include
>       this kind of information.

The problems of defining "functions" rather than "tables" (which, in a way,
is what localedef does) has been discussed before. The problem is really
one of portability, because the whole idea of POSIX is to provide
portability at the source level. Localedef can be said to describe tables,
but the intent is to define a _behavior_, i.e. with this locale, this is
how the system behaves. So, intentionally, the specification of locales in
localedef is currently based on the "table-oriented" structures (i.e.,
non-algorithmic). It was, however, also recognized that procedures may
be needed to support the structures. For an example of a method of
including "algorithmic" processing (however crude), you can look at the
'date' processing in POSIX.2, where it is envisioned that there is locale-
specific processing (not just data) in alternate date formats.

Of course you can easily extend the structure of localedef by adding
e.g. a function definition, e.g.
LC_COLLATE
	procedure my.own.strcoll
END LC_COLLATE

but that does not solve the problem with portability (it could ensure that
when, in that locale, I do a strcoll i would get my own routine) as
1. there is no guarantee that another vendor has implemented  my.own.strcoll
   in the same manner (or at all!)
2. It is rather difficult to document what I am doing. Of course, I could
   place C language code (or even LIS code) in the source and hope for
   the best), but we do not really have a good way to describe,
   unambiguously, processes.


It is not difficult to define methods for extending (or throwing away)
localedef and locale processing. The problem is how to do this and retain
the portability aspect.

Greger Leijonhufvud
Sunsoft, Inc
greger@West.Sun.COM
