From glenn@ila.com Sun Nov 18 20:36:15 1990
Received: from [140.186.1.4] by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8)
	id AA01521; Sun, 18 Nov 90 20:36:15 +0100
Received: by meillet.ila.com (4.1/ILA-4.10) id AA01206; Sun, 18 Nov 90 14:15:57 EST
Date: Sun, 18 Nov 90 14:15:57 EST
From: glenn@ila.com (Glenn Adams)
Message-Id: <9011181915.AA01206@meillet.ila.com>
To: erik@sra.co.jp
Cc: Becker.OSBU_North@xerox.com, unicode@sun.com, i18n@dkuug.dk,
        arnet@hpda.cup.hp.com
In-Reply-To: Erik M. van der Poel's message of Sun, 18 Nov 90 12:56:17 +0900 <9011180356.AA24323@sran8.sra.co.jp>
Subject: Han Character Code Ordering
X-Charset: ASCII
X-Char-Esc: 29


   From: Erik M. van der Poel <erik@sra.co.jp>
   Date: Sun, 18 Nov 90 12:56:17 +0900

   String-based sorting is desirable because of the change in
   pronunciation of a character when it is combined with other
   characters. Example:

	   KAZE		(1 character)	means "wind"
	   TAI FUU	(2 characters)	means "typhoon"

   Here, the KAZE and FUU are the same character. The implications of
   this are staggering. Not only do we need a large dictionary with all
   the different pronunciations, but we may in some cases also need to
   parse sentences. But this should probably be left to sophisticated
   applications.

Alternatively, one could retain the yomi at input conversion time and
annotate jiritsugo accordingly.  The annotation could be retained for
cases where recovery would be difficult or impossible (unambiguously).
Unfortunately, this will be impossible for most conversion interfaces
which remove this structure.  I believe this is a good reason for
demanding a richer conversion interface.

Glenn

