From ache@lsd.relcom.eu.net  Thu Mar 12 16:25:03 1998
Received: from lsd.relcom.eu.net (ache@lsd.relcom.eu.net [193.125.27.73]) by dkuug.dk (8.6.12/8.6.12) with ESMTP id QAA27074 for <i18n@dkuug.dk>; Thu, 12 Mar 1998 16:25:00 +0100
Received: (from ache@localhost)
	by lsd.relcom.eu.net (8.8.8/8.8.8) id SAA07604
	for i18n@dkuug.dk; Thu, 12 Mar 1998 18:24:05 +0300 (MSK)
	(envelope-from ache)
Resent-Message-Id: <199803121524.SAA07604@lsd.relcom.eu.net>
Message-ID: <19980312182235.29418@nagual.pp.ru>
Date: Thu, 12 Mar 1998 18:22:36 +0300
From: =?koi8-r?B?4c7E0sXKIP7F0s7P1w==?= <ache@nagual.pp.ru>
To: =?koi8-r?Q?Alain_LaBont=E9=A0?= <alb@sct.gouv.qc.ca>, unicode@unicode.org
Subject: Re: (i18n.422) Regular expressions in Unicode (Was: Ethiopic text)
References: <9803121037.AA21262@unicode.org> <199803121429.PAA25815@dkuug.dk>
Mime-Version: 1.0
Content-Type: text/plain; charset=koi8-r
Content-Transfer-Encoding: 8bit
X-Mailer: Mutt 0.89.1i
In-Reply-To: <199803121429.PAA25815@dkuug.dk>; from alb@sct.gouv.qc.ca on Thu, Mar 12, 1998 at 09:25:47AM -0500
Organization: Biomechanoid
Resent-From: ache@nagual.pp.ru
Resent-Date: Thu, 12 Mar 1998 18:24:05 +0300
Resent-To: i18n@dkuug.dk

On Thu, Mar 12, 1998 at 09:25:47AM -0500, Alain LaBonté  wrote:
> A 02:37 98-03-12 -0800, Hallvard B Furuseth a écrit :
> >I wrote:
> >
> >>> In particular, I wonder about
> >>> character ranges: If the user says "[À-Å]" in his 8-bit charset (not
> >>> latin-1),

FYI: in practice I patch all regex family in FreeBSD tree to use collation
sequence data from locale for [a-z]-type national ranges. Using Unicode
f.e. not help here, because letters not sorted alphabetically (f.e.
Russian YO letter is out of order) and [a-z]-type ranges assume alphabet
order in most cases. 

-- 
Andrey A. Chernov
http://www.nagual.pp.ru/~ache/
MTH/SH/HE S-- W-- N+ PEC>+ D A a++ C G>+ QH+(++) 666+>++ Y
