From keld@dkuug.dk Fri Feb 22 21:03:44 1991
Received: by dkuug.dk (5.64+/8+bit/IDA-1.2.8)
	id AA14796; Fri, 22 Feb 91 21:03:44 +0100
Date: Fri, 22 Feb 91 21:03:44 +0100
From: Keld J|rn Simonsen <keld@dkuug.dk>
Message-Id: <9102222003.AA14796@dkuug.dk>
To: i18n@dkuug.dk
Subject: general strings
X-Charset: ASCII
X-Char-Esc: 29

Milos Lalovic wrote:

(I am citating this in full lenght as it looks like it was
only sent to the uniforum-intl list - Keld).

> I am not sure what is Keld's objective when he proposes the use of
> mnemonics such as <o/> in "S<o/>ndag", but I see the following pros
> and cons with mnemonics:
> 
> Pros:
> 
> 1. Since mnemonics are built from characters in a portable character
>    set
>    - they can be displayed on any display and printed on any printer
>    - they can be universally recognized, regardless of the code set
>    - every keyboard can generate them
> 
> 2. Source code which is developed using the mnemonics is code set
>    independent, and universally portable.
> 
> 3. All characters in the world can be represented in any environment,
> 
> Cons:
> 
> a. Mnemonics can not be processed by string functions in programming
>    languages.

Well, they can - they are just characters as any other characters.
But they may have problems being recognised as *one* abstract character
and thus doing logical operations on them, like isalpha(), toupper()
etc. This is actually part of what I am proposing: that all programming
languages will recognise these abstract characters and handle them
properly, as defined in the locale and charmap. This could be done
by defining such extended locale/charmap support in POSIX and then 
have each programming language define (thin!) bindings to them.

>    The best that we can hope for is that compilers would convert them
>    into code points. However, this may not be possible if the code set
>    in which the compiler is running does not have a code point for every
>    mnemonic it encounters. Code sets that are based on ISO 2022 or
>    similar schemes can not be used for the same reasons as mnemonics.

That is another of my points. If the character can be compiled into
a code point, then by all means do it. If not, then leave it as
it is, with some notation to tell that this is an abstract character
- for example enclosing it in <mn> or with some \mn escapes.
(I don't like the \ - some may know that already :-)

>    To ensure that all mnemonics can be converted into a code point, wide
>    character strings must be used (here I completely agree with Matt
>    Caprile, although it should be clear that wchar_t does not guarantee
>    portable international applications unless it is exclusively based on
>    ISO 10646 or Unicode).

True. I would think that 10646 would be the choice for an official
ISO standard like POSIX.

> b. If there is a key on the keyboard that generates the character <o/>,
>    no user would ever agree to four keystrokes instead of one.

True. Thus it should always be possible to use the full
repertoire of the current character set, and the menemonic 
representation should only be needed with characters outside
this character set. For portability it should alway be possible
to write the mnemonic representation. This is just like POSIX.2
localedef allows it right now (draft 10).

> c. The text containing mnemonics is unreadable (we need computers to
>    help us, not to make our job more difficult).

Well, it is not the optimal presentation, but given the
circumstanses - that you are not able to present the real character,
it could be a good approximitation - and quite understandable.

> A possible good use for mnemonics would be in the following scenario:
> Everything is based on a universal character set (i.e. ISO 10646 or
> Unicode). The existing devices can not handle the universal character
> set encoding, so conversion to a device code set at run time must be
> performed. Since not all devices will have fonts for all characters in
> the universal character set, we could use mnemonics to represent those
> characters that are not in the code set of a device. We can also use
> mnemonics to input those characters that a keyboard can not generate.
> 
> Milos Lalovic  -------- National Language Technical Centre
>                -------- IBM Cananda, Toronto Laboratory

Keld