From ALB@immedia.ca Fri Dec 20 11:18:00 1993
Received: from Clouso.CRIM.CA by dkuug.dk with SMTP id AA24301
  (5.65c8/IDA-1.4.4j for <i18n@dkuug.dk>); Mon, 20 Dec 1993 17:12:21 +0100
Received: from immedia.ca ([192.139.197.1]) by clouso.crim.ca (4.1/SMI-4.1)
	id AA05938; Mon, 20 Dec 93 11:13:12 EST
Return-Path: <ALB@immedia.ca>
Received: by immedia.ca (3.2/2.D)
        id AA21817; 20 Dec 93 16:19:40 -0500
Date: 20 Dec 93 16:18:00 -0500
From: ALB@immedia.ca
Message-Id: <199312201619.AA21817@immedia.ca>
To: cpwg-mail@revcan.rct.ca, i18n@dkuug.dk, sc22wg20@dkuug.dk
Cc: iso10646@jhuvm.hcf.jhu.edu
Subject: Input methods to enter characters from the ISO 10646 repertoire
X-Charset: ASCII
X-Char-Esc: 29

----------
I don't know if I already posted this on this list, but I'm sure I did not post
it complete with the 3 documents SC18/WG9 N 1290R, 1291 and 1292 which form the
base for an NP to be distributed to JTC!/SC18 member bodies for letter ballot.

Have a merry Christmas for those of you who are Christians and others who
share that event with this community and a happy new civil year 1994.

Alain LaBont<e'>
Minist<e!>re des Communications du Qu<e'>bec
*******************************************************************************
To       : /Personnel/Umamaheswaran
CC       : /Personnel/Fred
Subject  : Input method to enter characters of the ISO/IEC 10646 repertoire
Delivery : Reguliere
From     : ALB
Type     : EXPRESS
Form     : Mail

Uma, following your phone call I found not 2 but 3 documents to make this
dossier complete.  As I am not sure I posted all of 3 on the ISO10646 list I
will do it again and also send a copy to the SC22WG20 list.

Joyeux No<e:>l et bonne et heureuse ann<e'>e 1994.

J'esp<e`>re que l'on pourra se voir les 2 et 3 mars <a`> Toronto.

Amiti<e'>s,
Alain LaBont<e'>

P.-S.: If you do not have TRANI850 to decode it, please do not hesitate to ask
me this decoder; it should be readable as is relatively well but it has examples
with accented characters (ex.: "<A" stands for LATIN CAPITAL LETTER A WITH
GRAVE with network-safe characters: TRANISCI would have righted it and other
characters for legibility on a PC installed with code page 850).

cc: Fred (Fred, comme tu n'<e'>tais pas <a`> Hursley, cela peut t'<e^>tre utile)
-------------------------------------------------------------------------------
TRANISCI <- 1992-01-22 ->TRANI850
                                      ISO+/IEC JTC1+/SC18+/WG9 N1290R

    Title>> A simplistic but workable method to enter characters
           of the full repertoire of ISO+/IEC 10646

    Source>> Alain LaBont/e

    Date>>   1993<-11<-08

    Distribution>> SC18+/WG9

    Action required>> For opening reference in a new work item to
                     be presented at the SC18+/WG9 Hursley meeting
    _____________________________________________________________

    1. Introduction

    Today, there is a well<-known method in existence for inputing
    characters foreign to a given keyboard on PC compatibles.
    However this method is code<-dependent and is limited to 8<-bit
    character sets. There is a need to standardize such a method
    independently of coding even for these limited sets of
    characters.

    There is also an international standard, ISO+/IEC 9995<-3, for
    inputing on a standard 48<-key keyboard the repertoire of
    characters belonging to those European languages using the
    Latin script. But this standard is limited to the Latin
    script, even if it opens the door to the defining
    supplementary groups for other scripts. In the meanwhile,
    until other groups are well defined and documented, there
    should be an easy standard way to enter non<-Latin characters
    in a code<-independent fashion. This would avoid the
    multiplication of such methods, a situation that is never
    desirable for end<-users.

    Furthermore, ISO+/IEC JTC1 recently published a standard,
    ISO+/IEC 10646, titled "Universal multiple<-octet coded
    character set (UCS)", which is a superset of the repertoires
    of all standard character sets published so far by ISO+/IEC
    JTC1. For this one large character set (UCS), there is no
    standard input method in existence today. But there will be
    an increasing need to do so, which would also solve the
    problem of code independence.

    2. Simplistic but workable proposal

    As an initial proposal, simplistic but workable, the
    following method, analogous to what exists now on PC
    compatibles, is proposed in 2.2.

    2.1 Repertoire and "catalog number" to identify characters

    Immediately exploitable is the recently published Universal
    multiple<-octet coded character set, whose repertoire can be
    seen as the complete catalog of standard graphic characters
    previously published by ISO+/IEC JTC1 plus new subsets never
    published before.
??,
    While ISO 10646 uses a multi<-octet coding scheme, it includes
    characters which were previously accessible using a lot of
    different codings. It is possible to really know that some of
    these characters are the same, independently of their coding,
    by their names>> if the names of 2 characters included in 2
    different codes are the same, then the characters are the
    same, whatever the coding.

    Hence, as a short<-cut, it is possible to use the code value
    of each character of ISO+/IEC 10646 as a "catalog number" that
    represents any character that an end<-user wants to use, even
    if it is not accessible on a given keyboard. This catalog
    number then represents an easy reference equivalent to the
    concept represented by the name of a character.

    Furthermore, this "catalog number" may easily represent the
    exact same character that exists in numerous non<-standard
    character sets, registered or not in the ISO+/IEC character
    registry (held by ECMA in Gen<eve on behalf of ISO+/IEC
    JTC1+/SC2).

    For what follows, this catalog number will be refered to as
    the hexadecimal value attributed to a given character in
    ISO+/IEC 10646, consisting of 1 to 8 hexadecimal digits, with
    or without the leading zeros of the hexadecimal number,
    followed by the character SPACE (ex.>> 0000020+<SPACE/>,
    000020+<SPACE/>, 0020+<SPACE/>, 20+<SPACE/> are all equivalent
    catalog numbers corresponding to the character SPACE as
    documented in ISO+/IEC 10646; again, 20+<SPACE/> is not to be
    considered as a code reference>> it could represent the code
    point X"40" of EBCDIC in systems which use the latter code to
    represent the character SPACE, and 20+<SPACE/> would still be
    the reference to that character, whence code<-independence.

    The choice of the character SPACE as an ending delimiter has
    been especially chosen because it is assumed the space bar
    exists on all keyboards of the world. While it is not ideal,
    it is assumed that the 6 first letters ABCDEF of the Latin
    alphabet exist on all keyboards, plus the 10 digits
    0123456789. It would have been better to use fully decimal
    references (and avoid, again, using the Latin script, which
    is a cultural bias) but ISO 10646 references are better found
    in hexadecimal than in full decimal values.

    2.2 Simplistic method of entry per se

    Given such a catalog number, it is proposed to allow the
    entry of any character included in the repertoire of ISO+/IEC
    10646, whatever the resulting code assigned by the target
    machine environment in the following way>>

    While the Control key (as described in ISO+/IEC 9995<-7) and
    the Alternate key (as described in the same standard) are
    simultaneously depressed [preferably by activating the
    Alternate key first, then the Control key], typing the
    hexadecimal value targeting the desired character and ending
    the "catalog number" by depressing the space bar shall
    generate a coded graphic character equivalent to the one
    corresponding to this "catalog number" in ISO+/IEC 10646.
??,
    For example, if the sequence 00C0+<SPACE/> is typed while +<ALT/>
    and +<CTRL/> were previously depressed and held depressed
    during the typing, character "<A" will be generated. In an
    ISO+/IEC 8859<-1 environment, the code value generated by such
    an operation would be hexadecimal C0; in an IBM 850
    environment, this would generate the code value equal to
    hexadecimal B7 (or decimal 183); in an IBM 863 environment,
    the code value generated would be hexadecimal 8E (decimal
    142). In an IBM 437 environment (the original PC code page),
    it is expected that this code will at the minimum give a
    warning that this character does not exist, and as an
    optional fallback generate the character A (unaccented). It
    is to be noted, however, that for most ISO 10646 characters a
    fallback will be very difficult with 7+/8<-bit character sets,
    and a very unlikely character (such as PC code 127 for EMPTY
    HOUSE) is then suggested as an error indication.

    3. Conclusion

    PC users will have noted that entering a given character no
    longer requires to know which code set is in use in the
    machine but requires looking at the ISO 10646 "catalog" or a
    hand<-made customized subset reference list.

    This represents a big advantage over the present simplistic
    (but well<-known and often nevertheless useful) PC method
    using the Alt key in conjunction with the numeric keypad.

    Furthermore, the arrival of notebook computers has seen many
    of these machines without an easily accessible numeric keypad
    (a logical fallback is then often provided, difficult to use
    and often disturbing the normal use of the alphanumeric
    section), and even some without even a fallback for the
    missing numeric keypad.

    It is expected that other methods will be defined but the
    one explained in the present proposal appears to be
    immediately subject to a quickly<-adopted standard, the "quick
    and dirty" solution that a lot of people will use when they
    are desperate.
+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+!
+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+!
+*
                                      ISO+/IEC JTC1+/SC18+/WG9 N1291

    New work item Proposal>> Input methods to enter graphic characters from
    the repertoire of ISO+/IEC 10646 with a keyboard or other input devices

    Source>> SC18+/WG9 SWGK

    Date>>   1993<-11<-04

    Distribution>> SC18+/WG9 and SC18

    Action required>> For SC18 letter ballot and forwarding to
                     JTC1 for approval in the programme of work of
                     SC18+/WG9
    _____________________________________________________________


    NP (New work item Proposal) in FORM 3 format of PROCEDURES

     1 Title of Project.

     Input Method(s) to enter graphic characters from the
     repertoire of ISO+/IEC 10646 with a keyboard or other input
     devices

     2 Scope (and field of application)

     This project will define methods to allow entering
     characters belonging to the repertoire of ISO+/IEC 10646 in
     a code independent manner using a keyboard or other input
     devices. It is also expected that for implementations of
     different character set coding schemes, this method will be
     useable, provided that the target character sets have
     repertoires that are subsets of the one for ISO+/IEC 10646.

     3 Purpose and Justification

     Today, there is a well<-known method in existence for
     inputing characters foreign to a given keyboard on PC
     compatibles. However this method is code<-dependent and is
     limited to 8<-bit character sets. There is a need to
     standardize such a method independently of coding even for
     these limited sets of characters.

     There is also an international standard, ISO+/IEC 9995<-3, for
     inputing on a standard 48<-key keyboard the repertoire of
     characters belonging to those European languages using the
     Latin script. But this standard is limited to the Latin
     script, even if it opens the door to the defining
     supplementary groups for other scripts.

     Furthermore, ISO+/IEC JTC1 recently published a standard,
     ISO+/IEC 10646, titled "Universal multiple<-octet coded
     character set (UCS)", which is a superset of the repertoires
     of all standard character sets published so far by ISO+/IEC
     JTC1. For this one large character set (UCS), there is no
     standard input method in existence today. But there will be
     an increasing need to do so, which would also solve the
     problem of code independence.
??,


     4 Programme of work

     A single<-part international standard is expected to be developed for
     this project.

     5 Relevant documents to be considered

     Initially, the following document will be referenced>>

     ISO+/IEC JTC1+/SC18+/WG9 N1290>> A simplistic but workable
     method to enter characters of the full repertoire of ISO+/IEC
     10646

     6 Cooperation and liaison

     The potential cooperation and liaisons identified would be with>>

      ISO+/IEC JTC1 SC2 <- Character Sets and Information Coding
      ISO+/IEC JTC1 SC22+/WG20 <- Internationalisation

     Note>> a liaison with SC2 is already established. A formal
           liaison with SC22+/WG20 remains to be established.

     7 Preparatory work offered and target dates

     Alain LaBont/e (Minist<ere des Communications du Qu/ebec, Canada)
     is volunteering to edit that draft and is being formally proposed as an
     editor by SC18+/WG9.

     Target dates proposed>>

            WD to be sent for comments in April 1994
            CD to be sent for registration and ballot in October
            1994

     8 References to External Authorities

     For this particular project, no registration authority is foreseen.

     Developments (addenda, corrigenda and extensions) on ISO+/IEC
     10646 will have to be followed.

     The proposed standard does not concern known patented items.
+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+!
+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+*+!
+*
                                      ISO+/IEC JTC1+/SC18+/WG9 N1292

    Title>> Draft User Requirements for a Standard on a method to
           enter characters of the repertoire of ISO+/IEC 10646

    Source>> Alain LaBont/e

    Date>>   1993<-11<-04

    Distribution>> SC18+/WG9 and SC18+/WG1

    Action required>> For consideration by SC18+/WG1 alongside with
                     documents WG9 N1290 and WG9 N1291 (NP)
    _____________________________________________________________

    The primary users of such a standard will be all keyboard
    users which require to enter characters that are part of the
    character set in use in their machine but that are not
    allocated on the keyboard they use.

    1 Today, there is a well<-known method in existence for
      inputing characters foreign to a given keyboard on PC
      compatibles. However this method is code<-dependent and is
      limited to 8<-bit character sets. There is a need to
      standardize such a method independently of coding even for
      these limited sets of characters.

    2 There is also an international standard, ISO+/IEC 9995<-3,
      for inputing on a standard 48<-key keyboard the repertoire
      of characters belonging to those European languages using
      the Latin script. But this standard is limited to the Latin
      script, even if it opens the door to the defining
      supplementary groups for other scripts. In the meanwhile,
      until other groups are well defined and documented, there
      should be an easy standard way to enter non<-Latin
      characters in a code<-independent fashion. This would avoid
      the multiplication of such methods, a situation that is
      never desirable for end<-users.

    3 Furthermore, ISO+/IEC JTC1 recently published a standard,
      ISO+/IEC 10646, titled "Universal multiple<-octet coded
      character set (UCS)", which is a superset of the
      repertoires of all standard character sets published so far
      by ISO+/IEC JTC1. For this one large character set (UCS),
      there is no standard input method in existence today. But
      there will be an increasing need to do so, which would also
      solve the problem of code independence.

    4 Finally, the arrival of notebook computers has seen many of
      these machines without an easily accessible numeric keypad
      (a logical fallback is then often provided, difficult to
      use and often disturbing the normal use of the alphanumeric
      section), and even some without even a fallback for the
      missing numeric keypad. The method proposed should make
      use of a standard keyboard, preferably any keyboard used in
      the world which has digits and Latin letters.