From erik@sran8.sra.co.jp Fri Jun 14 16:11:24 1991
Received: from mcsun.EU.net by dkuug.dk via EUnet with SMTP (5.64+/8+bit/IDA-1.2.8)
	id AA04675; Fri, 14 Jun 91 16:11:24 +0200
Received: from srawgw.sra.co.jp by mcsun.EU.net with SMTP;
	id AA05803 (5.65a/CWI-2.93); Fri, 14 Jun 91 16:11:00 +0200
Received: from srava.sra.co.jp by srawgw.sra.co.jp (5.64WH/1.4)
	id AA10491; Fri, 14 Jun 91 23:10:33 +0900
Received: from sran8.sra.co.jp by srava.sra.co.jp (5.64b/6.4J.6-BJW)
	id AA12019; Fri, 14 Jun 91 23:10:12 +0900
Received: from localhost by sran8.sra.co.jp (4.0/6.4J.6-SJ)
	id AA12638; Fri, 14 Jun 91 23:08:59 JST
Return-Path: <erik@sran8.sra.co.jp>
Message-Id: <9106141409.AA12638@sran8.sra.co.jp>
Reply-To: erik@sra.co.jp
From: erik@sra.co.jp (Erik M. van der Poel)
To: iso10646@jhuvm.bitnet, ietf-822@dimacs.rutgers.edu, unicode@sun.com,
        i18n@dkuug.dk
Subject: Re: data announcement
Date: Fri, 14 Jun 91 23:08:58 +0900
Sender: erik@sran8.sra.co.jp
X-Charset: ASCII
X-Char-Esc: 29

> And pray tell, what header lines do you put at the beginning
> of executable files?,

You just put some header lines that indicate that this file is an
executable file, and some information to distinguish it from
executable files for other hardware vendors' machines or even
different versions of the same vendor's systems.

For example, if SunOS 4.1.1 executables are not executable on SunOS
4.0.X systems, this information might come in handy when a 4.0.X
system remote-mounts a 4.1.1 filesystem.

(By the way, you would have to upgrade the kernel before you prepend
headers to executables.)


> database files?,

You just name the database system. You might put "Oracle", together
with the version number. So an Oracle program, having been invoked by
the GUI, would be able to tell whether or not it could process this
file. Of course, Oracle may already include such information in its
files. The point is, however, that not all applications have
human-readable headers in their files.  The human will find these
headers very useful if and when the GUI is not able to access the
corresponding program.


> and other files containing inherently binary data?

There is nothing wrong with putting an ASCII header in front of
"inherently binary data". The computer certainly doesn't care. It's
all binary as far as the computer is concerned. However, the ASCII
header is very meaningful to humans, since we can establish some very
simple rules to map the first handful of bits to descriptive text on
the screen, or printer, or whatever.


> Meta data (and codeset announcement is meta data) belongs in the inode
> on Unix systems.

I cannot bring myself to disagree with you strongly here. Separating
the data itself from the stuff that describes it, has a certain
appeal. If a vendor can upgrade their Unix filesystem in such a way
that the inodes become freely extensible, they can easily provide the
kind of human-readable text that I've been talking about.

I simply thought that it would be easier to put an extensible header
in the file itself. If a vendor determines that this is not the case,
then by all means, do it the inode way!


> The user just has to know what files
> are binary and what files are text (line oriented) files.

This is exactly the situation we should be trying to avoid. We want
the computers to do nearly everything automatically. Of course, the
human user will have to learn a few basics, such as the fact that it
is not such a good idea to stuff a floppy in the CD-ROM drive.


> Just try
> the split command on an arbitrary file.  If you are lucky it will find
> NLs sufficiently often to not abend due to overflowing a static
> buffer.

Well, I would agree that it is tragic that some programmers did not
know or did not care about overflowing buffers. This is also one of my
pet peeves. However, I do not see what this has to do with data
announcement.


> Just because this is a hard migration step for Unix does not mean that
> we should not spec it out and start pushing for its inclusion in the
> next POSIX release.

I agree completely. We should spec a new system call, called new_open
or whatever. Whether this system call derives the ASCII header from
the "file" itself, or from the inode, is immaterial. (It's an
implementation detail.)


> These days it even has a bar code strip that I can't read but
> I am sure the machine at the post office can.

What if you take this bar code to some other country's post offices?
Will they also be able to make sense of it? In some cases not, right?
Now do you see the value of human-readable information?


> Yeah, I know a fancy mail
> reader can suppress the junk headers, but simple programs should be
> simple.

I couldn't agree more strongly! Simple is best. Even the most dense
programmers out there will then get it right.


Cheers,
EvdP

