Re: multipart/alternative question

lee Sat, 18 Jul 2009 13:42:49 -0700

On Fri, Jul 17, 2009 at 06:28:35PM -0500, Kyle Wheeler wrote:
> On Friday, July 17 at 03:58 PM, quoth lee:
> > Hm, somehow I've never had that problem. When reading the message, I 
> > find out if something is attached.
> 
> You're lucky!


Yay! ;)

> But every now and then, I still manage to miss an attached file
> (either because I didn't look, or because it was surprisingly
> small).

That's because mutt doesn't count the attachments right and doesn't
make it very obvious that they are there. Not making it obvious has
advantages in that you don't need to worry about them when viewing a
message. But in your case, it's not so good because it leads you to
miss attachments ... If I got such mails, I might miss them as well.

> > Well, you could have a message with a container. The container could 
> > contain a number of files,
> 
> Hmm, well, I guess I see your point, but not even mutt supports 
> batch-decoding like that. Do you perhaps have a perl script of some 
> kind that you use to bulk-decode like that?

Unfortunately not; but I haven't needed one yet. Saving whole
containers is merely a possibility that comes to mind when considering
attachments as containers that contain something. That's something I
didn't do before. Once you do, it's evident that a MUA could have a
way to save a whole container like that.

Perhaps we should make a feature request? I'm not sure how useful that
would be, but I can imagine that there are ways of explicitly using
it. In case you want to send someone several files, you (or the MUA)
could make a container to contain them. The recipient could save the
whole container instead of having to save all the files separately. If
you want to go a step further, you could invent a way of specifying
something that should be done after saving the files, similar to
Debian packages specifying what to do to configure the software that
is in the package ... Like someone could attach a folder with files in
it, the MUA would attach them as/in a container, and the sender would
specify that "make" should be run after saving the files and then
program that's compiled because he's sending you the new version of
the program. That's a dangerous feature, so the MUA would have to ask
to user before doing something ...

Speaking of which, I sometimes wished that I could just attach a whole
folder instead of having to attach all the files separately. Moott,
that was when I wanted to send several pictures to someone. I would
resize them and convert them to JPEG and collect them in a folder, and
when I had all the pictures gathered, I "naturally" wanted to tell
mutt to send the folder --- but that doesn't work. Since pictures can
be large, the user should be able to specify a size limit after
attaching a folder (or large files) to a message, and the MUA should
automatically split the thing up as needed.

These mime guys did only part of the job --- or maybe it's the MUA
developers. I used to split up files when the message size limit was
64k and had scripts for sending them and splitting them up
automatically. The limits aren't that tight anymore, but the files are
also larger. Try sending someone 5 or 10 TIFFs as they come out of
your camera --- you may find that you can't even send one because he's
got a mailbox limit of 5MB or a size limit of 20MB and one TIFF is
about 18MB ...

So where's the full MUA support of the mime stuff?

> > At some time, all this mime crap (that's how I still think of it) 
> > was invented.
> 
> HEH - way to make me feel old.

Well, how should I feel? ;)

> The first MIME RFC was written in 1993, back when I was barely a
> teenager and was just about to discover the wonders of
> HyperCard. (My First Internet (tm) was AOL.)

You started early then. In 1993, only a few people had even heard
about "internet" --- it was something very expensive that only a few
institutions like universities could/would afford. At that time, I had
news and mail, but no internet ...

> But I see what you mean about your definition of an "attachment"... 
> though by that logic, a "message" is essentially defined by its 
> headers (From/To/Subject/etc.) rather than its content.

*The headers are content.* The headers are the substantial part of the
message.

What content the headers or other parts of a message have doesn't
define what a message is. The definition is formal and doesn't concern
the contents.

You can send something via SMTP that doesn't have headers and a body
because the MTA doesn't care about the content. That's why the SMTP
protocol knows such things as "envelope sender" and "envelope
recipient". What you put into the headers as From: and To: can be
different from what's in the envelope. For example:


l...@cat:~/Mail$ telnet localhost smtp
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 cat.rubenette.is-a-geek.com ESMTP Exim 4.69 Sat, 18 Jul 2009 12:37:07 -0600
helo cat.rubenette.is-a-geek.com
250 cat.rubenette.is-a-geek.com Hello localhost [127.0.0.1]
MAIL FROM: l...@yun.yagibdah.de
250 OK
RCPT TO: l...@localhost
250 Accepted
DATA
354 Enter message, ending with "." on a line by itself
blahlaber
asldfklsdaf
laskflskf
lasfkksf
laskflsfdk
.
250 OK id=1MSEnY-0002pN-TY
quit
221 cat.rubenette.is-a-geek.com closing connection
Connection closed by foreign host.


Do you know any MUA that could handle above data? The MTA --- and other
software if you have some --- add headers and make it a message because
they behave nicely:


Return-path: l...@yun.yagibdah.de
Envelope-to: l...@localhost
Delivery-date: Sat, 18 Jul 2009 12:38:29 -0600
Received: from localhost ([127.0.0.1] helo=cat.rubenette.is-a-geek.com)
        by cat.rubenette.is-a-geek.com with smtp (Exim 4.69)
        (envelope-from <l...@yun.yagibdah.de>)
        id 1MSEnY-0002pN-TY
        for l...@localhost; Sat, 18 Jul 2009 12:38:29 -0600
Message-Id: <e1mseny-0002pn...@cat.rubenette.is-a-geek.com>
From: l...@yun.yagibdah.de
Date: Sat, 18 Jul 2009 12:38:23 -0600
X-Spam_score: 1.4
X-Spam_score_int: 14
X-Spam_bar: +

blahlaber
asldfklsdaf
laskflskf
lasfkksf
laskflsfdk


Now with headers:


l...@cat:~/Mail$ telnet localhost smtp
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
220 cat.rubenette.is-a-geek.com ESMTP Exim 4.69 Sat, 18 Jul 2009 12:50:43 -0600
helo cat.rubenette.is-a-geek.com
250 cat.rubenette.is-a-geek.com Hello localhost [127.0.0.1]
MAIL FROM: l...@yun.yagibdah.de
250 OK
RCPT TO: l...@localhost
250 Accepted
DATA
354 Enter message, ending with "." on a line by itself
From: fakeu...@fakedomain.xxx
To: nob...@anywhere.com

.
250 OK id=1MSF04-0002qJ-03
quit
221 cat.rubenette.is-a-geek.com closing connection
Connection closed by foreign host.


That's how it looks in mutt:


Return-path: l...@yun.yagibdah.de
Envelope-to: l...@localhost
Delivery-date: Sat, 18 Jul 2009 12:51:51 -0600
Received: from localhost ([127.0.0.1] helo=cat.rubenette.is-a-geek.com)
        by cat.rubenette.is-a-geek.com with smtp (Exim 4.69)
        (envelope-from <l...@yun.yagibdah.de>)
        id 1MSF04-0002qJ-03
        for l...@localhost; Sat, 18 Jul 2009 12:51:51 -0600
From: fakeu...@fakedomain.xxx
To: nob...@anywhere.com
Message-Id: <e1msf04-0002qj...@cat.rubenette.is-a-geek.com>
Date: Sat, 18 Jul 2009 12:51:49 -0600
X-Spam_score: 3.4
X-Spam_score_int: 34
X-Spam_bar: +++



Notice Return-path: and Envelope-to: --- the MTA is nice to create
those headers. However, I could have given it other information in the
SMTP session --- that's how you send SPAM (if you can figure out how
to fake your IP address or if you don't need to).

The content doesn't matter. No message without headers, but headers
are content. That hasn't changed in any way with mime extensions. They
might try to redefine the understanding of "message", but the
implementation is a matter of fact they cannot simply change. They
didn't want to, they wanted to come up with something compatible.

Now you might say that the body of the message is an attachment (to
the headers). It sort of is, but that used to always be displayed
(because it was plain text) and thus nobody would consider it as an
attachment. When you get a letter on paper, it comes in an envelope
(SMTP), and it has the address of the sender and yours on top and a
subject and a date (headers), then there's blank space and then the
text (body). You wouldn't ever consider the text part (body) of the
letter as attached or as attachment, would you?

Email is *exactly* the same. But you run into problems when you send
someone a letter that has a picture printed in the middle of the text:
That's relatively easy with paper but impossible with email. People
found ways to do it nonetheless, using uuencoders or base64 to put the
picture into the *body of the mail*. Looking at the mail when sending
or receiving it, you would see the uucode or the base64 and know that
you got a file. It was in the body.

Then mime was invented to solve the problem, and you didn't put the
picture into the body of the mail anymore. You would instead attach
it and see nothing of that. There's no uucode or base64 in the
body. It's out of the body, it's an attachment.

Later you might have a GUI MUA that could display the picture in the
middle of the text. I'm still not using one, and I'm not going to see
the picture unless I do something to display it --- and I don't want
it any other way.

So for all practical purposes, whatever mime component is in an email,
it is an attachment. People who are 20 or 25 years younger than I am
and have grown up "without attachments" (sort of) and GUI MUAS may
have a different understanding. But how to display what is left up to
the user.

> > And isn't it outright amazing that the mime guys, in a way, 
> > massively failed in clearly defining what an attachment is and what 
> > not?
> 
> Yes and no. I think a lot of those sorts of oversights tend to be the 
> result of assumptions, influenced by popular software abstractions. 
> For example, it's easy to assume that an "attachment" is anything the 
> user explicitly "attached" (by clicking the "attach" button) and that 
> any behind-the-scenes encoding nonsense doesn't count IF (and only if) 
> you typically operate at a level where that stuff is handled 
> transparently such that you never see it. If, on the other hand, you 
> usually read mail with `more` (or `mail` or something else that 
> usually shows raw email content), and you tend to *see* the MIME 
> encoding, then its easier to think of that as "attached" to the "plain 
> text message".

Well, yes, but that doesn't explain why they so massively failed in
clarity. I can always maintain that the understanding you describe is
wrong because the matter of fact of the implementation is totally
different. It's only that you don't become aware of the implementation
because the implementation makes things transparent to you (but only
as far as it works that way). The people making the RFC, creating the
transparency, should (must) have been aware of the implementation and
could have much more clearly described what they are doing and defined
what they created much more clearly.

> >> As I understand it, this means that a "Message" is generally a 
> >> series of text lines similar to that defined in RFC 822 but that 
> >> may also be divided into one or more sub-parts that are encoded 
> >> according to the MIME standard (RFC 2045). As such, a "message" can 
> >> contain another "message", as long as the "contained" message is 
> >> encapsulated within a MIME entity/component of the other. Thus, 
> >> since a MIME entity can encapsulate another message, the entity's 
> >> body may be a full-blown "message" in and of itself.
> >
> > Why don't they just say that? But what is an entity?
> 
> Ehrm... it's defined in section 2.4:
> 
>      The term "entity", refers specifically to the MIME-defined header
>      fields and contents of either a message or one of the parts in the
>      body of a multipart entity. The specification of such entities is
>      the essence of MIME. Since the contents of an entity are often
>      called the "body", it makes sense to speak about the body of an
>      entity. Any sort of field may be present in the header of an
>      entity, but only those fields whose names begin with "content-"
>      actually have any MIME-related meaning. Note that this does NOT
>      imply that they have no meaning at all -- an entity that is also a
>      message has non-MIME header fields whose meanings are defined by
>      RFC 822.
> 
> So... it sounds like, because it's English, words got re-used and 
> redefined into confusion.

Hm. I was about to say that it doesn't have to be that way, but, since
I'm German, I find that English, at least American English, is mott
very sloppy and indistinctive with things. There are distinctions I
"naturally" make in German that nobody makes in English. That means
they are not aware that there is the possibility of making a
distinction. English doesn't allow them to think of one, it sets
limits. I can't tell if that's really true because I might not feel
that way if English was my native language --- and it's impossible to
tell because if English was my native language, I wouldn't be aware of
the possibility (unless I learned another language, maybe).

Anyway, they are using another recursive definition. When you are in
school and write an essay or something in such a manner, they will ---
at least they should --- tell you that you must not use recursive
definitions and that your essay sucks because it is incomprehensible.

Above quote does not define what an entity is because it is
recursive. I don't understand it: "The term "entity", refers
specifically to the MIME-defined header fields and contents of either
a message or one of the parts in the body of a multipart entity."

What is a "multipart entity"? They need to explain first what an
"entity" is and then what a "multipart entity" is.

> As I understand it, an "entity" is essentially anything between a
> pair of MIME delimiters.

That can't be right: "The term "entity", refers specifically to the
MIME-defined header fields".

I guess they mean "header fields that are defined by the mime
RFC". Using that in the RFC that defines mine is somewhat recursive
again.

So an "entity" can simply be a header field that is defined in the
RFC.

Now: "The term "entity", refers specifically to the [...] contents of
[...] a message". And:

"
   The term "message", when not further qualified, means either a
   (complete or "top-level") RFC 822 message being transferred on a
   network, or a message encapsulated in a body of type "message/rfc822"
   or "message/partial".
"


That is:

1.) There seem to be messages (undefined because used recursively)
that are "encapsulated in a body of type "message/rfc822" or
"message/partial"" and not being transferred on a network.

2.) There seem to be messages (undefined as of yet) that are "RFC 822
message[s] being transferred on a network".


That means:


"The term "entity", refers specifically to the [...] contents of
[something that is a] RFC 822 message or encapsulated in a body of
type "message/rfc822" or "message/partial" [and not being transferred
on a network]."


So they have a distinction between the headers they are defining in
the mime RFC and the contents of RFC 822 messages in either plain or
encapsulated form. They are saying that an "entity" can be a header or
the contents of an RFC 822 message (in either form), giving up the
distinction the same moment they make it.

That is again recursive because what they are trying to define is (a
way of using) "additional content types" in RFC 822 messages. It is
unwise to give up the distinction they made because not making the
distinction contributes a lot to the confusion.

I give up here. Have them make a comprehensible an non-recursive RFC.

> But that's a common terminology problem with any digital generic 
> container (and by generic, I mean that it can contain itself). Kind of 
> like running a PC emulator within a PC emulator - clear descriptions 
> of what you're doing start to become strained.

But that is very easy to understand and to explain. You might even do
that every day when emptying the dishwasher by putting cups into cups,
pots into pots or pans into pans to make them fit into the
cupboard. Nobody gets confused about that a pot can fit into another
pot and that pots can fit into cupboards. If you don't have much room,
you might put cups into pots and pots into pans as well, and there's
still nothing confusing about it.

> This is a bad example, but I think you can view things similarly when 
> you consider the Bible. The Bible is a "book". But it contains 
> "books". So you can make reference to the Book's books.

Yeah, and nobody gets confused about that. The bible, the emulator,
the dishwasher, the pots, pans and cups are not recursive.

Think of it: When making the RFC, if they hadn't tried to think in
terms of putting one thing into another recursively but in terms of
attaching something to a message, their job would have been a lot
easier and people could understand the RFC without difficulty. But
they confused themselves. That's not a problem with terminology.

Re: multipart/alternative question

Reply via email to