On Fri, Mar 31, 2017 at 05:52:00PM -0700, Linus Torvalds wrote:
> Ok, did a bisect, and this bisects to commit 6b4b013f1884 ("mailinfo:
> handle in-body header continuations").
>
> The continuation logic is oddly complex, and I can't follow the logic.
> But it is completely broken in how it thinks empty lines are somehow
> "continuations".
I think the continuation logic is OK. The problem is that the newline is
never fed to check_inbody_header() at all. So the next line its state
machine sees is the indented line, which looks like a continuation.
It seems to me that ignoring that newline is a bug, and causes other
problems. For instance, if you have a blank line and then another
header-looking thing (not a continuation) after, it is respected. For
instance, running mailinfo on:
From ...mbox header...
From: Email Author <[email protected]>
Subject: email subject
Date: whatever
From: in-body author <[email protected]>
Subject: in-body subject
And the rest of the message.
will use "in-body subject" as the subject, though I'd have thought we
should stop parsing in-body headers as soon as we see the first in-body
blank line (the one after the in-body "From").
The blank line gets eaten by the check at the beginning of
handle_commit_msg(), right _before_ we pass it off to the
check_inbody_header() function.
It seems like that should be more like:
diff --git a/mailinfo.c b/mailinfo.c
index a489d9d0f..6d89781eb 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -757,8 +757,10 @@ static int handle_commit_msg(struct mailinfo *mi, struct
strbuf *line)
assert(!mi->filter_stage);
if (mi->header_stage) {
- if (!line->len || (line->len == 1 && line->buf[0] == '\n'))
+ if (!line->len || (line->len == 1 && line->buf[0] == '\n')) {
+ mi->header_stage = 0;
return 0;
+ }
}
if (mi->use_inbody_headers && mi->header_stage) {
But that breaks a bunch of tests. It looks like the loop in handle_body
always starts by feeding the initial header-separator (the real headers,
not the in-body ones) to the various parsers. So that first newline
makes us trigger "no more in-body headers" before we even start parsing
any lines. Oops.
So doing this:
@@ -960,7 +962,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf
*line)
goto handle_body_out;
}
- do {
+ while (!strbuf_getwholeline(line, mi->input, '\n')) {
/* process any boundary lines */
if (*(mi->content_top) && is_multipart_boundary(mi, line)) {
/* flush any leftover */
@@ -1013,7 +1015,7 @@ static void handle_body(struct mailinfo *mi, struct
strbuf *line)
if (mi->input_error)
break;
- } while (!strbuf_getwholeline(line, mi->input, '\n'));
+ }
flush_inbody_header_accum(mi);
on top fixes that. But that still breaks a test that has blank lines at
the beginning of the message, before the in-body header. So probably the
state-machine needs an extra state: sucking up pre-inbody-header
newlines.
-Peff