On Fri, Mar 31, 2017 at 05:52:00PM -0700, Linus Torvalds wrote:

> Ok, did a bisect, and this bisects to commit 6b4b013f1884 ("mailinfo:
> handle in-body header continuations").
> 
> The continuation logic is oddly complex, and I can't follow the logic.
> But it is completely broken in how it thinks empty lines are somehow
> "continuations".

I think the continuation logic is OK. The problem is that the newline is
never fed to check_inbody_header() at all. So the next line its state
machine sees is the indented line, which looks like a continuation.

It seems to me that ignoring that newline is a bug, and causes other
problems. For instance, if you have a blank line and then another
header-looking thing (not a continuation) after, it is respected. For
instance, running mailinfo on:

  From ...mbox header...
  From: Email Author <aut...@example.com>
  Subject: email subject
  Date: whatever

  From: in-body author <aut...@example.com>

  Subject: in-body subject

  And the rest of the message.

will use "in-body subject" as the subject, though I'd have thought we
should stop parsing in-body headers as soon as we see the first in-body
blank line (the one after the in-body "From").

The blank line gets eaten by the check at the beginning of
handle_commit_msg(), right _before_ we pass it off to the
check_inbody_header() function.

It seems like that should be more like:

diff --git a/mailinfo.c b/mailinfo.c
index a489d9d0f..6d89781eb 100644
--- a/mailinfo.c
+++ b/mailinfo.c
@@ -757,8 +757,10 @@ static int handle_commit_msg(struct mailinfo *mi, struct 
strbuf *line)
        assert(!mi->filter_stage);
 
        if (mi->header_stage) {
-               if (!line->len || (line->len == 1 && line->buf[0] == '\n'))
+               if (!line->len || (line->len == 1 && line->buf[0] == '\n')) {
+                       mi->header_stage = 0;
                        return 0;
+               }
        }
 
        if (mi->use_inbody_headers && mi->header_stage) {


But that breaks a bunch of tests. It looks like the loop in handle_body
always starts by feeding the initial header-separator (the real headers,
not the in-body ones) to the various parsers. So that first newline
makes us trigger "no more in-body headers" before we even start parsing
any lines. Oops.

So doing this:

@@ -960,7 +962,7 @@ static void handle_body(struct mailinfo *mi, struct strbuf 
*line)
                        goto handle_body_out;
        }
 
-       do {
+       while (!strbuf_getwholeline(line, mi->input, '\n')) {
                /* process any boundary lines */
                if (*(mi->content_top) && is_multipart_boundary(mi, line)) {
                        /* flush any leftover */
@@ -1013,7 +1015,7 @@ static void handle_body(struct mailinfo *mi, struct 
strbuf *line)
 
                if (mi->input_error)
                        break;
-       } while (!strbuf_getwholeline(line, mi->input, '\n'));
+       }
 
        flush_inbody_header_accum(mi);
 

on top fixes that. But that still breaks a test that has blank lines at
the beginning of the message, before the in-body header. So probably the
state-machine needs an extra state: sucking up pre-inbody-header
newlines.

-Peff

Reply via email to