On Sun, Aug 25, 2019 at 04:10:55AM -0400, Jeff King wrote: > Fast-import's read_next_command() has somewhat odd memory ownership > semantics for the command_buf strbuf. After reading a command, we copy > the strbuf's pointer (without duplicating the string) into our cmd_hist > array of recent commands. And then when we're about to read a new > command, we clear the strbuf by calling strbuf_detach(), dropping > ownership from the strbuf (leaving the cmd_hist reference as the > remaining owner). > > This has a few surprising implications: > > - if the strbuf hasn't been copied into cmd_hist (e.g., because we > haven't ready any commands yet), then the strbuf_detach() will leak > the resulting string > > - any modification to command_buf risks invalidating the pointer held > by cmd_hist. There doesn't seem to be any way to trigger this > currently (since we tend to modify it only by detaching and reading > in a new value), but it's subtly dangerous. > > - any pointers into an input string will remain valid as long as > cmd_hist points to them. So in general, you can point into > command_buf.buf and call read_next_command() up to 100 times before > your string is cycled out and freed, leaving you with a dangling > pointer. This makes it easy to miss bugs during testing, as they > might trigger only for a sufficiently large commit (e.g., the bug > fixed in the previous commit). > > Instead, let's make a new string to copy the command into the history > array, rather than having dual ownership with the old. Then we can drop > the strbuf_detach() calls entirely, and just reuse the same buffer > within command_buf over and over. We'd normally have to strbuf_reset() > it before using it again, but in both cases here we're using > strbuf_getline(), which does it automatically for us. > > This fixes the leak, and it means that even a single call to > read_next_command() will invalidate any held pointers, making it easier > to find bugs. In fact, we can drop the extra input lines added to the > test case by the previous commit, as the unfixed bug would now trigger > just from reading the commit message, even without any modified files in > the commit. > > Reported-by: Mike Hommey <m...@glandium.org> > Signed-off-by: Jeff King <p...@peff.net> > --- > fast-import.c | 4 +--- > t/t9300-fast-import.sh | 5 ----- > 2 files changed, 1 insertion(+), 8 deletions(-) > > diff --git a/fast-import.c b/fast-import.c > index ee7258037a..1f9160b645 100644 > --- a/fast-import.c > +++ b/fast-import.c > @@ -1763,7 +1763,6 @@ static int read_next_command(void) > } else { > struct recent_command *rc; > > - strbuf_detach(&command_buf, NULL); > stdin_eof = strbuf_getline_lf(&command_buf, stdin); > if (stdin_eof) > return EOF; > @@ -1784,7 +1783,7 @@ static int read_next_command(void) > free(rc->buf); > } > > - rc->buf = command_buf.buf; > + rc->buf = xstrdup(command_buf.buf);
You could xstrndup(command_buf.buf, command_buf.len), which would avoid a hidden strlen. Mike