retitle 244962 mawk: buffer overflow in collect_RE from overlong regexp
thanks

Hi Ian and others,

Ian Jackson wrote:

> --- mawk-1.3.3.orig/scan.c
> +++ mawk-1.3.3/scan.c
> @@ -1033,6 +1033,15 @@
>     STRING *sval ;
>  
>     while (1)
> +   {
> +      if (p == string_buff + SPRINTF_SZ - 2)
> +      {
> +          compile_error(
> +                       "regular expression /%.10s ..."
> +                       " exceeds implementation size limit",
> +                       string_buff) ;
> +       mawk_exit(2) ;
> +      }
>        switch (scan_code[*p++ = next()])
>        {
>        case SC_DIV:           /* done */
> @@ -1070,6 +1079,7 @@
>           }
>           break ;
>        }
> +   }
>  
>  out:
>     /* now we've got the RE, so compile it */

Thanks for the report and patch!  Thomas Dickey applied this and
several other patches in mawk 1.3.3-20090711.  Modulo whitespace and
addressing some pedantic compiler warnings, his patch was the same.

The patch uses == to check for the end of the buffer, but
unfortunately that check can be bypassed by using an escape sequence
that straddles the boundary.  You can check this with

# 640 backslashes
backslashes='\\\\\\\\\\'
backslashes="$backslashes$backslashes$backslashes$backslashes"
backslashes="$backslashes$backslashes$backslashes$backslashes"
backslashes="$backslashes$backslashes$backslashes$backslashes"
mawk "/a$backslashes/"

Fixed by
http://git.debian.org/?p=collab-maint/mawk.git;a=commitdiff;h=e2e6d7ad
(collect_RE(): handle overflow when escape straddles boundary,
2010-01-24, applied upstream in mawk 1.3.4-20100131.

On the use of SPRINTF_SZ: SPRINTF_SZ is sizeof(tempbuff), while
MIN_SPRINTF is the size of its _string_buff member.  Deciding the size
of the sprintf buffer is what SPRINTF_SZ is for, so I think you are
right to use it.

The C standard at least strongly implies that
(void *)_string_buff == (void *) &tempbuff in this case. [1]

 union {
     STRING *_split_buff[MAX_SPLIT];
     char _string_buff[MIN_SPRINTF];
 } tempbuff;

Unfortunately, it is not nearly so clear that reading and writing
arbitrary bytes past the end of _string_buff is permitted in this
case [2].  So MIN_SPRINTF produces more strictly correct C code, at
the expense of being wasteful in some situations and stricter here.
Anyway, whatever the right choice is, it should be used to set
SPRINTF_SZ globally in my opinion.

Thomas made the switch from SPRINTF_SZ to MIN_SPRINTF in
mawk 1.3.3-20090712.

Hope that helps,
Jonathan

[1] See, for instance, WG14 N1425 § 6.7.2.1 paragraph 16.
[2] See § 6.5.7 paragraph 8.



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to