On Thu, 19 Dec 2013, Stefan Esser wrote:

Log:
 Fix overflow for timeout values of more than 68 years, which is the maximum
 covered by sbintime (LONG_MAX seconds).

Not LONG_MAX seconds, but INT32_MAX seconds.  LONG_MAX seconds is about 2**32
times larger on 64-bit arches.

sbintimes and their complexity give many more possibilities for overflow.
I thought that the new overflow bugs and some old ones were already  fixed.

 Some programs use timeout values in excess of 1000 years. The conversion
 to sbintime caused wrap-around on overflow, which resulted in short or
 negative timeout values. This caused long delays on sockets opened by
 affected programs (e.g. OpenSSH).

 Kernels compiled without -fno-strict-overflow were not affected, apparently
 because the compiler tested the sign of the timeout value before performing
 the multiplication that lead to overflow.

I think it just gave a garbage value that was accidentally less harmful.

 When the -fno-strict-overflow option was added to CFLAGS, this optimization
 was disabled and the test was performed on the result of the multiplication.
 Negative products were caught and resulted in EINVAL being returned, but
 wrap-around to positive values just shortened the timeout value to the
 residue of the result that could be represented by sbintime.

This shows one reason why -fno-strict-overflow shouldn't be used.  It just
wastes time to give different undefined behaviour with undocumented details.

 The fix is to cap the timeout values at the maximum that can be represented
 by sbintime, which is 2^31 - 1 seconds or more than 68 years.

2^31 - 1 is a correct spelling of INT32_MAX, unlike LONG_MAX.

 After this change, the kernel can be compiled with -fno-strict-overflow
 with no ill effects.

Modified: head/sys/kern/kern_event.c
==============================================================================
--- head/sys/kern/kern_event.c  Thu Dec 19 07:33:07 2013        (r259608)
+++ head/sys/kern/kern_event.c  Thu Dec 19 09:01:46 2013        (r259609)
@@ -526,7 +526,8 @@ knote_fork(struct knlist *list, int pid)
static __inline sbintime_t
timer2sbintime(intptr_t data)
{
-
+       if (data > LLONG_MAX / SBT_1MS)
+               return LLONG_MAX;
        return (SBT_1MS * data);
}

This has the following style bugs:
- it removes the empty line after the (null) declarations
- it is missing parentheses around the return value
- it uses the long long abomination

This has the following type errors:
- using the long long abomination is not just a style bug.  sbintime_t has
  type int64_t, not long long.  The overflow check will break if long long
  becomes longer than int64_t
- returning LLONG_MAX is usually just a style bug.  When long long is longer
  than int64_t, the return value will overflow, but the implementation-defined
  behaviour for this is usually to convert it to the correct value.

INT64_MAX is a dangerous default maximum timeout.  You can't even add the
minimum sbintime_t of 2**-32 seconds to this without overflowing.

Old timeout code used the following method to reduce the problem of
overflowing timeouts:
  For alarm() and setitimer(), limit the timeout to 100 million seconds
  (3.17 years) and return EINVAL if it is too large.  This works with
  32-bit time_t until about 2035.  POSIX doesn't allow this.  Clamping
  the time to 100 million seconds isn't allowed either, since the
  residual time can be seen from applications.  Note that 64-bit time_t
  has little effect on this problem.  Uses wishing to overflow the kernel
  simply ask for a timeout in a timeval of INT64_MAX seconds plus 999999
  seconds so that adding just 1 microseconds to it overflow.  With
  seconds in sbintime_t instead of in time_t, overflow occurs much
  sooner.

This is quite broken now:
- the limit of 100000000 used to be enforced in itimerfix(), but this was
  removed in connection with implementing POSIX realtime timers without
  even breaking the documentaion to match or touching PR(s) about the
  POSIX non-conformance of the limit.
- setitimer() mostly uses sbintimes now, so the old limit is irrelevant
  and enforcing it in itimerfix() would break the new sbintime code and
  presumably the not so new realtime timers code.  The sbintime code
  uses a modified limit that is also not allowed by POSIX.  This limit is
  of course undocumented.  It is INT32_MAX / 2, so that there is plenty
  of room for expansion.  This is what the above should use, modulo the
  conformance bug.  EINVAL is returned when this limit is exceeded.
- select() and poll() were more careful and seem to be correct now.
  The timeout in poll() is limited to INT_MAX milliseconds, so it fits
  in sbintime_t.  Except this assumes that int is 32 bits.  select()
  limits the timeout to INT32_MAX seconds so that it is representable
  as an sbintime_t and then does a further overflow check involving
  INT64_MAX.  When overflow would occur, it sets asbt to -1, which I
  think means an infinite timeout.  I don't like this.  -1 is a tiny
  amount less than 0, not near plus infinity.
- nanosleep() used to be less careful than select() and poll(), but seems
  to have been fixed.  For some reason, it uses a mixture of the above
  methods.  It limits sleep times to INT32_MAX / 2 seconds but repeats
  the sleep as necessary so as not to return an error.  The logic for
  this is subtle and I haven't checked it.

davide@ has patches related to fixing this, but none seem to have been
committed yet, except some old ones that gave some of the above overflow
checking.

Bruce
_______________________________________________
svn-src-head@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"

Reply via email to