I wrote:
> Thomas Munro <thomas.mu...@enterprisedb.com> writes:
>> +1, but I wonder if just separating them is enough.  Is our seeding
>> algorithm good enough for this new purpose?  The initial seed is 100%
>> predictable to a logged in user (it's made from the backend PID and
>> backend start time, which we tell you), and not even that hard to
>> guess from the outside, so I think Coverity's warning is an
>> understatement in this case.  Even if we separate the PRNG state used
>> for internal stuff so that users can't clobber its seed from SQL,
>> wouldn't it be possible to predict which statements will survive the
>> log sampling filter given easily available information and a good
>> guess at how many times random() (or whatever similar thing) has been
>> called so far?

> Yeah, that's a good point.  Maybe we should upgrade the per-process
> seed initialization to make it less predictable.  I could see expending
> a call of the strong RNG to contribute some more noise to the seeds
> selected in InitProcessGlobals().

Here's a simple patch to do so.

Looking at this, I seem to remember that we considered doing exactly this
awhile ago, but refrained because there was concern about depleting the
system's reserve of entropy if we have a high backend spawn rate, and it
didn't seem like there was a security reason to insist on unpredictable
random() results.  However, the log-sampling patch destroys the latter
argument.  As for the former argument, I'm not sure how big a deal that
really is.  Presumably, the act of spawning a backend would itself
contribute some more entropy to the pool (particularly if a network
connection is involved), so the depletion problem might be fictitious
in the first place.  Also, a few references I consulted, such as the
Linux urandom(4) man page, suggest that even in a depleted-entropy
state the results of reading /dev/urandom should be random enough
for all but the very strictest security requirements.

Thoughts?

                        regards, tom lane

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index eedc617..2d5a0ac 100644
*** a/src/backend/postmaster/postmaster.c
--- b/src/backend/postmaster/postmaster.c
*************** ClosePostmasterPorts(bool am_syslogger)
*** 2520,2530 ****
  /*
   * InitProcessGlobals -- set MyProcPid, MyStartTime[stamp], random seeds
   *
!  * Called early in every backend.
   */
  void
  InitProcessGlobals(void)
  {
  	MyProcPid = getpid();
  	MyStartTimestamp = GetCurrentTimestamp();
  	MyStartTime = timestamptz_to_time_t(MyStartTimestamp);
--- 2520,2532 ----
  /*
   * InitProcessGlobals -- set MyProcPid, MyStartTime[stamp], random seeds
   *
!  * Called early in the postmaster and every backend.
   */
  void
  InitProcessGlobals(void)
  {
+ 	unsigned int rseed;
+ 
  	MyProcPid = getpid();
  	MyStartTimestamp = GetCurrentTimestamp();
  	MyStartTime = timestamptz_to_time_t(MyStartTimestamp);
*************** InitProcessGlobals(void)
*** 2539,2553 ****
  #endif
  
  	/*
! 	 * Set a different seed for random() in every backend.  Since PIDs and
! 	 * timestamps tend to change more frequently in their least significant
! 	 * bits, shift the timestamp left to allow a larger total number of seeds
! 	 * in a given time period.  Since that would leave only 20 bits of the
! 	 * timestamp that cycle every ~1 second, also mix in some higher bits.
  	 */
! 	srandom(((uint64) MyProcPid) ^
  			((uint64) MyStartTimestamp << 12) ^
! 			((uint64) MyStartTimestamp >> 20));
  }
  
  
--- 2541,2570 ----
  #endif
  
  	/*
! 	 * Set a different seed for random() in every process.  We want something
! 	 * unpredictable, so if possible, use high-quality random bits for the
! 	 * seed.  Otherwise, fall back to a seed based on timestamp and PID.
! 	 *
! 	 * Note we can't use pg_backend_random here, since this is used in the
! 	 * postmaster, and even in a backend we might not be attached to shared
! 	 * memory yet.
  	 */
! #ifdef HAVE_STRONG_RANDOM
! 	if (!pg_strong_random(&rseed, sizeof(rseed)))
! #endif
! 	{
! 		/*
! 		 * Since PIDs and timestamps tend to change more frequently in their
! 		 * least significant bits, shift the timestamp left to allow a larger
! 		 * total number of seeds in a given time period.  Since that would
! 		 * leave only 20 bits of the timestamp that cycle every ~1 second,
! 		 * also mix in some higher bits.
! 		 */
! 		rseed = ((uint64) MyProcPid) ^
  			((uint64) MyStartTimestamp << 12) ^
! 			((uint64) MyStartTimestamp >> 20);
! 	}
! 	srandom(rseed);
  }
  
  

Reply via email to