Re: [HACKERS] gistchoose vs. bloat

Tom Lane Thu, 24 Jan 2013 13:08:12 -0800

Heikki Linnakangas <hlinnakan...@vmware.com> writes:
> BTW, one thing that I wondered about this: How expensive is random()? 
> I'm assuming not very, but I don't really know. Alexander's patch called 
> random() for every tuple on the page, while I call it only once for each 
> equal-penalty tuple. If it's really cheap, I think my/Tom's patch could 
> be slightly simplified by always initializing keep_current_best with 
> random() at top of the function, and eliminating the -1 "haven't decided 
> yet" state.


I thought about that too, and concluded that random() is probably
expensive enough that we don't want to call it unnecessarily.

> OTOH, if random() is expensive, I note that we only need one 
> bit of randomness, so we could avoid calling random() so often if we 
> utilized all the bits from each random() call.

Meh.  That would hard-wire the decision that the probability of keeping
a best tuple is exactly 0.5.  I'd rather keep the flexibility to tune it
later.  The way your patch is set up, it seems unlikely that it will
call random() very many times per page, so I'm not that concerned about
minimizing the number of calls further.  (Also, in the probably-common
case that there are no exactly equally good alternatives, this would
actually be a net loss since it would add one unnecessary call.)

So basically, Alexander's method requires more random() calls and fewer
penalty() calls than yours, at least in typical cases.  It's hard to say
which is faster without benchmarking, and it would matter which opclass
you were testing on.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gistchoose vs. bloat

Reply via email to