Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-03-02 Thread Bruce Momjian
gt; > Subject: Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create > Index > > behaviour) > > > > Ron <[EMAIL PROTECTED]> writes: > > > How are we choosing our pivots? > > > > See qsort.c: it looks like median of nine equally spaced input

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Dann Corbit
> -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Markus Schaber > Sent: Thursday, February 16, 2006 5:45 AM > To: pgsql-performance@postgresql.org; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS] qsort again

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Jonah H. Harris
Last night I implemented a non-recursive introsort in C... let me test it a bit more and then I'll post it here for everyone else to try out.On 2/16/06, Markus Schaber <[EMAIL PROTECTED]> wrote:Hi, Ron, Ron wrote:> ...and of course if you know enough about the data to be sorted so as to> constrain

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-17 Thread Dann Corbit
> -Original Message- > From: [EMAIL PROTECTED] [mailto:pgsql-hackers- > [EMAIL PROTECTED] On Behalf Of Tom Lane > Sent: Wednesday, February 15, 2006 5:22 PM > To: Ron > Cc: pgsql-performance@postgresql.org; pgsql-hackers@postgresql.org > Subject: Re: [HACKERS

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Tom Lane
Mark Lewis <[EMAIL PROTECTED]> writes: > I think we're actually on the same page here; you're right that the > constraint above ( f(a)==f(b) iff a==b ) can't be extended to data types > with more than 32 bits of value space. But the constraint I listed was > actually: > if a==b then f(a)==f(b) I

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Mark Lewis
On Thu, 2006-02-16 at 21:33 -0800, David Lang wrote: > > In SQL_ASCII, just take the first 4 characters (or 8, if using a 64-bit > > sortKey as elsewhere suggested). The sorting key doesn't need to be a > > one-to-one mapping. > > that would violate your second contraint ( f(a)==f(b) iff (a==b) )

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Martijn van Oosterhout
On Fri, Feb 17, 2006 at 08:18:41AM -0800, Scott Lamb wrote: > On Feb 16, 2006, at 2:17 PM, Mark Lewis wrote: > >Data types which could probably provide a useful function for f > >would be > >int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII). > > ...and with some work, floats (

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Scott Lamb
On Feb 16, 2006, at 2:17 PM, Mark Lewis wrote:Data types which could probably provide a useful function for f would be int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII). ...and with some work, floats (I think just the exponent would work, if nothing else). bytea. Probably just ab

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-17 Thread Markus Schaber
Hi, David, David Lang schrieb: >> In SQL_ASCII, just take the first 4 characters (or 8, if using a 64-bit >> sortKey as elsewhere suggested). The sorting key doesn't need to be a >> one-to-one mapping. > that would violate your second contraint ( f(a)==f(b) iff (a==b) ) no, it doesn't. When b

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread David Lang
On Thu, 16 Feb 2006, Mark Lewis wrote: On Thu, 2006-02-16 at 17:51 -0500, Greg Stark wrote: Data types which could probably provide a useful function for f would be int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII). How exactly do you imagine doing this for text? I could s

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Steinar H. Gunderson
On Fri, Feb 17, 2006 at 12:05:23AM +0100, PFC wrote: > I would have said a 64 bit int, but it's the same idea. However it > won't work for floats, which is a pity, because floats fit in 64 bits. Actually, you can compare IEEE floats directly as ints, as long as they're positive. (If

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Markus Schaber
Hi, PFC, PFC schrieb: > By the way, I'd like to declare my zipcode columns as SQL_ASCII > while the rest of my database is in UNICODE, so they are faster to > index and sort. Come on, MySQL does it... Another use case for parametric column definitions - charset definitions - and the first

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Mark Lewis
On Thu, 2006-02-16 at 17:51 -0500, Greg Stark wrote: > > > Data types which could probably provide a useful function for f would be > > > int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII). > > How exactly do you imagine doing this for text? > > I could see doing it for char(n)/

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread PFC
It seems that instead of maintaining a different sorting code path for each data type, you could get away with one generic path and one (hopefully faster) path if you allowed data types to optionally support a 'sortKey' interface by providing a function f which maps inputs to 32- bit int output

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Greg Stark
Markus Schaber <[EMAIL PROTECTED]> writes: > Hmm, to remove redundancy, I'd change the <= to a < and define: > > if a==b then f(a)==f(b) > if a > > Data types which could probably provide a useful function for f would be > > int2, int4, oid, and possibly int8 and text (at least for SQL_ASCII).

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Martijn van Oosterhout
On Thu, Feb 16, 2006 at 02:17:36PM -0800, Mark Lewis wrote: > It seems that instead of maintaining a different sorting code path for > each data type, you could get away with one generic path and one > (hopefully faster) path if you allowed data types to optionally support > a 'sortKey' interface b

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Markus Schaber
Hi, Mark, Mark Lewis schrieb: > It seems that instead of maintaining a different sorting code path for > each data type, you could get away with one generic path and one > (hopefully faster) path if you allowed data types to optionally support > a 'sortKey' interface by providing a function f whi

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Mark Lewis
On Thu, 2006-02-16 at 12:15 -0500, Tom Lane wrote: > Once or twice we've kicked around the idea of having some > datatype-specific sorting code paths alongside the general-purpose one, > but I can't honestly see this as being workable from a code maintenance > standpoint. > >

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-16 Thread Tom Lane
"Gary Doades" <[EMAIL PROTECTED]> writes: > I think the reason I wasn't seeing performance issues with normal sort > operations is because they use work_mem not maintenance_work_mem which was > only set to 2048 anyway. Does that sound right? Very probable. Do you want to test the theory by jackin

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Tom Lane
"Craig A. James" <[EMAIL PROTECTED]> writes: > You can also use this trick when the optimizer is asked for "fastest first > result." Say you have a cursor on a column of numbers with good > distribution. If you do a bucket sort on the first two or three digits only, > you know the first "page"

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Gary Doades
> "Gary Doades" <[EMAIL PROTECTED]> writes: >> I think the reason I wasn't seeing performance issues with normal sort >> operations is because they use work_mem not maintenance_work_mem which >> was >> only set to 2048 anyway. Does that sound right? > > Very probable. Do you want to test the theor

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Martijn van Oosterhout
On Thu, Feb 16, 2006 at 08:22:55AM -0500, Ron wrote: > 3= Especially in modern systems where the gap between internal CPU > bandwidth and memory bandwidth is so great, the overhead of memory > accesses for comparisons and moves is the majority of the overhead > for both the pivot choosing and th

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-16 Thread Gary Doades
Tom Lane wrote: > I increased the size of the test case by 10x (basically s/10/100/) > which is enough to push it into the external-sort regime. I get > amazingly stable runtimes now --- I didn't have the patience to run 100 > trials, but in 30 trials I have slowest 11538 msec and fastest

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Neil Conway
On Wed, 2006-02-15 at 18:28 -0500, Tom Lane wrote: > It seems clear that our qsort.c is doing a pretty awful job of picking > qsort pivots, while glibc is mostly managing not to make that mistake. > I haven't looked at the glibc code yet to see what they are doing > differently. glibc qsort is act

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index

2006-02-15 Thread Simon Riggs
On Wed, 2006-02-15 at 19:59 -0500, Tom Lane wrote: > I get > amazingly stable runtimes now --- I didn't have the patience to run 100 > trials, but in 30 trials I have slowest 11538 msec and fastest 11144 msec. > So this code path is definitely not very sensitive to this data > distribution. "The

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
I wrote: > Gary Doades <[EMAIL PROTECTED]> writes: >> Ouch! That confirms my problem. I generated the random test case because >> it was easier than including the dump of my tables, but you can >> appreciate that tables 20 times the size are basically crippled when it >> comes to creating an ind

Re: [HACKERS] qsort again (was Re: [PERFORM] Strange Create Index behaviour)

2006-02-15 Thread Tom Lane
Gary Doades <[EMAIL PROTECTED]> writes: > Ouch! That confirms my problem. I generated the random test case because > it was easier than including the dump of my tables, but you can > appreciate that tables 20 times the size are basically crippled when it > comes to creating an index on them. Ac