Added to TODO:
> * Consider being smarter about memory and external files used during
> sorts
>
> http://archives.postgresql.org/pgsql-hackers/2007-11/msg01101.php
> http://archives.postgresql.org/pgsql-hackers/2007-12/msg00045.php
-
Ron Mayer wrote:
Or do you mean being able to perform parts of the query plan fully in
parallel? If this, then one would need a lot more than ParallelSort...
I wouldn't recommend that - it seems like a Hard Problem.
Isn't it the case that the implicit unions from processing partitioned
>-Original Message-
>From: [EMAIL PROTECTED]
>[mailto:[EMAIL PROTECTED] On Behalf Of Ron Mayer
>Sent: Wednesday, 19 December 2007 19:26
>To: Mark Mielke; pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Sorting Improvements for 8.4
>
>> Or do you mean bei
"Brian Hurt" <[EMAIL PROTECTED]> writes:
> 3) It's possible to perform the sort lazily. You have the initial O(N) pass
> over the list, but then each block is only O(log N) cost. If it's likely that
> only the first part of the result is needed, then much of the work can be
> avoided.
Now that'
Brian Hurt wrote:
While we're blue skying things, I've had an idea for a sorting
algorithm kicking around for a couple of years that might be
interesting. It's a variation on heapsort to make it significantly
more block-friendly. I have no idea if the idea would work, or how
well it'd work,
Jeff Davis wrote:
Where parallel processing like this becomes attractive is when you're running
a 2 hour query on a machine sequentially running scheduled batch jobs which
can be sped up to 30 minutes. But in that case you're almost certainly being
limited by your disk bandwidth, not your cpu spe
On Thu, 2007-12-20 at 01:26 +, Gregory Stark wrote:
> I suspect most of that is spent just copying the data around. Which would not
> be helped by having multiple threads doing the copying -- and in fact might be
> exacerbated if it required an extra copy to consolidate all the data in the
> en
Jeff Davis wrote:
On Wed, 2007-12-19 at 15:51 -0500, Mark Mielke wrote:
That sounds possible, but I still feel myself suspecting that disk
reads will be much slower than localized text comparison. Perhaps I am
overestimating the performance of the comparison function?
I think this simpl
Gregory Stark wrote:
> Note that speeding up a query from 20s to 5s isn't terribly useful.
I disagree totally with that.
That is the difference between no chance of someone waiting for a web
page to load; vs. a good chance they'd wait. And 2s vs 0.5s is the
difference between a web site that f
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
> [EMAIL PROTECTED] On Behalf Of Brian Hurt
> Sent: Thursday, December 20, 2007 6:42 AM
> To: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Sorting Improvements for 8.4
>
> While we
On Thu, 20 Dec 2007, Martijn van Oosterhout wrote:
The way around this is a NUMA architecture, but that's a whole
other ball of wax.
Quick note for those reading Ulrich's paper: he refers in a couple of
places to Intel's upcoming CSI approach to NUMA. This has now been
renamed QuickPath, a
While we're blue skying things, I've had an idea for a sorting algorithm
kicking around for a couple of years that might be interesting. It's a
variation on heapsort to make it significantly more block-friendly. I
have no idea if the idea would work, or how well it'd work, but it might
be wor
On Wed, Dec 19, 2007 at 07:17:21PM -0800, Ron Mayer wrote:
> > but it doesn't follow that you will get
> > 10X improvement with 10 threads, or even 4X with 4.
>
> Yeah - unless those 10 cores have additional I/O to the
> memories compared to a 1 core system (which I'd hope
> would be the case or e
On Thu, 20 Dec 2007, Gregory Stark wrote:
Surely such machines have kickass memory backplanes too though? How could it
ever be reasonable to have an i/o controller with more bandwidth than your
memory?
Dann had the right general numbers here--max of 6.4GB/s between processors
and you might co
Tom Lane wrote:
> ...I can believe that suitable test cases would show
> 2X improvement for 2 threads,
One other thing I found interesting is that their test case
showed a near 2X improvement for hyperthreading; where I haven't
heard of many other ways to get hyperthreading to show improvements
f
"Greg Smith" <[EMAIL PROTECTED]> writes:
> On Wed, 19 Dec 2007, Dann Corbit wrote:
>
>> Benchmarking a single system will really only explain that system.
>> Someone may have a disk farm with 2GB/Sec throughput
>> But such a configuration is very unlikely.
>
> If you believe comments like those at
"Dann Corbit" <[EMAIL PROTECTED]> writes:
>> Note that speeding up a query from 20s to 5s isn't terribly useful. If it's
>> OLTP you can't be using all your cores for each user anyways. And if it's
>> DSS 20s isn't a problem.
>
> Unless (of course) there are 20,000 users doing the queries that wou
On Wed, 19 Dec 2007, Dann Corbit wrote:
Benchmarking a single system will really only explain that system.
Someone may have a disk farm with 2GB/Sec throughput
But such a configuration is very unlikely.
If you believe comments like those at
http://www.c0t0d0s0.org/archives/1792-Do-it-yourself
TECTED]
> Subject: Re: [HACKERS] Sorting Improvements for 8.4
>
>
> "Jeff Davis" <[EMAIL PROTECTED]> writes:
>
> > test=> explain analyze select * from sorter order by t;
> > test=> explain analyze select * from sorter order by b;
> > test=> ex
"Jeff Davis" <[EMAIL PROTECTED]> writes:
> test=> explain analyze select * from sorter order by t;
> test=> explain analyze select * from sorter order by b;
> test=> explain analyze select * from sorter order by f;
>
> On my machine this table fits easily in memory (so there aren't any disk
> re
Mark Mielke <[EMAIL PROTECTED]> writes:
> Jeff Davis wrote:
>> Also, there is probably a lot of memory copying going on, and that
>> probably destroys a lot of the effectiveness of L2 caching. When L2
>> caching is ineffective, the CPU spends a lot of time just waiting on
>> memory. In that case, i
On Wed, 2007-12-19 at 15:19 -0800, Dann Corbit wrote:
> The algorithm that I am suggesting will take exactly one pass to merge
> all of the files.
>
>From tuplesort.c:
"In the current code we determine the number of tapes M on the basis of
workMem: we want workMem/M to be large enough that we re
P.S.
A beautiful paper on replacement selection is found here:
http://students.fim.uni-passau.de/~fickensc/Proseminar/Proseminar.pdf
---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index
> -Original Message-
> From: Jeff Davis [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, December 19, 2007 3:10 PM
> To: Dann Corbit
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Sorting Improvements for 8.4
>
> On Wed, 2007-12-19 at 14:41 -0800, Dann C
On Wed, 2007-12-19 at 14:41 -0800, Dann Corbit wrote:
> As long as sorting improvements are being considered, may I suggest an
> experiment that uses a very simple model?
>
> Assuming that you have K subfiles created by the initial sorting pass,
> insert the top record of each file into a priority
On Wed, 2007-12-19 at 15:51 -0500, Mark Mielke wrote:
> That sounds possible, but I still feel myself suspecting that disk
> reads will be much slower than localized text comparison. Perhaps I am
> overestimating the performance of the comparison function?
I think this simple test will change you
As long as sorting improvements are being considered, may I suggest an
experiment that uses a very simple model?
Assuming that you have K subfiles created by the initial sorting pass,
insert the top record of each file into a priority queue.
Then, emit records from the queue until the priority qu
Jeff Davis wrote:
On Wed, 2007-12-19 at 12:08 -0500, Mark Mielke wrote:
Stupid question #2: Is it well recognized that the CPU is the
bottleneck in the PostgreSQL sorting mechanism? Or might it be memory
bandwidth and I/O?
I think it depends a lot on several factors. It's probably a dif
Ron Mayer wrote:
The link in the beginning of the thread points to articles
that seem to describe one such algorithm; along with benchmarks.
(http://tinyurl.com/3bvu4u, http://tinyurl.com/32wg2m)
The improvements were pretty consistent from set sizes ranging
from very small sets (hundreds) to qui
Mark Mielke wrote:
> I am curious - what algorithms exist to efficiently do a parallel sort?
> Do you mean if sorting 1 million items, it is possible to separate this
> into 2 sets of 500 thousand each, execute them in separate threads
> (with task administration and synchronization overhead) , co
On Wed, 2007-12-19 at 12:08 -0500, Mark Mielke wrote:
> Stupid question #2: Is it well recognized that the CPU is the
> bottleneck in the PostgreSQL sorting mechanism? Or might it be memory
> bandwidth and I/O?
>
I think it depends a lot on several factors. It's probably a different
bottleneck fo
Michał Zaborowski wrote:
Ok - we want to sort table with quick sort and we want to do it on - N threads.
Every thread - gets begin and end of indices of the table. First step starts
at 0 and lasts with count -1. Single step: find medium value and move
lover before it and bigger after. In normal
Andreas Joseph Krogh <[EMAIL PROTECTED]> writes:
> And remember; Users don't care about portability-issues, they care about
> performance.
Nonsense. If Postgres stops working on their machine, they'll care.
regards, tom lane
---(end of broadcast)
On Tuesday 18 December 2007 10:03:25 Dimitri Fontaine wrote:
> Hi,
>
> Le mardi 18 décembre 2007, Ron Mayer a écrit :
> > Has anyone looked into sorting algorithms that could use
> > more than one CPU or core at a time?
>
> [...]
>
> > PS: Yeah, I know multi-threading is a hot-button on these
> > l
2007/12/19, Mark Mielke <[EMAIL PROTECTED]>:
> Jeff Davis wrote:
> > My first thought would be that we would need a new executor node (e.g.
> > "ParallelSort") that would only be chosen when the cost of the sort is
> > large enough to outweigh other factors (such as process creation time,
> > divid
Jeff Davis wrote:
My first thought would be that we would need a new executor node (e.g.
"ParallelSort") that would only be chosen when the cost of the sort is
large enough to outweigh other factors (such as process creation time,
dividing available work_mem, and any necessary IPC).
It seems to
On Tue, 2007-12-18 at 09:31 +, Simon Riggs wrote:
> On Mon, 2007-12-17 at 16:34 -0800, Ron Mayer wrote:
>
> > PS: Yeah, I know multi-threading is a hot-button on these
> > lists; but sorting seems a relatively isolated of the code
> > and I'd wonder if it'd be isolate-able enough that multiple
On Mon, 2007-12-17 at 16:34 -0800, Ron Mayer wrote:
> PS: Yeah, I know multi-threading is a hot-button on these
> lists; but sorting seems a relatively isolated of the code
> and I'd wonder if it'd be isolate-able enough that multiple
> CPUs could be used there.
I'm not sure multi-threading is th
Hi,
Le mardi 18 décembre 2007, Ron Mayer a écrit :
> Has anyone looked into sorting algorithms that could use
> more than one CPU or core at a time?
[...]
> PS: Yeah, I know multi-threading is a hot-button on these
> lists; but sorting seems a relatively isolated of the code
> and I'd wonder if it
Has anyone looked into sorting algorithms that could use
more than one CPU or core at a time?
Benchmarks I see[1][2] suggest that sorting is an area that
improves greatly with multiple processors and even with
multi-threading on a single core processor.
"For 1-processor and 2-threads (1p2t), t
On Mon, 2007-12-03 at 20:40 +, Gregory Stark wrote:
> I'm not sure where the idea of keeping the current bounds of the input tapes
> comes into it. We only preread when we run out of tuples anyways and then we
> don't really have a choice about which tape we want to preread from. And it's
> a g
On Mon, 2007-12-03 at 20:40 +, Gregory Stark wrote:
> So the question is just how many seeks are we doing during sorting. If we're
> doing 0.1% seeks and 99.9% sequential i/o then eliminating the 1% entirely
> (which we can't do) isn't going to speed up seeking all that much. If we're
> doing 2
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> The buffer size at max tapes is an optimum - a trade off between
> avoiding intermediate merging and merging efficiently. Freeing more
> memory is definitely going to help in the case of low work_mem and lots
> of runs.
I can't follow these abstract arg
On Mon, 2007-12-03 at 20:40 +, Gregory Stark wrote:
> I think the way to do what you're proposing is...
Don't understand that. Algorithm F covers that already doesn't it?
> So the question is just how many seeks are we doing during sorting. If we're
> doing 0.1% seeks and 99.9% sequential i/
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> On Mon, 2007-12-03 at 10:32 -0800, Jeff Davis wrote:
>
>> If I understand correctly, we just keep track of the maximum value of
>> the last block read from each run, and then always read from the run in
>> which the last block read has the lowest maximu
On Mon, 2007-12-03 at 10:32 -0800, Jeff Davis wrote:
> If I understand correctly, we just keep track of the maximum value of
> the last block read from each run, and then always read from the run in
> which the last block read has the lowest maximum.
Yep, sounds like Algorithm F
> It seems as if
On Mon, 2007-12-03 at 11:51 +, Simon Riggs wrote:
> So Algorithm F is somewhat orthogonal to what I've proposed, though
> maybe still interesting. What we do now is fairly close, but you should
> look at the code in tuplesort.c and logtape.c to see how well it
> matches. That might lead to an i
On Fri, 2007-11-30 at 12:07 -0800, Jeff Davis wrote:
> On Tue, 2007-11-27 at 18:03 +, Simon Riggs wrote:
> > 5. DYNAMIC RUN HANDLING (in Final Merge)
> >
> > Another way of addressing a) is to simply make better use of memory
> > itself. Let's look at that in more detail:
> >
> > Number of r
On Tue, 2007-11-27 at 18:03 +, Simon Riggs wrote:
> 5. DYNAMIC RUN HANDLING (in Final Merge)
>
> Another way of addressing a) is to simply make better use of memory
> itself. Let's look at that in more detail:
>
> Number of runs that can be merged at once is currently fixed, based upon
> avai
On Tue, Nov 27, 2007 at 06:03:46PM +, Simon Riggs wrote:
> Just wanted to review a few thoughts and ideas around improving external
> sorts, as recently encouraged to do by Jim Nasby.
Is there any way of PG knowing that having an index on a subset of the
sorted columns is sometimes a win? Fo
Just wanted to review a few thoughts and ideas around improving external
sorts, as recently encouraged to do by Jim Nasby.
Current issues/opportunities are these:
ISSUES
a) Memory is always in short supply, so using what we have more
effectively is going to be welcome.
b) Heap sort has a reaso
51 matches
Mail list logo