On 09/12/15 00:02, Jeff Janes wrote:
> The second one consumes that giant tape run along with 232 small tape
> runs.
In terms of number of comparisons, binary merge works best when the
inputs are of similar length. I'd assume the same goes for n-ary
merge, but I don't know if comparison count is
On 07/12/15 16:44, Simon Riggs wrote:
> There are many optimizations we might adopt, yet planning time is a factor.
> It seems simple enough to ignore more complex optimizations if we have
> already achieved a threshold cost (say 10). Such a test would add nearly
> zero time for the common case. We
On 08/10/15 17:09, Robert Haas wrote:
> - You consider pushing down ORDER BY if any prefix of the query
> pathkeys satisfy is_foreign_expr(), but that doesn't seem right to me.
> If the user says SELECT * FROM remotetab ORDER BY a, unsafe(a),
> ordering the result set by a doesn't help us much. We
On 27/09/15 21:58, Gavin Wahl wrote:
> Somewhat harder but still possible would be using BRIN indexes to
> accelerate ORDER BY. This would require a sorting algorithm that can take
> advantage of mostly-sorted inputs. You would sort the page ranges by their
> minimum or maximum value, then feed the
On 31/07/15 18:31, Robert Haas wrote:
> On Fri, Jul 31, 2015 at 7:21 AM, Jeremy Harris wrote:
>> Heapification is O(n) already, whether siftup (existing) or down.
>
> That's not my impression, or what Wikipedia says. Source?
Measurements done last year:
http://www.postg
On 30/07/15 02:05, Peter Geoghegan wrote:
> Since heapification is now a big fraction of the total cost of a sort
> sometimes, even where the heap invariant need not be maintained for
> any length of time afterwards, it might be worth revisiting the patch
> to make that an O(n) rather than a O(log
On 22/05/15 22:30, Josh Berkus wrote:
> At CoreOS Fest, Intel presented about a technology which they used to
> improve write times for the nonrelational data store Etcd. It's called
> Asynchronous DRAM Self-Refresh, or ADR. This is supposedly a feature of
> all of their chips since E5 which allo
On 22/05/15 22:30, Josh Berkus wrote:
> At CoreOS Fest, Intel presented about a technology which they used to
> improve write times for the nonrelational data store Etcd. It's called
> Asynchronous DRAM Self-Refresh, or ADR. This is supposedly a feature of
> all of their chips since E5 which allo
On 03/03/15 03:08, Arthur Silva wrote:
> Does it always perform mergesort instead of quicksort when enabled?
Yes; there seemed no advantage to any additional complexity.
The merge consistently performs fewer comparisons than the
quicksort, on random input - and many fewer if there are
any sorted (
On 25/02/15 00:32, Jeremy Harris wrote:
> On 23/02/15 16:40, Tomas Vondra wrote:
>> On 22.2.2015 22:30, Peter Geoghegan wrote:
>>> You should try it with the data fully sorted like this, but with one
>>> tiny difference: The very last tuple is out of order. How does t
On 25/02/15 00:42, Peter Geoghegan wrote:
> On Tue, Feb 24, 2015 at 4:32 PM, Jeremy Harris wrote:
>> On 23/02/15 16:40, Tomas Vondra wrote:
>>> On 22.2.2015 22:30, Peter Geoghegan wrote:
>>>> You should try it with the data fully sorted like this, but with one
>&
On 23/02/15 16:40, Tomas Vondra wrote:
> On 22.2.2015 22:30, Peter Geoghegan wrote:
>> You should try it with the data fully sorted like this, but with one
>> tiny difference: The very last tuple is out of order. How does that
>> look?
If this case is actually important, a merge-sort can take
si
On 14/11/14 14:54, Tom Lane wrote:
> Jeremy Harris writes:
>> On 14/11/14 00:46, Simon Riggs wrote:
>>> Limit (cost= rows=20 width=175) (actual time= rows=20 loops=1)
>>> -> Sort (cost= rows=568733 width=175) (actual time=
>>> rows=20 l
On 14/11/14 00:46, Simon Riggs wrote:
> Limit (cost= rows=20 width=175) (actual time= rows=20 loops=1)
>-> Sort (cost= rows=568733 width=175) (actual time=
> rows=20 loops=1)
> Sort Method: top-N heapsort
Going off on a tangent, when I was playing with a merge-sort
On 29/10/14 16:11, Andres Freund wrote:
> I do think checking the link count to
> be 1 is safe though.
You will fail against certain styles of online-backup.
--
Cheers,
Jeremy
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http:/
On 26/02/14 03:08, Robert Haas wrote:
On Tue, Feb 25, 2014 at 2:55 PM, Jeremy Harris wrote:
On 24/02/14 17:38, Robert Haas wrote:
On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris wrote:
Run under cachegrind, it takes about N/10 last-level cache misses,
all for the new item being introduced
On 24/02/14 17:38, Robert Haas wrote:
On Thu, Feb 20, 2014 at 7:27 PM, Jeremy Harris wrote:
Run under cachegrind, it takes about N/10 last-level cache misses,
all for the new item being introduced to the heap. The existing
code takes none at all.
Can you explain this further? This seems
On 09/02/14 17:11, Jeremy Harris wrote:
On 06/02/14 18:21, Jeff Janes wrote:
Did you try sorting already-sorted, reverse
sorted, or pipe-organ shaped data sets? We will also need to test it on
strings. I usually use md5(random()::text) to generate strings for such
purposes, at least for a
On 06/02/14 22:12, Jeremy Harris wrote:
Did you try sorting already-sorted, reverse
sorted, or pipe-organ shaped data sets?
Summary (low numbers better):
Random ints: 83% compares, level on time.
Sorted ints: level compares, 70% time.
Reverse-sorted ints: 10% compares, 15
On 06/02/14 14:22, Robert Haas wrote:
On Thu, Feb 6, 2014 at 4:22 AM, Jeremy Harris wrote:
On 05/02/14 21:56, Robert Haas wrote:
On Tue, Feb 4, 2014 at 5:22 PM, Jeremy Harris wrote:
The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random
On 06/02/14 18:21, Jeff Janes wrote:
How big of
sets were you sorting in each case?
Big enough to go external. The timings and compare-counts given are
purely for the heapify stage not the total for the sort, so are
constrained by the work_mem not by the sort size per se.
I'm limited to wor
On 05/02/14 21:56, Robert Haas wrote:
On Tue, Feb 4, 2014 at 5:22 PM, Jeremy Harris wrote:
The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random integers it does 18% fewer
compares and takes 10% less time for the heapify, over the
The attached patch replaces the existing siftup method for heapify with
a siftdown method. Tested with random integers it does 18% fewer
compares and takes 10% less time for the heapify, over the work_mem
range 1024 to 1048576.
Both algorithms appear to be O(n) (contradicting Wikipedia's claim
th
On 27/01/14 18:00, Simon Riggs wrote:
On 27 January 2014 17:44, Pavel Stehule wrote:
This topic is interesting - we found very bad performance with hashing large
tables with high work_mem. MergeJoin with quicksort was significantly
faster.
I've seen this also.
I didn't deeper research - th
In ordered_set_startup() sorts are initialised in non-randomAccess mode
(tuplesort_begin_heap() and ~datum(), last argument).
The use of tuplesort_skip_tuples() feels very like a random access to
me. I think it doesn't fail because the only use (and implementation)
is to skip forwards; if backwa
On 22/01/14 03:53, Tom Lane wrote:
Jon Nelson writes:
A rough summary of the patch follows:
- a GUC variable enables or disables this capability
- in nodeAgg.c, eliding duplicate tuples is enabled if the number of
distinct columns is equal to the number of sort columns (and both are
gr
On 22/01/14 03:16, Jon Nelson wrote:
Greetings -hackers:
I have worked up a patch to PostgreSQL which elides tuples during an
external sort. The primary use case is when sorted input is being used
to feed a DISTINCT operation. The idea is to throw out tuples that
compare as identical whenever it
On 31/12/13 01:41, Andreas Karlsson wrote:
On 12/29/2013 08:24 AM, David Rowley wrote:
If it was possible to devise some way to reuse any
previous tuplesortstate perhaps just inventing a reset method which
clears out tuples, then we could see performance exceed the standard
seqscan -> sort. The
On 13/01/14 18:01, Alexander Korotkov wrote:
Thanks. It's included into attached version of patch. As wall as estimation
improvements, more comments and regression tests fix.
Would it be possible to totally separate the two sets of sort-keys,
only giving the non-index set to the tuplesort? At
On 14/01/14 22:23, Dave Chinner wrote:
On Tue, Jan 14, 2014 at 11:40:38AM -0800, Kevin Grittner wrote:
To quantify that, in a production setting we were seeing pauses of
up to two minutes with shared_buffers set to 8GB and default dirty
^^
On 22/12/13 20:26, Alexander Korotkov wrote:
On Sat, Dec 14, 2013 at 6:30 PM, Jeremy Harris wrote:
On 14/12/13 12:54, Andres Freund wrote:
Is that actually all that beneficial when sorting with a bog standard
qsort() since that doesn't generally benefit from data being pre-sorted?
I thi
On 14/12/13 12:54, Andres Freund wrote:
On 2013-12-14 13:59:02 +0400, Alexander Korotkov wrote:
Currently when we need to get ordered result from table we have to choose
one of two approaches: get results from index in exact order we need or do
sort of tuples. However, it could be useful to mix
On 20/11/13 19:58, Robert Haas wrote:
Parallel sort, and then parallel other stuff. Eventually general
parallel query.
I have recently updated https://wiki.postgresql.org/wiki/Parallel_Sort
and you may find that interesting/helpful as a statement of intent.
I've been playing with an internal
On 13/11/13 09:07, Leonardo Francalanci wrote:
Problem? having 4 btree indexes on random values (imsi+imei * 2,
since we have calling and caller) kills the performance in insertion
after a while.
Surely there's good correlation between IMSI & IMEI, so have a separate
table to translate one to (
On 06/09/13 15:44, Robert Haas wrote:
On Fri, Sep 6, 2013 at 3:34 AM, Jeremy Harris wrote:
I don't see the regression tests running any index-hash operations.
What am I missing?
What's an index-hash operation?
Ones that hit tuplesort_begin_index_hash()
--
Cheers,
Jeremy
Hi,
I don't see the regression tests running any index-hash operations.
What am I missing?
--
Cheers,
Jeremy
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 06/08/13 13:59, Robert Haas wrote:
My thought is to create a
queue abstraction that sits on top of the dynamic shared memory
infrastructure, so that you can set aside a portion of your dynamic
shared memory segment to use as a ring buffer and send messages back
and forth with using some kind
On 10/05/2012 09:03 PM, Tom Lane wrote:
Note that allowing subsequent requests to jump the queue would not be a
good fix for this; if you do that, it's likely the ex-lock will never be
granted, at least not till the next system idle time.
Offering that option to the admin sounds like a good thi
On 2012-03-18 15:25, Tom Lane wrote:
Jeff Janes writes:
On Wed, Mar 7, 2012 at 11:55 AM, Robert Haas wrote:
The problem there is that none of the files can be deleted until it
was entirely read, so you end up with all the data on disk twice. I
don't know how often people run their databases s
On 2012-01-05 10:04, Benedikt Grundmann wrote:
I have a question of how to benchmark hardware to determine
the appropriate ratio of seq_page_cost vs random_page_cost.
It'd be really nice if the DBMS measured actual experienced values..
--
Jeremy
--
Sent via pgsql-hackers mailing list (pgs
On 2012-01-03 04:44, Robert Haas wrote:
On read-only workloads, you get spinlock contention, because everyone
who wants a snapshot has to take the LWLock mutex to increment the
shared lock count and again (just a moment later) to decrement it.
Does the LWLock protect anything but the shared loc
41 matches
Mail list logo