Re: [PERFORM] Hash join seq scan slow

2016-04-19 Thread Jeff Janes
On Tue, Apr 19, 2016 at 1:07 AM, Aldo Sarmiento wrote: > Hello, > > I'm assuming this topic has probably been bludgeoned to a pulp, but my > google-fu can't seem to find a solution. > > I have two relatively largish tables that I'm trying to join that result in > a slow query. > > Hardware: > > 20

[PERFORM] Hash join seq scan slow

2016-04-19 Thread Aldo Sarmiento
Hello, I'm assuming this topic has probably been bludgeoned to a pulp, but my google-fu can't seem to find a solution. I have two relatively largish tables that I'm trying to join that result in a slow query. Hardware: 2014 iMac w/ SSD & i5 processor Tables: contacts: 1.14 million rows permiss

Re: [PERFORM] Hash join gets slower as work_mem increases?

2016-02-01 Thread Albe Laurenz
Tomas Vondra wrote: > Yes, that's clearly the culprit here. In both cases we estimate here are > only ~4000 tuples in the hash, and 9.3 sizes the hash table to have at > most ~10 tuples per bucket (in a linked list). > > However we actually get ~3M rows, so there will be ~3000 tuples per > bucket,

Re: [PERFORM] Hash join gets slower as work_mem increases?

2016-02-01 Thread Tomas Vondra
On 02/01/2016 10:38 AM, Albe Laurenz wrote: Tomas Vondra wrote: ... I didn't post the whole plan since it is awfully long, I'll include hyperlinks for the whole plan. work_mem = '100MB' (http://explain.depesz.com/s/7b6a): -> Hash Join (cost=46738.74..285400.61 rows=292 width=8) (actual tim

Re: [PERFORM] Hash join gets slower as work_mem increases?

2016-02-01 Thread Albe Laurenz
Tomas Vondra wrote: > On 01/29/2016 04:17 PM, Albe Laurenz wrote: >> I have a query that runs *slower* if I increase work_mem. >> >> The execution plans are identical in both cases, except that a temp file >> is used when work_mem is smaller. >> What could be an explanation for this? >> Is this kn

Re: [PERFORM] Hash join gets slower as work_mem increases?

2016-01-30 Thread Tomas Vondra
Hi, On 01/29/2016 04:17 PM, Albe Laurenz wrote: I have a query that runs *slower* if I increase work_mem. The execution plans are identical in both cases, except that a temp file is used when work_mem is smaller. The relevant lines of EXPLAIN ANALYZE output are: With work_mem='100MB': -> Has

Re: [PERFORM] Hash join gets slower as work_mem increases?

2016-01-29 Thread Pavel Stehule
Hi > I ran operf on both backends, and they look quite similar, except that the > number of samples is different (this is "opreport -c" output): > > CPU: Intel Sandy Bridge microarchitecture, speed 2899.8 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a uni

[PERFORM] Hash join gets slower as work_mem increases?

2016-01-29 Thread Albe Laurenz
I have a query that runs *slower* if I increase work_mem. The execution plans are identical in both cases, except that a temp file is used when work_mem is smaller. The relevant lines of EXPLAIN ANALYZE output are: With work_mem='100MB': -> Hash Join (cost=46738.74..285400.61 rows=292 width=8)

Re: [PERFORM] HASH

2015-11-05 Thread Jeff Janes
On Thu, Nov 5, 2015 at 1:11 AM, Artem Tomyuk wrote: > Hi all. > > Is the speed of hash operations stands on the performance of CPU? Yes, but the variation is probably not as much as the raw timing in your example indicates. > Below you can see part from output of explain analyze command > > Inte

[PERFORM] HASH

2015-11-05 Thread Artem Tomyuk
Hi all. Is the speed of hash operations stands on the performance of CPU? Below you can see part from output of explain analyze command *Intel(R) Xeon(R) CPU E7520 @ 1.87GHz* " -> Hash (cost=337389.43..337389.43 rows=3224443 width=34) (actual time=15046.382..15046.382 ro

Re: [PERFORM] Hash Join node sometimes slow

2014-07-03 Thread Dave Roberge
Tom Lane writes: > I'd bet on the extra time being in I/O for the per-batch temp files, since > it's hard > to see what else would be different if the data were identical in each run. > Maybe the kernel is under memory pressure and is dropping the file data from > in-memory disk cache. Or maybe i

Re: [PERFORM] Hash Join node sometimes slow

2014-07-02 Thread Tom Lane
Dave Roberge writes: > For example, running explain (analyze, buffers) with the query, 4/5 times we > will see the following: > -> Hash Join (cost=16385.76..103974.09 rows=523954 width=64) (actual > time=532.634..4018.678 rows=258648 loops=1) > Hash Cond: (p.a = c.c) > Buffers: sh

[PERFORM] Hash Join node sometimes slow

2014-07-02 Thread Dave Roberge
Hi, I'm in the process of attempting to tune some slow queries. I came across a scenario where I'm not entirely sure how I might figure out why a node is taking awhile to process. I'm not concerned with the query itself, we are working to figure out how we can make it faster. But I was hoping s

[PERFORM] Hash join

2013-12-10 Thread mspasic
Hello everyone, I'm looking for a way to specify join order in SQL query. Actually, the optimizer has chosen a plan with hash join of 2 tables, but I would like to know if there is a way to force it to use hash join, but with replaced tables on build phase and probe phase? Thank you, Mirko Spasic

Re: [PERFORM] hash join vs nested loop join

2012-12-20 Thread Huan Ruan
On 21 December 2012 01:06, Kevin Grittner wrote: > Huan Ruan wrote: > > Kevin Grittner wrote: > > >> Frankly, at 12 microseconds per matched pair of rows, I think > >> you're doing OK. > > > > This plan is the good one, I want the indexscan nested loop join and this > > is only achieved after mak

Re: [PERFORM] hash join vs nested loop join

2012-12-20 Thread Kevin Grittner
Huan Ruan wrote: > Kevin Grittner wrote: >> Frankly, at 12 microseconds per matched pair of rows, I think >> you're doing OK. > > This plan is the good one, I want the indexscan nested loop join and this > is only achieved after making all these costing factors change. Before > that, it was hash j

Re: [PERFORM] hash join vs nested loop join

2012-12-19 Thread Huan Ruan
Frankly, at 12 microseconds per matched pair of rows, I think > you're doing OK. > > This plan is the good one, I want the indexscan nested loop join and this is only achieved after making all these costing factors change. Before that, it was hash join and was very slow. However, I'm worried about

Re: [PERFORM] hash join vs nested loop join

2012-12-19 Thread Kevin Grittner
"Huan Ruan" wrote: > explain (analyze, buffers) > SELECT >  * > FROM IM_Match_Table smalltable >  inner join invtran bigtable on bigtable.invtranref = smalltable.invtranref Well, one table or the other will need to be read in full, and you would normally want that one to be the small table. When

Re: [PERFORM] hash join vs nested loop join

2012-12-18 Thread Huan Ruan
> Quite possibly, but it could be any of a number of other things, > like a type mismatch. It might be best to rule out other causes. If > you post the new query and EXPLAIN ANALYZE output, along with the > settings you have now adopted, someone may be able to spot > something. It wouldn't hurt to

Re: [PERFORM] hash join vs nested loop join

2012-12-14 Thread Kevin Grittner
Huan Ruan wrote: > Kevin Grittner wrote: >> With a low cache hit rate, that would generally be when the number >> of lookups into the table exceeds about 10% of the table's rows. > > So far, my main performance issue comes down to this pattern where > Postgres chooses hash join that's slower than

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Huan Ruan
> > With a low cache hit rate, that would generally be when the number > of lookups into the table exceeds about 10% of the table's rows. > > > So far, my main performance issue comes down to this pattern where Postgres chooses hash join that's slower than a nest loop indexed join. By changing thos

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Kevin Grittner
Huan Ruan wrote: > Interesting to see how you derived 100% cache hits. I assume by 'cache' you > mean the pg shared buffer plus the OS cache? Because the table is 23GB but > the shared buffer is only 6GB. Even then, I'm not completely convinced > because the total RAM is just 24GB, part of which w

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Huan Ruan
Hi Kevin Again, many thanks for your time and help. On 14 December 2012 02:26, Kevin Grittner wrote: > Huan Ruan wrote: > > > Hash 1st run > > > "Hash Join (cost=1681.87..6414169.04 rows=48261 width=171) > > (actual time=2182.450..88158.645 rows=48257 loops=1)" > > > " -> Seq Scan on invtran bi

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Huan Ruan
Hi Kevin On 13 December 2012 10:47, Kevin Grittner wrote: > Huan Ruan wrote: > > > is a lot slower than a nested loop join. > > Giving actual numbers is more useful than terms like "a lot". Even > better is to provide the output of EXPLAIN ANALYZZE rather than > just EXPLAIN. This shows estimate

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Huan Ruan
On 13 December 2012 03:28, Jeff Janes wrote: > > This looks like the same large-index over-penalty as discussed in the > recent thread "[PERFORM] Slow query: bitmap scan troubles". > > Back-patching the log(npages) change is starting to look like a good idea. > > Cheers, > > Jeff Thanks for the

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Kevin Grittner
Huan Ruan wrote: > Hash 1st run > "Hash Join (cost=1681.87..6414169.04 rows=48261 width=171) > (actual time=2182.450..88158.645 rows=48257 loops=1)" > " -> Seq Scan on invtran bigtable (cost=0.00..4730787.28 > rows=168121728 width=108) (actual time=0.051..32581.052 > rows=168121657 loops=1)" 19

Re: [PERFORM] hash join vs nested loop join

2012-12-12 Thread Kevin Grittner
Huan Ruan wrote: > is a lot slower than a nested loop join. Giving actual numbers is more useful than terms like "a lot". Even better is to provide the output of EXPLAIN ANALYZZE rather than just EXPLAIN. This shows estimates against actual numbers, and give timings. For more suggestions see this

Re: [PERFORM] hash join vs nested loop join

2012-12-12 Thread Jeff Janes
On Tue, Dec 11, 2012 at 8:25 PM, Huan Ruan wrote: > Hello All > > While investigating switching to Postgres, we come across a query plan that > uses hash join and is a lot slower than a nested loop join. > > I don't understand why the optimiser chooses the hash join in favor of the > nested loop.

Re: [PERFORM] hash join vs nested loop join

2012-12-11 Thread Evgeny Shishkin
On Dec 12, 2012, at 8:57 AM, Evgeny Shishkin wrote: > > On Dec 12, 2012, at 8:44 AM, Huan Ruan wrote: > >> >> On 12 December 2012 15:33, Evgeny Shishkin wrote: >> Optimiser thinks that nested loop is more expensive, because of point PK >> lookups, which a random io. >> Can you set random_p

Re: [PERFORM] hash join vs nested loop join

2012-12-11 Thread Evgeny Shishkin
On Dec 12, 2012, at 8:25 AM, Huan Ruan wrote: > Hello All > > While investigating switching to Postgres, we come across a query plan that > uses hash join and is a lot slower than a nested loop join. > > I don't understand why the optimiser chooses the hash join in favor of the > nested loop

[PERFORM] hash join vs nested loop join

2012-12-11 Thread Huan Ruan
Hello All While investigating switching to Postgres, we come across a query plan that uses hash join and is a lot slower than a nested loop join. I don't understand why the optimiser chooses the hash join in favor of the nested loop. What can I do to get the optimiser to make a better decision (n

Re: [PERFORM] hash aggregation

2012-10-12 Thread Tomas Vondra
On 11.10.2012 17:15, Korisk wrote: > "IOS scan" ? > Index Only Scan > > What's your seq_page_cost and random_page_cost? > > hashes=# SELECT name, setting, reset_val FROM pg_settings WHERE setting <> > reset_val; > name |setting | reset_val > -

Re: [PERFORM] hash aggregation

2012-10-12 Thread Tomas Vondra
On 12.10.2012 09:10, Sergey Konoplev wrote: > What I can not understand is why the seq scan's estimated cost is > better the index scan's one. It depends on the number of pages in > index/relation. May be the index is heavily bloated? The IOS cost depends on other things too. The index can't be re

Re: [PERFORM] hash aggregation

2012-10-12 Thread Korisk
> What I can not understand is why the seq scan's estimated cost is > better the index scan's one. It depends on the number of pages in > index/relation. May be the index is heavily bloated? Mm i don't know how to see bloating level. But the index was created by create index on hashcheck using btr

Re: [PERFORM] hash aggregation

2012-10-12 Thread Sergey Konoplev
On Thu, Oct 11, 2012 at 9:14 PM, Korisk wrote: > Strange situation. > After indexscan enabling the cost is seriously decreased. AFAIK when the planner has to choose between index scans and seq scans and both of this options are off it uses one of this strategies anyway but puts 100.00 as

Re: [PERFORM] hash aggregation

2012-10-11 Thread Ondrej Ivanič
Hi, On 12 October 2012 15:14, Korisk wrote: > Strange situation. > After indexscan enabling the cost is seriously decreased. You can not really disable any scan method. enable_xxx = off just sets very high cost (=100) for that operation. -- Ondrej Ivanic (ondrej.iva...@gmail.com) (http

Re: [PERFORM] hash aggregation

2012-10-11 Thread Korisk
Strange situation. After indexscan enabling the cost is seriously decreased. hashes=# set enable_bitmapscan=on; SET hashes=# explain analyse verbose select name, count(name) as cnt from hashcheck group by name order by name desc;

Re: [PERFORM] hash aggregation

2012-10-11 Thread Sergey Konoplev
On Thu, Oct 11, 2012 at 8:55 PM, Korisk wrote: > hashes=# explain analyse verbose select name, count(name) as cnt from > hashcheck group by name order by name desc; Now set enable_bitmapscan and enable_indexscan to on an try it again. Then set enable_seqscan to on and run it one more time. >

Re: [PERFORM] hash aggregation

2012-10-11 Thread Korisk
Again the same cost. hashes=# SELECT name, setting, reset_val FROM pg_settings WHERE setting <> reset_val; name |setting | reset_val -++--- archive_command | (disabled) | enable_bitmapscan | off

Re: [PERFORM] hash aggregation

2012-10-11 Thread Sergey Konoplev
On Thu, Oct 11, 2012 at 8:15 AM, Korisk wrote: > What's your seq_page_cost and random_page_cost? > hashes=# SELECT name, setting, reset_val FROM pg_settings WHERE setting <> > reset_val; > name |setting | reset_val > -++-

Re: [PERFORM] hash aggregation

2012-10-11 Thread Korisk
"IOS scan" ? Index Only Scan What's your seq_page_cost and random_page_cost? hashes=# SELECT name, setting, reset_val FROM pg_settings WHERE setting <> reset_val; name |setting | reset_val -++--- archive_command

Re: [PERFORM] hash aggregation

2012-10-10 Thread Sergey Konoplev
On Wed, Oct 10, 2012 at 9:13 PM, Korisk wrote: >-> Index Only Scan Backward using hashcheck_name_idx on public.hashcheck > (cost=100.00..1398674.92 rows=25986792 width=32) It seems odd. Is it possible to look at the non default configuration? SELECT name, setting, reset_val F

Re: [PERFORM] hash aggregation

2012-10-10 Thread Craig Ringer
On 10/11/2012 12:13 PM, Korisk wrote: Thanx for the advice, but increment table is not acceptable because it should be a plenty of them. Nevertheless in the investigations was achieved some progress (7.4 sec vs 19.6 sec). But using IOS scan "IOS scan" ? Do you mean some kind of I/O monitorin

Re: [PERFORM] hash aggregation

2012-10-10 Thread Korisk
Thanx for the advice, but increment table is not acceptable because it should be a plenty of them. Nevertheless in the investigations was achieved some progress (7.4 sec vs 19.6 sec). But using IOS scan you can see that there is an abnormal cost calculations it make me suspicious of little bug

Re: [PERFORM] hash aggregation

2012-10-10 Thread Sergey Konoplev
On Wed, Oct 10, 2012 at 9:09 AM, Korisk wrote: > Hello! Is it possible to speed up the plan? > Sort (cost=573977.88..573978.38 rows=200 width=32) (actual > time=10351.280..10351.551 rows=4000 loops=1) >Output: name, (count(name)) >Sort Key: hashcheck.name >Sort Method: quicksort Me

[PERFORM] hash aggregation

2012-10-10 Thread Korisk
Hello! Is it possible to speed up the plan? hashes=# \d hashcheck Table "public.hashcheck" Column | Type| Modifiers +---+ id

[PERFORM] hash aggregation speedup

2012-10-07 Thread Korisk
I have table: create table hashcheck(id serial, name varchar, value varchar); and query: hashaggr=# explain analyse verbose select name, count(name) as cnt from hashcheck group by name order by name desc; QUERY PLAN

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-20 Thread Tom Lane
Merlin Moncure writes: > just selects. update test is also very interesting -- the only test I > did for for updates is 'update foo set x=x+1' which was a win for > btree (20-30% faster typically). perhaps this isn't algorithm induced > though -- lack of wal logging could actually hurt time to

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-20 Thread Merlin Moncure
On Mon, Sep 19, 2011 at 1:53 PM, Claudio Freire wrote: > On Mon, Sep 19, 2011 at 3:43 PM, Merlin Moncure wrote: >> To make the test into i/o bound, I change the setrandom from 10 to >> 1000; this produced some unexpected results. The hash index is >> pulling about double the tps (~80 vs ~

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Claudio Freire
On Mon, Sep 19, 2011 at 3:43 PM, Merlin Moncure wrote: > To make the test into i/o bound, I change the setrandom from 10 to > 1000; this produced some unexpected results. The hash index is > pulling about double the tps (~80 vs ~ 40) over the hybrid version. > Well, unless my methodology i

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Merlin Moncure
On Mon, Sep 19, 2011 at 10:19 AM, Robert Klemme wrote: > On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure wrote: >> On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller wrote: >>> Merlin and Jeff, >>> >>> General remark again:It's hard for me to imagine that btree is >>> superior for all the issues men

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Jeff Janes
On Mon, Sep 19, 2011 at 8:19 AM, Robert Klemme wrote: > On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure wrote: > >> The other way to go of course is to try and fix up the existing hash >> index code -- add wal logging, etc. In theory, a customized hash >> structure should be able to beat btree al

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Claudio Freire
On Mon, Sep 19, 2011 at 12:54 PM, Vitalii Tymchyshyn wrote: > 19.09.11 18:19, Robert Klemme написав(ла): >> >> I still haven't seen a solution to locking when a hash table needs >> resizing.  All hashing algorithms I can think of at the moment would >> require a lock on the whole beast during the

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Vitalii Tymchyshyn
19.09.11 18:19, Robert Klemme написав(ла): I still haven't seen a solution to locking when a hash table needs resizing. All hashing algorithms I can think of at the moment would require a lock on the whole beast during the resize which makes this type of index impractical for certain loads (hea

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Tom Lane
Robert Klemme writes: > I still haven't seen a solution to locking when a hash table needs > resizing. All hashing algorithms I can think of at the moment would > require a lock on the whole beast during the resize which makes this > type of index impractical for certain loads (heavy updating).

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Vitalii Tymchyshyn
19.09.11 18:19, Robert Klemme написав(ла): On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure wrote: Postgres's hash index implementation used to be pretty horrible -- it stored the pre-hashed datum in the index which, while making it easier to do certain things, made it horribly slow, and, for

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Robert Klemme
On Mon, Sep 19, 2011 at 4:04 PM, Merlin Moncure wrote: > On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller wrote: >> Merlin and Jeff, >> >> General remark again:It's hard for me to imagine that btree is >> superior for all the issues mentioned before. I still believe in hash >> index for primary key

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Merlin Moncure
On Sun, Sep 18, 2011 at 9:59 AM, Stefan Keller wrote: > Merlin and Jeff, > > General remark again:It's hard for me to imagine that btree is > superior for all the issues mentioned before. I still believe in hash > index for primary keys and certain unique constraints where you need > equality sear

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-19 Thread Robert Klemme
On Sun, Sep 18, 2011 at 9:31 PM, Stefan Keller wrote: > I'm simply referring to literature (like the intro Ramakrishnan & Gehrke). > I just know that Oracle an Mysql actually do have them too and use it > without those current implementation specific restrictions in > Postgres. Where exactly do y

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-18 Thread Jeff Janes
On Sun, Sep 18, 2011 at 7:59 AM, Stefan Keller wrote: > Merlin and Jeff, > > General remark again:It's hard for me to imagine that btree is > superior for all the issues mentioned before. I still believe in hash > index for primary keys and certain unique constraints where you need > equality sear

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-18 Thread Stefan Keller
I'm simply referring to literature (like the intro Ramakrishnan & Gehrke). I just know that Oracle an Mysql actually do have them too and use it without those current implementation specific restrictions in Postgres. IMHO by design Hash Index (e.g. linear hashing) work best when: 1. only equal (=)

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-18 Thread Kevin Grittner
Stefan Keller wrote: > It's hard for me to imagine that btree is superior for all the > issues mentioned before. It would be great if you could show a benchmark technique which shows otherwise. -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make cha

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-18 Thread Stefan Keller
Merlin and Jeff, General remark again:It's hard for me to imagine that btree is superior for all the issues mentioned before. I still believe in hash index for primary keys and certain unique constraints where you need equality search and don't need ordering or range search. 2011/9/17 Jeff Janes

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-17 Thread Jeff Janes
On Thu, Sep 15, 2011 at 9:20 PM, Merlin Moncure wrote: > > odd: I was pondering Claudio's point about maintenance of hash indexes > vs btree and decided to do some more tests.  Something very strange is > happening:  I decided to compare 'update v set x=x+1', historically > one of postgres's weake

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-17 Thread Merlin Moncure
On Sat, Sep 17, 2011 at 4:48 PM, Jeff Janes wrote: > On Tue, Sep 13, 2011 at 5:04 PM, Peter Geoghegan > wrote: >> On 14 September 2011 00:04, Stefan Keller wrote: >>> Has this been verified on a recent release? I can't believe that hash >>> performs so bad over all these points. Theory tells me

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-17 Thread Jeff Janes
On Wed, Sep 14, 2011 at 4:03 PM, Stefan Keller wrote: > 2011/9/14 Tom Lane writes: >> (...) I think that >> the current state of affairs is still what depesz said, namely that >> there might be cases where they'd be a win to use, except the lack of >> WAL support is a killer.  I imagine somebody

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-17 Thread Jeff Janes
On Tue, Sep 13, 2011 at 5:04 PM, Peter Geoghegan wrote: > On 14 September 2011 00:04, Stefan Keller wrote: >> Has this been verified on a recent release? I can't believe that hash >> performs so bad over all these points. Theory tells me otherwise and >> http://en.wikipedia.org/wiki/Hash_table se

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-16 Thread Tomas Vondra
Dne 15.9.2011 01:40, Tom Lane napsal(a): > Stefan Keller writes: >> 2011/9/14 Tom Lane writes: >>> (...) I think that >>> the current state of affairs is still what depesz said, namely that >>> there might be cases where they'd be a win to use, except the lack of >>> WAL support is a killer. I i

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-16 Thread Stefan Keller
2011/9/16 Tom Lane : > I'm not entirely following this eagerness to junk that AM, anyway. > We've put a lot of sweat into it over the years, in the hopes that > it would eventually be good for something.  It's on the edge of > being good for something now, and there's doubtless room for more > impr

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-15 Thread Merlin Moncure
On Thu, Sep 15, 2011 at 8:00 PM, Merlin Moncure wrote: > On Thu, Sep 15, 2011 at 5:38 PM, Tom Lane wrote: >> Merlin Moncure writes: >>> HM, what if you junked the current hash indexam, and just implemented >>> a wrapper over btree so that the 'hash index' was just short hand for >>> hashing the

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-15 Thread Claudio Freire
On Fri, Sep 16, 2011 at 3:00 AM, Merlin Moncure wrote: > > c:\Program Files\PostgreSQL\9.0\data>dir/s | grep 16525 > 09/15/2011  07:46 PM       224,641,024 16525 > > c:\Program Files\PostgreSQL\9.0\data>dir/s | grep 16526 > 09/15/2011  07:49 PM       268,451,840 16526 That's not surprising at all

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-15 Thread Merlin Moncure
On Thu, Sep 15, 2011 at 5:38 PM, Tom Lane wrote: > Merlin Moncure writes: >> HM, what if you junked the current hash indexam, and just implemented >> a wrapper over btree so that the 'hash index' was just short hand for >> hashing the value into a standard index? > > Surely creating such a wrappe

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-15 Thread Claudio Freire
On Fri, Sep 16, 2011 at 12:38 AM, Tom Lane wrote: > I'm not entirely following this eagerness to junk that AM, anyway. > We've put a lot of sweat into it over the years, in the hopes that > it would eventually be good for something.  It's on the edge of > being good for something now, and there's

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-15 Thread Tom Lane
Merlin Moncure writes: > HM, what if you junked the current hash indexam, and just implemented > a wrapper over btree so that the 'hash index' was just short hand for > hashing the value into a standard index? Surely creating such a wrapper would be *more* work than adding WAL support to the hash

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-15 Thread Merlin Moncure
On Thu, Sep 15, 2011 at 3:28 PM, Claudio Freire wrote: > On Thu, Sep 15, 2011 at 5:00 PM, Merlin Moncure wrote: >> >> HM, what if you junked the current hash indexam, and just implemented >> a wrapper over btree so that the 'hash index' was just short hand for >> hashing the value into a standard

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-15 Thread Claudio Freire
On Thu, Sep 15, 2011 at 5:00 PM, Merlin Moncure wrote: > > HM, what if you junked the current hash indexam, and just implemented > a wrapper over btree so that the 'hash index' was just short hand for > hashing the value into a standard index? I'm doing this (only by hand, indexing on hash(blah))

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-15 Thread Merlin Moncure
On Wed, Sep 14, 2011 at 4:03 AM, Heikki Linnakangas wrote: > On 14.09.2011 03:24, Tom Lane wrote: >> >> The big picture though is that we're not going to remove hash indexes, >> even if they're nearly useless in themselves, because hash index >> opclasses provide the foundation for the system's kn

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-14 Thread Tom Lane
Stefan Keller writes: > 2011/9/14 Tom Lane writes: >> (...) I think that >> the current state of affairs is still what depesz said, namely that >> there might be cases where they'd be a win to use, except the lack of >> WAL support is a killer. I imagine somebody will step up and do that >> even

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-14 Thread Stefan Keller
2011/9/14 Tom Lane writes: > (...) I think that > the current state of affairs is still what depesz said, namely that > there might be cases where they'd be a win to use, except the lack of > WAL support is a killer. I imagine somebody will step up and do that > eventually. How much of work (in

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-14 Thread Leonardo Francalanci
>> Hash indexes have been improved since 2005 - their performance was >> improved quite a bit in 9.0. Here's a more recent analysis: > >> http://www.depesz.com/index.php/2010/06/28/should-you-use-hash-index/ > > The big picture though is that we're not going to remove hash indexes, > even if

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-14 Thread Heikki Linnakangas
On 14.09.2011 03:24, Tom Lane wrote: The big picture though is that we're not going to remove hash indexes, even if they're nearly useless in themselves, because hash index opclasses provide the foundation for the system's knowledge of how to do the datatype-specific hashing needed for hash joins

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-14 Thread Heikki Linnakangas
On 14.09.2011 09:39, Stefan Keller wrote: Should I open a ticket? What ticket? With whom? -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscription: http://www.postgre

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-13 Thread Stefan Keller
2011/9/14 Tom Lane : > (...) I think that > the current state of affairs is still what depesz said, namely that > there might be cases where they'd be a win to use, except the lack of > WAL support is a killer. I imagine somebody will step up and do that > eventually. Should I open a ticket? Ste

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-13 Thread Tom Lane
Peter Geoghegan writes: > On 14 September 2011 00:04, Stefan Keller wrote: >> Has this been verified on a recent release? I can't believe that hash >> performs so bad over all these points. Theory tells me otherwise and >> http://en.wikipedia.org/wiki/Hash_table seems to be a success. > Hash ind

Re: [PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-13 Thread Peter Geoghegan
On 14 September 2011 00:04, Stefan Keller wrote: > Has this been verified on a recent release? I can't believe that hash > performs so bad over all these points. Theory tells me otherwise and > http://en.wikipedia.org/wiki/Hash_table seems to be a success. Hash indexes have been improved since 20

[PERFORM] Hash index use presently(?) discouraged since 2005: revive or bury it?

2011-09-13 Thread Stefan Keller
The doc at http://www.postgresql.org/docs/current/interactive/indexes-types.html says: "Caution: Hash index operations are not presently WAL-logged, so hash indexes might need to be rebuilt with REINDEX after a database crash. They are also not replicated over streaming or file-based replication. F

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-31 Thread Cédric Villemain
2011/5/31 Robert Haas : > On Thu, May 26, 2011 at 8:33 AM, panam wrote: >> Any third party confirmation? > > Yeah, it definitely looks like there is some kind of bug here.  Or if > not a bug, then a very surprising feature.  EXPLAIN ANALYZE outputs > from your proposed test attached.  Here's a uni

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-31 Thread Robert Haas
On Thu, May 26, 2011 at 8:33 AM, panam wrote: > Any third party confirmation? Yeah, it definitely looks like there is some kind of bug here. Or if not a bug, then a very surprising feature. EXPLAIN ANALYZE outputs from your proposed test attached. Here's a unified diff of the two outputs:

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Craig Ringer
On 05/27/2011 02:13 AM, Cédric Villemain wrote: I am not an hibernate expert, but I'll surprised if you can not drive hibernate to do what you want. If nothing else, you can do a native query in hand-written SQL through Hibernate. ORMs are useful tools for some jobs, but it's good to be able

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Cédric Villemain
2011/5/26 panam : > Hi all, > > > Cédric Villemain-3 wrote: >> >> without explaining further why the antijoin has bad performance >> without cluster, I wonder why you don't use this query : >> >> SELECT  b.id, >>                   max(m.id) >> FROM box b, message m >> WHERE m.box_id = b.id >> GROUP

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread panam
Sorry, SELECT MAX(e.id) FROM event_message e WHERE e.box_id = id as posted previously should actually read SELECT max(m1.id) FROM message m1 WHERE m1.box_id = b.id) so I tried this already. Regards, panam -- View this message in context: http://postgresql.1045698.n5.nabble.com/Hash-Anti-J

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Kevin Grittner
panam wrote: > I cannot use it because of the way that query is generated > (by hibernate). > > The (simplyfied) base query is just > > SELECT b.id from box > > the subquery > > (SELECT m1.id FROM message m1 >LEFT JOIN message m2 > ON (m1.box_id = m2.box_id AND m1.id < m2.id )

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread panam
Hi all, Cédric Villemain-3 wrote: > > without explaining further why the antijoin has bad performance > without cluster, I wonder why you don't use this query : > > SELECT b.id, > max(m.id) > FROM box b, message m > WHERE m.box_id = b.id > GROUP BY b.id; > > looks similar an

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Kevin Grittner
Cédric Villemain wrote: > 2011/5/26 panam : >> "max_connections";"100" >> "work_mem";"1GB" Each connection can allocate work_mem, potentially several times. On a machines without hundreds of GB of RAM, that pair of settings could cause severe swapping. >> "Patholgical" query: >> >> select

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread Cédric Villemain
2011/5/26 panam : > Hi there, > > > Kevin Grittner wrote: >> >>> Is there a way to determine the values actually used? >> The pg_settings view.  Try the query shown here: >> http://wiki.postgresql.org/wiki/Server_Configuration >> > Thanks Kevin, very usful. Here is the output: > > "version";"Postgr

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-26 Thread panam
Hi there, Kevin Grittner wrote: > >> Is there a way to determine the values actually used? > The pg_settings view. Try the query shown here: > http://wiki.postgresql.org/wiki/Server_Configuration > Thanks Kevin, very usful. Here is the output: "version";"PostgreSQL 9.0.4, compiled by Visual C

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-25 Thread Craig Ringer
On 05/26/2011 12:42 AM, panam wrote: So, would you like to further investigate my previous issue (I think it is still strange that performance suddenly dropped that dramatically)? It's a bit beyond me, but I suspect that it'd be best if you could hang onto the dump file in case someone has th

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-25 Thread Kevin Grittner
panam wrote: > Is there a way to determine the values actually used? The pg_settings view. Try the query shown here: http://wiki.postgresql.org/wiki/Server_Configuration -Kevin -- Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org) To make changes to your subscrip

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-25 Thread panam
Hi all, @Tom, > BTW, this query doesn't actually match the EXPLAIN outputs... You're right, it is actually just the "heavy" subquery of a larger query which can be found here: http://pastebin.com/fuGrt0tB > One other thing I'm not following is how come it's using hash temp files > at all, when y

Re: [PERFORM] Hash Anti Join performance degradation

2011-05-24 Thread Craig Ringer
On 24/05/11 22:34, panam wrote: >> The usual cause is that the statistics for estimated row counts cross a >> threshold that makes the query planner think that a different kind of >> plan will be faster. > > Hm, as far as i understand the plans, they are equivalent, aren't they? Yes, they are,

  1   2   >