Re: [PERFORM] Do I have a hardware or a software problem?

2012-12-13 Thread Jeff Janes
On Wed, Dec 12, 2012 at 8:46 AM, Niels Kristian Schjødt wrote: > > Den 11/12/2012 kl. 18.25 skrev Jeff Janes : > >> On Tue, Dec 11, 2012 at 2:04 AM, Niels Kristian Schjødt >> wrote: >> >>> Maybe I should mention, that I never see more than max 5Gb out of my total >>> 32Gb being in use on the ser

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Kevin Grittner
Huan Ruan wrote: > Hash 1st run > "Hash Join (cost=1681.87..6414169.04 rows=48261 width=171) > (actual time=2182.450..88158.645 rows=48257 loops=1)" > " -> Seq Scan on invtran bigtable (cost=0.00..4730787.28 > rows=168121728 width=108) (actual time=0.051..32581.052 > rows=168121657 loops=1)" 19

[PERFORM] problem with large inserts

2012-12-13 Thread Lutz Fischer
Hi I have currently some trouble with inserts into a table INSERT INTO LPP (PPID, LID) SELECT DISTINCT PPid, LID FROM (SELECT * FROM PP WHERE s_id = sid) pp INNER JOIN has_protein hp1 ON pp.p1id = hp1.pid INNER JOIN has_protein hp2

Re: [PERFORM] Limit & offset effect on query plans

2012-12-13 Thread Kevin Grittner
Pavan Deolasee wrote: > Amitabh Kant wrote: >> Our scripts automatically add "LIMIT ALL" & "OFFSET 0" to every >> select query if no values are passed on for these parameters. I >> remember reading through the mailing list that it's better not >> to pass them if they are not needed as they add a

Re: [PERFORM] problem with large inserts

2012-12-13 Thread Willem Leenen
I would strongly discourage you from droppping the referential integrity. You risk data corruption, which will cost you a good deal of time to sort it out properly, and corruption prevents you to apply the R.I. again. Also it has hardly any performance impact. Are the plans different? ( i gue

Re: [PERFORM] problem with large inserts

2012-12-13 Thread Filip Rembiałkowski
Just an idea - how long does it take to run _only_ CREATE TEMP TABLE foo AS On Thu, Dec 13, 2012 at 4:37 PM, Lutz Fischer wrote: > Hi > > I have currently some trouble with inserts into a table > > INSERT INTO LPP (PPID, LID) > SELECT DISTINCT PPid, LID FROM > (SELECT * FROM PP WHERE

Re: [PERFORM] problem with large inserts

2012-12-13 Thread Tom Lane
Lutz Fischer writes: > I have currently some trouble with inserts into a table > If I run only [ the select part ] > it returns 200620 rows in 170649 ms ( thats just under 3 minutes). I > stopped the actual insert after about 8h. It should not take 8h to insert 200k rows on any machine made thi

Re: [PERFORM] Limit & offset effect on query plans

2012-12-13 Thread Tom Lane
"Kevin Grittner" writes: > Pavan Deolasee wrote: >> I would tend to think that is the latter. While undoubtedly >> limit/offset clause will add another node during query planning >> and execution, AFAICS the OFFSET 0 and LIMIT ALL cases are >> optimized to a good extent. So the overhead of having

Re: [PERFORM] problem with large inserts

2012-12-13 Thread Lutz Fischer
Thanks a lot you saved my day create temp table foo AS SELECT DISTINCT ... did take a mere 77464.744 ms And an additional Insert into LPP select * from foo; Just 576.909 ms I don't really understand why it's working via a temp table but not directly (or in any reasonable amount of time) - but at

Re: [PERFORM] problem with large inserts

2012-12-13 Thread Filip Rembiałkowski
Hmm, so it is some kind of file / table locking issue, not general IO system malfunction. It would be interesting and useful to run this use case on other postgres instance (or several instances), including non-Windows ones. OTOH Pg on Windows housekeeping was always "fun" - I advise all my clien

Re: [PERFORM] Limit & offset effect on query plans

2012-12-13 Thread Kevin Grittner
Tom Lane wrote: > "Kevin Grittner" writes: >> I ran some quick tests on my i7 under Linux. Plan time was >> increased by about 40 microseconds (based on EXPLAIN runtime) >> and added a limit node to the plan. Execution time on a SELECT * >> FROM tenk1 in the regression database went up by 1.35 ms

Re: [PERFORM] Limit & offset effect on query plans

2012-12-13 Thread Tom Lane
"Kevin Grittner" writes: > Tom Lane wrote: >> 1.35ms out of what? > Without the limit node the runtimes (after "priming" the cache) > were: > 1.805, 2.533 > 1.805, 2.495 > 1.800, 2.446 > 1.818, 2.470 > 1.804, 2.502 > The first time for each run is "Total runtime" reported by EXPLAIN, > the seco

Re: [PERFORM] Limit & offset effect on query plans

2012-12-13 Thread Kevin Grittner
Tom Lane wrote: > Huh, so on a percentage basis the Limit-node overhead is actually > pretty significant, at least for a trivial seqscan plan like this > case. (This is probably about the worst-case scenario, really, > since it's tough to beat a simple seqscan for cost-per-emitted- > row. Also I g

Re: [PERFORM] Why is PostgreSQL 9.2 slower than 9.1 in my tests?

2012-12-13 Thread Patryk Sidzina
On Mon, Dec 10, 2012 at 4:53 AM, Jeff Janes wrote: > On Wed, Dec 5, 2012 at 4:09 AM, Patryk Sidzina > wrote: > > > > CREATE TEMP TABLE test_table_md_speed(id serial primary key, n integer); > > > > CREATE OR REPLACE FUNCTION TEST_DB_SPEED(cnt integer) RETURNS text AS $$ > > DECLARE > > time_sta

Re: [PERFORM] encouraging index-only scans

2012-12-13 Thread Tom Lane
Andrew Dunstan writes: > A client is testing a migration from 9.1 to 9.2, and has found that a > large number of queries run much faster if they use index-only scans. > However, the only way he has found to get such a plan is by increasing > the seq_page_cost to insanely high levels (3.5). Is t

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Huan Ruan
On 13 December 2012 03:28, Jeff Janes wrote: > > This looks like the same large-index over-penalty as discussed in the > recent thread "[PERFORM] Slow query: bitmap scan troubles". > > Back-patching the log(npages) change is starting to look like a good idea. > > Cheers, > > Jeff Thanks for the

[PERFORM] How do I track stats on foreign table access through foreign data wrapper?

2012-12-13 Thread David Crawford
Hi, I'm using a foreign data wrapper to access mongodb and I'm looking for a way to monitor query stats against foreign tables. It looks like the common methods have limited support for foreign tables at this time. pg_stat_statements collects the query, total time, and rows returned, which is us

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Huan Ruan
Hi Kevin On 13 December 2012 10:47, Kevin Grittner wrote: > Huan Ruan wrote: > > > is a lot slower than a nested loop join. > > Giving actual numbers is more useful than terms like "a lot". Even > better is to provide the output of EXPLAIN ANALYZZE rather than > just EXPLAIN. This shows estimate

[PERFORM] Slow queries after vacuum analyze

2012-12-13 Thread Ghislain ROUVIGNAC
Hello, I have a customer that experience a strange behaviour related to statictics. Threre is a vacuum analyze planned during the night. The morning, 1 day out of 2, there are some extremely slow queries. Those queries lasts more than 5 minutes (never waited more and cancelled them) whereas when

Re: [PERFORM] Slow queries after vacuum analyze

2012-12-13 Thread Kevin Grittner
Ghislain ROUVIGNAC wrote: > Threre is a vacuum analyze planned during the night. > The morning, 1 day out of 2, there are some extremely slow > queries. Those queries lasts more than 5 minutes (never waited > more and cancelled them) whereas when everything is OK they last > less than 300ms. > >

Re: [PERFORM] Occasional timeouts on TRUNCATE and simple INSERTs

2012-12-13 Thread Osborn, Jeff
You all were right. The time-outs for TRUNCATE were due to a rogue pg_dump. And the issue with the inserts was due to an unrelated code change. Thanks for your help! --Jeff O On Dec 11, 2012, at 5:34 PM, Osborn, Jeff wrote: > Yeah I've been running a cron pulling relevant info from pg_stat

[PERFORM] Why does the number of rows are different in actual and estimated.

2012-12-13 Thread AI Rumman
Why does the number of rows are different in actual and estimated? The default_statistics_target is set to 100. explain analyze select * FROM ( SELECT entity.id AS "con_s_id", entity.setype AS "con_s_setype" , con_details.salutation AS "con_s_salutationtype", con_details.firstname AS "con_s_first

Re: [PERFORM] Why does the number of rows are different in actual and estimated.

2012-12-13 Thread Andrew Dunstan
On 12/13/2012 05:12 PM, AI Rumman wrote: Why does the number of rows are different in actual and estimated? Isn't that in the nature of estimates? An estimate is a heuristic guess at the number of rows it will find for the given query or part of a query. It's not uncommon for estimates to

Re: [PERFORM] Why does the number of rows are different in actual and estimated.

2012-12-13 Thread Evgeny Shishkin
On Dec 14, 2012, at 2:36 AM, Andrew Dunstan wrote: > > On 12/13/2012 05:12 PM, AI Rumman wrote: >> Why does the number of rows are different in actual and estimated? >> > > > Isn't that in the nature of estimates? An estimate is a heuristic guess at > the number of rows it will find for the

Re: [PERFORM] Why does the number of rows are different in actual and estimated.

2012-12-13 Thread Claudio Freire
On Thu, Dec 13, 2012 at 7:36 PM, Andrew Dunstan wrote: > On 12/13/2012 05:12 PM, AI Rumman wrote: >> >> Why does the number of rows are different in actual and estimated? >> > > > Isn't that in the nature of estimates? An estimate is a heuristic guess at > the number of rows it will find for the g

Re: [PERFORM] Why does the number of rows are different in actual and estimated.

2012-12-13 Thread Andrew Dunstan
On 12/13/2012 05:42 PM, Claudio Freire wrote: And it looks like it all may be starting to go south here: -> Hash Join (cost=9337.97..18115.71 rows=34489 width=244) (actual time=418.054..1156.453 rows=205420 loops=1)

Re: [PERFORM] Why does the number of rows are different in actual and estimated.

2012-12-13 Thread Evgeny Shishkin
On Dec 14, 2012, at 3:09 AM, Andrew Dunstan wrote: > > On 12/13/2012 05:42 PM, Claudio Freire wrote: >> And it looks like it all may be starting to go south here: >>>-> Hash Join >>> (cost=9337.97..18115.71 rows=34489 width=244) (actual >>> time=418.0

Re: [PERFORM] Why does the number of rows are different in actual and estimated.

2012-12-13 Thread Tom Lane
Evgeny Shishkin writes: > On Dec 14, 2012, at 3:09 AM, Andrew Dunstan wrote: >> Well, it looks like it's choosing a join order that's quite a bit different >> from the way the query is expressed, so the OP might need to play around >> with forcing the join order some. > OP joins 8 tables, and

Re: [PERFORM] Why does the number of rows are different in actual and estimated.

2012-12-13 Thread Evgeny Shishkin
On Dec 14, 2012, at 3:36 AM, Tom Lane wrote: > Evgeny Shishkin writes: >> On Dec 14, 2012, at 3:09 AM, Andrew Dunstan wrote: >>> Well, it looks like it's choosing a join order that's quite a bit different >>> from the way the query is expressed, so the OP might need to play around >>> with f

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Huan Ruan
Hi Kevin Again, many thanks for your time and help. On 14 December 2012 02:26, Kevin Grittner wrote: > Huan Ruan wrote: > > > Hash 1st run > > > "Hash Join (cost=1681.87..6414169.04 rows=48261 width=171) > > (actual time=2182.450..88158.645 rows=48257 loops=1)" > > > " -> Seq Scan on invtran bi

Re: [PERFORM] Why does the number of rows are different in actual and estimated.

2012-12-13 Thread Claudio Freire
On Thu, Dec 13, 2012 at 8:50 PM, Evgeny Shishkin wrote: >>> OP joins 8 tables, and i suppose join collapse limit is set to default 8. I >>> thought postgresql's optimiser is not mysql's. >> >> It's not obvious to me that there's anything very wrong with the plan. >> An 8-way join that produces 15

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Kevin Grittner
Huan Ruan wrote: > Interesting to see how you derived 100% cache hits. I assume by 'cache' you > mean the pg shared buffer plus the OS cache? Because the table is 23GB but > the shared buffer is only 6GB. Even then, I'm not completely convinced > because the total RAM is just 24GB, part of which w

Re: [PERFORM] hash join vs nested loop join

2012-12-13 Thread Huan Ruan
> > With a low cache hit rate, that would generally be when the number > of lookups into the table exceeds about 10% of the table's rows. > > > So far, my main performance issue comes down to this pattern where Postgres chooses hash join that's slower than a nest loop indexed join. By changing thos