Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses
Greg Stark wrote: > On Thu, Jul 16, 2009 at 9:07 PM, Roman Kononov wrote: >> test=# create table junk(i int); >> CREATE TABLE >> test=# select * from junk left outer join (select coalesce(i,1) as x, >> coalesce(i,2) as y from junk) t on coalesce(i,3)=x and coalesce(i,4)=y and >> coalesce(i,5)=x; >> ERROR: too few pathkeys for mergeclauses > > Thanks for the bug report. That's definitely not supposed to be > happening. It's always nice when it's easy to reproduce the problem > like this. Yep. This can be further reduced into this: CREATE TABLE a (i integer); CREATE TABLE b (x integer, y integer); select * from a left outer join b on i=x and i=y and i=x; The planner is choosing a merge join, where the outer side (table a) is sorted by (i), and the inner side is sorted by (x, y). But that doesn't work with the merge condition (i=x AND i=y AND i=x). Version 8.3 has the same bug, apparently introduced along with the equivalence classes. In 8.2, the merge condition is reduced into (i=x AND i=y), IOW the planner eliminates the duplicate condition. I believe 8.2 would otherwise have the same problem as well. I can see two different things that you could say is at fault here: 1. We no longer eliminate the duplicate condition, but the find_mergeclauses_for_pathkeys() + make_inner_pathkeys_for_merge() combination relies on there being no duplicates. We should try harder to eliminate duplicates in left join clauses. 2. make_inner_pathkeys_for_merge() should have created sort order (x, y, x) for the inner side. The first solution is what we probably want, to avoid unnecessary work at execution time. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses
Heikki Linnakangas wrote: > 2. make_inner_pathkeys_for_merge() should have created sort order (x, y, > x) for the inner side. On further thought, that would make no sense. Sort order (x, y) is always equivalent to (x, y, x). -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] Huge speed penalty using <>TRUE instead of =FALSE
One of our customers discovered that by replacing <>TRUE with =FALSE in a query of a table containing 750.000 records reduced the query time from about 12 seconds to about 60 milliseconds! The problematic query looks like this: SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE AND Alarm_status='X' ORDER BY ID DESC If it is changed to this it works as expected: SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE AND Alarm_status='X' ORDER BY ID DESC After investigation (on a smaller dataset on my own database) I found that the query was resulting in a sequential scan: "explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE AND Alarm_status='X' ORDER BY ID DESC" "Sort (cost=49936.96..49936.96 rows=1 width=405) (actual time=837.793..837.793 rows=0 loops=1)" " Sort Key: id" " Sort Method: quicksort Memory: 17kB" " -> Seq Scan on alarmlogg (cost=0.00..49936.95 rows=1 width=405) (actual time=837.782..837.782 rows=0 loops=1)" "Filter: ((logg_avsluttet <> true) AND ((alarm_status)::text = 'X'::text))" "Total runtime: 837.896 ms" The modified query gave this result: "explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE AND Alarm_status='X' ORDER BY ID DESC" "Sort (cost=8.36..8.37 rows=1 width=405) (actual time=0.032..0.032 rows=0 loops=1)" " Sort Key: id" " Sort Method: quicksort Memory: 17kB" " -> Index Scan using i_alarmlogg_logg_avsluttet on alarmlogg (cost=0.00..8.35 rows=1 width=405) (actual time=0.024..0.024 rows=0 loops=1)" "Index Cond: (logg_avsluttet = false)" "Filter: ((NOT logg_avsluttet) AND ((alarm_status)::text = 'X'::text))" "Total runtime: 0.123 ms" This is a dramatical difference, but I cannot understand why. In my head "<>TRUE" should behave exactly the same as "=FALSE". This looks like a bug to me, or am I overlooking something? This was verified on PostgreSQL 8.3.7, both on Windows Xp and Ubuntu 8.10. Some relevant details from the table definition: CREATE TABLE alarmlogg ( id serial NOT NULL, alarm_status character varying(1) DEFAULT ''::character varying, logg_avsluttet boolean DEFAULT false, ... CONSTRAINT alarmlogg_pkey PRIMARY KEY (id) ) CREATE INDEX i_alarmlogg_alarm_status ON alarmlogg USING btree (alarm_status); CREATE INDEX i_alarmlogg_logg_avsluttet ON alarmlogg USING btree (logg_avsluttet); Regards, Jan-Ivar Mellingen Securinet AS -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] bug or simply not enough stack space?
Ok, So this took a while, but here's your test case. Turns out to be quite small actually ;) create table t1 (id int); CREATE FUNCTION tr_t1_after_iud() RETURNS trigger LANGUAGE 'plpgsql' VOLATILE STRICT SECURITY INVOKER AS 'DECLARE BEGIN RAISE NOTICE ''%'', ROW(NEW.*); SELECT 1/0; RETURN NEW; END;'; CREATE TRIGGER t1_after AFTER INSERT OR UPDATE OR DELETE ON t1 FOR EACH ROW EXECUTE PROCEDURE tr_t1_after_iud(); begin; savepoint s1; INSERT INTO t1 values (1); => this will result in the following: db=# INSERT INTO t1 values (1); NOTICE: (1) WARNING: AbortSubTransaction while in ABORT state WARNING: did not find subXID 77063 in MyProc ERROR: division by zero CONTEXT: SQL statement "SELECT 1/0" PL/pgSQL function "tr_t1_after_iud" line 4 at SQL statement ERROR: tupdesc reference 0x7ffe74f24ad0 is not owned by resource owner SubTransaction => mind the fact that the savepoint is 'needed', without it there will be no problem => in my reallife example, this resulted in a "PANIC: ERRORDATA_STACK_SIZE exceeded", I cannot reproduce that, but as you stated earlier, this might just be collateral damage, which I'll be able to easily confirm once the problem above has been fixed. => cleanup: rollback; drop table t1; drop function tr_t1_after_iud(); => to avoid possible confusion db=# select version(); version --- PostgreSQL 8.4.0 on x86_64-unknown-linux-gnu, compiled by GCC gcc (GCC) 4.2.4, 64-bit Looking forward to your reply. -- Best, Frank. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE
On Friday 17 July 2009 11:12:41 Jan-Ivar Mellingen wrote: > One of our customers discovered that by replacing <>TRUE with =FALSE in > a query of a table containing 750.000 records reduced the query time > from about 12 seconds to about 60 milliseconds! > This is a dramatical difference, but I cannot understand why. In my head > "<>TRUE" should behave exactly the same as "=FALSE". This looks like a > bug to me, or am I overlooking something? The planner just isn't that smart. The boolean type is a special case where <> some_value implies = some_other_value, but this doesn't generalize well to other data types. And the planner doesn't have a whole lot of data type specific knowledge. I think a better index definition might actually be on alarm_status, with a partial index predicate on logg_avsluttet = false. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
[BUGS] BUG #4927: psql does "spoil" the query before sending it to server
The following bug has been logged online: Bug reference: 4927 Logged by: handling numeric literals with dots in psql \copy command Email address: filip.rembialkow...@gmail.com PostgreSQL version: 8.4.0 Operating system: Linux Description:psql does "spoil" the query before sending it to server Details: 8.4.0: fi...@filip=# \copy ( select 1.23::numeric as num ) to 'out.csv' with csv header ERROR: syntax error at or near "." LINE 1: COPY ( select 1 . 23::numeric as num ) TO STDOUT CSV HEADER ^ \copy: ERROR: syntax error at or near "." LINE 1: COPY ( select 1 . 23::numeric as num ) TO STDOUT CSV HEADER ^ query inside parentheses is OK. same bug reproduced in 8.3.7 -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE
It might be that your column may be NULL as well as TRUE or FALSE. I am no expert in this matter though. /M On Fri, Jul 17, 2009 at 10:12 AM, Jan-Ivar Mellingen wrote: > One of our customers discovered that by replacing <>TRUE with =FALSE in > a query of a table containing 750.000 records reduced the query time > from about 12 seconds to about 60 milliseconds! > > The problematic query looks like this: > SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE AND > Alarm_status='X' ORDER BY ID DESC > > If it is changed to this it works as expected: > SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE AND > Alarm_status='X' ORDER BY ID DESC > > After investigation (on a smaller dataset on my own database) I found > that the query was resulting in a sequential scan: > > "explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE > AND Alarm_status='X' ORDER BY ID DESC" > "Sort (cost=49936.96..49936.96 rows=1 width=405) (actual > time=837.793..837.793 rows=0 loops=1)" > " Sort Key: id" > " Sort Method: quicksort Memory: 17kB" > " -> Seq Scan on alarmlogg (cost=0.00..49936.95 rows=1 width=405) > (actual time=837.782..837.782 rows=0 loops=1)" > " Filter: ((logg_avsluttet <> true) AND ((alarm_status)::text = > 'X'::text))" > "Total runtime: 837.896 ms" > > The modified query gave this result: > "explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE > AND Alarm_status='X' ORDER BY ID DESC" > "Sort (cost=8.36..8.37 rows=1 width=405) (actual time=0.032..0.032 > rows=0 loops=1)" > " Sort Key: id" > " Sort Method: quicksort Memory: 17kB" > " -> Index Scan using i_alarmlogg_logg_avsluttet on alarmlogg > (cost=0.00..8.35 rows=1 width=405) (actual time=0.024..0.024 rows=0 > loops=1)" > " Index Cond: (logg_avsluttet = false)" > " Filter: ((NOT logg_avsluttet) AND ((alarm_status)::text = > 'X'::text))" > "Total runtime: 0.123 ms" > > This is a dramatical difference, but I cannot understand why. In my head > "<>TRUE" should behave exactly the same as "=FALSE". This looks like a > bug to me, or am I overlooking something? > > This was verified on PostgreSQL 8.3.7, both on Windows Xp and Ubuntu 8.10. > > Some relevant details from the table definition: > CREATE TABLE alarmlogg > ( > id serial NOT NULL, > alarm_status character varying(1) DEFAULT ''::character varying, > logg_avsluttet boolean DEFAULT false, > ... > CONSTRAINT alarmlogg_pkey PRIMARY KEY (id) > ) > > CREATE INDEX i_alarmlogg_alarm_status > ON alarmlogg > USING btree > (alarm_status); > > CREATE INDEX i_alarmlogg_logg_avsluttet > ON alarmlogg > USING btree > (logg_avsluttet); > > Regards, > Jan-Ivar Mellingen > Securinet AS > > > -- > Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-bugs > -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE
On Friday 17 July 2009 12:45:47 Mikael Krantz wrote: > It might be that your column may be NULL as well as TRUE or FALSE. I > am no expert in this matter though. Nulls also need to be considered when attempting to substitute purportedly equivalent clauses. But in this case it wouldn't actually matter, because WHERE foo <> TRUE and WHERE foo = false would both omit the row if foo is null. Both expressions only return true if foo has the value "false". But again, this is data type specific knowledge. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE
Jan-Ivar Mellingen writes: > One of our customers discovered that by replacing <>TRUE with =FALSE in > a query of a table containing 750.000 records reduced the query time > from about 12 seconds to about 60 milliseconds! This is not a bug. The set of operators that are indexable is well documented, and <> is not one of them. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE
Peter Eisentraut writes: > ... But again, this is data type specific knowledge. Actually, now that I think about it, the planner already has datatype-specific knowledge about boolean equality (see simplify_boolean_equality). It would take just a few more lines of code there to recognize "x <> true" and "x <> false" as additional variant spellings of the generic "x" or "NOT x" constructs. Not sure if it's worth the trouble though; how many people really write such things? If you really wanted to take it to extremes, you could also reduce cases like "x > false", but that's starting to get a bit silly. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] Huge speed penalty using <>TRUE instead of =FALSE
I don't think it even has to be so specific. We should just always rewrite bool <> bool into bool = NOT bool. Hmm. That only has a 50/50 chance of creating an indexable clause. Perhaps we could even rewrite it as "a = NOT b AND NOT a = b". -- Greg On 2009-07-17, at 3:21 PM, Tom Lane wrote: Peter Eisentraut writes: ... But again, this is data type specific knowledge. Actually, now that I think about it, the planner already has datatype-specific knowledge about boolean equality (see simplify_boolean_equality). It would take just a few more lines of code there to recognize "x <> true" and "x <> false" as additional variant spellings of the generic "x" or "NOT x" constructs. Not sure if it's worth the trouble though; how many people really write such things? If you really wanted to take it to extremes, you could also reduce cases like "x > false", but that's starting to get a bit silly. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] bug or simply not enough stack space?
Frank van Vugt writes: > So this took a while, but here's your test case. > Turns out to be quite small actually ;) Hmm ... the relevant code change seems to have been http://archives.postgresql.org/pgsql-committers/2009-04/msg00127.php I think I might have been overenthusiastic in trying to free resources during a subtransaction abort. Try this patch and see if you notice any bad side-effects. regards, tom lane Index: src/pl/plpgsql/src/pl_exec.c === RCS file: /cvsroot/pgsql/src/pl/plpgsql/src/pl_exec.c,v retrieving revision 1.244 diff -c -r1.244 pl_exec.c *** src/pl/plpgsql/src/pl_exec.c17 Jun 2009 13:46:12 - 1.244 --- src/pl/plpgsql/src/pl_exec.c17 Jul 2009 16:12:22 - *** *** 5292,5298 { SimpleEcontextStackEntry *next; ! FreeExprContext(simple_econtext_stack->stack_econtext); next = simple_econtext_stack->next; pfree(simple_econtext_stack); simple_econtext_stack = next; --- 5292,5299 { SimpleEcontextStackEntry *next; ! if (event == SUBXACT_EVENT_COMMIT_SUB) ! FreeExprContext(simple_econtext_stack->stack_econtext); next = simple_econtext_stack->next; pfree(simple_econtext_stack); simple_econtext_stack = next; -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses
Heikki Linnakangas writes: > Version 8.3 has the same bug, apparently introduced along with the > equivalence classes. In 8.2, the merge condition is reduced into (i=x > AND i=y), IOW the planner eliminates the duplicate condition. I believe > 8.2 would otherwise have the same problem as well. The fact that 8.2 eliminates the redundant condition is more or less accidental, I think. It's using equal() to detect duplicate RestrictInfos coming up from the two input relations for the join, while later versions rely on pointer equality for that. You can fool 8.2 by commuting the duplicate condition, but it still doesn't fail: regression=# explain select * from a left outer join b on i=x and i=y and x=i; QUERY PLAN - Merge Left Join (cost=285.12..325.93 rows=2140 width=12) Merge Cond: ((a.i = b.x) AND (a.i = b.y) AND (a.i = b.x)) -> Sort (cost=149.78..155.13 rows=2140 width=4) Sort Key: a.i -> Seq Scan on a (cost=0.00..31.40 rows=2140 width=4) -> Sort (cost=135.34..140.19 rows=1940 width=8) Sort Key: b.x, b.y -> Seq Scan on b (cost=0.00..29.40 rows=1940 width=8) (8 rows) I think what this case may show is simply that the consistency checking I added to create_mergejoin_plan in 8.3 is too strict. Not quite convinced yet though. Another possible solution for this particular case is to allow the equivclass code to deduce x=y as an equivalence class, that is the plan should enforce that check at the scan of b and then just have one sort key for the merge. Not sure how complicated that is, however, and in any case it may not fix every possible failure case for create_mergejoin_plan. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] bug or simply not enough stack space?
Hi Tom, > Hmm ... the relevant code change seems to have been > http://archives.postgresql.org/pgsql-committers/2009-04/msg00127.php Well, though not during beta, the field testing did pay off ;) > I think I might have been overenthusiastic in trying to free resources > during a subtransaction abort. Try this patch and see if you notice > any bad side-effects. All examples I had that crashed and burned, now work correctly and/or bail out correctly where needed. No side-effects noticed. -- Best, Frank. -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses
I wrote: > I think what this case may show is simply that the consistency > checking I added to create_mergejoin_plan in 8.3 is too strict. > Not quite convinced yet though. After further review I think that is the correct approach to take. The proximate cause of the problem is that find_mergeclauses_for_pathkeys is selecting an order for the merge clauses that corresponds to a noncanonical pathkey list for the inner relation (to wit, x, y, x). While it would be possible in this particular example to put the clauses in x, x, y order instead, I don't think that is necessarily possible in every case. The clause ordering is constrained by the outer pathkeys and what we have here is a demonstration that the inner pathkeys needn't match the outer ones one-to-one. So you could have a clause that references an inner pathkey that is also referenced by some earlier clause that matches a different outer pathkey, and there won't be any way to make them adjacent. By the time the plan gets to create_mergejoin_plan, the inner pathkey list has been reduced to canonical form (x, y), but *this does not represent any actual change in sort order*. (Which is why there's no actual bug in 8.2 and before, which blithely generate plans that involve such "incorrect" mergeclause orderings.) So I think we should just weaken the checks in create_mergejoin_plan to allow such cases, ie, each mergeclause should be allowed to match any already-used inner pathkey. The other approach we could possibly take is to have find_mergeclauses_for_pathkeys reject candidate mergeclauses that produce out-of-order inner pathkeys, but that would break at least this Assert at joinpath.c:272: /* Should have used them all... */ Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list)); and it'd be rather expensive to test for anyway. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4927: psql does "spoil" the query before sending it to server
handling numeric literals with dots in psql copy command escreveu: > fi...@filip=# \copy ( select 1.23::numeric as num ) to 'out.csv' with csv > header > ERROR: syntax error at or near "." > LINE 1: COPY ( select 1 . 23::numeric as num ) TO STDOUT CSV HEADER > ^ > \copy: ERROR: syntax error at or near "." > LINE 1: COPY ( select 1 . 23::numeric as num ) TO STDOUT CSV HEADER > ^ > > query inside parentheses is OK. > > same bug reproduced in 8.3.7 > Thanks for your report. Why are we analysing the query there? One possible fix is to remove the '.' as delimiter in strtokx(). The trivial patch is attached. -- Euler Taveira de Oliveira http://www.timbira.com/ Index: copy.c === RCS file: /a/pgsql/dev/anoncvs/pgsql/src/bin/psql/copy.c,v retrieving revision 1.80 diff -c -r1.80 copy.c *** copy.c 26 Apr 2009 15:31:50 - 1.80 --- copy.c 17 Jul 2009 19:41:58 - *** *** 146,152 while (parens > 0) { ! token = strtokx(NULL, whitespace, ".,()", "\"'", nonstd_backslash, true, false, pset.encoding); if (!token) goto error; --- 146,152 while (parens > 0) { ! token = strtokx(NULL, whitespace, ",()", "\"'", nonstd_backslash, true, false, pset.encoding); if (!token) goto error; -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4927: psql does "spoil" the query before sending it to server
Euler Taveira de Oliveira writes: > Why are we analysing the query there? One possible fix is to remove the '.' as > delimiter in strtokx(). The trivial patch is attached. Surely that would break a lot of other cases. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4926: too few pathkeys for mergeclauses
"Roman Kononov" writes: > Description:too few pathkeys for mergeclauses I've applied a patch for this. Thanks for the report. regards, tom lane -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs
Re: [BUGS] BUG #4927: psql does "spoil" the query before sending it to server
Tom Lane escreveu: > Euler Taveira de Oliveira writes: >> Why are we analysing the query there? One possible fix is to remove the '.' >> as >> delimiter in strtokx(). The trivial patch is attached. > > Surely that would break a lot of other cases. > Why? Even if it can't catch all cases when we remove the '.', the query will be parsed (again) by PostgreSQL. -- Euler Taveira de Oliveira http://www.timbira.com/ -- Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-bugs