[BUGS] BUG #7623: Inconsistency on transaction isolation documentation

2012-10-29 Thread louis-claude . canon
The following bug has been logged on the website:

Bug reference:  7623
Logged by:  Louis-Claude Canon
Email address:  louis-claude.ca...@univ-fcomte.fr
PostgreSQL version: 9.2.1
Operating system:   All
Description:

In the first paragraph of Section 13.2.1, it is first stated that "[with]
this isolation level, a SELECT query [...] never sees [...] changes
committed during query execution by concurrent transactions." It is my
understanding that this is untrue (as the rest of the paragraph seems to
imply).

Then, in the same subsection, an example is given in which a DELETE
statement is affected by an uncommitted UPDATE. It is unclear to me if this
still corresponds to the standard "Committed Read" isolation level. It could
be clearer with more precision: when is committed the DELETE statement? What
is the snapshot that it sees (when does the transaction start)? Or maybe
this standard isolation level only specified that SELECT statements must not
be affected by uncommitted changes?

As I am not an expert, I cannot guarantee that there is indeed any issue.
However, I believe it could be useful that an expert proof-reads or reviews
this subsection. For example, I am unsure whether any query will see a
consistent snapshot with only committed transactions (as it is stated at
first) or if uncommitted transactions may have an impact (as the last
example suggests).

I would also suggest to following change for clarity (or the contrary
change):
"The point at issue above is whether or not a single command sees an
absolutely consistent view of the database." -> "The point at issue above is
that any single command sees an absolutely consistent view of the database."



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #6704: ALTER EXTENSION postgis SET SCHEMA leaves dangling relations

2012-10-29 Thread Alvaro Herrera
Here's a cleaned up version of this patch, for HEAD.  (The patches for
9.1 and 9.2 required minor conflict fixes, but nothing substantial).

The main difference from Dimitri's patch is that I added enough support
code so that AlterRelationNamespaceInternal() is always getting a valid
ObjectAddresses list, and get rid of the checks for a NULL one.  It
seems cleaner and simpler to me this way.  This means I had to add
support for object types that may never be visited more than once
(indexes), but I don't think this is a problem.

Remaining object types are those that use the generic code path in HEAD
(as patched by KaiGai).  We know (because the generic code path checks)
that in these cases, relations are never involved; the only other funny
cases I saw were collations and functions, but those are funny only
because the lookups are unlike the normal cases; they don't have any
subsidiary objects that would cause trouble.


While I am looking at this code: I am uneasy about
AlterObjectNamespace_oid() ignoring object classes that it doesn't
explicitely know about.  This means we will fail to cover new cases we
might come up with.  For example, can we have event triggers in
extensions, and do event triggers belong to a schema?  If so, we already
have a bug here in HEAD.  I would like us to get rid of the "default:
break;" case, and instead explicitely list the object classes we ignore.
That way, the compiler will warn us as soon as we add a new object
class and neglect to add it here.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
*** a/src/backend/catalog/pg_constraint.c
--- b/src/backend/catalog/pg_constraint.c
***
*** 679,685  RenameConstraintById(Oid conId, const char *newname)
   */
  void
  AlterConstraintNamespaces(Oid ownerId, Oid oldNspId,
! 		  Oid newNspId, bool isType)
  {
  	Relation	conRel;
  	ScanKeyData key[1];
--- 679,685 
   */
  void
  AlterConstraintNamespaces(Oid ownerId, Oid oldNspId,
! 		  Oid newNspId, bool isType, ObjectAddresses *objsMoved)
  {
  	Relation	conRel;
  	ScanKeyData key[1];
***
*** 712,717  AlterConstraintNamespaces(Oid ownerId, Oid oldNspId,
--- 712,725 
  	while (HeapTupleIsValid((tup = systable_getnext(scan
  	{
  		Form_pg_constraint conform = (Form_pg_constraint) GETSTRUCT(tup);
+ 		ObjectAddress	thisobj;
+ 
+ 		thisobj.classId = ConstraintRelationId;
+ 		thisobj.objectId = HeapTupleGetOid(tup);
+ 		thisobj.objectSubId = 0;
+ 
+ 		if (object_address_present(&thisobj, objsMoved))
+ 			continue;
  
  		if (conform->connamespace == oldNspId)
  		{
***
*** 729,734  AlterConstraintNamespaces(Oid ownerId, Oid oldNspId,
--- 737,744 
  			 * changeDependencyFor().
  			 */
  		}
+ 
+ 		add_exact_object_address(&thisobj, objsMoved);
  	}
  
  	systable_endscan(scan);
*** a/src/backend/commands/alter.c
--- b/src/backend/commands/alter.c
***
*** 252,258  ExecAlterObjectSchemaStmt(AlterObjectSchemaStmt *stmt)
   * object doesn't have a schema.
   */
  Oid
! AlterObjectNamespace_oid(Oid classId, Oid objid, Oid nspOid)
  {
  	Oid			oldNspOid = InvalidOid;
  	ObjectAddress dep;
--- 252,259 
   * object doesn't have a schema.
   */
  Oid
! AlterObjectNamespace_oid(Oid classId, Oid objid, Oid nspOid,
! 		 ObjectAddresses *objsMoved)
  {
  	Oid			oldNspOid = InvalidOid;
  	ObjectAddress dep;
***
*** 266,285  AlterObjectNamespace_oid(Oid classId, Oid objid, Oid nspOid)
  		case OCLASS_CLASS:
  			{
  Relation	rel;
- Relation	classRel;
  
  rel = relation_open(objid, AccessExclusiveLock);
  oldNspOid = RelationGetNamespace(rel);
  
! classRel = heap_open(RelationRelationId, RowExclusiveLock);
! 
! AlterRelationNamespaceInternal(classRel,
! 			   objid,
! 			   oldNspOid,
! 			   nspOid,
! 			   true);
! 
! heap_close(classRel, RowExclusiveLock);
  
  relation_close(rel, NoLock);
  break;
--- 267,277 
  		case OCLASS_CLASS:
  			{
  Relation	rel;
  
  rel = relation_open(objid, AccessExclusiveLock);
  oldNspOid = RelationGetNamespace(rel);
  
! AlterTableNamespaceInternal(rel, oldNspOid, nspOid, objsMoved);
  
  relation_close(rel, NoLock);
  break;
***
*** 290,296  AlterObjectNamespace_oid(Oid classId, Oid objid, Oid nspOid)
  			break;
  
  		case OCLASS_TYPE:
! 			oldNspOid = AlterTypeNamespace_oid(objid, nspOid);
  			break;
  
  		case OCLASS_COLLATION:
--- 282,288 
  			break;
  
  		case OCLASS_TYPE:
! 			oldNspOid = AlterTypeNamespace_oid(objid, nspOid, objsMoved);
  			break;
  
  		case OCLASS_COLLATION:
*** a/src/backend/commands/extension.c
--- b/src/backend/commands/extension.c
***
*** 2204,2209  AlterExtensionNamespace(List *names, const char *newschema)
--- 2204,2210 
  	Relation	depRel;
  	SysScanDesc depScan;
  	HeapTuple	depTup;
+ 	ObjectAddresses *objsMo

[BUGS] BUG #7626: Query planner never returns, uses 100% CPU

2012-10-29 Thread brian
The following bug has been logged on the website:

Bug reference:  7626
Logged by:  Brian Dunavant
Email address:  br...@omniti.com
PostgreSQL version: 9.2.1
Operating system:   MacOS + Others
Description:

Running this causes the thread to use 100% CPU and never returns (or at
least not for longer than my patience runs out).

This returns just fine on 8.4.1 and on 9.2beta.

CREATE TABLE foo (
id integer PRIMARY KEY
);

explain SELECT 1
FROM foo AS a
JOIN (SELECT 2 as aid) AS ag1 ON (ag1.aid=a.id)
JOIN (SELECT 3 as aid) AS ag2 ON (ag2.aid=a.id)
JOIN (SELECT 4 as aid) AS ag3 ON (ag3.aid=a.id)
JOIN (SELECT 5 as aid) AS ag4 ON (ag4.aid=a.id)
JOIN (SELECT 6 as aid) AS ag5 ON (ag5.aid=a.id)
JOIN (SELECT 7 as aid) AS ag6 ON (ag6.aid=a.id)
JOIN (SELECT 8 as aid) AS ag7 ON (ag7.aid=a.id)
JOIN (SELECT 9 as aid) AS ag8 ON (ag8.aid=a.id)
JOIN (SELECT 10 as aid) AS ag9 ON (ag9.aid=a.id)
JOIN (SELECT 11 as aid) AS ag10 ON (ag10.aid=a.id)
JOIN (SELECT 12 as aid) AS ag11 ON (ag11.aid=a.id)
JOIN (SELECT 13 as aid) AS ag12 ON (ag12.aid=a.id)
JOIN (SELECT 14 as aid) AS ag13 ON (ag13.aid=a.id)
JOIN (SELECT 15 as aid) AS ag14 ON (ag14.aid=a.id)
JOIN (SELECT 16 as aid) AS ag15 ON (ag15.aid=a.id)
JOIN (SELECT 17 as aid) AS ag16 ON (ag16.aid=a.id)
JOIN (SELECT 18 as aid) AS ag17 ON (ag17.aid=a.id)
JOIN (SELECT 19 as aid) AS ag18 ON (ag18.aid=a.id)
JOIN (SELECT 20 as aid) AS ag19 ON (ag19.aid=a.id)
JOIN (SELECT 21 as aid) AS ag20 ON (ag20.aid=a.id)
JOIN (SELECT 22 as aid) AS ag21 ON (ag21.aid=a.id)
JOIN (SELECT 23 as aid) AS ag22 ON (ag22.aid=a.id)
JOIN (SELECT 24 as aid) AS ag23 ON (ag23.aid=a.id)
JOIN (SELECT 25 as aid) AS ag24 ON (ag24.aid=a.id)
JOIN (SELECT 26 as aid) AS ag25 ON (ag25.aid=a.id)
JOIN (SELECT 27 as aid) AS ag26 ON (ag26.aid=a.id)
JOIN (SELECT 28 as aid) AS ag27 ON (ag27.aid=a.id)
JOIN (SELECT 29 as aid) AS ag28 ON (ag28.aid=a.id)
JOIN (SELECT 30 as aid) AS ag29 ON (ag29.aid=a.id)
JOIN (SELECT 1 as aid) AS ag30 ON (ag30.aid=a.id)
;





-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7626: Query planner never returns, uses 100% CPU

2012-10-29 Thread Bill MacArthur

On 10/29/2012 12:29 PM, br...@omniti.com wrote:

The following bug has been logged on the website:

Bug reference:  7626
Logged by:  Brian Dunavant
Email address:  br...@omniti.com
PostgreSQL version: 9.2.1
Operating system:   MacOS + Others
Description:

Running this causes the thread to use 100% CPU and never returns (or at
least not for longer than my patience runs out).

This returns just fine on 8.4.1 and on 9.2beta.


This response is not an effort to assist you, Brian, but just to make an observation for 
the developers that my bug post earlier this month has an uncanny resemblence to this 
post. My post involved real data but was basically a mutltiple rejoin of a couple of 
tables on differing values. The total number of joins was almost exactly what you have 
here. In my case it worked fine on 9.0 but would never return in 9.2.1. Due to the 
complexity of building a self-contained test case, I ended up reworking the query into 
something simpler using CTE's. My bug reference was... scratch that, it was only a 
posting to pgsql-admin on Oct. 12, entitled "query from 9.0 fails to return on 
9.2".



CREATE TABLE foo (
 id integer PRIMARY KEY
);

explain SELECT 1
FROM foo AS a
JOIN (SELECT 2 as aid) AS ag1 ON (ag1.aid=a.id)
JOIN (SELECT 3 as aid) AS ag2 ON (ag2.aid=a.id)
JOIN (SELECT 4 as aid) AS ag3 ON (ag3.aid=a.id)
JOIN (SELECT 5 as aid) AS ag4 ON (ag4.aid=a.id)
JOIN (SELECT 6 as aid) AS ag5 ON (ag5.aid=a.id)
JOIN (SELECT 7 as aid) AS ag6 ON (ag6.aid=a.id)
JOIN (SELECT 8 as aid) AS ag7 ON (ag7.aid=a.id)
JOIN (SELECT 9 as aid) AS ag8 ON (ag8.aid=a.id)
JOIN (SELECT 10 as aid) AS ag9 ON (ag9.aid=a.id)
JOIN (SELECT 11 as aid) AS ag10 ON (ag10.aid=a.id)
JOIN (SELECT 12 as aid) AS ag11 ON (ag11.aid=a.id)
JOIN (SELECT 13 as aid) AS ag12 ON (ag12.aid=a.id)
JOIN (SELECT 14 as aid) AS ag13 ON (ag13.aid=a.id)
JOIN (SELECT 15 as aid) AS ag14 ON (ag14.aid=a.id)
JOIN (SELECT 16 as aid) AS ag15 ON (ag15.aid=a.id)
JOIN (SELECT 17 as aid) AS ag16 ON (ag16.aid=a.id)
JOIN (SELECT 18 as aid) AS ag17 ON (ag17.aid=a.id)
JOIN (SELECT 19 as aid) AS ag18 ON (ag18.aid=a.id)
JOIN (SELECT 20 as aid) AS ag19 ON (ag19.aid=a.id)
JOIN (SELECT 21 as aid) AS ag20 ON (ag20.aid=a.id)
JOIN (SELECT 22 as aid) AS ag21 ON (ag21.aid=a.id)
JOIN (SELECT 23 as aid) AS ag22 ON (ag22.aid=a.id)
JOIN (SELECT 24 as aid) AS ag23 ON (ag23.aid=a.id)
JOIN (SELECT 25 as aid) AS ag24 ON (ag24.aid=a.id)
JOIN (SELECT 26 as aid) AS ag25 ON (ag25.aid=a.id)
JOIN (SELECT 27 as aid) AS ag26 ON (ag26.aid=a.id)
JOIN (SELECT 28 as aid) AS ag27 ON (ag27.aid=a.id)
JOIN (SELECT 29 as aid) AS ag28 ON (ag28.aid=a.id)
JOIN (SELECT 30 as aid) AS ag29 ON (ag29.aid=a.id)
JOIN (SELECT 1 as aid) AS ag30 ON (ag30.aid=a.id)
;








--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7626: Query planner never returns, uses 100% CPU

2012-10-29 Thread Tom Lane
br...@omniti.com writes:
> Running this causes the thread to use 100% CPU and never returns (or at
> least not for longer than my patience runs out).
> This returns just fine on 8.4.1 and on 9.2beta.

Hmm ... seems to be a consequence of commit
3b8968f25232ad09001bf35ab4cc59f5a501193e.  I wrote in that patch

+* This looks expensive, but in practical cases there won't be very many
+* distinct sets of outer rels to consider.

but evidently that was too optimistic :-(.  Will need to think about
suitable heuristics for limiting the sets of outer relations we
consider building parameterized paths with.

regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7623: Inconsistency on transaction isolation documentation

2012-10-29 Thread Kevin Grittner
louis-claude.ca...@univ-fcomte.fr wrote:

> In the first paragraph of Section 13.2.1, it is first stated that
> "[with] this isolation level, a SELECT query [...] never sees [...]
> changes committed during query execution by concurrent
> transactions." It is my understanding that this is untrue (as the
> rest of the paragraph seems to imply).

What are you seeing which you feels contradicts that? (I don't see
anything which appears to contradict it, but maybe there's more than
one way for someone to read it, and I'm missing something.)

> Then, in the same subsection, an example is given in which a DELETE
> statement is affected by an uncommitted UPDATE.

No, the DELETE statement is blocked by the UPDATE to see whether it
commits or rolls back. If the transaction with the UPDATE rolls back,
the results of the DELETE will not be affected by the UPDATE. If the
UPDATE commits, the committed transaction will affect the DELETE.

> It is unclear to me if this still corresponds to the standard
> "Committed Read" isolation level.

Yes, it's still in section 13.2.1, which is all about READ COMMITTED
transactions. And I believe that it still complies with the
requirements of the standard for that level.

> It could be clearer with more precision: when is committed the
> DELETE statement?

I guess it could be more clear that the DELETE will block pending the
completion of the transaction containing the previously run UPDATE.
The commit of the DELETE will be some time after the commit of the
transaction containing the UPDATE.

> What is the snapshot that it sees (when does the transaction
> start)? Or maybe this standard isolation level only specified that
> SELECT statements must not be affected by uncommitted changes?

No transaction is affected (other than blocking or possibly rolling
back with a serialization failure), by an *uncommitted* transaction.
In this example the DELETE is affected by a *committed* transaction.
The snapshot for the DELETE is taken after the query is parsed and
before it starts scanning for rows to delete. When there is a write
conflict in READ COMMITTED it follows the update chain from the
visible row to find the latest committed version of the row, blocking
for a change in progress; so technically an UPDATE or DELETE at the
READ COMMITTED transaction isolation level sometimes strays from a
single snapshot for the statement. The alternative would be to
restart the statement after a write conflict where the other
transaction commits, which could perform much worse when a statement
affects a large number of rows.

> As I am not an expert, I cannot guarantee that there is indeed any
> issue.

Well, based on what you said there appears to be at lest some room to
clarify or elaborate, so people better understand it.

> However, I believe it could be useful that an expert proof-reads or
> reviews this subsection. For example, I am unsure whether any query
> will see a consistent snapshot with only committed transactions (as
> it is stated at first) or if uncommitted transactions may have an
> impact (as the last example suggests).

READ COMMITTED can only see work of committed transactions, but does
not necessarily see only data visible according to its snapshot.

> I would also suggest to following change for clarity (or the
> contrary change):
> "The point at issue above is whether or not a single command sees
> an absolutely consistent view of the database." -> "The point at
> issue above is that any single command sees an absolutely
> consistent view of the database."

Based on my clarification here, do you agree that the current
language is more accurate than your proposed revision?

-Kevin


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7624: Misleading Log Message & Inconsistent Configuration Design

2012-10-29 Thread Craig Ringer
On 10/28/2012 03:35 PM, t...@cs.ucsd.edu wrote:
> I really suggest to make the configuration file consistent in this case. But
> I understand it might not be easy. But at least I think we should print a
> better log message which pinpoints to the absolute path like
> 
> FATAL:  58P01: could not create lock file
> "/home/tianyin/postgresql-9.2.1/local/pgsql/data/local/pgsql/data/.s.PGSQL.5432.lock":
> No such file or directory

Essentially, you want PostgreSQL to print absolute paths in error
messages, instead of paths relative to the server datadir?

I'd prefer to do it slightly differently, but I like the general idea.
I'd instead want to write:

FATAL:  58P01: could not create lock file
"local/pgsql/data/.s.PGSQL.5432.lock"
(cwd="/home/tianyin/postgresql-9.2.1/local/pgsql/data/"): No such file
or directory

By spelling out the CWD explicitly there's no impression given that the
user ever actually specified the whole path in the configuration
anywhere, so it's clearer how the path came to be. It also shows which
part of the path is known-good (since it's the CWD) and which could be
at issue.

--
Craig Ringer


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #7627: Bad query plan when using LIMIT

2012-10-29 Thread edward
The following bug has been logged on the website:

Bug reference:  7627
Logged by:  Edward Faulkner
Email address:  edw...@clericare.com
PostgreSQL version: 9.2.1
Operating system:   OSX 10.8.2
Description:

The following two queries differ only by adding "LIMIT 1", but the one with
the limit gets radically worse performance. I've done VACUUM FULL, VACUUM
ANALYZE, and REINDEX DATABASE and there are no modifications since.

EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
DESC;
 
QUERY PLAN  

---
 Sort  (cost=4665.10..4666.59 rows=595 width=173) (actual
time=27.936..28.034 rows=543 loops=1)
   Sort Key: public.commits.tree_high
   Sort Method: quicksort  Memory: 169kB
   ->  Nested Loop  (cost=60.93..4637.68 rows=595 width=173) (actual
time=2.000..26.658 rows=543 loops=1)
 ->  HashAggregate  (cost=60.93..66.98 rows=605 width=28) (actual
time=1.880..2.214 rows=605 loops=1)
   ->  Subquery Scan on "ANY_subquery"  (cost=0.00..59.42
rows=605 width=28) (actual time=0.034..1.231 rows=605 loops=1)
 ->  Limit  (cost=0.00..53.37 rows=605 width=32) (actual
time=0.032..0.941 rows=605 loops=1)
   ->  Index Scan using commit_tree on commits 
(cost=0.00..13481.52 rows=152837 width=32) (actual time=0.031..0.799
rows=605 loops=1)
 ->  Index Scan using commits_pkey on commits  (cost=0.00..7.54
rows=1 width=173) (actual time=0.038..0.039 rows=1 loops=605)
   Index Cond: ((id)::text = ("ANY_subquery".id)::text)
   Filter: (tree_other IS NULL)
   Rows Removed by Filter: 0
 Total runtime: 28.210 ms
(13 rows)

EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
DESC LIMIT 1;
 
QUERY PLAN  

---
 Limit  (cost=0.00..2314.68 rows=1 width=173) (actual
time=46626.438..46626.439 rows=1 loops=1)
   ->  Nested Loop Semi Join  (cost=0.00..1377233.62 rows=595 width=173)
(actual time=46626.437..46626.437 rows=1 loops=1)
 Join Filter: ((public.commits.id)::text =
("ANY_subquery".id)::text)
 Rows Removed by Join Filter: 90573339
 ->  Index Scan Backward using commit_tree on commits 
(cost=0.00..13481.52 rows=150269 width=173) (actual time=0.025..406.336
rows=149708 loops=1)
   Filter: (tree_other IS NULL)
   Rows Removed by Filter: 2525
 ->  Materialize  (cost=0.00..62.44 rows=605 width=28) (actual
time=0.000..0.084 rows=605 loops=149708)
   ->  Subquery Scan on "ANY_subquery"  (cost=0.00..59.42
rows=605 width=28) (actual time=0.027..1.166 rows=605 loops=1)
 ->  Limit  (cost=0.00..53.37 rows=605 width=32) (actual
time=0.026..0.965 rows=605 loops=1)
   ->  Index Scan using commit_tree on commits 
(cost=0.00..13481.52 rows=152837 width=32) (actual time=0.026..0.828
rows=605 loops=1)
 Total runtime: 46626.562 ms
(12 rows)

It's possible to work around the problem like this:

EXPLAIN ANALYZE WITH candidates AS (SELECT * FROM commits WHERE id IN
(SELECT id FROM commits ORDER BY tree_high LIMIT 605 ) AND tree_other IS
NULL) SELECT * FROM candidates ORDER BY tree_high DESC LIMIT 1;
  
QUERY PLAN  
 
-
 Limit  (cost=4652.56..4652.56 rows=1 width=436) (actual time=29.369..29.370
rows=1 loops=1)
   CTE candidates
 ->  Nested Loop  (cost=60.93..4637.68 rows=595 width=173) (actual
time=2.008..27.454 rows=543 loops=1)
   ->  HashAggregate  (cost=60.93..66.98 rows=605 width=28) (actual
time=1.891..2.271 rows=605 loops=1)
 ->  Subquery Scan on "ANY_subquery"  (cost=0.00..59.42
rows=605 width=28) (actual time=0.032..1.237 rows=605 loops=1)
   ->  Limit  (cost=0.00..53.37 rows=605 width=32)
(actual time=0.031..0.909 rows=605 loops=1)
 ->  Index Scan using commit_tree on commits 
(cost=0.00..13481.52 rows=152837 width=32) (actual time=0.030..0.799
rows=605 loops=1)
   

Re: [BUGS] BUG #7627: Bad query plan when using LIMIT

2012-10-29 Thread Tom Lane
edw...@clericare.com writes:
> The following two queries differ only by adding "LIMIT 1", but the one with
> the limit gets radically worse performance. I've done VACUUM FULL, VACUUM
> ANALYZE, and REINDEX DATABASE and there are no modifications since.

> EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
> ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
> DESC;

> EXPLAIN ANALYZE SELECT * FROM commits WHERE id IN (SELECT id FROM commits
> ORDER BY tree_high LIMIT 605 ) AND tree_other IS NULL ORDER BY tree_high
> DESC LIMIT 1;

I think what's happening there is that the planner supposes that an
indexscan in "tree_high DESC" order will find rows matching the IN
condition uniformly distributed in the scan order --- but, because of
the construction of the IN clause, they're actually going to be
pessimally located at the very end of that scan order.  So it ends up
forming all of the nestloop result, when it had expected to have to
compute only 1/595 of it.  We've discussed dialing down the planner's
optimism about limit plans to not assume perfect independence of filter
conditions, but I don't think anyone would advocate for having it assume
the worst possible case, which is what you've got here unfortunately.

I can't help thinking that there's a way to express this problem without
the peculiar self-join, but I'm too tired to think of a good one right
now.  The best I can do is a window function:

select last_value(id) over (order by tree_high), last_value(...) ...
from (select * from commits order by tree_high limit 605) ss
where tree_other is null;

but it'd be pretty tedious to write out the last_value construct for
each column you want, and anyway this seems less than elegant even
aside from that objection.

regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #7624: Misleading Log Message & Inconsistent Configuration Design

2012-10-29 Thread Tianyin Xu
Hi, Craig,

Thanks a lot for your response. Sorry to make the bug report so long... I
was a little pissed off by myself for my stupidness after spending around 2
hours on this issue.

Yes, your message is definitely better and explicit by pointing out CWD.

The only concern to me is whether or not CWD is a common abbreviation,
especially for non-native speakers. Searching "CWD" on google, I didn't
find it refers to Current Working Directory in the first 3 returned pages.

To me, printing an explicit log message is the easiest way to solve the
problem. But the configuration file design is still not consistent (maybe
I'm too picky). In the following settings, we have to tell users that the
first 3 works and the last 1 does not work because of chdir blah blah blah.

data_directory = 'local/pgsql/data/'
hba_file = 'local/pgsql/data/pg_hba.conf'
ident_file = 'local/pgsql/data/pg_ident.conf'
unix_socket_directory = 'local/pgsql/data/'

Thanks,
Tianyin



On Mon, Oct 29, 2012 at 7:46 PM, Craig Ringer  wrote:

> On 10/28/2012 03:35 PM, t...@cs.ucsd.edu wrote:
> > I really suggest to make the configuration file consistent in this case.
> But
> > I understand it might not be easy. But at least I think we should print a
> > better log message which pinpoints to the absolute path like
> >
> > FATAL:  58P01: could not create lock file
> >
> "/home/tianyin/postgresql-9.2.1/local/pgsql/data/local/pgsql/data/.s.PGSQL.5432.lock":
> > No such file or directory
>
> Essentially, you want PostgreSQL to print absolute paths in error
> messages, instead of paths relative to the server datadir?
>
> I'd prefer to do it slightly differently, but I like the general idea.
> I'd instead want to write:
>
> FATAL:  58P01: could not create lock file
> "local/pgsql/data/.s.PGSQL.5432.lock"
> (cwd="/home/tianyin/postgresql-9.2.1/local/pgsql/data/"): No such file
> or directory
>
> By spelling out the CWD explicitly there's no impression given that the
> user ever actually specified the whole path in the configuration
> anywhere, so it's clearer how the path came to be. It also shows which
> part of the path is known-good (since it's the CWD) and which could be
> at issue.
>
> --
> Craig Ringer
>



-- 
Tianyin XU,
http://cseweb.ucsd.edu/~tixu/