[BUGS] Re: 8.3.5: Types with typnamespace pointing at non-existent pg_namespace oid

2011-02-22 Thread Daniel Farina
On Mon, Feb 21, 2011 at 10:46 PM, Daniel Farina  wrote:
> It may also be useful information to know that no recent shenanigans
> have happened on this server: it's been up continuously for about 500
> days. That doesn't mean something interesting did not occur a very
> long time ago, and I'm currently asking around for any notes about
> interesting things that have occurred on this machine.

>From what I can tell, people only see this problem with pg_dump, which
is interesting. This symptom has a very long history:

http://archives.postgresql.org/pgsql-general/2004-02/msg00970.php
http://archives.postgresql.org/pgsql-admin/2006-10/msg00192.php
http://archives.postgresql.org/pgsql-bugs/2005-11/msg00305.php
http://archives.postgresql.org/pgsql-bugs/2010-01/msg00087.php
http://archives.postgresql.org/pgsql-general/2011-02/msg00334.php

Something I'm not sure any of these mention that's very interesting in
my case that may be crucial information:

In my case, there are two "missing" pg_namespace entries, and both
have the same missing relations. Both of them have "credible" looking
OIDs (in the hundreds of thousands, and one after the other) as well
as "credible" looking ancillary information:

* all owners are correct

* there are exactly four relfrozenxid values. They look like this:

SELECT distinct c.relnamespace, relfrozenxid::text
   FROM pg_class c
   LEFT JOIN pg_namespace n ON n.oid = c.relnamespace
   LEFT JOIN pg_tablespace t ON t.oid = c.reltablespace
   WHERE nspname IS NULL;
 relnamespace | relfrozenxid
--+--
   320204 | 0
   320204 | 6573962
   320527 | 0
   320527 | 6574527

Note that relfrozenxic increases along with the oid, which is
generally what you'd expect. Some relations have no frozen xid.

* This is affecting the following features the user has used:
sequences, relations, indexes (in this case, they are all _pkey
indexes)

* There's also a valid version of all these relations/objects that
*are* connected to the schema that's alive and expected. As such, \dt,
\dn seem to work as one would expect.  The modern namespace OID is
378382, which is in line with a smooth monotonic increase over time.

* Each relkind has its own relfilenode, and they all do appear to
exist in the cluster directory. I didn't spot any big ones from a
random sampling (I can write a comprehensive one on request), but some
were 8KB and some were 16KB, which might suggest that some data is in
some of them.

More forensics tomorrow.

Sadly, for whatever reason, pg_dump --schema=public didn't seem to
help me out. We do need a workaround if we wish to keep doing
forensics.

-- 
fdr

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Hung Vacuum in 8.3

2011-02-22 Thread Greg Stark
On Tue, Feb 22, 2011 at 7:14 AM, Mark Kirkwood
 wrote:
> On 22/02/11 19:47, Heikki Linnakangas wrote:
>>
>> A long query on the same table can block vacuum. Vacuum needs to take a
>> so-called "cleanup lock" on each page, which means that it has to wait until
>> no other backend holds a pin on the page. A long-running query can keep a
>> page pinned for a long time.
>>
>
> Ah, - this is LockBufferForCleanup? So we are waiting for an exclusive
> LockBuffer operation i.e a LWLock not a (heavyweight) lock, urg... no
> wonder.

Actually it's not waiting for the LockBuffer LWLock. it's waiting
until your query unpins the buffer it wants. Vacuum tries to get an
exclusive lock on the buffer, if it gets it then it checks if anyone
is using that buffer. If someone is then it unlocks the buffer and
waits until nobody has it pinned.

Only certain plan types will keep a buffer pinned for a long time.
Things like the outer table of a nested loop join where the inner side
is another large table or slow subquery for example. This isn't
terribly common, usually it's caused by a large mistaken cartesian
join or something. but occasionally Postgres will generate a plan that
could do it. For instance joining a single row against a large table
will sometimes do a nested loop from the single row to the large
table. It's also possible Vacuum has been making progress but the
query keeps getting in its way and stopping it on new blocks.

It's also possible there's a bug of course. If someone was using that
buffer and somehow failed to notify the vacuum that they were done it
would wait for a very long time (forever?). However if vacuum
eventually continued when the query was canceled then it seems likely
it was working as intended.


-- 
greg

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] Re: 8.3.5: Types with typnamespace pointing at non-existent pg_namespace oid

2011-02-22 Thread Greg Stark
On Tue, Feb 22, 2011 at 8:48 AM, Daniel Farina  wrote:
>  The modern namespace OID is
> 378382, which is in line with a smooth monotonic increase over time.

Wait, what? namespace OID is the OID of the schema. The OID of an
object doesn't change over the lifetime of the object, it's a unique
identifier. So the only reason this would be increasing like this
would be if you're creating schemas continually over time. What
actually is going on in this database?

-- 
greg

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #5897: INSTALACAO

2011-02-22 Thread Nilson

The following bug has been logged online:

Bug reference:  5897
Logged by:  Nilson
Email address:  laut...@uol.com.br
PostgreSQL version: 9.0
Operating system:   Windows XP
Description:INSTALACAO
Details: 

Srs.

Não estou conseguinda instalar o postgres na minha maquina.
Chego até na tela que solicita a senha, ai recebo a mensagem que a senha é
invalida.

Nota. Tinha instalado a versao 8.4, desinstalei e agora nao consigo instalar
a outra versao.

Windows XP

Att.

Nilson

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5679: pgAdminIII 1.12.0 hangs

2011-02-22 Thread Robert Haas
On Thu, Feb 17, 2011 at 5:05 PM, gunziptarball  wrote:
>
> I too am experiencing this problem, which has only appeared today.  I have
> been running queries against million-record sized tables...so it was using
> probably a bunch of memory.  My entire computer froze mid-work as I was
> typing in a query (not running it), and I had to hard-reboot...now i can't
> run the query tool.

You probably need to try the pgadmin-support list:

http://www.pgadmin.org/support/list.php

This list is for bugs in core PostgreSQL.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Hung Vacuum in 8.3

2011-02-22 Thread Robert Haas
On Tue, Feb 22, 2011 at 6:26 AM, Greg Stark  wrote:
> Actually it's not waiting for the LockBuffer LWLock. it's waiting
> until your query unpins the buffer it wants. Vacuum tries to get an
> exclusive lock on the buffer, if it gets it then it checks if anyone
> is using that buffer. If someone is then it unlocks the buffer and
> waits until nobody has it pinned.

How bad it would be if we made LockBufferForCleanup() not wait?  If we
can't obtain the buffer cleanup lock immediately, we just skip that
page and continue on.  That would prevent us from updating
relfrozenxid, I guess, but we already can't do that if there are any
bits set in the visibility map.   It could also leave some bloat in
the table, but probably not much (he says hopefully).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5896: When server cannot be started, first it says that it couldn't be started and then Server Started

2011-02-22 Thread Kevin Grittner
"Nacho Mezzadra"  wrote:
 
> could not start server
> PostgreSQL 8.3 started successfully
 
That looks like a packaging problem.  The lines starting with
uppercase letters appear to have come from the service script, not
PostgreSQL itself.
 
-Kevin

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Re: 8.3.5: Types with typnamespace pointing at non-existent pg_namespace oid

2011-02-22 Thread Tom Lane
Daniel Farina  writes:
> From what I can tell, people only see this problem with pg_dump, which
> is interesting. This symptom has a very long history:

Yeah.  There seems to be some well-hidden bug whereby dropping an object
sometimes fails to drop (some of?) its dependencies.  I'm still looking
for a reproducible case, or even a hint as to what the trigger condition
might be.

> In my case, there are two "missing" pg_namespace entries, and both
> have the same missing relations.

Uh, what do you mean by "same missing relations"?

> * There's also a valid version of all these relations/objects that
> *are* connected to the schema that's alive and expected.

And this isn't making any sense to this onlooker, either.  Could you
provide a more detailed explanation of the usage pattern in this
database?  I speculate that what you mean is the user periodically
drops and recreates a schema + its contents, but please be explicit.

> Sadly, for whatever reason, pg_dump --schema=public didn't seem to
> help me out. We do need a workaround if we wish to keep doing
> forensics.

Yeah, pg_dump is written to glom onto everything listed in the catalogs
and sort it out later.  So it tends to notice inconsistencies that you
might not notice in regular usage of the database.  It's sort of hard to
avoid, since for example a --schema switch depends on seeing which
objects belong to which schema ...

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


[BUGS] BUG #5898: Nested "in" clauses hide bad column names

2011-02-22 Thread Scott Dunbar

The following bug has been logged online:

Bug reference:  5898
Logged by:  Scott Dunbar
Email address:  sc...@xigole.com
PostgreSQL version: 9.0.3
Operating system:   Ubuntu 10.10
Description:Nested "in" clauses hide bad column names
Details: 

I have a nested in clause like:

select respondent_id from respondent where respondent_id in (select
respondent_id from chat_session where project_id in (select project_id from
project where company_id = 4));

However, in this example, there is no column named respondent_id in the
chat_session table.  But the query runs and, indeed, returns all of the rows
in respondent.  Since this was then part of another nested in it deleted far
more that it was supposed to.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5898: Nested "in" clauses hide bad column names

2011-02-22 Thread Tom Lane
"Scott Dunbar"  writes:
> I have a nested in clause like:

> select respondent_id from respondent where respondent_id in (select
> respondent_id from chat_session where project_id in (select project_id from
> project where company_id = 4));

> However, in this example, there is no column named respondent_id in the
> chat_session table.

Probably there is one in respondent, though?  This behavior is not a bug
--- what you have there is an outer reference, and it is working exactly
as specified by the SQL standard.  Sub-selects would be a whole lot less
useful if they couldn't refer to variables of the outer query.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5898: Nested "in" clauses hide bad column names

2011-02-22 Thread Scott Dunbar
Yes, you're correct.  I guess this makes sense but it does seem strange 
that I can enter garbage in a query but it still runs.  And in my case 
the output from this (the entire table) was then used in a delete 
statement that toasted the entire table.  Allowing bogus SQL just seems 
"wrong" but I do understand what's going on.


Thanks for your help.


On 02/22/2011 10:45 AM, Tom Lane wrote:

"Scott Dunbar"  writes:

I have a nested in clause like:
select respondent_id from respondent where respondent_id in (select
respondent_id from chat_session where project_id in (select project_id from
project where company_id = 4));
However, in this example, there is no column named respondent_id in the
chat_session table.

Probably there is one in respondent, though?  This behavior is not a bug
--- what you have there is an outer reference, and it is working exactly
as specified by the SQL standard.  Sub-selects would be a whole lot less
useful if they couldn't refer to variables of the outer query.

regards, tom lane



--
Scott Dunbar
Xigole Systems, Inc.
Enterprise software consulting, development, and hosting
303·667·6343


Re: [BUGS] Re: 8.3.5: Types with typnamespace pointing at non-existent pg_namespace oid

2011-02-22 Thread Daniel Farina
On Tue, Feb 22, 2011 at 8:54 AM, Tom Lane  wrote:
> Daniel Farina  writes:
>> From what I can tell, people only see this problem with pg_dump, which
>> is interesting. This symptom has a very long history:
>
> Yeah.  There seems to be some well-hidden bug whereby dropping an object
> sometimes fails to drop (some of?) its dependencies.  I'm still looking
> for a reproducible case, or even a hint as to what the trigger condition
> might be.
>
>> In my case, there are two "missing" pg_namespace entries, and both
>> have the same missing relations.
>
> Uh, what do you mean by "same missing relations"?

There are an identical set of relations (including quasi-relations
like indexes and sequences) with relnames and most other properties
that are identical between the versions that are tied with each of the
two missing namespaces. There's also a superset of those (but that may
be partially or totally explained by the current set being more recent
as the application as grown) that are seen with a normal looking
pg_namespace record. All three copies of these formations seem to have
very sensible pg_class/pg_type/pg_sequence formations in their
respective relnamespaces.

>> * There's also a valid version of all these relations/objects that
>> *are* connected to the schema that's alive and expected.
>
> And this isn't making any sense to this onlooker, either.  Could you
> provide a more detailed explanation of the usage pattern in this
> database?  I speculate that what you mean is the user periodically
> drops and recreates a schema + its contents, but please be explicit.

We run quite a large number of databases, and I unfortunately think
that this particular fault has occurred in what could be called
ancient history, as far as log retention is concerned. Sadly our
investigation will have to be limited to what we can find at this
time, although we can probably slowly work our way to being able to
catch this one in the act. We might also be able to run a catalog
query across other databases to get a sense as to the frequency of the
problem.

It may be worth noting in this case that the user does not own the
schema that is thought to be dropped (or, in fact, any schemas at
all), so DROP SCHEMA as issued by them is not likely a culprit. I will
ask around as to what administrative programs we possess that might
fool with the schema. Still, such a program is probably run many times
across many databases. This is why I'm scratching my head about the
fact that two sets of such bogus relnamespace references were
produced.

Although I have no idea how such a thing could happen, is it possible
that both copies come from one occurrence of the bug?

> Yeah, pg_dump is written to glom onto everything listed in the catalogs
> and sort it out later.  So it tends to notice inconsistencies that you
> might not notice in regular usage of the database.  It's sort of hard to
> avoid, since for example a --schema switch depends on seeing which
> objects belong to which schema ...

I figured as much, although if it were written slightly differently
(starting from oid where nspname = 'public') then perhaps it would not
run into problems. I was meaning to poke at pg_depend to see if
anything interesting can be seen in there.

I'll probably hack up pg_dump to try to step around the yucky
relations so we can ensure that this database gets a clean-looking
restore elsewhere before we put the strange-looking database on ice --
permanently, if you think there is no value in having it around.

-- 
fdr

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5898: Nested "in" clauses hide bad column names

2011-02-22 Thread Kevin Grittner
Scott Dunbar  wrote:
 
> I guess this makes sense but it does seem strange that I can enter
> garbage in a query but it still runs.
 
It wasn't garbage.
 
> And in my case the output from this (the entire table) was then
> used in a delete statement that toasted the entire table.
 
I can suggest a few ways to protect yourself in such situations.
 
(1)  I generally run any DELETE statement against data I care about
as a SELECT first.  It's generally pretty easy to write it as a
SELECT tbl.* FROM which can be converted to DELETE FROM tbl after
reviewing what matches.
 
(2)  When in doubt, use BEGIN; before running the statement.  You
can review the count, run SELECTs to look at the results, etc.,
before running COMMIT; to make it "stick".
 
(3)  Less convenient, but sometimes useful, is to EXPLAIN your query
before actually running it.  If you do that with your delete, you'll
see the criterion applied to a table other than what you were
expecting, which might alert you to the problem.  The estimated row
count at the top level of the plan might be another red flag.
 
> Allowing bogus SQL just seems "wrong" but I do understand what's
> going on.
 
If you did you wouldn't call a well formed, unambiguous, standard-
conforming statement bogus.  It did exactly what you said; just not
what you meant.  It pays to be a bit paranoid when running ad hoc
DML in case you accidentally don't say what you mean.
 
-Kevin

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Hung Vacuum in 8.3

2011-02-22 Thread Mark Kirkwood

On 23/02/11 00:26, Greg Stark wrote:


It's also possible there's a bug of course. If someone was using that
buffer and somehow failed to notify the vacuum that they were done it
would wait for a very long time (forever?). However if vacuum
eventually continued when the query was canceled then it seems likely
it was working as intended.



Greg, thanks for clarifying this.

Unfortunately this time around I canceled the vacuum and then the query. 
However *next* time I'll get rid of the query 1st and see what happens.


Cheers

Mark


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] Hung Vacuum in 8.3

2011-02-22 Thread Mark Kirkwood

On 23/02/11 03:27, Robert Haas wrote:

On Tue, Feb 22, 2011 at 6:26 AM, Greg Stark  wrote:

Actually it's not waiting for the LockBuffer LWLock. it's waiting
until your query unpins the buffer it wants. Vacuum tries to get an
exclusive lock on the buffer, if it gets it then it checks if anyone
is using that buffer. If someone is then it unlocks the buffer and
waits until nobody has it pinned.

How bad it would be if we made LockBufferForCleanup() not wait?  If we
can't obtain the buffer cleanup lock immediately, we just skip that
page and continue on.  That would prevent us from updating
relfrozenxid, I guess, but we already can't do that if there are any
bits set in the visibility map.   It could also leave some bloat in
the table, but probably not much (he says hopefully).



Seems like a good suggestion, and may leave less bloat than having the 
vacuum hung for potentially quite some time.


Mark

--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5898: Nested "in" clauses hide bad column names

2011-02-22 Thread Eric Schwarzenbach
Using table aliases prevents such problems by disambiguating the column 
names. For example, if you had table aliases in just one part of this 
query as in the below, you would have gotten an error instead of 
deleting all those rows:


select respondent_id from respondent where respondent_id in (select
cs.respondent_id from chat_session cs where cs.project_id in (select project_id 
from
project where company_id = 4));


Cheers,

Eric



On 2/22/2011 1:07 PM, Scott Dunbar wrote:
Yes, you're correct.  I guess this makes sense but it does seem 
strange that I can enter garbage in a query but it still runs.  And in 
my case the output from this (the entire table) was then used in a 
delete statement that toasted the entire table.  Allowing bogus SQL 
just seems "wrong" but I do understand what's going on.


Thanks for your help.


On 02/22/2011 10:45 AM, Tom Lane wrote:

"Scott Dunbar"  writes:

I have a nested in clause like:
select respondent_id from respondent where respondent_id in (select
respondent_id from chat_session where project_id in (select project_id from
project where company_id = 4));
However, in this example, there is no column named respondent_id in the
chat_session table.

Probably there is one in respondent, though?  This behavior is not a bug
--- what you have there is an outer reference, and it is working exactly
as specified by the SQL standard.  Sub-selects would be a whole lot less
useful if they couldn't refer to variables of the outer query.

regards, tom lane



--
Scott Dunbar
Xigole Systems, Inc.
Enterprise software consulting, development, and hosting
303·667·6343