Re: [BUGS] BUG #5507: missing chunk number 0 for toast value XXXXX in pg_toast_XXXXX

2010-06-15 Thread 中嶋 信二
Thank you for a reply, everybody.


> On Mon, Jun 14, 2010 at 11:28 AM, Shinji Nakajima 
> wrote:
> > PostgreSQL version: 8.3.8
> > Description:        missing chunk number 0 for toast value X in
> > pg_toast_X
> >
> > I delete a record, and the system restores, but prime cause is unknown.
> > Will this be a bug of the databases?
> 
> Probably. Or possibly bad hardware. Assuming you didn't manually go in
> and delete that record from the toast table, which would be a strange
> thing to do.
>
The table restored.
However, there were tables when I checked the other tables.
Because primary key repeated in the same table, 
similar error message was displayed when I did select entirely.  


> The problem is it could have happened a long time ago and you just
> discovered it now. Have you had any other significant events on this
> machine? Any system crashes or power failures? Any drive crashes or
> signs of bad memory?
>
postgres is duplicated.
Red Hat Cluster Suite watches a process of each service.
PGDATA shares it in strage.

There is the thing that a wait server started. 
A cluster began the change disposal of servers. 
Because A cluster judged a state of postgres to be a stop.

I do not understand why duplex system to refer to same PGDATA was able to start.
I was able to surely carry out SQL by a psql command in duplex system.
I did not output log in those days.


> In the postgres logs are there any instances of unusual error messages
> or warnings?
> --
> greg
It continues, and an error occurs.
"could not read block 17 of relation 1663/16872/2840: read only 0 of 8192 bytes"

A data file seems to be broken...

Two postgres that PGDATA was shared will have started 
why if it was thought that it was caused by double start. 
Is there such a precedent?
Does a data file lead to the cause that failed?

Regards,
Nakajima



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5503: error in trigger function with dropped columns

2010-06-15 Thread Maxim Boguk
HI all.

Look like no one think this behavior is bug.
Then need change documentation probably, because in
http://www.postgresql.org/docs/8.4/interactive/plpgsql-trigger.html
specified:
"To alter the row to be stored, it is possible to replace single
values directly in NEW and return the modified NEW, or to build a
complete new record/row to return."

But in reality returning record or row doesn't work in insert trigger
at all in case of target table contained dropped columns.

Another interesting test:

CREATE TABLE test1 as select * from test;

now test1 table have the same structure as test
and try construct row instead of record:

CREATE OR REPLACE FUNCTION test_function() RETURNS trigger AS $$
 DECLARE
   _row   test1%ROWTYPE;
 BEGIN
   RAISE NOTICE 'NEW record = %', NEW;
   SELECT * INTO _row FROM test1 limit 1;
   RAISE NOTICE '_row record = %', _row;
   RETURN _row;
 END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER test_trigger before insert on test for each row EXECUTE
PROCEDURE test_function();

INSERT INTO test values (1);
NOTICE:  NEW record = (1)
NOTICE:  _row record = (1)
ERROR:  returned row structure does not match the structure of the
triggering table
DETAIL:  Number of returned columns (1) does not match expected column
count (3).
CONTEXT:  PL/pgSQL function "test_function" during function exit

So i can't return record, and i can return row from table of the same
structure. And that all because trigger function somehow think need
return all columns of table including dropped column.

If this behavior is not a bug, than documentation should be changed
(because "or to build a complete new record/row to return" will never
work if table contained dropped columns).

On Mon, Jun 14, 2010 at 11:20 AM, Maxim Boguk  wrote:
> I see... but anyway this bug does not allow use return record value
> from a trigger in table contained at least one dropped column, and
> even worse trigger will work on fresh loaded copy of production
> database and would pass all possible tests, but on production database
> it is stop working. Moreover, full functional system can become broken
> if single column dropped from table contained such trigger.
> E.g. functionality of such trigger depends of dropped column history
> of the table, which is wrong (IMHO).
>
> I was tried another test trigger on table with dropped column, and get
> even more funny results (trigger awaiting return record contained all
> rows from table include dropped so I tried construct such record):
>
> CREATE OR REPLACE FUNCTION test_function() RETURNS trigger AS $$
>  DECLARE
>   _row   record;
>  BEGIN
>   RAISE NOTICE 'NEW record = %', NEW;
>   SELECT *,2,3 INTO _row FROM test limit 1;
>   RAISE NOTICE '_row record = %', _row;
>   RETURN _row;
>  END;
> $$ LANGUAGE plpgsql;
>
> postgres=# insert into test values (1);
> NOTICE:  NEW record = (1)
> NOTICE:  _row record = (1,2,3)
> ERROR:  returned row structure does not match the structure of the
> triggering table
> DETAIL:  Returned type integer does not match expected type N/A
> (dropped column) in column "pg.dropped.2".
> CONTEXT:  PL/pgSQL function "test_function" during function exit
>
> I think changes in 9.0 now mask actual bug instead of fix it. If I was
> wrong, still would be useful to know how to use return record from
> trigger function in that case, because I can't make a working version
> at all.
>
> On Mon, Jun 14, 2010 at 4:09 AM, Tom Lane  wrote:
>> "Maksym Boguk"  writes:
>>> This bug hard to describe. But in general if a table contained dropped
>>> columns you cannot use return record variable in trigger function.
>>
>> This is fixed for 9.0 ... or at least the specific test case you provide
>> doesn't fail.  We have not risked back-porting the change though,
>> because there are other aspects of what the new code does that might
>> cause people problems, eg
>> http://archives.postgresql.org/pgsql-hackers/2010-03/msg00444.php
>> http://archives.postgresql.org/message-id/6645.1267926...@sss.pgh.pa.us
>>
>>                        regards, tom lane
>>
>
>
>
> --
> Maxim Boguk
> Senior Postgresql DBA.
>
> Skype: maxim.boguk
> Jabber: maxim.bo...@gmail.com
>
> LinkedIn profile: http://nz.linkedin.com/in/maximboguk
> МойКруг: http://mboguk.moikrug.ru/
>
> Сила солому ломит, но не все в нашей жизни - солома, да и сила далеко не все.
>



-- 
Maxim Boguk
Senior Postgresql DBA.

Skype: maxim.boguk
Jabber: maxim.bo...@gmail.com

LinkedIn profile: http://nz.linkedin.com/in/maximboguk
МойКруг: http://mboguk.moikrug.ru/

Сила солому ломит, но не все в нашей жизни - солома, да и сила далеко не все.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5507: missing chunk number 0 for toast value XXXXX in pg_toast_XXXXX

2010-06-15 Thread Kevin Grittner
中嶋 信二 wrote:
 
> postgres is duplicated.
> Red Hat Cluster Suite watches a process of each service.
> PGDATA shares it in strage.
> 
> There is the thing that a wait server started. 
> A cluster began the change disposal of servers. 
> Because A cluster judged a state of postgres to be a stop.
> 
> I do not understand why duplex system to refer to same PGDATA was
> able to start.
> I was able to surely carry out SQL by a psql command in duplex
> system.
> I did not output log in those days.
 
> Two postgres that PGDATA was shared will have started 
> why if it was thought that it was caused by double start. 
> Is there such a precedent?
> Does a data file lead to the cause that failed?
 
I'm not sure I totally understand, but it sounds like you had two
postmasters running against a single data directory.  If so, that
could cause all kinds of corruption.  It's hard to see how that
could happen unless you deleted a PostgreSQL data directory, or at
least the postmaster.pid file, while an instance was running.
 
I would start by capturing "ps auxf" output, to be able to
understand what postgres processes were running and when they
started.  Then I would probably make sure they all got stopped. 
Then I would be seriously looking at restoring from backup, unless
this was a development database which could just be recreated from
scratch.
 
-Kevin

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #5452: Server core dumps coming out of recovery mode

2010-06-15 Thread Chris Copeland
Heikki,

Thanks for your help on this issue.

I modified my restore script to return 1 only once and that solved the
problem.

Cheers,
Chris

On Fri, May 7, 2010 at 3:35 AM, Heikki Linnakangas <
heikki.linnakan...@enterprisedb.com> wrote:

> Chris Copeland wrote:
> > I have two servers with the same hardware, OS, and pg binaries.  Log
> files
> > are copied from the master to the standby and the standby is run in
> recovery
> > mode.
> >
> > When the standby is triggered to come out of recovery mode, it fails and
> > generates a core dump.  Upon trying to start it after failure, it starts
> > looking for WAL files that it has already recovered.
> >...
> > 2010-05-06 10:57:30 CDT :LOG:  restored log file
> "000100AF0059"
> > from archive
> >>> >> Now I trigger the restore command to return 1 to stop the recovery
> > 2010-05-06 10:59:30 CDT :LOG:  could not open file
> > "pg_xlog/000100AF005A" (log file 175, segment 90): No such
> file
> > or directory
> > 2010-05-06 10:59:30 CDT :LOG:  redo done at AF/5968
> > 2010-05-06 10:59:30 CDT :PANIC:  could not open file
> > "pg_xlog/000100AF0059" (log file 175, segment 89): No such
> file
> > or directory
>
> At startup, the server needs to re-fetch the last checkpoint record.
> That means calling restore_command again for a file that was already
> restored. It looks like restore_command is failing at the re-fetch,
> which causes the PANIC.
>
> To trigger failover, restore_command needs to return 1, once, but it
> must return 0 again on any subsequent calls. I suspect your
> restore_command keeps returning 1 on the subsequent calls.
>
> --
>  Heikki Linnakangas
>  EnterpriseDB   http://www.enterprisedb.com
>