[BUGS] BUG #4568: sporadic error doing COUNT(*) on empty table

2008-12-08 Thread Sameer Mahajan

The following bug has been logged online:

Bug reference:  4568
Logged by:  Sameer Mahajan
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.1.4-1.3
Operating system:   Suse Linux
Description:sporadic error doing COUNT(*) on empty table
Details: 

While performing COUNT(*) on an empty table via libpq APIs I am spradically
getting an error.

the error is returning 7 for PQresultStatus(res)

and 0 for PQntuples(res) where res is the PGresult returned by this count(*)
query on the empty table.

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4565: nextval not updated during wal replication, leading to pk violations

2008-12-08 Thread Marc Schablewski
FYI: We found another anomaly in that database (duplicated record with a
unique index on a column (non PK)) and so we did some more research. As
before, the record got replicated, but this time, the index entry was
not. Later, a job tried to check if that record was already there by
querying the column (thus, using the index) and couldn't find it so it
was inserted again.

We first thought it could be a WAL that wasn't restored. If I understand
pg_backup_start() and _stop() right, the first WAL that should have been
requested after the online backup had finished should have been the one
mentioned in the "START WAL LOCATION" line of the backup history file,
but the first file that actually gets requested is that in the "STOP WAL
LOCATION" line.

0002004B0027.00396450.backup:

START WAL LOCATION: 4B/27396450 (file 0002004B0027)
STOP WAL LOCATION: 4B/28DDF968 (file 0002004B0028)
CHECKPOINT LOCATION: 4B/27396450
START TIME: 2008-11-28 16:29:17 CET
LABEL: backup-20081128-162916
STOP TIME: 2008-11-28 17:49:09 CET

Both WAL files were copied to the backup system. The log switch occurred
before 17:06, because at that time the file was copied to the backup
server by the archive command. But sadly, the records were all part of a
single transaction which ran between 17:32 and 17:33, so obviously this
is not a part of the problem. Though it seems a bit strange to me that
the ...27 WAL didn't get recovered, but maybe I just don't fully
understand that part of the docs (Chapter 24.3.2., fifth paragraph after
the ordered list).

Right now, I run a fresh online backup to check  if we can reproduce the
problem. I made some minor corrections on my scripts which prepare the
backup database for replication, which might have caused the problem. I
now omit the pg_xlog directory when taking the initial backup, after
rereading chapter 24.3.2. saying there is a "risk of mistakes" if one
copies those files, whatever that means. Maybe someone could clarify
what the risks and mistakes could be?

Speaking of chapter 24.3.2.: The last sentence in second paragraph after
the ordered list ("You must ensure that these steps are carried out in
sequence without any possible overlap, or you will invalidate the
backup.") seems a bit misplaced. As far as I understand, it makes a
reference to the ordered list ("steps"), but I had to read it several
times to get it. I think, a better place would be right after the list.
...But then, maybe it's just me, not sleeping well last night. ;)

Marc


Marc Schablewski wrote:
> The following bug has been logged online:
>
> Bug reference:  4565
> Logged by:  Marc Schablewski
> Email address:  [EMAIL PROTECTED]
> PostgreSQL version: 8.3.5
> Operating system:   Debian, 2.6.24-19 kernel
> Description:nextval not updated during wal replication, leading to
> pk violations
> Details: 
>
> Hi!
>
> We are using "Continuous Archiving" of WAL to keep a warm standby database
> server. We recently switch over to that backup database for testing purpose.
> We first took an online backup of the master database, let the WAL shipping
> run for some days and finally started the backup for normal use. 
>
> A job tried to insert some records into a few tables and complained about a
> PK violation. PK are of type bigint. The column is populated by a sequence
> and a "default nextval()" on the PK column. We found that the sequence's
> currval was lower than the maximum value in the table and that this value
> was already present. Further investigation showed us that the last records
> in the table were inserted on the former master server while taking the
> initial online backup for the replication. It seems that the records got
> replicated but not the currval/nextval of the sequence. When running "select
> nextval()" on the backup database it returned the PK value of the first
> record inserted during that last run on the former master server.
>
> I couldn't reproduce the problem with two different servers and a simple
> database containing one table. Right now, I don't have any idea how to
> narrow down the problem or what to check next. So any hint would be
> helpful.
>
> Marc
>
>   


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4565: nextval not updated during wal replication, leading to pk violations

2008-12-08 Thread Tom Lane
"Marc Schablewski" <[EMAIL PROTECTED]> writes:
> A job tried to insert some records into a few tables and complained about a
> PK violation. PK are of type bigint. The column is populated by a sequence
> and a "default nextval()" on the PK column. We found that the sequence's
> currval was lower than the maximum value in the table and that this value
> was already present. Further investigation showed us that the last records
> in the table were inserted on the former master server while taking the
> initial online backup for the replication. It seems that the records got
> replicated but not the currval/nextval of the sequence. When running "select
> nextval()" on the backup database it returned the PK value of the first
> record inserted during that last run on the former master server.

How many sequence values "overlapped" here, exactly?  If more than one,
are you by any chance running that sequence with a CACHE setting larger
than one?  (See the sequence's cache_value column if you're unsure.)

I can see what might be a race condition between nextval()'s WAL logging
and checkpoint start, but it doesn't seem like it could account for more
than the CACHE setting worth of overlap.

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs


Re: [BUGS] BUG #4562: ts_headline() adds space when parsing url

2008-12-08 Thread Tom Lane
"Denis Monsieur" <[EMAIL PROTECTED]> writes:
> The problem is a space being added to text in the form of
> http://some.url/path
> Compare the output:

> shs=# SELECT ts_headline('http://some.url', to_tsquery('sometext'));
>ts_headline
> -
>  http://some.url
> (1 row)

> shs=# SELECT ts_headline('http://some.url/path', to_tsquery('sometext'));
>   ts_headline
> ---
>  http:// some.url/path
> (1 row)

I looked into this, and it seems that the problem is that
generateHeadline() emits a space for any token marked as replace = 1.
I think it probably shouldn't emit anything at all.  AFAICS the cases
where replace will get set are token types URL, TAG, NUMHWORD,
ASCIIHWORD, HWORD.  For URL and the HWORD variants the space is
certainly undesirable, because these token types are just respecifying
text that is also covered by their component tokens.  The only case
where you could make an argument that the space is useful is TAG,
as in

regression=# SELECT ts_headline('httpblah', to_tsquery('sometext'));
 ts_headline 
-
 http blah
(1 row)

But it seems to me to be at least as plausible that you should get
nothing as that you should get a space for a removed tag.

Comments?

regards, tom lane

-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs