[BUGS] pg_dump or hardware?

2006-04-19 Thread andrea suisani

Hi all,

short summary:

platform: i386 SMP (dual PIII)
os: linux 2.6.8.1
vendor: debian (3.1, stable)
pgsql ver: 7.4.7 (deb)
disk: tech. SCSI vendor. IBM model. DDYS-T36950N rev. S96H
controller: adaptec aic-7892a

description:

we're experiencing a weird problem
trying to get a dump of our db for backup purposes,
the executed command is:

/usr/bin/pg_dump -U postgres -h 6pali elenco | /usr/bin/bzip2 > elenco_test.bz2

the output:

pg_dump: ERROR:  could not open relation with OID 201327173
pg_dump: SQL command to dump the contents of table "nominativi" failed: 
PQendcopy() failed.
pg_dump: Error message from server: ERROR:  could not open relation with OID 
201327173
pg_dump: The command was: COPY public.nominativi (nome_cogno, indirizzo, cap, citta, prov, 
prefisso, telefono1, telefono2, note, idpersona, estrazione, num_estra, occupato, 
cod_prov, cod_com, cod_reg, capoluo, rand) TO stdout;



so it seems that we've got some problems with the "nominativi" table
(a 20 million-row table), in fact the following command also fails:

pg_dump -t nominativi -U postgres -h 6pali elenco | /usr/bin/bzip2 > 
nominativi.bz2

with the same err msg as before. Before the erros occurs we're are able to
get a partial backup, see:

#> ls -l nominativi.bz2
-rw-r--r--1 sickpig  users2.5M apr 19 12:35 nominativi.bz2

#> wc -l nominativi
 145904 nominativi

We're trying to understand whether this is due to data corruption or
hardware failure. We run long self-tests on our SCSI disk through
smartmontools on a regular basis. see attached file for "smartctl -a /dev/sda"
output. All suggestions are welcome.


Regards,


Andrea






smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

Device: IBM  DDYS-T36950N Version: S96H
Serial number: 5FFL3272
Device type: disk
Transport protocol: Fibre channel (FCP-2)
Local Time is: Wed Apr 19 13:14:01 2006 CEST
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK

Current Drive Temperature: 41 C
Drive Trip Temperature:85 C
Manufactured in week 06 of year 2001
Current start stop count:  147 times
Recommended maximum start stop count:  1 times

Error counter log:
  Errors CorrectedTotal  Total   Correction Gigabytes
Total
  delay:   [rereads/errors   algorithm  processed
uncorrected
minor | major  rewrites]  corrected  invocations   [10^9 bytes]  
errors
read:  00 0 5  5   6628.657 
  0
write: 00 0 0  0   4231.306 
  0

Non-medium error count:0

SMART Self-test log
Num  Test  Status segment  LifeTime  LBA_first_err 
[SK ASC ASQ]
 Description  number   (hours)
# 1  Background long   Completed   - 22800   - 
[-   --]
# 2  Background long   Completed   - 22631   - 
[-   --]
# 3  Background long   Completed   - 22463   - 
[-   --]
# 4  Background long   Completed   - 22294   - 
[-   --]
# 5  Background long   Completed   - 22126   - 
[-   --]
# 6  Background long   Completed   - 21958   - 
[-   --]
# 7  Background long   Completed   - 21789   - 
[-   --]
# 8  Background long   Completed   - 21621   - 
[-   --]
# 9  Background long   Completed   - 21452   - 
[-   --]
#10  Background long   Completed   - 21284   - 
[-   --]
#11  Background long   Completed   - 21115   - 
[-   --]
#12  Background long   Completed   - 20947   - 
[-   --]
#13  Background long   Completed   - 20801   - 
[-   --]
#14  Background long   Completed   - 20633   - 
[-   --]
#15  Background long   Completed   - 20464   - 
[-   --]
#16  Background long   Completed   - 20296   - 
[-   --]
#17  Background long   Completed   - 20127   - 
[-   --]
#18  Background long   Completed   - 19959   - 
[-   --]
#19  Background long   Completed   - 19790   - 
[-   --]
#20  Background long   Completed   - 19622   - 
[-   --]

Long (extended) Self Test duration: 1340 seconds [22.3 minutes]

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an in

Re: [BUGS] bug in windows xp

2006-04-19 Thread Tom Lane
Martijn van Oosterhout  writes:
> Not that I know of. However, the first step is to add this regression
> test for SIGFPE [-patches CCed].

This seems completely pointless.  The question is not about whether the
SIGFPE catcher works when fired, it's about what conditions trigger it.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [BUGS] bug in windows xp

2006-04-19 Thread Tom Lane
Martijn van Oosterhout  writes:
> Well, depends how you look at it. The original bug report was about a
> backend crash, which is what happens if you don't catch the SIGFPE. Can
> we guarentee that we know every situation that might generate a SIGFPE?

The point here is that under Windows int4div seems to be generating
something other than a SIGFPE --- if it were actually generating that
particular signal then the existing SIGFPE catcher would catch it.

It's barely possible that int4div *is* generating a SIGFPE and there's
some other breakage preventing FloatExceptionHandler from catching it,
but that's a question that deserves a one-shot test, not permanent
memorialization in a regression test.  Besides, if that's the situation
then testing that the handler catches kill(SIGFPE) proves exactly zero
about what the int4div problem is.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: Permission denied on fsync / Win32 (was [BUGS] right

2006-04-19 Thread Peter Brant
Here's the evidence from this morning.  I have to admit I'm not really
sure what to make of it though.

Pete

The fsync / Permission denied errors occurred on 2 of 3 active servers
for the 7 am CLUSTER cycle.

Server #1 (with fsync errors):
  - Both old and new relfilenodes are still visible with a 'dir':
04/19/2006  07:00 AM16,384 1532868
04/19/2006  07:06 AM 8,192 1536650

  - postgres.exe processes have handles to both old and new
relfilenodes

#1:
  F64: File  G:\pgsql\data\base\16385\1532868
  F84: Event \BaseNamedObjects\pgident: postgres: bigbird
bigbird 127.0.0.1(1745) idle

#2:
  F80: File  G:\pgsql\data\base\16385\1536650
  AB4: Event \BaseNamedObjects\pgident: postgres: writer
process

#3:
  F0C: File  G:\pgsql\data\base\16385\1536650
  F48: Event \BaseNamedObjects\pgident: postgres: bigbird
bigbird 127.0.0.1(1732) idle

(plus a few more like this)

Server #2 (with fsync errors):
  - Same pattern as Server #1.  bgwriter has a handle to the new
relfilenode.  Other backends have a handle to either old or new.  

Server #3 (w/o fsync errors):
  - Only the new relfilenode is visible with a 'dir':
04/19/2006  07:34 AM16,384 1550915

  - postgres.exe processes have handles to both old and new
relfilenodes

#1:
  F60: File  G:\pgsql\data\base\16385\1547888
  F84: Event \BaseNamedObjects\pgident: postgres: bigbird
bigbird 127.0.0.1(4060) idle

(plus two more like this)

#2:
  F78: File  G:\pgsql\data\base\16385\1550915
  F84: Event \BaseNamedObjects\pgident: postgres: bigbird
bigbird 127.0.0.1(2925) idle

(plus two more like this)


>>> "Magnus Hagander" <[EMAIL PROTECTED]> 04/18/06 9:00 pm >>>
> It happens often enough and the episodes last long enough 
> that grabbing a handle dump while this is going on should be 
> easily done.
> 
> Regarding the Win32 error code, backend/storage/file/fd.c 
> calls _commit(). 
> http://msdn2.microsoft.com/en-us/library/17618685(VS.80).aspx 
>   It looks
> like it is already using errno to report errors.  Will 
> GetLastError() return something useful there?

Good point.
Ran a quick test. If I open the file read-only and then fsync, I get
errno=9 (EBADF) and GetLastError()=5.  Which explains the fact that we
got the wrong error-code. The *underlying API call* to _commit()
returns
access denied...

Looking at the source to _commit(), if the call to FlushFileBuffers()
returns an error, it will set _doserrno to that value,and then return
with errno=EBADF.

So, this basicalliyu means that FlushFileBuffers() returns ACCESS
DENIED.

//Magnus

---(end of
broadcast)---
TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: Permission denied on fsync / Win32 (was [BUGS] right

2006-04-19 Thread Tom Lane
"Peter Brant" <[EMAIL PROTECTED]> writes:
> Here's the evidence from this morning.  I have to admit I'm not really
> sure what to make of it though.
> ...
>   - Same pattern as Server #1.  bgwriter has a handle to the new
> relfilenode.  Other backends have a handle to either old or new.  

It seems pretty clear to me that the problem occurs when we try to
fsync the old relfilenode, which is in a pending-delete state but
hasn't been deleted yet because of the presence of open handles in
some backends.  (Peter, did you check that the error messages in
the postmaster log all refer to old relfilenodes not new ones?)

We should be able to ignore this error, but I'm certainly unwilling
to just treat EACCES in general as an ignorable error.  So the problem
is to distinguish this case from genuine permission failures.  Perhaps
ERROR_SHARING_VIOLATION should be mapped to something other than EACCES?
It seems like that's a rather poor fit.  Or we could leave the mapping
as-is and add an #ifdef'd test on GetLastError to mdsync() (ugly...)

One worry is whether there are any other possible causes of
ERROR_SHARING_VIOLATION during fsync, and if so are they all ignorable.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [BUGS] pg_dump or hardware?

2006-04-19 Thread Tom Lane
andrea suisani <[EMAIL PROTECTED]> writes:
> pg_dump: ERROR:  could not open relation with OID 201327173
> pg_dump: SQL command to dump the contents of table "nominativi" failed: 
> PQendcopy() failed.
> pg_dump: Error message from server: ERROR:  could not open relation with OID 
> 201327173

Hmm ... what do you get from

select oid from pg_class where relname = 'nominativi';
select relname from pg_class where oid = 201327173;

If the first returns 201327173 while the second doesn't return anything,
then I'd wonder about corruption of pg_class's OID index.  REINDEXing it
might help.  I don't remember whether 7.4 makes you use a standalone
backend to reindex system catalogs --- see its REINDEX man page for
details.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: Permission denied on fsync / Win32 (was [BUGS] right

2006-04-19 Thread Peter Brant
I'm not sure that's the whole story.  "Server #3" had backends with
handles to the old relfilenode.  It didn't have any fsync errors and the
old relfilenode was apparently successfully deleted (or at least it
wasn't visible in the file system anymore).  That's the part of the
morning's investigations that left me kind of nonplussed.

Of half dozen or so cases I've looked at more closely, the error has
always referred to the old relfilenode.  I'll spot check a few more
though.

Pete

>>> Tom Lane <[EMAIL PROTECTED]> 04/19/06 7:21 pm >>>
It seems pretty clear to me that the problem occurs when we try to
fsync the old relfilenode, which is in a pending-delete state but
hasn't been deleted yet because of the presence of open handles in
some backends. 

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[BUGS] hello

2006-04-19 Thread Rasim Mustafa

Hello!I want to ask you a favour.Could you give me a free shell acount for
bot.If you render me a favour I will be much obliged to you.Thank you for
your kindness !

_
Express yourself instantly with MSN Messenger! Download today it's FREE! 
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [BUGS] bug in windows xp

2006-04-19 Thread Martijn van Oosterhout
On Wed, Apr 19, 2006 at 10:15:54AM -0400, Tom Lane wrote:
> Martijn van Oosterhout  writes:
> > Not that I know of. However, the first step is to add this regression
> > test for SIGFPE [-patches CCed].
> 
> This seems completely pointless.  The question is not about whether the
> SIGFPE catcher works when fired, it's about what conditions trigger it.

Well, depends how you look at it. The original bug report was about a
backend crash, which is what happens if you don't catch the SIGFPE. Can
we guarentee that we know every situation that might generate a SIGFPE?

Besides, isn't this what you were referring to here:

http://archives.postgresql.org/pgsql-bugs/2006-04/msg00091.php

Otherwise we should just fix int4div.
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.


signature.asc
Description: Digital signature


[BUGS] BUG #2400: '�' considered invalid UTF-8 character

2006-04-19 Thread Yusuf Siddiqui

The following bug has been logged online:

Bug reference:  2400
Logged by:  Yusuf Siddiqui
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.1
Operating system:   Linux
Description:'Æ' considered invalid UTF-8 character
Details: 

The character 'Æ' is rejected as an invalid UTF-8 character.

Here are the steps used to recreate it:

create table test (text_field text);
insert into test (text_field) values ('Æ');

Returned error: 
invalid UTF-8 byte sequence detected near byte 0x92

This statement also does not work:
insert into test (text_field) values ('\Æ');

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[BUGS] BUG #2401: spinlocks not available on amd64

2006-04-19 Thread Theo Schlossnagle

The following bug has been logged online:

Bug reference:  2401
Logged by:  Theo Schlossnagle
Email address:  [EMAIL PROTECTED]
PostgreSQL version: 8.1.3
Operating system:   Solaris 10
Description:spinlocks not available on amd64
Details: 

Compiling 8.1.3 on solaris 10 x86 with Sun Studio 11 for amd64 target
architecture leads us to an error resulting from no available "tas"
assembly.

The tas.s file doesn't look like valid assembly for the shipped Sun
assembler.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[BUGS] Dreamweaver 8 bug

2006-04-19 Thread Roger Braatz



Please,
 
    in  Dreamweaver 8 (windows) no show column in table of the basedata 
postgresql 8.X. How resolve???
 



  
  

  


  

  
  

  


  Roger Braatz 
  

  [EMAIL PROTECTED] 
  

  
  
tel: mobile: 
  Skype ID:
51 3228 631151 8447 
  1150rogerbraatz 

  
  

  


  Add me to your address book...
  Want a signature like 
  this?
 


Re: [BUGS] bug in windows xp

2006-04-19 Thread Martijn van Oosterhout
[Re: Uncaught exception when dividing integers]

On Tue, Apr 18, 2006 at 10:50:24PM -0400, Bruce Momjian wrote:
> 
> Is anyone working on this?

Not that I know of. However, the first step is to add this regression
test for SIGFPE [-patches CCed]. Note that this will probably redline
windows on the buildfarm. Once this has been added and all
architechures are in compliance, we can deal with the integer overflow
problem.

Triggering a SIGFPE is a bit tricky. On my i386 system the integer
divide will do it, but the rest just return +inf. Given there are
systems that don't SIGFPE the integer divide, I hope one of the others
will trigger... For UNIX systems I've made it try kill() first, that
seems the most reliable.

Have a nice day,
-- 
Martijn van Oosterhout  http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to 
> litigate.
? src/test/regress/.deps
? src/test/regress/libregress.so.0.0
? src/test/regress/log
? src/test/regress/pg_regress
? src/test/regress/results
? src/test/regress/testtablespace
? src/test/regress/tmp_check
? src/test/regress/expected/constraints.out
? src/test/regress/expected/copy.out
? src/test/regress/expected/create_function_1.out
? src/test/regress/expected/create_function_2.out
? src/test/regress/expected/misc.out
? src/test/regress/expected/tablespace.out
? src/test/regress/sql/constraints.sql
? src/test/regress/sql/copy.sql
? src/test/regress/sql/create_function_1.sql
? src/test/regress/sql/create_function_2.sql
? src/test/regress/sql/misc.sql
? src/test/regress/sql/tablespace.sql
Index: src/test/regress/regress.c
===
RCS file: /projects/cvsroot/pgsql/src/test/regress/regress.c,v
retrieving revision 1.65
diff -u -r1.65 regress.c
--- src/test/regress/regress.c  11 Jan 2006 20:12:43 -  1.65
+++ src/test/regress/regress.c  19 Apr 2006 10:16:59 -
@@ -10,6 +10,12 @@
 #include "executor/executor.h" /* For GetAttributeByName */
 #include "commands/sequence.h" /* for nextval() */
 
+/* For the SIGFPE test */
+#if defined(HAVE_POSIX_SIGNALS) && defined (HAVE_UNISTD_H)
+#include 
+#include 
+#endif
+
 #define P_MAXDIG 12
 #define LDELIM '('
 #define RDELIM ')'
@@ -26,7 +32,7 @@
 extern int oldstyle_length(int n, text *t);
 extern Datum int44in(PG_FUNCTION_ARGS);
 extern Datum int44out(PG_FUNCTION_ARGS);
-
+extern Datum sigfpe(PG_FUNCTION_ARGS);
 
 /*
  * Distance from a point to a path
@@ -734,3 +740,33 @@
*--walk = '\0';
PG_RETURN_CSTRING(result);
 }
+
+/*
+ * sigfpe  - deliberatly generates a floating 
point exception
+ */
+PG_FUNCTION_INFO_V1(sigfpe);
+
+Datum
+sigfpe(PG_FUNCTION_ARGS)
+{
+#if defined(HAVE_POSIX_SIGNALS) && defined (HAVE_UNISTD_H)
+   /* Most guarenteed way */
+   kill( getpid(), SIGFPE );
+#endif
+   /* If no signals, try to trigger manually */
+   volatile int int_val = 0; /* Avoid compiler constant elimination */
+   volatile float float_val = 0;
+
+   char str[12];
+   sprintf( str, "%d", 5/int_val );
+   sprintf( str, "%f", 5.0/float_val );
+   
+   float_val = -10;
+   sprintf( str, "%f", log(float_val) );
+
+   float_val = 2e+304;
+   sprintf( str, "%1.f", exp10(float_val) );
+   sprintf( str, "%1.f", float_val * float_val );
+
+   PG_RETURN_VOID();
+}
Index: src/test/regress/expected/errors.out
===
RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/errors.out,v
retrieving revision 1.52
diff -u -r1.52 errors.out
--- src/test/regress/expected/errors.out15 Apr 2006 17:45:46 -  
1.52
+++ src/test/regress/expected/errors.out19 Apr 2006 10:16:59 -
@@ -449,6 +449,10 @@
 ERROR:  syntax error at or near "NUL"
 LINE 16: ...L, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL, id4 I...
   ^
+-- Check that floating point exceptions are properly caught
+select sigfpe();
+ERROR:  floating-point exception
+DETAIL:  An invalid floating-point operation was signaled. This probably means 
an out-of-range result or an invalid operation, such as division by zero.
 -- Check that stack depth detection mechanism works and
 -- max_stack_depth is not set too high
 create function infinite_recurse() returns int as
Index: src/test/regress/input/create_function_1.source
===
RCS file: 
/projects/cvsroot/pgsql/src/test/regress/input/create_function_1.source,v
retrieving revision 1.17
diff -u -r1.17 create_function_1.source
--- src/test/regress/input/create_function_1.source 27 Feb 2006 16:09:50 
-  1.17
+++ src/test/regress/input/create_function_1.source 19 Apr 2006 10:16:59 
-
@@ -52,6 +52,11 @@
 AS '@abs_builddir@/[EMAIL PROTECTED]@'
 LANGUAGE C STRICT;
 
+CREATE FUNCT

Re: [BUGS] BUG #2401: spinlocks not available on amd64

2006-04-19 Thread Bruce Momjian
Theo Schlossnagle wrote:
> 
> The following bug has been logged online:
> 
> Bug reference:  2401
> Logged by:  Theo Schlossnagle
> Email address:  [EMAIL PROTECTED]
> PostgreSQL version: 8.1.3
> Operating system:   Solaris 10
> Description:spinlocks not available on amd64
> Details: 
> 
> Compiling 8.1.3 on solaris 10 x86 with Sun Studio 11 for amd64 target
> architecture leads us to an error resulting from no available "tas"
> assembly.
> 
> The tas.s file doesn't look like valid assembly for the shipped Sun
> assembler.

Yes.  We will have a fix for it in 8.2, but it was too risky for 8.1.X.
You can try a snapshot tarball from our FTP server and see if that works.

-- 
  Bruce Momjian   http://candle.pha.pa.us
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [BUGS] BUG #2400: 'Ã

2006-04-19 Thread Tomas Zerolo
On Tue, Apr 18, 2006 at 11:34:53PM +, Yusuf Siddiqui wrote:
> 
> The following bug has been logged online:
> 
> Bug reference:  2400
> Logged by:  Yusuf Siddiqui
> Email address:  [EMAIL PROTECTED]
> PostgreSQL version: 8.1
> Operating system:   Linux
> Description:'Æ' considered invalid UTF-8 character
> Details: 
> 
> The character 'Æ' is rejected as an invalid UTF-8 character.

Well, maybe it is :-)

> Here are the steps used to recreate it:
> 
> create table test (text_field text);
> insert into test (text_field) values ('Æ');
> 
> Returned error: 
> invalid UTF-8 byte sequence detected near byte 0x92

[...]

I'd need to know more. I gather from your mail that you are entering the
character into psql from a console. Several factors are relevant here:

  - which character encoding does your console have?
(if it is, e.g. iso-8859-x then this will be probably the culprit)
  - which client encoding is set? (in psql type SHOW CLIENT_ENCODING;)
  - which encoding is the server using (I'd guess utf-8; it doesn't need
to be the same as the client's, since it will try to convert).

HTH
-- tomás


signature.asc
Description: Digital signature