[BUGS] pg_dump or hardware?
Hi all, short summary: platform: i386 SMP (dual PIII) os: linux 2.6.8.1 vendor: debian (3.1, stable) pgsql ver: 7.4.7 (deb) disk: tech. SCSI vendor. IBM model. DDYS-T36950N rev. S96H controller: adaptec aic-7892a description: we're experiencing a weird problem trying to get a dump of our db for backup purposes, the executed command is: /usr/bin/pg_dump -U postgres -h 6pali elenco | /usr/bin/bzip2 > elenco_test.bz2 the output: pg_dump: ERROR: could not open relation with OID 201327173 pg_dump: SQL command to dump the contents of table "nominativi" failed: PQendcopy() failed. pg_dump: Error message from server: ERROR: could not open relation with OID 201327173 pg_dump: The command was: COPY public.nominativi (nome_cogno, indirizzo, cap, citta, prov, prefisso, telefono1, telefono2, note, idpersona, estrazione, num_estra, occupato, cod_prov, cod_com, cod_reg, capoluo, rand) TO stdout; so it seems that we've got some problems with the "nominativi" table (a 20 million-row table), in fact the following command also fails: pg_dump -t nominativi -U postgres -h 6pali elenco | /usr/bin/bzip2 > nominativi.bz2 with the same err msg as before. Before the erros occurs we're are able to get a partial backup, see: #> ls -l nominativi.bz2 -rw-r--r--1 sickpig users2.5M apr 19 12:35 nominativi.bz2 #> wc -l nominativi 145904 nominativi We're trying to understand whether this is due to data corruption or hardware failure. We run long self-tests on our SCSI disk through smartmontools on a regular basis. see attached file for "smartctl -a /dev/sda" output. All suggestions are welcome. Regards, Andrea smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen Home page is http://smartmontools.sourceforge.net/ Device: IBM DDYS-T36950N Version: S96H Serial number: 5FFL3272 Device type: disk Transport protocol: Fibre channel (FCP-2) Local Time is: Wed Apr 19 13:14:01 2006 CEST Device supports SMART and is Enabled Temperature Warning Disabled or Not Supported SMART Health Status: OK Current Drive Temperature: 41 C Drive Trip Temperature:85 C Manufactured in week 06 of year 2001 Current start stop count: 147 times Recommended maximum start stop count: 1 times Error counter log: Errors CorrectedTotal Total Correction Gigabytes Total delay: [rereads/errors algorithm processed uncorrected minor | major rewrites] corrected invocations [10^9 bytes] errors read: 00 0 5 5 6628.657 0 write: 00 0 0 0 4231.306 0 Non-medium error count:0 SMART Self-test log Num Test Status segment LifeTime LBA_first_err [SK ASC ASQ] Description number (hours) # 1 Background long Completed - 22800 - [- --] # 2 Background long Completed - 22631 - [- --] # 3 Background long Completed - 22463 - [- --] # 4 Background long Completed - 22294 - [- --] # 5 Background long Completed - 22126 - [- --] # 6 Background long Completed - 21958 - [- --] # 7 Background long Completed - 21789 - [- --] # 8 Background long Completed - 21621 - [- --] # 9 Background long Completed - 21452 - [- --] #10 Background long Completed - 21284 - [- --] #11 Background long Completed - 21115 - [- --] #12 Background long Completed - 20947 - [- --] #13 Background long Completed - 20801 - [- --] #14 Background long Completed - 20633 - [- --] #15 Background long Completed - 20464 - [- --] #16 Background long Completed - 20296 - [- --] #17 Background long Completed - 20127 - [- --] #18 Background long Completed - 19959 - [- --] #19 Background long Completed - 19790 - [- --] #20 Background long Completed - 19622 - [- --] Long (extended) Self Test duration: 1340 seconds [22.3 minutes] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an in
Re: [BUGS] bug in windows xp
Martijn van Oosterhout writes: > Not that I know of. However, the first step is to add this regression > test for SIGFPE [-patches CCed]. This seems completely pointless. The question is not about whether the SIGFPE catcher works when fired, it's about what conditions trigger it. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [BUGS] bug in windows xp
Martijn van Oosterhout writes: > Well, depends how you look at it. The original bug report was about a > backend crash, which is what happens if you don't catch the SIGFPE. Can > we guarentee that we know every situation that might generate a SIGFPE? The point here is that under Windows int4div seems to be generating something other than a SIGFPE --- if it were actually generating that particular signal then the existing SIGFPE catcher would catch it. It's barely possible that int4div *is* generating a SIGFPE and there's some other breakage preventing FloatExceptionHandler from catching it, but that's a question that deserves a one-shot test, not permanent memorialization in a regression test. Besides, if that's the situation then testing that the handler catches kill(SIGFPE) proves exactly zero about what the int4div problem is. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: Permission denied on fsync / Win32 (was [BUGS] right
Here's the evidence from this morning. I have to admit I'm not really sure what to make of it though. Pete The fsync / Permission denied errors occurred on 2 of 3 active servers for the 7 am CLUSTER cycle. Server #1 (with fsync errors): - Both old and new relfilenodes are still visible with a 'dir': 04/19/2006 07:00 AM16,384 1532868 04/19/2006 07:06 AM 8,192 1536650 - postgres.exe processes have handles to both old and new relfilenodes #1: F64: File G:\pgsql\data\base\16385\1532868 F84: Event \BaseNamedObjects\pgident: postgres: bigbird bigbird 127.0.0.1(1745) idle #2: F80: File G:\pgsql\data\base\16385\1536650 AB4: Event \BaseNamedObjects\pgident: postgres: writer process #3: F0C: File G:\pgsql\data\base\16385\1536650 F48: Event \BaseNamedObjects\pgident: postgres: bigbird bigbird 127.0.0.1(1732) idle (plus a few more like this) Server #2 (with fsync errors): - Same pattern as Server #1. bgwriter has a handle to the new relfilenode. Other backends have a handle to either old or new. Server #3 (w/o fsync errors): - Only the new relfilenode is visible with a 'dir': 04/19/2006 07:34 AM16,384 1550915 - postgres.exe processes have handles to both old and new relfilenodes #1: F60: File G:\pgsql\data\base\16385\1547888 F84: Event \BaseNamedObjects\pgident: postgres: bigbird bigbird 127.0.0.1(4060) idle (plus two more like this) #2: F78: File G:\pgsql\data\base\16385\1550915 F84: Event \BaseNamedObjects\pgident: postgres: bigbird bigbird 127.0.0.1(2925) idle (plus two more like this) >>> "Magnus Hagander" <[EMAIL PROTECTED]> 04/18/06 9:00 pm >>> > It happens often enough and the episodes last long enough > that grabbing a handle dump while this is going on should be > easily done. > > Regarding the Win32 error code, backend/storage/file/fd.c > calls _commit(). > http://msdn2.microsoft.com/en-us/library/17618685(VS.80).aspx > It looks > like it is already using errno to report errors. Will > GetLastError() return something useful there? Good point. Ran a quick test. If I open the file read-only and then fsync, I get errno=9 (EBADF) and GetLastError()=5. Which explains the fact that we got the wrong error-code. The *underlying API call* to _commit() returns access denied... Looking at the source to _commit(), if the call to FlushFileBuffers() returns an error, it will set _doserrno to that value,and then return with errno=EBADF. So, this basicalliyu means that FlushFileBuffers() returns ACCESS DENIED. //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: Permission denied on fsync / Win32 (was [BUGS] right
"Peter Brant" <[EMAIL PROTECTED]> writes: > Here's the evidence from this morning. I have to admit I'm not really > sure what to make of it though. > ... > - Same pattern as Server #1. bgwriter has a handle to the new > relfilenode. Other backends have a handle to either old or new. It seems pretty clear to me that the problem occurs when we try to fsync the old relfilenode, which is in a pending-delete state but hasn't been deleted yet because of the presence of open handles in some backends. (Peter, did you check that the error messages in the postmaster log all refer to old relfilenodes not new ones?) We should be able to ignore this error, but I'm certainly unwilling to just treat EACCES in general as an ignorable error. So the problem is to distinguish this case from genuine permission failures. Perhaps ERROR_SHARING_VIOLATION should be mapped to something other than EACCES? It seems like that's a rather poor fit. Or we could leave the mapping as-is and add an #ifdef'd test on GetLastError to mdsync() (ugly...) One worry is whether there are any other possible causes of ERROR_SHARING_VIOLATION during fsync, and if so are they all ignorable. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [BUGS] pg_dump or hardware?
andrea suisani <[EMAIL PROTECTED]> writes: > pg_dump: ERROR: could not open relation with OID 201327173 > pg_dump: SQL command to dump the contents of table "nominativi" failed: > PQendcopy() failed. > pg_dump: Error message from server: ERROR: could not open relation with OID > 201327173 Hmm ... what do you get from select oid from pg_class where relname = 'nominativi'; select relname from pg_class where oid = 201327173; If the first returns 201327173 while the second doesn't return anything, then I'd wonder about corruption of pg_class's OID index. REINDEXing it might help. I don't remember whether 7.4 makes you use a standalone backend to reindex system catalogs --- see its REINDEX man page for details. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: Permission denied on fsync / Win32 (was [BUGS] right
I'm not sure that's the whole story. "Server #3" had backends with handles to the old relfilenode. It didn't have any fsync errors and the old relfilenode was apparently successfully deleted (or at least it wasn't visible in the file system anymore). That's the part of the morning's investigations that left me kind of nonplussed. Of half dozen or so cases I've looked at more closely, the error has always referred to the old relfilenode. I'll spot check a few more though. Pete >>> Tom Lane <[EMAIL PROTECTED]> 04/19/06 7:21 pm >>> It seems pretty clear to me that the problem occurs when we try to fsync the old relfilenode, which is in a pending-delete state but hasn't been deleted yet because of the presence of open handles in some backends. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[BUGS] hello
Hello!I want to ask you a favour.Could you give me a free shell acount for bot.If you render me a favour I will be much obliged to you.Thank you for your kindness ! _ Express yourself instantly with MSN Messenger! Download today it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [BUGS] bug in windows xp
On Wed, Apr 19, 2006 at 10:15:54AM -0400, Tom Lane wrote: > Martijn van Oosterhout writes: > > Not that I know of. However, the first step is to add this regression > > test for SIGFPE [-patches CCed]. > > This seems completely pointless. The question is not about whether the > SIGFPE catcher works when fired, it's about what conditions trigger it. Well, depends how you look at it. The original bug report was about a backend crash, which is what happens if you don't catch the SIGFPE. Can we guarentee that we know every situation that might generate a SIGFPE? Besides, isn't this what you were referring to here: http://archives.postgresql.org/pgsql-bugs/2006-04/msg00091.php Otherwise we should just fix int4div. -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. signature.asc Description: Digital signature
[BUGS] BUG #2400: '�' considered invalid UTF-8 character
The following bug has been logged online: Bug reference: 2400 Logged by: Yusuf Siddiqui Email address: [EMAIL PROTECTED] PostgreSQL version: 8.1 Operating system: Linux Description:'ÃÂ' considered invalid UTF-8 character Details: The character 'ÃÂ' is rejected as an invalid UTF-8 character. Here are the steps used to recreate it: create table test (text_field text); insert into test (text_field) values ('ÃÂ'); Returned error: invalid UTF-8 byte sequence detected near byte 0x92 This statement also does not work: insert into test (text_field) values ('\ÃÂ'); ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[BUGS] BUG #2401: spinlocks not available on amd64
The following bug has been logged online: Bug reference: 2401 Logged by: Theo Schlossnagle Email address: [EMAIL PROTECTED] PostgreSQL version: 8.1.3 Operating system: Solaris 10 Description:spinlocks not available on amd64 Details: Compiling 8.1.3 on solaris 10 x86 with Sun Studio 11 for amd64 target architecture leads us to an error resulting from no available "tas" assembly. The tas.s file doesn't look like valid assembly for the shipped Sun assembler. ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[BUGS] Dreamweaver 8 bug
Please, in Dreamweaver 8 (windows) no show column in table of the basedata postgresql 8.X. How resolve??? Roger Braatz [EMAIL PROTECTED] tel: mobile: Skype ID: 51 3228 631151 8447 1150rogerbraatz Add me to your address book... Want a signature like this?
Re: [BUGS] bug in windows xp
[Re: Uncaught exception when dividing integers] On Tue, Apr 18, 2006 at 10:50:24PM -0400, Bruce Momjian wrote: > > Is anyone working on this? Not that I know of. However, the first step is to add this regression test for SIGFPE [-patches CCed]. Note that this will probably redline windows on the buildfarm. Once this has been added and all architechures are in compliance, we can deal with the integer overflow problem. Triggering a SIGFPE is a bit tricky. On my i386 system the integer divide will do it, but the rest just return +inf. Given there are systems that don't SIGFPE the integer divide, I hope one of the others will trigger... For UNIX systems I've made it try kill() first, that seems the most reliable. Have a nice day, -- Martijn van Oosterhout http://svana.org/kleptog/ > From each according to his ability. To each according to his ability to > litigate. ? src/test/regress/.deps ? src/test/regress/libregress.so.0.0 ? src/test/regress/log ? src/test/regress/pg_regress ? src/test/regress/results ? src/test/regress/testtablespace ? src/test/regress/tmp_check ? src/test/regress/expected/constraints.out ? src/test/regress/expected/copy.out ? src/test/regress/expected/create_function_1.out ? src/test/regress/expected/create_function_2.out ? src/test/regress/expected/misc.out ? src/test/regress/expected/tablespace.out ? src/test/regress/sql/constraints.sql ? src/test/regress/sql/copy.sql ? src/test/regress/sql/create_function_1.sql ? src/test/regress/sql/create_function_2.sql ? src/test/regress/sql/misc.sql ? src/test/regress/sql/tablespace.sql Index: src/test/regress/regress.c === RCS file: /projects/cvsroot/pgsql/src/test/regress/regress.c,v retrieving revision 1.65 diff -u -r1.65 regress.c --- src/test/regress/regress.c 11 Jan 2006 20:12:43 - 1.65 +++ src/test/regress/regress.c 19 Apr 2006 10:16:59 - @@ -10,6 +10,12 @@ #include "executor/executor.h" /* For GetAttributeByName */ #include "commands/sequence.h" /* for nextval() */ +/* For the SIGFPE test */ +#if defined(HAVE_POSIX_SIGNALS) && defined (HAVE_UNISTD_H) +#include +#include +#endif + #define P_MAXDIG 12 #define LDELIM '(' #define RDELIM ')' @@ -26,7 +32,7 @@ extern int oldstyle_length(int n, text *t); extern Datum int44in(PG_FUNCTION_ARGS); extern Datum int44out(PG_FUNCTION_ARGS); - +extern Datum sigfpe(PG_FUNCTION_ARGS); /* * Distance from a point to a path @@ -734,3 +740,33 @@ *--walk = '\0'; PG_RETURN_CSTRING(result); } + +/* + * sigfpe - deliberatly generates a floating point exception + */ +PG_FUNCTION_INFO_V1(sigfpe); + +Datum +sigfpe(PG_FUNCTION_ARGS) +{ +#if defined(HAVE_POSIX_SIGNALS) && defined (HAVE_UNISTD_H) + /* Most guarenteed way */ + kill( getpid(), SIGFPE ); +#endif + /* If no signals, try to trigger manually */ + volatile int int_val = 0; /* Avoid compiler constant elimination */ + volatile float float_val = 0; + + char str[12]; + sprintf( str, "%d", 5/int_val ); + sprintf( str, "%f", 5.0/float_val ); + + float_val = -10; + sprintf( str, "%f", log(float_val) ); + + float_val = 2e+304; + sprintf( str, "%1.f", exp10(float_val) ); + sprintf( str, "%1.f", float_val * float_val ); + + PG_RETURN_VOID(); +} Index: src/test/regress/expected/errors.out === RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/errors.out,v retrieving revision 1.52 diff -u -r1.52 errors.out --- src/test/regress/expected/errors.out15 Apr 2006 17:45:46 - 1.52 +++ src/test/regress/expected/errors.out19 Apr 2006 10:16:59 - @@ -449,6 +449,10 @@ ERROR: syntax error at or near "NUL" LINE 16: ...L, id2 TEXT NOT NULL PRIMARY KEY, id3 INTEGER NOT NUL, id4 I... ^ +-- Check that floating point exceptions are properly caught +select sigfpe(); +ERROR: floating-point exception +DETAIL: An invalid floating-point operation was signaled. This probably means an out-of-range result or an invalid operation, such as division by zero. -- Check that stack depth detection mechanism works and -- max_stack_depth is not set too high create function infinite_recurse() returns int as Index: src/test/regress/input/create_function_1.source === RCS file: /projects/cvsroot/pgsql/src/test/regress/input/create_function_1.source,v retrieving revision 1.17 diff -u -r1.17 create_function_1.source --- src/test/regress/input/create_function_1.source 27 Feb 2006 16:09:50 - 1.17 +++ src/test/regress/input/create_function_1.source 19 Apr 2006 10:16:59 - @@ -52,6 +52,11 @@ AS '@abs_builddir@/[EMAIL PROTECTED]@' LANGUAGE C STRICT; +CREATE FUNCT
Re: [BUGS] BUG #2401: spinlocks not available on amd64
Theo Schlossnagle wrote: > > The following bug has been logged online: > > Bug reference: 2401 > Logged by: Theo Schlossnagle > Email address: [EMAIL PROTECTED] > PostgreSQL version: 8.1.3 > Operating system: Solaris 10 > Description:spinlocks not available on amd64 > Details: > > Compiling 8.1.3 on solaris 10 x86 with Sun Studio 11 for amd64 target > architecture leads us to an error resulting from no available "tas" > assembly. > > The tas.s file doesn't look like valid assembly for the shipped Sun > assembler. Yes. We will have a fix for it in 8.2, but it was too risky for 8.1.X. You can try a snapshot tarball from our FTP server and see if that works. -- Bruce Momjian http://candle.pha.pa.us EnterpriseDBhttp://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [BUGS] BUG #2400: 'Ã
On Tue, Apr 18, 2006 at 11:34:53PM +, Yusuf Siddiqui wrote: > > The following bug has been logged online: > > Bug reference: 2400 > Logged by: Yusuf Siddiqui > Email address: [EMAIL PROTECTED] > PostgreSQL version: 8.1 > Operating system: Linux > Description:'Ã' considered invalid UTF-8 character > Details: > > The character 'Ã' is rejected as an invalid UTF-8 character. Well, maybe it is :-) > Here are the steps used to recreate it: > > create table test (text_field text); > insert into test (text_field) values ('Ã'); > > Returned error: > invalid UTF-8 byte sequence detected near byte 0x92 [...] I'd need to know more. I gather from your mail that you are entering the character into psql from a console. Several factors are relevant here: - which character encoding does your console have? (if it is, e.g. iso-8859-x then this will be probably the culprit) - which client encoding is set? (in psql type SHOW CLIENT_ENCODING;) - which encoding is the server using (I'd guess utf-8; it doesn't need to be the same as the client's, since it will try to convert). HTH -- tomás signature.asc Description: Digital signature