Hi all,
short summary:
platform: i386 SMP (dual PIII)
os: linux 2.6.8.1
vendor: debian (3.1, stable)
pgsql ver: 7.4.7 (deb)
disk: tech. SCSI vendor. IBM model. DDYS-T36950N rev. S96H
controller: adaptec aic-7892a
description:
we're experiencing a weird problem
trying to get a dump of our db for backup purposes,
the executed command is:
/usr/bin/pg_dump -U postgres -h 6pali elenco | /usr/bin/bzip2 > elenco_test.bz2
the output:
pg_dump: ERROR: could not open relation with OID 201327173
pg_dump: SQL command to dump the contents of table "nominativi" failed:
PQendcopy() failed.
pg_dump: Error message from server: ERROR: could not open relation with OID
201327173
pg_dump: The command was: COPY public.nominativi (nome_cogno, indirizzo, cap, citta, prov,
prefisso, telefono1, telefono2, note, idpersona, estrazione, num_estra, occupato,
cod_prov, cod_com, cod_reg, capoluo, rand) TO stdout;
so it seems that we've got some problems with the "nominativi" table
(a 20 million-row table), in fact the following command also fails:
pg_dump -t nominativi -U postgres -h 6pali elenco | /usr/bin/bzip2 >
nominativi.bz2
with the same err msg as before. Before the erros occurs we're are able to
get a partial backup, see:
#> ls -l nominativi.bz2
-rw-r--r-- 1 sickpig users 2.5M apr 19 12:35 nominativi.bz2
#> wc -l nominativi
145904 nominativi
We're trying to understand whether this is due to data corruption or
hardware failure. We run long self-tests on our SCSI disk through
smartmontools on a regular basis. see attached file for "smartctl -a /dev/sda"
output. All suggestions are welcome.
Regards,
Andrea
smartctl version 5.32 Copyright (C) 2002-4 Bruce Allen
Home page is http://smartmontools.sourceforge.net/
Device: IBM DDYS-T36950N Version: S96H
Serial number: 5FFL3272
Device type: disk
Transport protocol: Fibre channel (FCP-2)
Local Time is: Wed Apr 19 13:14:01 2006 CEST
Device supports SMART and is Enabled
Temperature Warning Disabled or Not Supported
SMART Health Status: OK
Current Drive Temperature: 41 C
Drive Trip Temperature: 85 C
Manufactured in week 06 of year 2001
Current start stop count: 147 times
Recommended maximum start stop count: 10000 times
Error counter log:
Errors Corrected Total Total Correction Gigabytes
Total
delay: [rereads/ errors algorithm processed
uncorrected
minor | major rewrites] corrected invocations [10^9 bytes]
errors
read: 0 0 0 5 5 6628.657
0
write: 0 0 0 0 0 4231.306
0
Non-medium error count: 0
SMART Self-test log
Num Test Status segment LifeTime LBA_first_err
[SK ASC ASQ]
Description number (hours)
# 1 Background long Completed - 22800 -
[- - -]
# 2 Background long Completed - 22631 -
[- - -]
# 3 Background long Completed - 22463 -
[- - -]
# 4 Background long Completed - 22294 -
[- - -]
# 5 Background long Completed - 22126 -
[- - -]
# 6 Background long Completed - 21958 -
[- - -]
# 7 Background long Completed - 21789 -
[- - -]
# 8 Background long Completed - 21621 -
[- - -]
# 9 Background long Completed - 21452 -
[- - -]
#10 Background long Completed - 21284 -
[- - -]
#11 Background long Completed - 21115 -
[- - -]
#12 Background long Completed - 20947 -
[- - -]
#13 Background long Completed - 20801 -
[- - -]
#14 Background long Completed - 20633 -
[- - -]
#15 Background long Completed - 20464 -
[- - -]
#16 Background long Completed - 20296 -
[- - -]
#17 Background long Completed - 20127 -
[- - -]
#18 Background long Completed - 19959 -
[- - -]
#19 Background long Completed - 19790 -
[- - -]
#20 Background long Completed - 19622 -
[- - -]
Long (extended) Self Test duration: 1340 seconds [22.3 minutes]
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match