Hi Chiru,

I am trying to pg_dump the database to have a snapshot of the current
state. I've turned on 'zero_damaged_pages' but pg_dump is still failing
with an "invalid page header" error - this time from what looks like a
sequence object that is auto-setting IDs on a table. Any advice on how to
remove this error?

Here is the full query that's failing:

SELECT sequence_name, start_value, last_value, increment_by, CASE WHEN
increment_by > 0 AND max_value = 9223372036854775807 THEN NULL      WHEN
increment_by < 0 AND max_value = -1 THEN NULL      ELSE max_value END AS
max_value, CASE WHEN increment_by > 0 AND min_value = 1 THEN NULL      WHEN
increment_by < 0 AND min_value = -9223372036854775807 THEN NULL      ELSE
min_value END AS min_value, cache_value, is_cycled, is_called from
unfinishedsubmissionstub_id_seq

On Fri, Dec 26, 2014 at 2:35 PM, chiru r <chir...@gmail.com> wrote:

> Hi Cory,
>
> After recovering table turn off *zero_damaged_pages  *parameter.
>
>
> On Fri, Dec 26, 2014 at 9:13 PM, Cory Zue <c...@dimagi.com> wrote:
>
>> Hi all,
>>
>> Thanks for the responses. Chiru, I'm looking into your suggestion.
>>
>> Sameer, here is the kernel version info:
>>
>> Linux dimagi 2.6.32-431.20.5.el6.x86_64 #1 SMP Wed Jul 16 05:26:53 EDT
>> 2014 x86_64 x86_64 x86_64 GNU/Linux
>>
>> Does that seem like it could be a problematic version?
>>
>> More generally - I'm still wondering whether I should chalk this failure
>> up to a transient/random issue, or whether I should be more worried about
>> the hardware on the machine. According to our diagnostic tools,  disk and
>> memory are fine, but it's still not clear to me how it got into this state.
>> Any general bits of information regarding the potential causes of these
>> types of issues would be much appreciated.
>>
>> thanks,
>> Cory
>>
>>
>> On Fri, Dec 26, 2014 at 6:55 AM, Sameer Kumar <sameer.ku...@ashnik.com>
>> wrote:
>>
>>> On 23 Dec 2014 12:05, "Cory Zue" <c...@dimagi.com> wrote:
>>> >
>>> > Hi all,
>>> >
>>> > Our postgres instance on one of our production machines has recently
>>> been returning errors of the form "DatabaseError: invalid page header in
>>> block 1 of relation base/16384/76623" from normal queries. I've been
>>> reading that these are often linked to hardware errors, but I would like to
>>> better understand what else it could be or how to determine that for sure.
>>> I've filled out the standard issue reporting template below. Any feedback
>>> or troubleshooting instructions would be much appreciated.
>>> >
>>> > ---
>>> > A description of what you are trying to achieve and what results you
>>> expect.:
>>> >
>>> > Intermittent queries are failing with the error "DatabaseError:
>>> invalid page header in block 1 of relation base/16384/76623"
>>> >
>>> > PostgreSQL version number you are running:
>>> >
>>> > PostgreSQL 8.4.13 on x86_64-redhat-linux-gnu, compiled by GCC gcc
>>> (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), 64-bit
>>> >
>>> > How you installed PostgreSQL:
>>> >
>>> > from standard package installer
>>> >
>>> > Changes made to the settings in the postgresql.conf file:
>>> >
>>> >
>>> >              name             |       current_setting       |
>>>  source
>>> >
>>> ------------------------------+-----------------------------+----------------------
>>> >  checkpoint_completion_target | 0.9                         |
>>> configuration file
>>> >  checkpoint_segments          | 32                          |
>>> configuration file
>>> >  checkpoint_timeout           | 15min                       |
>>> configuration file
>>> >  DateStyle                    | ISO, MDY                    |
>>> configuration file
>>> >  default_text_search_config   | pg_catalog.english          |
>>> configuration file
>>> >  effective_cache_size         | 1GB                         |
>>> configuration file
>>> >  lc_messages                  | en_US.UTF-8                 |
>>> configuration file
>>> >  lc_monetary                  | en_US.UTF-8                 |
>>> configuration file
>>> >  lc_numeric                   | en_US.UTF-8                 |
>>> configuration file
>>> >  lc_time                      | en_US.UTF-8                 |
>>> configuration file
>>> >  log_checkpoints              | on                          |
>>> configuration file
>>> >  log_connections              | off                         |
>>> configuration file
>>> >  log_destination              | csvlog                      |
>>> configuration file
>>> >  log_directory                | /opt/data/pgsql/data/pg_log |
>>> configuration file
>>> >  log_disconnections           | off                         |
>>> configuration file
>>> >  log_duration                 | on                          |
>>> configuration file
>>> >  log_filename                 | postgres-%Y-%m-%d_%H%M%S    |
>>> configuration file
>>> >  log_lock_waits               | on                          |
>>> configuration file
>>> >  log_min_duration_statement   | 250ms                       |
>>> configuration file
>>> >  log_rotation_age             | 1d                          |
>>> configuration file
>>> >  log_rotation_size            | 1GB                         |
>>> configuration file
>>> >  log_temp_files               | 0                           |
>>> configuration file
>>> >  log_timezone                 | Asia/Kolkata                | command
>>> line
>>> >  log_truncate_on_rotation     | on                          |
>>> configuration file
>>> >  logging_collector            | on                          |
>>> configuration file
>>> >  maintenance_work_mem         | 768MB                       |
>>> configuration file
>>> >  max_connections              | 500                         |
>>> configuration file
>>> >  max_stack_depth              | 2MB                         |
>>> environment variable
>>> >  port                         | 5432                        | command
>>> line
>>> >  shared_buffers               | 4GB                         |
>>> configuration file
>>> >  ssl                          | on                          |
>>> configuration file
>>> >  TimeZone                     | Asia/Kolkata                | command
>>> line
>>> >  timezone_abbreviations       | Default                     | command
>>> line
>>> >  wal_buffers                  | 16MB                        |
>>> configuration file
>>> >  work_mem                     | 48MB                        |
>>> configuration file
>>> >
>>> > It's also probably worth noting that postgres is installed on an
>>> encrypted volume which is mounted using ecryptfs.
>>> >
>>> > Operating system and version:
>>> >
>>> > RedHatEnterpriseServer, version 6.6
>>> >
>>> > What program you're using to connect to PostgreSQL:
>>> >
>>> > Python (django)
>>> >
>>> > Is there anything relevant or unusual in the PostgreSQL server logs?:
>>> >
>>> > I see lots of instances of this error (and similar). I'm not sure what
>>> else I should be looking for.
>>> >
>>> > What you were doing when the error happened / how to cause the error:
>>> >
>>> > I haven't explicitly tried to reproduce it, but it seems to
>>> consistently happen with certain queries. However, the system was rebooted
>>> shortly before the errors started occuring. The system was rebooted because
>>> another database (elasticsearch) was having problems on the same machine
>>> and the reboot was to attempt to resolve things.
>>> >
>>> > The EXACT TEXT of the error message you're getting, if there is one:
>>> >
>>> > DatabaseError: invalid page header in block 1 of relation
>>> base/16384/76623
>>> >
>>> > (block and relation numbers change)
>>> >
>>> > Unfortunately, I'm not completely familiar with the CPU and disk/RAID
>>> configurations used on the server. However it is storing to a (software)
>>> encrypted volume as mentioned above.
>>> >
>>> > Have you ever set fsync=off in the postgresql config file?
>>> > No
>>> > Have you had any unexpected power loss lately? Replaced a failed RAID
>>> disk? Had an operating system crash?
>>> > Not recently, though the system did reboot normally as described above.
>>> > Have you run a file system check? (chkdsk / fsck)
>>> > No.
>>> > Are there any error messages in the system logs?
>>> (unix/linux: dmesg, /var/log/syslog ;
>>> > I haven't seen anything obvious but I wasn't sure what to look for.
>>> >
>>>
>>> I guess you missed to provide the details and kernel version (rhel
>>> version and kernel level).
>>> This will give you kernel patch level-
>>>
>>> uname -a
>>>
>>> I had once faced this issue and I was on a buggy patch of Linux kernel.
>>> I just had to update to latest patch. That worked for me.
>>>
>>
>>
>

Reply via email to