[GENERAL] Strange Postgresql crash
each random_page_cost = 4# units are one sequential page fetch cost cpu_tuple_cost = 0.01 # (same) cpu_index_tuple_cost = 0.001# (same) cpu_operator_cost = 0.0025 # (same) log_connections = false log_pid = true log_statement = false log_duration = false log_timestamp = true log_min_error_statement = notice # Values in order of increasing severity: # debug5, debug4, debug3, debug2, debug1, # info, notice, warning, error, panic(off) syslog = 0 # range 0-2 syslog_facility = 'LOCAL0' syslog_ident = 'postgres' LC_MESSAGES = 'en_US' LC_MONETARY = 'en_US' LC_NUMERIC = 'en_US' LC_TIME = 'en_US' I tested my memory with memtest, and it's perfect. I also did some stress test within Linux, using stress and donnie++ to see if it would crash with APCI or not, while doing a dump... So far its okay. The machine: Linux aquilonII 2.6.17-1.2142_FC4 #1 Tue Jul 11 22:41:14 EDT 2006 i686 i686 i386 GNU/Linux Any one has a suggestion ? -- Eric Rousse 514-655-1001 Telmatik inc. 204 Montarville, suite 250 Boucherville, QC, Canada J4B 6S2 www.telmatik.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Strange Postgresql crash
duh! right. I didn't thought about this one!! but the strange thing though is that it doesn't happen frequently, only recently it started to crash regularly. here's the content of the crontab: SHELL=/bin/bash PATH=/sbin:/bin:/usr/sbin:/usr/bin MAILTO=root HOME=/ # run-parts 01 * * * * root run-parts /etc/cron.hourly 02 4 * * * root run-parts /etc/cron.daily 22 4 * * 0 root run-parts /etc/cron.weekly 42 4 1 * * root run-parts /etc/cron.monthly 00 3 * * * root /export/dbsystem/pg_backup.sh va > /dev/null 2>&1 00 4 * * * root /export/dbsystem/pg_backup.sh b > /dev/null 2>&1 00 5 * * * root rsync --password-file=/etc/.rs_sec -Cauzbqr /export/dbsystem/backup/ rsync://[EMAIL PROTECTED]/rsync/ I'll move cron.daily to 4:30 brian a écrit : Eric Rousse wrote: Hello all, I've been experiencing strange crash, never really took care of it since it was happening only every 1-2 months or so. But lately, I've seen it a lot in the past week and I have no clue about it, other than the backups. So, here's some info about it and about my machine: When: it crashes at night, at around 4AM, during the backup: 00 3 * * * root /export/dbsystem/pg_backup.sh va > /dev/null 2>&1 00 4 * * * root /export/dbsystem/pg_backup.sh b > /dev/null 2>&1 I move the vacuum to another time, just to make sure they are not in conflict, who knows! Is there anything else running at that time? What does /etc/crontab have? I ask because my fedora box has cron.daily scripts run at 4:02am by default. brian ---(end of broadcast)------- TIP 2: Don't 'kill -9' the postmaster -- Eric Rousse 514-655-1001 Telmatik inc. 204 Montarville, suite 250 Boucherville, QC, Canada J4B 6S2 www.telmatik.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org/
Re: [GENERAL] Strange Postgresql crash
Hi Tom, Yeah, that's what I suspect it seems more like a hardware/os issue. Since, I really have no proof against PostgreSQL, other than my daily dump that crashes *sometimes*. I didn't know about badblocks, I'll try this one. Last time I did a full fsck check on all the volumes, and everything was clean. But I have a sata raid on this server, I never knew if a raid would actually replicate badblocks to the other disk ? Thanks for your advice! Tom Lane a écrit : Eric Rousse <[EMAIL PROTECTED]> writes: ... 2006-11-16 04:00:39 [8763] LOG: connection received: host=10.1.1.54 port=4894 2006-11-16 04:00:40 [8763] LOG: pq_recvbuf: unexpected EOF on client connection 2006-11-16 04:00:40 [8763] LOG: incomplete startup packet 2006-11-16 04:02:26 [2534] LOG: database system was interrupted at 2006-11-16 03:57:36 EST 2006-11-16 04:02:26 [2534] LOG: checkpoint record is at C/6733EB68 ... I think what you're seeing here is probably a kernel-level crash and system reboot. It's not any normal sort of Postgres problem, because if it were you'd see the postmaster bleating about crash of one of its child processes. Here it appears that the postmaster and all its children died at once leaving no messages behind --- and that just doesn't happen without either manual intervention or a system crash. If it seems to be triggered by running a PG backup, it could be that you've got a disk hardware problem that only manifests when you try to read a particular data block :-(. Have you tried running "badblocks"? regards, tom lane -- Eric Rousse 514-655-1001 Telmatik inc. 204 Montarville, suite 250 Boucherville, QC, Canada J4B 6S2 www.telmatik.com