On Tue, 8 Dec 2009, cronfy wrote:
Please forgive me for probably a very stupid question. But why is FreeBSD
so sensitive to filesystem errors that it ends up with panics like
'freeing free block' or 'ffs_valloc: dup alloc'? I just can't get it.
Failed to allocate vnode? Go allocate another one! Freeing free block?
Leave it free then! I understand these situations should never happen, but
the hell why is it required to panic and kill everything that would be
working happily even if something very disasterous happen to /backup
partition, in example?
Probably because UFS is not designed to be a backup file system but a
working one :)
All those errors indicate file system corruption. To protect other data
from getting corrupted (e.g. by invalid pointers or calculations), the
kernel panics.
To protect us against terrorists our government do strange things too ;-)
After panic data *is* getting corrupted anyway - MySQL tables that were open
are broken, soft-updates are unsync'ed etc etc.
Server is required to reboot, fsck, time is wasted while this occurs. Why all
this should happen because of a single vnode fail? Why not just throw message
in /var/log/messages, return "oh, I failed to save a file" to the process
that initiated the operation and just go on? Are consequences of attept to
"free already free block" *so* dangerous that it is needed to give up on
EVERYTHING? Let's say it was not /backup partition, ok, it was
/var/tmp/some-php-session or even /var/cron/tabs/someuser file that failed.
So what? Even /boot/kernel/kernel corruption is not critical if you are not
going to reboot right now (or if you have /boot/kernel.old :)
Is there a way to say "Dear kernel, don't panic, I'am holding your hand, keep
working please-please-please?" If so, can it lead to complete filesystem
corruption indeed or it is not so serious?
Afaik you can't do this. And you shouldn't do if it'd be possible. The
file system errors you mention above should not happen under any normal
circumstances. They may happen after a crash caused by other reasons but
should get repaired by fsck. The kernel cannot continue with such errors
because the whole file system metadata cannot be trusted anymore until
repaired.
I use FreeBSD with UFS for more than 15 years now; partially on heavily
loaded and i/o-bound systems. I never had any serious filesystem problems
as long as the disks or the storage area network (san) didn't fail.
In the worst case, after a san crash, I had to run fsck three times (one
run immediately after the other) in single user mode on large partitions
until all errors were repaired.
Best regards
Konrad Heuer
GWDG, Am Fassberg, 37077 Goettingen, Germany, [email protected]
_______________________________________________
[email protected] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "[email protected]"