Kimi wrote:
>
> Hi,
>
> This is in continuation of mails I sent last week about postgres
> crashing
> We are running pg 6.5.1, on Redhar 5.1 with DBI 0.92 and DBD 1.13 on a
> 512 MB RAM
> and SCSI machine
>
> Our application consists of requests going upto 150 per second on this
> database
> with an expected uptime of 24 by 7.
> Earlier we were getting spinlock messages which we have hoped to sort
> out by raising
> number of open files per process to 1024 from the earlier 256
>
> Postgres crashes giving an error message : FATAL 1: Release LRU file :
> No opened files /
> no one can be closed.
>
> Now can anybody help on how to solve this.
>
> Please help
>
> Bye,
>
> Murali
> Differentiated Software Solutions
We have been running a production server under a somewhat
lighter load, and encountered this once. The following
conversation took place on the mailing list about a month
ago:
http://www.PostgreSQL.ORG/mhonarc/pgsql-hackers/1999-11/msg00454.html
------------------------------------------------------------
Mike Mascari <[EMAIL PROTECTED]> writes:
> FATAL 1: ReleaseLruFile: No opened files - no one can be closed
> This is the first time this has ever happened.
I've never seen that either. Offhand I do not recall any
post-6.5
changes that would affect it, so the problem (whatever it
is) is
probably still there.
After eyeballing the code, it seems there are only two ways
this
could happen:
1. the number of "allocated" (non-virtual) file descriptors
grew to
exceed the number of files Postgres thinks it can have open;
2. something else was temporarily exhausting your kernel's
file table
space, so that ENFILE was returned for many successive
attempts to
open a file. (After each one, fd.c will close another file
and try
again.)
#2 seems improbable on an unloaded system, and isn't real
probable even
on a loaded one, since you'd have to assume that some other
process
managed to suck up each filetable slot that fd.c released
before fd.c
could re-acquire it. Once, yes, but several dozen times in
a row?
So I'm guessing a leak of allocated file descriptors.
After grovelling through the calls to AllocateFile, I only
see one
prospect for a leak: it looks to me like verify_password()
neglects
to close the password file if an invalid user name is
given. Do you
use a plain (non-encrypted) password file? If so, I'll bet
you can
reproduce the crash by trying repeatedly to connect with a
username
that's not in the password file. If that pans out, it's a
simple fix:
add "FreeFile(pw_file);" near the bottom of
verify_password() in
src/backend/libpq/password.c. Let me know if this guess is
right...
regards, tom lane
------------------------------------------------------------
Hope that helps,
Mike Mascari
************