Greg Stark wrote:
I would vote for the kernel, if the server didn't respond within 5
seconds, to simply return EIO. At least we know how to handle that...
How do you handle it? By having Postgres shut down? And then the NFS server
comes back and then what?
Log the error if you can.
Refuse
Martijn van Oosterhout writes:
> The kernel is trying to be helpful by returning EINTR to say "ok, it
> didn't complete. There's no error yet but it may yet work".
Well it only returns EINTR if a signal was received.
> With local hard drives if they don't respond, you assume they're broken.
Martijn van Oosterhout writes:
> I would vote for the kernel, if the server didn't respond within 5
> seconds, to simply return EIO. At least we know how to handle that...
You can do this now by mounting 'soft' and setting the timeout
appropriately. Whether it's really the best idea, well...
-
On Mon, Jan 02, 2006 at 08:55:47AM -0700, Doug Royer wrote:
>
>
> Doug McNaught wrote:
>
> >c) treat EINTR as an I/O error (I don't know how easy this would be)
>
> So then at this point - it is detected, so problem solved?
>
> If a LOCAL hard drive fails to reply, you hang. Same with hard,int
Doug McNaught wrote:
c) treat EINTR as an I/O error (I don't know how easy this would be)
So then at this point - it is detected, so problem solved?
If a LOCAL hard drive fails to reply, you hang. Same with hard,intr
NFS file system.
bytesRead = read(fd, buffer, requestedBytes);
Doug Royer <[EMAIL PROTECTED]> writes:
> Yes - if you assume that EINTR only happens on NFS mounts.
> My point is that independent of NFS, the error checking
> that I have found in the code is not complete even for
> non-NFS file systems.
>
>
> The read() and write() LINUX man pages do NOT specify
Doug Royer <[EMAIL PROTECTED]> writes:
> The MOUNT options are opposite.
>
> Linux NFS mount - defualts to no-intr
> Solaris NFS mount - default to intr
Oh, right--I didn't realize that was what you were talking about.
-Doug
---(end of broadcast)---
Yes - if you assume that EINTR only happens on NFS mounts.
My point is that independent of NFS, the error checking
that I have found in the code is not complete even for
non-NFS file systems.
The read() and write() LINUX man pages do NOT specify that EINTR
is an NFS-only error.
EINTR The
The MOUNT options are opposite.
Linux NFS mount - defualts to no-intr
Solaris NFS mount - default to intr
Doug McNaught wrote:
Doug Royer <[EMAIL PROTECTED]> writes:
From the Linux 'nfs' man page:
intr If an NFS file operation has a major timeout and it is
h
Let me give you a sky-high view of this. Database reliability requires
that the disk drive be 100% reliable. If any part of the disk storage
fails (I/O write failure, NFS failure) we have to assume that the disk
storage is corrupt and the database needs to be restored from backup.
The NFS failu
Doug Royer <[EMAIL PROTECTED]> writes:
> From the Linux 'nfs' man page:
>
> intr If an NFS file operation has a major timeout and it is
> hard mounted, then allow signals to interupt the file
> operation and cause it to return EINTR to the cal
From the Linux 'nfs' man page:
intr If an NFS file operation has a major timeout and it is
hard mounted, then allow signals to interupt the file
operation and cause it to return EINTR to the calling
program. The default is to no
On Sun, 1 Jan 2006, Tom Lane wrote:
> Qingqing Zhou <[EMAIL PROTECTED]> writes:
> > I understand put a CHECK_FOR_INTERRUPTS() in the retry-loop may make more
> > graceful stop, but it won't work in some cases -- notice that the io
> > routines we will patch can be used before the signal mechanis
Qingqing Zhou <[EMAIL PROTECTED]> writes:
> I understand put a CHECK_FOR_INTERRUPTS() in the retry-loop may make more
> graceful stop, but it won't work in some cases -- notice that the io
> routines we will patch can be used before the signal mechanism is setup.
I don't think it will help much at
On Sun, 1 Jan 2006, Greg Stark wrote:
>
> "Qingqing Zhou" <[EMAIL PROTECTED]> writes:
>
> > The problem of above is if a signal sneaks in, these syscalls will fail.
> > With a retry, we can fix it.
>
> It's a bit stickier than that but only a bit. If you just retry then you're
> saying users have
"Qingqing Zhou" <[EMAIL PROTECTED]> writes:
> The problem of above is if a signal sneaks in, these syscalls will fail.
> With a retry, we can fix it.
It's a bit stickier than that but only a bit. If you just retry then you're
saying users have to use kill -9 to get away from the situation. For
"Greg Stark" <[EMAIL PROTECTED]> wrote
>
> Well NFS is only going to affect filesystem calls. If there are other
> syscalls
> that can signal EINTR on some obscure platform where Postgres isn't
> handling
> it then that's just a run-of-the-mill porting issue.
>
Ok, NFS just affects filesystem c
Doug Royer <[EMAIL PROTECTED]> writes:
> The 'intr' option to NFS is not the same as EINTR. It
> it means 'if the server does not respond for a while,
> then return an EINTR', just like any other disk read()
> or write() does when it fails to reply.
No, you're thinking of 'soft'. 'intr' (which i
EINTR on read() or write() is not unique to NFS.
It can happen on many file systems - it is just seen
less frequently on most of them.
The code should be able to handle ANY valid read()
and write() errno. And EINTR is documented on Linux, BSD,
Solaris (1 and 2), and POSIX.
Even the Linux man pa
Rod Taylor <[EMAIL PROTECTED]> writes:
> Are there issues with having an archive_command which does things with
> NFS based filesystems?
Well, whatever command you use for archive_command -- probably just "cp" if
you're using NFS would hang if the NFS server went away. What would happen
then mig
On Sat, 31 Dec 2005, Greg Stark wrote:
>
> Qingqing Zhou <[EMAIL PROTECTED]> writes:
>
> >
> > Is that by default the EINTR is truned off in NFS? If so, I don't see that
> > will be a problem. Sorry for my limited knowledge, is there any
> > requirements/benefits that people turn on EINTR?
>
> Th
Qingqing Zhou <[EMAIL PROTECTED]> writes:
> On Sat, 31 Dec 2005, Greg Stark wrote:
>
> >
> > I don't think that's reasonable. The NFS intr option breaks the traditional
> > unix filesystem semantics which breaks a lot of older or naive programs. But
> > that's no reason to decide that Postgres c
On Sat, Dec 31, 2005 at 04:46:02PM -0500, Qingqing Zhou wrote:
> Is that by default the EINTR is truned off in NFS? If so, I don't see that
> will be a problem. Sorry for my limited knowledge, is there any
> requirements/benefits that people turn on EINTR?
I wont speak for anyone else, but the rea
On Sat, 31 Dec 2005, Greg Stark wrote:
>
> I don't think that's reasonable. The NFS intr option breaks the traditional
> unix filesystem semantics which breaks a lot of older or naive programs. But
> that's no reason to decide that Postgres can't handle the new semantics.
>
Is that by default t
Qingqing Zhou <[EMAIL PROTECTED]> writes:
> On Sat, 31 Dec 2005, Tom Lane wrote:
> >
> > What I'd rather do is document prominently that running a DB over NFS
> > isn't recommended, and running it over NFS with interrupts allowed is
> > just not going to work.
>
> Agreed. IO syscalls is not the
On Sat, 2005-12-31 at 14:40 -0500, Tom Lane wrote:
> Greg Stark <[EMAIL PROTECTED]> writes:
> > Qingqing Zhou <[EMAIL PROTECTED]> writes:
> >> I have patched IO routines in backend/storage that POSIX says EINTR is
> >> possible except unlink(). Though POSIX says EINTR is not possible, during
> >> m
On Sat, 31 Dec 2005, Tom Lane wrote:
>
> What I'd rather do is document prominently that running a DB over NFS
> isn't recommended, and running it over NFS with interrupts allowed is
> just not going to work.
>
Agreed. IO syscalls is not the only problem for NFS -- if we can't fix
them in a run,
Greg Stark <[EMAIL PROTECTED]> writes:
> Qingqing Zhou <[EMAIL PROTECTED]> writes:
>> I have patched IO routines in backend/storage that POSIX says EINTR is
>> possible except unlink(). Though POSIX says EINTR is not possible, during
>> many regressions, I found it sometimes sets this errno on NFS
Qingqing Zhou <[EMAIL PROTECTED]> writes:
> On Fri, 30 Dec 2005, Tom Lane wrote:
> >
> > I've heard of this in connection with NFS ... is your DB on an NFS
> > filesystem by any chance?
>
> I have patched IO routines in backend/storage that POSIX says EINTR is
> possible except unlink(). Though
On Fri, 30 Dec 2005, Tom Lane wrote:
>
> I've heard of this in connection with NFS ... is your DB on an NFS
> filesystem by any chance?
>
I have patched IO routines in backend/storage that POSIX says EINTR is
possible except unlink(). Though POSIX says EINTR is not possible, during
many regressi
"Tom Lane" <[EMAIL PROTECTED]> wrote
> Qingqing Zhou <[EMAIL PROTECTED]> writes:
>> + ERROR: could not open relation 1663/16384/37713: Interrupted system
>> call
>
>> The reason I guess is the open() call is interrupted by a signal (what
>> signal BTW?).
>
> I've heard of this in connection with
Qingqing Zhou <[EMAIL PROTECTED]> writes:
> + ERROR: could not open relation 1663/16384/37713: Interrupted system call
> The reason I guess is the open() call is interrupted by a signal (what
> signal BTW?).
I've heard of this in connection with NFS ... is your DB on an NFS
filesystem by any cha
I encountered an error today (can't repeat) on SunOS 5.8:
--test that we read consecutive LFs properly
CREATE TEMP TABLE testnl (a int, b text, c int);
+ ERROR: could not open relation 1663/16384/37713: Interrupted system call
The reason I guess is the open() call is interrupted by a signal
33 matches
Mail list logo