3501" with FileFallocate(): Interrupted system call

Kyotaro Horiguchi Tue, 23 May 2023 21:14:06 -0700

At Tue, 23 May 2023 19:28:45 -0700, Andres Freund <[email protected]> wrote in 
> Hi,
> 
> On 2023-05-24 10:56:28 +0900, Kyotaro Horiguchi wrote:
> > At Wed, 26 Apr 2023 11:37:55 +1200, Thomas Munro <[email protected]> 
> > wrote in 
> > > On Tue, Apr 25, 2023 at 12:16 PM Andres Freund <[email protected]> wrote:
> > > > On 2023-04-24 15:32:25 -0700, Andres Freund wrote:
> > > > > We obviously can add a retry loop to FileFallocate(), similar to 
> > > > > what's
> > > > > already present e.g. in FileRead(). But I wonder if we shouldn't go a 
> > > > > bit
> > > > > further, and do it for all the fd.c routines where it's remotely 
> > > > > plausible
> > > > > EINTR could be returned? It's a bit silly to add EINTR retries 
> > > > > one-by-one to
> > > > > the functions.
> > > > >
> > > > >
> > > > > The following are documented to potentially return EINTR, without 
> > > > > fd.c having
> > > > > code to retry:
> > > > >
> > > > > - FileWriteback() / pg_flush_data()
> > > > > - FileSync() / pg_fsync()
> > > > > - FileFallocate()
> > > > > - FileTruncate()
> > > > >
> > > > > With the first two there's the added complication that it's not 
> > > > > entirely
> > > > > obvious whether it'd be better to handle this in File* or pg_*. I'd 
> > > > > argue the
> > > > > latter is a bit more sensible?
> > > >
> > > > A prototype of that approach is attached. I pushed the retry handling 
> > > > into the
> > > > pg_* routines where applicable.  I guess we could add pg_* routines for
> > > > FileFallocate(), FilePrewarm() etc as well, but I didn't do that here.
> > > 
> > > One problem we ran into with the the shm_open() case (which is nearly
> > > identical under the covers, since shm_open() just opens a file in a
> > > tmpfs on Linux) was that a simple retry loop like this could never
> > > terminate if the process was receiving a lot of signals from the
> > > recovery process, which is why we went with the idea of masking
> > > signals instead.  Eventually we should probably grow the file in
> > > smaller chunks with a CFI in between so that we both guarantee that we
> > > make progress (by masking for smaller size increases) and service
> > > interrupts in a timely fashion (by unmasking between loops).  I don't
> > > think that applies here because we're not trying to fallocate
> > > humongous size increases in one go, but I just want to note that we're
> > > making a different choice.  I think this looks reasonable for the use
> > > cases we actually have here.
> > 
> > FWIW I share the same feeling about looping by EINTR without signals
> > being blocked. If we just retry the same operation without processing
> > signals after getting EINTR, I think blocking signals is better. We
> > could block signals more gracefully, but I'm not sure it's worth the
> > complexity.
> 
> I seriously doubt it's a good path to go down in this case. As Thomas
> mentioned, this case isn't really comparable to the shm_open() one, due to the
> bounded vs unbounded amount of memory we're dealing with.
> 
> What would be the benefit?


I'm not certain what you mean by "it" here. Regarding signal blocking,
the benefit would be a lower chance of getting constantly interrupted
by a string of frequent interrupts, which can't be prevented just by
looping over. From what I gathered, Thomas meant that we don't need to
use chunking to prevent long periods of ignoring interrupts because
we're extending a file by a few blocks. However, I might have
misunderstood.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: could not extend file "base/5/3501" with FileFallocate(): Interrupted system call

Reply via email to