Hi,

I ran something which triggered the error in $subject. Except that it turns
out that
a) epoll_create1() was not being called
b) we didn't actually hit EMFILE or even max_safe_fds

The reason for the failure is that we have:
        if (!AcquireExternalFD())
        {
                /* treat this as though epoll_create1 itself returned EMFILE */
                elog(ERROR, "epoll_create1 failed: %m");
        }

and

bool
AcquireExternalFD(void)
{
        /*
         * We don't want more than max_safe_fds / 3 FDs to be consumed for
         * "external" FDs.
         */
        if (numExternalFDs < max_safe_fds / 3)
        {
                ReserveExternalFD();
                return true;
        }
        errno = EMFILE;
        return false;
}

I think it's rather confusing to claim that epoll_create1() failed when we
didn't even call it.

Why are we misattributing the failure to a system call that we didn't make?

The current behaviour was introduced in

commit 3d475515a15f70a4a3f36fbbba93db6877ff8346
Author: Tom Lane <t...@sss.pgh.pa.us>
Date:   2020-02-24 17:28:33 -0500

    Account explicitly for long-lived FDs that are allocated outside fd.c.



I also wish we wouldn't report EMFILE when we didn't actually reach any hard
limit - that makes the system behaviour unnecessarily confusing. But that's
not quite so easy to fix.


How about making the error message something like
                elog(ERROR, "AcquireExternalFD, for epoll_create1, failed: %m");

Greetings,

Andres Freund


Reply via email to