Tomas Vondra <to...@vondra.me> writes: > I did run into bottlenecks due to "too few file descriptors" during a > recent experiments with partitioning, which made it pretty trivial to > get into a situation when we start trashing the VfdCache. I have a > half-written draft of a blog post about that somewhere.
> But my conclusion was that it's damn difficult to even realize that's > happening, especially if you don't have access to the OS / perf, etc. Yeah. fd.c does its level best to keep going even with only a few FDs available, and it's hard to tell that you have a performance problem arising from that. (Although I recall old war stories about Postgres continuing to chug along just fine after it'd run the kernel out of FDs, although every other service on the system was crashing left and right, making it difficult e.g. even to log in. That scenario is why I'm resistant to pushing our allowed number of FDs to the moon...) > So > my takeaway was we should improve that first, so that people have a > chance to realize they have this issue, and can do the tuning. The > improvements I thought about were: > - track hits/misses for the VfdCache (and add a system view for that) I think what we actually would like to know is how often we have to close an open FD in order to make room to open a different file. Maybe that's the same thing you mean by "cache miss", but it doesn't seem like quite the right terminology. Anyway, +1 for adding some way to discover how often that's happening. > - maybe have wait event for opening/closing file descriptors Not clear that that helps, at least for this specific issue. > - show max_safe_fds value somewhere, not just max_files_per_process > (which we may silently override and use a lower value) Maybe we should just assign max_safe_fds back to max_files_per_process after running set_max_safe_fds? The existence of two variables is a bit confusing anyhow. I vaguely recall that we had a reason for keeping them separate, but I can't think of the reasoning now. regards, tom lane