On 11/13/12 4:41 PM, Markus Gebert wrote:
On 13.11.2012, at 19:30, Markus Gebert <markus.geb...@hostpoint.ch> wrote:

To me it looks like the unix socket GC is triggered way too often and/or 
running too long, which uses cpu and worse, causes a lot of contention around 
the unp_list_lock which in turn causes delays for all processes relaying on 
unix sockets for IPC.

I don't know why the unp_gc() is called so often and what's triggering this.
I have a guess now. Dovecot and relayd both use unix sockets heavily. According 
to dtrace uipc_detach() gets called quite often by dovecot closing unix 
sockets. Each time uipc_detach() is called unp_gc_task is taskqueue_enqueue()d 
if fds are inflight.

in uipc_detach():
682             if (local_unp_rights)   
683                     taskqueue_enqueue(taskqueue_thread, &unp_gc_task);

We use relayd in a way that keeps the source address of the client when 
connecting to the backend server (transparent load balancing). This requires 
IP_BINDANY on the socket which cannot be set by unprivileged processes, so 
relayd sends the socket fd to the parent process just to set the socket option 
and send it back. This means an fd gets transferred twice for every new backend 
connection.

So we have dovecot calling uipc_detach() often and relayd making it likely that 
fds are inflight (unp_rights > 0). With a certain amount of load this could 
cause unp_gc_task to be added to the thread taskq too often, slowing everything 
unix socket related down by holding global locks in unp_gc().

I don't know if the slowdown can even cause a negative feedback loop at some 
point by inreasing the chance of fds being inflight. This would explain why 
sometimes the condition goes away by itself and sometimes requires intervention 
(taking load away for a moment).

I'll look into a way to (dis)prove all this tomorrow. Ideas still welcome :-).


A couple of ideas:

1) convert the taskqueue to a callout, but only allow one to be queued at a time. set the granularity.

2) I think you only need to actually run garbage collection on the off-chance that you pass unix file descriptors, otherwise you can get away with refcounting. It's hard for me to express the exact logic needed for this though. I think you would need some way to simply do refcounting until there was a unix socket descriptor in flight, then switch to gc. Either that or make a sysctl that allows you administratively deny passing of unix descriptors and just use refcounting.

Or just use Adrian's hack. :)

-Alfred


_______________________________________________
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"

Reply via email to