On 26/11/2007, Micah Dowty <[EMAIL PROTECTED]> wrote: > > The application doesn't really depend on the load-balancer's decisions > per se, it just happens that this behaviour I'm seeing on NUMA systems > is extremely bad for performance. > > In this context, the application is a virtual machine runtime which is > executing either an SMP VM or it's executing a guest which has a > virtual GPU. In either case, there are at least three threads: > > - Two virtual CPU/GPU threads, which are nice(0) and often CPU-bound > - A low-latency event handling thread, at nice(-10) > > The event handling thread performs periodic tasks like delivering > timer interrupts and completing I/O operations.
Are I/O operations initiated by these "virtual CPU/GPU threads"? If so, would it make sense to have per-CPU event handling threads (instead of one global)? They would handle I/O operations initiated from their respective CPUs to (hopefully) achieve better data locality (esp. if the most part of the involved data is per-CPU). Then let the load balancer to evenly distribute the "virtual CPU/GPU threads" or even (at least, as an experiment) fix them to different CPUs as well? sure, the scenario is highly dependent on a nature of those 'events'... and I can just speculate here :-) (but I'd imagine situations when such a scenario would scale better). > > Thank you again, > --Micah > -- Best regards, Dmitry Adamushko - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [EMAIL PROTECTED] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/