> On Nov 4, 2016, at 2:49 AM, Stefan Hajnoczi <stefa...@gmail.com> wrote: > >> On Thu, Oct 20, 2016 at 01:31:15AM +0000, Ketan Nilangekar wrote: >> 2. The idea of having multi-threaded epoll based network client was to drive >> more throughput by using multiplexed epoll implementation and (fairly) >> distributing IOs from several vdisks (typical VM assumed to have atleast 2) >> across 8 connections. >> Each connection is serviced by single epoll and does not share its context >> with other connections/epoll. All memory pools/queues are in the context of >> a connection/epoll. >> The qemu thread enqueues IO request in one of the 8 epoll queues using a >> round-robin. Responses are also handled in the context of an epoll loop and >> do not share context with other epolls. Any synchronization code that you >> see today in the driver callback is code that handles the split IOs which we >> plan to address by a) implementing readv in libqnio and b) removing the 4MB >> limit on write IO size. >> The number of client epoll threads (8) is a #define in qnio and can easily >> be changed. However our tests indicate that we are able to drive a good >> number of IOs using 8 threads/epolls. >> I am sure there are ways to simplify the library implementation, but for now >> the performance of the epoll threads is more than satisfactory. > > Have you benchmarked against just 1 epoll thread with 8 connections? >
The first implementation of qnio was actually single threaded with 8 connections. The single VM throughput at the time iirc was less than half of what we are getting now. Especially with a workload doing IOs on multiple vdisks. I assume we will need some sort of cpu/core affinity to drive the most out of a single epoll threaded design. Ketan > Stefan