Mike Christie, on 09/02/2011 12:15 PM wrote: > On 09/01/2011 10:04 PM, Vladislav Bolkhovitin wrote: >> Hi, >> >> I've done some tests and looks like open-iscsi doesn't support full duplex >> speed >> on bidirectional data transfers from a single drive. >> >> My test is simple: 2 dd's doing big transfers in parallel over 1 GbE link >> from a >> ramdisk or nullio iSCSI device. One dd is reading and another one is >> writing. I'm >> watching throughput using vmstat. When any of the dd's working alone, I have >> full >> single direction link utilization (~120 MB/s) in both directions, but when >> both >> transfers working in parallel, throughput on any of them immediately drops >> in 2 >> times to 55-60 MB/s (sum is the same 120 MB/s). >> >> For sure, I tested bidirectional possibility of a single TCP connection and >> it >> does provide near 2 times throughput increase (~200 MB/s). >> >> Interesting, that doing another direction transfer from the same device >> imported >> from another iSCSI target provides expected full duplex 2x aggregate >> throughput >> increase. >> >> I tried several iSCSI targets + I'm pretty confident that iSCSI-SCST is >> capable to >> provide full duplex transfers, but from some look on the open-iscsi code I >> can't >> see the serialization point in it. Looks like open-iscsi receives and sends >> data >> in different threads (the requester process and per connection iscsi_q_X >> workqueue >> correspondingly), so should be capable to have full duplex. > > Yeah, we send from the iscsi_q workqueue and receive from the network > softirq if the net driver supports NAPI. > >> >> Does anyone have idea what could be the serialization point preventing full >> duplex >> speed? >> > > Did you do any lock profiliing and is the session->lock look the > problem? It is taken in both the receive and xmit paths and also the > queuecommand path.
Just done it. /proc/lock_stat says that there is no significant contention for session->lock. >From other side, session->lock is a spinlock, so, if it was the serialization point, we would see big CPU consumption on the initiator. But we have a plenty of CPU time there. So, there must be other serialization point. Thanks, Vlad -- You received this message because you are subscribed to the Google Groups "open-iscsi" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/open-iscsi?hl=en.
