> 5) DMA straight from user buffer to disk avoiding a copy. This is what the "direct" in "direct i/o" has historically meant. :-)
> line has been that 5) won't help latency much and > latency is here I think the game is currently played. Now the > disconnect might be because people might feel that the game > is not latency but CPU efficiency : "how many CPU cycles do I > burn to do get data from disk to user buffer". Actually, it's less CPU cycles in many cases than memory cycles. For many databases, most of the I/O is writes (reads wind up cached in memory). What's the cost of a write? With direct I/O: CPU writes to memory (spread out over many transactions), disk DMAs from memory. We write LPS (log page size) bytes of data from CPU to memory, we read LPS bytes from memory. On processors without a cache line zero, we probably read the LPS data from memory as part of the write. Total cost = W:LPS, R:2*LPS. Without direct I/O: The cost of getting the data into the user buffer remains the same (W:LPS, R:LPS). We copy the data from user buffer to system buffer (W:LPS, R:LPS). Then we push it out to disk. Total cost = W:2*LPS, R:3*LPS. We've nearly doubled the cost, not including any TLB effects. On a memory-bandwidth-starved system (which should be nearly all modern designs, especially with multi-threaded chips like Niagara), replacing buffered I/O with direct I/O should give you nearly a 2x improvement in log write bandwidth. That's without considering cache effects (which shouldn't be too significant, really, since LPS should be << the size of L2). How significant is this? We'd have to measure; and it will likely vary quite a lot depending on which database is used for testing. But note that, for ZFS, the win with direct I/O will be somewhat less. That's because you still need to read the page to compute its checksum. So for direct I/O with ZFS (with checksums enabled), the cost is W:LPS, R:2*LPS. Is saving one page of writes enough to make a difference? Possibly not. Anton This message posted from opensolaris.org _______________________________________________ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss