On Jul 28, 2011, at 9:52 PM, Jonathan Ellis wrote:

> This is not advisable in general, since non-mmap'd I/O is substantially 
> slower.

I see this again and again as a claim here, but it is actually close to 10 
years since I saw mmap'd I/O have any substantial performance benefits on any 
real life use I have needed.

We have done a lot of testing of this also with cassandra and I don't see 
anything conclusive. We have done as many test where normal I/O has been faster 
than mmap and the differences may very well be within statistical variances 
given the complexity and number of factors involved in something like a 
distributed cassandra working at quorum.

mmap made a difference in 2000 when memory throughput was still measured in 
hundreds of megabytes/sec and cpu caches was a few kilobytes, but today, you 
got megabytes of CPU caches with 100GB/sec bandwidths and even memory 
bandwidths are in 10's of GB/sec.

However, I/O buffers are generally quiet small and copying an I/O  buffer from 
kernel to user space inside a cache with 100GB/sec bandwidth is really  a 
non-issue given the I/O throughput cassandra generates.

In 2005 or so, CPUs had already reached a limit where I saw that mmap performed 
worse than regular I/O on as a large number of use cases. 

Hard to say exactly why, but I saw one theory from a FreeBSD core developer 
speculating back then that the extra MMU work involved in some I/O loads may 
actually be slower than cache internal memcopy of tiny I/O buffers (they are 
pretty small after all).

I don't have a personal theory here. I just know that especially on large 
amounts of smaller I/O operations regular I/O was typically faster than mmap, 
which could back up that theory.

So, I wonder how people came to this conclusion as I am, under no real life use 
case with cassandra, able to reproduce anything resembling a significant 
difference and we have been benchmarking on nodes with ssd setups which can 
churn out 1GB/sec+ read speeds. 

Way more I/O throughput than most people have at hand and still I cannot get 
mmap to give me better performance.

I do, although subjectively, feel that things just seem to work better with 
regular I/O for us. We have currently have very nice and stable heap sizes at 
regardless of I/O loads and we have an easier system to operate as we can 
actually monitor how much memory the darned thing work.

My recommendation? Stay away from mmap.

I would love to understand how people got to this conclusion however and try to 
find out why we seem to see differences!

> The OP is correct that it is best to disable swap entirely, and
> second-best to enable JNA for mlockall.

Be a bit careful with removing swap completely. Linux is not always happy when 
it gets short on memory.

Terje

Reply via email to