On Fri, Jul 29, 2011 at 6:29 AM, Peter Schuller <peter.schul...@infidyne.com > wrote:
> > I would love to understand how people got to this conclusion however and > try to find out why we seem to see differences! > > I won't make any claims with Cassandra because I have never bothered > benchmarking the different in CPU usage since all my use-cases have > been more focused on I/O efficiency, but I will say, without having > benchmarked that either, the *generally*, if you're doing small reads > of data that is in page cache using mmap() - something would have to > be seriously wrong for that not to be significantly faster than > regular I/O. > > Sorry, with small reads, I was thinking small random reads, basically things that are not very cacheable and probably cause demand paging. For quite large reads like 10s of MB from disk, the demand paging will not be good for mmap performance. This is probably not a type of storage use which is a stronghold of cassandra either. But you sort of nicely list a lot of things I did not take time to write and just add support for my original question: "What is the origin of the mmap is substantially faster" claim? You also need to throw in also throw in the fun question on how the jvm will interact with all of this. Given the amount of people asking question here related to confusion on mmap, memory map and jna, and the work of maintaining mmap code, I am somewhat curious if this is worth it. Different usages can generate vastly different loads on systems, so just because our current usage scenarios does not seem to benefit from mmap, other cases obviously can and I am curious what these cases look like. Terje