Probably relevant: we only use mmap'd I/O for single-row reads. When we are paging through entire files like we do for compaction or AES we do buffered i/o to avoid the complexity of having to manage multiple mmap segments (Java limits us to 2GB per segment).
On Sat, Mar 12, 2011 at 7:06 PM, Peter Schuller <peter.schul...@infidyne.com> wrote: >>> Nothing happens, because it _doesn't have to be resident_. >>> >> >> Hm, but why in my case top show RSS 10g, when max HEAP_SIZE is 6G? > > The point is that it is a result of how the kernel manages memory and > how it is reported in top. It is not reflective of actual memory > "use", the way users normally use the term. > > If you take a file that is 1 gig in file and "cat" it, it will end up > in page cache. But that does not get accounted to any particular > process. If on the other hand you mmap() the file in a process and > stream through it, it will be accounted as part of the resident set of > the process. But it is not indicative that the process is "using" that > memory in the usual sense of the word. > > Since this keeps coming up I decided to put up the little test I have > that can be used to demonstrate the effect: > > https://github.com/scode/alloctest > > You can run that and observe the effects in top. > > That said, there's *something* fishy going on in that whether pages > get counted towards the process may be depending on something else. > E.g., I have a node with ~ 1 TB virtual/sstable sizes that is actively > doing AES (so should be doing mmap():ed i/o), yet I have exactly 10 > gig (max heap size) RSS instead of > 10 gb. I haven't investigated > properly. Maybe just depending on mmap flags. > > -- > / Peter Schuller > -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com