We had a visitor from Intel a month ago. One question from him was "What could you do if we gave you a server 2 years from now that had 16TB of memory"
I went.... Eh... using Java....? 2 years is maybe unrealistic, but you can already get some quite acceptable prices even on servers in the 100GB memory range now if you buy in larger quantities (30-50 servers and more in one go). I don't think it is unrealistic that we will start seeing high end consumer (x64) servers with TB's of memory in a few years and I really wonder were that puts java based software. Terje On Fri, Jul 1, 2011 at 2:25 AM, Edward Capriolo <edlinuxg...@gmail.com>wrote: > > > On Thu, Jun 30, 2011 at 12:44 PM, Daniel Doubleday < > daniel.double...@gmx.net> wrote: > >> Hi all - or rather devs >> >> we have been working on an alternative implementation to the existing row >> cache(s) >> >> We have 2 main goals: >> >> - Decrease memory -> get more rows in the cache without suffering a huge >> performance penalty >> - Reduce gc pressure >> >> This sounds a lot like we should be using the new serializing cache in >> 0.8. >> Unfortunately our workload consists of loads of updates which would >> invalidate the cache all the time. >> >> The second unfortunate thing is that the idea we came up with doesn't fit >> the new cache provider api... >> >> It looks like this: >> >> Like the serializing cache we basically only cache the serialized byte >> buffer. we don't serialize the bloom filter and try to do some other minor >> compression tricks (var ints etc not done yet). The main difference is that >> we don't deserialize but use the normal sstable iterators and filters as in >> the regular uncached case. >> >> So the read path looks like this: >> >> return filter.collectCollatedColumns(memtable iter, cached row iter) >> >> The write path is not affected. It does not update the cache >> >> During flush we merge all memtable updates with the cached rows. >> >> These are early test results: >> >> - Depending on row width and value size the serialized cache takes between >> 30% - 50% of memory compared with cached CF. This might be optimized further >> - Read times increase by 5 - 10% >> >> We haven't tested the effects on gc but hope that we will see improvements >> there because we only cache a fraction of objects (in terms of numbers) in >> old gen heap which should make gc cheaper. Of course there's also the option >> to use native mem like serializing cache does. >> >> We believe that this approach is quite promising but as I said it is not >> compatible with the current cache api. >> >> So my question is: does that sound interesting enough to open a jira or >> has that idea already been considered and rejected for some reason? >> >> Cheers, >> Daniel >> > > > > The problem I see with the row cache implementation is more of a JVM > problem. This problem is not Cassandra localized (IMHO) as I hear Hbase > people with similar large cache/ Xmx issues. Personally, I feel this is a > sign of Java showing age. "Let us worry about the pointers" was a great > solution when systems had 32MB memory, because the cost of walking the > object graph was small and possible and small time windows. But JVM's > already can not handle 13+ GB of RAM and it is quite common to see systems > with 32-64GB physical memory. I am very curious to see how java is going to > evolve on systems with 128GB or even higher memory. > > The G1 collector will help somewhat, however I do not see that really > pushing Xmx higher then it is now. HBase has even went the route of using an > off heap cache, https://issues.apache.org/jira/browse/HBASE-4018 , and > some Jira mentions Cassandra exploring this alternative as well. > > Doing whatever possible to shrink the current size of item in cache is an > awesome. Anything that delivers more bang for the buck is +1. However I feel > that VFS cache is the only way to effectively cache large datasets. I was > quite disappointed when I upped a machine from 16GB to 48 GB physical > memory. I said to myself "Awesome! now I can shave off a couple of GB for > larger row caches" I changed Xmx from 9GB to 13GB, upped the caches, and > restarted. I found the system spending a lot of time managing heap, and also > found that my compaction processes that did 200GB in 4 hours now were taking > 6 or 8 hours. > > I had heard that JVMs "top out around 20GB" but I found they "top out" much > lower. VFS cache +1 > > >