I've given it some thought in the past. In the end, I usually talk myself out of it because I think it increases the surface area for failure. That is, managing N processes is more difficult that managing one process. But if the additional failure modes are addressed, there are some interesting possibilities.
For example, having gossip in its own process would decrease the odds that a node is marked dead because STW GC is happening in the storage JVM. On the flipside, you'd need checks to make sure that the gossip process can recognize when the storage process has died vs just running a long GC. I don't know that I'd go so far as to have separate processes for keyspaces, etc. There is probably some interesting work that could be done to support the orgs who run multiple cassandra instances on the same node (multiple gossipers in that case is at least a little wasteful). I've also played around with using domain sockets for IPC inside of cassandra. I never ran a proper benchmark, but there were some throughput advantages to this approach. Cheers, Gary. On Thu, Feb 22, 2018 at 8:39 PM, Carl Mueller <carl.muel...@smartthings.com> wrote: > GC pauses may have been improved in newer releases, since we are in 2.1.x, > but I was wondering why cassandra uses one jvm for all tables and > keyspaces, intermingling the heap for on-JVM objects. > > ... so why doesn't cassandra spin off a jvm per table so each jvm can be > tuned per table and gc tuned and gc impacts not impact other tables? It > would probably increase the number of endpoints if we avoid having an > overarching query router. >