I've given it some thought in the past. In the end, I usually talk myself
out of it because I think it increases the surface area for failure. That
is, managing N processes is more difficult that managing one process. But
if the additional failure modes are addressed, there are some interesting
possibilities.

For example, having gossip in its own process would decrease the odds that
a node is marked dead because STW GC is happening in the storage JVM. On
the flipside, you'd need checks to make sure that the gossip process can
recognize when the storage process has died vs just running a long GC.

I don't know that I'd go so far as to have separate processes for
keyspaces, etc.

There is probably some interesting work that could be done to support the
orgs who run multiple cassandra instances on the same node (multiple
gossipers in that case is at least a little wasteful).

I've also played around with using domain sockets for IPC inside of
cassandra. I never ran a proper benchmark, but there were some throughput
advantages to this approach.

Cheers,

Gary.


On Thu, Feb 22, 2018 at 8:39 PM, Carl Mueller <carl.muel...@smartthings.com>
wrote:

> GC pauses may have been improved in newer releases, since we are in 2.1.x,
> but I was wondering why cassandra uses one jvm for all tables and
> keyspaces, intermingling the heap for on-JVM objects.
>
> ... so why doesn't cassandra spin off a jvm per table so each jvm can be
> tuned per table and gc tuned and gc impacts not impact other tables? It
> would probably increase the number of endpoints if we avoid having an
> overarching query router.
>

Reply via email to