... compaction on its own jvm was also something I was thinking about, but then I realized even more JVM sharding could be done at the table level.
On Thu, Feb 22, 2018 at 4:09 PM, Jon Haddad <j...@jonhaddad.com> wrote: > Yeah, I’m in the compaction on it’s own JVM camp, in an ideal world where > we’re isolating crazy GC churning parts of the DB. It would mean reworking > how tasks are created and removal of all shared state in favor of messaging > + a smarter manager, which imo would be a good idea regardless. > > It might be a better use of time (especially for 4.0) to do some GC > performance profiling and cut down on the allocations, since that doesn’t > involve a massive effort. > > I’ve been meaning to do a little benchmarking and profiling for a while > now, and it seems like a few others have the same inclination as well, > maybe now is a good time to coordinate that. A nice perf bump for 4.0 > would be very rewarding. > > Jon > > > On Feb 22, 2018, at 2:00 PM, Nate McCall <zznat...@gmail.com> wrote: > > > > I've heard a couple of folks pontificate on compaction in its own > > process as well, given it has such a high impact on GC. Not sure about > > the value of individual tables. Interesting idea though. > > > > On Fri, Feb 23, 2018 at 10:45 AM, Gary Dusbabek <gdusba...@gmail.com> > wrote: > >> I've given it some thought in the past. In the end, I usually talk > myself > >> out of it because I think it increases the surface area for failure. > That > >> is, managing N processes is more difficult that managing one process. > But > >> if the additional failure modes are addressed, there are some > interesting > >> possibilities. > >> > >> For example, having gossip in its own process would decrease the odds > that > >> a node is marked dead because STW GC is happening in the storage JVM. On > >> the flipside, you'd need checks to make sure that the gossip process can > >> recognize when the storage process has died vs just running a long GC. > >> > >> I don't know that I'd go so far as to have separate processes for > >> keyspaces, etc. > >> > >> There is probably some interesting work that could be done to support > the > >> orgs who run multiple cassandra instances on the same node (multiple > >> gossipers in that case is at least a little wasteful). > >> > >> I've also played around with using domain sockets for IPC inside of > >> cassandra. I never ran a proper benchmark, but there were some > throughput > >> advantages to this approach. > >> > >> Cheers, > >> > >> Gary. > >> > >> > >> On Thu, Feb 22, 2018 at 8:39 PM, Carl Mueller < > carl.muel...@smartthings.com> > >> wrote: > >> > >>> GC pauses may have been improved in newer releases, since we are in > 2.1.x, > >>> but I was wondering why cassandra uses one jvm for all tables and > >>> keyspaces, intermingling the heap for on-JVM objects. > >>> > >>> ... so why doesn't cassandra spin off a jvm per table so each jvm can > be > >>> tuned per table and gc tuned and gc impacts not impact other tables? It > >>> would probably increase the number of endpoints if we avoid having an > >>> overarching query router. > >>> > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >