Also, I was wondering if the key cache maintains a count of how many local accesses a key undergoes. Such information might be very useful for compactions of sstables by splitting data by frequency of use so that those can be preferentially compacted.
On Wed, Feb 21, 2018 at 5:08 PM, Carl Mueller <carl.muel...@smartthings.com> wrote: > Looking through the 2.1.X code I see this: > > org.apache.cassandra.io.sstable.Component.java > > In the enum for component types there is a CUSTOM enum value which seems > to indicate a catchall for providing metadata for sstables. > > Has this been exploited... ever? I noticed in some of the patches for the > archival options on TWCS there are complaints about being able to identify > sstables that are archived and those that aren't. > > I would be interested in order to mark the sstables with metadata > indicating the date range an sstable is targetted at for compactions. > > discoverComponentsFor seems to explicitly exclude the loadup of any > files/sstable components that are CUSTOM in SStable.java > > On Wed, Feb 21, 2018 at 10:05 AM, Carl Mueller < > carl.muel...@smartthings.com> wrote: > >> jon: I am planning on writing a custom compaction strategy. That's why >> the question is here, I figured the specifics of memtable -> sstable and >> cassandra internals are not a user question. If that still isn't deep >> enough for the dev thread, I will move all those questions to user. >> >> On Wed, Feb 21, 2018 at 9:59 AM, Carl Mueller < >> carl.muel...@smartthings.com> wrote: >> >>> Thank you all! >>> >>> On Tue, Feb 20, 2018 at 7:35 PM, kurt greaves <k...@instaclustr.com> >>> wrote: >>> >>>> Probably a lot of work but it would be incredibly useful for vnodes if >>>> flushing was range aware (to be used with RangeAwareCompactionStrategy). >>>> The writers are already range aware for JBOD, but that's not terribly >>>> valuable ATM. >>>> >>>> On 20 February 2018 at 21:57, Jeff Jirsa <jji...@gmail.com> wrote: >>>> >>>>> There are some arguments to be made that the flush should consider >>>>> compaction strategy - would allow a bug flush to respect LCS filesizes or >>>>> break into smaller pieces to try to minimize range overlaps going from l0 >>>>> into l1, for example. >>>>> >>>>> I have no idea how much work would be involved, but may be worthwhile. >>>>> >>>>> >>>>> -- >>>>> Jeff Jirsa >>>>> >>>>> >>>>> On Feb 20, 2018, at 1:26 PM, Jon Haddad <j...@jonhaddad.com> wrote: >>>>> >>>>> The file format is independent from compaction. A compaction strategy >>>>> only selects sstables to be compacted, that’s it’s only job. It could >>>>> have >>>>> side effects, like generating other files, but any decent compaction >>>>> strategy will account for the fact that those other files don’t exist. >>>>> >>>>> I wrote a blog post a few months ago going over some of the nuance of >>>>> compaction you mind find informative: http://thelastpic >>>>> kle.com/blog/2017/03/16/compaction-nuance.html >>>>> >>>>> This is also the wrong mailing list, please direct future user >>>>> questions to the user list. The dev list is for development of Cassandra >>>>> itself. >>>>> >>>>> Jon >>>>> >>>>> On Feb 20, 2018, at 1:10 PM, Carl Mueller < >>>>> carl.muel...@smartthings.com> wrote: >>>>> >>>>> When memtables/CommitLogs are flushed to disk/sstable, does the >>>>> sstable go >>>>> through sstable organization specific to each compaction strategy, or >>>>> is >>>>> the sstable creation the same for all compactionstrats and it is up to >>>>> the >>>>> compaction strategy to recompact the sstable if desired? >>>>> >>>>> >>>>> >>>> >>> >> >