+1 I would guess a lot of C* clusters/tables have this option set to the default value, and not many of them are having the need for reading so big chunks of data. I believe this will greatly limit disk overreads for a fair amount (a big majority?) of new users. It seems fair enough to change this default value, I also think 4.0 is a nice place to do this.
Thanks for taking care of this Ariel and for making sure there is a consensus here as well, C*heers, ----------------------- Alain Rodriguez - al...@thelastpickle.com France / Spain The Last Pickle - Apache Cassandra Consulting http://www.thelastpickle.com Le sam. 13 oct. 2018 à 08:52, Ariel Weisberg <ar...@weisberg.ws> a écrit : > Hi, > > This would only impact new tables, existing tables would get their > chunk_length_in_kb from the existing schema. It's something we record in a > system table. > > I have an implementation of a compact integer sequence that only requires > 37% of the memory required today. So we could do this with only slightly > more than doubling the memory used. I'll post that to the JIRA soon. > > Ariel > > On Fri, Oct 12, 2018, at 1:56 AM, Jeff Jirsa wrote: > > > > > > I think 16k is a better default, but it should only affect new tables. > > Whoever changes it, please make sure you think about the upgrade path. > > > > > > > On Oct 12, 2018, at 2:31 AM, Ben Bromhead <b...@instaclustr.com> wrote: > > > > > > This is something that's bugged me for ages, tbh the performance gain > for > > > most use cases far outweighs the increase in memory usage and I would > even > > > be in favor of changing the default now, optimizing the storage cost > later > > > (if it's found to be worth it). > > > > > > For some anecdotal evidence: > > > 4kb is usually what we end setting it to, 16kb feels more reasonable > given > > > the memory impact, but what would be the point if practically, most > folks > > > set it to 4kb anyway? > > > > > > Note that chunk_length will largely be dependent on your read sizes, > but 4k > > > is the floor for most physical devices in terms of ones block size. > > > > > > +1 for making this change in 4.0 given the small size and the large > > > improvement to new users experience (as long as we are explicit in the > > > documentation about memory consumption). > > > > > > > > >> On Thu, Oct 11, 2018 at 7:11 PM Ariel Weisberg <ar...@weisberg.ws> > wrote: > > >> > > >> Hi, > > >> > > >> This is regarding > https://issues.apache.org/jira/browse/CASSANDRA-13241 > > >> > > >> This ticket has languished for a while. IMO it's too late in 4.0 to > > >> implement a more memory efficient representation for compressed chunk > > >> offsets. However I don't think we should put out another release with > the > > >> current 64k default as it's pretty unreasonable. > > >> > > >> I propose that we lower the value to 16kb. 4k might never be the > correct > > >> default anyways as there is a cost to compression and 16k will still > be a > > >> large improvement. > > >> > > >> Benedict and Jon Haddad are both +1 on making this change for 4.0. In > the > > >> past there has been some consensus about reducing this value although > maybe > > >> with more memory efficiency. > > >> > > >> The napkin math for what this costs is: > > >> "If you have 1TB of uncompressed data, with 64k chunks that's 16M > chunks > > >> at 8 bytes each (128MB). > > >> With 16k chunks, that's 512MB. > > >> With 4k chunks, it's 2G. > > >> Per terabyte of data (pre-compression)." > > >> > > >> > https://issues.apache.org/jira/browse/CASSANDRA-13241?focusedCommentId=15886621&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15886621 > > >> > > >> By way of comparison memory mapping the files has a similar cost per > 4k > > >> page of 8 bytes. Multiple mappings makes this more expensive. With a > > >> default of 16kb this would be 4x less expensive than memory mapping a > file. > > >> I only mention this to give a sense of the costs we are already > paying. I > > >> am not saying they are directly related. > > >> > > >> I'll wait a week for discussion and if there is consensus make the > change. > > >> > > >> Regards, > > >> Ariel > > >> > > >> --------------------------------------------------------------------- > > >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > >> For additional commands, e-mail: dev-h...@cassandra.apache.org > > >> > > >> -- > > > Ben Bromhead > > > CTO | Instaclustr <https://www.instaclustr.com/> > > > +1 650 284 9692 > > > Reliability at Scale > > > Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > > For additional commands, e-mail: dev-h...@cassandra.apache.org > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org > For additional commands, e-mail: dev-h...@cassandra.apache.org > >