HI, Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was the solution... The number of SSTables decreased from many thousands to a number below a hundred and the SSTables are now much bigger with several gigabytes (most of them).
Cheers, Roni Balthazar On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam <ptrstp...@gmail.com> wrote: > After some diagnostic ( we didn't set yet cold_reads_to_omit ). Compaction > are running but VERY slow with "idle" IO. > > We had a lot of "Data files" in Cassandra. In DC_A it is about ~120000 (only > xxx-Data.db) in DC_B has only ~4000. > > I don't know if this change anything but: > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big ones, > but most is really small (almost 10000 files are less then 100mb). > 2) in DC_B avg size of Data.db is much bigger ~260mb. > > Do you think that above flag will help us? > > > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam <ptrstp...@gmail.com> wrote: >> >> I set setcompactionthroughput 999 permanently and it doesn't change >> anything. IO is still same. CPU is idle. >> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar <ronibaltha...@gmail.com> >> wrote: >>> >>> Hi, >>> >>> You can run "nodetool compactionstats" to view statistics on compactions. >>> Setting cold_reads_to_omit to 0.0 can help to reduce the number of >>> SSTables when you use Size-Tiered compaction. >>> You can also create a cron job to increase the value of >>> setcompactionthroughput during the night or when your IO is not busy. >>> >>> From http://wiki.apache.org/cassandra/NodeTool: >>> 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999 >>> 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16 >>> >>> Cheers, >>> >>> Roni Balthazar >>> >>> On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam <ptrstp...@gmail.com> wrote: >>> > One think I do not understand. In my case compaction is running >>> > permanently. >>> > Is there a way to check which compaction is pending? The only >>> > information is >>> > about total count. >>> > >>> > >>> > On Monday, February 16, 2015, Ja Sam <ptrstp...@gmail.com> wrote: >>> >> >>> >> Of couse I made a mistake. I am using 2.1.2. Anyway night build is >>> >> available from >>> >> http://cassci.datastax.com/job/cassandra-2.1/ >>> >> >>> >> I read about cold_reads_to_omit It looks promising. Should I set also >>> >> compaction throughput? >>> >> >>> >> p.s. I am really sad that I didn't read this before: >>> >> >>> >> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ >>> >> >>> >> >>> >> >>> >> On Monday, February 16, 2015, Carlos Rolo <r...@pythian.com> wrote: >>> >>> >>> >>> Hi 100% in agreement with Roland, >>> >>> >>> >>> 2.1.x series is a pain! I would never recommend the current 2.1.x >>> >>> series >>> >>> for production. >>> >>> >>> >>> Clocks is a pain, and check your connectivity! Also check tpstats to >>> >>> see >>> >>> if your threadpools are being overrun. >>> >>> >>> >>> Regards, >>> >>> >>> >>> Carlos Juzarte Rolo >>> >>> Cassandra Consultant >>> >>> >>> >>> Pythian - Love your data >>> >>> >>> >>> rolo@pythian | Twitter: cjrolo | Linkedin: >>> >>> linkedin.com/in/carlosjuzarterolo >>> >>> Tel: 1649 >>> >>> www.pythian.com >>> >>> >>> >>> On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer >>> >>> <r.etzenham...@t-online.de> wrote: >>> >>>> >>> >>>> Hi, >>> >>>> >>> >>>> 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 (suggested by >>> >>>> Al >>> >>>> Tobey from DataStax) >>> >>>> 7) minimal reads (usually none, sometimes few) >>> >>>> >>> >>>> those two points keep me repeating an anwser I got. First where did >>> >>>> you >>> >>>> get 2.1.3 from? Maybe I missed it, I will have a look. But if it is >>> >>>> 2.1.2 >>> >>>> whis is the latest released version, that version has many bugs - >>> >>>> most of >>> >>>> them I got kicked by while testing 2.1.2. I got many problems with >>> >>>> compactions not beeing triggred on column families not beeing read, >>> >>>> compactions and repairs not beeing completed. See >>> >>>> >>> >>>> >>> >>>> >>> >>>> https://www.mail-archive.com/search?l=user@cassandra.apache.org&q=subject:%22Re%3A+Compaction+failing+to+trigger%22&o=newest&f=1 >>> >>>> >>> >>>> https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html >>> >>>> >>> >>>> Apart from that, how are those both datacenters connected? Maybe >>> >>>> there >>> >>>> is a bottleneck. >>> >>>> >>> >>>> Also do you have ntp up and running on all nodes to keep all clocks >>> >>>> in >>> >>>> thight sync? >>> >>>> >>> >>>> Note: I'm no expert (yet) - just sharing my 2 cents. >>> >>>> >>> >>>> Cheers, >>> >>>> Roland >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> >>> >>> >>> >>> >>> > >> >> >