ad 3) I did this already yesterday (setcompactionthrouput also). But still SSTables are increasing.
ad 1) What do you think I should use -pr or try to use incremental? On Wed, Feb 18, 2015 at 4:54 PM, Roni Balthazar <ronibaltha...@gmail.com> wrote: > You are right... Repair makes the data consistent between nodes. > > I understand that you have 2 issues going on. > > You need to run repair periodically without errors and need to decrease > the numbers of compactions pending. > > So I suggest: > > 1) Run repair -pr on all nodes. If you upgrade to the new 2.1.3, you can > use incremental repairs. There were some bugs on 2.1.2. > 2) Run cleanup on all nodes > 3) Since you have too many cold SSTables, set cold_reads_to_omit to 0.0, > and increase setcompactionthroughput for some time and see if the number > of SSTables is going down. > > Let us know what errors are you getting when running repairs. > > Regards, > > Roni Balthazar > > > On Wed, Feb 18, 2015 at 1:31 PM, Ja Sam <ptrstp...@gmail.com> wrote: > >> Can you explain me what is the correlation between growing SSTables and >> repair? >> I was sure, until your mail, that repair is only to make data consistent >> between nodes. >> >> Regards >> >> >> On Wed, Feb 18, 2015 at 4:20 PM, Roni Balthazar <ronibaltha...@gmail.com> >> wrote: >> >>> Which error are you getting when running repairs? >>> You need to run repair on your nodes within gc_grace_seconds (eg: >>> weekly). They have data that are not read frequently. You can run >>> "repair -pr" on all nodes. Since you do not have deletes, you will not >>> have trouble with that. If you have deletes, it's better to increase >>> gc_grace_seconds before the repair. >>> >>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html >>> After repair, try to run a "nodetool cleanup". >>> >>> Check if the number of SSTables goes down after that... Pending >>> compactions must decrease as well... >>> >>> Cheers, >>> >>> Roni Balthazar >>> >>> >>> >>> >>> On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam <ptrstp...@gmail.com> wrote: >>> > 1) we tried to run repairs but they usually does not succeed. But we >>> had >>> > Leveled compaction before. Last week we ALTER tables to STCS, because >>> guys >>> > from DataStax suggest us that we should not use Leveled and alter >>> tables in >>> > STCS, because we don't have SSD. After this change we did not run any >>> > repair. Anyway I don't think it will change anything in SSTable count >>> - if I >>> > am wrong please give me an information >>> > >>> > 2) I did this. My tables are 99% write only. It is audit system >>> > >>> > 3) Yes I am using default values >>> > >>> > 4) In both operations I am using LOCAL_QUORUM. >>> > >>> > I am almost sure that READ timeout happens because of too much >>> SSTables. >>> > Anyway firstly I would like to fix to many pending compactions. I still >>> > don't know how to speed up them. >>> > >>> > >>> > On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar < >>> ronibaltha...@gmail.com> >>> > wrote: >>> >> >>> >> Are you running repairs within gc_grace_seconds? (default is 10 days) >>> >> >>> >> >>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html >>> >> >>> >> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS >>> >> that you do not read often. >>> >> >>> >> Are you using default values for the properties >>> >> min_compaction_threshold(4) and max_compaction_threshold(32)? >>> >> >>> >> Which Consistency Level are you using for reading operations? Check if >>> >> you are not reading from DC_B due to your Replication Factor and CL. >>> >> >>> >> >>> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html >>> >> >>> >> >>> >> Cheers, >>> >> >>> >> Roni Balthazar >>> >> >>> >> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam <ptrstp...@gmail.com> wrote: >>> >> > I don't have problems with DC_B (replica) only in DC_A(my system >>> write >>> >> > only >>> >> > to it) I have read timeouts. >>> >> > >>> >> > I checked in OpsCenter SSTable count and I have: >>> >> > 1) in DC_A same +-10% for last week, a small increase for last 24h >>> (it >>> >> > is >>> >> > more than 15000-20000 SSTables depends on node) >>> >> > 2) in DC_B last 24h shows up to 50% decrease, which give nice >>> >> > prognostics. >>> >> > Now I have less then 1000 SSTables >>> >> > >>> >> > What did you measure during system optimizations? Or do you have an >>> idea >>> >> > what more should I check? >>> >> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) >>> >> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there >>> are >>> >> > spikes >>> >> > 3) system RAM usage is almost full >>> >> > 4) In Total Bytes Compacted most most lines are below 3MB/s. For >>> total >>> >> > DC_A >>> >> > it is less than 10MB/s, in DC_B it looks much better (avg is like >>> >> > 17MB/s) >>> >> > >>> >> > something else? >>> >> > >>> >> > >>> >> > >>> >> > On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar >>> >> > <ronibaltha...@gmail.com> >>> >> > wrote: >>> >> >> >>> >> >> Hi, >>> >> >> >>> >> >> You can check if the number of SSTables is decreasing. Look for the >>> >> >> "SSTable count" information of your tables using "nodetool >>> cfstats". >>> >> >> The compaction history can be viewed using "nodetool >>> >> >> compactionhistory". >>> >> >> >>> >> >> About the timeouts, check this out: >>> >> >> >>> >> >> >>> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure >>> >> >> Also try to run "nodetool tpstats" to see the threads statistics. >>> It >>> >> >> can lead you to know if you are having performance problems. If you >>> >> >> are having too many pending tasks or dropped messages, maybe will >>> you >>> >> >> need to tune your system (eg: driver's timeout, concurrent reads >>> and >>> >> >> so on) >>> >> >> >>> >> >> Regards, >>> >> >> >>> >> >> Roni Balthazar >>> >> >> >>> >> >> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam <ptrstp...@gmail.com> >>> wrote: >>> >> >> > Hi, >>> >> >> > Thanks for your "tip" it looks that something changed - I still >>> don't >>> >> >> > know >>> >> >> > if it is ok. >>> >> >> > >>> >> >> > My nodes started to do more compaction, but it looks that some >>> >> >> > compactions >>> >> >> > are really slow. >>> >> >> > In IO we have idle, CPU is quite ok (30%-40%). We set >>> >> >> > compactionthrouput >>> >> >> > to >>> >> >> > 999, but I do not see difference. >>> >> >> > >>> >> >> > Can we check something more? Or do you have any method to monitor >>> >> >> > progress >>> >> >> > with small files? >>> >> >> > >>> >> >> > Regards >>> >> >> > >>> >> >> > On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar >>> >> >> > <ronibaltha...@gmail.com> >>> >> >> > wrote: >>> >> >> >> >>> >> >> >> HI, >>> >> >> >> >>> >> >> >> Yes... I had the same issue and setting cold_reads_to_omit to >>> 0.0 >>> >> >> >> was >>> >> >> >> the solution... >>> >> >> >> The number of SSTables decreased from many thousands to a number >>> >> >> >> below >>> >> >> >> a hundred and the SSTables are now much bigger with several >>> >> >> >> gigabytes >>> >> >> >> (most of them). >>> >> >> >> >>> >> >> >> Cheers, >>> >> >> >> >>> >> >> >> Roni Balthazar >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam <ptrstp...@gmail.com> >>> >> >> >> wrote: >>> >> >> >> > After some diagnostic ( we didn't set yet cold_reads_to_omit >>> ). >>> >> >> >> > Compaction >>> >> >> >> > are running but VERY slow with "idle" IO. >>> >> >> >> > >>> >> >> >> > We had a lot of "Data files" in Cassandra. In DC_A it is about >>> >> >> >> > ~120000 >>> >> >> >> > (only >>> >> >> >> > xxx-Data.db) in DC_B has only ~4000. >>> >> >> >> > >>> >> >> >> > I don't know if this change anything but: >>> >> >> >> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a >>> really >>> >> >> >> > big >>> >> >> >> > ones, >>> >> >> >> > but most is really small (almost 10000 files are less then >>> 100mb). >>> >> >> >> > 2) in DC_B avg size of Data.db is much bigger ~260mb. >>> >> >> >> > >>> >> >> >> > Do you think that above flag will help us? >>> >> >> >> > >>> >> >> >> > >>> >> >> >> > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam <ptrstp...@gmail.com> >>> >> >> >> > wrote: >>> >> >> >> >> >>> >> >> >> >> I set setcompactionthroughput 999 permanently and it doesn't >>> >> >> >> >> change >>> >> >> >> >> anything. IO is still same. CPU is idle. >>> >> >> >> >> >>> >> >> >> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar >>> >> >> >> >> <ronibaltha...@gmail.com> >>> >> >> >> >> wrote: >>> >> >> >> >>> >>> >> >> >> >>> Hi, >>> >> >> >> >>> >>> >> >> >> >>> You can run "nodetool compactionstats" to view statistics on >>> >> >> >> >>> compactions. >>> >> >> >> >>> Setting cold_reads_to_omit to 0.0 can help to reduce the >>> number >>> >> >> >> >>> of >>> >> >> >> >>> SSTables when you use Size-Tiered compaction. >>> >> >> >> >>> You can also create a cron job to increase the value of >>> >> >> >> >>> setcompactionthroughput during the night or when your IO is >>> not >>> >> >> >> >>> busy. >>> >> >> >> >>> >>> >> >> >> >>> From http://wiki.apache.org/cassandra/NodeTool: >>> >> >> >> >>> 0 0 * * * root nodetool -h `hostname` >>> setcompactionthroughput >>> >> >> >> >>> 999 >>> >> >> >> >>> 0 6 * * * root nodetool -h `hostname` >>> setcompactionthroughput 16 >>> >> >> >> >>> >>> >> >> >> >>> Cheers, >>> >> >> >> >>> >>> >> >> >> >>> Roni Balthazar >>> >> >> >> >>> >>> >> >> >> >>> On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam < >>> ptrstp...@gmail.com> >>> >> >> >> >>> wrote: >>> >> >> >> >>> > One think I do not understand. In my case compaction is >>> >> >> >> >>> > running >>> >> >> >> >>> > permanently. >>> >> >> >> >>> > Is there a way to check which compaction is pending? The >>> only >>> >> >> >> >>> > information is >>> >> >> >> >>> > about total count. >>> >> >> >> >>> > >>> >> >> >> >>> > >>> >> >> >> >>> > On Monday, February 16, 2015, Ja Sam <ptrstp...@gmail.com >>> > >>> >> >> >> >>> > wrote: >>> >> >> >> >>> >> >>> >> >> >> >>> >> Of couse I made a mistake. I am using 2.1.2. Anyway night >>> >> >> >> >>> >> build >>> >> >> >> >>> >> is >>> >> >> >> >>> >> available from >>> >> >> >> >>> >> http://cassci.datastax.com/job/cassandra-2.1/ >>> >> >> >> >>> >> >>> >> >> >> >>> >> I read about cold_reads_to_omit It looks promising. >>> Should I >>> >> >> >> >>> >> set >>> >> >> >> >>> >> also >>> >> >> >> >>> >> compaction throughput? >>> >> >> >> >>> >> >>> >> >> >> >>> >> p.s. I am really sad that I didn't read this before: >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> >> >> >> >>> >> >>> >> >> >> >>> >> On Monday, February 16, 2015, Carlos Rolo < >>> r...@pythian.com> >>> >> >> >> >>> >> wrote: >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> Hi 100% in agreement with Roland, >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> 2.1.x series is a pain! I would never recommend the >>> current >>> >> >> >> >>> >>> 2.1.x >>> >> >> >> >>> >>> series >>> >> >> >> >>> >>> for production. >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> Clocks is a pain, and check your connectivity! Also >>> check >>> >> >> >> >>> >>> tpstats >>> >> >> >> >>> >>> to >>> >> >> >> >>> >>> see >>> >> >> >> >>> >>> if your threadpools are being overrun. >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> Regards, >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> Carlos Juzarte Rolo >>> >> >> >> >>> >>> Cassandra Consultant >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> Pythian - Love your data >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> rolo@pythian | Twitter: cjrolo | Linkedin: >>> >> >> >> >>> >>> linkedin.com/in/carlosjuzarterolo >>> >> >> >> >>> >>> Tel: 1649 >>> >> >> >> >>> >>> www.pythian.com >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer >>> >> >> >> >>> >>> <r.etzenham...@t-online.de> wrote: >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> Hi, >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 >>> >> >> >> >>> >>>> (suggested >>> >> >> >> >>> >>>> by >>> >> >> >> >>> >>>> Al >>> >> >> >> >>> >>>> Tobey from DataStax) >>> >> >> >> >>> >>>> 7) minimal reads (usually none, sometimes few) >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> those two points keep me repeating an anwser I got. >>> First >>> >> >> >> >>> >>>> where >>> >> >> >> >>> >>>> did >>> >> >> >> >>> >>>> you >>> >> >> >> >>> >>>> get 2.1.3 from? Maybe I missed it, I will have a look. >>> But >>> >> >> >> >>> >>>> if >>> >> >> >> >>> >>>> it >>> >> >> >> >>> >>>> is >>> >> >> >> >>> >>>> 2.1.2 >>> >> >> >> >>> >>>> whis is the latest released version, that version has >>> many >>> >> >> >> >>> >>>> bugs - >>> >> >> >> >>> >>>> most of >>> >> >> >> >>> >>>> them I got kicked by while testing 2.1.2. I got many >>> >> >> >> >>> >>>> problems >>> >> >> >> >>> >>>> with >>> >> >> >> >>> >>>> compactions not beeing triggred on column families not >>> >> >> >> >>> >>>> beeing >>> >> >> >> >>> >>>> read, >>> >> >> >> >>> >>>> compactions and repairs not beeing completed. See >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> https://www.mail-archive.com/search?l=user@cassandra.apache.org&q=subject:%22Re%3A+Compaction+failing+to+trigger%22&o=newest&f=1 >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> >>> https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> Apart from that, how are those both datacenters >>> connected? >>> >> >> >> >>> >>>> Maybe >>> >> >> >> >>> >>>> there >>> >> >> >> >>> >>>> is a bottleneck. >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> Also do you have ntp up and running on all nodes to >>> keep >>> >> >> >> >>> >>>> all >>> >> >> >> >>> >>>> clocks >>> >> >> >> >>> >>>> in >>> >> >> >> >>> >>>> thight sync? >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> Note: I'm no expert (yet) - just sharing my 2 cents. >>> >> >> >> >>> >>>> >>> >> >> >> >>> >>>> Cheers, >>> >> >> >> >>> >>>> Roland >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> -- >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> >>> >> >> >> >>> >>> >>> >> >> >> >>> > >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> > >>> >> >> > >>> >> >> > >>> >> > >>> >> > >>> > >>> > >>> >> >> >