Which error are you getting when running repairs? You need to run repair on your nodes within gc_grace_seconds (eg: weekly). They have data that are not read frequently. You can run "repair -pr" on all nodes. Since you do not have deletes, you will not have trouble with that. If you have deletes, it's better to increase gc_grace_seconds before the repair. http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html After repair, try to run a "nodetool cleanup".
Check if the number of SSTables goes down after that... Pending compactions must decrease as well... Cheers, Roni Balthazar On Wed, Feb 18, 2015 at 12:39 PM, Ja Sam <ptrstp...@gmail.com> wrote: > 1) we tried to run repairs but they usually does not succeed. But we had > Leveled compaction before. Last week we ALTER tables to STCS, because guys > from DataStax suggest us that we should not use Leveled and alter tables in > STCS, because we don't have SSD. After this change we did not run any > repair. Anyway I don't think it will change anything in SSTable count - if I > am wrong please give me an information > > 2) I did this. My tables are 99% write only. It is audit system > > 3) Yes I am using default values > > 4) In both operations I am using LOCAL_QUORUM. > > I am almost sure that READ timeout happens because of too much SSTables. > Anyway firstly I would like to fix to many pending compactions. I still > don't know how to speed up them. > > > On Wed, Feb 18, 2015 at 2:49 PM, Roni Balthazar <ronibaltha...@gmail.com> > wrote: >> >> Are you running repairs within gc_grace_seconds? (default is 10 days) >> >> http://www.datastax.com/documentation/cassandra/2.0/cassandra/operations/ops_repair_nodes_c.html >> >> Double check if you set cold_reads_to_omit to 0.0 on tables with STCS >> that you do not read often. >> >> Are you using default values for the properties >> min_compaction_threshold(4) and max_compaction_threshold(32)? >> >> Which Consistency Level are you using for reading operations? Check if >> you are not reading from DC_B due to your Replication Factor and CL. >> >> http://www.datastax.com/documentation/cassandra/2.0/cassandra/dml/dml_config_consistency_c.html >> >> >> Cheers, >> >> Roni Balthazar >> >> On Wed, Feb 18, 2015 at 11:07 AM, Ja Sam <ptrstp...@gmail.com> wrote: >> > I don't have problems with DC_B (replica) only in DC_A(my system write >> > only >> > to it) I have read timeouts. >> > >> > I checked in OpsCenter SSTable count and I have: >> > 1) in DC_A same +-10% for last week, a small increase for last 24h (it >> > is >> > more than 15000-20000 SSTables depends on node) >> > 2) in DC_B last 24h shows up to 50% decrease, which give nice >> > prognostics. >> > Now I have less then 1000 SSTables >> > >> > What did you measure during system optimizations? Or do you have an idea >> > what more should I check? >> > 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) >> > 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are >> > spikes >> > 3) system RAM usage is almost full >> > 4) In Total Bytes Compacted most most lines are below 3MB/s. For total >> > DC_A >> > it is less than 10MB/s, in DC_B it looks much better (avg is like >> > 17MB/s) >> > >> > something else? >> > >> > >> > >> > On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar >> > <ronibaltha...@gmail.com> >> > wrote: >> >> >> >> Hi, >> >> >> >> You can check if the number of SSTables is decreasing. Look for the >> >> "SSTable count" information of your tables using "nodetool cfstats". >> >> The compaction history can be viewed using "nodetool >> >> compactionhistory". >> >> >> >> About the timeouts, check this out: >> >> >> >> http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure >> >> Also try to run "nodetool tpstats" to see the threads statistics. It >> >> can lead you to know if you are having performance problems. If you >> >> are having too many pending tasks or dropped messages, maybe will you >> >> need to tune your system (eg: driver's timeout, concurrent reads and >> >> so on) >> >> >> >> Regards, >> >> >> >> Roni Balthazar >> >> >> >> On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam <ptrstp...@gmail.com> wrote: >> >> > Hi, >> >> > Thanks for your "tip" it looks that something changed - I still don't >> >> > know >> >> > if it is ok. >> >> > >> >> > My nodes started to do more compaction, but it looks that some >> >> > compactions >> >> > are really slow. >> >> > In IO we have idle, CPU is quite ok (30%-40%). We set >> >> > compactionthrouput >> >> > to >> >> > 999, but I do not see difference. >> >> > >> >> > Can we check something more? Or do you have any method to monitor >> >> > progress >> >> > with small files? >> >> > >> >> > Regards >> >> > >> >> > On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar >> >> > <ronibaltha...@gmail.com> >> >> > wrote: >> >> >> >> >> >> HI, >> >> >> >> >> >> Yes... I had the same issue and setting cold_reads_to_omit to 0.0 >> >> >> was >> >> >> the solution... >> >> >> The number of SSTables decreased from many thousands to a number >> >> >> below >> >> >> a hundred and the SSTables are now much bigger with several >> >> >> gigabytes >> >> >> (most of them). >> >> >> >> >> >> Cheers, >> >> >> >> >> >> Roni Balthazar >> >> >> >> >> >> >> >> >> >> >> >> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam <ptrstp...@gmail.com> >> >> >> wrote: >> >> >> > After some diagnostic ( we didn't set yet cold_reads_to_omit ). >> >> >> > Compaction >> >> >> > are running but VERY slow with "idle" IO. >> >> >> > >> >> >> > We had a lot of "Data files" in Cassandra. In DC_A it is about >> >> >> > ~120000 >> >> >> > (only >> >> >> > xxx-Data.db) in DC_B has only ~4000. >> >> >> > >> >> >> > I don't know if this change anything but: >> >> >> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really >> >> >> > big >> >> >> > ones, >> >> >> > but most is really small (almost 10000 files are less then 100mb). >> >> >> > 2) in DC_B avg size of Data.db is much bigger ~260mb. >> >> >> > >> >> >> > Do you think that above flag will help us? >> >> >> > >> >> >> > >> >> >> > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam <ptrstp...@gmail.com> >> >> >> > wrote: >> >> >> >> >> >> >> >> I set setcompactionthroughput 999 permanently and it doesn't >> >> >> >> change >> >> >> >> anything. IO is still same. CPU is idle. >> >> >> >> >> >> >> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar >> >> >> >> <ronibaltha...@gmail.com> >> >> >> >> wrote: >> >> >> >>> >> >> >> >>> Hi, >> >> >> >>> >> >> >> >>> You can run "nodetool compactionstats" to view statistics on >> >> >> >>> compactions. >> >> >> >>> Setting cold_reads_to_omit to 0.0 can help to reduce the number >> >> >> >>> of >> >> >> >>> SSTables when you use Size-Tiered compaction. >> >> >> >>> You can also create a cron job to increase the value of >> >> >> >>> setcompactionthroughput during the night or when your IO is not >> >> >> >>> busy. >> >> >> >>> >> >> >> >>> From http://wiki.apache.org/cassandra/NodeTool: >> >> >> >>> 0 0 * * * root nodetool -h `hostname` setcompactionthroughput >> >> >> >>> 999 >> >> >> >>> 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16 >> >> >> >>> >> >> >> >>> Cheers, >> >> >> >>> >> >> >> >>> Roni Balthazar >> >> >> >>> >> >> >> >>> On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam <ptrstp...@gmail.com> >> >> >> >>> wrote: >> >> >> >>> > One think I do not understand. In my case compaction is >> >> >> >>> > running >> >> >> >>> > permanently. >> >> >> >>> > Is there a way to check which compaction is pending? The only >> >> >> >>> > information is >> >> >> >>> > about total count. >> >> >> >>> > >> >> >> >>> > >> >> >> >>> > On Monday, February 16, 2015, Ja Sam <ptrstp...@gmail.com> >> >> >> >>> > wrote: >> >> >> >>> >> >> >> >> >>> >> Of couse I made a mistake. I am using 2.1.2. Anyway night >> >> >> >>> >> build >> >> >> >>> >> is >> >> >> >>> >> available from >> >> >> >>> >> http://cassci.datastax.com/job/cassandra-2.1/ >> >> >> >>> >> >> >> >> >>> >> I read about cold_reads_to_omit It looks promising. Should I >> >> >> >>> >> set >> >> >> >>> >> also >> >> >> >>> >> compaction throughput? >> >> >> >>> >> >> >> >> >>> >> p.s. I am really sad that I didn't read this before: >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> >> >> >> >>> >> On Monday, February 16, 2015, Carlos Rolo <r...@pythian.com> >> >> >> >>> >> wrote: >> >> >> >>> >>> >> >> >> >>> >>> Hi 100% in agreement with Roland, >> >> >> >>> >>> >> >> >> >>> >>> 2.1.x series is a pain! I would never recommend the current >> >> >> >>> >>> 2.1.x >> >> >> >>> >>> series >> >> >> >>> >>> for production. >> >> >> >>> >>> >> >> >> >>> >>> Clocks is a pain, and check your connectivity! Also check >> >> >> >>> >>> tpstats >> >> >> >>> >>> to >> >> >> >>> >>> see >> >> >> >>> >>> if your threadpools are being overrun. >> >> >> >>> >>> >> >> >> >>> >>> Regards, >> >> >> >>> >>> >> >> >> >>> >>> Carlos Juzarte Rolo >> >> >> >>> >>> Cassandra Consultant >> >> >> >>> >>> >> >> >> >>> >>> Pythian - Love your data >> >> >> >>> >>> >> >> >> >>> >>> rolo@pythian | Twitter: cjrolo | Linkedin: >> >> >> >>> >>> linkedin.com/in/carlosjuzarterolo >> >> >> >>> >>> Tel: 1649 >> >> >> >>> >>> www.pythian.com >> >> >> >>> >>> >> >> >> >>> >>> On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer >> >> >> >>> >>> <r.etzenham...@t-online.de> wrote: >> >> >> >>> >>>> >> >> >> >>> >>>> Hi, >> >> >> >>> >>>> >> >> >> >>> >>>> 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 >> >> >> >>> >>>> (suggested >> >> >> >>> >>>> by >> >> >> >>> >>>> Al >> >> >> >>> >>>> Tobey from DataStax) >> >> >> >>> >>>> 7) minimal reads (usually none, sometimes few) >> >> >> >>> >>>> >> >> >> >>> >>>> those two points keep me repeating an anwser I got. First >> >> >> >>> >>>> where >> >> >> >>> >>>> did >> >> >> >>> >>>> you >> >> >> >>> >>>> get 2.1.3 from? Maybe I missed it, I will have a look. But >> >> >> >>> >>>> if >> >> >> >>> >>>> it >> >> >> >>> >>>> is >> >> >> >>> >>>> 2.1.2 >> >> >> >>> >>>> whis is the latest released version, that version has many >> >> >> >>> >>>> bugs - >> >> >> >>> >>>> most of >> >> >> >>> >>>> them I got kicked by while testing 2.1.2. I got many >> >> >> >>> >>>> problems >> >> >> >>> >>>> with >> >> >> >>> >>>> compactions not beeing triggred on column families not >> >> >> >>> >>>> beeing >> >> >> >>> >>>> read, >> >> >> >>> >>>> compactions and repairs not beeing completed. See >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> https://www.mail-archive.com/search?l=user@cassandra.apache.org&q=subject:%22Re%3A+Compaction+failing+to+trigger%22&o=newest&f=1 >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> >> >> >> >>> >>>> https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html >> >> >> >>> >>>> >> >> >> >>> >>>> Apart from that, how are those both datacenters connected? >> >> >> >>> >>>> Maybe >> >> >> >>> >>>> there >> >> >> >>> >>>> is a bottleneck. >> >> >> >>> >>>> >> >> >> >>> >>>> Also do you have ntp up and running on all nodes to keep >> >> >> >>> >>>> all >> >> >> >>> >>>> clocks >> >> >> >>> >>>> in >> >> >> >>> >>>> thight sync? >> >> >> >>> >>>> >> >> >> >>> >>>> Note: I'm no expert (yet) - just sharing my 2 cents. >> >> >> >>> >>>> >> >> >> >>> >>>> Cheers, >> >> >> >>> >>>> Roland >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> -- >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> >>> >> >> >> >>> > >> >> >> >> >> >> >> >> >> >> >> > >> >> > >> >> > >> > >> > > >