I don't have problems with DC_B (replica) only in DC_A(my system write only to it) I have read timeouts.
I checked in OpsCenter SSTable count and I have: 1) in DC_A same +-10% for last week, a small increase for last 24h (it is more than 15000-20000 SSTables depends on node) 2) in DC_B last 24h shows up to 50% decrease, which give nice prognostics. Now I have less then 1000 SSTables What did you measure during system optimizations? Or do you have an idea what more should I check? 1) I look at CPU Idle (one node is 50% idle, rest 70% idle) 2) Disk queue -> mostly is it near zero: avg 0.09. Sometimes there are spikes 3) system RAM usage is almost full 4) In Total Bytes Compacted most most lines are below 3MB/s. For total DC_A it is less than 10MB/s, in DC_B it looks much better (avg is like 17MB/s) something else? On Wed, Feb 18, 2015 at 1:32 PM, Roni Balthazar <ronibaltha...@gmail.com> wrote: > Hi, > > You can check if the number of SSTables is decreasing. Look for the > "SSTable count" information of your tables using "nodetool cfstats". > The compaction history can be viewed using "nodetool > compactionhistory". > > About the timeouts, check this out: > http://www.datastax.com/dev/blog/how-cassandra-deals-with-replica-failure > Also try to run "nodetool tpstats" to see the threads statistics. It > can lead you to know if you are having performance problems. If you > are having too many pending tasks or dropped messages, maybe will you > need to tune your system (eg: driver's timeout, concurrent reads and > so on) > > Regards, > > Roni Balthazar > > On Wed, Feb 18, 2015 at 9:51 AM, Ja Sam <ptrstp...@gmail.com> wrote: > > Hi, > > Thanks for your "tip" it looks that something changed - I still don't > know > > if it is ok. > > > > My nodes started to do more compaction, but it looks that some > compactions > > are really slow. > > In IO we have idle, CPU is quite ok (30%-40%). We set compactionthrouput > to > > 999, but I do not see difference. > > > > Can we check something more? Or do you have any method to monitor > progress > > with small files? > > > > Regards > > > > On Tue, Feb 17, 2015 at 2:43 PM, Roni Balthazar <ronibaltha...@gmail.com > > > > wrote: > >> > >> HI, > >> > >> Yes... I had the same issue and setting cold_reads_to_omit to 0.0 was > >> the solution... > >> The number of SSTables decreased from many thousands to a number below > >> a hundred and the SSTables are now much bigger with several gigabytes > >> (most of them). > >> > >> Cheers, > >> > >> Roni Balthazar > >> > >> > >> > >> On Tue, Feb 17, 2015 at 11:32 AM, Ja Sam <ptrstp...@gmail.com> wrote: > >> > After some diagnostic ( we didn't set yet cold_reads_to_omit ). > >> > Compaction > >> > are running but VERY slow with "idle" IO. > >> > > >> > We had a lot of "Data files" in Cassandra. In DC_A it is about ~120000 > >> > (only > >> > xxx-Data.db) in DC_B has only ~4000. > >> > > >> > I don't know if this change anything but: > >> > 1) in DC_A avg size of Data.db file is ~13 mb. I have few a really big > >> > ones, > >> > but most is really small (almost 10000 files are less then 100mb). > >> > 2) in DC_B avg size of Data.db is much bigger ~260mb. > >> > > >> > Do you think that above flag will help us? > >> > > >> > > >> > On Tue, Feb 17, 2015 at 9:04 AM, Ja Sam <ptrstp...@gmail.com> wrote: > >> >> > >> >> I set setcompactionthroughput 999 permanently and it doesn't change > >> >> anything. IO is still same. CPU is idle. > >> >> > >> >> On Tue, Feb 17, 2015 at 1:15 AM, Roni Balthazar > >> >> <ronibaltha...@gmail.com> > >> >> wrote: > >> >>> > >> >>> Hi, > >> >>> > >> >>> You can run "nodetool compactionstats" to view statistics on > >> >>> compactions. > >> >>> Setting cold_reads_to_omit to 0.0 can help to reduce the number of > >> >>> SSTables when you use Size-Tiered compaction. > >> >>> You can also create a cron job to increase the value of > >> >>> setcompactionthroughput during the night or when your IO is not > busy. > >> >>> > >> >>> From http://wiki.apache.org/cassandra/NodeTool: > >> >>> 0 0 * * * root nodetool -h `hostname` setcompactionthroughput 999 > >> >>> 0 6 * * * root nodetool -h `hostname` setcompactionthroughput 16 > >> >>> > >> >>> Cheers, > >> >>> > >> >>> Roni Balthazar > >> >>> > >> >>> On Mon, Feb 16, 2015 at 7:47 PM, Ja Sam <ptrstp...@gmail.com> > wrote: > >> >>> > One think I do not understand. In my case compaction is running > >> >>> > permanently. > >> >>> > Is there a way to check which compaction is pending? The only > >> >>> > information is > >> >>> > about total count. > >> >>> > > >> >>> > > >> >>> > On Monday, February 16, 2015, Ja Sam <ptrstp...@gmail.com> wrote: > >> >>> >> > >> >>> >> Of couse I made a mistake. I am using 2.1.2. Anyway night build > is > >> >>> >> available from > >> >>> >> http://cassci.datastax.com/job/cassandra-2.1/ > >> >>> >> > >> >>> >> I read about cold_reads_to_omit It looks promising. Should I set > >> >>> >> also > >> >>> >> compaction throughput? > >> >>> >> > >> >>> >> p.s. I am really sad that I didn't read this before: > >> >>> >> > >> >>> >> > >> >>> >> > https://engineering.eventbrite.com/what-version-of-cassandra-should-i-run/ > >> >>> >> > >> >>> >> > >> >>> >> > >> >>> >> On Monday, February 16, 2015, Carlos Rolo <r...@pythian.com> > wrote: > >> >>> >>> > >> >>> >>> Hi 100% in agreement with Roland, > >> >>> >>> > >> >>> >>> 2.1.x series is a pain! I would never recommend the current > 2.1.x > >> >>> >>> series > >> >>> >>> for production. > >> >>> >>> > >> >>> >>> Clocks is a pain, and check your connectivity! Also check > tpstats > >> >>> >>> to > >> >>> >>> see > >> >>> >>> if your threadpools are being overrun. > >> >>> >>> > >> >>> >>> Regards, > >> >>> >>> > >> >>> >>> Carlos Juzarte Rolo > >> >>> >>> Cassandra Consultant > >> >>> >>> > >> >>> >>> Pythian - Love your data > >> >>> >>> > >> >>> >>> rolo@pythian | Twitter: cjrolo | Linkedin: > >> >>> >>> linkedin.com/in/carlosjuzarterolo > >> >>> >>> Tel: 1649 > >> >>> >>> www.pythian.com > >> >>> >>> > >> >>> >>> On Mon, Feb 16, 2015 at 8:12 PM, Roland Etzenhammer > >> >>> >>> <r.etzenham...@t-online.de> wrote: > >> >>> >>>> > >> >>> >>>> Hi, > >> >>> >>>> > >> >>> >>>> 1) Actual Cassandra 2.1.3, it was upgraded from 2.1.0 > (suggested > >> >>> >>>> by > >> >>> >>>> Al > >> >>> >>>> Tobey from DataStax) > >> >>> >>>> 7) minimal reads (usually none, sometimes few) > >> >>> >>>> > >> >>> >>>> those two points keep me repeating an anwser I got. First where > >> >>> >>>> did > >> >>> >>>> you > >> >>> >>>> get 2.1.3 from? Maybe I missed it, I will have a look. But if > it > >> >>> >>>> is > >> >>> >>>> 2.1.2 > >> >>> >>>> whis is the latest released version, that version has many > bugs - > >> >>> >>>> most of > >> >>> >>>> them I got kicked by while testing 2.1.2. I got many problems > >> >>> >>>> with > >> >>> >>>> compactions not beeing triggred on column families not beeing > >> >>> >>>> read, > >> >>> >>>> compactions and repairs not beeing completed. See > >> >>> >>>> > >> >>> >>>> > >> >>> >>>> > >> >>> >>>> > >> >>> >>>> > https://www.mail-archive.com/search?l=user@cassandra.apache.org&q=subject:%22Re%3A+Compaction+failing+to+trigger%22&o=newest&f=1 > >> >>> >>>> > >> >>> >>>> > >> >>> >>>> > https://www.mail-archive.com/user%40cassandra.apache.org/msg40768.html > >> >>> >>>> > >> >>> >>>> Apart from that, how are those both datacenters connected? > Maybe > >> >>> >>>> there > >> >>> >>>> is a bottleneck. > >> >>> >>>> > >> >>> >>>> Also do you have ntp up and running on all nodes to keep all > >> >>> >>>> clocks > >> >>> >>>> in > >> >>> >>>> thight sync? > >> >>> >>>> > >> >>> >>>> Note: I'm no expert (yet) - just sharing my 2 cents. > >> >>> >>>> > >> >>> >>>> Cheers, > >> >>> >>>> Roland > >> >>> >>> > >> >>> >>> > >> >>> >>> > >> >>> >>> -- > >> >>> >>> > >> >>> >>> > >> >>> >>> > >> >>> > > >> >> > >> >> > >> > > > > > >