> I checked the log, and found some ERROR about network problems, > and some ERROR about "Keys must not be empty". Do you have the full error stack ?
Cheers ----------------- Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 25/11/2012, at 4:14 AM, Chuan-Heng Hsiao <hsiao.chuanh...@gmail.com> wrote: > Hi Cassandra Devs, > > After trying to setup the same settings (and importing same data) > to the 3 VMs on the same machine instead of 3 physical machines, > so far I couldn't replicate the exploded-commitlog situation. > > On my 4-physical-machine setting, everything seems to be > back to normal (commitlog size is less than the expected max setting) > after restarting the nodes. > > This time the size of the commitlog of one node is set as 4G, and the > others are set as 8G. > > Few days ago the node with 4G got exploded as 5+G. (the 8G nodes remain at > 8G). > I checked the log, and found some ERROR about network problems, > and some ERROR about "Keys must not be empty". > > I suspect that besides the network problems, > the "Keys must not be empty" ERROR may be the main reason why > the commitlog continues growing. > (I've already ensured that the Keys must not be empty in my code, > so the problem may be raised when syncing internally in cassandra.) > > I restarted the 4G node as 8G node. Because there is no huge traffic since > then, I am not sure whether increasing the commitlog size will > solve/reduce this problem or not yet. > I'll keep you posted once the commitlog get expldoed again. > > Sincerely, > Hsiao > > > On Mon, Nov 19, 2012 at 11:21 AM, Chuan-Heng Hsiao > <hsiao.chuanh...@gmail.com> wrote: >> I have RF = 3. Read/Write consistency has already been set as TWO. >> >> It did seem that the data were not consistent yet. >> (There are some CFs that I expected empty after the operations, but still >> got some data, and the number of data were decreasing after retrying >> to get all data >> from that CF) >> >> Sincerely, >> Hsiao >> >> >> On Mon, Nov 19, 2012 at 11:14 AM, Tupshin Harper <tups...@tupshin.com> wrote: >>> What consistency level are you writing with? If you were writing with ANY, >>> try writing with a higher consistency level. >>> >>> -Tupshin >>> >>> On Nov 18, 2012 9:05 PM, "Chuan-Heng Hsiao" <hsiao.chuanh...@gmail.com> >>> wrote: >>>> >>>> Hi Aaron, >>>> >>>> Thank you very much for the replying. >>>> >>>> The 700 CFs were created in the beginning (before any insertion.) >>>> >>>> I did not do anything with commitlog_archiving.properties, so I guess >>>> I was not using commit log archiving. >>>> >>>> What I did was doing a lot of insertions (and some deletions) >>>> using another 4 machines with 32 processes in total. >>>> (There are 4 nodes in my setting, so there are 8 machines in total) >>>> >>>> I did see huge logs in /var/log/cassandra after such huge amount of >>>> insertions. >>>> Right now I can't distinguish whether single insertion also cause huge >>>> logs. >>>> >>>> nodetool flush hanged (maybe because of 200G+ commitlog) >>>> >>>> Because these machines are not in production (guaranteed no more >>>> insertion/deletion) >>>> I ended up restarting cassandra one node each time, the commitlog >>>> shrinked back to >>>> 4G. I am doing repair on each node now. >>>> >>>> I'll try to re-import and keep logs when the commitlog increases insanely >>>> again. >>>> >>>> Sincerely, >>>> Hsiao >>>> >>>> >>>> On Mon, Nov 19, 2012 at 3:19 AM, aaron morton <aa...@thelastpickle.com> >>>> wrote: >>>>> I am wondering whether the huge commitlog size is the expected behavior >>>>> or >>>>> not? >>>>> >>>>> Nope. >>>>> >>>>> Did you notice the large log size during or after the inserts ? >>>>> If after did the size settle ? >>>>> Are you using commit log archiving ? (in commitlog_archiving.properties) >>>>> >>>>> and around 700 mini column family (around 10M in data_file_directories) >>>>> >>>>> Can you describe how you created the 700 CF's ? >>>>> >>>>> and how can we reduce the size of commitlog? >>>>> >>>>> As a work around nodetool flush should checkpoint the log. >>>>> >>>>> Cheers >>>>> >>>>> ----------------- >>>>> Aaron Morton >>>>> Freelance Cassandra Developer >>>>> New Zealand >>>>> >>>>> @aaronmorton >>>>> http://www.thelastpickle.com >>>>> >>>>> On 17/11/2012, at 2:30 PM, Chuan-Heng Hsiao <hsiao.chuanh...@gmail.com> >>>>> wrote: >>>>> >>>>> hi Cassandra Developers, >>>>> >>>>> I am experiencing huge commitlog size (200+G) after inserting huge >>>>> amount of data. >>>>> It is a 4-node cluster with RF= 3, and currently each has 200+G commit >>>>> log (so there are around 1T commit log in total) >>>>> >>>>> The setting of commitlog_total_space_in_mb is default. >>>>> >>>>> I am using 1.1.6. >>>>> >>>>> I did not do nodetool cleanup and nodetool flush yet, but >>>>> I did nodetool repair -pr for each column family. >>>>> >>>>> There is 1 huge column family (around 68G in data_file_directories), >>>>> and 18 mid-huge column family (around 1G in data_file_directories) >>>>> and around 700 mini column family (around 10M in data_file_directories) >>>>> >>>>> I am wondering whether the huge commitlog size is the expected behavior >>>>> or >>>>> not? >>>>> and how can we reduce the size of commitlog? >>>>> >>>>> Sincerely, >>>>> Hsiao >>>>> >>>>>