I added a third node to the cluster. Sure enough, this morning I come and only one node is up, in the other two the cassandra process is not running.
In the cassandra log there is nothing, but in /var/log/syslog I see In one node: Mar 15 07:50:51 Cassandra3 kernel: [58566.666906] Out of memory: Kill process 2840 (java) score 383 or sacrifice child Mar 15 07:50:51 Cassandra3 kernel: [58566.667066] Killed process 2840 (java) total-vm:956792kB, anon-rss:689752kB, file-rss:21680kB And in the other: Mar 14 18:36:02 Cassandra2 kernel: [16262.267300] Out of memory: Kill process 2611 (java) score 409 or sacrifice child Mar 14 18:36:02 Cassandra2 kernel: [16262.267325] Killed process 2611 (java) total-vm:968040kB, anon-rss:748644kB, file-rss:18436kB Two questions: 1. How can I prevent this? I guess my setup is limited, and this may happen, but is there a way to improve things. 2. Assuming that I will run out of memory from time to time, how do I setup a monit \ god task to restart cassandra in case it does. Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 13, 2012 at 11:12 AM, aaron morton <aa...@thelastpickle.com>wrote: > If you are on Ubuntu it may be this > http://wiki.apache.org/cassandra/FAQ#ubuntu_hangs > > otherwise I would look for GC problems. > > Cheers > > > ----------------- > Aaron Morton > Freelance Developer > @aaronmorton > http://www.thelastpickle.com > > On 13/03/2012, at 7:53 PM, Tamar Fraenkel wrote: > > Done it. Now it generally runs ok, till one of the nodes get's stuck with > 100% cpu and I need to reboot it. > > Last lines in the system.log just before are: > INFO [OptionalTasks:1] 2012-03-13 07:36:43,850 MeteredFlusher.java (line > 62) flushing high-traffic column family CFS(Keyspace='tok', > ColumnFamily='tk_vertical_tag_story_indx') (estimated 35417890 bytes) > INFO [OptionalTasks:1] 2012-03-13 07:36:43,869 ColumnFamilyStore.java > (line 704) Enqueuing flush of > Memtable-tk_vertical_tag_story_indx@2002820169(1620316/35417890 > serialized/live bytes, 30572 ops) > INFO [FlushWriter:76] 2012-03-13 07:36:43,869 Memtable.java (line 246) > Writing Memtable-tk_vertical_tag_story_indx@2002820169(1620316/35417890 > serialized/live bytes, 30572 ops) > INFO [FlushWriter:76] 2012-03-13 07:36:44,015 Memtable.java (line 283) > Completed flushing > /opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-191-Data.db (2134123 > bytes) > INFO [OptionalTasks:1] 2012-03-13 07:37:37,886 MeteredFlusher.java (line > 62) flushing high-traffic column family CFS(Keyspace='tok', > ColumnFamily='tk_vertical_tag_story_indx') (estimated 34389135 bytes) > INFO [OptionalTasks:1] 2012-03-13 07:37:37,887 ColumnFamilyStore.java > (line 704) Enqueuing flush of > Memtable-tk_vertical_tag_story_indx@1869953681(1573252/34389135 > serialized/live bytes, 29684 ops) > INFO [FlushWriter:76] 2012-03-13 07:37:37,887 Memtable.java (line 246) > Writing Memtable-tk_vertical_tag_story_indx@1869953681(1573252/34389135 > serialized/live bytes, 29684 ops) > INFO [FlushWrit > > Any idea? > I am considering adding a third node, so that replication factor of 2 > won't stuck my system when one node goes down. Does it make sense? > > Thanks > > > *Tamar Fraenkel * > Senior Software Engineer, TOK Media > > <tokLogo.png> > > ta...@tok-media.com > Tel: +972 2 6409736 > Mob: +972 54 8356490 > Fax: +972 2 5612956 > > > > > > On Tue, Mar 6, 2012 at 7:51 PM, aaron morton <aa...@thelastpickle.com>wrote: > >> Reduce these settings for the CF >> row_cache (disable it) >> key_cache (disable it) >> >> Increase these settings for the CF >> bloom_filter_fp_chance >> >> Reduce these settings in cassandra.yaml >> >> flush_largest_memtables_at >> memtable_flush_queue_size >> sliced_buffer_size_in_kb >> in_memory_compaction_limit_in_mb >> concurrent_compactors >> >> >> Increase these settings >> index_interval >> >> >> While it obviously depends on load, I would not be surprised if you had a >> lot of trouble running cassandra with that setup. >> >> Cheers >> A >> >> >> ----------------- >> Aaron Morton >> Freelance Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 6/03/2012, at 11:02 PM, Tamar Fraenkel wrote: >> >> Arron, Thanks for your response. I was afraid this is the issue. >> Can you give me some direction regarding the fine tuning of my VMs, I >> would like to explore that option some more. >> Thanks! >> >> *Tamar Fraenkel * >> Senior Software Engineer, TOK Media >> >> <tokLogo.png> >> >> ta...@tok-media.com >> Tel: +972 2 6409736 >> Mob: +972 54 8356490 >> Fax: +972 2 5612956 >> >> >> >> >> >> On Tue, Mar 6, 2012 at 11:58 AM, aaron morton <aa...@thelastpickle.com>wrote: >> >>> You do not have enough memory allocated to the JVM and are suffering >>> from excessive GC as a result. >>> >>> There are some tuning things you can try, but 480MB is not enough. 1GB >>> would be a better start, 2 better than that. >>> >>> Consider using https://github.com/pcmanus/ccm for testing multiple >>> instances on a single server rather than a VM. >>> >>> Cheers >>> >>> ----------------- >>> Aaron Morton >>> Freelance Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 6/03/2012, at 10:21 PM, Tamar Fraenkel wrote: >>> >>> I have some more info, after couple of hours running the problematic >>> node became again 100% CPU and I had to reboot it, last lines from log show >>> it did GC: >>> >>> INFO [ScheduledTasks:1] 2012-03-06 10:28:00,880 GCInspector.java (line >>> 122) GC for Copy: 203 ms for 1 collections, 185983456 used; max is 513802240 >>> INFO [ScheduledTasks:1] 2012-03-06 10:28:50,595 GCInspector.java (line >>> 122) GC for Copy: 3927 ms for 1 collections, 156572576 used; max is >>> 513802240 >>> INFO [ScheduledTasks:1] 2012-03-06 10:28:55,434 StatusLogger.java (line >>> 50) Pool Name Active Pending Blocked >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,298 StatusLogger.java (line >>> 65) ReadStage 2 2 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,499 StatusLogger.java (line >>> 65) RequestResponseStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line >>> 65) ReadRepairStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line >>> 65) MutationStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line >>> 65) ReplicateOnWriteStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line >>> 65) GossipStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line >>> 65) AntiEntropyStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line >>> 65) MigrationStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line >>> 65) StreamStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line >>> 65) MemtablePostFlusher 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line >>> 65) FlushWriter 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line >>> 65) MiscStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line >>> 65) InternalResponseStage 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line >>> 65) HintedHandoff 0 0 0 >>> INFO [ScheduledTasks:1] 2012-03-06 10:29:03,553 StatusLogger.java (line >>> 69) CompactionManager n/a 0 >>> >>> Thanks, >>> >>> *Tamar Fraenkel * >>> Senior Software Engineer, TOK Media >>> >>> <tokLogo.png> >>> >>> ta...@tok-media.com >>> Tel: +972 2 6409736 >>> Mob: +972 54 8356490 >>> Fax: +972 2 5612956 >>> >>> >>> >>> >>> >>> On Tue, Mar 6, 2012 at 9:12 AM, Tamar Fraenkel <ta...@tok-media.com>wrote: >>> >>>> Works.. >>>> >>>> But during the night my setup encountered a problem. >>>> I have two VMs on my cluster (running on VmWare ESXi). >>>> Each VM has1GB memory, and two Virtual Disks of 16 GB >>>> They are running on a small server with 4CPUs (2.66 GHz), and 4 GB >>>> memory (together with two other VMs) >>>> I put cassandra data on the second disk of each machine. >>>> VMs are running Ubuntu 11.10 and cassandra 1.0.7. >>>> >>>> I left them running overnight and this morning when I came: >>>> In one node cassandra was down, and the last thing in the system.log is: >>>> >>>> INFO [CompactionExecutor:150] 2012-03-06 00:55:04,821 >>>> CompactionTask.java (line 113) Compacting >>>> [SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1243-Data.db'), >>>> SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1245-Data.db'), >>>> SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1242-Data.db'), >>>> SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db')] >>>> INFO [CompactionExecutor:150] 2012-03-06 00:55:07,919 >>>> CompactionTask.java (line 221) Compacted to >>>> [/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1246-Data.db,]. >>>> 32,424,771 to 26,447,685 (~81% of original) bytes for 58,938 keys at >>>> 8.144165MB/s. Time: 3,097ms. >>>> >>>> >>>> The other node was using all it's CPU and I had to restart it. >>>> After that, I can see that the last lines in it's system.log are that >>>> the other node is down... >>>> >>>> INFO [FlushWriter:142] 2012-03-06 00:55:02,418 Memtable.java (line >>>> 246) Writing >>>> Memtable-tk_vertical_tag_story_indx@1365852701(1122169/25154556 >>>> serialized/live bytes, 21173 ops) >>>> INFO [FlushWriter:142] 2012-03-06 00:55:02,742 Memtable.java (line >>>> 283) Completed flushing >>>> /opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db (2075930 >>>> bytes) >>>> INFO [GossipTasks:1] 2012-03-06 08:02:18,584 Gossiper.java (line 818) >>>> InetAddress /10.0.0.31 is now dead. >>>> >>>> How can I trace why that happened? >>>> Also, I brought cassandra up in both nodes. They both spend long time >>>> reading commit logs, but now they seem to run. >>>> Any idea how to debug or improve my setup? >>>> Thanks, >>>> Tamar >>>> >>>> >>>> >>>> *Tamar Fraenkel * >>>> Senior Software Engineer, TOK Media >>>> >>>> <tokLogo.png> >>>> >>>> >>>> ta...@tok-media.com >>>> Tel: +972 2 6409736 >>>> Mob: +972 54 8356490 >>>> Fax: +972 2 5612956 >>>> >>>> >>>> >>>> >>>> >>>> On Mon, Mar 5, 2012 at 7:30 PM, aaron morton >>>> <aa...@thelastpickle.com>wrote: >>>> >>>>> Create nodes that do not share seeds, and give the clusters different >>>>> names as a safety measure. >>>>> >>>>> Cheers >>>>> >>>>> ----------------- >>>>> Aaron Morton >>>>> Freelance Developer >>>>> @aaronmorton >>>>> http://www.thelastpickle.com >>>>> >>>>> On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote: >>>>> >>>>> I want tow separate clusters. >>>>> *Tamar Fraenkel * >>>>> Senior Software Engineer, TOK Media >>>>> >>>>> <tokLogo.png> >>>>> >>>>> >>>>> ta...@tok-media.com >>>>> Tel: +972 2 6409736 >>>>> Mob: +972 54 8356490 >>>>> Fax: +972 2 5612956 >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> On Mon, Mar 5, 2012 at 12:48 PM, aaron morton <aa...@thelastpickle.com >>>>> > wrote: >>>>> >>>>>> Do you want to create two separate clusters or a single cluster with >>>>>> two data centres ? >>>>>> >>>>>> If it's the later, token selection is discussed here >>>>>> http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra >>>>>> >>>>>> >>>>>> Moreover all tokens must be unique (even across datacenters), >>>>>> although - from pure curiosity - I wonder what is the rationale behind >>>>>> this. >>>>>> >>>>>> Otherwise data is not evenly distributed. >>>>>> >>>>>> By the way, can someone enlighten me about the first line in the >>>>>> output of the nodetool. Obviously it contains a token, but nothing else. >>>>>> It >>>>>> seems like a formatting glitch, but maybe it has a role. >>>>>> >>>>>> It's the exclusive lower bound token for the first node in the ring. >>>>>> This also happens to be the token for the last node in the ring. >>>>>> >>>>>> In your setup >>>>>> 10.0.0.19 "owns" (85070591730234615865843651857942052864+1) to 0 >>>>>> 10.0.0.28 "owns" (0 + 1) to 85070591730234615865843651857942052864 >>>>>> >>>>>> (does not imply primary replica, just used to map keys to nodes.) >>>>>> >>>>>> >>>>>> >>>>>> ----------------- >>>>>> Aaron Morton >>>>>> Freelance Developer >>>>>> @aaronmorton >>>>>> http://www.thelastpickle.com >>>>>> >>>>>> On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote: >>>>>> >>>>>> You have to use PropertyFileSnitch and NetworkTopologyStrategy to >>>>>> create a multi-datacenter setup with two circles. You can start reading >>>>>> from this page: >>>>>> >>>>>> http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy >>>>>> >>>>>> Moreover all tokens must be unique (even across datacenters), >>>>>> although - from pure curiosity - I wonder what is the rationale behind >>>>>> this. >>>>>> >>>>>> By the way, can someone enlighten me about the first line in the >>>>>> output of the nodetool. Obviously it contains a token, but nothing else. >>>>>> It >>>>>> seems like a formatting glitch, but maybe it has a role. >>>>>> >>>>>> On 2012.03.05. 11:06, Tamar Fraenkel wrote: >>>>>> >>>>>> Hi! >>>>>> I have a Cassandra cluster with two nodes >>>>>> >>>>>> nodetool ring -h localhost >>>>>> Address DC Rack Status State Load >>>>>> Owns Token >>>>>> >>>>>> 85070591730234615865843651857942052864 >>>>>> 10.0.0.19 datacenter1 rack1 Up Normal 488.74 KB >>>>>> 50.00% 0 >>>>>> 10.0.0.28 datacenter1 rack1 Up Normal 504.63 KB >>>>>> 50.00% 85070591730234615865843651857942052864 >>>>>> >>>>>> I want to create a second ring with the same name but two different >>>>>> nodes. >>>>>> using tokengentool I get the same tokens as they are affected from >>>>>> the number of nodes in a ring. >>>>>> >>>>>> My question is like this: >>>>>> Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11 >>>>>> *In 10.0.0.31 cassandra.yaml I will set* >>>>>> initial_token: 0 >>>>>> seeds: "10.0.0.31" >>>>>> listen_address: 10.0.0.31 >>>>>> rpc_address: 0.0.0.0 >>>>>> >>>>>> *In 10.0.0.11 cassandra.yaml I will set* >>>>>> initial_token: 85070591730234615865843651857942052864 >>>>>> seeds: "10.0.0.31" >>>>>> listen_address: 10.0.0.11 >>>>>> rpc_address: 0.0.0.0 >>>>>> >>>>>> *Would the rings be separate?* >>>>>> >>>>>> Thanks, >>>>>> >>>>>> *Tamar Fraenkel * >>>>>> Senior Software Engineer, TOK Media >>>>>> >>>>>> <Mail Attachment.png> >>>>>> >>>>>> >>>>>> ta...@tok-media.com >>>>>> Tel: +972 2 6409736 >>>>>> Mob: +972 54 8356490 >>>>>> Fax: +972 2 5612956 >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>> >> >> > >
<<tokLogo.png>>