Re: running two rings on the same subnet

aaron morton Tue, 06 Mar 2012 09:51:53 -0800

Reduce these settings for the CF
row_cache (disable it)
key_cache (disable it)


Increase these settings for the CF
bloom_filter_fp_chance

Reduce these settings in cassandra.yaml

flush_largest_memtables_at
memtable_flush_queue_size
sliced_buffer_size_in_kb
in_memory_compaction_limit_in_mb
concurrent_compactors


Increase these settings 
index_interval


While it obviously depends on load, I would not be surprised if you had a lot 
of trouble running cassandra with that setup. 

Cheers
A


-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 6/03/2012, at 11:02 PM, Tamar Fraenkel wrote:

> Arron, Thanks for your response. I was afraid this is the issue.
> Can you give me some direction regarding the fine tuning of my VMs, I would 
> like to explore that option some more.
> Thanks!
> 
> Tamar Fraenkel 
> Senior Software Engineer, TOK Media 
> 
> <tokLogo.png>
> 
> ta...@tok-media.com
> Tel:   +972 2 6409736 
> Mob:  +972 54 8356490 
> Fax:   +972 2 5612956 
> 
> 
> 
> 
> 
> On Tue, Mar 6, 2012 at 11:58 AM, aaron morton <aa...@thelastpickle.com> wrote:
> You do not have enough memory allocated to the JVM and are suffering from 
> excessive GC as a result.
> 
> There are some tuning things you can try, but 480MB is not enough. 1GB would 
> be a better start, 2 better than that. 
> 
> Consider using https://github.com/pcmanus/ccm for testing multiple instances 
> on a single server rather than a VM.
> 
> Cheers
> 
> -----------------
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 6/03/2012, at 10:21 PM, Tamar Fraenkel wrote:
> 
>> I have some more info, after couple of hours running the problematic node 
>> became again 100% CPU and I had to reboot it, last lines from log show it 
>> did GC:
>> 
>>  INFO [ScheduledTasks:1] 2012-03-06 10:28:00,880 GCInspector.java (line 122) 
>> GC for Copy: 203 ms for 1 collections, 185983456 used; max is 513802240
>>  INFO [ScheduledTasks:1] 2012-03-06 10:28:50,595 GCInspector.java (line 122) 
>> GC for Copy: 3927 ms for 1 collections, 156572576 used; max is 513802240
>>  INFO [ScheduledTasks:1] 2012-03-06 10:28:55,434 StatusLogger.java (line 50) 
>> Pool Name                    Active   Pending   Blocked
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,298 StatusLogger.java (line 65) 
>> ReadStage                         2         2         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,499 StatusLogger.java (line 65) 
>> RequestResponseStage              0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) 
>> ReadRepairStage                   0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) 
>> MutationStage                     0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) 
>> ReplicateOnWriteStage             0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,500 StatusLogger.java (line 65) 
>> GossipStage                       0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) 
>> AntiEntropyStage                  0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) 
>> MigrationStage                    0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) 
>> StreamStage                       0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,501 StatusLogger.java (line 65) 
>> MemtablePostFlusher               0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) 
>> FlushWriter                       0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) 
>> MiscStage                         0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) 
>> InternalResponseStage             0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,502 StatusLogger.java (line 65) 
>> HintedHandoff                     0         0         0
>>  INFO [ScheduledTasks:1] 2012-03-06 10:29:03,553 StatusLogger.java (line 69) 
>> CompactionManager               n/a         0
>> 
>> Thanks,
>> 
>> Tamar Fraenkel 
>> Senior Software Engineer, TOK Media 
>> 
>> <tokLogo.png>
>> 
>> ta...@tok-media.com
>> Tel:   +972 2 6409736 
>> Mob:  +972 54 8356490 
>> Fax:   +972 2 5612956 
>> 
>> 
>> 
>> 
>> 
>> On Tue, Mar 6, 2012 at 9:12 AM, Tamar Fraenkel <ta...@tok-media.com> wrote:
>> Works..
>> 
>> But during the night my setup encountered a problem.
>> I have two VMs on my cluster (running on VmWare ESXi).
>> Each VM has1GB memory, and two Virtual Disks of 16 GB
>> They are running on a small server with 4CPUs (2.66 GHz), and 4 GB memory 
>> (together with two other VMs)
>> I put cassandra data on the second disk of each machine.
>> VMs are running Ubuntu 11.10 and cassandra 1.0.7.
>> 
>> I left them running overnight and this morning when I came:
>> In one node cassandra was down, and the last thing in the system.log is:
>> 
>>  INFO [CompactionExecutor:150] 2012-03-06 00:55:04,821 CompactionTask.java 
>> (line 113) Compacting 
>> [SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1243-Data.db'),
>>  
>> SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1245-Data.db'),
>>  
>> SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1242-Data.db'),
>>  
>> SSTableReader(path='/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db')]
>>  INFO [CompactionExecutor:150] 2012-03-06 00:55:07,919 CompactionTask.java 
>> (line 221) Compacted to 
>> [/opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1246-Data.db,].  
>> 32,424,771 to 26,447,685 (~81% of original) bytes for 58,938 keys at 
>> 8.144165MB/s.  Time: 3,097ms.
>> 
>> 
>> The other node was using all it's CPU and I had to restart it.
>> After that, I can see that the last lines in it's system.log are that the 
>> other node is down...
>> 
>>  INFO [FlushWriter:142] 2012-03-06 00:55:02,418 Memtable.java (line 246) 
>> Writing Memtable-tk_vertical_tag_story_indx@1365852701(1122169/25154556 
>> serialized/live bytes, 21173 ops)
>>  INFO [FlushWriter:142] 2012-03-06 00:55:02,742 Memtable.java (line 283) 
>> Completed flushing 
>> /opt/cassandra/data/tok/tk_vertical_tag_story_indx-hc-1244-Data.db (2075930 
>> bytes)
>>  INFO [GossipTasks:1] 2012-03-06 08:02:18,584 Gossiper.java (line 818) 
>> InetAddress /10.0.0.31 is now dead.
>> 
>> How can I trace why that happened?
>> Also, I brought cassandra up in both nodes. They both spend long time 
>> reading commit logs, but now they seem to run.
>> Any idea how to debug or improve my setup?
>> Thanks,
>> Tamar
>> 
>> 
>> 
>> Tamar Fraenkel 
>> Senior Software Engineer, TOK Media 
>> 
>> <tokLogo.png>
>> 
>> 
>> ta...@tok-media.com
>> Tel:   +972 2 6409736 
>> Mob:  +972 54 8356490 
>> Fax:   +972 2 5612956 
>> 
>> 
>> 
>> 
>> 
>> On Mon, Mar 5, 2012 at 7:30 PM, aaron morton <aa...@thelastpickle.com> wrote:
>> Create nodes that do not share seeds, and give the clusters different names 
>> as a safety measure. 
>> 
>> Cheers
>> 
>> -----------------
>> Aaron Morton
>> Freelance Developer
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote:
>> 
>>> I want tow separate clusters.
>>> Tamar Fraenkel 
>>> Senior Software Engineer, TOK Media 
>>> 
>>> <tokLogo.png>
>>> 
>>> 
>>> ta...@tok-media.com
>>> Tel:   +972 2 6409736 
>>> Mob:  +972 54 8356490 
>>> Fax:   +972 2 5612956 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> On Mon, Mar 5, 2012 at 12:48 PM, aaron morton <aa...@thelastpickle.com> 
>>> wrote:
>>> Do you want to create two separate clusters or a single cluster with two 
>>> data centres ? 
>>> 
>>> If it's the later, token selection is discussed here 
>>> http://www.datastax.com/docs/1.0/install/cluster_init#token-gen-cassandra
>>>  
>>>> Moreover all tokens must be unique (even across datacenters), although - 
>>>> from pure curiosity - I wonder what is the rationale behind this.
>>> Otherwise data is not evenly distributed.
>>> 
>>>> By the way, can someone enlighten me about the first line in the output of 
>>>> the nodetool. Obviously it contains a token, but nothing else. It seems 
>>>> like a formatting glitch, but maybe it has a role. 
>>> It's the exclusive lower bound token for the first node in the ring. This 
>>> also happens to be the token for the last node in the ring. 
>>> 
>>> In your setup 
>>> 10.0.0.19 "owns" (85070591730234615865843651857942052864+1) to 0
>>> 10.0.0.28 "owns"  (0 + 1) to 85070591730234615865843651857942052864
>>> 
>>> (does not imply primary replica, just used to map keys to nodes.)
>>>  
>>> 
>>> 
>>> -----------------
>>> Aaron Morton
>>> Freelance Developer
>>> @aaronmorton
>>> http://www.thelastpickle.com
>>> 
>>> On 5/03/2012, at 11:38 PM, Hontvári József Levente wrote:
>>> 
>>>> You have to use PropertyFileSnitch and NetworkTopologyStrategy to create a 
>>>> multi-datacenter setup with two circles. You can start reading from this 
>>>> page:
>>>> http://www.datastax.com/docs/1.0/cluster_architecture/replication#about-replica-placement-strategy
>>>> 
>>>> Moreover all tokens must be unique (even across datacenters), although - 
>>>> from pure curiosity - I wonder what is the rationale behind this.
>>>> 
>>>> By the way, can someone enlighten me about the first line in the output of 
>>>> the nodetool. Obviously it contains a token, but nothing else. It seems 
>>>> like a formatting glitch, but maybe it has a role. 
>>>> 
>>>> On 2012.03.05. 11:06, Tamar Fraenkel wrote:
>>>>> Hi!
>>>>> I have a Cassandra  cluster with two nodes
>>>>> 
>>>>> nodetool ring -h localhost
>>>>> Address         DC          Rack        Status State   Load            
>>>>> Owns    Token
>>>>>                                                                           
>>>>>      85070591730234615865843651857942052864
>>>>> 10.0.0.19       datacenter1 rack1       Up     Normal  488.74 KB       
>>>>> 50.00%  0
>>>>> 10.0.0.28       datacenter1 rack1       Up     Normal  504.63 KB       
>>>>> 50.00%  85070591730234615865843651857942052864
>>>>> 
>>>>> I want to create a second ring with the same name but two different nodes.
>>>>> using tokengentool I get the same tokens as they are affected from the 
>>>>> number of nodes in a ring.
>>>>> 
>>>>> My question is like this:
>>>>> Lets say I create two new VMs, with IPs: 10.0.0.31 and 10.0.0.11
>>>>> In 10.0.0.31 cassandra.yaml I will set
>>>>> initial_token: 0
>>>>> seeds: "10.0.0.31"
>>>>> listen_address: 10.0.0.31
>>>>> rpc_address: 0.0.0.0
>>>>> 
>>>>> In 10.0.0.11 cassandra.yaml I will set
>>>>> initial_token: 85070591730234615865843651857942052864
>>>>> seeds: "10.0.0.31"
>>>>> listen_address: 10.0.0.11
>>>>> rpc_address: 0.0.0.0 
>>>>> 
>>>>> Would the rings be separate?
>>>>> 
>>>>> Thanks,
>>>>> 
>>>>> Tamar Fraenkel 
>>>>> Senior Software Engineer, TOK Media 
>>>>> 
>>>>> <Mail Attachment.png>
>>>>> 
>>>>> 
>>>>> ta...@tok-media.com
>>>>> Tel:   +972 2 6409736 
>>>>> Mob:  +972 54 8356490 
>>>>> Fax:   +972 2 5612956 
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>>> 
>> 
>> 
>> 
> 
>

Re: running two rings on the same subnet

Reply via email to