Re[2]: Question about adding nodes to a cluster

2015-02-09 Thread Plotnik, Alexey
Sorry, No - you are not doing it wrong ^) Yes, Cassandra partitioner is based on hash ring. Doubling number of nodes is the best cluster exctending policy I've ever seen, because it's zero-overhead. Hashring - you get MD5 max (2^128-1), divide it by number of nodes (partitions) getting N point

Re: Question about adding nodes to a cluster

2015-02-09 Thread Plotnik, Alexey
Yes, Cassandra partitioner is based on hash ring. Doubling number of nodes is the best cluster exctending policy I've ever seen, because it's zero-overhead. Hashring - you get MD5 max (2^128-1), divide it by number of nodes (partitions) getting N points and then evenly distribute them across you

Re: Cassandra on Ceph

2015-01-31 Thread Plotnik, Alexey
1. What do you mean by "on top of Ceph"? 2. What's he goal? -- Original Message -- From: "Colin Taylor" mailto:colin.tay...@gmail.com>> To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Sent: 01.02.2015 12:26:42 Subject: Cassandra on Ceph I may be forced to run Cassandra

Re[2]: Test

2014-12-03 Thread Plotnik, Alexey
Alah Akbar -- Original Message -- From: "Servando Muñoz G." mailto:smg...@gmail.com>> To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Sent: 04.12.2014 16:12:32 Subject: RE: Test Saludos… Quien eres tu De: Castelain, Alain [mailto:alain.castel...@xerox.com

Re[2]: how wide can wide rows get?

2014-11-13 Thread Plotnik, Alexey
We have 380k of them in some of our rows and it's ok. -- Original Message -- From: "Hannu Kröger" mailto:hkro...@gmail.com>> To: "user@cassandra.apache.org" mailto:user@cassandra.apache.org>> Sent: 14.11.2014 16:13:49 Subject: Re: how wide can wide rows get? The theoretical limit is mayb

Re: Better option to load data to cassandra

2014-11-11 Thread Plotnik, Alexey
What have you tried? -- Original Message -- From: "srinivas rao" mailto:pinnakasrin...@gmail.com>> To: "Cassandra Users" mailto:user@cassandra.apache.org>> Sent: 11.11.2014 22:51:54 Subject: Better option to load data to cassandra Hi Team, Please suggest me the better options to load da

Re[2]: Redundancy inside a cassandra node

2014-11-08 Thread Plotnik, Alexey
o think that the servers must be super robust. Personally I'm not sure if that should be the case. The node Thanks Jabbar Azam On 8 November 2014 02:56, Plotnik, Alexey mailto:aplot...@rhonda.ru>> wrote: Cassandra is a cluster itself, it's not necessary to have redundant e

Re: Redundancy inside a cassandra node

2014-11-07 Thread Plotnik, Alexey
Cassandra is a cluster itself, it's not necessary to have redundant each node. Cassandra has replication for that. And also Cassandra is designed to run in multiple data center - am think that redundant policy is applicable for you. Only thing from your saying you can deploy is raid10, other don

RE: Tombstones

2014-05-18 Thread Plotnik, Alexey
It's located beside SSTables and has a name in format .json For `accounts` column family in `company` keyspace it looks like /company/accounts/accounts.json -Original Message- From: Dimetrio [mailto:dimet...@flysoft.ru] Sent: 18 мая 2014 г. 2:06 To: cassandra-u...@incubator.apache.org Su

750Gb compaction task

2014-03-12 Thread Plotnik, Alexey
After rebalance and cleanup I have leveled CF (SSTable size = 100MB) and a compaction Task that is going to process ~750GB: > root@da1-node1:~# nodetool compactionstats pending tasks: 10556 compaction typekeyspace column family completed total unit progr

RE: Multi-threaded sub-range repair

2014-02-23 Thread Plotnik, Alexey
UPD: Here we can see that repair process can be executed in parallel: http://www.datastax.com/dev/blog/advanced-repair-techniques From: Plotnik, Alexey Sent: 24 февраля 2014 г. 12:42 To: user@cassandra.apache.org Subject: Multi-threaded sub-range repair 1. How many parallel threads is

Multi-threaded sub-range repair

2014-02-23 Thread Plotnik, Alexey
1. How many parallel threads is safe to have for sub-range repair process running for a single node? 2. Is Repair process affected `concurrent_compactors` parameter? Should the `concurrent_compactors` meet the multithreaded repair process needs?

RE: Turn off compression (1.2.11)

2014-02-23 Thread Plotnik, Alexey
, Feb 18, 2014 at 2:58 PM, Plotnik, Alexey mailto:aplot...@rhonda.ru>> wrote: Compression buffers are located in Heap, I saw them in Heapdump. That is: == public class CompressedRandomAccessReader extends RandomAccessReader { ….. private ByteBuffer compressed; // <-

RE: Turn off compression (1.2.11)

2014-02-18 Thread Plotnik, Alexey
ent: 19 февраля 2014 г. 6:24 To: user@cassandra.apache.org Subject: Re: Turn off compression (1.2.11) On Mon, Feb 17, 2014 at 4:35 PM, Plotnik, Alexey mailto:aplot...@rhonda.ru>> wrote: As an aside, 1.2.0 beta moved a bunch of data related to compression off the heap. If you were to try t

RE: Turn off compression (1.2.11)

2014-02-18 Thread Plotnik, Alexey
My SSTable size is 100Mb. Last time I removed leveled manifest compaction was running for 3 months From: Robert Coli [mailto:rc...@eventbrite.com] Sent: 19 февраля 2014 г. 6:24 To: user@cassandra.apache.org Subject: Re: Turn off compression (1.2.11) On Mon, Feb 17, 2014 at 4:35 PM, Plotnik

Turn off compression (1.2.11)

2014-02-17 Thread Plotnik, Alexey
Each compressed SSTable uses additional transfer buffer in CompressedRandomAccessReader instance. After analyzing Heap I saw this buffer has a size about 70KB per SSTable. I have more than 30K SSTables per node. I want to turn off a compression for this column family to save some Heap. How can I

Hadoop kills Cassandra

2014-01-27 Thread Plotnik, Alexey
For some reason when my map-reduce job is almost complete, all mappers (~40) begin to connect to a single Cassandra node. This noe then die due to Java Heap space error. It looks like Hadoop is misconfigured. Valid behavior for me: Each mapper should iterate only a local node. How can I configu

RE: one big cluster vs multiple smaller clusters

2013-10-13 Thread Plotnik, Alexey
If you are talking about scaling: Cassandra scaling is absolutely horizontal without namenodes or other Mongo-bulshit-like intermediate daemons. And that’s why one big cluster has the same throughput as many smaller clusters. What will you do when your small clusters will exceed it’s capacity? Ca