Re: Random Distribution, yet Order Preserving Partitioner

2013-08-27 Thread Manoj Mainali
Hi Takenori, I can't tell for sure without knowing what kind of data you have and how much you have.You can use the random partitioner and use the concept of metadata row that stores the row key, as for example like below {metadata_row}: key1 | key2 | key3 key1:column1 | column2 When you do the

Re: Large number of files for Leveled Compaction

2013-06-16 Thread Manoj Mainali
-apache-cassandra on details of how LeveledCompaction works. Cheers Manoj On Mon, Jun 17, 2013 at 1:54 PM, Franc Carter wrote: > On Mon, Jun 17, 2013 at 2:47 PM, Manoj Mainali wrote: > >> With LeveledCompaction, each sstable size is fixed and is defined by >> sstable_size_in_mb

Re: Large number of files for Leveled Compaction

2013-06-16 Thread Manoj Mainali
With LeveledCompaction, each sstable size is fixed and is defined by sstable_size_in_mb in the compaction configuration of CF definition and default value is 5MB. In you case, you may have not defined your own value, that is why your each sstable is 5MB. And if you dataset is huge, you will see a l

PerRowSecondaryIndex uses

2013-06-11 Thread Manoj Mainali
I am looking into the C* secondary index feature so that I could query the rows based on the column value. In my use case, I wanted to create index of several columns or maybe all columns of a row. (A single row does not have many columns, maybe around 50 - 100 columns) and was looking into PerRow

Re: SSDs w/ C* only for commit log?

2013-06-10 Thread Manoj Mainali
You can refer to this conversation here http://comments.gmane.org/gmane.comp.db.cassandra.user/27366 Manoj On Tue, Jun 11, 2013 at 10:01 AM, Tanya Malik wrote: > If I understand the C* architecture correctly, in order to increase write > speed, I only need to put the commit log on SSDs. > > Whe

Re: Flushing column families individually in cassandra

2013-06-10 Thread Manoj Mainali
In the older versions it was possible, but, in C* 1.2 it is a global configuration so you won't be able to configure it per CF basis. Manoj On Tue, Jun 11, 2013 at 10:32 AM, Tanya Malik wrote: > Is it possible in C* 1.2 to configure column families to be flushed > individually? > > So, if I hav

Re: Cassandra Evaluation/ Benchmarking: Throughput not scaling as expected neither latency showing good numbers

2012-07-18 Thread Manoj Mainali
clients. Of course, it doesnt mean that throughtput will always increase. My observation was that it will increase and after certain number of clients throughput decrease again. Regards, Manoj Mainali On Wednesday, July 18, 2012, Code Box wrote: > The cassandra stress tool gives me values aro

Re: Cassandra Evaluation/ Benchmarking: Throughput not scaling as expected neither latency showing good numbers

2012-07-17 Thread Manoj Mainali
Is the "Threads" in your data the number of clients? How much heap space does each node have? YCSB has a paper on their benchmark tests. You can try comparing your result with theirs and see if you have similarity. Best regards, Manoj On Tuesday, July 17, 2012, Code Box wrote: > I am doing Cas

Re: Getting stats of keyspaces

2012-07-16 Thread Manoj Mainali
You can get the statistics using jmx. See here : http://www.datastax.com/docs/1.1/operations/monitoring Best regards, Manoj On Monday, July 16, 2012, Thierry Templier wrote: > Hello, > > I wonder if it's possible to get statistics for a keyspace like its size, > size of each column family it c

Re: Cassandra keeps on logging "Finished hinted handoff of 0 rows to endpoint"

2012-02-24 Thread Manoj Mainali
Thanks. On Saturday, February 25, 2012, Brandon Williams wrote: > It's a special case of a single sstable existing for hints: > https://issues.apache.org/jira/browse/CASSANDRA-3955 > > On Fri, Feb 24, 2012 at 5:43 AM, Manoj Mainali wrote: >> Hi, >> >> I have

Cassandra keeps on logging "Finished hinted handoff of 0 rows to endpoint"

2012-02-24 Thread Manoj Mainali
Hi, I have been running Cassandra 1.0.7 and in the log file I see the log saying " Finished hinted handoff of 0 rows to endpoint /{ipaddress}" The above issue can be reproduced by the following steps, 1. Start a cluster with 2 node, suppose node1 and node2 2. Create a keyspace with rf=2, create