Re: Hadoop jobs and data locality

2013-05-07 Thread Shamim
I have created an issue in jira 
https://issues.apache.org/jira/browse/CASSANDRA-5544 

-- 
Best regards
  Shamim A.

06.05.2013, 22:26, "Shamim" :
> I think It will be better to open a issue in jira
> Best regards
>   Shamim A.
>
>>  Unfortunately I've just tried with a new cluster with RandomPartitioner and 
>> it doesn't work better : > > it may come from hadoop/pig modifications : > > 
>> 18:02:53|elia:hadoop cyril$ git diff --
>
> stat cassandra-1.1.5..cassandra-1.2.1 . > 
> .../apache/cassandra/hadoop/BulkOutputFormat.java | 27 +-- > 
> .../apache/cassandra/hadoop/BulkRecordWriter.java | 55 +++--- > 
> .../cassandra/hadoop/ColumnFamilyInputFormat.java | 102 ++ > 
> .../cassandra/hadoop/ColumnFamilyOutputFormat.java | 31 ++-- > 
> .../cassandra/hadoop/ColumnFamilyRecordReader.java | 76  > 
> .../cassandra/hadoop/ColumnFamilyRecordWriter.java | 24 +-- > 
> .../apache/cassandra/hadoop/ColumnFamilySplit.java | 32 ++-- > 
> .../org/apache/cassandra/hadoop/ConfigHelper.java | 73 ++-- > 
> .../cassandra/hadoop/pig/CassandraStorage.java | 214 +--- > 9 
> files changed, 380 insertions(+), 254 deletions(-) > > Can anyone help on 
> getting more mapper running ? Maybe we should open a bug report ? > > -- > 
> Cyril SCETBON > > On May 5, 2013, at 8:45 AM, Shamim  wrote: > >> Hello, >> 
> We have also came across this issue in our dev environment, when we upgrade 
> Cassandra from 1.1.5 to 1.2.1 version. I have mentioned this issue in few 
> times in this forum but haven't got any answer yet. For quick work around you 
> can use pig.splitCombination false in your pig script to avoid this issue, 
> but it will make one of your task with a very big amount of data. I can't 
> figure out why this happening in newer version of Cassandra, strongly guess 
> some thing goes wrong in Cassandra implementation of LoadFunc or in 
> Murmur3Partition (it's my guess). >> Here is my earliar post >> 
> http://www.mail-archive.com/user@cassandra.apache.org/msg28016.html >> 
> http://www.mail-archive.com/user@cassandra.apache.org/msg29425.html >> >> Any 
> comment from authors will be highly appreciated >> P.S. please keep me in 
> touch with any solution or hints. >> >> -- >> Best regards >> Shamim A. >> >> 
> 03.05.2013, 19:25, "cscetbon@orange.com" : >> >>> Hi, >>> I'm using Pig 
> to calculate the sum of a columns from a columnfamily (scan of all rows) and 
> I've read that input data locality is supported at 
> http://wiki.apache.org/cassandra/HadoopSupport >>> However when I execute my 
> Pig script Hadoop assigns only one mapper to the task and not one mapper on 
> each node (replication factor = 1). FYI, I've 8 mappers available (2 per 
> node). >>> Is there anything that can disable the data locality feature ? >>> 
> >>> Thanks >>> -- >>> Cyril SCETBON >>> >>> 
> _
>  Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc pas etre diffuses, 
> exploites ou copies sans autorisation. Si vous avez recu ce message par 
> erreur, veuillez le signaler a l'expediteur et le detruire ainsi que les 
> pieces jointes. Les messages electroniques etant susceptibles d'alteration, 
> France Telecom - Orange decline toute responsabilite si ce message a ete 
> altere, deforme ou falsifie. Merci. This message and its attachments may 
> contain confidential or privileged information that may be protected by law; 
> they should not be distributed, used or copied without authorisation. If you 
> have received this email in error, please notify the sender and delete this 
> message and its attachments. As emails may be altered, France Telecom - 
> Orange is not liable for messages that have been modified, changed or 
> falsified. Thank you. > > 
> _
>  > > Ce message et ses pieces jointes peuvent contenir des informations 
> confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, 
> exploites ou copies sans autorisation. Si vous avez recu ce message par 
> erreur, veuillez le signaler > a l'expediteur et le detruire ainsi que les 
> pieces jointes. Les messages electroniques etant susceptibles d'alteration, > 
> France Telecom - Orange decline toute responsabilite si ce message a ete 
> altere, deforme ou falsifie. Merci. > > This message and its attachments may 
> contain confidential or privileged information that may be protected by law; 
> > they should not be distributed, used or copied without authorisation. > If 
> you have received this email in error, please notify the sender and delete 
> this message and its attachments. > As emails may be altered, France Telecom 
> - Orange is not liable for messages that have been modified, changed or 
> falsified. > Thank you. -- Best regards

Re: Hadoop jobs and data locality

2013-05-07 Thread cscetbon.ext
I was going to open one. Great !
--
Cyril SCETBON

On May 7, 2013, at 9:03 AM, Shamim mailto:sre...@yandex.ru>> 
wrote:

I have created an issue in jira
https://issues.apache.org/jira/browse/CASSANDRA-5544


_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.



Re: cost estimate about some Cassandra patchs

2013-05-07 Thread aaron morton
> Use case = rows with rowkey like (folder id, file id)
> And operations read/write multiple rows with same folder id => so, it could 
> make sense to have a partitioner putting rows with same "folder id" on the 
> same replicas.
The entire row key the thing we use to make the token used to both locate the 
replicas and place the row in the node. I don't see that changing. 

Have you done any performance testing to see if this is a problem?

Cheers
 
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/05/2013, at 5:27 AM, DE VITO Dominique  
wrote:

> > De : aaron morton [mailto:aa...@thelastpickle.com] 
> > Envoyé : dimanche 28 avril 2013 22:54
> > À : user@cassandra.apache.org
> > Objet : Re: cost estimate about some Cassandra patchs
> > 
> > > Does anyone know enough of the inner working of Cassandra to tell me how 
> > > much work is needed to patch Cassandra to enable such communication 
> > > vectorization/batch ?
> > 
>  
> > Assuming you mean "have the coordinator send multiple row read/write 
> > requests in a single message to replicas"
> > 
> > Pretty sure this has been raised as a ticket before but I cannot find one 
> > now. 
> > 
> > It would be a significant change and I'm not sure how big the benefit is. 
> > To send the messages the coordinator places them in a queue, there is 
> > little delay sending. Then it waits on them async. So there may be some 
> > saving on networking but from the coordinators point of view I think the 
> > impact is minimal. 
> > 
> > What is your use case?
>  
> Use case = rows with rowkey like (folder id, file id)
> And operations read/write multiple rows with same folder id => so, it could 
> make sense to have a partitioner putting rows with same "folder id" on the 
> same replicas.
>  
> But so far, Cassandra is not able to exploit this locality as batch effect 
> ends at the coordinator node.
>  
> So, my question about the cost estimate for patching Cassandra.
>  
> The closest (or exactly corresponding to my need ?) JIRA entries I have found 
> so far are:
>  
> CASSANDRA-166: Support batch inserts for more than one key at once
> https://issues.apache.org/jira/browse/CASSANDRA-166
> => "WON'T FIX" status
>  
> CASSANDRA-5034: Refactor to introduce Mutation Container in write path
> https://issues.apache.org/jira/browse/CASSANDRA-5034
> => I am not very sure if it's related to my topic
>  
> Thanks.
>  
> Dominique
>  
>  
>  
> > 
> > Cheers
> > 
> > 
> > -
> > Aaron Morton
> > Freelance Cassandra Consultant
> > New Zealand
> > 
> > @aaronmorton
> > http://www.thelastpickle.com
>  
> On 27/04/2013, at 4:04 AM, DE VITO Dominique 
>  wrote:
> 
> 
> Hi,
>  
> We are created a new partitioner that groups some rows with **different** row 
> keys on the same replicas.
>  
> But neither the batch_mutate, or the multiget_slice are able to take 
> opportunity of this partitioner-defined placement to vectorize/batch 
> communications between the coordinator and the replicas.
>  
> Does anyone know enough of the inner working of Cassandra to tell me how much 
> work is needed to patch Cassandra to enable such communication 
> vectorization/batch ?
>  
> Thanks.
>  
> Regards,
> Dominique
>  
>  



Re: Cassandra won't restart : 7365....6c73 is not defined as a collection

2013-05-07 Thread aaron morton
>   I have also been changing types, e.g. lock_tokens__ from MAP 
> to MAP.
The error looks like the schema was changed and a log replayed from before the 
change. Which obviously is not something we would expect to happen. 
Do you change the map type using ALTER TABLE (not sure if that is possible) or 
dropping / re-creating?
  
> BTW, would a drain before running '/etc/init.d/cassandra stop' have helped?
Probably not. 
Drain is designed to stop processing writes, flush everything from memory to 
disk and mark the commit log segments as no longer needed.  

If you notice it again can you take note of the order of operations around 
dropping, creating or modifying the schema and restarting ? 

Cheers
-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/05/2013, at 5:03 AM, Blair Zajac  wrote:

> Hi Aaron,
> 
> The keyspace consistent of 3 column families for user management, see below.
> 
> I have dropped these tables multiple times since I'm testing a script to 
> automatically create the column families if they do not exists.  I have also 
> been changing types, e.g. lock_tokens__ from MAP to MAP BIGINT>.
> 
> I have tar copies of /var/lib/cassandra from all three nodes if somebody 
> wants to look.  Since making the tarballs, I blew the cluster away and 
> re-initialized it from scratch.
> 
> BTW, would a drain before running '/etc/init.d/cassandra stop' have helped?
> 
> Regards,
> Blair
> 
> 
> CREATE TABLE account (
>  pk_account UUID PRIMARY KEY,
>  last_login_using TEXT,
>  first_name TEXT,
>  last_name TEXT,
>  full_name TEXT,
>  created_micros BIGINT,
>  modified_micros BIGINT,
>  lock_tokens__ MAP
> );
> 
> 
> CREATE TABLE external_account (
>  pk_external_username TEXT PRIMARY KEY,
>  pk_account UUID,
>  primary_email_address TEXT,
>  secondary_email_addresses SET,
>  first_name TEXT,
>  last_name TEXT,
>  full_name TEXT,
>  last_login_micros BIGINT,
>  created_micros BIGINT,
>  modified_micros BIGINT,
>  lock_tokens__ MAP
> );
> 
> CREATE TABLE email_address (
>  pk_email_address TEXT PRIMARY KEY,
>  pk_account UUID,
>  pk_external_username SET,
>  lock_tokens__ MAP
> );
> 
> 
> On 05/06/2013 01:14 AM, aaron morton wrote:
>> Do you have the table definitions ?
>> Any example data?
>> Something is confused about a set / map / list type.
>> 
>> It's failing when replying the log, if you want to work around move the
>> commit log file out of the directory. There is a chance of data loss if
>> this row mutation is being replied on all nodes.
>> 
>> Cheers
>> 
>> -
>> Aaron Morton
>> Freelance Cassandra Consultant
>> New Zealand
>> 
>> @aaronmorton
>> http://www.thelastpickle.com
>> 
>> On 3/05/2013, at 2:36 PM, Blair Zajac > > wrote:
>> 
>>> Hello,
>>> 
>>> I'm running a 3-node development cluster on OpenStack VMs and recently
>>> updated to DataStax's 1.2.4 debs on Ubuntu Raring after which the
>>> cluster was fine.  I shut it down for a few days and after getting
>>> back to Cassandra today and booting the VMs, Cassandra is unable to
>>> start. Below is the output from output.log from one of the nodes.
>>> None of the Cassandra nodes can start.
>>> 
>>> The deployment is pretty simple, two test keyspaces with a few column
>>> families in each keyspace.  I am doing a lot of keyspace and column
>>> family deletions as I'm testing some db style migration code to
>>> auto-setup a schema.
>>> 
>>> Any suggestions?
>>> 
>>> Blair
>>> 
>>> INFO 19:24:09,780 Logging initialized
>>> INFO 19:24:09,790 JVM vendor/version: Java HotSpot(TM) 64-Bit Server
>>> VM/1.7.0_21
>>> INFO 19:24:09,791 Heap size: 880803840/880803840
>>> INFO 19:24:09,791 Classpath:
>>> /usr/share/cassandra/lib/antlr-3.2.jar:/usr/share/cassandra/lib/avro-1.4.0-fixes.jar:/usr/share/cassandra/lib/avro-1.4.0-sources-fixes.jar:/usr/share/cassandra/lib/commons-cli-1.1.jar:/usr/share/cassandra/lib/commons-codec-1.2.jar:/usr/share/cassandra/lib/commons-lang-2.6.jar:/usr/share/cassandra/lib/compress-lzf-0.8.4.jar:/usr/share/cassandra/lib/concurrentlinkedhashmap-lru-1.3.jar:/usr/share/cassandra/lib/guava-13.0.1.jar:/usr/share/cassandra/lib/high-scale-lib-1.1.2.jar:/usr/share/cassandra/lib/jackson-core-asl-1.9.2.jar:/usr/share/cassandra/lib/jackson-mapper-asl-1.9.2.jar:/usr/share/cassandra/lib/jamm-0.2.5.jar:/usr/share/cassandra/lib/jbcrypt-0.3m.jar:/usr/share/cassandra/lib/jline-1.0.jar:/usr/share/cassandra/lib/json-simple-1.1.jar:/usr/share/cassandra/lib/libthrift-0.7.0.jar:/usr/share/cassandra/lib/log4j-1.2.16.jar:/usr/share/cassandra/lib/lz4-1.1.0.jar:/usr/share/cassandra/lib/metrics-core-2.0.3.jar:/usr/share/cassandra/lib/netty-3.5.9.Final.jar:/usr/share/cass!
> and!
>>> ra/lib/ser
>>> vlet-api-2.5-20081211.jar:/usr/share/cassandra/lib/slf4j-api-1.7.2.jar:/usr/share/cassandra/lib/slf4j-log4j12-1.7.2.jar:/usr/share/cassandra/lib/snakeyaml-1.6.jar:/usr/share/cassandra/lib/snappy-java-1.0.4.1.jar:/usr/share/cassandra/lib/snaptree-0.1

Re: hector or astyanax

2013-05-07 Thread aaron morton
> i want to know which cassandra client is better?
Go with Astynax or Native Binary, they are both under active development and 
support by a vendor / large implementor. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/05/2013, at 7:03 AM, Derek Williams  wrote:

> Also have to keep in mind that it should be rare to only use a single socket 
> since you are usually making at least 1 connection per node in the cluster 
> (or local datacenter). There is also nothing enforcing that a single client 
> cannot open more than 1 connection to a node. In the end it should come down 
> to which protocol implementation is faster.
> 
> 
> On Mon, May 6, 2013 at 11:58 AM, Aaron Turner  wrote:
> From my experience, your NIC buffers generally aren't the problem (or at 
> least it's easy to tune them to fix).  It's TCP.  Simply put, your raw NIC 
> throughput > single TCP socket throughput on most modern hardware/OS 
> combinations.  This is especially true as latency increases between the two 
> hosts.  This is why Bittorrent or "download accellerators" are often faster 
> then just downloading a large file via your browser or ftp client- they're 
> running multiple TCP connections in parallel compared to only one.
> 
> TCP is great for reliable, bi-directional, stream based communication.  Not 
> the best solution for high throughput though.  UDP is much better for that, 
> but then you loose all the features that TCP gives you and so then people end 
> up re-inventing the wheel (poorly I might add).
> 
> So yeah, I think the answer to the question of "which is faster" the answer 
> is "it depends on your queries".
> 
> 
> 
> On Mon, May 6, 2013 at 10:24 AM, Hiller, Dean  wrote:
> You have me thinking more.  I wonder in practice if 3 sockets is any faster 
> than 1 socket when doing nio.  If your buffer sizes were small, maybe that 
> would be the case.  Usually the nic buffers are big so when the selector 
> fires it is reading from 3 buffers for 3 sockets or 1 buffer for one socket.  
> In both cases, all 3 requests are there in the buffers.  At any rate, my 
> belief is it probably is still basically parallel performance on one socket 
> though I have not tested my theory…..My theory being the real bottleneck on 
> performance being the work cassandra has to do on the reads and such.
> 
> What about 20 sockets then(like someone has a pool).  Will it be any 
> faster…not really sure as in the end you are still held up by the real 
> bottleneck of reading from disk on the cassandra side.  We went to 20 threads 
> in one case using 20 sockets with astyanax and received no performance 
> improvement(synchronous but more sockets did not improve our performance).  
> Ie. It may be the case 90% of the time, one socket is just as fast as 
> 10/20…..I would love to know the truth/answer to that though.
> 
> Later,
> Dean
> 
> 
> From: Aaron Turner mailto:synfina...@gmail.com>>
> Reply-To: "user@cassandra.apache.org" 
> mailto:user@cassandra.apache.org>>
> Date: Monday, May 6, 2013 10:57 AM
> To: cassandra users 
> mailto:user@cassandra.apache.org>>
> Subject: Re: hector or astyanax
> 
> Just because you can batch queries or have the server process them out of 
> order doesn't make it fully "parellel".  You're still using a single TCP 
> connection which is by definition a serial data stream.  Basically, if you 
> send a bunch of queries which each return a large amount of data you've 
> effectively limited your query throughput to a single TCP connection.  Using 
> Thrift, each query result is returned in it's own TCP stream in *parallel*.
> 
> Not saying the new API isn't great, doesn't have it's place or may have 
> better performance in certain situations, but generally speaking I would 
> refrain from making general claims without actual benchmarks to back them up. 
>   I do completely agree that Async interfaces have their place and have 
> certain advantages over multi-threading models, but it's just another tool to 
> be used when appropriate.
> 
> Just my .02. :)
> 
> 
> 
> On Mon, May 6, 2013 at 5:08 AM, Hiller, Dean 
> mailto:dean.hil...@nrel.gov>> wrote:
> I was under the impression that it is multiple requests using a single 
> connectin PARALLEL not serial as they have request ids and the responses do 
> as well so you can send a request while a previous request has no response 
> just yet.
> 
> I think you do get a big speed advantage from the asynchronous nature as you 
> do not need to hold up so many threads in your webserver while you have 
> outstanding requests being processed.  The thrift async was not exactly async 
> like I am suspecting the new java driver is, but have not verified(I hope it 
> is)
> 
> Dean
> 
> From: Aaron Turner 
> mailto:synfina...@gmail.com>>>
> Reply-To: 
> "user@cassandra.apache.org

Re: Cassandra running High Load with no one using the cluster

2013-05-07 Thread aaron morton
> Why did you increase the stack-size to 5.5 times greater than recommended?  
> Since each threads now uses 1000KB minimum just for the stack, a large number 
> of threads will use a large amount of memory.
I'd say that is the reason you are running out of memory.

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 7/05/2013, at 9:12 AM, Bryan Talbot  wrote:

> On Sat, May 4, 2013 at 9:22 PM, Aiman Parvaiz  wrote:
> 
> When starting this cluster we set 
> > JVM_OPTS="$JVM_OPTS -Xss1000k"
> 
> 
> 
> 
> Why did you increase the stack-size to 5.5 times greater than recommended?  
> Since each threads now uses 1000KB minimum just for the stack, a large number 
> of threads will use a large amount of memory.
> 
> -Bryan
>  



Re: Hadoop jobs and data locality

2013-05-07 Thread cscetbon.ext
I tried to use your quick workaround but the task is lasting really longer than 
before even if it uses 2 mappers in //. The fact is that there are 1000 tasks. 
Are you using vnodes ? I didn't try to disable them.

Kind% Complete  Num Tasks   Pending Running CompleteKilled  
Failed/Killed
Task 
Attempts
map
 7.64%

1025
945
   
2
 
78
  0   0 / 0
reduce
   2.53%

1   0   
1
  0   0   0 / 0




--
Cyril SCETBON

On May 5, 2013, at 8:45 AM, Shamim mailto:sre...@yandex.ru>> 
wrote:

Hello,
  We have also came across this issue in our dev environment, when we upgrade 
Cassandra from 1.1.5 to 1.2.1 version. I have mentioned this issue in few times 
in this forum but haven't got any answer yet. For quick work around you can use 
pig.splitCombination false in your pig script to avoid this issue, but it will 
make one of your task with a very big amount of data. I can't figure out why 
this happening in newer version of Cassandra, strongly guess some thing goes 
wrong in Cassandra implementation of LoadFunc or in Murmur3Partition (it's my 
guess).
Here is my earliar post
http://www.mail-archive.com/user@cassandra.apache.org/msg28016.html
http://www.mail-archive.com/user@cassandra.apache.org/msg29425.html

Any comment from authors will be highly appreciated
P.S. please keep me in touch with any solution or hints.

--
Best regards
  Shamim A.



03.05.2013, 19:25, "cscetbon@orange.com" :
Hi,
I'm using Pig to calculate the sum of a columns from a columnfamily (scan of 
all rows) and I've read that input data locality is supported at 
http://wiki.apache.org/cassandra/HadoopSupport
However when I execute my Pig script Hadoop assigns only one mapper to the task 
and not one mapper on each node (replication factor = 1).  FYI, I've 8 mappers 
available (2 per node).
Is there anything that can disable the data locality feature ?

Thanks
--
Cyril SCETBON

_
 Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc pas etre diffuses, exploites 
ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez 
le signaler a l'expediteur et le detruire ainsi que les pieces jointes. Les 
messages electroniques etant susceptibles d'alteration, France Telecom - Orange 
decline toute responsabilite si ce message a ete altere, deforme ou falsifie. 
Merci. This message and its attachments may contain confidential or privileged 
information that may be protected by law; they should not be distributed, used 
or copied without authorisation. If you have received this email in error, 
please notify the sender and delete this message and its attachments. As emails 
may be altered, France Telecom - Orange is not liable for messages that have 
been modified, changed or falsified. Thank you.


_

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.



RE: cost estimate about some Cassandra patchs

2013-05-07 Thread DE VITO Dominique

> -Message d'origine-
> De : aaron morton [mailto:aa...@thelastpickle.com] 
> Envoyé : mardi 7 mai 2013 10:22
> À : user@cassandra.apache.org
> Objet : Re: cost estimate about some Cassandra patchs
>
> > Use case = rows with rowkey like (folder id, file id)
> > And operations read/write multiple rows with same folder id => so, it could 
> > make sense to have a partitioner putting rows with same "folder id" on the 
> > same > replicas.
> The entire row key the thing we use to make the token used to both locate the 
> replicas and place the row in the node. I don't see that changing. 

Well, we can't do that, because of secondary indexes on rows.
Only the C* v2 will allow the row design you mention, with secondary index.
So, this row design you mention is a no go for us, with C* 1.1 or 1.2.

> Have you done any performance testing to see if this is a problem?

Unfortunately, we have just some pieces, today, for doing performance testing. 
We are beginning. But still, I investigate to know if alternative designs are 
(at least) possible. Because if no alternative design is easy to develop, then 
there's no need to compare performance.

The lesson I learnt here is that, if I would restart our project from the 
beginning, I would start a more extensive performance testing project along 
with business project development. It's a kind of must-have for a NoSQL 
database.

So, the only tests we have done so far with our FolderPartitioner is with a one 
machine-cluster.
As expected, due to the more important work of this FolderPartitioner, the CPU 
is a better higher (~10%), memory and network consumptions are the same than 
with RP, but I have strange results for I/O (average hard drive), for example, 
for a write-only test. I don't know why the I/O consumption could be much 
higher with our FolderPartitioner than with the RP. So, I am questioning my 
measurement methods, and my C* understanding.
Well, the use of such FolderPartitioner is quite a long way to go...

Regards.
Dominique

> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com

On 7/05/2013, at 5:27 AM, DE VITO Dominique  
wrote:

> > De : aaron morton [mailto:aa...@thelastpickle.com] 
> > Envoyé : dimanche 28 avril 2013 22:54
> > À : user@cassandra.apache.org
> > Objet : Re: cost estimate about some Cassandra patchs
> > 
> > > Does anyone know enough of the inner working of Cassandra to tell me how 
> > > much work is needed to patch Cassandra to enable such communication 
> > > vectorization/batch ?
> > 
>  
> > Assuming you mean "have the coordinator send multiple row read/write 
> > requests in a single message to replicas"
> > 
> > Pretty sure this has been raised as a ticket before but I cannot find one 
> > now. 
> > 
> > It would be a significant change and I'm not sure how big the benefit is. 
> > To send the messages the coordinator places them in a queue, there is 
> > little delay sending. Then it waits on them async. So there may be some 
> > saving on networking but from the coordinators point of view I think the 
> > impact is minimal. 
> > 
> > What is your use case?
>  
> Use case = rows with rowkey like (folder id, file id)
> And operations read/write multiple rows with same folder id => so, it could 
> make sense to have a partitioner putting rows with same "folder id" on the 
> same replicas.
>  
> But so far, Cassandra is not able to exploit this locality as batch effect 
> ends at the coordinator node.
>  
> So, my question about the cost estimate for patching Cassandra.
>  
> The closest (or exactly corresponding to my need ?) JIRA entries I have found 
> so far are:
>  
> CASSANDRA-166: Support batch inserts for more than one key at once
> https://issues.apache.org/jira/browse/CASSANDRA-166
> => "WON'T FIX" status
>  
> CASSANDRA-5034: Refactor to introduce Mutation Container in write path
> https://issues.apache.org/jira/browse/CASSANDRA-5034
> => I am not very sure if it's related to my topic
>  
> Thanks.
>  
> Dominique
>  
>  
>  
> > 
> > Cheers
> > 
> > 
> > -
> > Aaron Morton
> > Freelance Cassandra Consultant
> > New Zealand
> > 
> > @aaronmorton
> > http://www.thelastpickle.com
>  
> On 27/04/2013, at 4:04 AM, DE VITO Dominique 
>  wrote:
> 
> 
> Hi,
>  
> We are created a new partitioner that groups some rows with **different** row 
> keys on the same replicas.
>  
> But neither the batch_mutate, or the multiget_slice are able to take 
> opportunity of this partitioner-defined placement to vectorize/batch 
> communications between the coordinator and the replicas.
>  
> Does anyone know enough of the inner working of Cassandra to tell me how much 
> work is needed to patch Cassandra to enable such communication 
> vectorization/batch ?
>  
> Thanks.
>  
> Regards,
> Dominique
>  
>  



Re: SSTables not opened on new cluste

2013-05-07 Thread Philippe
Definitely knew that for major releases, didn't expect it for a minor
release at all.
Le 6 mai 2013 19:22, "Robert Coli"  a écrit :

> On Sat, May 4, 2013 at 5:41 AM, Philippe  wrote:
> > After trying every possible combination of parameters, config and the
> rest,
> > I ended up downgrading the new node from 1.1.11 to 1.1.2 to match the
> > existing 3 nodes. And that solved the issue immediately : the schema was
> > propagated and the node started handling reads & writes.
>
> As you have discovered...
>
> Trying to upgrade Cassandra by :
>
> 1) Adding new node at new version
> 2) Upgrading old nodes
>
> Is far less likely to work than :
>
> 1) Add new node at old version
> 2) Upgrade all nodes
>
> =Rob
>


how to get column family details dynamically in cassandra bulk load program

2013-05-07 Thread chandana.tummala
Dear All,

I am using cassandra bulkload program from  
www.datastax.com/dev/blog/bulk-loading‎
In This for CSV entry we are giving column name and validation class .
Is there any way to get the column names and validation class directly from 
database by giving
just keyspace and column family name ,like using JDBC
metadata we can get  details of the table dynamically.
Please can you let me know if there is any way to get.

Thanks & Regards,
Chandana Tummala.


Please do not print this email unless it is absolutely necessary.

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments.

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email.

www.wipro.com


Re: hector or astyanax

2013-05-07 Thread Blair Zajac

On 05/07/2013 01:37 AM, aaron morton wrote:

i want to know which cassandra client is better?

Go with Astynax or Native Binary, they are both under active development
and support by a vendor / large implementor.


Native Binary being which one specifically?  Do you mean the new 
DataStax java-driver? [1]


Regards,
Blair

[1] https://github.com/datastax/java-driver


Re: Cassandra won't restart : 7365....6c73 is not defined as a collection

2013-05-07 Thread Blair Zajac

On 05/07/2013 01:28 AM, aaron morton wrote:

  I have also been changing types, e.g. lock_tokens__ from MAP to MAP.

The error looks like the schema was changed and a log replayed from
before the change. Which obviously is not something we would expect to
happen.
Do you change the map type using ALTER TABLE (not sure if that is
possible) or dropping / re-creating?


I think I tried ALTER TABLE but it doesn't work.  In any case, I would 
always drop the table and recreate it.



BTW, would a drain before running '/etc/init.d/cassandra stop' have
helped?

Probably not.
Drain is designed to stop processing writes, flush everything from
memory to disk and mark the commit log segments as no longer needed.

If you notice it again can you take note of the order of operations
around dropping, creating or modifying the schema and restarting ?


Sure, but I can tell you that it was many drops and creates, probably 
over 10, followed by a restart.  All the drop, create and shutdown 
operations were done in serial.


Blair




mutation stalls and FileNotFoundException

2013-05-07 Thread Keith Wright
I am running 1.2.4 with Vnodes and have been writing at low volume.  I have 
doubled the volume and suddenly 3 of my 6 nodes are showing much higher load 
than the others (30 vs 3) and tpstats show the mutation stage as completely 
full (see below).  I did find a FileNotFoundException that I pasted below which 
appears to be caused by creating, dropping, and creating a keyspace (something 
I did but 4 or 5 days ago).  Anyone have any idea what's going on here?

Thanks

Keiths-MacBook-Pro:bin keith$ ./nodetool tpstats -h lxpcas005.nanigans.com
Pool NameActive   Pending  Completed   Blocked  All 
time blocked
ReadStage 0 0 130990 0  
   0
RequestResponseStage  0 0 344216 0  
   0
MutationStage   128   4523464036 0  
   0
ReadRepairStage   0 0  14131 0  
   0
ReplicateOnWriteStage 0 0  32872 0  
   0
GossipStage   1   611   6351 0  
   0
AntiEntropyStage  0 0  0 0  
   0
MigrationStage0 0  9 0  
   0
MemtablePostFlusher   0 0 91 0  
   0
FlushWriter   0 0 60 0  
  27
MiscStage 0 0  0 0  
   0
commitlog_archiver0 0  0 0  
   0
InternalResponseStage 0 0  3 0  
   0
HintedHandoff 1 1 13 0  
   0

Message type   Dropped
RANGE_SLICE  0
READ_REPAIR 54
BINARY   0
READ 0
MUTATION  8539
_TRACE   0
REQUEST_RESPONSE 0



ERROR [ReplicateOnWriteStage:95404] 2013-05-06 14:55:06,555 
CassandraDaemon.java (line 174) Exception in thread 
Thread[ReplicateOnWriteStage:95404,5,main]
java.lang.RuntimeException: java.lang.RuntimeException: 
java.io.FileNotFoundException: 
/data/1/cassandra/data/users/global_user_stats/users-global_user_stats-ib-30716-Data.db
 (No such file or directory)
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1582)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: java.lang.RuntimeException: java.io.FileNotFoundException: 
/data/1/cassandra/data/users/global_user_stats/users-global_user_stats-ib-30716-Data.db
 (No such file or directory)
at 
org.apache.cassandra.io.compress.CompressedRandomAccessReader.open(CompressedRandomAccessReader.java:46)
at 
org.apache.cassandra.io.util.CompressedSegmentedFile.createReader(CompressedSegmentedFile.java:57)
at 
org.apache.cassandra.io.util.PoolingSegmentedFile.getSegment(PoolingSegmentedFile.java:41)
at 
org.apache.cassandra.io.sstable.SSTableReader.getFileDataInput(SSTableReader.java:976)
at 
org.apache.cassandra.db.columniterator.SSTableNamesIterator.createFileDataInput(SSTableNamesIterator.java:98)
at 
org.apache.cassandra.db.columniterator.SSTableNamesIterator.read(SSTableNamesIterator.java:117)
at 
org.apache.cassandra.db.columniterator.SSTableNamesIterator.(SSTableNamesIterator.java:64)
at 
org.apache.cassandra.db.filter.NamesQueryFilter.getSSTableColumnIterator(NamesQueryFilter.java:81)
at 
org.apache.cassandra.db.filter.QueryFilter.getSSTableColumnIterator(QueryFilter.java:68)
at 
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:274)
at 
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
at 
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
at 
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
at org.apache.cassandra.db.Table.getRow(Table.java:347)
at 
org.apache.cassandra.db.SliceByNamesReadCommand.getRow(SliceByNamesReadCommand.java:64)
at 
org.apache.cassandra.db.CounterMutation.makeReplicationMutation(CounterMutation.java:90)
at 
org.apache.cassandra.service.StorageProxy$7$1.runMayThrow(StorageProxy.java:796)
at 
org.apache.cassandra.service.StorageProxy$DroppableRunnable.run(StorageProxy.java:1578)
... 3 more
Caused 

how to monitor nodetool cleanup?

2013-05-07 Thread Brian Tarbox
I'm recovering from a significant failure and so am doing lots of nodetool
move, removetoken, repair and cleanup.

For most of these I can do "nodetool netstats" to monitor progress but it
doesn't show anything for cleanup...how can I monitor the progress of
cleanup?  On a related note: I'm able to stop all client access to the
cluster until things are happy again...is there anything I can do to make
move/repair/cleanup go faster?

FWIW my problems came from trying to move nodes between EC2 availability
zones...which led to
1) killing a node and recreating it in another availability zone
2) new node had different local ip address so cluster thought old node was
just down and we had a new node...

I did the removetoken on the dead node and gave the new node
oldToken-1...but things still got weird and I ended up spending a couple of
days cleaning up (which seems odd for only about 300 gig total data).

Anyway, any suggestions for monitoring / speeding up cleanup would be
appreciated.

Brian Tarbox


Cassanrda 1.1.11 compression: how to tell if it works ?

2013-05-07 Thread Oleg Dulin

I have a column family with really wide rows set to use Snappy like this:

compression_options = {'sstable_compression' : 
'org.apache.cassandra.io.compress.SnappyCompressor'}  

My understanding is that if a file is compressed I should not be able 
to use "strings" command to view its contents. But it seems like I can 
view the contents like this:


strings *-Data.db 


At what point does compression start ? How can I confirm it is working ?


--
Regards,
Oleg Dulin
NYC Java Big Data Engineer
http://www.olegdulin.com/

HintedHandoff

2013-05-07 Thread Kanwar Sangha
Hi -I had a question on  hinted-handoff.  We have 2 DCs configured with overall 
RF = 2 (DC1:1, DC2:1) and 4 nodes in each DC (total - 8 nodes across 2  DCs)

Now we do a write with CL = ONE and Hinted Handoff enabled.


*If node 'X ' in DC1 which is a 'replica' node is down and a write 
comes with CL =1 to DC1, the co-ordinator node will write the hint and also the 
data will be written to the other 'replica' node in DC2 ? Is this correct ?

*If yes, then when we try to do a 'read' of this data with CL = 
'local_quorum' from DC1, it will fail (since the data was written as a hint) 
and we will need to read it from the other DC ?

Thanks,
Kanwar



Re: SSTables not opened on new cluste

2013-05-07 Thread Robert Coli
On Tue, May 7, 2013 at 4:26 AM, Philippe  wrote :
> Definitely knew that for major releases, didn't expect it for a minor
> release at all.

This sort of incompatibility is definitely more common between major
versions, but not unheard of within minor series.

=Rob


CQL3 Data Model Question

2013-05-07 Thread Keith Wright
Hi all,

I was hoping you could provide some assistance with a data modeling 
question (my apologies if a similar question has already been posed).  I have 
time based data that I need to store on a per customer (aka app id ) basis so 
that I can easily return it in sorted order by event time.  The data in 
question is being written at high volume (~50K / sec) and I am concerned about 
the cardinality of using either app id or event time as the row key as either 
will likely result in hot spots.  Here are is the table definition I am 
considering:

create table organic_events (
event_id UUID,
app_id INT,
event_time TIMESTAMP,
user_id INT,
….
PRIMARY KEY (app_id, event_time, event_id)
)  WITH CLUSTERING ORDER BY (app_id asc,event_time desc);

So that I can be able to query as follows which will naturally sort the results 
by time descending:

select * from organic_events where app_id = 1234 and event_time <= '2012-01-01' 
and event_time > '2012-01-01';

Anyone have an idea of the best way to accomplish this?  I was considering the 
following:

 *   Making the row key a concatenation of app id and 0-100 using a mod on 
event id to get the value.  When getting data I would just fetch all keys given 
the mods (app_id in (1234_0,1234_1,1234_2, etc).  This would alleviate the 
"hot" key issue but still seems expensive and a little hacky
 *   I tried removing app_id from the primary key all together (using primary 
key of user_id, event_time, event_id) and making app_id a secondary index.  I 
would need to sort by time on the client.  The above query is valid however 
running a query is VERY slow as I believe it needs to fetch every row key that 
matches the index which is quite expensive (I get a timeout in cqlsh).
 *   Create a different column family for each app id (I.e. 
1234_organic_events).  Note that we could easily have 1000s of application ids.

Thanks!


Re: CQL3 Data Model Question

2013-05-07 Thread Hiller, Dean
We use PlayOrm to do 60,000 different streams which are all time series and use 
the virtual column families of PlayOrm so they are all in one column family.  
We then partition by time as well.  I don't believe that we really have any 
hotspots from what I can tell.

Dean

From: Keith Wright mailto:kwri...@nanigans.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Tuesday, May 7, 2013 2:02 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: CQL3 Data Model Question

Hi all,

I was hoping you could provide some assistance with a data modeling 
question (my apologies if a similar question has already been posed).  I have 
time based data that I need to store on a per customer (aka app id ) basis so 
that I can easily return it in sorted order by event time.  The data in 
question is being written at high volume (~50K / sec) and I am concerned about 
the cardinality of using either app id or event time as the row key as either 
will likely result in hot spots.  Here are is the table definition I am 
considering:

create table organic_events (
event_id UUID,
app_id INT,
event_time TIMESTAMP,
user_id INT,
….
PRIMARY KEY (app_id, event_time, event_id)
)  WITH CLUSTERING ORDER BY (app_id asc,event_time desc);

So that I can be able to query as follows which will naturally sort the results 
by time descending:

select * from organic_events where app_id = 1234 and event_time <= '2012-01-01' 
and event_time > '2012-01-01';

Anyone have an idea of the best way to accomplish this?  I was considering the 
following:

 *   Making the row key a concatenation of app id and 0-100 using a mod on 
event id to get the value.  When getting data I would just fetch all keys given 
the mods (app_id in (1234_0,1234_1,1234_2, etc).  This would alleviate the 
"hot" key issue but still seems expensive and a little hacky
 *   I tried removing app_id from the primary key all together (using primary 
key of user_id, event_time, event_id) and making app_id a secondary index.  I 
would need to sort by time on the client.  The above query is valid however 
running a query is VERY slow as I believe it needs to fetch every row key that 
matches the index which is quite expensive (I get a timeout in cqlsh).
 *   Create a different column family for each app id (I.e. 
1234_organic_events).  Note that we could easily have 1000s of application ids.

Thanks!


Re: CQL3 Data Model Question

2013-05-07 Thread Keith Wright
So in that case I would create a different column family for each app id
and then a "time bucket" key as the row key with perhaps an hour
resolution?  Something like this:

create 123_table organic_events (
   hour timestamp,
   event_id UUID,
   app_id INT,
   event_time TIMESTAMP,
   user_id INT,
   Š.
   PRIMARY KEY (hour, event_time, event_id)
)  WITH CLUSTERING ORDER BY (event_time desc);



Is this what others are doing?


On 5/7/13 4:18 PM, "Hiller, Dean"  wrote:

>We use PlayOrm to do 60,000 different streams which are all time series
>and use the virtual column families of PlayOrm so they are all in one
>column family.  We then partition by time as well.  I don't believe that
>we really have any hotspots from what I can tell.
>
>Dean
>
>From: Keith Wright mailto:kwri...@nanigans.com>>
>Reply-To: "user@cassandra.apache.org"
>mailto:user@cassandra.apache.org>>
>Date: Tuesday, May 7, 2013 2:02 PM
>To: "user@cassandra.apache.org"
>mailto:user@cassandra.apache.org>>
>Subject: CQL3 Data Model Question
>
>Hi all,
>
>I was hoping you could provide some assistance with a data modeling
>question (my apologies if a similar question has already been posed).  I
>have time based data that I need to store on a per customer (aka app id )
>basis so that I can easily return it in sorted order by event time.  The
>data in question is being written at high volume (~50K / sec) and I am
>concerned about the cardinality of using either app id or event time as
>the row key as either will likely result in hot spots.  Here are is the
>table definition I am considering:
>
>create table organic_events (
>event_id UUID,
>app_id INT,
>event_time TIMESTAMP,
>user_id INT,
>Š.
>PRIMARY KEY (app_id, event_time, event_id)
>)  WITH CLUSTERING ORDER BY (app_id asc,event_time desc);
>
>So that I can be able to query as follows which will naturally sort the
>results by time descending:
>
>select * from organic_events where app_id = 1234 and event_time <=
>'2012-01-01' and event_time > '2012-01-01';
>
>Anyone have an idea of the best way to accomplish this?  I was
>considering the following:
>
> *   Making the row key a concatenation of app id and 0-100 using a mod
>on event id to get the value.  When getting data I would just fetch all
>keys given the mods (app_id in (1234_0,1234_1,1234_2, etc).  This would
>alleviate the "hot" key issue but still seems expensive and a little hacky
> *   I tried removing app_id from the primary key all together (using
>primary key of user_id, event_time, event_id) and making app_id a
>secondary index.  I would need to sort by time on the client.  The above
>query is valid however running a query is VERY slow as I believe it needs
>to fetch every row key that matches the index which is quite expensive (I
>get a timeout in cqlsh).
> *   Create a different column family for each app id (I.e.
>1234_organic_events).  Note that we could easily have 1000s of
>application ids.
>
>Thanks!



Re: CQL3 Data Model Question

2013-05-07 Thread Hiller, Dean
Playorm is not yet on CQL3 and cassandra doesn't work well with +10,000
CF's as we went down that path and cassandra can't cope, so we have one
cassandra CF with 60,000 virtual CF's thanks to PlayOrm and a few other
CF's.

But yes, we bucket into hour or month or whatever depending on your rates
and have an exact timestamp as well.  That is one option.  You can
virtualize without playorm by just prefixing the rowkey with the device id
each time and reversing that on reads of course.  I am not sure if you
need to partition by time or not after that as that dependson number of
rows per device.

Later,
Dean

On 5/7/13 2:42 PM, "Keith Wright"  wrote:

>So in that case I would create a different column family for each app id
>and then a "time bucket" key as the row key with perhaps an hour
>resolution?  Something like this:
>
>create 123_table organic_events (
>   hour timestamp,
>   event_id UUID,
>   app_id INT,
>   event_time TIMESTAMP,
>   user_id INT,
>   Š.
>   PRIMARY KEY (hour, event_time, event_id)
>)  WITH CLUSTERING ORDER BY (event_time desc);
>
>
>
>Is this what others are doing?
>
>
>On 5/7/13 4:18 PM, "Hiller, Dean"  wrote:
>
>>We use PlayOrm to do 60,000 different streams which are all time series
>>and use the virtual column families of PlayOrm so they are all in one
>>column family.  We then partition by time as well.  I don't believe that
>>we really have any hotspots from what I can tell.
>>
>>Dean
>>
>>From: Keith Wright mailto:kwri...@nanigans.com>>
>>Reply-To: "user@cassandra.apache.org"
>>mailto:user@cassandra.apache.org>>
>>Date: Tuesday, May 7, 2013 2:02 PM
>>To: "user@cassandra.apache.org"
>>mailto:user@cassandra.apache.org>>
>>Subject: CQL3 Data Model Question
>>
>>Hi all,
>>
>>I was hoping you could provide some assistance with a data modeling
>>question (my apologies if a similar question has already been posed).  I
>>have time based data that I need to store on a per customer (aka app id )
>>basis so that I can easily return it in sorted order by event time.  The
>>data in question is being written at high volume (~50K / sec) and I am
>>concerned about the cardinality of using either app id or event time as
>>the row key as either will likely result in hot spots.  Here are is the
>>table definition I am considering:
>>
>>create table organic_events (
>>event_id UUID,
>>app_id INT,
>>event_time TIMESTAMP,
>>user_id INT,
>>Š.
>>PRIMARY KEY (app_id, event_time, event_id)
>>)  WITH CLUSTERING ORDER BY (app_id asc,event_time desc);
>>
>>So that I can be able to query as follows which will naturally sort the
>>results by time descending:
>>
>>select * from organic_events where app_id = 1234 and event_time <=
>>'2012-01-01' and event_time > '2012-01-01';
>>
>>Anyone have an idea of the best way to accomplish this?  I was
>>considering the following:
>>
>> *   Making the row key a concatenation of app id and 0-100 using a mod
>>on event id to get the value.  When getting data I would just fetch all
>>keys given the mods (app_id in (1234_0,1234_1,1234_2, etc).  This would
>>alleviate the "hot" key issue but still seems expensive and a little
>>hacky
>> *   I tried removing app_id from the primary key all together (using
>>primary key of user_id, event_time, event_id) and making app_id a
>>secondary index.  I would need to sort by time on the client.  The above
>>query is valid however running a query is VERY slow as I believe it needs
>>to fetch every row key that matches the index which is quite expensive (I
>>get a timeout in cqlsh).
>> *   Create a different column family for each app id (I.e.
>>1234_organic_events).  Note that we could easily have 1000s of
>>application ids.
>>
>>Thanks!
>



backup strategy

2013-05-07 Thread Kanwar Sangha
Hi - If we have a RF=2 in a 4 node cluster, how do we ensure that the backup 
taken is only for 1 copy of the data ? in other words, is it possible for us to 
take back-up only from 2 nodes and not all 4 and still have at least 1 copy of 
the data ?

Thanks,
Kanwar





Re: how to monitor nodetool cleanup?

2013-05-07 Thread Michael Morris
Not sure about making things go faster, but you should be able to monitor
it with nodetool compactionstats.

Thanks,

Mike


On Tue, May 7, 2013 at 12:43 PM, Brian Tarbox wrote:

> I'm recovering from a significant failure and so am doing lots of nodetool
> move, removetoken, repair and cleanup.
>
> For most of these I can do "nodetool netstats" to monitor progress but it
> doesn't show anything for cleanup...how can I monitor the progress of
> cleanup?  On a related note: I'm able to stop all client access to the
> cluster until things are happy again...is there anything I can do to make
> move/repair/cleanup go faster?
>
> FWIW my problems came from trying to move nodes between EC2 availability
> zones...which led to
> 1) killing a node and recreating it in another availability zone
> 2) new node had different local ip address so cluster thought old node was
> just down and we had a new node...
>
> I did the removetoken on the dead node and gave the new node
> oldToken-1...but things still got weird and I ended up spending a couple of
> days cleaning up (which seems odd for only about 300 gig total data).
>
> Anyway, any suggestions for monitoring / speeding up cleanup would be
> appreciated.
>
> Brian Tarbox
>
>