how to effectively drop table

2016-11-24 Thread joseph gao
Hi, all
I've had a very bad system design before. This caused about 1
tables in my cassandra cluster, and the cluster is very unstable. Now I
want to redesign the system, but it's so suffering to drop the former
tables. Dropping a table may cost 30 seconds or even worse(170s) using
cqlsh or driver client.
So is there an effectively way to drop these unused table ? Thanks very
much.

-- 
--
Joseph Gao
PhoneNum:18136950721
QQ: 409343351


Re: how to effectively drop table

2016-11-24 Thread Vladimir Yudovin
Hi,



Is DROP whole keyspace an option?



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Thu, 24 Nov 2016 03:00:40 -0500joseph gao 
 wrote 




Hi, all

I've had a very bad system design before. This caused about 1 tables in 
my cassandra cluster, and the cluster is very unstable. Now I want to redesign 
the system, but it's so suffering to drop the former tables. Dropping a table 
may cost 30 seconds or even worse(170s) using cqlsh or driver client. 

So is there an effectively way to drop these unused table ? Thanks very 
much.



-- 

--

Joseph Gao

PhoneNum:18136950721

QQ: 409343351









OperationTimedOutException (NoHostAvailableException)

2016-11-24 Thread techpyaasa .
Hi all,

Following exception thrown sometimes though all nodes are up.


* SEVERE : This error occurs if there are not enough Cassandra nodes for
the required QUORUM to persist data. Please make sure enough nodes are up
at this point of time. Error Count is at 150 Exception
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
tried for query failed (tried: /192.168.198.168:9042

(com.datastax.driver.core.exceptions.DriverException: Timeout while trying
to acquire available connection (you may want to increase the driver number
of per-host connections)), /192.168.198.169:9042

(com.datastax.driver.core.exceptions.DriverException: Timeout while trying
to acquire available connection (you may want to increase the driver number
of per-host connections)), /192.168.198.75:9042

(com.datastax.driver.core.OperationTimedOutException: [/192.168.198.75:9042
] Operation timed out)) at
com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
at
com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
at
com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
at
com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
at *

We are using c*-2.0.17 , datastax java driver - cassandra-driver-core-2.1.8.
jar.

In cassandra.yaml following were set
rpc_address: 0.0.0.0  , broadcast_address: 1.2.3.4

This exception thrown for both READ & WRITE queries. Can someone please
help me out in debugging things?


Thanks
Techpyaasa


Re: OperationTimedOutException (NoHostAvailableException)

2016-11-24 Thread Jeff Jirsa
Did you already try doing what the error message indicates you should try? 

 

Is there anything in the logs on the 3 cassandra boxes listed (192.168.198.168, 
192.168.198.169, 192.168.198.75) that indicates they had problems at that time, 
perhaps GCInspector or StatusLogger messages about pauses, or any drops in 
network utilization to indicate a networking problem? 

 

 

 

From: "techpyaasa ." 
Reply-To: "user@cassandra.apache.org" 
Date: Thursday, November 24, 2016 at 1:43 AM
To: "user@cassandra.apache.org" 
Subject: OperationTimedOutException (NoHostAvailableException)

 

Hi all,

Following exception thrown sometimes though all nodes are up.

 SEVERE : This error occurs if there are not enough Cassandra nodes for the 
required QUORUM to persist data. Please make sure enough nodes are up at this 
point of time. Error Count is at 150 Exception 
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried 
for query failed (tried: /192.168.198.168:9042 
(com.datastax.driver.core.exceptions.DriverException: Timeout while trying to 
acquire available connection (you may want to increase the driver number of 
per-host connections)), /192.168.198.169:9042 
(com.datastax.driver.core.exceptions.DriverException: Timeout while trying to 
acquire available connection (you may want to increase the driver number of 
per-host connections)), /192.168.198.75:9042 
(com.datastax.driver.core.OperationTimedOutException: [/192.168.198.75:9042] 
Operation timed out)) at 
com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
 at com.datastax.driver.core.DriverThrowables.propagateCause(DriverThr
 owables.java:37) at 
com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
 at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52) 
at 


We are using c*-2.0.17 , datastax java driver - 
cassandra-driver-core-2.1.8.jar. 


In cassandra.yaml following were set 
rpc_address: 0.0.0.0  , broadcast_address: 1.2.3.4

This exception thrown for both READ & WRITE queries. Can someone please help me 
out in debugging things?


Thanks 
Techpyaasa



smime.p7s
Description: S/MIME cryptographic signature


Re: how to effectively drop table

2016-11-24 Thread joseph gao
Hi,Vladimir,
I have to do the whole stuff in 2 step. DROP whole keyspace works in
step2. But in step 1, I have to drop 2000 tables。 All I could wait and skip
step 1 and do all jobs in step 2.
Anyway, thanks very much!

2016-11-24 16:25 GMT+08:00 Vladimir Yudovin :

> Hi,
>
> Is DROP whole keyspace an option?
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Thu, 24 Nov 2016 03:00:40 -0500*joseph gao
> >* wrote 
>
> Hi, all
> I've had a very bad system design before. This caused about 1
> tables in my cassandra cluster, and the cluster is very unstable. Now I
> want to redesign the system, but it's so suffering to drop the former
> tables. Dropping a table may cost 30 seconds or even worse(170s) using
> cqlsh or driver client.
> So is there an effectively way to drop these unused table ? Thanks
> very much.
>
> --
> --
> Joseph Gao
> PhoneNum:18136950721
> QQ: 409343351
>
>


-- 
--
Joseph Gao
PhoneNum:15210513582
QQ: 409343351


Re: how to effectively drop table

2016-11-24 Thread Vladimir Yudovin
Actually you shouldn't drop tables prior to dropping keyspace. Just drop 
keyspace with all its tables.



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Thu, 24 Nov 2016 05:38:36 -0500joseph gao 
 wrote 




Hi,Vladimir,

I have to do the whole stuff in 2 step. DROP whole keyspace works in step2. 
But in step 1, I have to drop 2000 tables。 All I could wait and skip step 1 and 
do all jobs in step 2.

Anyway, thanks very much!




2016-11-24 16:25 GMT+08:00 Vladimir Yudovin :



Hi,



Is DROP whole keyspace an option?



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Thu, 24 Nov 2016 03:00:40 -0500joseph gao 
 wrote 




Hi, all

I've had a very bad system design before. This caused about 1 tables in 
my cassandra cluster, and the cluster is very unstable. Now I want to redesign 
the system, but it's so suffering to drop the former tables. Dropping a table 
may cost 30 seconds or even worse(170s) using cqlsh or driver client. 

So is there an effectively way to drop these unused table ? Thanks very 
much.



-- 

--

Joseph Gao

PhoneNum:18136950721

QQ: 409343351

















-- 

--

Joseph Gao

PhoneNum:15210513582

QQ: 409343351








Re: OperationTimedOutException (NoHostAvailableException)

2016-11-24 Thread Vladimir Yudovin
>rpc_address: 0.0.0.0  , broadcast_address: 1.2.3.4

Did you try set rpc_address to node IP and not to 0.0.0.0 ?



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Thu, 24 Nov 2016 04:50:08 -0500Jeff Jirsa 
 wrote 




Did you already try doing what the error message indicates you should try?

 

Is there anything in the logs on the 3 cassandra boxes listed (192.168.198.168, 
192.168.198.169, 192.168.198.75) that indicates they had problems at that time, 
perhaps GCInspector or StatusLogger messages about pauses, or any drops in 
network utilization to indicate a networking problem?

 



 

From: "techpyaasa ." 
Reply-To: "user@cassandra.apache.org" 
Date: Thursday, November 24, 2016 at 1:43 AM
To: "user@cassandra.apache.org" 
Subject: OperationTimedOutException (NoHostAvailableException)

 


Hi all,

Following exception thrown sometimes though all nodes are up.

 SEVERE : This error occurs if there are not enough Cassandra nodes for the 
required QUORUM to persist data. Please make sure enough nodes are up at this 
point of time. Error Count is at 150 Exception 
com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried 
for query failed (tried: /192.168.198.168:9042 
(com.datastax.driver.core.exceptions.DriverException: Timeout while trying to 
acquire available connection (you may want to increase the driver number of 
per-host connections)), /192.168.198.169:9042 
(com.datastax.driver.core.exceptions.DriverException: Timeout while trying to 
acquire available connection (you may want to increase the driver number of 
per-host connections)), /192.168.198.75:9042 
(com.datastax.driver.core.OperationTimedOutException: [/192.168.198.75:9042] 
Operation timed out)) at 
com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
 at 
com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
 at 
com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
 at com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52) 
at 


We are using c*-2.0.17 , datastax java driver - cassandra-driver-core-2.1.8.jar.

In cassandra.yaml following were set 
rpc_address: 0.0.0.0 , broadcast_address: 1.2.3.4

This exception thrown for both READ & WRITE queries. Can someone please 
help me out in debugging things?


Thanks 
Techpyaasa












Re: OperationTimedOutException (NoHostAvailableException)

2016-11-24 Thread techpyaasa .
I tried , that didnt work out.. :(

On Thu, Nov 24, 2016 at 4:49 PM, Vladimir Yudovin 
wrote:

> >rpc_address: 0.0.0.0  , broadcast_address: 1.2.3.4
> Did you try set rpc_address to node IP and not to 0.0.0.0 ?
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Thu, 24 Nov 2016 04:50:08 -0500*Jeff Jirsa
> >* wrote 
>
> Did you already try doing what the error message indicates you should try?
>
>
>
> Is there anything in the logs on the 3 cassandra boxes listed
> (192.168.198.168, 192.168.198.169, 192.168.198.75) that indicates they had
> problems at that time, perhaps GCInspector or StatusLogger messages about
> pauses, or any drops in network utilization to indicate a networking
> problem?
>
>
>
>
>
>
> *From: *"techpyaasa ." 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Thursday, November 24, 2016 at 1:43 AM
> *To: *"user@cassandra.apache.org" 
> *Subject: *OperationTimedOutException (NoHostAvailableException)
>
>
>
> Hi all,
>
> Following exception thrown sometimes though all nodes are up.
>
>
> * SEVERE : This error occurs if there are not enough Cassandra nodes for
> the required QUORUM to persist data. Please make sure enough nodes are up
> at this point of time. Error Count is at 150 Exception
> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
> tried for query failed (tried: /192.168.198.168:9042
> 
> (com.datastax.driver.core.exceptions.DriverException: Timeout while trying
> to acquire available connection (you may want to increase the driver number
> of per-host connections)), /192.168.198.169:9042
> 
> (com.datastax.driver.core.exceptions.DriverException: Timeout while trying
> to acquire available connection (you may want to increase the driver number
> of per-host connections)), /192.168.198.75:9042
> 
> (com.datastax.driver.core.OperationTimedOutException: [/192.168.198.75:9042
> ]
> Operation timed out)) at
> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
> at
> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
> at
> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
> at
> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
> at *
>
> We are using c*-2.0.17 , datastax java driver -
> cassandra-driver-core-2.1.8.jar.
>
>
> In cassandra.yaml following were set
> rpc_address: 0.0.0.0
> 
> , broadcast_address: 1.2.3.4
> 
>
> This exception thrown for both READ & WRITE queries. Can someone please
> help me out in debugging things?
>
>
> Thanks
> Techpyaasa
>
>
>


Re: OperationTimedOutException (NoHostAvailableException)

2016-11-24 Thread Shalom Sagges
Do you get this error on specific column families or on all of the
environment?



Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections



On Thu, Nov 24, 2016 at 1:37 PM, techpyaasa .  wrote:

> I tried , that didnt work out.. :(
>
> On Thu, Nov 24, 2016 at 4:49 PM, Vladimir Yudovin 
> wrote:
>
>> >rpc_address: 0.0.0.0  , broadcast_address: 1.2.3.4
>> Did you try set rpc_address to node IP and not to 0.0.0.0 ?
>>
>> Best regards, Vladimir Yudovin,
>> *Winguzone  - Cloud Cassandra Hosting*
>>
>>
>>  On Thu, 24 Nov 2016 04:50:08 -0500*Jeff Jirsa
>> >* wrote 
>>
>> Did you already try doing what the error message indicates you should try?
>>
>>
>>
>> Is there anything in the logs on the 3 cassandra boxes listed
>> (192.168.198.168, 192.168.198.169, 192.168.198.75) that indicates they had
>> problems at that time, perhaps GCInspector or StatusLogger messages about
>> pauses, or any drops in network utilization to indicate a networking
>> problem?
>>
>>
>>
>>
>>
>>
>> *From: *"techpyaasa ." 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Thursday, November 24, 2016 at 1:43 AM
>> *To: *"user@cassandra.apache.org" 
>> *Subject: *OperationTimedOutException (NoHostAvailableException)
>>
>>
>>
>> Hi all,
>>
>> Following exception thrown sometimes though all nodes are up.
>>
>>
>> * SEVERE : This error occurs if there are not enough Cassandra nodes for
>> the required QUORUM to persist data. Please make sure enough nodes are up
>> at this point of time. Error Count is at 150 Exception
>> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
>> tried for query failed (tried: /192.168.198.168:9042
>> 
>> (com.datastax.driver.core.exceptions.DriverException: Timeout while trying
>> to acquire available connection (you may want to increase the driver number
>> of per-host connections)), /192.168.198.169:9042
>> 
>> (com.datastax.driver.core.exceptions.DriverException: Timeout while trying
>> to acquire available connection (you may want to increase the driver number
>> of per-host connections)), /192.168.198.75:9042
>> 
>> (com.datastax.driver.core.OperationTimedOutException: [/192.168.198.75:9042
>> ]
>> Operation timed out)) at
>> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:84)
>> at
>> com.datastax.driver.core.DriverThrowables.propagateCause(DriverThrowables.java:37)
>> at
>> com.datastax.driver.core.DefaultResultSetFuture.getUninterruptibly(DefaultResultSetFuture.java:214)
>> at
>> com.datastax.driver.core.AbstractSession.execute(AbstractSession.java:52)
>> at *
>>
>> We are using c*-2.0.17 , datastax java driver -
>> cassandra-driver-core-2.1.8.jar.
>>
>>
>> In cassandra.yaml following were set
>> rpc_address: 0.0.0.0
>> 
>> , broadcast_address: 1.2.3.4
>> 
>>
>> This exception thrown for both READ & WRITE queries. Can someone please
>> help me out in debugging things?
>>
>>
>> Thanks
>> Techpyaasa
>>
>>
>>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy

Re: Extremely large ValidationExecutor.MaxPoolSize in Cassandra 2.1.13

2016-11-24 Thread Paulo Motta
This is not a problem per se, it's just the maximum number of concurrent
threads allowed in the validation pool which is Integer.MAX_VALUE, which
will limit the maximum number of simultaneous validations the node will
handle. It may be too big, but you probably will never reach anywhere close
to this number before killing your node.

Incremental repair should limit the number of concurrent
repairs/validations in the same table anyway so this shouldn't be a
problem. Nevertheless, if you're worried you may still set this at the
runtime via CompactionManager.setMaximumValidatorThreads Mbean.

2016-11-24 4:20 GMT-02:00 Thomas Julian :

> Hello,
>
> We are using Cassandra 2.1.13 in a production environment. I was verifying
> Threadpools.MaxPoolSize to check for under/over allocated pool counts.
> In our environment ValidationExecutor.MaxPoolSize is 2.147 Billion. Please
> refer the below image
>
>
>
> Is this a known issue? Is it safe to ignore? Is there a problem with the
> allocation or with the metric calculation itself? Can we fix this with some
> configuration changes/patches?
>
> Any help is appreciated.
>
> Thanks,
> Julian.
>
>
>


system.NodeIdInfo - leftover from Cassandra 0.8.3 ?

2016-11-24 Thread cassandra

I've half inherited Cassandra administration on a two DC cluster with 8
nodes (4 in each DC).

We've just successfully completed the upgrade to 2.1.6

On completion of upgradesstables the only remaining "jb" sstable was in
the System keyspace for NodeIdInfo. I notice that this is no longer
included in the schema.

I presume the NodeIdInfo sstable can be safely delete. Is this a safe
presumption?

Thanks
.BN


Max Capacity per Node

2016-11-24 Thread Shalom Sagges
Hi Everyone,

I have a 24 node cluster (12 in each DC) with a capacity of 3.3 TB per node
for the data directory.
I'd like to increase the capacity per node.
Can anyone tell what is the maximum recommended capacity a node can use?
The disks we use are HDD, not SSD.

Thanks!

Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections


-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.


RE: Max Capacity per Node

2016-11-24 Thread Leleu Eric
Hi,

I’m not a Cassandra expert but according this reference : 
http://docs.datastax.com/en/landing_page/doc/landing_page/planning/planningHardware.html#planningHardware__capacity-per-node
You already reached (even exceeded) the recommended limit for HDD.

As usual, I guess the maximum limits depend of your use case, data model and 
workload…


Regards,
Eric

De : Shalom Sagges [mailto:shal...@liveperson.com]
Envoyé : jeudi 24 novembre 2016 14:48
À : user@cassandra.apache.org
Objet : Max Capacity per Node

Hi Everyone,

I have a 24 node cluster (12 in each DC) with a capacity of 3.3 TB per node for 
the data directory.
I'd like to increase the capacity per node.
Can anyone tell what is the maximum recommended capacity a node can use?
The disks we use are HDD, not SSD.

Thanks!

[https://signature.s3.amazonaws.com/2015/lp_logo.png]

Shalom Sagges

DBA

T: +972-74-700-4035

[https://signature.s3.amazonaws.com/2015/LinkedIn.png]

[https://signature.s3.amazonaws.com/2015/Twitter.png]

[https://signature.s3.amazonaws.com/2015/Facebook.png]


We Create Meaningful Connections


[https://signature.s3.amazonaws.com/2015/banners/idc-email-sig.png]



This message may contain confidential and/or privileged information.
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this message 
or any information herein.
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.

!!!*
"Ce message et les pièces jointes sont confidentiels et réservés à l'usage 
exclusif de ses destinataires. Il peut également être protégé par le secret 
professionnel. Si vous recevez ce message par erreur, merci d'en avertir 
immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant 
être assurée sur Internet, la responsabilité de Worldline ne pourra être 
recherchée quant au contenu de ce message. Bien que les meilleurs efforts 
soient faits pour maintenir cette transmission exempte de tout virus, 
l'expéditeur ne donne aucune garantie à cet égard et sa responsabilité ne 
saurait être recherchée pour tout dommage résultant d'un virus transmis.

This e-mail and the documents attached are confidential and intended solely for 
the addressee; it may also be privileged. If you receive this e-mail in error, 
please notify the sender immediately and destroy it. As its integrity cannot be 
secured on the Internet, the Worldline liability cannot be triggered for the 
message content. Although the sender endeavours to maintain a computer 
virus-free network, the sender does not warrant that this transmission is 
virus-free and will not be liable for any damages resulting from any virus 
transmitted.!!!"


generate different sizes of request from single client

2016-11-24 Thread Vikas Jaiman
Hi all,

I want to generate two different sizes (let's say 1 KB and 10 KB) of
request from single client for benchmarking Cassandra. Is there any tool
exist for this type of scenario?

Vikas


Re: generate different sizes of request from single client

2016-11-24 Thread Vladimir Yudovin
You can use cassandra stress-tool.

It has options to set different load patterns.



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Thu, 24 Nov 2016 13:27:59 -0500Vikas Jaiman 
 wrote 




Hi all,


I want to generate two different sizes (let's say 1 KB and 10 KB) of request 
from single client for benchmarking Cassandra. Is there any tool exist for this 
type of scenario? 



Vikas










Re: generate different sizes of request from single client

2016-11-24 Thread DuyHai Doan
Gatling + CQL plugin is really cool too

On Thu, Nov 24, 2016 at 7:46 PM, Vladimir Yudovin 
wrote:

> You can use cassandra stress-tool.
> 
> It has options to set different load patterns.
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Thu, 24 Nov 2016 13:27:59 -0500*Vikas Jaiman
> >* wrote 
>
> Hi all,
> I want to generate two different sizes (let's say 1 KB and 10 KB) of
> request from single client for benchmarking Cassandra. Is there any tool
> exist for this type of scenario?
>
> Vikas
>
>
>


Re: generate different sizes of request from single client

2016-11-24 Thread Vikas Jaiman
Hi Vladimir,

It has option of mixed request of read/write but doesn't has any option for
mixed request size where I can request different size of request.

Thanks,
Vikas

On Thu, Nov 24, 2016 at 7:46 PM, Vladimir Yudovin 
wrote:

> You can use cassandra stress-tool.
> 
> It has options to set different load patterns.
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Thu, 24 Nov 2016 13:27:59 -0500*Vikas Jaiman
> >* wrote 
>
> Hi all,
> I want to generate two different sizes (let's say 1 KB and 10 KB) of
> request from single client for benchmarking Cassandra. Is there any tool
> exist for this type of scenario?
>
> Vikas
>
>


Re: failure node rejoin

2016-11-24 Thread Yuji Ito
Thanks Ben,

I've reported this issue at https://issues.apache.org/j
ira/browse/CASSANDRA-12955.

I'll tell you if I find anything about the data loss issue.

Regards,
yuji


On Thu, Nov 24, 2016 at 1:37 PM, Ben Slater 
wrote:

> You could certainly log a JIRA for the “failure node rejoin” issue (
> https://issues.apache.org/*jira*/browse/
> *cassandra
> ). I*t sounds like
> unexpected behaviour to me. However, I’m not sure it will be viewed a high
> priority to fix given there is a clear operational work-around.
>
> Cheers
> Ben
>
> On Thu, 24 Nov 2016 at 15:14 Yuji Ito  wrote:
>
>> Hi Ben,
>>
>> I continue to investigate the data loss issue.
>> I'm investigating logs and source code and try to reproduce the data loss
>> issue with a simple test.
>> I also try my destructive test with DROP instead of TRUNCATE.
>>
>> BTW, I want to discuss the issue of the title "failure node rejoin" again.
>>
>> Will this issue be fixed? Other nodes should refuse this unexpected
>> rejoin.
>> Or should I be more careful to add failure nodes to the existing cluster?
>>
>> Thanks,
>> yuji
>>
>>
>> On Fri, Nov 11, 2016 at 1:00 PM, Ben Slater 
>> wrote:
>>
>> From a quick look I couldn’t find any defects other than the ones you’ve
>> found that seem potentially relevant to your issue (if any one else on the
>> list knows of one please chime in). Maybe the next step, if you haven’t
>> done so already, is to check your Cassandra logs for any signs of issues
>> (ie WARNING or ERROR logs) in the failing case.
>>
>> Cheers
>> Ben
>>
>> On Fri, 11 Nov 2016 at 13:07 Yuji Ito  wrote:
>>
>> Thanks Ben,
>>
>> I tried 2.2.8 and could reproduce the problem.
>> So, I'm investigating some bug fixes of repair and commitlog between
>> 2.2.8 and 3.0.9.
>>
>> - CASSANDRA-12508: "nodetool repair returns status code 0 for some errors"
>>
>> - CASSANDRA-12436: "Under some races commit log may incorrectly think it
>> has unflushed data"
>>   - related to CASSANDRA-9669, CASSANDRA-11828 (the fix of 2.2 is
>> different from that of 3.0?)
>>
>> Do you know other bug fixes related to commitlog?
>>
>> Regards
>> yuji
>>
>> On Wed, Nov 9, 2016 at 11:34 AM, Ben Slater 
>> wrote:
>>
>> There have been a few commit log bugs around in the last couple of months
>> so perhaps you’ve hit something that was fixed recently. Would be
>> interesting to know the problem is still occurring in 2.2.8.
>>
>> I suspect what is happening is that when you do your initial read
>> (without flush) to check the number of rows, the data is in memtables and
>> theoretically the commitlogs but not sstables. With the forced stop the
>> memtables are lost and Cassandra should read the commitlog from disk at
>> startup to reconstruct the memtables. However, it looks like that didn’t
>> happen for some (bad) reason.
>>
>> Good news that 3.0.9 fixes the problem so up to you if you want to
>> investigate further and see if you can narrow it down to file a JIRA
>> (although the first step of that would be trying 2.2.9 to make sure it’s
>> not already fixed there).
>>
>> Cheers
>> Ben
>>
>> On Wed, 9 Nov 2016 at 12:56 Yuji Ito  wrote:
>>
>> I tried C* 3.0.9 instead of 2.2.
>> The data lost problem hasn't happen for now (without `nodetool flush`).
>>
>> Thanks
>>
>> On Fri, Nov 4, 2016 at 3:50 PM, Yuji Ito  wrote:
>>
>> Thanks Ben,
>>
>> When I added `nodetool flush` on all nodes after step 2, the problem
>> didn't happen.
>> Did replay from old commit logs delete rows?
>>
>> Perhaps, the flush operation just detected that some nodes were down in
>> step 2 (just after truncating tables).
>> (Insertion and check in step2 would succeed if one node was down because
>> consistency levels was serial.
>> If the flush failed on more than one node, the test would retry step 2.)
>> However, if so, the problem would happen without deleting Cassandra data.
>>
>> Regards,
>> yuji
>>
>>
>> On Mon, Oct 24, 2016 at 8:37 AM, Ben Slater 
>> wrote:
>>
>> Definitely sounds to me like something is not working as expected but I
>> don’t really have any idea what would cause that (other than the fairly
>> extreme failure scenario). A couple of things I can think of to try to
>> narrow it down:
>> 1) Run nodetool flush on all nodes after step 2 - that will make sure all
>> data is written to sstables rather than relying on commit logs
>> 2) Run the test with consistency level quorom rather than serial
>> (shouldn’t be any different but quorom is more widely used so maybe there
>> is a bug that’s specific to serial)
>>
>> Cheers
>> Ben
>>
>> On Mon, 24 Oct 2016 at 10:29 Yuji Ito  wrote:
>>
>> Hi Ben,
>>
>> The test without killing nodes has been working well without data lost.
>> I've repeated my test about 200 times after removing data and
>> rebuild/repair.
>>
>> Regards,
>>
>>
>> On Fri, Oct 21, 2016 at 3:14 PM, Yuji Ito  wrote:
>>
>> > Just to confirm, are you saying:
>> > a) after operation 2, you select all and get 10

Re: generate different sizes of request from single client

2016-11-24 Thread Vladimir Yudovin
>doesn't has any option for mixed request size

As a workaround you can run two parallel tests with its own size each.



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Thu, 24 Nov 2016 16:54:08 -0500 Vikas Jaiman 
 wrote 




Hi Vladimir,



It has option of mixed request of read/write but doesn't has any option for 
mixed request size where I can request different size of request.



Thanks,

Vikas  



On Thu, Nov 24, 2016 at 7:46 PM, Vladimir Yudovin  
wrote:



You can use cassandra stress-tool.

It has options to set different load patterns.



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Thu, 24 Nov 2016 13:27:59 -0500Vikas Jaiman 
 wrote 




Hi all,


I want to generate two different sizes (let's say 1 KB and 10 KB) of request 
from single client for benchmarking Cassandra. Is there any tool exist for this 
type of scenario? 



Vikas




















Re: generate different sizes of request from single client

2016-11-24 Thread Ben Slater
If targetting two different tables for the different sizes works then I’ve
submitted a patch for cassandra-stress that allows you to do that:
https://issues.apache.org/jira/browse/CASSANDRA-8780

It would be nice to see someone else test it if you have the appetite to
build it and try it out.

Cheers
Ben

On Fri, 25 Nov 2016 at 16:43 Vladimir Yudovin  wrote:

> >doesn't has any option for mixed request size
> As a workaround you can run two parallel tests with its own size each.
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Thu, 24 Nov 2016 16:54:08 -0500 *Vikas Jaiman
> >* wrote 
>
> Hi Vladimir,
>
> It has option of mixed request of read/write but doesn't has any option
> for mixed request size where I can request different size of request.
>
> Thanks,
> Vikas
>
> On Thu, Nov 24, 2016 at 7:46 PM, Vladimir Yudovin 
> wrote:
>
>
> You can use cassandra stress-tool.
> 
> It has options to set different load patterns.
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Thu, 24 Nov 2016 13:27:59 -0500*Vikas Jaiman
> >* wrote 
>
> Hi all,
> I want to generate two different sizes (let's say 1 KB and 10 KB) of
> request from single client for benchmarking Cassandra. Is there any tool
> exist for this type of scenario?
>
> Vikas
>
>
>


Re: repair -pr in crontab

2016-11-24 Thread wxn...@zjqunshuo.com
Hi Artur,
When I asked similar questions, someone addressed me to the below links and 
they are helpful.

See http://www.datastax.com/dev/blog/repair-in-cassandra
https://lostechies.com/ryansvihla/2015/09/25/cassandras-repair-should-be-called-required-maintenance/
https://cassandra-zone.com/understanding-repairs/

Cheers,
-Simon

From: Artur Siekielski
Date: 2016-11-10 04:22
To: user
Subject: repair -pr in crontab
Hi,
the docs give me an impression that repairing should be run manually, 
and not put in crontab for default. Should each repair run be monitored 
manually?
 
If I would like to put "repair -pr" in crontab for each node, with a few 
hour difference between the runs, are there any risks with such setup? 
Specifically:
- if two or more "repair -pr" runs on different nodes are running at the 
same time, can it cause any problems besides high load?
- can "repair -pr" be run simultaneously on all nodes at the same time?
- I'm using the default gc_grace_period of 10 days. Are there any 
reasons to run repairing more often that once per 10 days, for a case 
when previous repairing fails?
- how to monitor start and finish times of repairs, and if the runs were 
successful? Does the "nodetool repair" command is guaranteed to exit 
only after the repair is finished and does it return a status code to a 
shell?


Re: repair -pr in crontab

2016-11-24 Thread Benjamin Roth
I recommend using cassandra-reaper
Using crons without proper Monitoring will most  likely not work as
expected.
There are some reaper forks on GitHub. You have to check which one works
with your Cassandra version. The original one from Spotify only works on
2.x not on 3.x

Am 25.11.2016 07:31 schrieb "wxn...@zjqunshuo.com" :

> Hi Artur,
> When I asked similar questions, someone addressed me to the below links
> and they are helpful.
>
> See http://www.datastax.com/dev/blog/repair-in-cassandra
> https://lostechies.com/ryansvihla/2015/09/25/cassandras-repair-should-be-
> called-required-maintenance/
> https://cassandra-zone.com/understanding-repairs/
>
> Cheers,
> -Simon
>
> *From:* Artur Siekielski 
> *Date:* 2016-11-10 04:22
> *To:* user 
> *Subject:* repair -pr in crontab
> Hi,
> the docs give me an impression that repairing should be run manually,
> and not put in crontab for default. Should each repair run be monitored
> manually?
>
> If I would like to put "repair -pr" in crontab for each node, with a few
> hour difference between the runs, are there any risks with such setup?
> Specifically:
> - if two or more "repair -pr" runs on different nodes are running at the
> same time, can it cause any problems besides high load?
> - can "repair -pr" be run simultaneously on all nodes at the same time?
> - I'm using the default gc_grace_period of 10 days. Are there any
> reasons to run repairing more often that once per 10 days, for a case
> when previous repairing fails?
> - how to monitor start and finish times of repairs, and if the runs were
> successful? Does the "nodetool repair" command is guaranteed to exit
> only after the repair is finished and does it return a status code to a
> shell?
>
>


Re: Does recovery continue after truncating a table?

2016-11-24 Thread Yuji Ito
Hi all,

I revised the script to reproduce the issue.
I think the issue happens more frequently than before.
Killing another node is added to the previous script.

 [script] 
#!/bin/sh

node1_ip=
node2_ip=
node3_ip=
node2_user=
node3_user=
rows=1

echo "consistency quorum;" > init_data.cql
for key in $(seq 0 $(expr $rows - 1))
do
echo "insert into testdb.testtbl (key, val) values($key, ) IF NOT
EXISTS;" >> init_data.cql
done

while true
do
echo "truncate the table"
cqlsh $node1_ip -e "truncate table testdb.testtbl" > /dev/null 2>&1
if [ $? -ne 0 ]; then
echo "truncating failed"
continue
else
break
fi
done

echo "kill C* process on node3"
pdsh -l $node3_user -R ssh -w $node3_ip "ps auxww | grep CassandraDaemon |
awk '{if (\$13 ~ /cassand/) print \$2}' | xargs sudo kill -9"

echo "insert $rows rows"
cqlsh $node1_ip -f init_data.cql > insert_log 2>&1

echo "restart C* process on node3"
pdsh -l $node3_user -R ssh -w $node3_ip "sudo /etc/init.d/cassandra start"

while true
do
echo "truncate the table again"
cqlsh $node1_ip -e "truncate table testdb.testtbl"
if [ $? -ne 0 ]; then
echo "truncating failed"
continue
else
echo "truncation succeeded!"
break
fi
done

echo "kill C* process on node2"
pdsh -l $node2_user -R ssh -w $node2_ip "ps auxww | grep CassandraDaemon |
awk '{if (\$13 ~ /cassand/) print \$2}' | xargs sudo kill -9"

cqlsh $node1_ip --request-timeout 3600 -e "consistency serial; select
count(*) from testdb.testtbl;"
sleep 10
cqlsh $node1_ip --request-timeout 3600 -e "consistency serial; select
count(*) from testdb.testtbl;"

echo "restart C* process on node2"
pdsh -l $node2_user -R ssh -w $node2_ip "sudo /etc/init.d/cassandra start"


Thanks,
yuji


On Fri, Nov 18, 2016 at 7:52 PM, Yuji Ito  wrote:

> I investigated source code and logs of killed node.
> I guess that unexpected writes are executed when truncation is being
> executed.
>
> Some writes were executed after flush (the first flush) in truncation and
> these writes could be read.
> These writes were requested as MUTATION by another node for hinted handoff.
> Their data was stored to a new memtable and flushed (the second flush) to
> a new SSTable before snapshot in truncation.
> So, the truncation discarded only old SSTables, not the new SSTable.
> That's because ReplayPosition which was used for discarding SSTable was
> that of the first flush.
>
> I copied some parts of log as below.
> "##" line is my comment.
> The point is that the ReplayPosition is moved forward by the second flush.
> It means some writes are executed after the first flush.
>
> == log ==
> ## started truncation
> TRACE [SharedPool-Worker-16] 2016-11-17 08:36:04,612
> ColumnFamilyStore.java:2790 - truncating testtbl
> ## the first flush started before truncation
> DEBUG [SharedPool-Worker-16] 2016-11-17 08:36:04,612
> ColumnFamilyStore.java:952 - Enqueuing flush of testtbl: 591360 (0%)
> on-heap, 0 (0%) off-heap
> INFO  [MemtableFlushWriter:1] 2016-11-17 08:36:04,613 Memtable.java:352 -
> Writing Memtable-testtbl@1863835308(42.625KiB serialized bytes, 2816 ops,
> 0%/0% of on/off-heap limit)
> ...
> DEBUG [MemtableFlushWriter:1] 2016-11-17 08:36:04,973 Memtable.java:386 -
> Completed flushing /var/lib/cassandra/data/testdb
> /testtbl-562848f0a55611e68b1451065d58fdfb/tmp-lb-1-big-Data.db
> (17.651KiB) for commitlog position ReplayPosition(segmentId=1479371760395,
> position=315867)
> ## this ReplayPosition was used for discarding SSTables
> ...
> TRACE [MemtablePostFlush:1] 2016-11-17 08:36:05,022 CommitLog.java:298 -
> discard completed log segments for ReplayPosition(segmentId=1479371760395,
> position=315867), table 562848f0-a556-11e6-8b14-51065d58fdfb
> ## end of the first flush
> DEBUG [SharedPool-Worker-16] 2016-11-17 08:36:05,028
> ColumnFamilyStore.java:2823 - Discarding sstable data for truncated CF +
> indexes
> ## the second flush before snapshot
> DEBUG [SharedPool-Worker-16] 2016-11-17 08:36:05,028
> ColumnFamilyStore.java:952 - Enqueuing flush of testtbl: 698880 (0%)
> on-heap, 0 (0%) off-heap
> INFO  [MemtableFlushWriter:2] 2016-11-17 08:36:05,029 Memtable.java:352 -
> Writing Memtable-testtbl@1186728207(50.375KiB serialized bytes, 3328 ops,
> 0%/0% of on/off-heap limit)
> ...
> DEBUG [MemtableFlushWriter:2] 2016-11-17 08:36:05,258 Memtable.java:386 -
> Completed flushing /var/lib/cassandra/data/testdb
> /testtbl-562848f0a55611e68b1451065d58fdfb/tmp-lb-2-big-Data.db
> (17.696KiB) for commitlog position ReplayPosition(segmentId=1479371760395,
> position=486627)
> ...
> TRACE [MemtablePostFlush:1] 2016-11-17 08:36:05,289 CommitLog.java:298 -
> discard completed log segments for ReplayPosition(segmentId=1479371760395,
> position=486627), table 562848f0-a556-11e6-8b14-51065d58fdfb
> ## end of the second flush: position was moved
> ...
> ## only old SSTable was deleted because this SSTable was older than
> ReplayPosition(segmentId=1479371760395, position=315867)
> TRACE [N

Re: repair -pr in crontab

2016-11-24 Thread Alexander Dejanovski
Hi,

we maintain a hard fork of Reaper that works with all versions of Cassandra
up to 3.0.x : https://github.com/thelastpickle/cassandra-reaper
Just to save you some time digging into all the forks that could exist.

Cheers,

On Fri, Nov 25, 2016 at 7:37 AM Benjamin Roth 
wrote:

> I recommend using cassandra-reaper
> Using crons without proper Monitoring will most  likely not work as
> expected.
> There are some reaper forks on GitHub. You have to check which one works
> with your Cassandra version. The original one from Spotify only works on
> 2.x not on 3.x
>
> Am 25.11.2016 07:31 schrieb "wxn...@zjqunshuo.com" :
>
> Hi Artur,
> When I asked similar questions, someone addressed me to the below links
> and they are helpful.
>
> See http://www.datastax.com/dev/blog/repair-in-cassandra
>
> https://lostechies.com/ryansvihla/2015/09/25/cassandras-repair-should-be-called-required-maintenance/
> https://cassandra-zone.com/understanding-repairs/
>
> Cheers,
> -Simon
>
> *From:* Artur Siekielski 
> *Date:* 2016-11-10 04:22
> *To:* user 
> *Subject:* repair -pr in crontab
> Hi,
> the docs give me an impression that repairing should be run manually,
> and not put in crontab for default. Should each repair run be monitored
> manually?
>
> If I would like to put "repair -pr" in crontab for each node, with a few
> hour difference between the runs, are there any risks with such setup?
> Specifically:
> - if two or more "repair -pr" runs on different nodes are running at the
> same time, can it cause any problems besides high load?
> - can "repair -pr" be run simultaneously on all nodes at the same time?
> - I'm using the default gc_grace_period of 10 days. Are there any
> reasons to run repairing more often that once per 10 days, for a case
> when previous repairing fails?
> - how to monitor start and finish times of repairs, and if the runs were
> successful? Does the "nodetool repair" command is guaranteed to exit
> only after the repair is finished and does it return a status code to a
> shell?
>
> --
-
Alexander Dejanovski
France
@alexanderdeja

Consultant
Apache Cassandra Consulting
http://www.thelastpickle.com