Re: unsubscribe

2016-09-21 Thread Alain RODRIGUEZ
Hi,

Sending a message to user-unsubscr...@cassandra.apache.org is the right way
to go if you want to unsubscribe from the "Cassandra User" mailing list.

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

2016-09-20 14:58 GMT+02:00 Jerry Poole :

>
>
>
>
> IMPORTANT NOTICE: This e-mail, including any and all attachments, is
> intended for the addressee or its representative only. It is confidential
> and may be under legal privilege. Any form of unauthorized use,
> publication, reproduction, copying or disclosure of the content of this
> e-mail is not permitted. If you are not the intended recipient of this
> e-mail and its contents, please notify the sender immediately by reply
> e-mail and delete this e-mail and all its attachments subsequently.
>


Re: Nodetool repair

2016-09-21 Thread Alain RODRIGUEZ
Hi George,

That's the best way to monitor repairs "out of the box" I could think of.
When you're not seeing 2048 (in your case), it might be due to log rotation
or to a session failure. Have you had a look at repair failures?

I am wondering why the implementor did not put something in the log (e.g.
> ... Repair command #41 has ended...) to clearly state that the repair has
> completed.


+1, and some informations about ranges successfully repaired and the ranges
that failed could be a very good thing as well. It would be easy to then
read the repair result and to know what to do next (re-run repair on some
ranges, move to the next node, etc).


2016-09-20 17:00 GMT+02:00 Li, Guangxing :

> Hi,
>
> I am using version 2.0.9. I have been looking into the logs to see if a
> repair is finished. Each time a repair is started on a node, I am seeing
> log line like "INFO [Thread-112920] 2016-09-16 19:00:43,805
> StorageService.java (line 2646) Starting repair command #41, repairing 2048
> ranges for keyspace groupmanager" in system.log. So I know that I am
> expecting to see 2048 log lines like "INFO [AntiEntropySessions:109]
> 2016-09-16 19:27:20,662 RepairSession.java (line 282) [repair
> #8b910950-7c43-11e6-88f3-f147ea74230b] session completed successfully".
> Once I see 2048 such log lines, I know this repair has completed. But this
> is not dependable since sometimes I am seeing less than 2048 but I know
> there is no repair going on since I do not see any trace of repair in
> system.log for a long time. So it seems to me that there is a clear way to
> tell that a repair has started but there is no clear way to tell a repair
> has ended. The only thing you can do is to watch the log and if you do not
> see repair activity for a long time, the repair is done somehow. I am
> wondering why the implementor did not put something in the log (e.g. ...
> Repair command #41 has ended...) to clearly state that the repair has
> completed.
>
> Thanks.
>
> George.
>
> On Tue, Sep 20, 2016 at 2:54 AM, Jens Rantil  wrote:
>
>> On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ 
>> wrote:
>>
>> ...
>>
>>> - The size of your data
>>> - The number of vnodes
>>> - The compaction throughput
>>> - The streaming throughput
>>> - The hardware available
>>> - The load of the cluster
>>> - ...
>>>
>>
>> I've also heard that the number of clustering keys per partition key
>> could have an impact. Might be worth investigating.
>>
>> Cheers,
>> Jens
>> --
>>
>> Jens Rantil
>> Backend Developer @ Tink
>>
>> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
>> For urgent matters you can reach me at +46-708-84 18 32.
>>
>
>


Re: Having secondary indices limited to analytics dc

2016-09-21 Thread Andres de la Peña
Hi,

Sratio's Lucene index takes a walk around to offer this feature. The index
can be created with a configuration option

specifying a list of data centers to be excluded from indexing. The index
is actually created on this data centers but all the write operations are
silently ignored. It would be really nice to have something similar at
Cassandra level.

2016-09-18 21:01 GMT+01:00 Bhuvan Rawal :

> Created CASSANDRA-12663
>  pls feel free to
> make edits. From a birds eye view it seems a bit ineffecient to keep doing
> computations and generating data which may not be put to use. (A user may
> never read via Secondary Indices on primary transactional DC but he/she is
> currently forced to create them on every dc in cluster).
>
> On Mon, Sep 19, 2016 at 1:05 AM, Jonathan Haddad 
> wrote:
>
>> I don't see why having per DC indexes would be an issue, from a technical
>> standpoint.  I suggest putting in a JIRA for it, it's a good idea (if it
>> doesn't exist already).  Post back to the ML with the issue #.
>>
>> On Sun, Sep 18, 2016 at 12:26 PM Bhuvan Rawal 
>> wrote:
>>
>>> Can it be possible with change log feature implemented in CASSANDRA-8844
>>> ?  i.e. to have
>>> two clusters (With different schema definitions for secondary indices) and
>>> segregating analytics workload on the other cluster with CDC log shipper
>>> enabled on parent DC which is taking care of transactional workload?
>>>
>>> On Sun, Sep 18, 2016 at 9:30 PM, Dorian Hoxha 
>>> wrote:
>>>
 Only way I know is in elassandra .
 You spin nodes in dc1 as elassandra (having data + indexes) and in dc2 as
 cassandra (having only data).

 On Sun, Sep 18, 2016 at 5:43 PM, Bhuvan Rawal 
 wrote:

> Hi,
>
> Is it possible to have secondary indices (SASI or native ones) defined
> on a table restricted to a particular DC? For instance it is very much
> possible in mysql to have a parent server on which writes are being done
> without any indices (other than the required ones), and to have indices on
> replica db's, this helps the parent database to be lightweight and free
> from building secondary index on every write.
>
> For analytics & auditing purposes it is essential to serve different
> access patterns than that modeled from a partition key fetch perspective,
> although a limited reads are needed by users but if enabled cluster wide 
> it
> will require index write for every row written on that table on every
> single node on every DC even the one which may be serving read operations.
>
> What could be the potential means to solve this problem inside of
> cassandra (Not having to ship off the data into elasticsearch etc).
>
> Best Regards,
> Bhuvan
>


>>>
>


-- 
Andrés de la Peña

Vía de las dos Castillas, 33, Ática 4, 3ª Planta
28224 Pozuelo de Alarcón, Madrid
Tel: +34 91 828 6473 // www.stratio.com // *@stratiobd
*


Re: Using keyspaces for virtual clusters

2016-09-21 Thread Alain RODRIGUEZ
Hi Dorian,

I'm thinking of creating many keyspaces and storing them into many virtual
> datacenters (the servers will be in 1 logical datacenter, but separated by
> keyspaces).
>
> Does that make sense (so growing up to 200 dcs of 3 servers each in best
> case scenario)?


There is 3 main things you can do here

1 - Use 1 DC, 200 keyspaces using the DC
2 - Use 200 DC, 1 keyspace per DC.
3 - Use 200 cluster, 1 DC, 1 keyspace per client (or many keyspaces, but
related to 1 client)

I am not sure if you want to go with 1 or 2, my understanding is you wanted
to write "the servers will be in 1 -*logical- **physical* datacenter" and
you are willing to do as described in 2.

This looks to be a good idea to me, but for other reasons (clients /
workload isolation, limited risk, independent growth for each client,
visibility on cost per client, ...)

Does that make sense (so growing up to 200 dcs of 3 servers each in best
> case scenario)?
>

Yet I would not go with distinct DC, but rather distinct C* clusters
(different cluster names, seeds, etc).

I see no good reason to use virtual cluster instead of distinct cluster.
Keep keyspace in distinct isolated datacenter would work. Datacenter would
be quite isolated since no information or load would be shared, excepted
from gossip.

Yet there are some issue with big clusters due to gossip, and I had some
issue in the past due to gossip, affecting all the DC within a cluster. In
this case you would face a major issue, that you could have avoided or
limited. Plus when upgrading Cassandra, you would have to upgrade 600 nodes
quite quickly when distinct clusters can be upgraded independently. I would
then go with either option 1 or 3.

and because I don't like having multiple cql clients/connections on my
> app-code


In this case, wouldn't it make sense for you to have per customer app-code
or just a conditional connection creation depending on the client?

I just try to give you some ideas.

Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
> there is overhead with each keyspace + table which would probably break
> this design)

Or is it just a simple map dcx--->ip1,ip2,ip3 ?


I just checked it. All the nodes would know about every keyspace and table,
if using the same Cassandra cluster, (in my testing version C*3.7, this is
stored under system_schema.tables - local strategy, no replication). To
avoid that, using distinct clusters is the way to go.

https://gist.github.com/arodrime/2f4fb2133c5b242b9500860ac8c6d89c

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-09-20 22:49 GMT+02:00 Dorian Hoxha :

> Hi,
>
> I need to separate clients data into multiple clusters and because I don't
> like having multiple cql clients/connections on my app-code, I'm thinking
> of creating many keyspaces and storing them into many virtual datacenters
> (the servers will be in 1 logical datacenter, but separated by keyspaces).
>
> Does that make sense (so growing up to 200 dcs of 3 servers each in best
> case scenario)?
>
> Does the cql-engine make a new connection (like "use keyspace") when
> specifying "keyspace.table" on the query ?
>
> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
> there is overhead with each keyspace + table which would probably break
> this design)
> Or is it just a simple map dcx--->ip1,ip2,ip3 ?
>
> Thank you!
>


Re: unsubscribe

2016-09-21 Thread Jerry Poole
Thanks for your polite guidance.

On Sep 21, 2016, at 4:54 AM, Alain RODRIGUEZ 
mailto:arodr...@gmail.com>> wrote:

Hi,

Sending a message to 
user-unsubscr...@cassandra.apache.org
 is the right way to go if you want to unsubscribe from the "Cassandra User" 
mailing list.

C*heers,
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France

2016-09-20 14:58 GMT+02:00 Jerry Poole 
mailto:jerry.d.po...@acrometis.com>>:




IMPORTANT NOTICE: This e-mail, including any and all attachments, is intended 
for the addressee or its representative only. It is confidential and may be 
under legal privilege. Any form of unauthorized use, publication, reproduction, 
copying or disclosure of the content of this e-mail is not permitted. If you 
are not the intended recipient of this e-mail and its contents, please notify 
the sender immediately by reply e-mail and delete this e-mail and all its 
attachments subsequently.



IMPORTANT NOTICE: This e-mail, including any and all attachments, is intended 
for the addressee or its representative only. It is confidential and may be 
under legal privilege. Any form of unauthorized use, publication, reproduction, 
copying or disclosure of the content of this e-mail is not permitted. If you 
are not the intended recipient of this e-mail and its contents, please notify 
the sender immediately by reply e-mail and delete this e-mail and all its 
attachments subsequently.


Re: Re : Generic keystore when enabling SSL

2016-09-21 Thread Eric Evans
On Tue, Sep 20, 2016 at 12:57 PM, sai krishnam raju potturi
 wrote:
> Due to the security policies in our company, we were asked to use 3rd party
> signed certs. Since we'll require to manage 100's of individual certs, we
> wanted to know if there is a work around with a generic keystore and
> truststore.

Can you explain what you mean by "generic keystore"?  Are you looking
to create keystores signed by a self-signed root CA (distributed via a
truststore)?

-- 
Eric Evans
john.eric.ev...@gmail.com


Re: High load on few nodes in a DC.

2016-09-21 Thread Romain Hardouin
Hi,
Do you shuffle the replicas with 
TokenAwarePolicy?TokenAwarePolicy(LoadBalancingPolicy childPolicy, boolean 
shuffleReplicas) 

Best,
RomainLe Mardi 20 septembre 2016 15h47, Pranay akula 
 a écrit :
 

 I was a able to find the hotspots causing the load,but the size of these 
partitions are in KB and no tombstones and no.of sstables is only 2 what else i 
need to debug to find the reason for high load for some nodes.  we are also 
using unlogged batches is that can be the reason ?? how to find which node is 
serving as a coordinator for un logged batches?? we are using token-aware 
policy.
thanks


On Mon, Sep 19, 2016 at 12:29 PM, Pranay akula  
wrote:

I was able to see most used partitions but the nodes with less load are serving 
more read and write requests for that particular partitions when compared to 
nodes with high load, how can i find if these nodes are serving as 
co-coordinators for those read and write requests ?? how can i find the token 
range for these particular partitions and which node is the primary for these 
partition ??

Thanks
On Mon, Sep 19, 2016 at 11:04 AM, Pranay akula  
wrote:

Hai Jeff,
Thank, we are using RF 3 and cassandra version 2.1.8.
ThanksPranay.
On Mon, Sep 19, 2016 at 10:55 AM, Jeff Jirsa  wrote:

Is your replication_factor 2? Or is it 3?  What version are you using?  The 
most likely answer is some individual partition that’s either being 
written/read more than others, or is somehow impacting the cluster (wide rows 
are a natural candidate). You don’t mention your version, but most modern 
versions of Cassandra ship with ‘nodetool toppartitions’, which will help you 
identify frequently written/read partitions – perhaps you can use that to 
identify a hotspot due to some external behavior (some partition being read 
thousands of times, over and over could certainly drive up load). -  
Jeff From: Pranay akula 
Reply-To: "user@cassandra.apache.org" 
Date: Monday, September 19, 2016 at 7:53 AM
To: "user@cassandra.apache.org" 
Subject: High load on few nodes in a DC. when our cluster was under load  i am 
seeing  1 or 2 nodes are on more load consistently when compared to others in 
dc i am not seeing any GC pauses or wide partitions  is this can be those nodes 
are continuously serving as coordinators ?? how can  i find what is the reason 
for high load on those two nodes ?? We are using Vnode.   ThanksPranay. 







   

Re: Nodetool repair

2016-09-21 Thread Li, Guangxing
Alain,

my script actually grep through all the log files, including those
system.log.*. So it was probably due to a failed session. So now my script
assumes the repair has finished (possibly due to failure) if it does not
see any more repair related logs after 2 hours.

Thanks.

George.

On Wed, Sep 21, 2016 at 3:03 AM, Alain RODRIGUEZ  wrote:

> Hi George,
>
> That's the best way to monitor repairs "out of the box" I could think of.
> When you're not seeing 2048 (in your case), it might be due to log rotation
> or to a session failure. Have you had a look at repair failures?
>
> I am wondering why the implementor did not put something in the log (e.g.
>> ... Repair command #41 has ended...) to clearly state that the repair has
>> completed.
>
>
> +1, and some informations about ranges successfully repaired and the
> ranges that failed could be a very good thing as well. It would be easy to
> then read the repair result and to know what to do next (re-run repair on
> some ranges, move to the next node, etc).
>
>
> 2016-09-20 17:00 GMT+02:00 Li, Guangxing :
>
>> Hi,
>>
>> I am using version 2.0.9. I have been looking into the logs to see if a
>> repair is finished. Each time a repair is started on a node, I am seeing
>> log line like "INFO [Thread-112920] 2016-09-16 19:00:43,805
>> StorageService.java (line 2646) Starting repair command #41, repairing 2048
>> ranges for keyspace groupmanager" in system.log. So I know that I am
>> expecting to see 2048 log lines like "INFO [AntiEntropySessions:109]
>> 2016-09-16 19:27:20,662 RepairSession.java (line 282) [repair
>> #8b910950-7c43-11e6-88f3-f147ea74230b] session completed successfully".
>> Once I see 2048 such log lines, I know this repair has completed. But this
>> is not dependable since sometimes I am seeing less than 2048 but I know
>> there is no repair going on since I do not see any trace of repair in
>> system.log for a long time. So it seems to me that there is a clear way to
>> tell that a repair has started but there is no clear way to tell a repair
>> has ended. The only thing you can do is to watch the log and if you do not
>> see repair activity for a long time, the repair is done somehow. I am
>> wondering why the implementor did not put something in the log (e.g. ...
>> Repair command #41 has ended...) to clearly state that the repair has
>> completed.
>>
>> Thanks.
>>
>> George.
>>
>> On Tue, Sep 20, 2016 at 2:54 AM, Jens Rantil  wrote:
>>
>>> On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ 
>>> wrote:
>>>
>>> ...
>>>
 - The size of your data
 - The number of vnodes
 - The compaction throughput
 - The streaming throughput
 - The hardware available
 - The load of the cluster
 - ...

>>>
>>> I've also heard that the number of clustering keys per partition key
>>> could have an impact. Might be worth investigating.
>>>
>>> Cheers,
>>> Jens
>>> --
>>>
>>> Jens Rantil
>>> Backend Developer @ Tink
>>>
>>> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
>>> For urgent matters you can reach me at +46-708-84 18 32.
>>>
>>
>>
>


Re: Using keyspaces for virtual clusters

2016-09-21 Thread Eric Stevens
Using keyspaces to support multi tenancy is very close to an anti pattern
unless there is a finite and reasonable upper bound to how many tenants
you'll support overall. Large numbers of tables comes with cluster overhead
and operational complexity you will come to regret eventually.

>and because I don't like having multiple cql clients/connections on my
app-code

You should note that although Cassandra drivers present a single logical
connection per cluster, under the hood it maintains connection pools per C*
host. You might be able to do a slightly better job of managing those pools
as a single cluster and logical connection, but I doubt it will be very
significant. It would depend on what options you have available in your
driver of choice.

Application logic would complexity not be greatly improved because you
still need to switch by tenant, whether it's keyspace name or connection
name doesn't seem like it would make much difference.

As Alain pointed out, upgrades will be painful and maybe even dangerous as
a monolithic cluster.

On Wed, Sep 21, 2016, 3:50 AM Alain RODRIGUEZ  wrote:

> Hi Dorian,
>
> I'm thinking of creating many keyspaces and storing them into many virtual
>> datacenters (the servers will be in 1 logical datacenter, but separated by
>> keyspaces).
>>
>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>> case scenario)?
>
>
> There is 3 main things you can do here
>
> 1 - Use 1 DC, 200 keyspaces using the DC
> 2 - Use 200 DC, 1 keyspace per DC.
> 3 - Use 200 cluster, 1 DC, 1 keyspace per client (or many keyspaces, but
> related to 1 client)
>
> I am not sure if you want to go with 1 or 2, my understanding is you
> wanted to write "the servers will be in 1 -*logical- **physical*
> datacenter" and you are willing to do as described in 2.
>
> This looks to be a good idea to me, but for other reasons (clients /
> workload isolation, limited risk, independent growth for each client,
> visibility on cost per client, ...)
>
> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>> case scenario)?
>>
>
> Yet I would not go with distinct DC, but rather distinct C* clusters
> (different cluster names, seeds, etc).
>
> I see no good reason to use virtual cluster instead of distinct cluster.
> Keep keyspace in distinct isolated datacenter would work. Datacenter would
> be quite isolated since no information or load would be shared, excepted
> from gossip.
>
> Yet there are some issue with big clusters due to gossip, and I had some
> issue in the past due to gossip, affecting all the DC within a cluster. In
> this case you would face a major issue, that you could have avoided or
> limited. Plus when upgrading Cassandra, you would have to upgrade 600 nodes
> quite quickly when distinct clusters can be upgraded independently. I would
> then go with either option 1 or 3.
>
> and because I don't like having multiple cql clients/connections on my
>> app-code
>
>
> In this case, wouldn't it make sense for you to have per customer app-code
> or just a conditional connection creation depending on the client?
>
> I just try to give you some ideas.
>
> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
>> there is overhead with each keyspace + table which would probably break
>> this design)
>
> Or is it just a simple map dcx--->ip1,ip2,ip3 ?
>
>
> I just checked it. All the nodes would know about every keyspace and
> table, if using the same Cassandra cluster, (in my testing version C*3.7,
> this is stored under system_schema.tables - local strategy, no
> replication). To avoid that, using distinct clusters is the way to go.
>
> https://gist.github.com/arodrime/2f4fb2133c5b242b9500860ac8c6d89c
>
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2016-09-20 22:49 GMT+02:00 Dorian Hoxha :
>
>> Hi,
>>
>> I need to separate clients data into multiple clusters and because I
>> don't like having multiple cql clients/connections on my app-code, I'm
>> thinking of creating many keyspaces and storing them into many virtual
>> datacenters (the servers will be in 1 logical datacenter, but separated by
>> keyspaces).
>>
>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>> case scenario)?
>>
>> Does the cql-engine make a new connection (like "use keyspace") when
>> specifying "keyspace.table" on the query ?
>>
>> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
>> there is overhead with each keyspace + table which would probably break
>> this design)
>> Or is it just a simple map dcx--->ip1,ip2,ip3 ?
>>
>> Thank you!
>>
>
>


Re: Re : Generic keystore when enabling SSL

2016-09-21 Thread sai krishnam raju potturi
hi Evans;
   rather than having one individual certificate for every node, we are
looking at getting one Comodo wild-card certificate, and importing that
into the keystore. along with the intermediate CA provided by Comodo. As
far as the trust-store is concerned, we are looking at importing the
intermediate CA provided along with the signed wild-card cert by Comodo.

   So in this case we'll be having just one keystore (generic), and
truststore we'll be copying to all the nodes. We've run into issues
however, and are trying to iron that out. Interested to know if anybody in
the community has taken a similar approach.

   We are pretty much going on the lines of following post by LastPickle
http://thelastpickle.com/blog/2015/09/30/hardening-cassandra-step-by-step-part-1-server-to-server.html.
Instead of creating our own CA, we are relying on Comodo.

thanks
Sai

On Wed, Sep 21, 2016 at 10:30 AM, Eric Evans 
wrote:

> On Tue, Sep 20, 2016 at 12:57 PM, sai krishnam raju potturi
>  wrote:
> > Due to the security policies in our company, we were asked to use 3rd
> party
> > signed certs. Since we'll require to manage 100's of individual certs, we
> > wanted to know if there is a work around with a generic keystore and
> > truststore.
>
> Can you explain what you mean by "generic keystore"?  Are you looking
> to create keystores signed by a self-signed root CA (distributed via a
> truststore)?
>
> --
> Eric Evans
> john.eric.ev...@gmail.com
>


Re: Nodetool repair

2016-09-21 Thread Romain Hardouin
Do you see any pending AntiEntropySessions (not AntiEntropyStage) with nodetool 
tpstats on nodes?
Romain
 

Le Mercredi 21 septembre 2016 16h45, "Li, Guangxing" 
 a écrit :
 

 Alain,
my script actually grep through all the log files, including those 
system.log.*. So it was probably due to a failed session. So now my script 
assumes the repair has finished (possibly due to failure) if it does not see 
any more repair related logs after 2 hours.
Thanks.
George.
On Wed, Sep 21, 2016 at 3:03 AM, Alain RODRIGUEZ  wrote:

Hi George,
That's the best way to monitor repairs "out of the box" I could think of. When 
you're not seeing 2048 (in your case), it might be due to log rotation or to a 
session failure. Have you had a look at repair failures?

I am wondering why the implementor did not put something in the log (e.g. ... 
Repair command #41 has ended...) to clearly state that the repair has completed.

+1, and some informations about ranges successfully repaired and the ranges 
that failed could be a very good thing as well. It would be easy to then read 
the repair result and to know what to do next (re-run repair on some ranges, 
move to the next node, etc).

2016-09-20 17:00 GMT+02:00 Li, Guangxing :

Hi,
I am using version 2.0.9. I have been looking into the logs to see if a repair 
is finished. Each time a repair is started on a node, I am seeing log line like 
"INFO [Thread-112920] 2016-09-16 19:00:43,805 StorageService.java (line 2646) 
Starting repair command #41, repairing 2048 ranges for keyspace groupmanager" 
in system.log. So I know that I am expecting to see 2048 log lines like "INFO 
[AntiEntropySessions:109] 2016-09-16 19:27:20,662 RepairSession.java (line 282) 
[repair #8b910950-7c43-11e6-88f3-f147e a74230b] session completed 
successfully". Once I see 2048 such log lines, I know this repair has 
completed. But this is not dependable since sometimes I am seeing less than 
2048 but I know there is no repair going on since I do not see any trace of 
repair in system.log for a long time. So it seems to me that there is a clear 
way to tell that a repair has started but there is no clear way to tell a 
repair has ended. The only thing you can do is to watch the log and if you do 
not see repair activity for a long time, the repair is done somehow. I am 
wondering why the implementor did not put something in the log (e.g. ... Repair 
command #41 has ended...) to clearly state that the repair has completed.
Thanks.
George.
On Tue, Sep 20, 2016 at 2:54 AM, Jens Rantil  wrote:

On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ  wrote:

...
- The size of your data- The number of vnodes- The compaction throughput- The 
streaming throughput- The hardware available- The load of the cluster- ...

I've also heard that the number of clustering keys per partition key could have 
an impact. Might be worth investigating.
Cheers,Jens -- 
Jens Rantil
Backend Developer @ TinkTink AB, Wallingatan 5, 111 60 Stockholm, Sweden
For urgent matters you can reach me at +46-708-84 18 32.







   

Re: Using keyspaces for virtual clusters

2016-09-21 Thread Dorian Hoxha
@Alain
I wanted to do 2, but looks like that won't be possible because of too much
overhead.

@Eric
Yeah that's what I was afraid of. Though I know that the client connects to
every server, I just didn't want to do the extra code.

On Wed, Sep 21, 2016 at 4:56 PM, Eric Stevens  wrote:

> Using keyspaces to support multi tenancy is very close to an anti pattern
> unless there is a finite and reasonable upper bound to how many tenants
> you'll support overall. Large numbers of tables comes with cluster overhead
> and operational complexity you will come to regret eventually.
>
> >and because I don't like having multiple cql clients/connections on my
> app-code
>
> You should note that although Cassandra drivers present a single logical
> connection per cluster, under the hood it maintains connection pools per C*
> host. You might be able to do a slightly better job of managing those pools
> as a single cluster and logical connection, but I doubt it will be very
> significant. It would depend on what options you have available in your
> driver of choice.
>
> Application logic would complexity not be greatly improved because you
> still need to switch by tenant, whether it's keyspace name or connection
> name doesn't seem like it would make much difference.
>
> As Alain pointed out, upgrades will be painful and maybe even dangerous as
> a monolithic cluster.
>
> On Wed, Sep 21, 2016, 3:50 AM Alain RODRIGUEZ  wrote:
>
>> Hi Dorian,
>>
>> I'm thinking of creating many keyspaces and storing them into many
>>> virtual datacenters (the servers will be in 1 logical datacenter, but
>>> separated by keyspaces).
>>>
>>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>>> case scenario)?
>>
>>
>> There is 3 main things you can do here
>>
>> 1 - Use 1 DC, 200 keyspaces using the DC
>> 2 - Use 200 DC, 1 keyspace per DC.
>> 3 - Use 200 cluster, 1 DC, 1 keyspace per client (or many keyspaces, but
>> related to 1 client)
>>
>> I am not sure if you want to go with 1 or 2, my understanding is you
>> wanted to write "the servers will be in 1 -*logical- **physical*
>> datacenter" and you are willing to do as described in 2.
>>
>> This looks to be a good idea to me, but for other reasons (clients /
>> workload isolation, limited risk, independent growth for each client,
>> visibility on cost per client, ...)
>>
>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>>> case scenario)?
>>>
>>
>> Yet I would not go with distinct DC, but rather distinct C* clusters
>> (different cluster names, seeds, etc).
>>
>> I see no good reason to use virtual cluster instead of distinct cluster.
>> Keep keyspace in distinct isolated datacenter would work. Datacenter would
>> be quite isolated since no information or load would be shared, excepted
>> from gossip.
>>
>> Yet there are some issue with big clusters due to gossip, and I had some
>> issue in the past due to gossip, affecting all the DC within a cluster. In
>> this case you would face a major issue, that you could have avoided or
>> limited. Plus when upgrading Cassandra, you would have to upgrade 600 nodes
>> quite quickly when distinct clusters can be upgraded independently. I would
>> then go with either option 1 or 3.
>>
>> and because I don't like having multiple cql clients/connections on my
>>> app-code
>>
>>
>> In this case, wouldn't it make sense for you to have per customer app-code
>> or just a conditional connection creation depending on the client?
>>
>> I just try to give you some ideas.
>>
>> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2 ?(since
>>> there is overhead with each keyspace + table which would probably break
>>> this design)
>>
>> Or is it just a simple map dcx--->ip1,ip2,ip3 ?
>>
>>
>> I just checked it. All the nodes would know about every keyspace and
>> table, if using the same Cassandra cluster, (in my testing version C*3.7,
>> this is stored under system_schema.tables - local strategy, no
>> replication). To avoid that, using distinct clusters is the way to go.
>>
>> https://gist.github.com/arodrime/2f4fb2133c5b242b9500860ac8c6d89c
>>
>> C*heers,
>> ---
>> Alain Rodriguez - @arodream - al...@thelastpickle.com
>> France
>>
>> The Last Pickle - Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> 2016-09-20 22:49 GMT+02:00 Dorian Hoxha :
>>
>>> Hi,
>>>
>>> I need to separate clients data into multiple clusters and because I
>>> don't like having multiple cql clients/connections on my app-code, I'm
>>> thinking of creating many keyspaces and storing them into many virtual
>>> datacenters (the servers will be in 1 logical datacenter, but separated by
>>> keyspaces).
>>>
>>> Does that make sense (so growing up to 200 dcs of 3 servers each in best
>>> case scenario)?
>>>
>>> Does the cql-engine make a new connection (like "use keyspace") when
>>> specifying "keyspace.table" on the query ?
>>>
>>> Are the keyspaces+tables of dc1 stored in a cassandra node of dc2

Re: Client-side timeouts after dropping table

2016-09-21 Thread John Sanda
I was able to get metrics, but nothing stands out. When the applications
start up and a table is dropped, shortly thereafter on a subsequent write I
get a NoHostAvailableException that is caused by an
OperationTimedOutException. I am not 100% certain on which write the
timeout occurs because there are multiple apps running, but it does happen
fairly consistently almost immediately after the table is dropped. I don't
see any indication of a server side timeout or any dropped mutations being
reported in the log.

On Tue, Sep 20, 2016 at 11:07 PM, John Sanda  wrote:

> Thanks Nate. We do not have monitoring set up yet, but I should be able to
> get the deployment updated with a metrics reporter. I'll update the thread
> with my findings.
>
> On Tue, Sep 20, 2016 at 10:30 PM, Nate McCall 
> wrote:
>
>> If you can get to them in the test env. you want to look in
>> o.a.c.metrics.CommitLog for:
>> - TotalCommitlogSize: if this hovers near commitlog_size_in_mb and never
>> goes down, you are thrashing on segment allocation
>> - WaitingOnCommit: this is the time spent waiting on calls to sync and
>> will start to climb real fast if you cant sync within sync_interval
>> - WaitingOnSegmentAllocation: how long it took to allocate a new
>> commitlog segment, if it is all over the place it is IO bound
>>
>> Try turning all the commit log settings way down for low-IO test
>> infrastructure like this. Maybe total commit log size of like 32mb with 4mb
>> segments (or even lower depending on test data volume) so they basically
>> flush constantly and don't try to hold any tables open. Also lower
>> concurrent_writes substantially while you are at it to add some write
>> throttling.
>>
>> On Wed, Sep 21, 2016 at 2:14 PM, John Sanda  wrote:
>>
>>> I have seen in various threads on the list that 3.0.x is probably best
>>> for prod. Just wondering though if there is anything in particular in 3.7
>>> to be weary of.
>>>
>>> I need to check with one of our QA engineers to get specifics on the
>>> storage. Here is what I do know. We have a blade center running lots of
>>> virtual machines for various testing. Some of those vm's are running
>>> Cassandra and the Java web apps I previously mentioned via docker
>>> containers. The storage is shared. Beyond that I don't have any more
>>> specific details at the moment. I can also tell you that the storage can be
>>> quite slow.
>>>
>>> I have come across different threads that talk to one degree or another
>>> about the flush queue getting full. I have been looking at the code in
>>> ColumnFamilyStore.java. Is perDiskFlushExecutors the thread pool I should
>>> be interested in? It uses an unbounded queue, so I am not really sure what
>>> it means for it to get full. Is there anything I can check or look for to
>>> see if writes are getting blocked?
>>>
>>> On Tue, Sep 20, 2016 at 8:41 PM, Jonathan Haddad 
>>> wrote:
>>>
 If you haven't yet deployed to prod I strongly recommend *not* using
 3.7.

 What network storage are you using?  Outside of a handful of highly
 experienced experts using EBS in very specific ways, it usually ends in
 failure.

 On Tue, Sep 20, 2016 at 3:30 PM John Sanda 
 wrote:

> I am deploying multiple Java web apps that connect to a Cassandra 3.7
> instance. Each app creates its own schema at start up. One of the schema
> changes involves dropping a table. I am seeing frequent client-side
> timeouts reported by the DataStax driver after the DROP TABLE statement is
> executed. I don't see this behavior in all environments. I do see it
> consistently in a QA environment in which Cassandra is running in docker
> with network storage, so writes are pretty slow from the get go. In my 
> logs
> I see a lot of tables getting flushed, which I guess are all of the dirty
> column families in the respective commit log segment. Then I seen a whole
> bunch of flushes getting queued up. Can I reach a point in which too many
> table flushes get queued such that writes would be blocked?
>
>
> --
>
> - John
>

>>>
>>>
>>> --
>>>
>>> - John
>>>
>>
>>
>>
>> --
>> -
>> Nate McCall
>> Wellington, NZ
>> @zznate
>>
>> CTO
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>
>
>
> --
>
> - John
>



-- 

- John


Re: Nodetool repair

2016-09-21 Thread Li, Guangxing
Romain,

I started running a new repair. If I see such behavior again, I will try
what you mentioned.

Thanks.

On Wed, Sep 21, 2016 at 9:51 AM, Romain Hardouin 
wrote:

> Do you see any pending AntiEntropySessions (not AntiEntropyStage) with
> nodetool tpstats on nodes?
>
> Romain
>
>
> Le Mercredi 21 septembre 2016 16h45, "Li, Guangxing" <
> guangxing...@pearson.com> a écrit :
>
>
> Alain,
>
> my script actually grep through all the log files, including those
> system.log.*. So it was probably due to a failed session. So now my script
> assumes the repair has finished (possibly due to failure) if it does not
> see any more repair related logs after 2 hours.
>
> Thanks.
>
> George.
>
> On Wed, Sep 21, 2016 at 3:03 AM, Alain RODRIGUEZ 
> wrote:
>
> Hi George,
>
> That's the best way to monitor repairs "out of the box" I could think of.
> When you're not seeing 2048 (in your case), it might be due to log rotation
> or to a session failure. Have you had a look at repair failures?
>
> I am wondering why the implementor did not put something in the log (e.g.
> ... Repair command #41 has ended...) to clearly state that the repair has
> completed.
>
>
> +1, and some informations about ranges successfully repaired and the
> ranges that failed could be a very good thing as well. It would be easy to
> then read the repair result and to know what to do next (re-run repair on
> some ranges, move to the next node, etc).
>
>
> 2016-09-20 17:00 GMT+02:00 Li, Guangxing :
>
> Hi,
>
> I am using version 2.0.9. I have been looking into the logs to see if a
> repair is finished. Each time a repair is started on a node, I am seeing
> log line like "INFO [Thread-112920] 2016-09-16 19:00:43,805
> StorageService.java (line 2646) Starting repair command #41, repairing 2048
> ranges for keyspace groupmanager" in system.log. So I know that I am
> expecting to see 2048 log lines like "INFO [AntiEntropySessions:109]
> 2016-09-16 19:27:20,662 RepairSession.java (line 282) [repair
> #8b910950-7c43-11e6-88f3-f147e a74230b] session completed successfully".
> Once I see 2048 such log lines, I know this repair has completed. But this
> is not dependable since sometimes I am seeing less than 2048 but I know
> there is no repair going on since I do not see any trace of repair in
> system.log for a long time. So it seems to me that there is a clear way to
> tell that a repair has started but there is no clear way to tell a repair
> has ended. The only thing you can do is to watch the log and if you do not
> see repair activity for a long time, the repair is done somehow. I am
> wondering why the implementor did not put something in the log (e.g. ...
> Repair command #41 has ended...) to clearly state that the repair has
> completed.
>
> Thanks.
>
> George.
>
> On Tue, Sep 20, 2016 at 2:54 AM, Jens Rantil  wrote:
>
> On Mon, Sep 19, 2016 at 3:07 PM Alain RODRIGUEZ 
> wrote:
>
> ...
>
> - The size of your data
> - The number of vnodes
> - The compaction throughput
> - The streaming throughput
> - The hardware available
> - The load of the cluster
> - ...
>
>
> I've also heard that the number of clustering keys per partition key could
> have an impact. Might be worth investigating.
>
> Cheers,
> Jens
> --
> Jens Rantil
> Backend Developer @ Tink
> Tink AB, Wallingatan 5, 111 60 Stockholm, Sweden
> For urgent matters you can reach me at +46-708-84 18 32.
>
>
>
>
>
>
>


Re: Client-side timeouts after dropping table

2016-09-21 Thread Jesse Hodges
Thanks, filing this under "things I wish I'd realized sooner" :)

On Tue, Sep 20, 2016 at 10:27 PM, Jonathan Haddad  wrote:

> 3.7 falls under the Tick Tock release cycle, which is almost completely
> untested in production by experienced operators.  In the cases where it has
> been tested, there have been numerous bugs found which I (and I think most
> people on this list) consider to be show stoppers.  Additionally, the Tick
> Tock release cycle puts the operator in the uncomfortable position of
> having to decide between upgrading to a new version with new features
> (probably new bugs) or back porting bug fixes from future versions
> themselves.There will never be a 3.7.1 release which fixes bugs in 3.7
> without adding new features.
>
> https://github.com/apache/cassandra/blob/trunk/NEWS.txt
>
> For new projects I recommend starting with the recently released 3.0.9.
>
> Assuming the project changes it's policy on releases (all signs point to
> yes), then by the time 4.0 rolls out a lot of the features which have been
> released in the 3.x series will have matured a bit, so it's very possible
> 4.0 will stabilize faster than the usual 6 months it takes for a major
> release.
>
> All that said, there's nothing wrong with doing compatibility & smoke
> tests against the latest 3.x release as well as 3.0 and reporting bugs back
> to the Apache Cassandra JIRA, I'm sure it would be greatly appreciated.
>
> https://issues.apache.org/jira/secure/Dashboard.jspa
>
> Jon
>
>
> On Tue, Sep 20, 2016 at 8:10 PM Jesse Hodges 
> wrote:
>
>> Can you elaborate on why not 3.7?
>>
>> On Tue, Sep 20, 2016 at 7:41 PM, Jonathan Haddad 
>> wrote:
>>
>>> If you haven't yet deployed to prod I strongly recommend *not* using
>>> 3.7.
>>>
>>> What network storage are you using?  Outside of a handful of highly
>>> experienced experts using EBS in very specific ways, it usually ends in
>>> failure.
>>>
>>> On Tue, Sep 20, 2016 at 3:30 PM John Sanda  wrote:
>>>
 I am deploying multiple Java web apps that connect to a Cassandra 3.7
 instance. Each app creates its own schema at start up. One of the schema
 changes involves dropping a table. I am seeing frequent client-side
 timeouts reported by the DataStax driver after the DROP TABLE statement is
 executed. I don't see this behavior in all environments. I do see it
 consistently in a QA environment in which Cassandra is running in docker
 with network storage, so writes are pretty slow from the get go. In my logs
 I see a lot of tables getting flushed, which I guess are all of the dirty
 column families in the respective commit log segment. Then I seen a whole
 bunch of flushes getting queued up. Can I reach a point in which too many
 table flushes get queued such that writes would be blocked?


 --

 - John

>>>
>>


understanding partitions

2016-09-21 Thread S Ahmed
Hello,

If you have a 10 node cluster, how does having 10 partitions or 100
partitions change how cassandra will perform?

With 10 partitions you will have 1 partition per node.
WIth 100 partitions you will have 10 partitions per node.

With 100 partitions I guess it helps because when you add more nodes to
your cluster, the data can be redistributed since you have more nodes.

What else are things to consider?


understanding partitions and # of nodes

2016-09-21 Thread S Ahmed
Hello,

If you have a 10 node cluster, how does having 10 partitions or 100
partitions change how cassandra will perform?

With 10 partitions you will have 1 partition per node.
WIth 100 partitions you will have 10 partitions per node.

With 100 partitions I guess it helps because when you add more nodes to
your cluster, the data can be redistributed since you have more nodes.

What else are things to consider?

Thanks.


Re: understanding partitions and # of nodes

2016-09-21 Thread Jeff Jirsa
It If you only have 100 partitions, then having more than (100 * RF) nodes 
doesn’t help you much.

 

However, unless you’re using very specific partitioners, there’s no guarantee 
that you’ll have 1 partition per node (with 10 nodes / 10 partitions).

 

Cassandra uses murmur3 hash (by default, and md5 in old versions) to hash the 
partition key to place data onto a node. You have very little control over 
distribution – murmur3 and md5 are both sufficiently distributed that you’re 
likely to have a good distribution on sufficiently high number of partitions, 
but with 100 partitions, you’re going to have a miserable time.

 

If your data model is such that you’re only ever going to have 100 partitions, 
your data model is broken, or you should use some other database. 

 

 

 

From: S Ahmed 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, September 21, 2016 at 2:40 PM
To: "user@cassandra.apache.org" 
Subject: understanding partitions and # of nodes

 

Hello, 

 

If you have a 10 node cluster, how does having 10 partitions or 100 partitions 
change how cassandra will perform?

 

With 10 partitions you will have 1 partition per node.

WIth 100 partitions you will have 10 partitions per node.

 

With 100 partitions I guess it helps because when you add more nodes to your 
cluster, the data can be redistributed since you have more nodes.

 

What else are things to consider?

 

Thanks.



smime.p7s
Description: S/MIME cryptographic signature