Re: Cluster schema version choosing

2019-05-21 Thread Aleksey Korolkov
Thanks for the feedback.
I also think that node choose like "last wins" but I could not find any
timestamp of schema creation in system tables.
Hope this is not the order of an element in Map or List)


On Tue, 21 May 2019 at 02:58, Stefan Miklosovic <
stefan.mikloso...@instaclustr.com> wrote:

> My guess is that the "latest" schema would be chosen but I am
> definitely interested in in-depth explanation.
>
> On Tue, 21 May 2019 at 00:28, Alexey Korolkov 
> wrote:
> >
> > Hello team,
> > In some circumstances, my cluster was split onto two schema versions
> > (half on one version, and rest on another)
> > In the process of resolving this issue, I restarted some nodes.
> > Eventually, nodes migrated to one schema, but it was not clear why they
> choose exactly this version of schema?
> > I haven't found any explainings of the factor on which they picking
> schema version,
> > please help me to find out the algorithm of choosing schema or classes
> in source code responsible for this.
> >
> >
> >
> >
> >
> > --
> > Sincerely yours,  Korolkov Aleksey
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>

-- 
*Sincerely yours,  **Korolkov Aleksey*


Re: Cluster schema version choosing

2019-05-21 Thread Rhys Campbell
I'd hazzard a guess that the uuid contains a datetime component

Aleksey Korolkov  schrieb am Di., 21. Mai 2019, 09:36:

> Thanks for the feedback.
> I also think that node choose like "last wins" but I could not find any
> timestamp of schema creation in system tables.
> Hope this is not the order of an element in Map or List)
>
>
> On Tue, 21 May 2019 at 02:58, Stefan Miklosovic <
> stefan.mikloso...@instaclustr.com> wrote:
>
>> My guess is that the "latest" schema would be chosen but I am
>> definitely interested in in-depth explanation.
>>
>> On Tue, 21 May 2019 at 00:28, Alexey Korolkov 
>> wrote:
>> >
>> > Hello team,
>> > In some circumstances, my cluster was split onto two schema versions
>> > (half on one version, and rest on another)
>> > In the process of resolving this issue, I restarted some nodes.
>> > Eventually, nodes migrated to one schema, but it was not clear why they
>> choose exactly this version of schema?
>> > I haven't found any explainings of the factor on which they picking
>> schema version,
>> > please help me to find out the algorithm of choosing schema or classes
>> in source code responsible for this.
>> >
>> >
>> >
>> >
>> >
>> > --
>> > Sincerely yours,  Korolkov Aleksey
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>
>
> --
> *Sincerely yours,  **Korolkov Aleksey*
>


Re: Cluster schema version choosing

2019-05-21 Thread Aleksey Korolkov
Unfortunately not, I had the same idea, but it is not timeuuid, an example
from my cluster (5edde338-ce0d-3ead-bbee-63010ffbee6d)

On Tue, 21 May 2019 at 10:40, Rhys Campbell
 wrote:

> I'd hazzard a guess that the uuid contains a datetime component
>
> Aleksey Korolkov  schrieb am Di., 21. Mai 2019,
> 09:36:
>
>> Thanks for the feedback.
>> I also think that node choose like "last wins" but I could not find any
>> timestamp of schema creation in system tables.
>> Hope this is not the order of an element in Map or List)
>>
>>
>> On Tue, 21 May 2019 at 02:58, Stefan Miklosovic <
>> stefan.mikloso...@instaclustr.com> wrote:
>>
>>> My guess is that the "latest" schema would be chosen but I am
>>> definitely interested in in-depth explanation.
>>>
>>> On Tue, 21 May 2019 at 00:28, Alexey Korolkov 
>>> wrote:
>>> >
>>> > Hello team,
>>> > In some circumstances, my cluster was split onto two schema versions
>>> > (half on one version, and rest on another)
>>> > In the process of resolving this issue, I restarted some nodes.
>>> > Eventually, nodes migrated to one schema, but it was not clear why
>>> they choose exactly this version of schema?
>>> > I haven't found any explainings of the factor on which they picking
>>> schema version,
>>> > please help me to find out the algorithm of choosing schema or classes
>>> in source code responsible for this.
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > --
>>> > Sincerely yours,  Korolkov Aleksey
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>
>>>
>>
>> --
>> *Sincerely yours,  **Korolkov Aleksey*
>>
>

-- 
*Sincerely yours,  **Korolkov Aleksey*


Re: Cluster schema version choosing

2019-05-21 Thread Aleksey Korolkov
It seems, that we can compare UUID s and find the biggest or smallest

On Tue, 21 May 2019 at 11:15, Aleksey Korolkov  wrote:

> Unfortunately not, I had the same idea, but it is not timeuuid, an example
> from my cluster (5edde338-ce0d-3ead-bbee-63010ffbee6d)
>
> On Tue, 21 May 2019 at 10:40, Rhys Campbell
>  wrote:
>
>> I'd hazzard a guess that the uuid contains a datetime component
>>
>> Aleksey Korolkov  schrieb am Di., 21. Mai 2019,
>> 09:36:
>>
>>> Thanks for the feedback.
>>> I also think that node choose like "last wins" but I could not find any
>>> timestamp of schema creation in system tables.
>>> Hope this is not the order of an element in Map or List)
>>>
>>>
>>> On Tue, 21 May 2019 at 02:58, Stefan Miklosovic <
>>> stefan.mikloso...@instaclustr.com> wrote:
>>>
 My guess is that the "latest" schema would be chosen but I am
 definitely interested in in-depth explanation.

 On Tue, 21 May 2019 at 00:28, Alexey Korolkov 
 wrote:
 >
 > Hello team,
 > In some circumstances, my cluster was split onto two schema versions
 > (half on one version, and rest on another)
 > In the process of resolving this issue, I restarted some nodes.
 > Eventually, nodes migrated to one schema, but it was not clear why
 they choose exactly this version of schema?
 > I haven't found any explainings of the factor on which they picking
 schema version,
 > please help me to find out the algorithm of choosing schema or
 classes in source code responsible for this.
 >
 >
 >
 >
 >
 > --
 > Sincerely yours,  Korolkov Aleksey

 -
 To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
 For additional commands, e-mail: user-h...@cassandra.apache.org


>>>
>>> --
>>> *Sincerely yours,  **Korolkov Aleksey*
>>>
>>
>
> --
> *Sincerely yours,  **Korolkov Aleksey*
>


-- 
*Sincerely yours,  **Korolkov Aleksey*


Re: Optimal Heap Size Cassandra Configuration

2019-05-21 Thread Alain RODRIGUEZ
Hello,

I completely agree with Elliott above on the observation that it is hard to
say what *this cluster *needs. Yet, my colleague Jon wrote a small guide on
how to tune this in most cases or as a starting point let's say.
We often write post were we see repetitive question in the mailing list,
this gives us hints on what are good topics to cover writing a post (not to
repeat here every day the same :)). This was most definitely one of the
most 'demanded' topic and I believe this might be really helpful to you
https://thelastpickle.com/blog/2018/04/11/gc-tuning.html.

No one starts working on garbage collection unless they have to, because
it's for many a different world they don't know about. It was my case, I
did not touch GC for 2 years when I started. But really, you'll see that we
can reason about GC and make things way better in some case. The first time
I changed GC, I divided the cluster size by 2 and still divided latency by
2. So improvement were substantial, it was worth the interest we put into
it.

In addition, to break the ice with GC, I found that using http://gceasy.io
to be an excellent way to monitor/troubleshoot GC. Feed it with some gc
logs and it will give you the GC throughput (% of time JVM is available -
not doing a 'stop the world' pause). To give you some numbers, this should
be > 95-98% minimum. If you are having a lower throughput, chances are
hight that you can 'easily' improve performances there.

There is a lot more details in this analysis that might help you making
your head around GC and tune it properly.

I generally prefer using CMS, but saw some very successful cluster using
G1GC. G1GC is known to work better with bigger heaps. If you're going to
use 8 GB  (or even 16 GB) for the heap, I would stick to CMS and tune it
properly most probably, but again, G1GC might work quite well, with way
less efforts if you can assign 16+GB to the heap let's say.

Work on a canary node (only one random node) while changing this, then
observe logs with GCeasy. Repeat until you're happy with it (I would be
happy with about 95% to 98% of GC throughput (ie 2 to 5 % of pauses). But
what really matters is that after the changes you have better latency/less
dropped messages etc. You can measure the impact in GC throughput. When the
workload seems to be optimised enough / you're tired of playing with GC,
you can apply changes everywhere and observe impact on the cluster
(latency/dropped messages/CPU load...)

Hope that helps and completes somewhat Elliott's excellent answer.
---
Alain Rodriguez - al...@thelastpickle.com
France / Spain

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

Le lun. 20 mai 2019 à 23:31, Elliott Sims  a écrit :

> It's not really something that can be easily calculated based on write
> rate, but more something you have to find empirically and adjust
> periodically.
> Generally speaking, I'd start by running "nodetool gcstats" or similar and
> just see what the GC pause stats look like.  If it's not pausing much or
> for long, you're good.  If it is, you'll likely need to do some tuning
> based on GC logging which may involve increasing the heap but could also
> mean decreasing it or changing the collection strategy.
>
> Generally speaking, with G1GC you can get away with just setting a larger
> heap than you really need and it's close enough to optimal.  CMS is
> theoretically more efficient, but far more complex to get tuned properly
> and tends to fail more dramatically.
>
> On Mon, May 20, 2019 at 7:38 AM Akshay Bhardwaj <
> akshay.bhardwaj1...@gmail.com> wrote:
>
>> Hi Experts,
>>
>> I have a 5 node cluster with 8 core CPU and 32 GiB RAM
>>
>> If I have a write TPS of 5K/s and read TPS of 8K/s, I want to know what
>> is the optimal heap size configuration for each cassandra node.
>>
>> Currently, the heap size is set at 8GB. How can I know if cassandra
>> requires more or less heap memory?
>>
>> Akshay Bhardwaj
>> +91-97111-33849
>>
>


Re: Cassandra cross dc replication row isolationCassandra cross dc replication row isolation

2019-05-21 Thread Alexey Knyshev
Anyone?

ср, 8 мая 2019 г. в 11:37, Alexey Knyshev :

> Hi, thanks for your answers!
>
> > Are you asking if writes are atomic at the partition level? If so yes.
> If you have N columns in a simple k/v schema, and you send a write with X/N
> of those columns set, all X will be updated at the same time wherever that
> writes goes.
>
> Even for cross dc replication? (And yes, it has nothing about CL at all).
> I'm not familiar with Cassandra internals and implementation details, but
> if I understand correctly there are at least 3 possible implementations for
> replication:
>
>1. Row level, each row is eventually consistent as a bunch of cells.
>User cannot see the mix of two concurrent updates, so he can only see row
>as before or after write but not in intermediate state (write are
>linearised). Looks like this case is not about Cassandra because it should
>require read before write operation to achieve this and such thing has huge
>impact on Cassandra basic properties as it is an AP system in general.
>2. Write level, each write is applied / replicated atomically and this
>guarantees isolation across ONLY updated columns during this write. Enough
>for my case.
>3. Cell level, each cell "lives" its' own life without any sync with
>other cells in row (excluding PK of course). Worst case scenario.
>
> To clarify a bit my case. I just insert rows with unique PK (no overwrites
> happen) into CF and hope that row will be atomically replicated
> (eventually) with isolation guarantees (whole row or nothing) to another
> datacenter.
>
> ср, 8 мая 2019 г. в 01:15, Avinash Mandava :
>
>> Are you asking if writes are atomic at the partition level? If so yes. If
>> you have N columns in a simple k/v schema, and you send a write with X/N of
>> those columns set, all X will be updated at the same time wherever that
>> writes goes.
>>
>> The CL thing is more about how tolerant you are to stale data, i.e. if
>> you write in one DC and you absolutely can't tolerate reads from a remote
>> DC showing stale data, you would have to write at EACH_QUORUM and read at
>> LOCAL_QUORUM. While I'm not one for blanket advice, and certainly you can
>> make the decision on this tradeoff, this is a last resort situation, one of
>> those "supported features" that you ought to be wary of, as it's a bit off
>> from the intended design/usage of the system.
>>
>> On Tue, May 7, 2019 at 2:58 PM Rahul Singh 
>> wrote:
>>
>>> Depends on the consistency level you are setting on write and read.
>>>
>>> What CL are you writing at and what CL are you reading at?
>>>
>>> The consistency level tells the coordinator when to send acknowledgement
>>> of a write and whether to cross DCs to confirm a write. It also tells the
>>> coordinator how many replicas to read and whether or not to cross  DCs to
>>> get consensus.
>>>
>>> Eg. Local_quorum is different from Quorum.
>>> Local_quorum guarantees Data was saved to a quorum of nodes on the DC on
>>> which the Coordinator accepted the write. Similarly it would only check
>>> nodes in that DC. Quorum would check across DCs in the whole cluster.
>>> On May 7, 2019, 12:11 PM -0500, Alexey Knyshev ,
>>> wrote:
>>>
>>> Hi there!
>>>
>>> Could someone please explain how Column Family would be replicated and
>>> "visible / readable" in the following scenario? Having multiple
>>> geo-distributed datacenters with significant latency (up to 100ms RTT).
>>> Let's name two of them A and B and consider the following 2 cases:
>>>
>>>1. Cassandra client X inserts row into Column Family (CF) with
>>>Primary Key = PK (all cells are set - no nulls possible). Write 
>>> coordinator
>>>is in dc A. All cells in this write should have the same writetime. For
>>>simplicity let's assume that Cassandra coordinator node sets writetime.
>>>After some amount of time (< RTT) client Y reads whole row (select * ...)
>>>from the same CF with same PK talking to coordinator node from, another 
>>> dc
>>>(B). Is it possible that client Y will get some cells as NULLs, I mean, 
>>> is
>>>it possible to read some already replicated cells and for others get 
>>> NULLs,
>>>or does Cassandra guarantee row-level isolation / atomic write for that
>>>insert? Assume that row (all cells for same PK will never be updated /
>>>deleted afterwards.
>>>2. Same as in p.1 but after first write at PK same client (X)
>>>updates some columns for the same PK. Will be this update isolated /
>>>atomically written and eventually visible in another dc. Will client see
>>>isolated state as it was before write or after it?
>>>
>>> Thanks in advance!
>>>
>>>
>>> --
>>> linkedin.com/profile
>>> 
>>>
>>> github.com/alexeyknyshev
>>> bitbucket.org/alexeyknyshev
>>>
>>>
>>
>> --
>> www.vorstella.com
>> 408 691 8402
>>
>
>
> --
> linkedin.com/profile
> 

Unsubscribe

2019-05-21 Thread srinivas rao



Unsubscribe

2019-05-21 Thread A



Sent from Yahoo Mail for iPhone