Does LOCAL_ONE still replicate data?

2018-05-08 Thread Jakub Lida
Hi,

I want to add a new DC to an existing cluster (RF=1 per DC).
Will setting consistency to LOCAL_ONE on all machines make it still
replicate write requests sent to online DCs to all DCs (including the new
one being rebuilt) and only isolate read requests from reaching the new DC?
That is basically want I want to accomplish.

Thanks in advance, Jakub


Re: Does LOCAL_ONE still replicate data?

2018-05-08 Thread Hannu Kröger
Writes are always replicated to all nodes (if they are online).

LOCAL_ONE in writes just means that client will get an “OK” for the write only 
after at least node in local datacenter has acknowledged that the write is done.

If all local replicas are offline, then the write will fail even if it gets 
written in your other DC.

Hannu

> On 8 May 2018, at 13:24, Jakub Lida  wrote:
> 
> Hi,
> 
> I want to add a new DC to an existing cluster (RF=1 per DC).
> Will setting consistency to LOCAL_ONE on all machines make it still replicate 
> write requests sent to online DCs to all DCs (including the new one being 
> rebuilt) and only isolate read requests from reaching the new DC? That is 
> basically want I want to accomplish.
> 
> Thanks in advance, Jakub



Re: compaction: huge number of random reads

2018-05-08 Thread Kyrylo Lebediev
You are right, Kurt, it's what I was trying to do - lowering compression chunk 
size and device read-ahead.

Column-family settings: "compression = {'chunk_length_kb': '16', 
'sstable_compression': 'org.apache.cassandra.io.compress.SnappyCompressor'}"
Device read-ahead: blockdev --setra 8 

I had to fallback to default RA 256 and got large merged reads and small iops 
with good MBytes/sec after this.
I believe it's not caused by C* settings, but it's something with filesystem / 
IO-related kernel settings (or it's by design?).


Tried to emulate C* reads during compactions by dd:


**  RA=8 (4k)

# blockdev --setra 8 /dev/xvdb
# dd if=/dev/zero of=/data/ZZZ
^C16980952+0 records in
16980951+0 records out
8694246912 bytes (8.7 GB, 8.1 GiB) copied, 36.4651 s, 238 MB/s
# sync

# echo 3 > /proc/sys/vm/drop_caches
# dd if=/data/ZZZ of=/dev/null
^C846513+0 records in
846512+0 records out
433414144 bytes (433 MB, 413 MiB) copied, 21.4604 s, 20.2 MB/s   <

High IOPS in this case, io size = 4k.
What's interesting, setting bs=128k in dd didn't decrease iops, io size still 
was 4k


** RA=256 (128k):
# blockdev --setra 256 /dev/xvdb
# echo 3 > /proc/sys/vm/drop_caches
# dd if=/data/ZZZ of=/dev/null
^C15123937+0 records in
15123936+0 records out
7743455232 bytes (7.7 GB, 7.2 GiB) copied, 60.8407 s, 127 MB/s  <<

io size - 128k, small iops, good throughput (limited by EBS bandwidth)

Writes were fine in both cases: io size 128k, good throughput limited by EBS 
bandwidth only

Is above situation typical for small read-ahead ("price for small fast reads") 
or it's something wrong with my setup?
[It's not XFS mailing list, but as somebody here may know this, ] Why in case 
of small RA even large reads (bs=128k) are converted to multiple small reads?

Regards,
Kyrill



From: kurt greaves 
Sent: Tuesday, May 8, 2018 2:12:40 AM
To: User
Subject: Re: compaction: huge number of random reads

If you've got small partitions/small reads you should test lowering your 
compression chunk size on the table and disabling read ahead. This sounds like 
it might just be a case of read amplification.

On Tue., 8 May 2018, 05:43 Kyrylo Lebediev, 
mailto:kyrylo_lebed...@epam.com>> wrote:

Dear Experts,


I'm observing strange behavior on a cluster 2.1.20 during compactions.


My setup is:

12 nodes  m4.2xlarge (8 vCPU, 32G RAM) Ubuntu 16.04, 2T EBS gp2.

Filesystem: XFS, blocksize 4k, device read-ahead - 4k

/sys/block/vxdb/queue/nomerges = 0

SizeTieredCompactionStrategy


After data loads when effectively nothing else is talking to the cluster and 
compactions is the only activity, I see something like this:
$ iostat -dkx 1
...


Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util
xvda  0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
xvdb  0.00 0.00 4769.00  213.00 19076.00 26820.0018.42 
7.951.171.063.76   0.20 100.00

Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util
xvda  0.00 0.000.000.00 0.00 0.00 0.00 
0.000.000.000.00   0.00   0.00
xvdb  0.00 0.00 6098.00  177.00 24392.00 22076.0014.81 
6.461.360.96   15.16   0.16 100.00

Writes are fine: 177 writes/sec <-> ~22Mbytes/sec,

But for some reason compactions generate a huge number of small reads:
6098 reads/s <-> ~24Mbytes/sec.  ===>   Read size is 4k


Why instead much smaller amount of large reads I'm getting huge number of 4k 
reads instead?

What could be the reason?


Thanks,

Kyrill




Re: Does LOCAL_ONE still replicate data?

2018-05-08 Thread Lucas Benevides
Yes, but remind that there is Write Consistency and Read Consistency.
To prevent the reads from reaching the other DC, you should set the Read
Consistency LOCAL_ONE.
As Hannu Kroger said, the LOCAL_ONE may be enough to you but maybe not if
you want to be sure that your data was written also in another DC.

Lucas B. Dias


2018-05-08 7:26 GMT-03:00 Hannu Kröger :

> Writes are always replicated to all nodes (if they are online).
>
> LOCAL_ONE in writes just means that client will get an “OK” for the write
> only after at least node in local datacenter has acknowledged that the
> write is done.
>
> If all local replicas are offline, then the write will fail even if it
> gets written in your other DC.
>
> Hannu
>
>
> On 8 May 2018, at 13:24, Jakub Lida  wrote:
>
> Hi,
>
> I want to add a new DC to an existing cluster (RF=1 per DC).
> Will setting consistency to LOCAL_ONE on all machines make it still
> replicate write requests sent to online DCs to all DCs (including the new
> one being rebuilt) and only isolate read requests from reaching the new DC?
> That is basically want I want to accomplish.
>
> Thanks in advance, Jakub
>
>
>


Re: Does LOCAL_ONE still replicate data?

2018-05-08 Thread shalom sagges
 It's advisable to set the RF to 3 regardless of the consistency level.

If using RF=1, Read CL=LOCAL_ONE and a node goes down in the local DC, you
will not be able to read data related to this node until it goes back up.
For writes and CL=LOCAL_ONE, the write will fail (if it falls on the token
ranges of the downed node).



On Tue, May 8, 2018 at 3:05 PM, Lucas Benevides  wrote:

> Yes, but remind that there is Write Consistency and Read Consistency.
> To prevent the reads from reaching the other DC, you should set the Read
> Consistency LOCAL_ONE.
> As Hannu Kroger said, the LOCAL_ONE may be enough to you but maybe not if
> you want to be sure that your data was written also in another DC.
>
> Lucas B. Dias
>
>
> 2018-05-08 7:26 GMT-03:00 Hannu Kröger :
>
>> Writes are always replicated to all nodes (if they are online).
>>
>> LOCAL_ONE in writes just means that client will get an “OK” for the write
>> only after at least node in local datacenter has acknowledged that the
>> write is done.
>>
>> If all local replicas are offline, then the write will fail even if it
>> gets written in your other DC.
>>
>> Hannu
>>
>>
>> On 8 May 2018, at 13:24, Jakub Lida  wrote:
>>
>> Hi,
>>
>> I want to add a new DC to an existing cluster (RF=1 per DC).
>> Will setting consistency to LOCAL_ONE on all machines make it still
>> replicate write requests sent to online DCs to all DCs (including the new
>> one being rebuilt) and only isolate read requests from reaching the new DC?
>> That is basically want I want to accomplish.
>>
>> Thanks in advance, Jakub
>>
>>
>>
>


missing cpu matrix in grafana

2018-05-08 Thread sunny kumar
Hi

In one of my prod cluster missing cpu usgae matrix in Grafana y its
happening any ideas?.