RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Abhishek Kumar Maheshwari
Hi Benjamin,

I have 1Cr records in my Java ArrayList and yes I am writing in sync mode. My 
table is as below:

CREATE TABLE XXX_YY_MMS (
date timestamp,
userid text,
time timestamp,
xid text,
addimid text,
advcid bigint,
algo bigint,
alla text,
aud text,
bmid text,
ctyid text,
bid double,
ctxid text,
devipid text,
gmid text,
ip text,
itcid bigint,
iid text,
metid bigint,
osdid text,
paid int,
position text,
pcid bigint,
refurl text,
sec text,
siid bigint,
tmpid bigint,
xforwardedfor text,
PRIMARY KEY (date, userid, time, xid)
) WITH CLUSTERING ORDER BY (userid ASC, time ASC, xid ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';

So please let me know what I miss?

And for this hardware below config is fine?

concurrent_reads: 32
concurrent_writes: 64
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 32
concurrent_compactors: 8

thanks,
Abhishek

From: Benjamin Roth [mailto:benjamin.r...@jaumo.com]
Sent: Wednesday, November 23, 2016 12:56 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Config as per server hardware for heavy write

This is ridiculously slow for that hardware setup. Sounds like you benchmark 
with a single thread and / or sync queries or very large writes.
A setup like this should be easily able to handle tens of thousands of writes / 
s

2016-11-23 8:02 GMT+01:00 Jonathan Haddad 
mailto:j...@jonhaddad.com>>:
How are you benchmarking that?
On Tue, Nov 22, 2016 at 9:16 PM Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote:
Hi,

I have 8 servers in my Cassandra Cluster. Each server has 64 GB ram and 40 
Cores and 8 SSD. Currently I have below config in Cassandra.yaml:

concurrent_reads: 32
concurrent_writes: 64
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 32
concurrent_compactors: 8

With this configuration, I can write 1700 Request/Sec per server.

But our desired write performance is 3000-4000 Request/Sec per server. As per 
my Understanding Max value for these parameters can be as below:
concurrent_reads: 32
concurrent_writes: 128(8*16 Corew)
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 128
concurrent_compactors: 8 or 16 (as I have 8 SSD and 16 core reserve for this)

Please let me know this is fine or I need to tune some other parameters for 
speedup write.


Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

Education gets Exciting with IIM Kozhikode Executive Post Graduate Programme in 
Management - 2 years (AMBA accredited with full benefits of IIMK Alumni 
status). Brought to you by IIMK in association with TSW, an Executive Education 
initiative from The Times of India Group. Learn more: 
www.timestsw.com



--
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


Re: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Benjamin Roth
There is cassandra-stress to benchmark your cluster.

See docs here:
https://docs.datastax.com/en/cassandra/3.x/cassandra/tools/toolsCStress.html?hl=stress

2016-11-23 9:09 GMT+01:00 Abhishek Kumar Maheshwari <
abhishek.maheshw...@timesinternet.in>:

> Hi Benjamin,
>
>
>
> I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.
> My table is as below:
>
>
>
> CREATE TABLE XXX_YY_MMS (
>
> date timestamp,
>
> userid text,
>
> time timestamp,
>
> xid text,
>
> addimid text,
>
> advcid bigint,
>
> algo bigint,
>
> alla text,
>
> aud text,
>
> bmid text,
>
> ctyid text,
>
> bid double,
>
> ctxid text,
>
> devipid text,
>
> gmid text,
>
> ip text,
>
> itcid bigint,
>
> iid text,
>
> metid bigint,
>
> osdid text,
>
> paid int,
>
> position text,
>
> pcid bigint,
>
> refurl text,
>
> sec text,
>
> siid bigint,
>
> tmpid bigint,
>
> xforwardedfor text,
>
> PRIMARY KEY (date, userid, time, xid)
>
> ) WITH CLUSTERING ORDER BY (userid ASC, time ASC, xid ASC)
>
> AND bloom_filter_fp_chance = 0.01
>
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>
> AND comment = ''
>
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.
> SizeTieredCompactionStrategy'}
>
> AND compression = {'sstable_compression': 'org.apache.cassandra.io.
> compress.LZ4Compressor'}
>
> AND dclocal_read_repair_chance = 0.1
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 864000
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 0
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99.0PERCENTILE';
>
>
>
> So please let me know what I miss?
>
>
>
> And for this hardware below config is fine?
>
>
>
> concurrent_reads: 32
>
> concurrent_writes: 64
>
> concurrent_counter_writes: 32
>
> compaction_throughput_mb_per_sec: 32
>
> concurrent_compactors: 8
>
>
>
> thanks,
>
> Abhishek
>
>
>
> *From:* Benjamin Roth [mailto:benjamin.r...@jaumo.com]
> *Sent:* Wednesday, November 23, 2016 12:56 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra Config as per server hardware for heavy write
>
>
>
> This is ridiculously slow for that hardware setup. Sounds like you
> benchmark with a single thread and / or sync queries or very large writes.
>
> A setup like this should be easily able to handle tens of thousands of
> writes / s
>
>
>
> 2016-11-23 8:02 GMT+01:00 Jonathan Haddad :
>
> How are you benchmarking that?
>
> On Tue, Nov 22, 2016 at 9:16 PM Abhishek Kumar Maheshwari <
> abhishek.maheshw...@timesinternet.in> wrote:
>
> Hi,
>
>
>
> I have 8 servers in my Cassandra Cluster. Each server has 64 GB ram and 40
> Cores and 8 SSD. Currently I have below config in Cassandra.yaml:
>
>
>
> concurrent_reads: 32
>
> concurrent_writes: 64
>
> concurrent_counter_writes: 32
>
> compaction_throughput_mb_per_sec: 32
>
> concurrent_compactors: 8
>
>
>
> With this configuration, I can write 1700 Request/Sec per server.
>
>
>
> But our desired write performance is 3000-4000 Request/Sec per server. As
> per my Understanding Max value for these parameters can be as below:
>
> concurrent_reads: 32
>
> concurrent_writes: 128(8*16 Corew)
>
> concurrent_counter_writes: 32
>
> compaction_throughput_mb_per_sec: 128
>
> concurrent_compactors: 8 or 16 (as I have 8 SSD and 16 core reserve for
> this)
>
>
>
> Please let me know this is fine or I need to tune some other parameters
> for speedup write.
>
>
>
>
>
> *Thanks & Regards,*
> *Abhishek Kumar Maheshwari*
> *+91- 805591 <%2B91-%C2%A0805591> (Mobile)*
>
> Times Internet Ltd. | A Times of India Group Company
>
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>
> *P** Please do not print this email unless it is absolutely necessary.
> Spread environmental awareness.*
>
>
>
> Education gets Exciting with IIM Kozhikode Executive Post Graduate
> Programme in Management - 2 years (AMBA accredited with full benefits of
> IIMK Alumni status). Brought to you by IIMK in association with TSW, an
> Executive Education initiative from The Times of India Group. Learn more:
> www.timestsw.com
>
>
>
>
>
> --
>
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 · Fax +49 7161 304880-1
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>



-- 
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer


How to Choose a Version for Upgrade

2016-11-23 Thread Shalom Sagges
Hi Everyone,

I was wondering how to choose the proper, most stable Cassandra version for
a Production environment.
Should I follow the version that's used in Datastax Enterprise (in this
case 3.0.10) or is there a better way of figuring this out?

Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections


-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.


Re: How to Choose a Version for Upgrade

2016-11-23 Thread Vladimir Yudovin
Hi Shalom,



there are a lot of discussion on this topic, but it seems that for know we can 
call 3.0.xx line as most stable. If you don't need specific feature from 3.x 
line take 3.0.10.





Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Wed, 23 Nov 2016 03:14:37 -0500Shalom Sagges 
 wrote 




Hi Everyone, 



I was wondering how to choose the proper, most stable Cassandra version for a 
Production environment. 

Should I follow the version that's used in Datastax Enterprise (in this case 
3.0.10) or is there a better way of figuring this out?



Thanks!



 


 
Shalom Sagges
 
DBA
 
T: +972-74-700-4035
 

 
 
 
 We Create Meaningful Connections
 
 

 

 









This message may contain confidential and/or privileged information. 

If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this message 
or any information herein. 

If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.








RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Vladimir Yudovin
>I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.

Is your Java program single threaded?



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting, Zero production time






 On Wed, 23 Nov 2016 03:09:29 -0500Abhishek Kumar Maheshwari 
 wrote 




Hi Benjamin,

 

I have 1Cr records in my Java ArrayList and yes I am writing in sync mode. My 
table is as below:

 

CREATE TABLE XXX_YY_MMS (

date timestamp,

userid text,

time timestamp,

xid text,

addimid text,

advcid bigint,

algo bigint,

alla text,

aud text,

bmid text,

ctyid text,

bid double,

ctxid text,

devipid text,

gmid text,

ip text,

itcid bigint,

iid text,

metid bigint,

osdid text,

paid int,

position text,

pcid bigint,

refurl text,

sec text,

siid bigint,

tmpid bigint,

xforwardedfor text,

PRIMARY KEY (date, userid, time, xid)

) WITH CLUSTERING ORDER BY (userid ASC, time ASC, xid ASC)

AND bloom_filter_fp_chance = 0.01

AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'

AND comment = ''

AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}

AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}

AND dclocal_read_repair_chance = 0.1

AND default_time_to_live = 0

AND gc_grace_seconds = 864000

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 0

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = '99.0PERCENTILE';

 

So please let me know what I miss?

 

And for this hardware below config is fine?

 

concurrent_reads: 32

concurrent_writes: 64

concurrent_counter_writes: 32

compaction_throughput_mb_per_sec: 32

concurrent_compactors: 8

 

thanks,

Abhishek

 

From: Benjamin Roth [mailto:benjamin.r...@jaumo.com] 
 Sent: Wednesday, November 23, 2016 12:56 PM
 To: user@cassandra.apache.org
 Subject: Re: Cassandra Config as per server hardware for heavy write
 

This is ridiculously slow for that hardware setup. Sounds like you benchmark 
with a single thread and / or sync queries or very large writes.

A setup like this should be easily able to handle tens of thousands of writes / 
s



 

2016-11-23 8:02 GMT+01:00 Jonathan Haddad :

How are you benchmarking that?

On Tue, Nov 22, 2016 at 9:16 PM Abhishek Kumar Maheshwari 
 wrote:


Hi,

 

I have 8 servers in my Cassandra Cluster. Each server has 64 GB ram and 40 
Cores and 8 SSD. Currently I have below config in Cassandra.yaml:

 

concurrent_reads: 32

concurrent_writes: 64

concurrent_counter_writes: 32

compaction_throughput_mb_per_sec: 32

concurrent_compactors: 8

 

With this configuration, I can write 1700 Request/Sec per server.

 

But our desired write performance is 3000-4000 Request/Sec per server. As per 
my Understanding Max value for these parameters can be as below:

concurrent_reads: 32

concurrent_writes: 128(8*16 Corew)

concurrent_counter_writes: 32

compaction_throughput_mb_per_sec: 128

concurrent_compactors: 8 or 16 (as I have 8 SSD and 16 core reserve for this)

 

Please let me know this is fine or I need to tune some other parameters for 
speedup write.

 

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

 


Education gets Exciting with IIM Kozhikode Executive Post Graduate Programme in 
Management - 2 years (AMBA accredited with full benefits of IIMK Alumni 
status). Brought to you by IIMK in association with TSW, an Executive Education 
initiative from The Times of India Group. Learn more:  www.timestsw.com












 


--



Benjamin Roth

Prokurist



Jaumo GmbH · www.jaumo.com

Wehrstraße 46 · 73035 Göppingen · Germany

Phone +49 7161 304880-6 · Fax +49 7161 304880-1

AG Ulm · HRB 731058 · Managing Director: Jens Kammerer














RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Abhishek Kumar Maheshwari
No I am using 100 threads.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin [mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 2:00 PM
To: user 
Subject: RE: Cassandra Config as per server hardware for heavy write

>I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.
Is your Java program single threaded?

Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting, Zero 
production time


 On Wed, 23 Nov 2016 03:09:29 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

Hi Benjamin,

I have 1Cr records in my Java ArrayList and yes I am writing in sync mode. My 
table is as below:

CREATE TABLE XXX_YY_MMS (
date timestamp,
userid text,
time timestamp,
xid text,
addimid text,
advcid bigint,
algo bigint,
alla text,
aud text,
bmid text,
ctyid text,
bid double,
ctxid text,
devipid text,
gmid text,
ip text,
itcid bigint,
iid text,
metid bigint,
osdid text,
paid int,
position text,
pcid bigint,
refurl text,
sec text,
siid bigint,
tmpid bigint,
xforwardedfor text,
PRIMARY KEY (date, userid, time, xid)
) WITH CLUSTERING ORDER BY (userid ASC, time ASC, xid ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';

So please let me know what I miss?

And for this hardware below config is fine?

concurrent_reads: 32
concurrent_writes: 64
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 32
concurrent_compactors: 8

thanks,
Abhishek

From: Benjamin Roth 
[mailto:benjamin.r...@jaumo.com]
Sent: Wednesday, November 23, 2016 12:56 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Config as per server hardware for heavy write

This is ridiculously slow for that hardware setup. Sounds like you benchmark 
with a single thread and / or sync queries or very large writes.
A setup like this should be easily able to handle tens of thousands of writes / 
s

2016-11-23 8:02 GMT+01:00 Jonathan Haddad 
mailto:j...@jonhaddad.com>>:
How are you benchmarking that?
On Tue, Nov 22, 2016 at 9:16 PM Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote:
Hi,

I have 8 servers in my Cassandra Cluster. Each server has 64 GB ram and 40 
Cores and 8 SSD. Currently I have below config in Cassandra.yaml:

concurrent_reads: 32
concurrent_writes: 64
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 32
concurrent_compactors: 8

With this configuration, I can write 1700 Request/Sec per server.

But our desired write performance is 3000-4000 Request/Sec per server. As per 
my Understanding Max value for these parameters can be as below:
concurrent_reads: 32
concurrent_writes: 128(8*16 Corew)
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 128
concurrent_compactors: 8 or 16 (as I have 8 SSD and 16 core reserve for this)

Please let me know this is fine or I need to tune some other parameters for 
speedup write.


Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

Education gets Exciting with IIM Kozhikode Executive Post Graduate Programme in 
Management - 2 years (AMBA accredited with full benefits of IIMK Alumni 
status). Brought to you by IIMK in association with TSW, an Executive Education 
initiative from The Times of India Group. Learn more: 
www.timestsw.com





--

Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer





Re: How to Choose a Version for Upgrade

2016-11-23 Thread Shalom Sagges
Thanks Vladimir!


Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections



On Wed, Nov 23, 2016 at 10:26 AM, Vladimir Yudovin 
wrote:

> Hi Shalom,
>
> there are a lot of discussion on this topic, but it seems that for know we
> can call 3.0.xx line as most stable. If you don't need specific feature
> from 3.x line take 3.0.10.
>
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Wed, 23 Nov 2016 03:14:37 -0500*Shalom Sagges
> >* wrote 
>
> Hi Everyone,
>
> I was wondering how to choose the proper, most stable Cassandra version
> for a Production environment.
> Should I follow the version that's used in Datastax Enterprise (in this
> case 3.0.10) or is there a better way of figuring this out?
>
> Thanks!
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035
> 
> 
> 
> We Create Meaningful Connections
>
> 
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.


RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Vladimir Yudovin
So do you see speed write saturation at this number of thread? Does doubling to 
200 bring increase?





Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting, Zero production time






 On Wed, 23 Nov 2016 03:31:32 -0500Abhishek Kumar Maheshwari 
 wrote 




No I am using 100 threads.

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.


 

From: Vladimir Yudovin [mailto:vla...@winguzone.com] 
 Sent: Wednesday, November 23, 2016 2:00 PM
 To: user 
 Subject: RE: Cassandra Config as per server hardware for heavy write


 

>I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.


Is your Java program single threaded?


 


Best regards, Vladimir Yudovin,


Winguzone - Cloud Cassandra Hosting, Zero production time



 


 


 On Wed, 23 Nov 2016 03:09:29 -0500Abhishek Kumar Maheshwari 
 wrote 



 


Hi Benjamin,

 

I have 1Cr records in my Java ArrayList and yes I am writing in sync mode. My 
table is as below:

 

CREATE TABLE XXX_YY_MMS (

date timestamp,

userid text,

time timestamp,

xid text,

addimid text,

advcid bigint,

algo bigint,

alla text,

aud text,

bmid text,

ctyid text,

bid double,

ctxid text,

devipid text,

gmid text,

ip text,

itcid bigint,

iid text,

metid bigint,

osdid text,

paid int,

position text,

pcid bigint,

refurl text,

sec text,

siid bigint,

tmpid bigint,

xforwardedfor text,

PRIMARY KEY (date, userid, time, xid)

) WITH CLUSTERING ORDER BY (userid ASC, time ASC, xid ASC)

AND bloom_filter_fp_chance = 0.01

AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'

AND comment = ''

AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}

AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}

AND dclocal_read_repair_chance = 0.1

AND default_time_to_live = 0

AND gc_grace_seconds = 864000

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 0

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = '99.0PERCENTILE';

 

So please let me know what I miss?

 

And for this hardware below config is fine?

 

concurrent_reads: 32

concurrent_writes: 64

concurrent_counter_writes: 32

compaction_throughput_mb_per_sec: 32

concurrent_compactors: 8

 

thanks,

Abhishek

 

From: Benjamin Roth [mailto:benjamin.r...@jaumo.com] 
 Sent: Wednesday, November 23, 2016 12:56 PM
 To:  user@cassandra.apache.org
 Subject: Re: Cassandra Config as per server hardware for heavy write
 

This is ridiculously slow for that hardware setup. Sounds like you benchmark 
with a single thread and / or sync queries or very large writes.

A setup like this should be easily able to handle tens of thousands of writes / 
s



 

2016-11-23 8:02 GMT+01:00 Jonathan Haddad :

How are you benchmarking that?

On Tue, Nov 22, 2016 at 9:16 PM Abhishek Kumar Maheshwari 
 wrote:


Hi,

 

I have 8 servers in my Cassandra Cluster. Each server has 64 GB ram and 40 
Cores and 8 SSD. Currently I have below config in Cassandra.yaml:

 

concurrent_reads: 32

concurrent_writes: 64

concurrent_counter_writes: 32

compaction_throughput_mb_per_sec: 32

concurrent_compactors: 8

 

With this configuration, I can write 1700 Request/Sec per server.

 

But our desired write performance is 3000-4000 Request/Sec per server. As per 
my Understanding Max value for these parameters can be as below:

concurrent_reads: 32

concurrent_writes: 128(8*16 Corew)

concurrent_counter_writes: 32

compaction_throughput_mb_per_sec: 128

concurrent_compactors: 8 or 16 (as I have 8 SSD and 16 core reserve for this)

 

Please let me know this is fine or I need to tune some other parameters for 
speedup write.

 

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

 


Education gets Exciting with IIM Kozhikode Executive Post Graduate Programme in 
Management - 2 years (AMBA accredited with full benefits of IIMK Alumni 
status). Brought to you by IIMK in association with TSW, an Executive Education 
initiative from The Times of India Group. Learn more: www.timestsw.com






 

 


 

 


--

 

Benjamin Roth


Prokurist


 


Jaumo GmbH · www.jaumo.com


Wehrstraße 46 · 73

Re: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread siddharth verma
Hi Abhishek,
You could check whether you are throttling on client side queries or on
cassandra side.
You could also use grafana to monitor the cluster as well.
As you said, you are using 100 threads, it can't be sure whether you are
throttling cassandra cluster to its max limit.

As Benjamin suggested, you could use cassandra stress tool.

Lastly, if after everything( and you are sure, that cassandra seems slow)
the TPS comes out to be the numbers as you suggested, you could check you
schema, many rows in one partition key, read queries, read write load,
write queries with Batch/LWT, compactions running etc.


For checking ONLY cassandra throughput, you could use cassandra-stress with
any schema of your choice.

Regards


On Wed, Nov 23, 2016 at 2:07 PM, Vladimir Yudovin 
wrote:

> So do you see speed write saturation at this number of thread? Does
> doubling to 200 bring increase?
>
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting,
> Zero production time*
>
>
>  On Wed, 23 Nov 2016 03:31:32 -0500*Abhishek Kumar Maheshwari
>  >* wrote 
>
> No I am using 100 threads.
>
>
>
> *Thanks & Regards,*
> *Abhishek Kumar Maheshwari*
> *+91- 805591 (Mobile)*
>
> Times Internet Ltd. | A Times of India Group Company
>
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>
> *P** Please do not print this email unless it is absolutely necessary.
> Spread environmental awareness.*
>
>
>
> *From:* Vladimir Yudovin [mailto:vla...@winguzone.com]
> *Sent:* Wednesday, November 23, 2016 2:00 PM
> *To:* user 
> *Subject:* RE: Cassandra Config as per server hardware for heavy write
>
>
>
> >I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.
>
> Is your Java program single threaded?
>
>
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Cloud Cassandra Hosting,
> Zero production time*
>
>
>
>
>
>  On Wed, 23 Nov 2016 03:09:29 -0500*Abhishek Kumar Maheshwari <
> abhishek.maheshw...@timesinternet.in
> **>* wrote 
>
>
>
> Hi Benjamin,
>
>
>
> I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.
> My table is as below:
>
>
>
> CREATE TABLE XXX_YY_MMS (
>
> date timestamp,
>
> userid text,
>
> time timestamp,
>
> xid text,
>
> addimid text,
>
> advcid bigint,
>
> algo bigint,
>
> alla text,
>
> aud text,
>
> bmid text,
>
> ctyid text,
>
> bid double,
>
> ctxid text,
>
> devipid text,
>
> gmid text,
>
> ip text,
>
> itcid bigint,
>
> iid text,
>
> metid bigint,
>
> osdid text,
>
> paid int,
>
> position text,
>
> pcid bigint,
>
> refurl text,
>
> sec text,
>
> siid bigint,
>
> tmpid bigint,
>
> xforwardedfor text,
>
> PRIMARY KEY (date, userid, time, xid)
>
> ) WITH CLUSTERING ORDER BY (userid ASC, time ASC, xid ASC)
>
> AND bloom_filter_fp_chance = 0.01
>
> AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
>
> AND comment = ''
>
> AND compaction = {'class': 'org.apache.cassandra.db.compaction.
> SizeTieredCompactionStrategy'}
>
> AND compression = {'sstable_compression': 'org.apache.cassandra.io.
> compress.LZ4Compressor'}
>
> AND dclocal_read_repair_chance = 0.1
>
> AND default_time_to_live = 0
>
> AND gc_grace_seconds = 864000
>
> AND max_index_interval = 2048
>
> AND memtable_flush_period_in_ms = 0
>
> AND min_index_interval = 128
>
> AND read_repair_chance = 0.0
>
> AND speculative_retry = '99.0PERCENTILE';
>
>
>
> So please let me know what I miss?
>
>
>
> And for this hardware below config is fine?
>
>
>
> concurrent_reads: 32
>
> concurrent_writes: 64
>
> concurrent_counter_writes: 32
>
> compaction_throughput_mb_per_sec: 32
>
> concurrent_compactors: 8
>
>
>
> thanks,
>
> Abhishek
>
>
>
> *From:* Benjamin Roth [mailto:benjamin.r...@jaumo.com]
> *Sent:* Wednesday, November 23, 2016 12:56 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra Config as per server hardware for heavy write
>
>
>
> This is ridiculously slow for that hardware setup. Sounds like you
> benchmark with a single thread and / or sync queries or very large writes.
>
> A setup like this should be easily able to handle tens of thousands of
> writes / s
>
>
>
> 2016-11-23 8:02 GMT+01:00 Jonathan Haddad :
>
> How are you benchmarking that?
>
> On Tue, Nov 22, 2016 at 9:16 PM Abhishek Kumar Maheshwari <
> abhishek.maheshw...@timesinternet.in> wrote:
>
> Hi,
>
>
>
> I have 8 servers in my Cassandra Cluster. Each server has 64 GB ram and 40
> Cores and 8 SSD. Currently I have below config in Cassandra.yaml:
>
>
>
> concurrent_reads: 32
>
> concurrent_writes: 64
>
> concurrent_counter_writes: 32
>
> compaction_throughput_mb_per_sec: 32
>
> concurrent_compactors: 8
>
>
>
> With this configuration, I can write 1700 Request/Sec per server.
>
>
>
> But our desired write performance is 3000-4000 Request/Sec per server.

RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Abhishek Kumar Maheshwari
Hi Vladimir,

I try the same but it doesn’t increase. also in grafan average write latency is 
near about 10Ms.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin [mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 2:07 PM
To: user 
Subject: RE: Cassandra Config as per server hardware for heavy write

So do you see speed write saturation at this number of thread? Does doubling to 
200 bring increase?


Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting, Zero 
production time


 On Wed, 23 Nov 2016 03:31:32 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

No I am using 100 threads.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin 
[mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 2:00 PM
To: user mailto:user@cassandra.apache.org>>
Subject: RE: Cassandra Config as per server hardware for heavy write

>I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.
Is your Java program single threaded?

Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting, Zero 
production time


 On Wed, 23 Nov 2016 03:09:29 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

Hi Benjamin,

I have 1Cr records in my Java ArrayList and yes I am writing in sync mode. My 
table is as below:

CREATE TABLE XXX_YY_MMS (
date timestamp,
userid text,
time timestamp,
xid text,
addimid text,
advcid bigint,
algo bigint,
alla text,
aud text,
bmid text,
ctyid text,
bid double,
ctxid text,
devipid text,
gmid text,
ip text,
itcid bigint,
iid text,
metid bigint,
osdid text,
paid int,
position text,
pcid bigint,
refurl text,
sec text,
siid bigint,
tmpid bigint,
xforwardedfor text,
PRIMARY KEY (date, userid, time, xid)
) WITH CLUSTERING ORDER BY (userid ASC, time ASC, xid ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';

So please let me know what I miss?

And for this hardware below config is fine?

concurrent_reads: 32
concurrent_writes: 64
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 32
concurrent_compactors: 8

thanks,
Abhishek

From: Benjamin Roth 
[mailto:benjamin.r...@jaumo.com]
Sent: Wednesday, November 23, 2016 12:56 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Config as per server hardware for heavy write

This is ridiculously slow for that hardware setup. Sounds like you benchmark 
with a single thread and / or sync queries or very large writes.
A setup like this should be easily able to handle tens of thousands of writes / 
s

2016-11-23 8:02 GMT+01:00 Jonathan Haddad 
mailto:j...@jonhaddad.com>>:
How are you benchmarking that?
On Tue, Nov 22, 2016 at 9:16 PM Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote:
Hi,

I have 8 servers in my Cassandra Cluster. Each server has 64 GB ram and 40 
Cores and 8 SSD. Currently I have below config in Cassandra.yaml:

concurrent_reads: 32
concurrent_writes: 64
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 32
concurrent_compactors: 8

With this configuration, I can write 1700 Request/Sec per server.

But our desired write performance is 3000-4000 Request/Sec per server. As per 
my Understanding Max value for these parameters can be as below:
concurrent_reads: 32
concurrent_writes: 128(8*16 Corew)
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 128
concurrent_compactors: 8 or 16 (as I have 8 SSD and 16 core reserve for this)

Please let me know this is fine or I need to tune some other parameters for 
speedup write.


Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Time

Re: data not replicated on new node

2016-11-23 Thread Oleksandr Shulgin
On Tue, Nov 22, 2016 at 5:23 PM, Bertrand Brelier <
bertrand.brel...@gmail.com> wrote:

> Hello Shalom.
>
> No I really went from 3.1.1 to 3.0.9 .
>
So you've just installed the 3.0.9 version and re-started with it?  I
wonder if it's really supported?

Regards,
--
Alex


RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Abhishek Kumar Maheshwari
Hi Siddharth,

For me it seems Cassandra side. Because I have a list with 1cr record. I am 
just iterating on it and executing the query.
Also, I try with 200 thread but still speed doesn’t increase that much as 
expected. On grafana write latency is near about 10Ms.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: siddharth verma [mailto:sidd.verma29.l...@gmail.com]
Sent: Wednesday, November 23, 2016 2:23 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Config as per server hardware for heavy write

Hi Abhishek,
You could check whether you are throttling on client side queries or on 
cassandra side.
You could also use grafana to monitor the cluster as well.
As you said, you are using 100 threads, it can't be sure whether you are 
throttling cassandra cluster to its max limit.

As Benjamin suggested, you could use cassandra stress tool.

Lastly, if after everything( and you are sure, that cassandra seems slow) the 
TPS comes out to be the numbers as you suggested, you could check you schema, 
many rows in one partition key, read queries, read write load, write queries 
with Batch/LWT, compactions running etc.


For checking ONLY cassandra throughput, you could use cassandra-stress with any 
schema of your choice.

Regards


On Wed, Nov 23, 2016 at 2:07 PM, Vladimir Yudovin 
mailto:vla...@winguzone.com>> wrote:
So do you see speed write saturation at this number of thread? Does doubling to 
200 bring increase?


Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting, Zero 
production time


 On Wed, 23 Nov 2016 03:31:32 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

No I am using 100 threads.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin 
[mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 2:00 PM
To: user mailto:user@cassandra.apache.org>>
Subject: RE: Cassandra Config as per server hardware for heavy write

>I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.
Is your Java program single threaded?

Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting, Zero 
production time


 On Wed, 23 Nov 2016 03:09:29 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

Hi Benjamin,

I have 1Cr records in my Java ArrayList and yes I am writing in sync mode. My 
table is as below:

CREATE TABLE XXX_YY_MMS (
date timestamp,
userid text,
time timestamp,
xid text,
addimid text,
advcid bigint,
algo bigint,
alla text,
aud text,
bmid text,
ctyid text,
bid double,
ctxid text,
devipid text,
gmid text,
ip text,
itcid bigint,
iid text,
metid bigint,
osdid text,
paid int,
position text,
pcid bigint,
refurl text,
sec text,
siid bigint,
tmpid bigint,
xforwardedfor text,
PRIMARY KEY (date, userid, time, xid)
) WITH CLUSTERING ORDER BY (userid ASC, time ASC, xid ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}
AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99.0PERCENTILE';

So please let me know what I miss?

And for this hardware below config is fine?

concurrent_reads: 32
concurrent_writes: 64
concurrent_counter_writes: 32
compaction_throughput_mb_per_sec: 32
concurrent_compactors: 8

thanks,
Abhishek

From: Benjamin Roth 
[mailto:benjamin.r...@jaumo.com]
Sent: Wednesday, November 23, 2016 12:56 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Config as per server hardware for heavy write

This is ridiculously slow for that hardware setup. Sounds like you benchmark 
with a single thread and / or sync queries or very large writes.
A setup like this should be easily able to handle tens of thousands of writes / 
s

2016-11-23 8:02 GMT+01:00 Jonathan Haddad 
mailto:j...@jonhaddad.

RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Vladimir Yudovin
>I have a list with 1cr record. I am just iterating on it and executing the 
query. Also, I try with 200 thread

Do you fetch each list item and put it to separate thread to perform CQL query? 
Also how exactly do you connect to Cassandra?

If you use synchronous API so it's better to create connection pool (with 
TokenAwarePolicy each) and then pass each item to separate thread.





Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Wed, 23 Nov 2016 04:23:13 -0500Abhishek Kumar Maheshwari 
 wrote 




Hi Siddharth,

 

For me it seems Cassandra side. Because I have a list with 1cr record. I am 
just iterating on it and executing the query.

Also, I try with 200 thread but still speed doesn’t increase that much as 
expected. On grafana write latency is near about 10Ms.

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

 

From: siddharth verma [mailto:sidd.verma29.l...@gmail.com] 
 Sent: Wednesday, November 23, 2016 2:23 PM
 To: user@cassandra.apache.org
 Subject: Re: Cassandra Config as per server hardware for heavy write
 

Hi Abhishek,

You could check whether you are throttling on client side queries or on 
cassandra side.


You could also use grafana to monitor the cluster as well.


As you said, you are using 100 threads, it can't be sure whether you are 
throttling cassandra cluster to its max limit.


 


As Benjamin suggested, you could use cassandra stress tool.


 


Lastly, if after everything( and you are sure, that cassandra seems slow) the 
TPS comes out to be the numbers as you suggested, you could check you schema, 
many rows in one partition key, read queries, read write load, write queries 
with Batch/LWT, compactions running etc.


 


 


For checking ONLY cassandra throughput, you could use cassandra-stress with any 
schema of your choice.


 


Regards


 



 

On Wed, Nov 23, 2016 at 2:07 PM, Vladimir Yudovin  
wrote:

So do you see speed write saturation at this number of thread? Does doubling to 
200 bring increase?


 


 


Best regards, Vladimir Yudovin,


Winguzone - Cloud Cassandra Hosting, Zero production time



 


 


 On Wed, 23 Nov 2016 03:31:32 -0500Abhishek Kumar Maheshwari 
 wrote 



 


No I am using 100 threads.

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.


 

From: Vladimir Yudovin [mailto:vla...@winguzone.com] 
 Sent: Wednesday, November 23, 2016 2:00 PM
 To: user 
 Subject: RE: Cassandra Config as per server hardware for heavy write


 

>I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.


Is your Java program single threaded?


 


Best regards, Vladimir Yudovin,


Winguzone - Cloud Cassandra Hosting, Zero production time



 


 


 On Wed, 23 Nov 2016 03:09:29 -0500Abhishek Kumar Maheshwari 
 wrote 



 


Hi Benjamin,

 

I have 1Cr records in my Java ArrayList and yes I am writing in sync mode. My 
table is as below:

 

CREATE TABLE XXX_YY_MMS (

date timestamp,

userid text,

time timestamp,

xid text,

addimid text,

advcid bigint,

algo bigint,

alla text,

aud text,

bmid text,

ctyid text,

bid double,

ctxid text,

devipid text,

gmid text,

ip text,

itcid bigint,

iid text,

metid bigint,

osdid text,

paid int,

position text,

pcid bigint,

refurl text,

sec text,

siid bigint,

tmpid bigint,

xforwardedfor text,

PRIMARY KEY (date, userid, time, xid)

) WITH CLUSTERING ORDER BY (userid ASC, time ASC, xid ASC)

AND bloom_filter_fp_chance = 0.01

AND caching = '{"keys":"ALL", "rows_per_partition":"NONE"}'

AND comment = ''

AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy'}

AND compression = {'sstable_compression': 
'org.apache.cassandra.io.compress.LZ4Compressor'}

AND dclocal_read_repair_chance = 0.1

AND default_time_to_live = 0

AND gc_grace_seconds = 864000

AND max_index_interval = 2048

AND memtable_flush_period_in_ms = 0

AND min_index_interval = 128

AND read_repair_chance = 0.0

AND speculative_retry = '99.0PERCENTILE';

 

So please let me know what I miss?

 

And for this hardware below config is fine?

 

concurrent_reads: 32

concurrent_writes: 64

concurrent_counter_writes: 32

compaction_throughput_mb_pe

RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Abhishek Kumar Maheshwari
Hi

I am submitting record to Executor service and below is my client config and 
code:

cluster = Cluster.builder().addContactPoints(hostAddresses)
   .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
   .withReconnectionPolicy(new 
ConstantReconnectionPolicy(3L))
   .withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy()))
   .build();

   ExecutorService service=Executors.newFixedThreadPool(1000);
  for(final AdLog adLog:li){
 service.submit(()->{

session.execute(ktest.adImprLogToStatement(adLog.getAdLogType(),adLog.getAdImprLog()));
   inte.incrementAndGet();
 });
  }



Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin [mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 3:15 PM
To: user 
Subject: RE: Cassandra Config as per server hardware for heavy write

>I have a list with 1cr record. I am just iterating on it and executing the 
>query. Also, I try with 200 thread
Do you fetch each list item and put it to separate thread to perform CQL query? 
Also how exactly do you connect to Cassandra?
If you use synchronous API so it's better to create connection pool (with 
TokenAwarePolicy each) and then pass each item to separate thread.


Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting


 On Wed, 23 Nov 2016 04:23:13 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

Hi Siddharth,

For me it seems Cassandra side. Because I have a list with 1cr record. I am 
just iterating on it and executing the query.
Also, I try with 200 thread but still speed doesn’t increase that much as 
expected. On grafana write latency is near about 10Ms.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: siddharth verma 
[mailto:sidd.verma29.l...@gmail.com]
Sent: Wednesday, November 23, 2016 2:23 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Config as per server hardware for heavy write

Hi Abhishek,
You could check whether you are throttling on client side queries or on 
cassandra side.
You could also use grafana to monitor the cluster as well.
As you said, you are using 100 threads, it can't be sure whether you are 
throttling cassandra cluster to its max limit.

As Benjamin suggested, you could use cassandra stress tool.

Lastly, if after everything( and you are sure, that cassandra seems slow) the 
TPS comes out to be the numbers as you suggested, you could check you schema, 
many rows in one partition key, read queries, read write load, write queries 
with Batch/LWT, compactions running etc.


For checking ONLY cassandra throughput, you could use cassandra-stress with any 
schema of your choice.

Regards


On Wed, Nov 23, 2016 at 2:07 PM, Vladimir Yudovin 
mailto:vla...@winguzone.com>> wrote:
So do you see speed write saturation at this number of thread? Does doubling to 
200 bring increase?


Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting, Zero 
production time


 On Wed, 23 Nov 2016 03:31:32 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

No I am using 100 threads.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin 
[mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 2:00 PM
To: user mailto:user@cassandra.apache.org>>
Subject: RE: Cassandra Config as per server hardware for heavy write

>I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.
Is your Java program single threaded?

Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting, Zero 
production time


 On Wed, 23 Nov 2016 03:09:29 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

Hi Benjamin,

I have 1Cr records in my Java ArrayList and yes I am writing in sync mode. My 
table is as below:

CREATE TABLE XXX_YY_MMS (
date timestamp,
userid text,
time timestamp,
xid text,
a

Re: data not replicated on new node

2016-11-23 Thread Malte Pickhan
Not sure if it's really related, but we experienced something similar last
friday. I summarized it in the following Issue:

https://issues.apache.org/jira/browse/CASSANDRA-12947

Best,

Malte
2016-11-23 10:21 GMT+01:00 Oleksandr Shulgin :

> On Tue, Nov 22, 2016 at 5:23 PM, Bertrand Brelier <
> bertrand.brel...@gmail.com> wrote:
>
>> Hello Shalom.
>>
>> No I really went from 3.1.1 to 3.0.9 .
>>
> So you've just installed the 3.0.9 version and re-started with it?  I
> wonder if it's really supported?
>
> Regards,
> --
> Alex
>
>


Reading Commit log files

2016-11-23 Thread Kamesh
Hi All,
 I am trying to read cassandra commit log files, but unable to do it. I am
experimenting this with 1 node cluster(laptop)

 Cassandra Version : *3.8*
 Updated cassadra.yaml with *cdc_enabled: true*

 After executing the below statments and flushing memtables, tried reading
commit log files, but there are no cdc events correpsonding to *test*
keyspace.

 CREATE KEYSPACE *test* WITH replication = {'class': 'SimpleStrategy',
'replication_factor': '1'};
 CREATE TABLE foo (a int, b text, PRIMARY KEY(a)) WITH cdc=true;


 INSERT INTO foo(a, b) VALUES (0, 'static0');
 INSERT INTO foo(a, b) VALUES (1, 'static1');
 INSERT INTO foo(a, b) VALUES (2, 'static2');
 INSERT INTO foo(a, b) VALUES (3, 'static3');
 INSERT INTO foo(a, b) VALUES (4, 'static4');
 INSERT INTO foo(a, b) VALUES (5, 'static5');
 INSERT INTO foo(a, b) VALUES (6, 'static6');
 INSERT INTO foo(a, b) VALUES (7, 'static7');
 INSERT INTO foo(a, b) VALUES (8, 'static8');

 Can someone please help us.

Thanks & Regards
Kamesh.


RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Vladimir Yudovin
session.execute is coming from Session session = cluster.connect(); I guess?



So actually all threads work with the same TCP connection. It's worth to try 
async API with Connection Pool.



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Wed, 23 Nov 2016 04:49:18 -0500Abhishek Kumar Maheshwari 
 wrote 




Hi

 

I am submitting record to Executor service and below is my client config and 
code:

 

cluster = Cluster.builder().addContactPoints(hostAddresses)

   .withRetryPolicy(DefaultRetryPolicy.INSTANCE)

   .withReconnectionPolicy(new 
ConstantReconnectionPolicy(3L))

   .withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy()))

   .build();

 

   ExecutorService service=Executors.newFixedThreadPool(1000);

for(final AdLog adLog:li){

service.submit(()->{

session.execute(ktest.adImprLogToStatement(adLog.getAdLogType(),adLog.getAdImprLog()));

inte.incrementAndGet();

 });

  }

 

 

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.


 

From: Vladimir Yudovin [mailto:vla...@winguzone.com] 
 Sent: Wednesday, November 23, 2016 3:15 PM
 To: user 
 Subject: RE: Cassandra Config as per server hardware for heavy write


 

>I have a list with 1cr record. I am just iterating on it and executing the 
query. Also, I try with 200 thread


Do you fetch each list item and put it to separate thread to perform CQL query? 
Also how exactly do you connect to Cassandra?


If you use synchronous API so it's better to create connection pool (with 
TokenAwarePolicy each) and then pass each item to separate thread.


 


 


Best regards, Vladimir Yudovin,


Winguzone - Cloud Cassandra Hosting



 


 


 On Wed, 23 Nov 2016 04:23:13 -0500Abhishek Kumar Maheshwari 
 wrote 



 


Hi Siddharth,

 

For me it seems Cassandra side. Because I have a list with 1cr record. I am 
just iterating on it and executing the query.

Also, I try with 200 thread but still speed doesn’t increase that much as 
expected. On grafana write latency is near about 10Ms.

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

 

From: siddharth verma [mailto:sidd.verma29.l...@gmail.com] 
 Sent: Wednesday, November 23, 2016 2:23 PM
 To:  user@cassandra.apache.org
 Subject: Re: Cassandra Config as per server hardware for heavy write
 

Hi Abhishek,

You could check whether you are throttling on client side queries or on 
cassandra side.


You could also use grafana to monitor the cluster as well.


As you said, you are using 100 threads, it can't be sure whether you are 
throttling cassandra cluster to its max limit.


 


As Benjamin suggested, you could use cassandra stress tool.


 


Lastly, if after everything( and you are sure, that cassandra seems slow) the 
TPS comes out to be the numbers as you suggested, you could check you schema, 
many rows in one partition key, read queries, read write load, write queries 
with Batch/LWT, compactions running etc.


 


 


For checking ONLY cassandra throughput, you could use cassandra-stress with any 
schema of your choice.


 


Regards


 



 

On Wed, Nov 23, 2016 at 2:07 PM, Vladimir Yudovin  
wrote:

So do you see speed write saturation at this number of thread? Does doubling to 
200 bring increase?


 


 


Best regards, Vladimir Yudovin,


Winguzone - Cloud Cassandra Hosting, Zero production time



 


 


 On Wed, 23 Nov 2016 03:31:32 -0500Abhishek Kumar Maheshwari 
 wrote 



 


No I am using 100 threads.

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.


 

From: Vladimir Yudovin [mailto:vla...@winguzone.com] 
 Sent: Wednesday, November 23, 2016 2:00 PM
 To: user 
 Subject: RE: Cassandra Config as per server hardware for heavy write


 

>I have 1Cr records in my Java ArrayList and yes I am writing in sync mode.


Is your Java program single threaded?


 


Best regards, Vladimir Yudovin,


Winguzone - Cloud Cassandra Hosting, Zero production time



 


 


 On Wed, 23 Nov 2016 03:09:29 -0500Abhishek Kumar Mah

RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Abhishek Kumar Maheshwari
But I need to do it in sync mode as per business requirement. If something went 
wrong then it should be replayle. That’s why I am using sync mode.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin [mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 3:47 PM
To: user 
Subject: RE: Cassandra Config as per server hardware for heavy write

session.execute is coming from Session session = cluster.connect(); I guess?

So actually all threads work with the same TCP connection. It's worth to try 
async API with Connection Pool.

Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting


 On Wed, 23 Nov 2016 04:49:18 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

Hi

I am submitting record to Executor service and below is my client config and 
code:

cluster = Cluster.builder().addContactPoints(hostAddresses)
   .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
   .withReconnectionPolicy(new 
ConstantReconnectionPolicy(3L))
   .withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy()))
   .build();

   ExecutorService service=Executors.newFixedThreadPool(1000);
for(final AdLog adLog:li){
service.submit(()->{
session.execute(ktest.adImprLogToStatement(adLog.getAdLogType(),adLog.getAdImprLog()));
inte.incrementAndGet();
 });
  }



Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin 
[mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 3:15 PM
To: user mailto:user@cassandra.apache.org>>
Subject: RE: Cassandra Config as per server hardware for heavy write

>I have a list with 1cr record. I am just iterating on it and executing the 
>query. Also, I try with 200 thread
Do you fetch each list item and put it to separate thread to perform CQL query? 
Also how exactly do you connect to Cassandra?
If you use synchronous API so it's better to create connection pool (with 
TokenAwarePolicy each) and then pass each item to separate thread.


Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting


 On Wed, 23 Nov 2016 04:23:13 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

Hi Siddharth,

For me it seems Cassandra side. Because I have a list with 1cr record. I am 
just iterating on it and executing the query.
Also, I try with 200 thread but still speed doesn’t increase that much as 
expected. On grafana write latency is near about 10Ms.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: siddharth verma 
[mailto:sidd.verma29.l...@gmail.com]
Sent: Wednesday, November 23, 2016 2:23 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Config as per server hardware for heavy write

Hi Abhishek,
You could check whether you are throttling on client side queries or on 
cassandra side.
You could also use grafana to monitor the cluster as well.
As you said, you are using 100 threads, it can't be sure whether you are 
throttling cassandra cluster to its max limit.

As Benjamin suggested, you could use cassandra stress tool.

Lastly, if after everything( and you are sure, that cassandra seems slow) the 
TPS comes out to be the numbers as you suggested, you could check you schema, 
many rows in one partition key, read queries, read write load, write queries 
with Batch/LWT, compactions running etc.


For checking ONLY cassandra throughput, you could use cassandra-stress with any 
schema of your choice.

Regards


On Wed, Nov 23, 2016 at 2:07 PM, Vladimir Yudovin 
mailto:vla...@winguzone.com>> wrote:
So do you see speed write saturation at this number of thread? Does doubling to 
200 bring increase?


Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting, Zero 
production time


 On Wed, 23 Nov 2016 03:31:32 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

No I am using 100 threads.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of Indi

Re: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Benjamin Roth
This has nothing to do with sync/async operations. An async operation is
also replayable. You receive the result in a future instead.
Have you ever dealt with async programming techniques like promises,
futures, callbacks?
Async programming does not change the fact that you get a result of your
operation only WHERE and WHEN.
Doing sync operations means the result is available in the "next line of
code" whereas async operation means that some handler is called when the
result is there.

There are tons of articles around this in the web.

2016-11-23 11:29 GMT+01:00 Abhishek Kumar Maheshwari <
abhishek.maheshw...@timesinternet.in>:

> But I need to do it in sync mode as per business requirement. If something
> went wrong then it should be replayle. That’s why I am using sync mode.
>
>
>
> *Thanks & Regards,*
> *Abhishek Kumar Maheshwari*
> *+91- 805591 <%2B91-%C2%A0805591> (Mobile)*
>
> Times Internet Ltd. | A Times of India Group Company
>
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>
> *P** Please do not print this email unless it is absolutely necessary.
> Spread environmental awareness.*
>
>
>
> *From:* Vladimir Yudovin [mailto:vla...@winguzone.com]
> *Sent:* Wednesday, November 23, 2016 3:47 PM
> *To:* user 
> *Subject:* RE: Cassandra Config as per server hardware for heavy write
>
>
>
> session.execute is coming from Session session = cluster.connect(); I
> guess?
>
>
>
> So actually all threads work with the same TCP connection. It's worth to
> try async API with Connection Pool.
>
>
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>
>
>
>  On Wed, 23 Nov 2016 04:49:18 -0500*Abhishek Kumar Maheshwari
>  >* wrote 
>
>
>
> Hi
>
>
>
> I am submitting record to Executor service and below is my client config
> and code:
>
>
>
> cluster = Cluster.*builder*().addContactPoints(hostAddresses)
>
>.withRetryPolicy(DefaultRetryPolicy.*INSTANCE*)
>
>.withReconnectionPolicy(*new*
> ConstantReconnectionPolicy(3L))
>
>.withLoadBalancingPolicy(*new* TokenAwarePolicy(*new*
> DCAwareRoundRobinPolicy()))
>
>.build();
>
>
>
>ExecutorService service=Executors.*newFixedThreadPool*(1000);
>
> *for*(*final* AdLog adLog:li){
>
> service.submit(()->{
>
> *session*.execute(ktest.adImprLogToStatement(adLog.getAdLogType(),adLog.
> getAdImprLog()));
>
> inte.incrementAndGet();
>
>  });
>
>   }
>
>
>
>
>
>
>
> *Thanks & Regards,*
> *Abhishek Kumar Maheshwari*
> *+91- 805591 <%2B91-%C2%A0805591> (Mobile)*
>
> Times Internet Ltd. | A Times of India Group Company
>
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>
> *P** Please do not print this email unless it is absolutely necessary.
> Spread environmental awareness.*
>
>
>
> *From:* Vladimir Yudovin [mailto:vla...@winguzone.com]
> *Sent:* Wednesday, November 23, 2016 3:15 PM
> *To:* user 
> *Subject:* RE: Cassandra Config as per server hardware for heavy write
>
>
>
> >I have a list with 1cr record. I am just iterating on it and executing
> the query. Also, I try with 200 thread
>
> Do you fetch each list item and put it to separate thread to perform CQL
> query? Also how exactly do you connect to Cassandra?
>
> If you use synchronous API so it's better to create connection pool (with
> TokenAwarePolicy each) and then pass each item to separate thread.
>
>
>
>
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>
>
>
>  On Wed, 23 Nov 2016 04:23:13 -0500*Abhishek Kumar Maheshwari
>  **>* wrote 
>
>
>
> Hi Siddharth,
>
>
>
> For me it seems Cassandra side. Because I have a list with 1cr record. I
> am just iterating on it and executing the query.
>
> Also, I try with 200 thread but still speed doesn’t increase that much as
> expected. On grafana write latency is near about 10Ms.
>
>
>
> *Thanks & Regards,*
> *Abhishek Kumar Maheshwari*
> *+91- 805591 <%2B91-%C2%A0805591> (Mobile)*
>
> Times Internet Ltd. | A Times of India Group Company
>
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>
> *P** Please do not print this email unless it is absolutely necessary.
> Spread environmental awareness.*
>
>
>
> *From:* siddharth verma [mailto:sidd.verma29.l...@gmail.com]
> *Sent:* Wednesday, November 23, 2016 2:23 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra Config as per server hardware for heavy write
>
>
>
> Hi Abhishek,
>
> You could check whether you are throttling on client side queries or on
> cassandra side.
>
> You could also use grafana to monitor the cluster as well.
>
> As you said, you are using 100 threads, it can't be sure whether you are
> throttling cassandra cluster to its max limit.
>
>
>
> As Benjamin suggested, you could use cassandra stress tool.
>
>
>
> Lastly, if after everything( and you are sure, that cassandra seem

RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Abhishek Kumar Maheshwari
Yes, i also try with async mode but I got max speed on 2500 request/sec per 
server.

   ExecutorService service=Executors.newFixedThreadPool(1000);
  for(final AdLog adLog:li){
 service.submit(()->{
  
session.executeAsync(ktest.adImprLogToStatement(adLog.getAdLogType(),adLog.getAdImprLog()));
   inte.incrementAndGet();
 });
  }

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Benjamin Roth [mailto:benjamin.r...@jaumo.com]
Sent: Wednesday, November 23, 2016 4:09 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra Config as per server hardware for heavy write

This has nothing to do with sync/async operations. An async operation is also 
replayable. You receive the result in a future instead.
Have you ever dealt with async programming techniques like promises, futures, 
callbacks?
Async programming does not change the fact that you get a result of your 
operation only WHERE and WHEN.
Doing sync operations means the result is available in the "next line of code" 
whereas async operation means that some handler is called when the result is 
there.

There are tons of articles around this in the web.

2016-11-23 11:29 GMT+01:00 Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>:
But I need to do it in sync mode as per business requirement. If something went 
wrong then it should be replayle. That’s why I am using sync mode.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin 
[mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 3:47 PM
To: user mailto:user@cassandra.apache.org>>
Subject: RE: Cassandra Config as per server hardware for heavy write

session.execute is coming from Session session = cluster.connect(); I guess?

So actually all threads work with the same TCP connection. It's worth to try 
async API with Connection Pool.

Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting


 On Wed, 23 Nov 2016 04:49:18 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

Hi

I am submitting record to Executor service and below is my client config and 
code:

cluster = Cluster.builder().addContactPoints(hostAddresses)
   .withRetryPolicy(DefaultRetryPolicy.INSTANCE)
   .withReconnectionPolicy(new 
ConstantReconnectionPolicy(3L))
   .withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy()))
   .build();

   ExecutorService service=Executors.newFixedThreadPool(1000);
for(final AdLog adLog:li){
service.submit(()->{
session.execute(ktest.adImprLogToStatement(adLog.getAdLogType(),adLog.getAdImprLog()));
inte.incrementAndGet();
 });
  }



Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

From: Vladimir Yudovin 
[mailto:vla...@winguzone.com]
Sent: Wednesday, November 23, 2016 3:15 PM
To: user mailto:user@cassandra.apache.org>>
Subject: RE: Cassandra Config as per server hardware for heavy write

>I have a list with 1cr record. I am just iterating on it and executing the 
>query. Also, I try with 200 thread
Do you fetch each list item and put it to separate thread to perform CQL query? 
Also how exactly do you connect to Cassandra?
If you use synchronous API so it's better to create connection pool (with 
TokenAwarePolicy each) and then pass each item to separate thread.


Best regards, Vladimir Yudovin,
Winguzone - Cloud Cassandra Hosting


 On Wed, 23 Nov 2016 04:23:13 -0500Abhishek Kumar Maheshwari 
mailto:abhishek.maheshw...@timesinternet.in>>
 wrote 

Hi Siddharth,

For me it seems Cassandra side. Because I have a list with 1cr record. I am 
just iterating on it and executing the query.
Also, I try with 200 thread but still speed doesn’t increase that much as 
expected. On grafana write latency is near about 10Ms.

Thanks & Regards,
Abhishek Kumar Maheshwari
+91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company
FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
P Please do not print this email unless it is a

RE: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Vladimir Yudovin
Try to build cluster with .withPoolingOptions



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Wed, 23 Nov 2016 05:57:58 -0500Abhishek Kumar Maheshwari 
 wrote 




Yes, i also try with async mode but I got max speed on 2500 request/sec per 
server.

 

   ExecutorService service=Executors.newFixedThreadPool(1000);

for(final AdLog adLog:li){

service.submit(()->{

session.executeAsync(ktest.adImprLogToStatement(adLog.getAdLogType(),adLog.getAdImprLog()));

inte.incrementAndGet();

 });

  }

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.

 

From: Benjamin Roth [mailto:benjamin.r...@jaumo.com] 
 Sent: Wednesday, November 23, 2016 4:09 PM
 To: user@cassandra.apache.org
 Subject: Re: Cassandra Config as per server hardware for heavy write
 

This has nothing to do with sync/async operations. An async operation is also 
replayable. You receive the result in a future instead.

Have you ever dealt with async programming techniques like promises, futures, 
callbacks?


Async programming does not change the fact that you get a result of your 
operation only WHERE and WHEN.


Doing sync operations means the result is available in the "next line of code" 
whereas async operation means that some handler is called when the result is 
there.


 


There are tons of articles around this in the web.



 

2016-11-23 11:29 GMT+01:00 Abhishek Kumar Maheshwari 
:

But I need to do it in sync mode as per business requirement. If something went 
wrong then it should be replayle. That’s why I am using sync mode.

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.


 

From: Vladimir Yudovin [mailto:vla...@winguzone.com] 
 Sent: Wednesday, November 23, 2016 3:47 PM
 To: user 
 Subject: RE: Cassandra Config as per server hardware for heavy write


 

session.execute is coming from Session session = cluster.connect(); I guess?


 


So actually all threads work with the same TCP connection. It's worth to try 
async API with Connection Pool.


 


Best regards, Vladimir Yudovin,


Winguzone - Cloud Cassandra Hosting



 


 


 On Wed, 23 Nov 2016 04:49:18 -0500Abhishek Kumar Maheshwari 
 wrote 



 


Hi

 

I am submitting record to Executor service and below is my client config and 
code:

 

cluster = Cluster.builder().addContactPoints(hostAddresses)

   .withRetryPolicy(DefaultRetryPolicy.INSTANCE)

   .withReconnectionPolicy(new 
ConstantReconnectionPolicy(3L))

   .withLoadBalancingPolicy(new TokenAwarePolicy(new 
DCAwareRoundRobinPolicy()))

   .build();

 

   ExecutorService service=Executors.newFixedThreadPool(1000);

for(final AdLog adLog:li){

service.submit(()->{

session.execute(ktest.adImprLogToStatement(adLog.getAdLogType(),adLog.getAdImprLog()));

inte.incrementAndGet();

 });

  }

 

 

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
Times Internet Ltd. | A Times of India Group Company

FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA

P Please do not print this email unless it is absolutely necessary. Spread 
environmental awareness.


 

From: Vladimir Yudovin [mailto:vla...@winguzone.com] 
 Sent: Wednesday, November 23, 2016 3:15 PM
 To: user 
 Subject: RE: Cassandra Config as per server hardware for heavy write


 

>I have a list with 1cr record. I am just iterating on it and executing the 
query. Also, I try with 200 thread


Do you fetch each list item and put it to separate thread to perform CQL query? 
Also how exactly do you connect to Cassandra?


If you use synchronous API so it's better to create connection pool (with 
TokenAwarePolicy each) and then pass each item to separate thread.


 


 


Best regards, Vladimir Yudovin,


Winguzone - Cloud Cassandra Hosting



 


 


 On Wed, 23 Nov 2016 04:23:13 -0500Abhishek Kumar Maheshwari 
 wrote 



 


Hi Siddharth,

 

For me it seems Cassandra side. Because I have a list with 1cr record. I am 
just iterating on it and executing the query.

Also, I try with 200 thread but still speed doesn’t increase that much as 
expected. On grafana write latency is near about 10Ms.

 

Thanks & Regards,
 Abhishek Kumar Maheshwari
 +91- 805591 (Mobile)
T

Re: Reading Commit log files

2016-11-23 Thread Carlos Alonso
Hi Kamesh.

Flushing memtables to disk causes the corresponding commitlog segments to
be deleted. Once the data is flushed into SSTables it can be considered
durable (in case of a node crash, the data won't be lost), and therefore
there's no point in keeping it in the commitlog as well.

Try without flushing and see if you can see your operations there.

Regards

On Wed, 23 Nov 2016 at 11:04 Kamesh  wrote:

> Hi All,
>  I am trying to read cassandra commit log files, but unable to do it. I am
> experimenting this with 1 node cluster(laptop)
>
>  Cassandra Version : *3.8*
>  Updated cassadra.yaml with *cdc_enabled: true*
>
>  After executing the below statments and flushing memtables, tried reading
> commit log files, but there are no cdc events correpsonding to *test*
> keyspace.
>
>  CREATE KEYSPACE *test* WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': '1'};
>  CREATE TABLE foo (a int, b text, PRIMARY KEY(a)) WITH cdc=true;
>
>
>  INSERT INTO foo(a, b) VALUES (0, 'static0');
>  INSERT INTO foo(a, b) VALUES (1, 'static1');
>  INSERT INTO foo(a, b) VALUES (2, 'static2');
>  INSERT INTO foo(a, b) VALUES (3, 'static3');
>  INSERT INTO foo(a, b) VALUES (4, 'static4');
>  INSERT INTO foo(a, b) VALUES (5, 'static5');
>  INSERT INTO foo(a, b) VALUES (6, 'static6');
>  INSERT INTO foo(a, b) VALUES (7, 'static7');
>  INSERT INTO foo(a, b) VALUES (8, 'static8');
>
>  Can someone please help us.
>
> Thanks & Regards
>
> Kamesh.
>


Re: Reading Commit log files

2016-11-23 Thread Kamesh
Hi Carlos,
 Thanks for your response.
 I performed few insert statements and run my application without flushing.
Still not able to read the commit logs.
 However, I am able to read the commit logs of  *system* and *system_schema*
key spaces but not able to read the application key space (key space
created by me).

Thanks & Regards
Kamesh.

On Wed, Nov 23, 2016 at 5:24 PM, Carlos Alonso  wrote:

> Hi Kamesh.
>
> Flushing memtables to disk causes the corresponding commitlog segments to
> be deleted. Once the data is flushed into SSTables it can be considered
> durable (in case of a node crash, the data won't be lost), and therefore
> there's no point in keeping it in the commitlog as well.
>
> Try without flushing and see if you can see your operations there.
>
> Regards
>
> On Wed, 23 Nov 2016 at 11:04 Kamesh  wrote:
>
>> Hi All,
>>  I am trying to read cassandra commit log files, but unable to do it. I
>> am experimenting this with 1 node cluster(laptop)
>>
>>  Cassandra Version : *3.8*
>>  Updated cassadra.yaml with *cdc_enabled: true*
>>
>>  After executing the below statments and flushing memtables, tried
>> reading commit log files, but there are no cdc events correpsonding to
>> *test*keyspace.
>>
>>  CREATE KEYSPACE *test* WITH replication = {'class': 'SimpleStrategy',
>> 'replication_factor': '1'};
>>  CREATE TABLE foo (a int, b text, PRIMARY KEY(a)) WITH cdc=true;
>>
>>
>>  INSERT INTO foo(a, b) VALUES (0, 'static0');
>>  INSERT INTO foo(a, b) VALUES (1, 'static1');
>>  INSERT INTO foo(a, b) VALUES (2, 'static2');
>>  INSERT INTO foo(a, b) VALUES (3, 'static3');
>>  INSERT INTO foo(a, b) VALUES (4, 'static4');
>>  INSERT INTO foo(a, b) VALUES (5, 'static5');
>>  INSERT INTO foo(a, b) VALUES (6, 'static6');
>>  INSERT INTO foo(a, b) VALUES (7, 'static7');
>>  INSERT INTO foo(a, b) VALUES (8, 'static8');
>>
>>  Can someone please help us.
>>
>> Thanks & Regards
>>
>> Kamesh.
>>
>


Re: Reading Commit log files

2016-11-23 Thread Carlos Alonso
Did you configured your keyspace with durable_writes = false by any chance?
That would make operations not reach the commitlog.


On Wed, 23 Nov 2016 at 13:06 Kamesh  wrote:

> Hi Carlos,
>  Thanks for your response.
>  I performed few insert statements and run my application without
> flushing. Still not able to read the commit logs.
>  However, I am able to read the commit logs of  *system* and
> *system_schema* key spaces but not able to read the application key space
> (key space created by me).
>
> Thanks & Regards
>
> Kamesh.
>
> On Wed, Nov 23, 2016 at 5:24 PM, Carlos Alonso  wrote:
>
> Hi Kamesh.
>
> Flushing memtables to disk causes the corresponding commitlog segments to
> be deleted. Once the data is flushed into SSTables it can be considered
> durable (in case of a node crash, the data won't be lost), and therefore
> there's no point in keeping it in the commitlog as well.
>
> Try without flushing and see if you can see your operations there.
>
> Regards
>
> On Wed, 23 Nov 2016 at 11:04 Kamesh  wrote:
>
> Hi All,
>  I am trying to read cassandra commit log files, but unable to do it. I am
> experimenting this with 1 node cluster(laptop)
>
>  Cassandra Version : *3.8*
>  Updated cassadra.yaml with *cdc_enabled: true*
>
>  After executing the below statments and flushing memtables, tried reading
> commit log files, but there are no cdc events correpsonding to *test*
> keyspace.
>
>  CREATE KEYSPACE *test* WITH replication = {'class': 'SimpleStrategy',
> 'replication_factor': '1'};
>  CREATE TABLE foo (a int, b text, PRIMARY KEY(a)) WITH cdc=true;
>
>
>  INSERT INTO foo(a, b) VALUES (0, 'static0');
>  INSERT INTO foo(a, b) VALUES (1, 'static1');
>  INSERT INTO foo(a, b) VALUES (2, 'static2');
>  INSERT INTO foo(a, b) VALUES (3, 'static3');
>  INSERT INTO foo(a, b) VALUES (4, 'static4');
>  INSERT INTO foo(a, b) VALUES (5, 'static5');
>  INSERT INTO foo(a, b) VALUES (6, 'static6');
>  INSERT INTO foo(a, b) VALUES (7, 'static7');
>  INSERT INTO foo(a, b) VALUES (8, 'static8');
>
>  Can someone please help us.
>
> Thanks & Regards
>
> Kamesh.
>
>
>


Re: Reading Commit log files

2016-11-23 Thread Kamesh
Hi Carlos,
 durable_writes = true.

 *cqlsh:test> describe test;*
* CREATE KEYSPACE test WITH replication = {'class': 'SimpleStrategy',
'replication_factor': '1'}  AND durable_writes = true;*

Thanks & Regards
Kamesh.

On Wed, Nov 23, 2016 at 9:10 PM, Carlos Alonso  wrote:

> Did you configured your keyspace with durable_writes = false by any
> chance? That would make operations not reach the commitlog.
>
>
> On Wed, 23 Nov 2016 at 13:06 Kamesh  wrote:
>
>> Hi Carlos,
>>  Thanks for your response.
>>  I performed few insert statements and run my application without
>> flushing. Still not able to read the commit logs.
>>  However, I am able to read the commit logs of  *system* and
>> *system_schema* key spaces but not able to read the application key
>> space (key space created by me).
>>
>> Thanks & Regards
>>
>> Kamesh.
>>
>> On Wed, Nov 23, 2016 at 5:24 PM, Carlos Alonso 
>> wrote:
>>
>> Hi Kamesh.
>>
>> Flushing memtables to disk causes the corresponding commitlog segments to
>> be deleted. Once the data is flushed into SSTables it can be considered
>> durable (in case of a node crash, the data won't be lost), and therefore
>> there's no point in keeping it in the commitlog as well.
>>
>> Try without flushing and see if you can see your operations there.
>>
>> Regards
>>
>> On Wed, 23 Nov 2016 at 11:04 Kamesh  wrote:
>>
>> Hi All,
>>  I am trying to read cassandra commit log files, but unable to do it. I
>> am experimenting this with 1 node cluster(laptop)
>>
>>  Cassandra Version : *3.8*
>>  Updated cassadra.yaml with *cdc_enabled: true*
>>
>>  After executing the below statments and flushing memtables, tried
>> reading commit log files, but there are no cdc events correpsonding to
>> *test*keyspace.
>>
>>  CREATE KEYSPACE *test* WITH replication = {'class': 'SimpleStrategy',
>> 'replication_factor': '1'};
>>  CREATE TABLE foo (a int, b text, PRIMARY KEY(a)) WITH cdc=true;
>>
>>
>>  INSERT INTO foo(a, b) VALUES (0, 'static0');
>>  INSERT INTO foo(a, b) VALUES (1, 'static1');
>>  INSERT INTO foo(a, b) VALUES (2, 'static2');
>>  INSERT INTO foo(a, b) VALUES (3, 'static3');
>>  INSERT INTO foo(a, b) VALUES (4, 'static4');
>>  INSERT INTO foo(a, b) VALUES (5, 'static5');
>>  INSERT INTO foo(a, b) VALUES (6, 'static6');
>>  INSERT INTO foo(a, b) VALUES (7, 'static7');
>>  INSERT INTO foo(a, b) VALUES (8, 'static8');
>>
>>  Can someone please help us.
>>
>> Thanks & Regards
>>
>> Kamesh.
>>
>>
>>


Re: Cassandra Config as per server hardware for heavy write

2016-11-23 Thread Manoj Khangaonkar
Hi,

What is your write consistency setting ?

regards

On Wed, Nov 23, 2016 at 3:48 AM, Vladimir Yudovin 
wrote:

> Try to build cluster with *.withPoolingOptions*
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Wed, 23 Nov 2016 05:57:58 -0500*Abhishek Kumar Maheshwari
>  >* wrote 
>
> Yes, i also try with async mode but I got max speed on 2500 request/sec
> per server.
>
>
>
>ExecutorService service=Executors.*newFixedThreadPool*(1000);
>
> *for*(*final* AdLog adLog:li){
>
> service.submit(()->{
>
> *session*.executeAsync(ktest.adImprLogToStatement(adLog.getAdLogType(),
> adLog.getAdImprLog()));
>
> inte.incrementAndGet();
>
>  });
>
>   }
>
>
>
> *Thanks & Regards,*
> *Abhishek Kumar Maheshwari*
> *+91- 805591 <%2B91-%C2%A0805591> (Mobile)*
>
> Times Internet Ltd. | A Times of India Group Company
>
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>
> *P** Please do not print this email unless it is absolutely necessary.
> Spread environmental awareness.*
>
>
>
> *From:* Benjamin Roth [mailto:benjamin.r...@jaumo.com]
> *Sent:* Wednesday, November 23, 2016 4:09 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Cassandra Config as per server hardware for heavy write
>
>
>
> This has nothing to do with sync/async operations. An async operation is
> also replayable. You receive the result in a future instead.
>
> Have you ever dealt with async programming techniques like promises,
> futures, callbacks?
>
> Async programming does not change the fact that you get a result of your
> operation only WHERE and WHEN.
>
> Doing sync operations means the result is available in the "next line of
> code" whereas async operation means that some handler is called when the
> result is there.
>
>
>
> There are tons of articles around this in the web.
>
>
>
> 2016-11-23 11:29 GMT+01:00 Abhishek Kumar Maheshwari  timesinternet.in>:
>
> But I need to do it in sync mode as per business requirement. If something
> went wrong then it should be replayle. That’s why I am using sync mode.
>
>
>
> *Thanks & Regards,*
> *Abhishek Kumar Maheshwari*
> *+91- 805591 (Mobile)*
>
> Times Internet Ltd. | A Times of India Group Company
>
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>
> *P** Please do not print this email unless it is absolutely necessary.
> Spread environmental awareness.*
>
>
>
> *From:* Vladimir Yudovin [mailto:vla...@winguzone.com]
> *Sent:* Wednesday, November 23, 2016 3:47 PM
> *To:* user 
> *Subject:* RE: Cassandra Config as per server hardware for heavy write
>
>
>
> session.execute is coming from Session session = cluster.connect(); I
> guess?
>
>
>
> So actually all threads work with the same TCP connection. It's worth to
> try async API with Connection Pool.
>
>
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>
>
>
>  On Wed, 23 Nov 2016 04:49:18 -0500*Abhishek Kumar Maheshwari <
> abhishek.maheshw...@timesinternet.in
> **>* wrote 
>
>
>
> Hi
>
>
>
> I am submitting record to Executor service and below is my client config
> and code:
>
>
>
> cluster = Cluster.*builder*().addContactPoints(hostAddresses)
>
>.withRetryPolicy(DefaultRetryPolicy.*INSTANCE*)
>
>.withReconnectionPolicy(*new*
> ConstantReconnectionPolicy(3L))
>
>.withLoadBalancingPolicy(*new* TokenAwarePolicy(*new*
> DCAwareRoundRobinPolicy()))
>
>.build();
>
>
>
>ExecutorService service=Executors.*newFixedThreadPool*(1000);
>
> *for*(*final* AdLog adLog:li){
>
> service.submit(()->{
>
> *session*.execute(ktest.adImprLogToStatement(adLog.getAdLogType(),adLog.
> getAdImprLog()));
>
> inte.incrementAndGet();
>
>  });
>
>   }
>
>
>
>
>
>
>
> *Thanks & Regards,*
> *Abhishek Kumar Maheshwari*
> *+91- 805591 (Mobile)*
>
> Times Internet Ltd. | A Times of India Group Company
>
> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>
> *P** Please do not print this email unless it is absolutely necessary.
> Spread environmental awareness.*
>
>
>
> *From:* Vladimir Yudovin [mailto:vla...@winguzone.com]
> *Sent:* Wednesday, November 23, 2016 3:15 PM
> *To:* user 
> *Subject:* RE: Cassandra Config as per server hardware for heavy write
>
>
>
> >I have a list with 1cr record. I am just iterating on it and executing
> the query. Also, I try with 200 thread
>
> Do you fetch each list item and put it to separate thread to perform CQL
> query? Also how exactly do you connect to Cassandra?
>
> If you use synchronous API so it's better to create connection pool (with
> TokenAwarePolicy each) and then pass each item to separate thread.
>
>
>
>
>
> Best regards, Vladimir Yudovin,
>
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>
>
>
>  On Wed, 23 Nov 2016 04:23:13 -0500*

Row and column level tombstones

2016-11-23 Thread Andrew Cooper
What would be returned in the following example?

Row with columns exists
Row is deleted (row tombstone)
Row key is recreated

Would columns that existed before the row delete/tombstone show back up in a 
read if the row key is recreated?
My assumption is that the row key tombstone timestamp is taken into 
consideration on the read path and all columns with timestamp less than key 
tombstone are ignored in the response.
I have not dug into the codebase yet.  If anyone can shed light on this 
question from their own experiences that would be helpful.

Thanks,

-Andrew


Re: Row and column level tombstones

2016-11-23 Thread Vladimir Yudovin
You are right, only new inserts after delete are taken into account:

CREATE KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': 1};

CREATE TABLE ks.tb (id int PRIMARY KEY , str text);

INSERT INTO ks.tb (id, str) VALUES ( 0,'');

DELETE from ks.tb WHERE id =0;

INSERT INTO ks.tb (id) VALUES ( 0);

SELECT * FROM ks.tb ;



 id | str

+--

  0 | null



(1 rows)




Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Wed, 23 Nov 2016 11:59:51 -0500Andrew Cooper 
 wrote 




What would be returned in the following example?

 

Row with columns exists

Row is deleted (row tombstone)

Row key is recreated

 

Would columns that existed before the row delete/tombstone show back up in a 
read if the row key is recreated?

My assumption is that the row key tombstone timestamp is taken into 
consideration on the read path and all columns with timestamp less than key 
tombstone are ignored in the response.

I have not dug into the codebase yet.  If anyone can shed light on this 
question from their own experiences that would be helpful.

 

Thanks,

 

-Andrew









Bulk Import Question

2016-11-23 Thread Joe Olson
I'm following the Cassandra bulk import example here: 
https://github.com/yukim/cassandra-bulkload-example 

Are the Cassandra data types inet, smallint, and tinyint supported by the bulk 
import CQLSSTableWriter ? 

I can't seem to get them to work... 


Re: failure node rejoin

2016-11-23 Thread Yuji Ito
Hi Ben,

I continue to investigate the data loss issue.
I'm investigating logs and source code and try to reproduce the data loss
issue with a simple test.
I also try my destructive test with DROP instead of TRUNCATE.

BTW, I want to discuss the issue of the title "failure node rejoin" again.

Will this issue be fixed? Other nodes should refuse this unexpected rejoin.
Or should I be more careful to add failure nodes to the existing cluster?

Thanks,
yuji


On Fri, Nov 11, 2016 at 1:00 PM, Ben Slater 
wrote:

> From a quick look I couldn’t find any defects other than the ones you’ve
> found that seem potentially relevant to your issue (if any one else on the
> list knows of one please chime in). Maybe the next step, if you haven’t
> done so already, is to check your Cassandra logs for any signs of issues
> (ie WARNING or ERROR logs) in the failing case.
>
> Cheers
> Ben
>
> On Fri, 11 Nov 2016 at 13:07 Yuji Ito  wrote:
>
>> Thanks Ben,
>>
>> I tried 2.2.8 and could reproduce the problem.
>> So, I'm investigating some bug fixes of repair and commitlog between
>> 2.2.8 and 3.0.9.
>>
>> - CASSANDRA-12508: "nodetool repair returns status code 0 for some errors"
>>
>> - CASSANDRA-12436: "Under some races commit log may incorrectly think it
>> has unflushed data"
>>   - related to CASSANDRA-9669, CASSANDRA-11828 (the fix of 2.2 is
>> different from that of 3.0?)
>>
>> Do you know other bug fixes related to commitlog?
>>
>> Regards
>> yuji
>>
>> On Wed, Nov 9, 2016 at 11:34 AM, Ben Slater 
>> wrote:
>>
>> There have been a few commit log bugs around in the last couple of months
>> so perhaps you’ve hit something that was fixed recently. Would be
>> interesting to know the problem is still occurring in 2.2.8.
>>
>> I suspect what is happening is that when you do your initial read
>> (without flush) to check the number of rows, the data is in memtables and
>> theoretically the commitlogs but not sstables. With the forced stop the
>> memtables are lost and Cassandra should read the commitlog from disk at
>> startup to reconstruct the memtables. However, it looks like that didn’t
>> happen for some (bad) reason.
>>
>> Good news that 3.0.9 fixes the problem so up to you if you want to
>> investigate further and see if you can narrow it down to file a JIRA
>> (although the first step of that would be trying 2.2.9 to make sure it’s
>> not already fixed there).
>>
>> Cheers
>> Ben
>>
>> On Wed, 9 Nov 2016 at 12:56 Yuji Ito  wrote:
>>
>> I tried C* 3.0.9 instead of 2.2.
>> The data lost problem hasn't happen for now (without `nodetool flush`).
>>
>> Thanks
>>
>> On Fri, Nov 4, 2016 at 3:50 PM, Yuji Ito  wrote:
>>
>> Thanks Ben,
>>
>> When I added `nodetool flush` on all nodes after step 2, the problem
>> didn't happen.
>> Did replay from old commit logs delete rows?
>>
>> Perhaps, the flush operation just detected that some nodes were down in
>> step 2 (just after truncating tables).
>> (Insertion and check in step2 would succeed if one node was down because
>> consistency levels was serial.
>> If the flush failed on more than one node, the test would retry step 2.)
>> However, if so, the problem would happen without deleting Cassandra data.
>>
>> Regards,
>> yuji
>>
>>
>> On Mon, Oct 24, 2016 at 8:37 AM, Ben Slater 
>> wrote:
>>
>> Definitely sounds to me like something is not working as expected but I
>> don’t really have any idea what would cause that (other than the fairly
>> extreme failure scenario). A couple of things I can think of to try to
>> narrow it down:
>> 1) Run nodetool flush on all nodes after step 2 - that will make sure all
>> data is written to sstables rather than relying on commit logs
>> 2) Run the test with consistency level quorom rather than serial
>> (shouldn’t be any different but quorom is more widely used so maybe there
>> is a bug that’s specific to serial)
>>
>> Cheers
>> Ben
>>
>> On Mon, 24 Oct 2016 at 10:29 Yuji Ito  wrote:
>>
>> Hi Ben,
>>
>> The test without killing nodes has been working well without data lost.
>> I've repeated my test about 200 times after removing data and
>> rebuild/repair.
>>
>> Regards,
>>
>>
>> On Fri, Oct 21, 2016 at 3:14 PM, Yuji Ito  wrote:
>>
>> > Just to confirm, are you saying:
>> > a) after operation 2, you select all and get 1000 rows
>> > b) after operation 3 (which only does updates and read) you select and
>> only get 953 rows?
>>
>> That's right!
>>
>> I've started the test without killing nodes.
>> I'll report the result to you next Monday.
>>
>> Thanks
>>
>>
>> On Fri, Oct 21, 2016 at 3:05 PM, Ben Slater 
>> wrote:
>>
>> Just to confirm, are you saying:
>> a) after operation 2, you select all and get 1000 rows
>> b) after operation 3 (which only does updates and read) you select and
>> only get 953 rows?
>>
>> If so, that would be very unexpected. If you run your tests without
>> killing nodes do you get the expected (1,000) rows?
>>
>> Cheers
>> Ben
>>
>> On Fri, 21 Oct 2016 at 17:00 Yuji Ito  wrote:
>>
>> > Are you certain your tests don’t genera

Re: failure node rejoin

2016-11-23 Thread Ben Slater
You could certainly log a JIRA for the “failure node rejoin” issue (
https://issues.apache.org/*jira*/browse/
*cassandra
). I*t sounds like
unexpected behaviour to me. However, I’m not sure it will be viewed a high
priority to fix given there is a clear operational work-around.

Cheers
Ben

On Thu, 24 Nov 2016 at 15:14 Yuji Ito  wrote:

> Hi Ben,
>
> I continue to investigate the data loss issue.
> I'm investigating logs and source code and try to reproduce the data loss
> issue with a simple test.
> I also try my destructive test with DROP instead of TRUNCATE.
>
> BTW, I want to discuss the issue of the title "failure node rejoin" again.
>
> Will this issue be fixed? Other nodes should refuse this unexpected rejoin.
> Or should I be more careful to add failure nodes to the existing cluster?
>
> Thanks,
> yuji
>
>
> On Fri, Nov 11, 2016 at 1:00 PM, Ben Slater 
> wrote:
>
> From a quick look I couldn’t find any defects other than the ones you’ve
> found that seem potentially relevant to your issue (if any one else on the
> list knows of one please chime in). Maybe the next step, if you haven’t
> done so already, is to check your Cassandra logs for any signs of issues
> (ie WARNING or ERROR logs) in the failing case.
>
> Cheers
> Ben
>
> On Fri, 11 Nov 2016 at 13:07 Yuji Ito  wrote:
>
> Thanks Ben,
>
> I tried 2.2.8 and could reproduce the problem.
> So, I'm investigating some bug fixes of repair and commitlog between 2.2.8
> and 3.0.9.
>
> - CASSANDRA-12508: "nodetool repair returns status code 0 for some errors"
>
> - CASSANDRA-12436: "Under some races commit log may incorrectly think it
> has unflushed data"
>   - related to CASSANDRA-9669, CASSANDRA-11828 (the fix of 2.2 is
> different from that of 3.0?)
>
> Do you know other bug fixes related to commitlog?
>
> Regards
> yuji
>
> On Wed, Nov 9, 2016 at 11:34 AM, Ben Slater 
> wrote:
>
> There have been a few commit log bugs around in the last couple of months
> so perhaps you’ve hit something that was fixed recently. Would be
> interesting to know the problem is still occurring in 2.2.8.
>
> I suspect what is happening is that when you do your initial read (without
> flush) to check the number of rows, the data is in memtables and
> theoretically the commitlogs but not sstables. With the forced stop the
> memtables are lost and Cassandra should read the commitlog from disk at
> startup to reconstruct the memtables. However, it looks like that didn’t
> happen for some (bad) reason.
>
> Good news that 3.0.9 fixes the problem so up to you if you want to
> investigate further and see if you can narrow it down to file a JIRA
> (although the first step of that would be trying 2.2.9 to make sure it’s
> not already fixed there).
>
> Cheers
> Ben
>
> On Wed, 9 Nov 2016 at 12:56 Yuji Ito  wrote:
>
> I tried C* 3.0.9 instead of 2.2.
> The data lost problem hasn't happen for now (without `nodetool flush`).
>
> Thanks
>
> On Fri, Nov 4, 2016 at 3:50 PM, Yuji Ito  wrote:
>
> Thanks Ben,
>
> When I added `nodetool flush` on all nodes after step 2, the problem
> didn't happen.
> Did replay from old commit logs delete rows?
>
> Perhaps, the flush operation just detected that some nodes were down in
> step 2 (just after truncating tables).
> (Insertion and check in step2 would succeed if one node was down because
> consistency levels was serial.
> If the flush failed on more than one node, the test would retry step 2.)
> However, if so, the problem would happen without deleting Cassandra data.
>
> Regards,
> yuji
>
>
> On Mon, Oct 24, 2016 at 8:37 AM, Ben Slater 
> wrote:
>
> Definitely sounds to me like something is not working as expected but I
> don’t really have any idea what would cause that (other than the fairly
> extreme failure scenario). A couple of things I can think of to try to
> narrow it down:
> 1) Run nodetool flush on all nodes after step 2 - that will make sure all
> data is written to sstables rather than relying on commit logs
> 2) Run the test with consistency level quorom rather than serial
> (shouldn’t be any different but quorom is more widely used so maybe there
> is a bug that’s specific to serial)
>
> Cheers
> Ben
>
> On Mon, 24 Oct 2016 at 10:29 Yuji Ito  wrote:
>
> Hi Ben,
>
> The test without killing nodes has been working well without data lost.
> I've repeated my test about 200 times after removing data and
> rebuild/repair.
>
> Regards,
>
>
> On Fri, Oct 21, 2016 at 3:14 PM, Yuji Ito  wrote:
>
> > Just to confirm, are you saying:
> > a) after operation 2, you select all and get 1000 rows
> > b) after operation 3 (which only does updates and read) you select and
> only get 953 rows?
>
> That's right!
>
> I've started the test without killing nodes.
> I'll report the result to you next Monday.
>
> Thanks
>
>
> On Fri, Oct 21, 2016 at 3:05 PM, Ben Slater 
> wrote:
>
> Just to confirm, are you saying:
> a) after operation 2, you select all and get 1000 row