Re: Joining a cluster of nodes having multi valued initial_token parameters.

2018-03-09 Thread Mikhail Tsaplin
I suspect that cluster was created by recovering from a snapshot. PS. I asked a related question on this mailing list. Please check subject: Removing initial_token parameter. 2018-03-08 20:02 GMT+07:00 Oleksandr Shulgin : > On Thu, Mar 8, 2018 at 1:41 PM, Mikhail Tsaplin > wrote: > >> Thank you

Removing initial_token parameter

2018-03-09 Thread Mikhail Tsaplin
Is it safe to remove initial_token parameter on a cluster created by snapshot restore procedure presented here https://docs.datastax.com /en/cassandra/latest/cassandra/operations/opsSnapshotRestoreNewCluster.html ? For me, it seems that initial_token parameter is used only when nodes are started t

Re: Removing initial_token parameter

2018-03-09 Thread kurt greaves
correct, tokens will be stored in the nodes system tables after the first boot, so feel free to remove them (although it's not really necessary) On 9 Mar. 2018 20:16, "Mikhail Tsaplin" wrote: > Is it safe to remove initial_token parameter on a cluster created by > snapshot restore procedure pres

uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Yasir Saleem
Hi Team, we are facing issue of uneven data movement in cassandra disk for specific which disk03 in our case, however all the disk are consuming around 60% of space but disk03 is taking 87% space. Here is configuration in yaml and current disk space: data_file_directories: - /data/disk01/ca

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Nicolas Guyomar
Hi, This might be a compaction which is running, have you check that ? On 9 March 2018 at 11:29, Yasir Saleem wrote: > Hi Team, > > we are facing issue of uneven data movement in cassandra disk for > specific which disk03 in our case, however all the disk are consuming > around 60% of space b

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Yasir Saleem
Thanks, Nicolas Guyomar I am new to cassandra, here is the properties which I can see in yaml file: # of compaction, including validation compaction. compaction_throughput_mb_per_sec: 16 compaction_large_partition_warning_threshold_mb: 100 On Fri, Mar 9, 2018 at 3:33 PM, Nicolas Guyomar wrote

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Oleksandr Shulgin
On Fri, Mar 9, 2018 at 11:40 AM, Yasir Saleem wrote: > Thanks, Nicolas Guyomar > > I am new to cassandra, here is the properties which I can see in yaml file: > > # of compaction, including validation compaction. > compaction_throughput_mb_per_sec: 16 > compaction_large_partition_warning_threshol

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Yasir Saleem
Hi Alex, no active compaction, right now. On Fri, Mar 9, 2018 at 3:47 PM, Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On Fri, Mar 9, 2018 at 11:40 AM, Yasir Saleem > wrote: > >> Thanks, Nicolas Guyomar >> >> I am new to cassandra, here is the properties which I can see in yaml

Re: Amazon Time Sync Service + ntpd vs chrony

2018-03-09 Thread Kyrylo Lebediev
Thank you to all who replied so far, thank you Ben for the links you provided! From: Ben Slater Sent: Friday, March 9, 2018 12:12:09 AM To: user@cassandra.apache.org Subject: Re: Amazon Time Sync Service + ntpd vs chrony It is important to make sure you are usin

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Kyrylo Lebediev
Not sure where I heard this, but AFAIK data imbalance when multiple data_directories are in use is a known issue for older versions of Cassandra. This might be the root-cause of your issue. Which version of C* are you using? Unfortunately, don't remember in which version this imbalance issue wa

Re: Adding disk to operating C*

2018-03-09 Thread Kyrylo Lebediev
Niclas, Here is Jeff's comment regarding this: https://stackoverflow.com/a/31690279 From: Niclas Hedhman Sent: Friday, March 9, 2018 9:09:53 AM To: user@cassandra.apache.org; Rahul Singh Subject: Re: Adding disk to operating C* I am curious about the side commen

RE: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Kenneth Brotman
Yasir, How many nodes are in the cluster? What is num_tokens set to in the Cassandra.yaml file? Is it just this one node doing this? What replication factor do you use that affects the ranges on that disk? Kenneth Brotman From: Kyrylo Lebediev [mailto:kyrylo_lebed...@epam.com]

Re: Adding disk to operating C*

2018-03-09 Thread Rahul Singh
Yep. Most of my arguments are the same from seeing it in production. Cass= andra is used for fast writes and generally fast reads with redundancy an= d failover for OLTP and OLAP. It=E2=80=99s not just a bunch of dumb disks= . You can throw crap into S3 or HD=46S and analyze / report with Hive or

Re: Adding disk to operating C*

2018-03-09 Thread Jeff Jirsa
1.5 TB sounds very very conservative - 3-4T is where I set the limit at past jobs. Have heard of people doing twice that (6-8T). -- Jeff Jirsa > On Mar 8, 2018, at 11:09 PM, Niclas Hedhman wrote: > > I am curious about the side comment; "Depending on your usecase you may not > want to have

Cassandra storage: Some thoughts

2018-03-09 Thread Vangelis Koukis
Hello all, My name is Vangelis Koukis and I am a Founder and the CTO of Arrikto. I'm writing to share our thoughts on how people run distributed, stateful applications such as Cassandra on modern infrastructure, and would love to get the community's feedback and comments. The fundamental questio

Re: Adding disk to operating C*

2018-03-09 Thread Jon Haddad
I agree with Jeff - I usually advise teams to cap their density around 3TB, especially with TWCS. Read heavy workloads tend to use smaller datasets and ring size ends up being a function of performance tuning. Since 2.2 bootstrap can now be resumed, which helps quite a bit with the streami

TWCS enabling tombstone compaction

2018-03-09 Thread Lucas Benevides
Dear community, I have been using TWCS in my lab, with TTL'd data. In the debug log there is always the sentence: "TimeWindowCompactionStrategy.java:65 Disabling tombstone compactions for TWCS". Indeed, the line is always repeated. What does it actually mean? If my data gets expired, the TWCS is

Consistency level for the COPY command

2018-03-09 Thread Jai Bheemsen Rao Dhanwada
Hello, What is the consistency level used when performing COPY command using CQL interface? don't see anything in the documents https://docs.datastax.com/en/cql/3.1/cql/cql_reference/copy_r.html I am setting CONSISTENCY LEVEL at the cql level and then running a copy command, does that honor the

Re: Cassandra storage: Some thoughts

2018-03-09 Thread Rahul Singh
Interesting. Can this be used in conjunction with bare metal? As in does it present containers in place if the “real” node until the node is up and running? -- Rahul Singh rahul.si...@anant.us Anant Corporation On Mar 9, 2018, 10:56 AM -0500, Vangelis Koukis , wrote: > Hello all, > > My name i

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Madhu B
Yasir, I think you need to run full repair in off-peak hours Thanks, Madhu > On Mar 9, 2018, at 7:20 AM, Kenneth Brotman > wrote: > > Yasir, > > How many nodes are in the cluster? > What is num_tokens set to in the Cassandra.yaml file? > Is it just this one node doing this? > What replic

Re: Cassandra storage: Some thoughts

2018-03-09 Thread Oleksandr Shulgin
On 9 Mar 2018 16:56, "Vangelis Koukis" wrote: Hello all, My name is Vangelis Koukis and I am a Founder and the CTO of Arrikto. I'm writing to share our thoughts on how people run distributed, stateful applications such as Cassandra on modern infrastructure, and would love to get the community's

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread James Shaw
Ours have similar issue and I am working to solve it this weekend. Our case is because STCS make one huge table's sstable file bigger and bigger after compaction (this is STCS compaction nature, nothing wrong), even all most all data TTL 30days, but tombstones not evicted since largest file is wai

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread James Shaw
per my testing, repair not help. repair build Merkle tree to compare data, it only write to a new file while have difference, very very small file at the end (of course, means most data are synced) On Fri, Mar 9, 2018 at 10:31 PM, Madhu B wrote: > Yasir, > I think you need to run full repair in

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Madhu B
Yes it will helps,thanks James for correcting me > On Mar 9, 2018, at 9:52 PM, James Shaw wrote: > > per my testing, repair not help. > repair build Merkle tree to compare data, it only write to a new file while > have difference, very very small file at the end (of course, means most data >

Re: uneven data movement in one of the disk in Cassandra

2018-03-09 Thread Jeff Jirsa
The version here really matters. If it’s higher than 3.2, it’s probably related to this issue which places sstables for a given range in the same directory to avoid data loss on single drive failure: https://issues.apache.org/jira/browse/CASSANDRA-6696 -- Jeff Jirsa > On Mar 9, 2018, at 9:

Re: data types storage saving

2018-03-09 Thread onmstester onmstester
I've find out that blobs has no gain in storage saving! I had some 16 digit number which been saved as bigint previously but by saving this as blob, the storage usage per record is still the same Sent using Zoho Mail On Tue, 06 Mar 2018 19:18:31 +0330 Carl Mueller