Hi,
There is no DevCenter 2.x, latest is 1.6. It would help if you provide jar
names and exceptions you encounter. Make sure you’re not mixing Guava versions
from other dependencies. DevCenter uses Datastax driver to connect to
Cassandra, double check the versions of the jars you need here:
htt
On Sun, Mar 11, 2018 at 10:31 PM, Kunal Gangakhedkar <
kgangakhed...@gmail.com> wrote:
> Hi all,
>
> We currently have a cluster in GCE for one of the customers.
> They want it to be migrated to AWS.
>
> I have setup one node in AWS to join into the cluster by following:
> https://docs.datastax.co
How did you distribute your seed nodes across whole cluster?
--
Rahul Singh
rahul.si...@anant.us
Anant Corporation
On Mar 12, 2018, 5:12 AM -0400, Oleksandr Shulgin
, wrote:
> > On Sun, Mar 11, 2018 at 10:31 PM, Kunal Gangakhedkar
> > wrote:
> > > Hi all,
> > >
> > > We currently have a clust
Running two instance of Apache Cassandra on same server, each having their own
commit log disk dis not help. Sum of cpu/ram usage for both instances would be
less than half of all available resources. disk usage is less than 20% and
network is still less than 300Mb in Rx.
Sent using Zoho Mai
If this is a universal recommendation, then should that actually be default in
Cassandra?
Hannu
> On 18 Jan 2018, at 00:49, Jon Haddad wrote:
>
> I *strongly* recommend disabling dynamic snitch. I’ve seen it make latency
> jump 10x.
>
> dynamic_snitch: false is your friend.
>
>
>
>> O
What’s your disk latency? What kind of disk is it?
--
Jacques-Henri Berthemet
From: onmstester onmstester [mailto:onmstes...@zoho.com]
Sent: Monday, March 12, 2018 10:48 AM
To: user
Subject: Re: yet another benchmark bottleneck
Running two instance of Apache Cassandra on same server, each havin
Anyone?
> On 4 Mar 2018, at 20:45, Hannu Kröger wrote:
>
> Hello,
>
> I am trying to verify and understand fully the functionality of row cache in
> Cassandra.
>
> I have been using mainly two different sources for information:
> https://github.com/apache/cassandra/blob/0db88242c66d3a7193a9ad
1.2 TB 15K
latency reported by stress tool is 7.6 ms. disk latency is 2.6 ms
Sent using Zoho Mail
On Mon, 12 Mar 2018 14:02:29 +0330 Jacques-Henri Berthemet
wrote
What’s your disk latency? What kind of disk is it?
--
Jacques-Henri B
Any errors/warning in Cassandra logs? What’s your RF?
Using 300MB/s of network bandwidth for only 130 op/s looks very high.
--
Jacques-Henri Berthemet
From: onmstester onmstester [mailto:onmstes...@zoho.com]
Sent: Monday, March 12, 2018 11:38 AM
To: user
Subject: RE: yet another benchmark bottle
What’s the goal? How big are your partitions , size in MB and in rows?
--
Rahul Singh
rahul.si...@anant.us
Anant Corporation
On Mar 12, 2018, 6:37 AM -0400, Hannu Kröger , wrote:
> Anyone?
>
> > On 4 Mar 2018, at 20:45, Hannu Kröger wrote:
> >
> > Hello,
> >
> > I am trying to verify and unders
RF=1
No errors or warnings.
Actually its 300 Mbit/seconds and 130K OP/seconds. I missed a 'K' in first
mail, but anyway! the point is: More than half of node resources (cpu, mem,
disk, network) is unused and i can't increase write throughput.
Sent using Zoho Mail
On Mon, 12 Mar 201
Hi,
My goal is to make sure that I understand functionality correctly and that the
documentation is accurate.
The question in other words: Is the documentation or the comment in the code
wrong (or inaccurate).
Hannu
> On 12 Mar 2018, at 13:00, Rahul Singh wrote:
>
> What’s the goal? How bi
It makes more sense now, 130K is not that bad.
According to cassandra.yaml you should be able to increase your number of write
threads in Cassandra:
# On the other hand, since writes are almost never IO bound, the ideal
# number of "concurrent_writes" is dependent on the number of cores in
# your
no luck even with 320 threads for write
Sent using Zoho Mail
On Mon, 12 Mar 2018 14:44:15 +0330 Jacques-Henri Berthemet
wrote
It makes more sense now, 130K is not that bad.
According to cassandra.yaml you should be able to increase yo
Hi,
I understand that a well designed cassandra system will allow to query ANY
data within it at an incredible speed as well as ingesting data at a very
fast pace.
However this data is going to grow until it is archived. As I see it, data
has two stages, HOT DATA when data is accessible to be que
I may be wrong, but what I’ve read and used in the past assumes that the
“first” N rows are cached and the clustering key design is how I change what N
rows are put into memory. Looking at the code, it seems that’s the case.
The language of the comment basically says that it holds in cache what
HDFS / S3 is a great place to dump this data. You can also consider other types
of compaction strategies for “COLD DATA” in not so powerful C* clusters for
which the purpose is write only. C* is still better in my opinion for data
management than S3/HDFS. It depends on how easy you want the ret
> On 12 Mar 2018, at 14:45, Rahul Singh wrote:
>
> I may be wrong, but what I’ve read and used in the past assumes that the
> “first” N rows are cached and the clustering key design is how I change what
> N rows are put into memory. Looking at the code, it seems that’s the case.
So we agree
What happens if you increase number of client threads?
Can you add another instance of cassandra-stress on another host?
--
Jacques-Henri Berthemet
From: onmstester onmstester [mailto:onmstes...@zoho.com]
Sent: Monday, March 12, 2018 12:50 PM
To: user
Subject: RE: yet another benchmark bottlenec
I mentioned that already tested increasing client threads + many stress-client
instances in one node + two stress-client in two separate nodes, in all of them
the sum of throughputs is less than 130K. I've been tuning all aspects of OS
and Cassandra (whatever I've seen in config files!) for two
If throughput decreases as you add more load then it’s probably due to disk
latency, can you test SDDs? Are you using VMWare ESXi?
--
Jacques-Henri Berthemet
From: onmstester onmstester [mailto:onmstes...@zoho.com]
Sent: Monday, March 12, 2018 2:15 PM
To: user
Subject: RE: yet another benchmark
I'm unclear what versions are most popular right now? What version are you
running?
What version should still be supported in the documentation? For example,
I'm turning my attention back to writing a section on adding a data center.
What versions should I support in that information?
I'm
In my opinion, a good documentation should somehow include version specific
pieces of information. Whether it is nodetool command that came in certain
version or parameter for something or something else.
That would very useful. It’s confusing if I see documentation talking about 4.0
specifics
Although low amount of updates, it's possible that you hit a contention
bug. A simple test would be to add multiple Cassandra nodes on the same
physical node (like split your 20 cores to 5 instances of Cassandra). If
you get much higher throughput, then you have an answer..
I don't think a sin
If we use DataStax’s example, we would have instructions for v3.0 and v2.1.
How’s that?
We should have to be instructions for the cloud platforms like AWS but how do
you do that and stay vendor neutral?
Kenneth Brotman
From: Hannu Kröger [mailto:hkro...@gmail.com]
Sent: Monday, Ma
The docs are in tree, meaning they are versioned, and should be written for
the version they correspond to. Trunk docs should reflect the current state
of trunk, and shouldn’t have caveats for other versions.
On Mon, Mar 12, 2018 at 8:15 AM Kenneth Brotman
wrote:
> If we use DataStax’s example, w
I see how that makes sense Jon but how does a user then select the
documentation for the version they are running on the Apache Cassandra web site?
Kenneth Brotman
From: Jonathan Haddad [mailto:j...@jonhaddad.com]
Sent: Monday, March 12, 2018 8:40 AM
To: user@cassandra.apache.org
Subject:
Right now they can’t.
On Mon, Mar 12, 2018 at 9:03 AM Kenneth Brotman
wrote:
> I see how that makes sense Jon but how does a user then select the
> documentation for the version they are running on the Apache Cassandra web
> site?
>
>
>
> Kenneth Brotman
>
>
>
> *From:* Jonathan Haddad [mailto:j.
It seems like the documentation that should be in the trunk for version 3.0,
should include information for users of version 3.0 and 2.1; the documentation
that should in 4.0 (when its released), should include information for users of
4.0 and at least one previous version, etc.
How about i
Hello everyone,
Do you know if exist a Cassandra tool that performs anomaly detection?
Thank you in advance
Salvatore
Anomaly detection of what? The data inside Cassandra or Casandra metrics?
--
Rahul Singh
rahul.si...@anant.us
Anant Corporation
On Mar 12, 2018, 12:44 PM -0400, D. Salvatore , wrote:
> Hello everyone,
> Do you know if exist a Cassandra tool that performs anomaly detection?
>
> Thank you in advan
Hi Rahul,
I was mainly thinking about performance anomaly detection but I am also
interested in other types such as fault detection, data or queries
anomalies.
Thanks
2018-03-12 16:52 GMT+00:00 Rahul Singh :
> Anomaly detection of what? The data inside Cassandra or Casandra metrics?
>
> --
> Rah
Docs for 3.0 go in the 3.0 branch.
I’ve never heard of anyone shipping docs for multiple versions, I don’t know
why we’d do that. You can get the docs for any version you need by downloading
C*, the docs are included. I’m a firm -1 on changing that process.
Jon
> On Mar 12, 2018, at 9:19 AM,
You cannot migrate and upgrade at the same time across major versions.
Streaming is (usually) not compatible between versions.
As to the migration question, I would expect that you may need to put the
external-facing ip addresses in several places in the cassandra.yaml file. And,
yes, it would
Hello,
We have a project currently using MySQL single-node with 5-6TB of data
and some performance issues, and we plan to add data up to a total size of
maybe 25-30TB.
We are thinking of migrating to Cassandra. I have been trying to find
benchmarks or other guidelines to compare MySQL an
Hi,
On Mon, Mar 12, 2018 at 8:58 PM Oliver Ruebenacker wrote:
> We have a project currently using MySQL single-node with 5-6TB of data and
> some performance issues, and we plan to add data up to a total size of
> maybe 25-30TB.
>
There is no 'silver bullet', the Cassandra is not a 'drop in' re
Hi Oliver,
Few years back I had a similar problem where there was a lot of data in
MySQL and it was starting to choke. I migrated data to Cassandra, ran
benchmarks and blew MySQL out of the water with a small 3 node C* cluster.
If you have a use case for Cassandra the answer is yes, but keep in mi
Dear Lucas,
Those properties that result in the log message you are seeing are
properties common to all compaction strategies. See http://cassandra.apache.
org/doc/latest/operating/compaction.html#common-options. They are
*tombstone_compaction_interval
*and *tombstone_threshold*. If you didn't def
You can’t migrate and upgrade at the same time perhaps but you could do one and
then the other so as to end up on new version. I’m guessing it’s an error in
the yaml file or a port not open. Is there any good reason for a production
cluster to still be on version 2.1x?
Kenneth Brotman
Again, I'd really like to get a feel for scylla vs rocksandra vs cassandra.
Isn't the driver binary protocol the easiest / least redesign level of
storage engine swapping? Scylla and Cassandra and Rocksandra are currently
three options. Rocksandra can expand out it's non-java footprint without
rea
On 13 March 2018 at 00:06, Durity, Sean R
wrote:
> You cannot migrate and upgrade at the same time across major versions.
> Streaming is (usually) not compatible between versions.
>
I'm not trying to upgrade as of now - first priority is the migration.
We can look at version upgrade later on.
Kunal,
Please provide the following setting from the yaml files you are using:
seeds:
listen_address:
broadcast_address:
rpc_address:
endpoint_snitch:
auto_bootstrap:
Kenneth Brotman
From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
Sent: Monday, March 12, 2018
On 13 March 2018 at 03:28, Kenneth Brotman
wrote:
> You can’t migrate and upgrade at the same time perhaps but you could do
> one and then the other so as to end up on new version. I’m guessing it’s
> an error in the yaml file or a port not open. Is there any good reason for
> a production clus
I didn’t understand something. Are you saying you are using one data center on
Google and one on Amazon?
Kenneth Brotman
From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
Sent: Monday, March 12, 2018 4:24 PM
To: user@cassandra.apache.org
Cc: Nikhil Soman
Subject: Re: [EXTERNAL] R
Quick question: If you have one cluster made of nodes of a datacenter in
AWS and a datacenter in Google, what snitch do you use?
Kenneth Brotman
Kenneth,
For AWS -EC2Snitch(if DC in Single Region)
For Google- Better go with GossipingPropertyFileSnitch
Thanks,
Madhu
On Mon, Mar 12, 2018 at 6:31 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:
> Quick question: If you have one cluster made of nodes of a datacenter in
> AWS and
On 13 March 2018 at 04:54, Kenneth Brotman
wrote:
> Kunal,
>
>
>
> Please provide the following setting from the yaml files you are using:
>
>
>
> seeds:
>
In GCE: seeds: "10.142.14.27"
In AWS (new node being added): seeds:
"35.196.96.247,35.227.127.245,35.196.241.232" (these are the public IP
Yes, that's correct. The customer wants us to migrate the cassandra setup
in their AWS account.
Thanks,
Kunal
On 13 March 2018 at 04:56, Kenneth Brotman
wrote:
> I didn’t understand something. Are you saying you are using one data
> center on Google and one on Amazon?
>
>
>
> Kenneth Brotman
>
I would just go with GossipingPropertyFileSnitch, it will work across both
data centers (I once had a test cluster with 1 DC in Azure, 1 DC in AWS and
1 DC in GCP using GPFS). Even if it's just solely AWS, I think GPFS is
superior because you can configure virtual racks if you ever need it while
EC
GPFS
--
Jeff Jirsa
> On Mar 12, 2018, at 4:31 PM, Kenneth Brotman
> wrote:
>
> Quick question: If you have one cluster made of nodes of a datacenter in AWS
> and a datacenter in Google, what snitch do you use?
>
> Kenneth Brotman
On Mon, Mar 12, 2018 at 3:58 PM, Carl Mueller
wrote:
> Rocksandra can expand out it's non-java footprint without rearchitecting
> the java codebase. Or are there serious concerns with Datastax and the
> binary protocols?
>
>
Rockssandra should eventually become part of Cassandra. The pluggable
s
Kunal,
Is this the GCE cluster you are speaking of in the “Adding new DC?” thread?
Kenneth Brotman
From: Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
Sent: Sunday, March 11, 2018 2:18 PM
To: user@cassandra.apache.org
Subject: Re: system.size_estimates - safe to remove sstables?
Can you update changes to cassandra.yaml in version 2.1x without restating
the node?
Kenneth Brotman
To my knowledge for any version updates to cassandra.yaml will only be
applied after you restart the node..
On 13 March 2018 at 12:24, Kenneth Brotman
wrote:
> Can you update changes to cassandra.yaml in version 2.1x without restating
> the node?
>
>
>
> Kenneth Brotman
>
Is there a command, perhaps a nodetool command to view the actual yaml
settings a node is using so you can confirm it is using the changes to a
yaml file you made?
Kenneth Brotman
There’s a bit of nuance in that there are some undocumented situations in some
versions where we may reload seeds from yaml without notice - notably when
instances come online and we decided whether or not to gossip with them.
That’s not really intended, and fixed in recent versions
--
Jeff J
Cassandra-7622 went patch available today
--
Jeff Jirsa
> On Mar 12, 2018, at 6:40 PM, Kenneth Brotman
> wrote:
>
> Is there a command, perhaps a nodetool command to view the actual yaml
> settings a node is using so you can confirm it is using the changes to a yaml
> file you made?
>
>
You say the nicest things!
From: Jeff Jirsa [mailto:jji...@gmail.com]
Sent: Monday, March 12, 2018 6:43 PM
To: user@cassandra.apache.org
Subject: Re: command to view yaml file setting in use on console
Cassandra-7622 went patch available today
--
Jeff Jirsa
On Mar 12, 2018, at 6:4
Hello Salvatore,
On Mon, Mar 12, 2018 at 2:12 PM, D. Salvatore
wrote:
> Hi Rahul,
> I was mainly thinking about performance anomaly detection but I am also
> interested in other types such as fault detection, data or queries
> anomalies.
>
I know VividCortex (http://vividcortex.com) supports Ca
Kunal,
Sorry for asking you things you already answered. You provided a lot of good
information and you know what you’re are doing. It’s going to be something
really simple to figure out. While I read through the thread more closely, I’m
guessing we are right on top of it so could I ask y
Kunal,
While we are looking into all this I feel compelled to ask you to check your
security configurations now that you are using public addresses to communicate
inter-node across data centers. Are you sure you are using best practices?
Kenneth Brotman
From: Kenneth Brotman [mailt
Hi Kenneth,
In addition to CASSANDRA-7622, it may help to inspect the Cassandra
*system.log* and look for the following entry:
INFO [main] ... - Node configuration:[...]
The content of "Node configuration" will have the settings the node is
using.
Regards,
Anthony
On Tue, 13 Mar 2018 at 12:
No, this is a different cluster.
Kunal
On 13-Mar-2018 6:27 AM, "Kenneth Brotman"
wrote:
Kunal,
Is this the GCE cluster you are speaking of in the “Adding new DC?” thread?
Kenneth Brotman
*From:* Kunal Gangakhedkar [mailto:kgangakhed...@gmail.com]
*Sent:* Sunday, March 11, 2018 2:18 PM
Cassandra is going to be die in next few time (What I see) - Cassandra
is not solving the purpose rather people are facing fewer issue
sometime where in virtual environments.
We have tried crdb database cluster and migrated few of cluster over
on the cockroach database environment, it seems workin
Kunal,
Also to check:
You should use the same list of seeds, probably two in each data center if you
will have five nodes in each, in all the yaml files. All the seeds node
addresses from all the data centers listed in each yaml file where it says
“-seeds:”. I’m not sure from your prev
I already ran two instance of cassandra in one node, sum of throughput is less
than 130K/ ops.
Currently i'm suspecting network packet per seconds which seems like couldn't
get higher than 10 K/pps. which Actually would be iperf limit for packets with
the same size. I'm looking for how to tune
66 matches
Mail list logo