On Wed, Feb 21, 2018 at 7:54 PM, Durity, Sean R wrote:
>
>
> However, I think the shots at Cassandra are generally unfair. When I
> started working with it, the DataStax documentation was some of the best
> documentation I had seen on any project, especially an open source one.
>
Oh, don't get m
>
> Also, I was wondering if the key cache maintains a count of how many local
> accesses a key undergoes. Such information might be very useful for
> compactions of sstables by splitting data by frequency of use so that those
> can be preferentially compacted.
No we don't currently have metrics f
Jeff,
I already addressed everything you said. Boy! Would I like to bring up the out
of date articles on the web that trip people up and the lousy documentation on
the Apache website but I can’t because a lot of folks don’t know me or why I’m
saying these things.
I will be making a
>
> Instead of saying "Make X better" you can quantify "Here's how we can make
> X better" in a jira and the conversation will continue with interested
> parties (opening jiras are free!). Being combative and insulting project on
> mailing list may help vent some frustrations but it is counter prod
Instead of saying "Make X better" you can quantify "Here's how we can make X
better" in a jira and the conversation will continue with interested parties
(opening jiras are free!). Being combative and insulting project on mailing
list may help vent some frustrations but it is counter productive
Hi all,
I'd like to deescalate a bit here.
Since this is an Apache and an OSS project, contributions come in many
forms: code, speaking/advocacy, documentation, support, project management,
and so on. None of these things come for free.
Ken, I appreciate you bring up these usability topics; they
Also, I was wondering if the key cache maintains a count of how many local
accesses a key undergoes. Such information might be very useful for
compactions of sstables by splitting data by frequency of use so that those
can be preferentially compacted.
On Wed, Feb 21, 2018 at 5:08 PM, Carl Mueller
On Wed, Feb 21, 2018 at 2:53 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:
> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting the big companies could justify taking it on easy enough and
> you know actually pay the people who wo
The only progress from this point is what Jon said: enumerate and detail
your issues in jira tickets.
On Wed, Feb 21, 2018 at 4:53 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:
> Hi Akash,
>
> I get the part about outside work which is why in replying to Jeff Jirsa I
> was suggesting
Looking through the 2.1.X code I see this:
org.apache.cassandra.io.sstable.Component.java
In the enum for component types there is a CUSTOM enum value which seems to
indicate a catchall for providing metadata for sstables.
Has this been exploited... ever? I noticed in some of the patches for the
Hi Akash,
I get the part about outside work which is why in replying to Jeff Jirsa I was
suggesting the big companies could justify taking it on easy enough and you
know actually pay the people who would be working at it so those people could
have a life.
The part I don't get is the aversion t
I would second Jon in the arguments he made. Contributing outside work is
draining and really requires a lot of commitment. If someone requires
features around usability etc, just pay for it, period.
On Wed, Feb 21, 2018 at 2:20 PM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:
> Jon,
>
Jon,
Very sorry that you don't see the value of the time I'm taking for this. I
don't have demands; I do have a stern warning and I'm right Jon. Please be
very careful not to mischaracterized my words Jon.
You suggest I put things in JIRA's, then seem to suggest that I'd be lucky if
anyone l
Slight nuance: we don't load the whole row into memory, but the column
index (and the result set, and the tombstones in the partition), which can
still spike your GC/heap (and potentially overflow the row cache, if you
have it on, which is atypical).
On Wed, Feb 21, 2018 at 1:35 PM, Carl Mueller
Cass 2.1.14 is missing some wide row optimizations done in later cass
releases IIRC.
Speculation: IN won't matter, it will load the entire wide row into memory
regardless which might spike your GC/heap and overflow the rowcache
On Wed, Feb 21, 2018 at 2:16 PM, Gareth Collins
wrote:
> Thanks for
Hm nodetool decommision performs the streamout of the replicated data, and
you said that was apparently without error...
But if you dropped three nodes in one AZ/rack on a five node with RF3, then
we have a missing RF factor unless NetworkTopologyStrategy fails over to
another AZ. But that would a
sorry for the idiot questions...
data was allowed to fully rebalance/repair/drain before the next node was
taken off?
did you take 1 off per rack/AZ?
On Wed, Feb 21, 2018 at 12:29 PM, Fred Habash wrote:
> One node at a time
>
> On Feb 21, 2018 10:23 AM, "Carl Mueller"
> wrote:
>
>> What is y
Ken,
Maybe it’s not clear how open source projects work, so let me try to explain.
There’s a bunch of us who either get paid by someone or volunteer on our free
time. The folks that get paid, (yay!) usually take direction on what the
priorities are, and work on projects that directly affect o
It is instructive to listen to the concerns of new and existing users in order
to improve a product like Cassandra, but I think the school yard taunt model
isn’t the most effective.
In my experience with open and closed source databases, there are always things
that could be improved. Many have
Thanks for the response!
I could understand that being the case if the Cassandra cluster is not
loaded. Splitting the work across multiple nodes would obviously make
the query faster.
But if this was just a single node, shouldn't one IN query be faster
than multiple due to the fact that, if I und
So before buying any marketing claims from Microsoft or whoever, maybe
should you try to use it extensively ?
And talking about backup, have a look at DynamoDB:
http://i68.tinypic.com/n1b6yr.jpg
>From my POV, if a multi-billions company like Amazon doesn't get it right
or can't make it easy for e
Josh,
To say nothing is indifference. If you care about your community, sometimes
don't you have to bring up a subject even though you know it's also temporarily
adding some discomfort?
As to opening a JIRA, I've got a very specific topic to try in mind now. An
easy one I'll work on and
RF of 3 with three racs AZ's in a single region.
On Feb 21, 2018 10:23 AM, "Carl Mueller"
wrote:
> What is your replication factor?
> Single datacenter, three availability zones, is that right?
> You removed one node at a time or three at once?
>
> On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash wr
One node at a time
On Feb 21, 2018 10:23 AM, "Carl Mueller"
wrote:
> What is your replication factor?
> Single datacenter, three availability zones, is that right?
> You removed one node at a time or three at once?
>
> On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash wrote:
>
>> We have had a 15 nod
On 02/21/2018 11:56 AM, Zachary Marois wrote:
> Starting in that last two weeks (I successfully installed cassandra
> sometime in the last two weeks), I'm guessing on 2/19 when version
> 3.11.2 was released, the cassandra apt package version 3.11.1 became
> unstable. It doesn't seem to be published
Starting in that last two weeks (I successfully installed cassandra sometime in
the last two weeks), I'm guessing on 2/19 when version 3.11.2 was released, the
cassandra apt package version 3.11.1 became unstable. It doesn't seem to be
published in the http://www.apache.org/dist/cassandra/debian
Hello Apache Supporters and Enthusiasts
This is your FINAL reminder that the Call for Papers (CFP) for the
Apache EU Roadshow is closing soon. Our Apache EU Roadshow will focus on
Cloud, IoT, Apache Tomcat, Apache Http and will run from 13-14 June 2018
in Berlin.
Note that the CFP deadline has
I don't disagree with jon.
On Wed, Feb 21, 2018 at 10:27 AM, Jonathan Haddad wrote:
> The easiest way to do this is replacing one node at a time by using
> rsync. I don't know why it has to be more complicated than copying data to
> a new machine and replacing it in the cluster. Bringing up a
There's a disheartening amount of "here's where Cassandra is bad, and
here's what it needs to do for me for free" happening in this thread.
This is open-source software. Everyone is *strongly encouraged* to submit a
patch to move the needle on *any* of these things being complained about in
this t
The easiest way to do this is replacing one node at a time by using rsync.
I don't know why it has to be more complicated than copying data to a new
machine and replacing it in the cluster. Bringing up a new DC with
snapshots is going to be a nightmare in comparison.
On Wed, Feb 21, 2018 at 8:16
nodetool cfhistograms, nodetool compactionstats would be helpful
Compaction is probably behind from streaming, and reads are touching many
sstables.
--
Jeff Jirsa
> On Feb 21, 2018, at 8:20 AM, Fd Habash wrote:
>
> We have had a 15 node cluster across three zones and cluster repairs using
What is your replication factor?
Single datacenter, three availability zones, is that right?
You removed one node at a time or three at once?
On Wed, Feb 21, 2018 at 10:20 AM, Fd Habash wrote:
> We have had a 15 node cluster across three zones and cluster repairs using
> ‘nodetool repair -pr’ to
We have had a 15 node cluster across three zones and cluster repairs using
‘nodetool repair -pr’ took about 3 hours to finish. Lately, we shrunk the
cluster to 12. Since then, same repair job has taken up to 12 hours to finish
and most times, it never does.
More importantly, at some point duri
DCs can be stood up with snapshotted data.
Stand up a new cluster with your old cluster snapshots:
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_snapshot_restore_new_cluster.html
Then link the DCs together.
Disclaimer: I've never done this in real life.
On Wed, Feb 21, 2
jon: I am planning on writing a custom compaction strategy. That's why the
question is here, I figured the specifics of memtable -> sstable and
cassandra internals are not a user question. If that still isn't deep
enough for the dev thread, I will move all those questions to user.
On Wed, Feb 21,
Thank you all!
On Tue, Feb 20, 2018 at 7:35 PM, kurt greaves wrote:
> Probably a lot of work but it would be incredibly useful for vnodes if
> flushing was range aware (to be used with RangeAwareCompactionStrategy).
> The writers are already range aware for JBOD, but that's not terribly
> valuab
New dc will be faster but may impact cluster performance due to streaming.
Sent from my iPhone
> On Feb 21, 2018, at 8:53 AM, Leena Ghatpande wrote:
>
> We do use LOCAL_ONE and LOCAL_Quorum currently. But these 8 nodes need to be
> in 2 different DC< so we would end up create additional 2 new
We do use LOCAL_ONE and LOCAL_Quorum currently. But these 8 nodes need to be in
2 different DC< so we would end up create additional 2 new DC and dropping 2.
are there any advantages on adding DC over one node at a time?
From: Jeff Jirsa
Sent: Wednesday, Februa
Jeff,
Check the service configuration to see what path it’s using for the JRE
execution and if it’s specifying any class path parameters. The system user may
not have the environment variables available whereas your user may have it.
--
Rahul Singh
rahul.si...@anant.us
Anant Corporation
On Fe
That depends on the driver you use but separate queries asynchronously around
the cluster would be faster.
--
Rahul Singh
rahul.si...@anant.us
Anant Corporation
On Feb 20, 2018, 6:48 PM -0500, Eric Stevens , wrote:
> Someone can correct me if I'm wrong, but I believe if you do a large IN() on
On Mon, Feb 19, 2018 at 10:01 AM, Kenneth Brotman <
kenbrot...@yahoo.com.invalid> wrote:
>
> >> Cluster wide management should be a big theme in any next major release.
> >>
> >Na. Stability and testing should be a big theme in the next major release.
> >
>
> Double Na on that one Jeff. I think y
I’ve been bitting my tongue because I don’t normally like to directly plug
our service on the mailing list but if you’re going to compare Cassandra to
a full managed service from Microsoft then you really should check out
Instaclustr (www.instaclustr.com) and you’ll find that we take care of many
o
Bloom filter settings have not changed, they are default. In the table
settings bloom_filter_fp_chance = 0.01. Should I increase it?
DESC TABLE "PerBoxEventSeriesEventIds"
CREATE TABLE "EventsKeyspace"."PerBoxEventSeriesEventIds" (
key blob,
column1 text,
value blob,
PRIMARY KEY
For UI and interactive data exploration there is already the Cassandra
interpreter for Apache Zeppelin that is more than decent for the job
On Wed, Feb 21, 2018 at 9:19 AM, Daniel Hölbling-Inzko <
daniel.hoelbling-in...@bitmovin.com> wrote:
> But what does this video really show? That Microsoft m
But what does this video really show? That Microsoft managed to run
Cassandra as a SaaS product with nice UI?
Google did that years ago with BigTable and Amazon with DynamoDB.
I agree that we need more tools, but not so much for querying (although
that would also help a bit), but just in general t
45 matches
Mail list logo