Re: Weekly Cassandra Wrap-Up: Oct 16 Edition

Jon Haddad Mon, 16 Oct 2017 11:10:22 -0700

Regarding the stress tests, if you’re willing to share, I’m starting a repo 
where we can keep a bunch of different stress profiles.  I’d like to start 
running them on releases before we agree to push them out.  If anyone has a 
stress test they are willing to share, please get in touch with me!




> On Oct 16, 2017, at 8:37 AM, Jeff Jirsa <[email protected]> wrote:
> 
> I got some feedback last week that I should try this on Monday morning, so
> let's see if we can nudge a few people into action this week.
> 
> 3.0.15 and 3.11.1 are released. This is a dev list, so that shouldn't be a
> surprise to anyone here - you should have seen the votes and release
> notifications. The people working directly ON Cassandra every day are
> probably very aware of the number and nature of fixes in those versions -
> if you're not aware, the Change lists are HUGE, and some of the fixes are
> VERY IMPORTANT. So this week's wrap-up is really a reflection on the size
> of those two release changelogs.
> 
> One of the advantages of the Cassandra project is the size of the user base
> - I don't know if we have accurate counts (and some of the "surveys" are
> laughable), but we know it's on the order of thousands (probably tens of
> thousands) of companies, and some huge number of instances (not willing to
> speculate here, we know it's at least in the hundreds of thousands, may be
> well into the millions). Historically, the best stabilizer of a release was
> people upgrading their unusual use cases, finding bugs that the developers
> hadn't anticipated (and therefore tests didn't exist for those edge cases),
> reporting them, and the next release would be slightly better than the one
> before it. The chicken/egg problem here is pretty obvious, and while a lot
> of us are spending a lot of time making things better, I want to use this
> email to ask a favor (in 3 parts):
> 
> 1) If you haven't tried 3.0 or 3.11 yet, please spin it up on a test
> cluster. 3.11 would be better, 3.0 is ok too. It doesn't need to be a
> thousand node cluster, most of the weird stuff we've seen in the post-3.0
> world deals with data, not cluster size. Grab some of your prod data if you
> can, throw it into a test cluster, add a node/remove a node, tell us if it
> doesn't work.
> 2) Please run a stress workload against that test cluster, even if it's
> 5-10 minutes. Purpose here is two-fold: like #1, it'll help us find some
> edge cases we haven't seen before, but it'll also help us identify holes in
> stress coverage. We have some tickets to add UDTs to stress (
> https://issues.apache.org/jira/browse/CASSANDRA-13260 ) and LWT (
> https://issues.apache.org/jira/browse/CASSANDRA-7960 ). Ideally your stress
> profile should be more than "80% reads 20% writes" - try to actually model
> your schema and query behavior. Do you use static columns? Do you use
> collections?  If you're unable to model your use case because of a
> deficiency in stress, open a JIRA. If things break, open a JIRA. If it
> works perfectly, I'm interested in seeing your stress yaml and results
> (please send it to me privately, don't spam the list).
> 3) If you're somehow not able to run stress because you don't have hardware
> for a spare cluster, profiling your live cluster is also incredibly useful.
> TLP has some notes on how to generate flame graphs -
> https://github.com/thelastpickle/lightweight-java-profiler - I saw one
> example from a cluster that really surprised me. There are versions and use
> cases that we know have been heavily profiled, but there are probably
> versions and use cases where nobody's ever run much in the way of
> profiling. If you're running openjdk in prod, and you're able to SAFELY
> attach a profiler to generate some flame graphs, please send those to me
> (again, privately please, I don't think the whole list needs a copy).
> 
> My hope in all of this is to build up a corpus of real world use cases (and
> real current state via profiling) that we can leverage to make testing and
> performance better going forward. If I get much in the way of response to
> either of these, I'll try to send out a summary in next week's email).
> 
> - Jeff


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: Weekly Cassandra Wrap-Up: Oct 16 Edition

Reply via email to