Re: [DISCUSS] Potential circleci config and workflow changes

Derek Chen-Becker Mon, 24 Oct 2022 16:44:30 -0700

This could also be a pipeline parameter instead of hacking it in
generate.sh. I promise I'll have a proposal before the end of the week.


Derek

On Mon, Oct 24, 2022 at 2:13 PM Josh McKenzie <[email protected]> wrote:

> @Ekaterina: I recall us going back and forth on whether default should be
> require approval or not and there not being a consensus. I'm fine not
> changing the status quo and just parameterizing that in generate.sh so
> folks can locally script how they want to setup when they alias up
> generate.sh.
>
> I'll add C-17113 to the epic as well and any other tickets anyone has in
> flight we can link up.
>
> Maybe we should remove them from the workflow when the free option is used
>
> That'd put us in the position of having a "smoke testing suite" for free
> tier users and the expectation of a committer running the full suite
> pre-merge. Which, now that I type it out, is a lot more representative of
> our current reality so we should probably do that.
>
> Noted re: the -f flag; I could have checked that but just hacked that out
> in the email spur of the moment. We could just default to low / free /
> smoke test and have -p for paid tier.
>
>
> On Mon, Oct 24, 2022, at 3:23 PM, Andrés de la Peña wrote:
>
> - Ticket for: remove -h, have -f and -p (free and paid)
>
>
> +1 to this, probably there isn't anyone using -h. There are some jobs that
> can't pass with the free option. Maybe we should remove them from the
> workflow when the free option is used. Perhaps that could save new
> contributors some confusion. Or should we leave them because a subset of
> the tests inside those jobs can still pass even with the free tier?
>
> By the way, the generate.sh script already accepts a -f flag. It's used to
> stop checking that the specified environment variables are known. It was
> meant to be a kind of general "--force" flag.
>
> On Mon, 24 Oct 2022 at 20:07, Ekaterina Dimitrova <[email protected]>
> wrote:
>
> Seems like my email crashed with Andres’ one.
> My understanding is we will use the ticket CASSANDRA-17113 as
> placeholder, the work there will be rebased/reworked etc depending on what
> we agree with.
> I also agree with the other points he made. Sounds reasonable to me
>
> On Mon, 24 Oct 2022 at 15:03, Ekaterina Dimitrova <[email protected]>
> wrote:
>
> Thank you Josh
>
> So about push with/without a single click, I guess you mean to
> parameterize whether the step build needs approval or not? Pre-commit the
> new flag will use the “no-approval” version, but during development we
> still will be able to push the tests without immediately starting all
> tests, right?
> - parallelism + -h being removed - just to confirm, that means we will not
> use xlarge containers. As David confirmed, this is not needed for all jibs
> and it is important as otherwise whoever uses paid account will burn their
> credits time faster for very similar duration runs.
>
> CASSANDRA-17930 - I will use the opportunity also to mention that many of
> the identified missing jobs in CircleCI will be soon there - Andres is
> working on all variations unit tests, I am doing final testing on fixing
> the Python upgrade tests (we weren’t using the right parameters and running
> way more jobs then we should) and Derek is looking into the rest of the
> Python test. I still need to check whether we need something regarding
> in-jvm etc, the simulator ones are running only for jdk8 for now,
> confirmed. All this should unblock us to be able to do next releases based
> on CircleCI as we agreed. Then we move to do some
> changes/additions/improvements to Jenkins. And of course, the future
> improvements we agreed on.
>
> On Mon, 24 Oct 2022 at 14:10, Josh McKenzie <[email protected]> wrote:
>
>
> Auto-run on push? Can you elaborate?
>
> Yep - instead of having to go to circle and click, when you push your
> branch the circle hook picks it up and kicks off the top level job
> automatically. I tend to be paranoid and push a lot of incremental work
> that's not ready for CI remotely so it's not great for me, but I think
> having it be optional is the Right Thing.
>
> So here's the outstanding work I've distilled from this thread:
> - Create an epic for circleci improvement work (we have a lot of little
> augments to do here; keep it organized and try and avoid redundancy)
> - Include CASSANDRA-17600 in epic umbrella
> - Include CASSANDRA-17930 in epic umbrella
> - Ticket to tune parallelism per job
>     -
>     > def java_parallelism(src_dir, kind, num_file_in_worker, include =
> lambda a, b: True):
>     >     d = os.path.join(src_dir, 'test', kind)
>     >     num_files = 0
>     >     for root, dirs, files in os.walk(d):
>     >         for f in files:
>     >             if f.endswith('Test.java') and
> include(os.path.join(root, f), f):
>     >                 num_files += 1
>     >     return math.floor(num_files / num_file_in_worker)
>     >
>     > def fix_parallelism(args, contents):
>     >     jobs = contents['jobs']
>     >
>     >     unit_parallelism                = java_parallelism(args.src,
> 'unit', 20)
>     >     jvm_dtest_parallelism           = java_parallelism(args.src,
> 'distributed', 4, lambda full, name: 'upgrade' not in full)
>     >     jvm_dtest_upgrade_parallelism   = java_parallelism(args.src,
> 'distributed', 2, lambda full, name: 'upgrade' in full)
>     - `TL;DR - I find all test files we are going to run, and based off a
> pre-defined variable that says “idea” number of files per worker, I then
> calculate how many workers we need.  So unit tests are num_files / 20 ~= 35
> workers.  Can I be “smarter” by knowing which files have higher cost?
> Sure… but the “perfect” and the “average” are too similar that it wasn’t
> worth it...`
> - Ticket to combine pre-commit jobs into 1 pipeline for all JDK's
>     - Path to activate all supported JDK's for pre-commit at root
> (one-click pre-merge full validation)
>     - Path to activate per JDK below that (interim work partial validation)
> - Ticket to rename jobs in circleci
>     - Reference comment:
> https://issues.apache.org/jira/browse/CASSANDRA-17939?focusedCommentId=17617016&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17617016
>     - (buildjdk)_(runjdk)_(testsuite) format:
>     - j8_j8_jvm_dtests
>     - j8_j11_jvm_dtests
>     - j11_j11_jvm_dtest_vnode
>     etc
> - Ticket for flag in generate.sh to support auto run on push (see response
> above)
> - Ticket for: remove -h, have -f and -p (free and paid) (probably
> intersects with https://issues.apache.org/jira/browse/CASSANDRA-17600)
>
> Anything wrong w/the above or anything missed? If not, I'll go do some
> JIRA'ing.
>
>
> ~Josh
>
>
> On Fri, Oct 21, 2022, at 3:50 PM, Josh McKenzie wrote:
>
> I am cool with removing circle if apache CI is stable and works, we do
> need to solve the non-committer issue but would argue that partially exists
> in circle today (you can be a non-commuter with a paid account, but you
> can’t be a non-committer with a free account)
>
> There's a few threads here:
> 1. non-committers should be able to run ci
> 2. People that have resources and want to run ci faster should be able to
> do so (assuming the ci of record could serve to be faster)
> 3. ci should be stable
>
> Thus far we haven't landed on 1 system that satisfies all 3. There's some
> background discussions brainstorming how to get there; when / if things
> come from that they'll as always be brought to the list for discussion.
>
> On Fri, Oct 21, 2022, at 1:44 PM, Ekaterina Dimitrova wrote:
>
> I agree with David with one caveat - last time I checked only some Python
> tests lack enough resources with the free tier. The rest run slower than
> with a paid account, but they do fine. In fact I use the free tier if I
> want to test only unit or in-jvm tests sometimes. I guess that is what he
> meant by partially but even being able to run the non-Python tests is a win
> IMHO. If we find a solution for all tests though… even better.
> @Derek your idea sounds interesting, I will be happy to see a proposal.
> Thank you
>
> On Fri, 21 Oct 2022 at 13:39, David Capwell <[email protected]> wrote:
>
> I am cool with removing circle if apache CI is stable and works, we do
> need to solve the non-committer issue but would argue that partially exists
> in circle today (you can be a non-commuter with a paid account, but you
> can’t be a non-committer with a free account)
>
>
>
> On Oct 20, 2022, at 2:20 PM, Josh McKenzie <[email protected]> wrote:
>
> I believe it's original intention to be just about CircleCI.
>
> It was but fwiw I'm good w/us exploring adjacent things regarding CI here.
> I'm planning on deep diving on the thread tomorrow and distilling a
> snapshot of the work we have a consensus on for circle and summarizing here
> so we don't lose that. Seems like it's fairly non-controversial.
>
> On Thu, Oct 20, 2022, at 5:14 PM, Mick Semb Wever wrote:
>
>
>
> On Thu, 20 Oct 2022 at 22:07, Derek Chen-Becker <[email protected]>
> wrote:
>
> Would the preclusion of non-committers also prevent us from configuring
> Jenkins to auto-test on PR independent of who opens it?
>
> One of my current concerns is that we're maintaining 2x the CI for 1x the
> benefit, and I don't currently see an easy way to unify them (perhaps a
> lack of imagination?). I know there's a long history behind the choice of
> CircleCI, so I'm not trying to be hand-wavy about all of the thought that
> went into that decision, but that decision has costs beyond just a paid
> CircleCI account. My long term, probably naive, goals for CI would be to:
>
> 1. Have a CI system that is *fully* available to *any* contributor, modulo
> safeguards to prevent abuse
>
>
>
> This thread is going off-topic, as I believe it's original intention to be
> just about CircleCI.
>
> But on your point… our community CI won't be allowed (by ASF), nor have
> capacity (limited donated resources), to run pre-commit testing by anyone
> and everyone.
>
> Today, trusted contributors can be handed tokens to ci-cassandra.a.o (make
> sure to label them so they can be revoked easily), but we still face the
> issue that too many pre-commit runs impacts the throughput and quality of
> the post-commit runs (though this has improved recently).
>
> It's on my wishlist to be able to: with a single command line; spin up the
> ci-cassandra.a.o stack on any k8s cluster, run any git sha through it and
> collect results, and tear it down. Variations on this would solve
> non-committers being able to repeat, use, and work on their own (or a
> separately donated) CI system, and folk/companies with money to be able to
> run their own ci-cassandra.a.o stacks for faster pre-commit turnaround
> time. Having this reproducibility of the CI system would make testing
> changes to it easier as well, so I'd expect a positive feedback loop here.
>
> I have some rough ideas on how to get started on this, if anyone would
> like to buddy up on it.
>
>
>
>
>

-- 
+---------------------------------------------------------------+
| Derek Chen-Becker                                             |
| GPG Key available at https://keybase.io/dchenbecker and       |
| https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org |
| Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
+---------------------------------------------------------------+

Re: [DISCUSS] Potential circleci config and workflow changes

Reply via email to