Re: [DISCUSS] Potential circleci config and workflow changes

David Capwell Tue, 25 Oct 2022 09:40:27 -0700

> This could also be a pipeline parameter instead of hacking it in generate.sh



Curious how this works… I run a script that deletes all the approvals and 
removes the testing workflows… I really don’t want to use the UI at all….  I 
assumed pipeline params are a UI thing, but I think the goal here for many of 
us are to ignore the UI other than looking at the results… and even that can be 
scripted...


> On Oct 24, 2022, at 4:44 PM, Derek Chen-Becker <de...@chen-becker.org> wrote:
> 
> This could also be a pipeline parameter instead of hacking it in generate.sh. 
> I promise I'll have a proposal before the end of the week.
> 
> Derek
> 
> On Mon, Oct 24, 2022 at 2:13 PM Josh McKenzie <jmcken...@apache.org 
> <mailto:jmcken...@apache.org>> wrote:
> @Ekaterina: I recall us going back and forth on whether default should be 
> require approval or not and there not being a consensus. I'm fine not 
> changing the status quo and just parameterizing that in generate.sh so folks 
> can locally script how they want to setup when they alias up generate.sh.
> 
> I'll add C-17113 to the epic as well and any other tickets anyone has in 
> flight we can link up.
> 
>> Maybe we should remove them from the workflow when the free option is used
> That'd put us in the position of having a "smoke testing suite" for free tier 
> users and the expectation of a committer running the full suite pre-merge. 
> Which, now that I type it out, is a lot more representative of our current 
> reality so we should probably do that.
> 
> Noted re: the -f flag; I could have checked that but just hacked that out in 
> the email spur of the moment. We could just default to low / free / smoke 
> test and have -p for paid tier.
> 
> 
> On Mon, Oct 24, 2022, at 3:23 PM, Andrés de la Peña wrote:
>> - Ticket for: remove -h, have -f and -p (free and paid)
>> 
>> +1 to this, probably there isn't anyone using -h. There are some jobs that 
>> can't pass with the free option. Maybe we should remove them from the 
>> workflow when the free option is used. Perhaps that could save new 
>> contributors some confusion. Or should we leave them because a subset of the 
>> tests inside those jobs can still pass even with the free tier?
>> 
>> By the way, the generate.sh script already accepts a -f flag. It's used to 
>> stop checking that the specified environment variables are known. It was 
>> meant to be a kind of general "--force" flag.
>> 
>> On Mon, 24 Oct 2022 at 20:07, Ekaterina Dimitrova <e.dimitr...@gmail.com 
>> <mailto:e.dimitr...@gmail.com>> wrote:
>> Seems like my email crashed with Andres’ one. 
>> My understanding is we will use the ticket CASSANDRA-17113 as placeholder, 
>> the work there will be rebased/reworked etc depending on what we agree with. 
>> I also agree with the other points he made. Sounds reasonable to me
>> 
>> On Mon, 24 Oct 2022 at 15:03, Ekaterina Dimitrova <e.dimitr...@gmail.com 
>> <mailto:e.dimitr...@gmail.com>> wrote:
>> Thank you Josh
>> 
>> So about push with/without a single click, I guess you mean to parameterize 
>> whether the step build needs approval or not? Pre-commit the new flag will 
>> use the “no-approval” version, but during development we still will be able 
>> to push the tests without immediately starting all tests, right?
>> - parallelism + -h being removed - just to confirm, that means we will not 
>> use xlarge containers. As David confirmed, this is not needed for all jibs 
>> and it is important as otherwise whoever uses paid account will burn their 
>> credits time faster for very similar duration runs. 
>> 
>> CASSANDRA-17930 - I will use the opportunity also to mention that many of 
>> the identified missing jobs in CircleCI will be soon there - Andres is 
>> working on all variations unit tests, I am doing final testing on fixing the 
>> Python upgrade tests (we weren’t using the right parameters and running way 
>> more jobs then we should) and Derek is looking into the rest of the Python 
>> test. I still need to check whether we need something regarding in-jvm etc, 
>> the simulator ones are running only for jdk8 for now, confirmed. All this 
>> should unblock us to be able to do next releases based on CircleCI as we 
>> agreed. Then we move to do some changes/additions/improvements to Jenkins. 
>> And of course, the future improvements we agreed on. 
>> 
>> On Mon, 24 Oct 2022 at 14:10, Josh McKenzie <jmcken...@apache.org 
>> <mailto:jmcken...@apache.org>> wrote:
>> 
>>> Auto-run on push? Can you elaborate?
>> Yep - instead of having to go to circle and click, when you push your branch 
>> the circle hook picks it up and kicks off the top level job automatically. I 
>> tend to be paranoid and push a lot of incremental work that's not ready for 
>> CI remotely so it's not great for me, but I think having it be optional is 
>> the Right Thing.
>> 
>> So here's the outstanding work I've distilled from this thread:
>> - Create an epic for circleci improvement work (we have a lot of little 
>> augments to do here; keep it organized and try and avoid redundancy)
>> - Include CASSANDRA-17600 in epic umbrella  
>> - Include CASSANDRA-17930 in epic umbrella
>> - Ticket to tune parallelism per job  
>>     -  
>>     > def java_parallelism(src_dir, kind, num_file_in_worker, include = 
>> lambda a, b: True):
>>     >     d = os.path.join(src_dir, 'test', kind)
>>     >     num_files = 0
>>     >     for root, dirs, files in os.walk(d):
>>     >         for f in files:
>>     >             if f.endswith('Test.java') and include(os.path.join(root, 
>> f), f):
>>     >                 num_files += 1
>>     >     return math.floor(num_files / num_file_in_worker)
>>     > 
>>     > def fix_parallelism(args, contents):
>>     >     jobs = contents['jobs']
>>     > 
>>     >     unit_parallelism                = java_parallelism(args.src, 
>> 'unit', 20)
>>     >     jvm_dtest_parallelism           = java_parallelism(args.src, 
>> 'distributed', 4, lambda full, name: 'upgrade' not in full)
>>     >     jvm_dtest_upgrade_parallelism   = java_parallelism(args.src, 
>> 'distributed', 2, lambda full, name: 'upgrade' in full)
>>     - `TL;DR - I find all test files we are going to run, and based off a 
>> pre-defined variable that says “idea” number of files per worker, I then 
>> calculate how many workers we need.  So unit tests are num_files / 20 ~= 35 
>> workers.  Can I be “smarter” by knowing which files have higher cost?  Sure… 
>> but the “perfect” and the “average” are too similar that it wasn’t worth 
>> it...`  
>> - Ticket to combine pre-commit jobs into 1 pipeline for all JDK's
>>     - Path to activate all supported JDK's for pre-commit at root (one-click 
>> pre-merge full validation)
>>     - Path to activate per JDK below that (interim work partial validation)
>> - Ticket to rename jobs in circleci
>>     - Reference comment: 
>> https://issues.apache.org/jira/browse/CASSANDRA-17939?focusedCommentId=17617016&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17617016
>>  
>> <https://issues.apache.org/jira/browse/CASSANDRA-17939?focusedCommentId=17617016&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17617016>
>>     - (buildjdk)_(runjdk)_(testsuite) format:
>>     - j8_j8_jvm_dtests
>>     - j8_j11_jvm_dtests
>>     - j11_j11_jvm_dtest_vnode
>>     etc
>> - Ticket for flag in generate.sh to support auto run on push (see response 
>> above)
>> - Ticket for: remove -h, have -f and -p (free and paid) (probably intersects 
>> with https://issues.apache.org/jira/browse/CASSANDRA-17600 
>> <https://issues.apache.org/jira/browse/CASSANDRA-17600>)
>> 
>> Anything wrong w/the above or anything missed? If not, I'll go do some 
>> JIRA'ing.
>> 
>> 
>> ~Josh
>> 
>> 
>> On Fri, Oct 21, 2022, at 3:50 PM, Josh McKenzie wrote:
>>>> I am cool with removing circle if apache CI is stable and works, we do 
>>>> need to solve the non-committer issue but would argue that partially 
>>>> exists in circle today (you can be a non-commuter with a paid account, but 
>>>> you can’t be a non-committer with a free account)
>>> There's a few threads here:
>>> 1. non-committers should be able to run ci
>>> 2. People that have resources and want to run ci faster should be able to 
>>> do so (assuming the ci of record could serve to be faster)
>>> 3. ci should be stable
>>> 
>>> Thus far we haven't landed on 1 system that satisfies all 3. There's some 
>>> background discussions brainstorming how to get there; when / if things 
>>> come from that they'll as always be brought to the list for discussion.
>>> 
>>> On Fri, Oct 21, 2022, at 1:44 PM, Ekaterina Dimitrova wrote:
>>>> I agree with David with one caveat - last time I checked only some Python 
>>>> tests lack enough resources with the free tier. The rest run slower than 
>>>> with a paid account, but they do fine. In fact I use the free tier if I 
>>>> want to test only unit or in-jvm tests sometimes. I guess that is what he 
>>>> meant by partially but even being able to run the non-Python tests is a 
>>>> win IMHO. If we find a solution for all tests though… even better.
>>>> @Derek your idea sounds interesting, I will be happy to see a proposal. 
>>>> Thank you
>>>> 
>>>> On Fri, 21 Oct 2022 at 13:39, David Capwell <dcapw...@apple.com 
>>>> <mailto:dcapw...@apple.com>> wrote:
>>>> I am cool with removing circle if apache CI is stable and works, we do 
>>>> need to solve the non-committer issue but would argue that partially 
>>>> exists in circle today (you can be a non-commuter with a paid account, but 
>>>> you can’t be a non-committer with a free account)
>>>> 
>>>> 
>>>> 
>>>>> On Oct 20, 2022, at 2:20 PM, Josh McKenzie <jmcken...@apache.org 
>>>>> <mailto:jmcken...@apache.org>> wrote:
>>>>> 
>>>>>> I believe it's original intention to be just about CircleCI.
>>>>> It was but fwiw I'm good w/us exploring adjacent things regarding CI 
>>>>> here. I'm planning on deep diving on the thread tomorrow and distilling a 
>>>>> snapshot of the work we have a consensus on for circle and summarizing 
>>>>> here so we don't lose that. Seems like it's fairly non-controversial.
>>>>> 
>>>>> On Thu, Oct 20, 2022, at 5:14 PM, Mick Semb Wever wrote:
>>>>>> 
>>>>>> 
>>>>>> On Thu, 20 Oct 2022 at 22:07, Derek Chen-Becker <de...@chen-becker.org 
>>>>>> <mailto:de...@chen-becker.org>> wrote:
>>>>>> Would the preclusion of non-committers also prevent us from configuring 
>>>>>> Jenkins to auto-test on PR independent of who opens it?
>>>>>> 
>>>>>> One of my current concerns is that we're maintaining 2x the CI for 1x 
>>>>>> the benefit, and I don't currently see an easy way to unify them 
>>>>>> (perhaps a lack of imagination?). I know there's a long history behind 
>>>>>> the choice of CircleCI, so I'm not trying to be hand-wavy about all of 
>>>>>> the thought that went into that decision, but that decision has costs 
>>>>>> beyond just a paid CircleCI account. My long term, probably naive, goals 
>>>>>> for CI would be to:
>>>>>> 
>>>>>> 1. Have a CI system that is *fully* available to *any* contributor, 
>>>>>> modulo safeguards to prevent abuse
>>>>>> 
>>>>>> 
>>>>>> This thread is going off-topic, as I believe it's original intention to 
>>>>>> be just about CircleCI.
>>>>>> 
>>>>>> But on your point… our community CI won't be allowed (by ASF), nor have 
>>>>>> capacity (limited donated resources), to run pre-commit testing by 
>>>>>> anyone and everyone.
>>>>>> 
>>>>>> Today, trusted contributors can be handed tokens to ci-cassandra.a.o 
>>>>>> (make sure to label them so they can be revoked easily), but we still 
>>>>>> face the issue that too many pre-commit runs impacts the throughput and 
>>>>>> quality of the post-commit runs (though this has improved recently).
>>>>>> 
>>>>>> It's on my wishlist to be able to: with a single command line; spin up 
>>>>>> the ci-cassandra.a.o stack on any k8s cluster, run any git sha through 
>>>>>> it and collect results, and tear it down. Variations on this would solve 
>>>>>> non-committers being able to repeat, use, and work on their own (or a 
>>>>>> separately donated) CI system, and folk/companies with money to be able 
>>>>>> to run their own ci-cassandra.a.o stacks for faster pre-commit 
>>>>>> turnaround time. Having this reproducibility of the CI system would make 
>>>>>> testing changes to it easier as well, so I'd expect a positive feedback 
>>>>>> loop here. 
>>>>>> 
>>>>>> I have some rough ideas on how to get started on this, if anyone would 
>>>>>> like to buddy up on it.
>>> 
>> 
> 
> 
> 
> -- 
> +---------------------------------------------------------------+
> | Derek Chen-Becker                                             |
> | GPG Key available at https://keybase.io/dchenbecker 
> <https://keybase.io/dchenbecker> and       |
> | https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org 
> <https://pgp.mit.edu/pks/lookup?search=derek%40chen-becker.org> |
> | Fngrprnt: EB8A 6480 F0A3 C8EB C1E7  7F42 AFC5 AFEE 96E4 6ACC  |
> +---------------------------------------------------------------+
>

Re: [DISCUSS] Potential circleci config and workflow changes

Reply via email to