I am strong +1 to new linters, I have been working on SpotBugs but not sent a patch due to sickness and holidays…
About the check style as the source of truth for the style guides, I am +1 to this as well… I feel that wiki is a bad place for this and we can use the check style file to generate the wiki text (no idea if there are tools for this…). About the comment on “hard” requirements… my understanding has been that new changes must follow and I know several reviewers who act this way in the review process…. If we agree on a style, why do we want to keep allowing ignoring it? > There’s always a handful of people asking for it, but notably few if any of > the full time contributors doing the majority of the core development of > Cassandra. I don’t agree with that statement, I know several committers/PMC who want to switch, but every 6 months this gets brought up and gets shot down… so we just give up trying… > I never really see a good argument articulated for the migration, besides > general hand waving that ant is old, and people like newer build systems. Ant > is working fine, so there isn’t a strong technical reason to replace it, and > there are good organisational reasons not to. I don’t know how Mick feels, but I tend to find that maintaining our current Ant build is a big waste of my time, and that every time I need to go to this layer its far more brittle than it should be…. There are pushes to refactor to try to get more order in the build (side conversation between Mick and I), but these tend to hide the complexity from most…. I strongly feel that our current build is duct taping things together, this isn’t a complaint of the maintainers but more how Ant works… And there are a few things I feel newer build systems offer that actually make my life better: 1) cache - I hate running tests in a loop as this always has the same pattern: rebuild everything from scratch for 1m to run a single test that takes 30 milliseconds…. With ant we “could” force our tasks to cache, but then we have to go do that for everything… we are basically rewriting and reimplementing what other builds have 2) consistency - CI runs tests using 1 task, but people like to run tests using a different task… they tend to have different setup and act slightly differently… I had to go improve this for jvm-dtest and see that simulator is in the same shape (if you run simulator tests using the normal CI task, they will fail; you need the bulk task that doesn’t have the same features as the CI version) 3) reusable work, avoid copy/paste - when we add new jars/artifacts we need to copy/paste a set of tasks to implement a “sub-module” or “sub-project” (depending on build tool you prefer), not only is this very easy to get wrong, it discourages splitting work into these units (we talked about moving utils/concurrent into its own jar for Accord…) 4) only having to know 1 build system - right now we are Ant + Maven; to maintain this project you need to understand both… you can get by not knowing this until you touch dependencies, and if you need tasks that also know about those dependencies… you need to see how this is taped together and grasp that logic… Maybe I am bitter as it took a whole day to get SpotBugs working with Ant… and it is just the following in other builds: Gradle: apply plugin: “com.github.spotbugs" Maven: <reporting> <plugins> <plugin> <groupId>com.github.spotbugs</groupId> <artifactId>spotbugs-maven-plugin</artifactId> </plugin> </plugins> </reporting> > On Nov 28, 2022, at 10:46 AM, Maxim Muzafarov <mmu...@apache.org> wrote: > > Thank you all for the feedback and productive discussion. > > > I couldn't have formed my thoughts on the build tools for the product > better and provided such good examples than Scott did. Rephrasing what > I wrote in the first letter, seeing Maven/Gradle in the project > underfoot, a modern IDE will take care of all the necessary files and > configurations for themselves much better than we do with scripts. I > fully agree that there is no rush with such migration, and the > databases in such cases must be more conservative than progressive, > and not change anything without strong benefits and a broad consensus > on it. I still believe this consensus can be reached in future and > when (and if) the consensus will be reached, a clear migration plan > should be developed for several releases ahead as well. There's still > a lot of work to be done here that's why I mentioned it at the end of > my proposal, so as not to pay too much attention to this question at > this moment. > I've added a link to this thread to the JIRA issue [1], so we don't > lose the insights mentioned by members above. > > > I want to take away your concerns about lints expansion for now. I > thought first of all about making all the source code-checking tools > more convenient for use with a minimal set of already existing lints > rather than adding or forcing new rules. I really want to avoid here > cases with storing multiple configurations for a single tool e.g. > having different configurations for 'optional' or 'mandatory' checks > as well as different configurations for 'production' or 'tests'. > > Thus, the ideal picture in my mind of all discussed above is : > > We have: > - checkstyle > - SpotBugs > - Sonar > > They work the same way for: > - Jenkins builds > - CirleCI builds > - GitHub pull requests > - build on the local machine > > For all that, we have the code style webpage [2] (and wiki [3]) is > pretty well described, there is no need to expand checking tools with > new rules until we will get working these tools on the minimal set of > rules. For instance, we can pick up for the checkstyle 'Unused > imports', 'Import order', for the SpotBugs 'AutoCloseable', > 'Number.valueOf', for Sonar - only reports to monitor the source code > trends. > > I agree that adding new lints require a broad consensus, so I'd like > to avoid such debatable questions for now. Moreover, even with the > lints already agreed upon, it is still risky to implement some of them > because they can contain a lot of boilerplate changes and may affect > more important fixes ready for merge. > > > So, as a first step, I can invest my time into the checkstyle tool and > make it work everywhere with the same configuration. > WDYT? > > > P.S. > > For IntelliJ with the Checkstyle Plugin it's easy to import the > checkstyle.xml the following way: > Preferences -> Code Style -> Show Scheme Actions (wheel) -> Import > scheme -> Checkstyle configuration. > > > [1] https://issues.apache.org/jira/browse/CASSANDRA-17015 > [2] https://cassandra.apache.org/_/development/code_style.html > [3] https://cwiki.apache.org/confluence/display/CASSANDRA2/CodeStyle > > On Sun, 27 Nov 2022 at 13:17, Josh McKenzie <jmcken...@apache.org> wrote: >> >> My .02 on the build discussion is we should try and keep the guts of that in >> one place, be it the other email thread or on JIRA. Some insightful points >> made on this thread but would hate to see this thread derailed on a complex >> independent topic as well as see some of these points lost on the other >> discussion. >> >> I think there needs to be a lot of community consensus on the broad >> expansion of lints that can reject patches. >> >> +1. It may be worthwhile to configure 2 tiers of lints, optional and >> required, so we can move to a more gradual process of cleaning up lint >> violations for those that are interested in that type of work. I know in the >> past we've seen value in looking at the diff in linting violations even w/a >> 1k+ noisy violation environment. >> >> >> On Fri, Nov 25, 2022, at 12:41 PM, sc...@paradoxica.net wrote: >> >> For me, the strongest arguments in favor of adopting a modern build tool >> like Maven or Gradle are their ecosystems - both explicit (in terms of >> plugins), and implicit (in terms of nearly all build tooling supporting both >> of them, but not ant). >> >> Investment in Ant - and in tooling that integrates with Ant - fell off years >> ago. This makes integrating build-phase aspects of Cassandra with other >> tooling a very frustrating task that users of most build tools get for free. >> Many tools built in the last several years don’t support it, or do so only >> as an afterthought. >> >> Two recent examples that have caused pain for me, which I suspect are felt >> by many: >> >> – Integration with internal build systems at many companies that develop >> Cassandra. Because ant has fallen into disuse, this integration is heavily >> manual instead of automatic and free. It usually requires forking the >> project’s build.xml, developing custom tooling around it, or creating a mock >> Gradle build that wraps ant lifecycle tasks (which also requires overriding >> ant tasks whose names clash). >> >> – Security toolchain integration. Many users and developers of Cassandra >> also integrate with security tooling at their respective companies. Because >> Ant has fallen into disuse, most tooling commonly used by security >> organizations doesn’t support it. SBOMs are a good example, as their >> introduction postdates ant’s decline. Maven plugins exist to generate them >> in CycloneDX and SPDX format, but no such plugins exist for ant. Cassandra >> users and developers who need them must manually write shims to produce >> SBOMs that users of modern build tools get for free. >> >> These might not be use cases anticipated by the project, but they represent >> work I suspect a large number of contributors to the project are required to >> perform to make Cassandra usable for them. >> >> The ecosystem point means that the fix for this has to be external to the >> project if the project continues to use Ant: lots of one-off scripts and >> shims unsuitable for contribution with sole maintainers at their respective >> companies to provide functionality that users of modern build tools get for >> free. It also demands continuous, incremental work to adapt to changes in >> security and build tooling in use at many companies that don’t need to be >> made for projects using well-supported tools like Maven or Gradle. >> >> – Scott >> >>> On Nov 25, 2022, at 4:56 AM, Benedict <bened...@apache.org> wrote: >>> >>> I think modules are already fairly well supported - we in effect already >>> have several? FQL, Simulator and others I think. >>> >>> I can anyway say with absolute certainty that the main impediment to >>> modularising is not build tooling, it’s the difficulty of the task, the >>> cost to the project of undertaking it, and therefore its relative payoff >>> versus other things that could be undertaken by the folk with the relevant >>> expertise to do it. >>> >>> I’ve also been around long enough to see enough efforts to broaden >>> contributions fail to have an impact, that basing costly decisions on this >>> kind of reasoning doesn’t resonate. The main impediments to contributions >>> are the complexity of the codebase (and problem domain) and our limited >>> capacity to respond promptly to high quality contributions. Until we fix >>> those fundamental issues, this kind of rearranging of chairs seems more >>> like a branding exercise to ourselves than to anyone else. >>> >>> >>>> On 25 Nov 2022, at 10:22, Miklosovic, Stefan >>>> <stefan.mikloso...@netapp.com> wrote: >>>> >>>> I agree with what you wrote. How I understand it is that migrating to >>>> Maven/Gradle makes the project more "attractive" for newcomers. If a >>>> project is built on "that old un-cool Ant", it might be a little bit >>>> off-putting and questionable if we are "stuck in the past on build systems >>>> and not progressing". >>>> >>>> So in that sense I agree this is more "marketing" rather than >>>> technological question but on the other hand, does not Maven/Gradle allow >>>> us to modularize the project better? Maybe we would like to modularize but >>>> nobody is up to that because build system makes it impossible or at least >>>> quite inconvenient to do so. Do you really think there are not any >>>> significant benefits to switch even if it "just works" now? >>>> >>>> ________________________________________ >>>> From: Benedict <bened...@apache.org> >>>> Sent: Friday, November 25, 2022 11:07 >>>> To: dev@cassandra.apache.org >>>> Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis >>>> >>>> NetApp Security WARNING: This is an external email. Do not click links or >>>> open attachments unless you recognize the sender and know the content is >>>> safe. >>>> >>>> >>>> >>>> >>>> There’s always a handful of people asking for it, but notably few if any >>>> of the full time contributors doing the majority of the core development >>>> of Cassandra. It strikes me as something very appealing to others, but >>>> less so to those wanting to get on with development. >>>> >>>> I never really see a good argument articulated for the migration, besides >>>> general hand waving that ant is old, and people like newer build systems. >>>> Ant is working fine, so there isn’t a strong technical reason to replace >>>> it, and there are good organisational reasons not to. >>>> >>>> Why do you consider a migration inevitable? >>>> >>>> >>>> >>>>> On 25 Nov 2022, at 09:58, Miklosovic, Stefan >>>>> <stefan.mikloso...@netapp.com> wrote: >>>>> >>>>> Interesting take on Ant / no-Ant, Benedict. I am very curious how this >>>>> unfolds. My long-term perception is that changing it to something else is >>>>> more or less inevitable but if there is a broader consensus to not do >>>>> that .... well. >>>>> >>>>> ________________________________________ >>>>> From: Benedict <bened...@apache.org> >>>>> Sent: Friday, November 25, 2022 10:52 >>>>> To: dev@cassandra.apache.org >>>>> Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis >>>>> >>>>> NetApp Security WARNING: This is an external email. Do not click links or >>>>> open attachments unless you recognize the sender and know the content is >>>>> safe. >>>>> >>>>> >>>>> >>>>> >>>>> I was in a bit of a rush last night. I should say that I’m of course +1 a >>>>> general endeavour to clean this up, and to expand our use of linters, and >>>>> I appreciate your volunteering to help out in this way Maxim. >>>>> >>>>> However, responding to Stefan, I’m pretty -1 migrating from ant to >>>>> another build system without really good reason. Migration has a real >>>>> cost to productivity for all existing contributors, and the phantom of >>>>> increasing new contributions has never paid off historically. I’m all for >>>>> easing people into participation, but not at penalty to the existing >>>>> contributor base. >>>>> >>>>> If the only reason is to make it easier to open in a different IDE, we >>>>> can perhaps have some basic build files outlining code structure for >>>>> importing, that are compatible with our canonical ant build? We could >>>>> perhaps even generate them. >>>>> >>>>> >>>>>>> On 25 Nov 2022, at 09:35, Miklosovic, Stefan >>>>>>> <stefan.mikloso...@netapp.com> wrote: >>>>>> >>>>>> For the record, I was testing that same combo Claude mentioned and it >>>>>> did not work out of the box but it is definitely possible to set up >>>>>> successfully. I do not remember the details. >>>>>> >>>>>> To replay to Maxim, it all seems good to me, roughly, but I humbly think >>>>>> it all boils down to Maven/Gradle refactoring and on top of that we can >>>>>> do all else. >>>>>> >>>>>> For example, there is (1) where the solution, besides fixing the tests, >>>>>> is to introduce an Ant task which would check this on build. That being >>>>>> said, how is that going to look like when we change Ant for something >>>>>> else? That stuff suddenly becomes obsolete. >>>>>> >>>>>> This case maybe applies to other problems we want to solve as well. I do >>>>>> not want to do something tailored for one build system just to rewrite >>>>>> it all or to spend significant amount of time on that again when we >>>>>> switch the build system. >>>>>> >>>>>> For that reason I think changing Ant for something else should be top >>>>>> priority (as I understand that it the hot topic for community for very >>>>>> long time) and then everything else should follow. We should spend time >>>>>> on things mentioned only in case they do not collide with any build >>>>>> system at all. >>>>>> >>>>>> (1) https://issues.apache.org/jira/browse/CASSANDRA-17964 >>>>>> >>>>>> Stefan >>>>>> >>>>>> ________________________________________ >>>>>> From: Claude Warren, Jr via dev <dev@cassandra.apache.org> >>>>>> Sent: Friday, November 25, 2022 10:16 >>>>>> To: dev@cassandra.apache.org >>>>>> Subject: Re: [DISCUSSION] Cassandra's code style and source code analysis >>>>>> >>>>>> NetApp Security WARNING: This is an external email. Do not click links >>>>>> or open attachments unless you recognize the sender and know the content >>>>>> is safe. >>>>>> >>>>>> >>>>>> >>>>>> +1 for the concept as a whole. I am certain I could find nits to pick >>>>>> if I looked deeply. >>>>>> >>>>>> @mck -- I did have a problem with Cassandra + Eclipse + Java11 >>>>>> (Classpath). I gave up and am spending time trying to learn IntelliJ. >>>>>> I also mentioned it in one of the discussion areas. >>>>>> >>>>>> Claude >>>>>> >>>>>>> On Thu, Nov 24, 2022 at 8:55 PM Mick Semb Wever >>>>>>> <m...@apache.org<mailto:m...@apache.org>> wrote: >>>>>> Thank you for a solid write up Maxim. And welcome to Cassandra, it's >>>>>> very positive to see you here. >>>>>> >>>>>> I whole-heartedly agree with nearly everything you write. Some input >>>>>> and questions inline. >>>>>> >>>>>> >>>>>>> >>>>>>> As you may know, the infrastructure team has disabled public sign-up >>>>>>> to ASF JIRA (the GitHub issues are recommended instead). >>>>>>> >>>>>> >>>>>> >>>>>> I suspect (based on chatter in general, but not on dev@ AFAIK) is to >>>>>> avoid GH issues and stick with jira. The sign-up hurdle we will >>>>>> document on the website: CASSANDRA-18064 >>>>>> >>>>>> >>>>>> >>>>>>> == 1. Make the checkstyle config a single point of truth for the >>>>>>> source code style. == >>>>>>> >>>>>>> The checkstyle is already used and included in the Cassandra project >>>>>>> build lifecycle (ant command line, Jenkins, CircleCI). There is no >>>>>>> need to maintain code style configurations for different types of IDEs >>>>>>> (e.g. IntelliJ inspections configuration) since the checkstyle.xml >>>>>>> file can be directly imported to IDE used by a developer. This is fair >>>>>>> for Intellij Idea, NetBeans, and Eclipse. >>>>>> >>>>>> >>>>>> Big +1 >>>>>> >>>>>> >>>>>>> So, I propose to focus on the checks themselves and checking pull >>>>>>> requests with automation scripts, rather than maintaining these >>>>>>> integrations. The benefits here are avoiding all issues with >>>>>>> maintaining configurations for different IDEs. Another good advantage >>>>>>> of this approach would be the ability to add new checkstyle rules >>>>>>> without touching IDE configuration - and such tickets will be LFH and >>>>>>> easy to commit. >>>>>>> >>>>>>> The actions points here are: >>>>>>> >>>>>>> - create an umbrella JIRA ticket for all checkstyle issues e.g. [8] >>>>>>> (or label checkstyle); >>>>>>> - add checkstyle to GitHub pull requests using GitHub actions (execute >>>>>>> ant command); >>>>>> >>>>>> >>>>>> Instead of custom GHA scripting, please use our existing >>>>>> cassandra-artifact.sh (which should already include all such checks). >>>>>> >>>>>> Something like >>>>>> https://github.com/apache/cassandra/compare/cassandra-3.11...thelastpickle:cassandra:mck/github-actions/3.11 >>>>>> >>>>>> >>>>>> >>>>>>> == 3. Enable pushing backwards build results for both Jenkins and >>>>>>> CircleCI to GitHub pull requests. == >>>>>>> >>>>>>> The goal here is to have a "green checkbox" for those GitHub pull >>>>>>> requests that have corresponding Jenkins/CircleCI runs. According to >>>>>>> the following links, it is completely possible to have those. >>>>>>> >>>>>>> https://github.com/jenkinsci/github-branch-source-plugin >>>>>>> https://circleci.com/docs/enable-checks/ >>>>>>> >>>>>>> Moreover, the GitHub Branch Source Plugin is already enabled for the >>>>>>> Cassandra project [16]. The same seems should work the same way for >>>>>>> CirleCI, but I have faced the infrastructure team comment [17] that >>>>>>> describes admin permissions are required for that, so the question is >>>>>>> still open here. I could dig a bit more once we've agreed on it. >>>>>>> >>>>>>> The actions points here are: >>>>>>> - enable Jenkins integration for GitHub pull requests; >>>>>>> - enable CircleCI integration for GitHub pull requests; >>>>>> >>>>>> >>>>>> Some folk use CircleCI, some use ci-cassandra. The green checkbox idea >>>>>> is great, so long as it's optional. We don't want PRs triggering the >>>>>> runs either (there are other mechanisms for triggering for now). >>>>>> >>>>>> >>>>>>> The actions points here are: >>>>>>> >>>>>>> - initiate a wide survey for user and dev lists, to get to know about >>>>>>> the usages; >>>>>>> - remove those configurations that are not used anymore; >>>>>>> - force migration from Ant to Gradle/Maven; >>>>>> >>>>>> >>>>>> Let's leave this out for now. There's too many unknowns here. If >>>>>> there's an IDE configuration that's broken, no one has reported it for >>>>>> ages, and no one is around to fix it, then I say we should raise the >>>>>> discussion to remove it. >>>>>> >>>>>> The Gradle/Maven migration is a hot one, contribute to that discussion >>>>>> but let's not tangle this work up with it, IMHO. >>>>>> >>>>>> Totally agree that IDE project files should be as light weight as >>>>>> possible. >>>>>> >>>>>> >>>>>>> Summarizing everything proposed above I think it is possible to >>>>>>> simplify adding small contributions easier to the codebase as well as >>>>>>> save a bunch of committer's time. >>>>>>> >>>>>>> So, >>>>>>> WDYT about the things described above? >>>>>>> Should I create a CEP for this? >>>>>> >>>>>> >>>>>> I see no need for a CEP here. An epic and tickets will work. >>>>>> Again, thanks for the input Maxim! >>>> >> >> >>