So much awesome here. Big +1 to having checkstyle be the source of truth.
On 2022/11/24 17:10:28 Maxim Muzafarov wrote: > Hello everyone, > > > First of all, thank you all for this awesome project which I have > often been inspired by. My name is Maxim Muzafarov I'm a Committer and > PMC of Apache Ignite hence you most likely don't know me as I come > from another part of the ASF. Perhaps, I did almost the same things > with B-Trees, off-heap memory management, rebalancing, checkpointing, > snapshotting, and IEPs (you are calling it CEPs) but on a slightly > different distributed database architecture. > > Nevertheless, > > I was chasing down for my first issue to get experience with Cassandra > and found a bunch of opened JIRAs related to the source code analysis > (static analysis as well as the code style). These issues still appear > in JIRA from time to time [1][2][3][4]. It seems to me there not > enough attention has been paid to this topic and all possible options > for this analysis and code style haven't been widely discussed before. > I'd like to summarize everything that I have found and offer my skills > and my experience for solving some of such issues we'll agree on. > > > = Motivation = > > The goal is to make small contributions easier and safer to apply with > GitHub PRs for both a contributor and a committer by adding automation > code checks for each new Cassandra contribution. This also will help > to reduce the time required for reviewing and applying such PRs by an > experienced developer. > > As you may know, the infrastructure team has disabled public sign-up > to ASF JIRA (the GitHub issues are recommended instead). Thus the > following things become more important if we are still interested in > attracting new contributions as it was discussed earlier [6]. > > I do not want to add extra steps to the existing workflow with code > review or make GitHub pull requests as default for patches as it also > was discussed already [7], just to improve the automation checks in it > and make checks more convenient. > > > = Proposed Solution = > > == 1. Make the checkstyle config a single point of truth for the > source code style. == > > The checkstyle is already used and included in the Cassandra project > build lifecycle (ant command line, Jenkins, CircleCI). There is no > need to maintain code style configurations for different types of IDEs > (e.g. IntelliJ inspections configuration) since the checkstyle.xml > file can be directly imported to IDE used by a developer. This is fair > for Intellij Idea, NetBeans, and Eclipse. > > So, I propose to focus on the checks themselves and checking pull > requests with automation scripts, rather than maintaining these > integrations. The benefits here are avoiding all issues with > maintaining configurations for different IDEs. Another good advantage > of this approach would be the ability to add new checkstyle rules > without touching IDE configuration - and such tickets will be LFH and > easy to commit. > > The actions points here are: > > - create an umbrella JIRA ticket for all checkstyle issues e.g. [8] > (or label checkstyle); > - add checkstyle to GitHub pull requests using GitHub actions (execute > ant command); > - include checkstyle to the build (already done); > - remove redundant code style configurations related to IDEs from the > source code e.g. [9]; > > > == 2. Add additional tooling for code analysis to the build and GitHub > pull requests. == > > The source code static analysis and automated checks have been > discussed recently in the "SpotBugs to the build" topic [10]. I'd like > to add my 50 cents here. > > Before discussing the pros and cons of each solution, let's focus on > the criteria that such solutions must meet. You can find the most > complete list of such tooling here [11]. > > From my point of view, the crucial criteria are: > - free for open-source (at least licenses should allow such usages); > - popularity in the ASF ecosystems; > - convenient integration and/or plugins for IDEs and GitHub; > - we must be able to integrate with CirleCI, and Jenkins as well as > launch from a command line; > > > === Sonar === > > pros > + this tool is free for open-source and recommended by the ASF > infrastructure team [12]; > + was already used for the Cassandra project some time ago at > sonarcloud.io [13]; > + it has GitHub pull requests analysis [14]; > > cons > - run locally requires additional configuration and TOKEN_ID due to > the analysis results stored in the ext database (infra will not > provide it for local usage); > > === SpotBugs (FindBugs) === > > pros > + license is allowed to use it and run it as a library (should be legal for > us); > + it analyses the bytecode that differs from how the checkstyle works; > + can be executed from the command line as well as integrated into the build; > > cons > - requires compiled source code; > > === PMD === > > pros > + BSD licenses more permissive than LGPL (SpotBugs); > + analyses the source code like the checkstyle does; > + have an extended rule sets for source code analyses; > > cons > - the checkstyle is already used in the project, and should be enough for now; > > === IntelliJ IDEA === > > pros > + free for open-source and can be used from the command line [15]; > + local checks are performed with the same code style configuration as > the IDE does during development; > > cons > - inconvenient for different IDEs (Eclipse, NetBeans); > > > As a summary for this section: > > There is no need to limit ourselves to the choice of such tools and > pick up only one from this list. > > For instance, it seems it will be better to configure sonarcloud.io > reports on each push commit to the trunk. Sonar already has built-in > rules (no need to maintain them) and this is easy to add with the ASF > infra team install guides using GitHub actions or Jenkins. We will be > able to see immediately the source code trends as well as new good > candidates for LFH issues to fix. > > In addition to Sonar reports, we can add SpotBugs (PMD is more like an > 'improved' checkstyle, but for now it is more than enough and there is > no need to maintain another set of rules) to the build and GitHub pull > requests with limited configuration for the first time. Thus we will > be able to add new rules one by one. > > > The actions points here are: > > - configure Sonar reports for the trunk (using GA or Jenkins); > - add SpotBugs to the build; > - add integration SpotBugs with GitHub pull requests; > > > == 3. Enable pushing backwards build results for both Jenkins and > CircleCI to GitHub pull requests. == > > The goal here is to have a "green checkbox" for those GitHub pull > requests that have corresponding Jenkins/CircleCI runs. According to > the following links, it is completely possible to have those. > > https://github.com/jenkinsci/github-branch-source-plugin > https://circleci.com/docs/enable-checks/ > > Moreover, the GitHub Branch Source Plugin is already enabled for the > Cassandra project [16]. The same seems should work the same way for > CirleCI, but I have faced the infrastructure team comment [17] that > describes admin permissions are required for that, so the question is > still open here. I could dig a bit more once we've agreed on it. > > The actions points here are: > - enable Jenkins integration for GitHub pull requests; > - enable CircleCI integration for GitHub pull requests; > > > == 4. Review IDE configs that are not used by the community. == > > Sorry for bumping up such a holy topic with IDEs that are being used > by developers, so please do not judge me strictly against it. My main > point here is "IDE should serve the source code" rather than "the > source code adapts to IDE", thus it will probably become easy to use > the project sources in an IDE once the migration from Ant to > Maven/Gradle occurs [5], but it is still possible to make things a bit > simpler today. > > Some of the configuration IDE scripts still consume some time for > their maintenance and require to be tested on each new release occurs. > These files must be always up to date if a configuration file > structure changed in a new IDE version and this is inconvenient. I > don't propose removing such scripts, just a review of what IDEs are > exactly using. > > For instance, > > Cassandra + Eclipse doesn't seem to appear so often on StackOverflow. > It could mean only two things: everything is working fine or no one is > using it (hope the first one). > https://stackoverflow.com/questions/tagged/cassandra%20eclipse?sort=Newest > > Cassandra + NetBeans appear even less frequently and the same questions occur. > https://stackoverflow.com/questions/tagged/cassandra%20netbeans?sort=Newest > > The actions points here are: > > - initiate a wide survey for user and dev lists, to get to know about > the usages; > - remove those configurations that are not used anymore; > - force migration from Ant to Gradle/Maven; > > > > Summarizing everything proposed above I think it is possible to > simplify adding small contributions easier to the codebase as well as > save a bunch of committer's time. > > So, > WDYT about the things described above? > Should I create a CEP for this? > > > > [1] https://issues.apache.org/jira/browse/CASSANDRA-9209 (Add static > analysis to report any AutoCloseable objects) > [2] https://issues.apache.org/jira/browse/CASSANDRA-17400 (Fail build > or warn when closeable reference is not closed in tests) > [3] https://issues.apache.org/jira/browse/CASSANDRA-9919 (Cleanup > closeable iterator usage) > [4] https://issues.apache.org/jira/browse/CASSANDRA-17925 (Java source > code should have sorted imports as defined in the codestyle) > [5] https://issues.apache.org/jira/browse/CASSANDRA-17015 (Migrate > build from Ant to Maven/Gradle) > [6] https://lists.apache.org/thread/1t3hr1pylhpgc5c9d2382xtnw3wgkg06 > (Attracting new contributors) > [7] https://lists.apache.org/thread/b3nowtrlpjwr46vftsb3ob38xqnrnxl1 > (Switch to using GitHub pull requests?) > [8] https://issues.apache.org/jira/browse/CASSANDRA-17925 (Java source > code should have sorted imports as defined in the codestyle) > [9] https://github.com/apache/cassandra/tree/trunk/ide/idea/inspectionProfiles > [10] https://lists.apache.org/thread/1ro1mvkpvt4vr24nw7dbpdlxo82mq3hz > (Add SpotBugs to the build) > [11] https://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis > [12] > https://cwiki.apache.org/confluence/display/INFRA/SonarCloud+for+ASF+projects > [13] https://sonarcloud.io/summary/overall?id=cassandra_180410 > [14] https://docs.sonarqube.org/latest/analysis/pull-request/ > [15] https://www.jetbrains.com/help/idea/command-line-code-inspector.html > [16] https://cwiki.apache.org/confluence/display/INFRA/ci-cassandra.apache.org > [17] > https://issues.apache.org/jira/browse/INFRA-22367?focusedCommentId=17421230 >