So much awesome here.  Big +1 to having checkstyle be the source of truth. 

On 2022/11/24 17:10:28 Maxim Muzafarov wrote:
> Hello everyone,
> 
> 
> First of all, thank you all for this awesome project which I have
> often been inspired by. My name is Maxim Muzafarov I'm a Committer and
> PMC of Apache Ignite hence you most likely don't know me as I come
> from another part of the ASF. Perhaps, I did almost the same things
> with B-Trees, off-heap memory management, rebalancing, checkpointing,
> snapshotting, and IEPs (you are calling it CEPs) but on a slightly
> different distributed database architecture.
> 
> Nevertheless,
> 
> I was chasing down for my first issue to get experience with Cassandra
> and found a bunch of opened JIRAs related to the source code analysis
> (static analysis as well as the code style). These issues still appear
> in JIRA from time to time [1][2][3][4]. It seems to me there not
> enough attention has been paid to this topic and all possible options
> for this analysis and code style haven't been widely discussed before.
> I'd like to summarize everything that I have found and offer my skills
> and my experience for solving some of such issues we'll agree on.
> 
> 
> = Motivation =
> 
> The goal is to make small contributions easier and safer to apply with
> GitHub PRs for both a contributor and a committer by adding automation
> code checks for each new Cassandra contribution. This also will help
> to reduce the time required for reviewing and applying such PRs by an
> experienced developer.
> 
> As you may know, the infrastructure team has disabled public sign-up
> to ASF JIRA (the GitHub issues are recommended instead). Thus the
> following things become more important if we are still interested in
> attracting new contributions as it was discussed earlier [6].
> 
> I do not want to add extra steps to the existing workflow with code
> review or make GitHub pull requests as default for patches as it also
> was discussed already [7], just to improve the automation checks in it
> and make checks more convenient.
> 
> 
> = Proposed Solution =
> 
> == 1. Make the checkstyle config a single point of truth for the
> source code style. ==
> 
> The checkstyle is already used and included in the Cassandra project
> build lifecycle (ant command line, Jenkins, CircleCI). There is no
> need to maintain code style configurations for different types of IDEs
> (e.g. IntelliJ inspections configuration) since the checkstyle.xml
> file can be directly imported to IDE used by a developer. This is fair
> for Intellij Idea, NetBeans, and Eclipse.
> 
> So, I propose to focus on the checks themselves and checking pull
> requests with automation scripts, rather than maintaining these
> integrations. The benefits here are avoiding all issues with
> maintaining configurations for different IDEs. Another good advantage
> of this approach would be the ability to add new checkstyle rules
> without touching IDE configuration - and such tickets will be LFH and
> easy to commit.
> 
> The actions points here are:
> 
> - create an umbrella JIRA ticket for all checkstyle issues e.g. [8]
> (or label checkstyle);
> - add checkstyle to GitHub pull requests using GitHub actions (execute
> ant command);
> - include checkstyle to the build (already done);
> - remove redundant code style configurations related to IDEs from the
> source code e.g. [9];
> 
> 
> == 2. Add additional tooling for code analysis to the build and GitHub
> pull requests. ==
> 
> The source code static analysis and automated checks have been
> discussed recently in the "SpotBugs to the build" topic [10]. I'd like
> to add my 50 cents here.
> 
> Before discussing the pros and cons of each solution, let's focus on
> the criteria that such solutions must meet. You can find the most
> complete list of such tooling here [11].
> 
> From my point of view, the crucial criteria are:
> - free for open-source (at least licenses should allow such usages);
> - popularity in the ASF ecosystems;
> - convenient integration and/or plugins for IDEs and GitHub;
> - we must be able to integrate with CirleCI, and Jenkins as well as
> launch from a command line;
> 
> 
> === Sonar ===
> 
> pros
> + this tool is free for open-source and recommended by the ASF
> infrastructure team [12];
> + was already used for the Cassandra project some time ago at
> sonarcloud.io [13];
> + it has GitHub pull requests analysis [14];
> 
> cons
> - run locally requires additional configuration and TOKEN_ID due to
> the analysis results stored in the ext database (infra will not
> provide it for local usage);
> 
> === SpotBugs (FindBugs) ===
> 
> pros
> + license is allowed to use it and run it as a library (should be legal for 
> us);
> + it analyses the bytecode that differs from how the checkstyle works;
> + can be executed from the command line as well as integrated into the build;
> 
> cons
> - requires compiled source code;
> 
> === PMD ===
> 
> pros
> + BSD licenses more permissive than LGPL (SpotBugs);
> + analyses the source code like the checkstyle does;
> + have an extended rule sets for source code analyses;
> 
> cons
> - the checkstyle is already used in the project, and should be enough for now;
> 
> === IntelliJ IDEA ===
> 
> pros
> + free for open-source and can be used from the command line [15];
> + local checks are performed with the same code style configuration as
> the IDE does during development;
> 
> cons
> - inconvenient for different IDEs (Eclipse, NetBeans);
> 
> 
> As a summary for this section:
> 
> There is no need to limit ourselves to the choice of such tools and
> pick up only one from this list.
> 
> For instance, it seems it will be better to configure sonarcloud.io
> reports on each push commit to the trunk. Sonar already has built-in
> rules (no need to maintain them) and this is easy to add with the ASF
> infra team install guides using GitHub actions or Jenkins. We will be
> able to see immediately the source code trends as well as new good
> candidates for LFH issues to fix.
> 
> In addition to Sonar reports, we can add SpotBugs (PMD is more like an
> 'improved' checkstyle, but for now it is more than enough and there is
> no need to maintain another set of rules) to the build and GitHub pull
> requests with limited configuration for the first time. Thus we will
> be able to add new rules one by one.
> 
> 
> The actions points here are:
> 
> - configure Sonar reports for the trunk (using GA or Jenkins);
> - add SpotBugs to the build;
> - add integration SpotBugs with GitHub pull requests;
> 
> 
> == 3. Enable pushing backwards build results for both Jenkins and
> CircleCI to GitHub pull requests. ==
> 
> The goal here is to have a "green checkbox" for those GitHub pull
> requests that have corresponding Jenkins/CircleCI runs. According to
> the following links, it is completely possible to have those.
> 
> https://github.com/jenkinsci/github-branch-source-plugin
> https://circleci.com/docs/enable-checks/
> 
> Moreover, the GitHub Branch Source Plugin is already enabled for the
> Cassandra project [16]. The same seems should work the same way for
> CirleCI, but I have faced the infrastructure team comment [17] that
> describes admin permissions are required for that, so the question is
> still open here. I could dig a bit more once we've agreed on it.
> 
> The actions points here are:
> - enable Jenkins integration for GitHub pull requests;
> - enable CircleCI integration for GitHub pull requests;
> 
> 
> == 4. Review IDE configs that are not used by the community. ==
> 
> Sorry for bumping up such a holy topic with IDEs that are being used
> by developers, so please do not judge me strictly against it. My main
> point here is "IDE should serve the source code" rather than "the
> source code adapts to IDE", thus it will probably become easy to use
> the project sources in an IDE once the migration from Ant to
> Maven/Gradle occurs [5], but it is still possible to make things a bit
> simpler today.
> 
> Some of the configuration IDE scripts still consume some time for
> their maintenance and require to be tested on each new release occurs.
> These files must be always up to date if a configuration file
> structure changed in a new IDE version and this is inconvenient. I
> don't propose removing such scripts, just a review of what IDEs are
> exactly using.
> 
> For instance,
> 
> Cassandra + Eclipse doesn't seem to appear so often on StackOverflow.
> It could mean only two things: everything is working fine or no one is
> using it (hope the first one).
> https://stackoverflow.com/questions/tagged/cassandra%20eclipse?sort=Newest
> 
> Cassandra + NetBeans appear even less frequently and the same questions occur.
> https://stackoverflow.com/questions/tagged/cassandra%20netbeans?sort=Newest
> 
> The actions points here are:
> 
> - initiate a wide survey for user and dev lists, to get to know about
> the usages;
> - remove those configurations that are not used anymore;
> - force migration from Ant to Gradle/Maven;
> 
> 
> 
> Summarizing everything proposed above I think it is possible to
> simplify adding small contributions easier to the codebase as well as
> save a bunch of committer's time.
> 
> So,
> WDYT about the things described above?
> Should I create a CEP for this?
> 
> 
> 
> [1] https://issues.apache.org/jira/browse/CASSANDRA-9209 (Add static
> analysis to report any AutoCloseable objects)
> [2] https://issues.apache.org/jira/browse/CASSANDRA-17400 (Fail build
> or warn when closeable reference is not closed in tests)
> [3] https://issues.apache.org/jira/browse/CASSANDRA-9919 (Cleanup
> closeable iterator usage)
> [4] https://issues.apache.org/jira/browse/CASSANDRA-17925 (Java source
> code should have sorted imports as defined in the codestyle)
> [5] https://issues.apache.org/jira/browse/CASSANDRA-17015 (Migrate
> build from Ant to Maven/Gradle)
> [6] https://lists.apache.org/thread/1t3hr1pylhpgc5c9d2382xtnw3wgkg06
> (Attracting new contributors)
> [7] https://lists.apache.org/thread/b3nowtrlpjwr46vftsb3ob38xqnrnxl1
> (Switch to using GitHub pull requests?)
> [8] https://issues.apache.org/jira/browse/CASSANDRA-17925 (Java source
> code should have sorted imports as defined in the codestyle)
> [9] https://github.com/apache/cassandra/tree/trunk/ide/idea/inspectionProfiles
> [10] https://lists.apache.org/thread/1ro1mvkpvt4vr24nw7dbpdlxo82mq3hz
> (Add SpotBugs to the build)
> [11] https://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis
> [12] 
> https://cwiki.apache.org/confluence/display/INFRA/SonarCloud+for+ASF+projects
> [13] https://sonarcloud.io/summary/overall?id=cassandra_180410
> [14] https://docs.sonarqube.org/latest/analysis/pull-request/
> [15] https://www.jetbrains.com/help/idea/command-line-code-inspector.html
> [16] https://cwiki.apache.org/confluence/display/INFRA/ci-cassandra.apache.org
> [17] 
> https://issues.apache.org/jira/browse/INFRA-22367?focusedCommentId=17421230
> 

Reply via email to