I personally wouldn't mind digging a bit through the code and help a with fixing things. If there are some checks people feel are particular important let me know, happy to send a PR.
Jörn On Thu, Jun 14, 2018 at 12:45 PM, Chesnay Schepler <ches...@apache.org> wrote: > Note that we attempted the same thing with spotbugs in the past but no one > actually enabled > more checks and fixed issues. > > This isn't an argument against trying it again, but to anyone who wants to > introduce static checks again: > Please be aware that you may end up pretty much on your own in regards to > enforcing/fixing/extending static checks. > > > On 14.06.2018 12:20, Joern Kottmann wrote: >> >> Hello, >> >> we introduced static code analysis 2 years back at Apache OpenNLP. To >> make this work with our also not perfect code base we added the >> checker to the build to make the build fail, then we disabled all >> checks which caused failures (or fixed them), afterwards we worked on >> resolving the issues, and enabled more and more checks over time. >> The advantage of that approach is that once a check is enabled new >> commits can't be added which violate them. >> >> A more modern approach these days for this problem is to start with >> looking at check failures on lines which get changed by commits, see >> infer [1], this can be done as part of a code review. >> >> Jörn >> >> [1] https://github.com/facebook/infer >> >> On Thu, Jun 14, 2018 at 10:23 AM, Till Rohrmann <trohrm...@apache.org> >> wrote: >>> >>> Looking at some of the bugs then these things are all valid problems. For >>> example, the RowTypeInfo does override hashCode but not equals. Or the >>> RocksDBKeyedStateBackend tries to close streams which might be null in >>> case >>> of an exception. I agree that we won't fix all the problems at once and >>> it >>> will more likely be an iterative process. And the later we start with >>> this, >>> the harder it's gonna get up to the point that we won't introduce it >>> anymore. My worry is that it will end like the discussion about a >>> consistent coding style which in my opinion we should have introduced but >>> never did. >>> >>> Ideally we would run this with every PR and check that we don't introduce >>> more bugs. But I also see the benefit of having this run concurrently to >>> the builds such that we can check what's the state and take action if >>> necessary. I also see the benefit of raising awareness for these things >>> and >>> educating people. This goes a little bit hand in hand with activating >>> more >>> IntelliJ inspections to avoid many slips of pen. >>> >>> Cheers, >>> Till >>> >>> On Wed, Jun 13, 2018 at 3:15 PM Chesnay Schepler <ches...@apache.org> >>> wrote: >>> >>>> We will need a significant amount of exclusions/suppressions. >>>> >>>> For example, 300 of the 511 vulnerabilities are caused by our TupleX >>>> classes, with public mutable fields. >>>> The "unclosed resource" inspection is rather simplistic and only tracks >>>> the life-cycle in the scope of the method. >>>> As a result it fails for any resource factory (e.g. any method that >>>> returns a stream), or resources stored in fields like one does for >>>> frequently in wrappers. >>>> >>>> On 13.06.2018 11:56, Piotr Nowojski wrote: >>>>> >>>>> Hi, >>>>> >>>>> Generally speaking I would be in favour of adding some reasonable >>>>> static >>>> >>>> code analysis, however this requires a lot of effort and can not be done >>>> at >>>> once. Also if such checks are firing incorrectly too often it would be >>>> annoying to manually suppress such warnings/errors. >>>>>> >>>>>> I don't really see a benefit in enabling it /continuously/. >>>>> >>>>> On the other hand, I do not see a point of not running it as a part of >>>> >>>> travis CI and non-fast mvn build (`mvn clean install -DskipTests` should >>>> perform such static checks). If such rules are not enforced by a tool, >>>> then >>>> there is really no point in talking about them - they will be very >>>> quickly >>>> ignored and abandoned. >>>>> >>>>> Piotrek >>>>> >>>>>> On 13 Jun 2018, at 10:38, Chesnay Schepler <ches...@apache.org> wrote: >>>>>> >>>>>> I don't really see a benefit in enabling it /continuously/. >>>>>> This wouldn't be part of the build or CI processes, as we can't fail >>>> >>>> the builds since it happens too often that issues are improperly >>>> categorized. >>>>>> >>>>>> Wading through these lists is time-consuming and I very much doubt >>>>>> that >>>> >>>> we will do that with a high or even regular frequency. >>>>>> >>>>>> We would always require 1-2 committers to commit to this process. >>>>>> Thus there's no benefit in running sonarcube beyond these irregular >>>> >>>> checks as we'd just be wasting processing time. >>>>>> >>>>>> I suggest to keep it as a manual process. >>>>>> >>>>>> On 13.06.2018 09:35, Till Rohrmann wrote: >>>>>>> >>>>>>> Hi Alex, >>>>>>> >>>>>>> thanks for bringing this topic up. So far the Flink project does not >>>> >>>> use a >>>>>>> >>>>>>> static code analysis tool but I think it can strongly benefit from it >>>>>>> (simply by looking at the reported bugs). There was a previous >>>> >>>> discussion >>>>>>> >>>>>>> about enabling the ASF Sonarcube integration for Flink [1] but it was >>>> >>>> never >>>>>>> >>>>>>> put into reality. There is also an integration for Travis which might >>>> >>>> be >>>>>>> >>>>>>> interesting to look into [2]. I would be in favour of enabling this. >>>>>>> >>>>>>> [1] >>>>>>> >>>> >>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Add-Sonarqube-analysis-td14556.html >>>>>>> >>>>>>> [2] https://docs.travis-ci.com/user/sonarcloud/ >>>>>>> >>>>>>> Cheers, >>>>>>> Till >>>>>>> >>>>>>> On Tue, Jun 12, 2018 at 11:12 PM Ted Yu <yuzhih...@gmail.com> wrote: >>>>>>> >>>>>>>> I took a look at some of the blocker defects. >>>>>>>> e.g. >>>>>>>> >>>>>>>> >>>> >>>> https://sonarcloud.io/project/issues?id=org.apache.flink%3Aflink-parent&open=AWPxETxA3e-qcckj1Sl1&resolved=false&severities=BLOCKER&types=BUG >>>>>>>> >>>>>>>> For >>>>>>>> >>>> >>>> ./flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java >>>>>>>> >>>>>>>> , the closing of DBOptions using try-with-resources is categorized >>>>>>>> as >>>>>>>> blocker by the analysis. >>>>>>>> >>>>>>>> I don't think that categorization is proper. >>>>>>>> >>>>>>>> We can locate the high priority defects, according to consensus, and >>>> >>>> fix >>>>>>>> >>>>>>>> those. >>>>>>>> >>>>>>>> Cheers >>>>>>>> >>>>>>>> On Tue, Jun 12, 2018 at 2:01 PM, <simeon.arkhi...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Hello Flink community. >>>>>>>>> >>>>>>>>> I am new in Flink project and probably don't understand it a lot. >>>> >>>> Could >>>>>>>>> >>>>>>>>> you please clarify one question to me? >>>>>>>>> >>>>>>>>> I download Flink sources and build it from scratch. I found >>>> >>>> checkstyle >>>>>>>>> >>>>>>>>> guidelines that every Flink developer should follow which is very >>>> >>>> useful. >>>>>>>>> >>>>>>>>> However, I didn't find anything about static analysis tools like >>>>>>>> >>>>>>>> Sonarcube. >>>>>>>>> >>>>>>>>> I have looked through mailing lists archive but without success. >>>>>>>>> That >>>>>>>>> seemed very strange to me. >>>>>>>>> >>>>>>>>> I have setup Sonarcube and run analysis on whole Flink project. >>>> >>>> After a >>>>>>>>> >>>>>>>>> while I have got 442 bugs, 511 vulnerabilities and more than 13K >>>>>>>>> Code >>>>>>>>> Smells issues. You can see them all here: https://sonarcloud.io/ >>>>>>>>> dashboard?id=org.apache.flink%3Aflink-parent >>>>>>>>> >>>>>>>>> I looked through some of bugs and vulnerabilities and there are >>>>>>>>> many >>>>>>>>> important ones (in my opinions) like these: >>>>>>>>> - 'other' is dereferenced. A "NullPointerException" could be >>>>>>>>> thrown; >>>>>>>>> "other" is nullable here. >>>>>>>>> - Either re-interrupt this method or rethrow the >>>> >>>> "InterruptedException". >>>>>>>>> >>>>>>>>> - Move this call to "wait()" into a synchronized block to be sure >>>>>>>>> the >>>>>>>>> monitor on "Object" is held. >>>>>>>>> - Refactor this code so that the Iterator supports multiple >>>>>>>>> traversal >>>>>>>>> - Use try-with-resources or close this "JsonGenerator" in a >>>>>>>>> "finally" >>>>>>>>> clause. Use try-with-resources or close this "JsonGenerator" in a >>>>>>>> >>>>>>>> "finally" >>>>>>>>> >>>>>>>>> clause. >>>>>>>>> - Cast one of the operands of this subtraction operation to a >>>>>>>>> "long". >>>>>>>>> - Make "ZERO_CALENDAR" an instance variable. >>>>>>>>> - Add a "NoSuchElementException" for iteration beyond the end of >>>>>>>>> the >>>>>>>>> collection. >>>>>>>>> - Replace the call to "Thread.sleep(...)" with a call to >>>>>>>>> "wait(...)". >>>>>>>>> - Call "Optional#isPresent()" before accessing the value. >>>>>>>>> - Change this condition so that it does not always evaluate to >>>> >>>> "false". >>>>>>>>> >>>>>>>>> Expression is always false. >>>>>>>>> - This class overrides "equals()" and should therefore also >>>>>>>>> override >>>>>>>>> "hashCode()". >>>>>>>>> - "equals(Object obj)" should test argument type >>>>>>>>> - Not enough arguments in LOG.debug function. Not enough arguments. >>>>>>>>> - Remove this return statement from this finally block. >>>>>>>>> - "notify" may not wake up the appropriate thread. >>>>>>>>> - Remove the boxing to "Double". >>>>>>>>> - Classes should not be compared by name >>>>>>>>> - "buffers" is a method parameter, and should not be used for >>>>>>>>> synchronization. >>>>>>>>> >>>>>>>>> Are there any plans to work on static analysis support for Flink >>>> >>>> project >>>>>>>>> >>>>>>>>> or it was intentionally agreed do not use static analysis as time >>>>>>>> >>>>>>>> consuming >>>>>>>>> >>>>>>>>> and worthless? >>>>>>>>> >>>>>>>>> Thank you in advance for you replies. >>>>>>>>> >>>>>>>>> Best Regards, >>>>>>>>> --- >>>>>>>>> Alex Arkhipov >>>>>>>>> >>>>>>>>> >>>> >