Re: Static code analysis for Flink project

Chesnay Schepler Thu, 14 Jun 2018 03:46:07 -0700

Note that we attempted the same thing with spotbugs in the past but noone actually enabled

more checks and fixed issues.

This isn't an argument against trying it again, but to anyone who wantsto introduce static checks again:Please be aware that you may end up pretty much on your own in regardsto enforcing/fixing/extending static checks.


On 14.06.2018 12:20, Joern Kottmann wrote:

Hello,

we introduced static code analysis 2 years back at Apache OpenNLP. To
make this work with our also not perfect code base we added the
checker to the build to make the build fail, then we disabled all
checks which caused failures (or fixed them), afterwards we worked on
resolving the issues, and enabled more and more checks over time.
The advantage of that approach is that once a check is enabled new
commits can't be added which violate them.

A more modern approach these days for this problem is to start with
looking at check failures on lines which get changed by commits, see
infer [1], this can be done as part of a code review.

Jörn

[1] https://github.com/facebook/infer

On Thu, Jun 14, 2018 at 10:23 AM, Till Rohrmann <trohrm...@apache.org> wrote:

Looking at some of the bugs then these things are all valid problems. For
example, the RowTypeInfo does override hashCode but not equals. Or the
RocksDBKeyedStateBackend tries to close streams which might be null in case
of an exception. I agree that we won't fix all the problems at once and it
will more likely be an iterative process. And the later we start with this,
the harder it's gonna get up to the point that we won't introduce it
anymore. My worry is that it will end like the discussion about a
consistent coding style which in my opinion we should have introduced but
never did.

Ideally we would run this with every PR and check that we don't introduce
more bugs. But I also see the benefit of having this run concurrently to
the builds such that we can check what's the state and take action if
necessary. I also see the benefit of raising awareness for these things and
educating people. This goes a little bit hand in hand with activating more
IntelliJ inspections to avoid many slips of pen.

Cheers,
Till

On Wed, Jun 13, 2018 at 3:15 PM Chesnay Schepler <ches...@apache.org> wrote:

We will need a significant amount of exclusions/suppressions.

For example, 300 of the 511 vulnerabilities are caused by our TupleX
classes, with public mutable fields.
The "unclosed resource" inspection is rather simplistic and only tracks
the life-cycle in the scope of the method.
As a result it fails for any resource factory (e.g. any method that
returns a stream), or resources stored in fields like one does for
frequently in wrappers.

On 13.06.2018 11:56, Piotr Nowojski wrote:

Hi,

Generally speaking I would be in favour of adding some reasonable static

code analysis, however this requires a lot of effort and can not be done at
once. Also if such checks are firing incorrectly too often it would be
annoying to manually suppress such warnings/errors.

I don't really see a benefit in enabling it /continuously/.

On the other hand, I do not see a point of not running it as a part of

travis CI and non-fast mvn build (`mvn clean install -DskipTests` should
perform such static checks). If such rules are not enforced by a tool, then
there is really no point in talking about them - they will be very quickly
ignored and abandoned.

Piotrek

On 13 Jun 2018, at 10:38, Chesnay Schepler <ches...@apache.org> wrote:

I don't really see a benefit in enabling it /continuously/.
This wouldn't be part of the build or CI processes, as we can't fail

the builds since it happens too often that issues are improperly
categorized.

Wading through these lists is time-consuming and I very much doubt that

we will do that with a high or even regular frequency.

We would always require 1-2 committers to commit to this process.
Thus there's no benefit in running sonarcube beyond these irregular

checks as we'd just be wasting processing time.

I suggest to keep it as a manual process.

On 13.06.2018 09:35, Till Rohrmann wrote:

Hi Alex,

thanks for bringing this topic up. So far the Flink project does not

use a

static code analysis tool but I think it can strongly benefit from it
(simply by looking at the reported bugs). There was a previous

discussion

about enabling the ASF Sonarcube integration for Flink [1] but it was

never

put into reality. There is also an integration for Travis which might

be

interesting to look into [2]. I would be in favour of enabling this.

[1]

http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Add-Sonarqube-analysis-td14556.html

[2] https://docs.travis-ci.com/user/sonarcloud/

Cheers,
Till

On Tue, Jun 12, 2018 at 11:12 PM Ted Yu <yuzhih...@gmail.com> wrote:

I took a look at some of the blocker defects.
e.g.

https://sonarcloud.io/project/issues?id=org.apache.flink%3Aflink-parent&open=AWPxETxA3e-qcckj1Sl1&resolved=false&severities=BLOCKER&types=BUG

For

./flink-state-backends/flink-statebackend-rocksdb/src/main/java/org/apache/flink/contrib/streaming/state/PredefinedOptions.java

, the closing of DBOptions using try-with-resources is categorized as
blocker by the analysis.

I don't think that categorization is proper.

We can locate the high priority defects, according to consensus, and

fix

those.

Cheers

On Tue, Jun 12, 2018 at 2:01 PM, <simeon.arkhi...@gmail.com> wrote:

Hello Flink community.

I am new in Flink project and probably don't understand it a lot.

Could

you please clarify one question to me?

I download Flink sources and build it from scratch. I found

checkstyle

guidelines that every Flink developer should follow which is very

useful.

However, I didn't find anything about static analysis tools like

Sonarcube.

I have looked through mailing lists archive but without success. That
seemed very strange to me.

I have setup Sonarcube and run analysis on whole Flink project.

After a

while I have got 442 bugs, 511 vulnerabilities and more than 13K Code
Smells issues. You can see them all here: https://sonarcloud.io/
dashboard?id=org.apache.flink%3Aflink-parent

I looked through some of bugs and vulnerabilities and there are many
important ones (in my opinions) like these:
- 'other' is dereferenced. A "NullPointerException" could be thrown;
"other" is nullable here.
- Either re-interrupt this method or rethrow the

"InterruptedException".

- Move this call to "wait()" into a synchronized block to be sure the
monitor on "Object" is held.
- Refactor this code so that the Iterator supports multiple traversal
- Use try-with-resources or close this "JsonGenerator" in a "finally"
clause. Use try-with-resources or close this "JsonGenerator" in a

"finally"

clause.
- Cast one of the operands of this subtraction operation to a "long".
- Make "ZERO_CALENDAR" an instance variable.
- Add a "NoSuchElementException" for iteration beyond the end of the
collection.
- Replace the call to "Thread.sleep(...)" with a call to "wait(...)".
- Call "Optional#isPresent()" before accessing the value.
- Change this condition so that it does not always evaluate to

"false".

Expression is always false.
- This class overrides "equals()" and should therefore also override
"hashCode()".
- "equals(Object obj)" should test argument type
- Not enough arguments in LOG.debug function. Not enough arguments.
- Remove this return statement from this finally block.
- "notify" may not wake up the appropriate thread.
- Remove the boxing to "Double".
- Classes should not be compared by name
- "buffers" is a method parameter, and should not be used for
synchronization.

Are there any plans to work on static analysis support for Flink

project

or it was intentionally agreed do not use static analysis as time

consuming

and worthless?

Thank you in advance for you replies.

Best Regards,
---
Alex Arkhipov

Re: Static code analysis for Flink project

Reply via email to