I couldn't agree more. I'm a fan of tools that check formatting style, but
only if the tools can also apply the style. Having a tool to tell humans to
manually fix style issues is annoying at best. This has been a pain point
for me as I start to work with the Java implementation.

In Rust, we just need to remember to run "cargo fmt" before pushing changes
so it isn't so bad.

In my current day job we use Spotless [1] to format Java and Scala code
automatically and I would recommend considering it for the Java
implementation of Arrow.

Thanks,

Andy.

[1] https://github.com/diffplug/spotless

On Wed, Feb 19, 2020 at 10:59 AM Neal Richardson <
[email protected]> wrote:

> Hi all,
> Having a build fail because of a lint check or similar style requirement is
> frustrating. It adds time to the process of getting a patch finished and
> merged because you have to check back to see the build failed, switch
> contexts from whatever you'd been working on since, and go fix it. And as a
> new or infrequent contributor, it's particularly challenging because you
> have to figure out which additional dependencies (and specific versions, as
> in the case of clang-format) to install and how to run this extra tool on
> your machine.
>
> It doesn't have to be this way. With GitHub Actions, we can run workflows
> that fix style and other violations and push the fix in a commit back to
> the branch. To demonstrate this, I have a PR (
> https://github.com/apache/arrow/pull/6411) that updates the generated R
> man
> pages whenever there's a change to the inline docstrings. The existing CI
> for R package checks fail if there's a mismatch between code and docs, and
> it's all too easy to update the inline docs and forget to regenerate the
> man pages from them. I could envision doing the same kind of GHA workflow
> for the C++ linting--the current CI job already prints the diff that should
> be applied to fix the lint failures, so why not just fix it?--and even to
> implement Python black, if we decided we wanted to do that.
>
> On the PR discussion, it seems that there's some interest in doing this but
> also some concern. Some thoughts:
>
> * It only runs on your fork (apache/arrow is excluded) and it does not run
> on master
> * I've added to the commit message "[automated commit]" so that it's clear
> that you personally didn't add the commit.
> * If anyone were strongly opposed, we could add their fork to the list of
> excluded repositories.
>
> It was also pointed out to me that we have precommit hooks that we could
> extend to do this kind of work instead of GitHub Actions. I tried them out
> and found them to be largely undocumented and requiring additional
> out-of-band environment setup; ultimately I disabled them because I kept
> getting a popup telling me to go to some website and get another JDK, all
> just when editing a .R file in the r/ directory. If I as a regular
> contributor found them burdensome to set up, I'm not going to recommend
> them to a new contributor.
>
> Style guides and linting are important for large projects like Arrow, but
> we don't want to add unnecessary friction to the dev process, particularly
> for new contributors--it's challenging enough without it. From my
> experience, the best way to enforce standards and processes is to automate
> them so that contributors don't need to think about doing the right thing
> because the right thing is the easy thing.
>
> What are your thoughts? If anyone objects to using GitHub Actions in this
> way, would you be satisfied with blacklisting your fork (i.e. you don't
> want it running on your branches but you don't mind if others do)?
>
> Thanks,
> Neal
>

Reply via email to