+1 Thanks for driving this.
On Wed, Dec 16, 2020 at 7:33 PM Chesnay Schepler <ches...@apache.org> wrote: > +1 to set this up ASAP. Now's a good time to (finally) finalize this > effort with a new release cycle having started and christmas/vacations > being around the corner. > > On 12/16/2020 7:20 PM, Aljoscha Krettek wrote: > > Let's try and conclude this discussion! I've prepared a PoC that uses > > Spotless with google-java-format to do the formatting: > > https://github.com/aljoscha/flink/commits/flink-xxx-spotless > > > > To summarize from earlier discussion, the main benefits are: > > - no more worrying about code style, both as reviewer and implementer > > - everyone can configure their IDE to auto-format, there will never > > be unrelated formatting changes > > > > Also, this uses git's blame.ignoreRevsFile to add a file that tells > > git blame/IntelliJ to ignore the refactoring commit. However, you need > > to manually configure your git for that using: > > > > $ git config blame.ignoreRevsFile .git-blame-ignore-revs > > > > You can check out the PoC, look at the code in an IDE, play around > > with blame/annotations. > > > > By the way, the coding style is not configurable, it’s the Google Java > > Style, wich uses spaces for indentation. In an IDE or on github the > > code looks barely different from the previous formatting at a first > > glance. > > > > For IDE setup, I will add a guide in the README but it boils down to: > > > > - install the google-java-format plugin, enable it > > - install the Save Actions plugin, enable "Optimize Imports" and > > "Reformat File" > > > > With this, every time you save, formatting will be applied > > automatically. You will never see formatting complaints from CI > > (except for cases where you break semantical checkstyle rules, such as > > using certain imports that we don't allow.). > > > > What do you think? > > > > Best, > > Aljoscha > > > > On 19.10.20 12:36, Aljoscha Krettek wrote: > >> I don't like checkstyle because it cannot be easily applied from the > >> commandline. I'm happy to learn otherwise, though. And I'd also be > >> very happy about alternative suggestions that can do that. > >> > >> Right now, I think Spotless is the most straightforward tool for > >> that. Also I don't care so much about what style we choose in the > >> end. (If we choose one). My main motivation is that we have a common, > >> strict style that can easily applied via tooling so that we no longer > >> need to comment on coding style in PRs. > >> > >> Aljoscha > >> > >> On 09.10.20 11:11, tison wrote: > >>> +1 on locking on one codestyle and I think that is what current > >>> checkstyle > >>> rules serving. > >>> > >>> For automatic applying part, we suggest developing by IDEA and with > >>> Checkstyle Plugin on IDEA applying checkstyle suggestion is also > >>> automatic. > >>> > >>> One short coming for spotless is that we can hardly adjust rules if the > >>> project has its own issues to overcome. IIRC only several > >>> pre-defined rules > >>> and a clumsy general regex substitution rule can be used. > >>> > >>> FYI my team started with spotless but ended up with checkstyle with few > >>> rules and Checkstyle Plugin for automation. > >>> > >>> Codestyle, in another perspective, is part of cognition of developers > >>> working in project, not something we just apply before pull request. No > >>> matter how much automation introduced, most of developers will converge > >>> working with the configured codestyle. > >>> > >>> Best, > >>> tison. > >>> > >>> > >>> Kostas Kloudas <kklou...@gmail.com> 于2020年10月7日周三 下午6:37写道: > >>> > >>>> Hi all, > >>>> > >>>> +1 for enforcing "a" codestyle using "a" tool. > >>>> > >>>> As the project grows both in terms of LOCs and contributors, this > >>>> becomes more and more important as it eliminates some potential points > >>>> of friction without any additional effort. > >>>> > >>>> From the discussion, I am leaning towards having a single commit with > >>>> all the codestyle-related changes. This will avoid sprinkling > >>>> refactoring commits all over the place for the next year or more. But > >>>> if this is the price to pay for having consensus on a tool, then I am > >>>> ok with gradual implementation. I believe that the value added by > >>>> having an automated process of enforcing a codestyle exceeds the cost > >>>> of the nuisance of gradual refactoring. > >>>> > >>>> As for the actual format, I like the google-java-format but again, if > >>>> the community agrees on a different one I would not oppose that (as > >>>> long as it does not use the same amount of indentation for method args > >>>> and method body :P). > >>>> > >>>> Cheers, > >>>> Kostas > >>>> > >>>> On Wed, Oct 7, 2020 at 10:26 AM Chesnay Schepler <ches...@apache.org> > >>>> wrote: > >>>>> > >>>>> To me, ratchet seems to combine the worst aspects of both approaches. > >>>>> You get disruptive changes, but only in singular files, > >>>>> for something mundane as a typo fix or import change, which would be > >>>>> annoying to keep separate from the actual functional changes in a PR. > >>>>> > >>>>> On 10/7/2020 10:04 AM, Matthias Pohl wrote: > >>>>>> I find the ratchet feature you're suggesting interesting. But Arvid > >>>> has a > >>>>>> point referring to the blog post about ignoring revisions in git > >>>>>> blame > >>>> [1]. > >>>>>> Adding the configuration file for commits to ignore revs as proposed > >>>> in the > >>>>>> blog post makes it even easier. One problem I see is that this is > >>>>>> not > >>>>>> supported by Github (yet?) [2] as mentioned in [1]. > >>>>>> > >>>>>> Considering all that I prefer applying the code style in one go. I > >>>> have no > >>>>>> strong opinion on what codestyle is the best. > >>>>>> > >>>>>> PS: We used spotless in the project I previously worked on. It was > >>>>>> convenient to use. > >>>>>> > >>>>>> [1] > >>>>>> > >>>> > https://www.moxio.com/blog/43/ignoring-bulk-change-commits-with-git-blame > >>>> > >>>>>> [2] > >>>>>> > >>>> > https://github.community/t/support-ignore-revs-file-in-githubs-blame-view/3256 > >>>> > >>>>>> > >>>>>> On Tue, Oct 6, 2020 at 6:00 PM Aljoscha Krettek > >>>>>> <aljos...@apache.org> > >>>> wrote: > >>>>>> > >>>>>>> Maybe I wasn't very clear on how the ratchet works. The changes are > >>>>>>> gradual yes, but it doesn't leave you an option: if you touch a > >>>>>>> file > >>>> you > >>>>>>> will it will have to conform to the coding style afterwards. > >>>>>>> It's not > >>>>>>> like the previous gradual process we had before where it would be > >>>> based > >>>>>>> on people actively working towards a style. > >>>>>>> > >>>>>>> That being said, I also completely like the option of just doing > >>>>>>> one > >>>> big > >>>>>>> change commit. > >>>>>>> > >>>>>>> Regarding actual coding styles: we're a bit limited by what tools > >>>> exist. > >>>>>>> I like Spotless because it can be used to both check and apply a > >>>> style. > >>>>>>> Then you need a formatter that works with Spotless and of those we > >>>> only > >>>>>>> have the Eclipse Formatter, google-java-format, and Prettier. > >>>>>>> Prettier > >>>>>>> is a Javascript tool that I would like to avoid. Eclipse is > >>>>>>> doable but > >>>>>>> you need to fiddle with configuration files. I like > >>>>>>> google-java-format > >>>>>>> because of it's take-it-or-leave-it approach. You either use the > >>>>>>> style > >>>>>>> or you don't but it's very thorough. The downside is that it > >>>>>>> won't do > >>>>>>> tabs-only formatting. > >>>>>>> > >>>>>>> Best, > >>>>>>> Aljoscha > >>>>>>> > >>>>>>> On 06.10.20 17:43, Arvid Heise wrote: > >>>>>>>> After having written that I did a quick search, you can even > >>>>>>>> use git > >>>>>>> blame > >>>>>>>> with one big massive change commit [1], which would further > >>>>>>>> help the > >>>> idea > >>>>>>>> of "just get over with it". > >>>>>>>> > >>>>>>>> With that option, I'd even change all whitespaces if the community > >>>> thinks > >>>>>>>> that it's a better option (a separate discussion that I'll gladly > >>>> skip). > >>>>>>>> > >>>>>>>> [1] > >>>>>>>> > >>>>>>> > >>>> > https://www.moxio.com/blog/43/ignoring-bulk-change-commits-with-git-blame > >>>> > >>>>>>>> On Tue, Oct 6, 2020 at 5:38 PM Arvid Heise <ar...@ververica.com> > >>>> wrote: > >>>>>>>> > >>>>>>>>> I'm also +1 for automatically enforceable code style. > >>>>>>>>> > >>>>>>>>> I also would just go over it as Chesnay said. While it makes some > >>>>>>> changes > >>>>>>>>> a bit harder to track (inline git blame), it's easy to skip > >>>>>>>>> over in > >>>> any > >>>>>>> git > >>>>>>>>> history and if it's only one massive commit, then it's also much > >>>> easier > >>>>>>> to > >>>>>>>>> ignore than many gradual changes. Further, if we just do it once, > >>>> git > >>>>>>> blame > >>>>>>>>> will quickly become more reliable again. > >>>>>>>>> > >>>>>>>>> Btw I completely don't care about the code style as long as it > >>>>>>>>> plays > >>>>>>> well > >>>>>>>>> with IntelliJ (it used to be different, but things change :p). > >>>>>>>>> > >>>>>>>>> On Tue, Oct 6, 2020 at 5:23 PM Chesnay Schepler > >>>>>>>>> <ches...@apache.org > >>>>> > >>>>>>>>> wrote: > >>>>>>>>> > >>>>>>>>>> We shouldn't switch to spaces _now_; cutting this bit from your > >>>>>>> proposal > >>>>>>>>>> will massively simplify things and there's hardly any value in > >>>> changing > >>>>>>>>>> it. > >>>>>>>>>> > >>>>>>>>>> Also I'm getting rather tired of this constant idea of "gradual > >>>>>>>>>> application". We've been doing this for 2-3 years now since we > >>>>>>>>>> introduced Checkstyle and basically got nowhere. We should just > >>>> bite > >>>>>>> the > >>>>>>>>>> bullet and get it over with; we could've solved this whole > >>>>>>>>>> problem > >>>>>>>>>> already. > >>>>>>>>>> > >>>>>>>>>> In conclusion, I'm +1 on finally locking down the codestyle and > >>>>>>> applying > >>>>>>>>>> it immediately, I'm -1 on any gradual application scheme because > >>>> they > >>>>>>>>>> _just don't work_. > >>>>>>>>>> > >>>>>>>>>> On 10/6/2020 2:15 PM, Aljoscha Krettek wrote: > >>>>>>>>>>> Hi All, > >>>>>>>>>>> > >>>>>>>>>>> I know I know, but please keep reading because I recently > >>>>>>>>>>> learned > >>>>>>>>>>> about some new developments in the area of coding-style > >>>> automation. > >>>>>>>>>>> > >>>>>>>>>>> The tool I would propose we use is Spotless > >>>>>>>>>>> (https://github.com/diffplug/spotless). This doesn't come > >>>>>>>>>>> with a > >>>>>>>>>>> formatter but allows using other popular formatters such as > >>>>>>>>>>> google-java-format. The nice thing about Spotless is that it > >>>> serves as > >>>>>>>>>>> a verifier for CI but can also apply the configured style > >>>>>>>>>>> automatically. That is, for the programmer all she has to do is > >>>> `mvn > >>>>>>>>>>> spotless:apply` to fix any style violations. > >>>>>>>>>>> > >>>>>>>>>>> An interesting feature, which was (somewhat) recently added is > >>>>>>>>>>> "ratchet" > >>>>>>>>>>> ( > >>>>>>> > >>>> > https://github.com/diffplug/spotless/blob/main/plugin-maven/README.md#ratchet > >>>> > >>>>>>> ). > >>>>>>>>>>> With this, you can set up Spotless to only apply it's rules to > >>>> files > >>>>>>>>>>> that were changed after a configured commit. This would allow a > >>>>>>>>>>> gradual application of the new coding style instead of one big > >>>> change. > >>>>>>>>>>> > >>>>>>>>>>> If we decide to use Spotless, we would of course also have to > >>>> decide > >>>>>>>>>>> on a coding style. For this I would propose google-java-format, > >>>> which > >>>>>>>>>>> the flink-statefun project uses. The main difference from our > >>>> current > >>>>>>>>>>> "style" is that this uses spaces instead of tabs for > >>>>>>>>>>> indentation. > >>>> By > >>>>>>>>>>> default it would be 2 spaces but it can be configured to use 4 > >>>> spaces > >>>>>>>>>>> which would make code look more or less like our current style. > >>>> There > >>>>>>>>>>> are no more configuration knobs, so using tabs is not an > >>>>>>>>>>> option. > >>>>>>>>>>> > >>>>>>>>>>> Finally, why should we do this? I think most engineers agree > >>>>>>>>>>> that > >>>>>>>>>>> having a common enforced style is good to have so I only > >>>>>>>>>>> want to > >>>>>>>>>>> highlight a few thoughts here about things we could improve: > >>>>>>>>>>> > >>>>>>>>>>> - No more nits about coding style in reviews, this makes it > >>>> easier > >>>>>>>>>>> for both the reviewer and developer > >>>>>>>>>>> > >>>>>>>>>>> - No manual fixing of Checkstyle errors because Spotless > >>>>>>>>>>> can > >>>> do that > >>>>>>>>>>> automatically > >>>>>>>>>>> > >>>>>>>>>>> - Because Flink is such a big project little islands of > >>>>>>>>>>> coding > >>>> style > >>>>>>>>>>> have formed between people that commonly work on components. It > >>>> can be > >>>>>>>>>>> a nuisance when you work on a different component and then > >>>> reviewers > >>>>>>>>>>> don't like your typical coding style. And you first have to get > >>>> used > >>>>>>>>>>> to the slight differences in style when reading code. > >>>>>>>>>>> > >>>>>>>>>>> There are also downsides I see in this: > >>>>>>>>>>> > >>>>>>>>>>> - We break the history, but both "git blame" and modern > >>>> IntelliJ can > >>>>>>>>>>> ignore whitespace when attributing changes. So for files > >>>>>>>>>>> that are > >>>>>>>>>>> already "well" formatted not much would change. > >>>>>>>>>>> > >>>>>>>>>>> - In the short-term it will be harder to apply changes > >>>>>>>>>>> both to > >>>>>>> master > >>>>>>>>>>> and one of the release-x branches because formatting will be > >>>>>>>>>>> different. I think this is not too hard though because Spotless > >>>> can > >>>>>>>>>>> automatically apply the style. > >>>>>>>>>>> > >>>>>>>>>>> In summary, we would have some short-term pain with this but I > >>>> think > >>>>>>>>>>> it would be good in the long run. What are your thoughts? > >>>>>>>>>>> > >>>>>>>>>>> Best, > >>>>>>>>>>> Aljoscha > >>>>>>>>>>> > >>>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> > >>>>>>>>> Arvid Heise | Senior Java Developer > >>>>>>>>> > >>>>>>>>> <https://www.ververica.com/> > >>>>>>>>> > >>>>>>>>> Follow us @VervericaData > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> > >>>>>>>>> Join Flink Forward <https://flink-forward.org/> - The Apache > >>>>>>>>> Flink > >>>>>>>>> Conference > >>>>>>>>> > >>>>>>>>> Stream Processing | Event Driven | Real Time > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> > >>>>>>>>> Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > >>>>>>>>> > >>>>>>>>> -- > >>>>>>>>> Ververica GmbH > >>>>>>>>> Registered at Amtsgericht Charlottenburg: HRB 158244 B > >>>>>>>>> Managing Directors: Timothy Alexander Steinert, Yip Park Tung > >>>> Jason, Ji > >>>>>>>>> (Toni) Cheng > >>>>>>>>> > >>>>>>>> > >>>>> > >>>> > >>> > >> > > > > > >