Austin, I don’t think anyone questions the value of consistent code or the 
value of automatic formatting. It is quite the opposite, actually. I think 
everyone on this thread actually cares about those things :)
Was automatic formatting helpful? For sure! Did changing the line limit from 
120 to 100 on a 4+ year project brought a lot of value, for example? I am not 
sure.

- Anton

> On Jul 29, 2022, at 9:44 AM, Yufei Gu <flyrain...@gmail.com> wrote:
> 
> It is the right direction to remove the trivial part of the coding. Just 
> wondering why we change the max line length from 120 to 100.
> 
> Best,
> 
> Yufei
> 
> `This is not a contribution`
> 
> 
> On Fri, Jul 29, 2022 at 9:21 AM Austin Bennett <whatwouldausti...@gmail.com 
> <mailto:whatwouldausti...@gmail.com>> wrote:
> Probably worthwhile to really think about the long-term desired outcomes.  
> Even a large, one-time, disruption of the codebase for the sake of future 
> consistency could be quite valuable.  Remember you'll also lose fidelity with 
> history/blame.  At some point a 'breaking' ( in a different sense ) change 
> might be useful rather than anchoring on what exists.  
> 
> 
> 
> On Fri, Jul 29, 2022 at 8:59 AM Anton Okolnychyi 
> <aokolnyc...@apple.com.invalid> wrote:
> On a side note, having automatic formatting was really nice. I think it is 
> the right direction. If we only had a way to tweak a few rules like the max 
> line length, one-line Javadoc and maybe consider using the Google format for 
> 1.8 cause we are using the 1.7 format at the moment, which was not designed 
> for lambdas. That way we may have all the benefits of automatic formatting 
> with less disruption to the code.
> 
> - Anton
> 
>> On Jul 28, 2022, at 7:35 PM, Anton Okolnychyi <aokolnyc...@apple.com.INVALID 
>> <mailto:aokolnyc...@apple.com.INVALID>> wrote:
>> 
>> I had a chance to rebase one of my PRs. It is really hard to justify the new 
>> max line length of 100 chars. I think the most unreadable code is usually 
>> around places when single statements are split into multiple lines. Since 
>> Java has explicit types and we prefer full and elaborate class/var names, 
>> the new line limit alone triggered so many changes and requires splitting a 
>> lot of statements that would previously fit on a single line.
>> 
>> The max line length is one of the reasons why Palantir folks forked the 
>> Google format.
>> 
>> https://github.com/palantir/palantir-java-format 
>> <https://github.com/palantir/palantir-java-format>
>> 
>> - Anton
>> 
>> 
>>> On Jul 28, 2022, at 7:42 AM, Eduard Tudenhoefner <edu...@tabular.io 
>>> <mailto:edu...@tabular.io>> wrote:
>>> 
>>> Thanks for your honest feedback Anton, which I really appreciate. I agree 
>>> with the points you brought up.
>>> 
>>> I don't prefer the Google Java Format either, but it is at least something 
>>> that produces consistent results across tools.
>>> If you ask me, I would love to customize the code format and have it as 
>>> close as possible to what the Iceberg project had. But one of the reasons 
>>> why the format isn't customizable is because it is difficult to support 
>>> such customizations across tools.
>>> 
>>> The problem with code format is that there is no single format that can 
>>> satisfy the preferences of everybody. However, from my experience, once 
>>> people start to use any code format that produces consistent results
>>> across Eclipse/IntelliJ/cmd line, people stop worrying about code format 
>>> details.
>>> 
>>> This is also one of the reasons why the creators of Go decided to have a 
>>> code formatter built-in (https://go.dev/doc/effective_go#formatting 
>>> <https://go.dev/doc/effective_go#formatting>).
>>> 
>>> In the long run I think the decision to use Spotless + code formatting 
>>> helps, especially for new people that want to contribute and don't want to 
>>> feel frustrated due to review feedback around code style.
>>> 
>>> Eduard
>>> 
>>> On Thu, Jul 28, 2022 at 6:15 AM Anton Okolnychyi <aokolnyc...@apple.com 
>>> <mailto:aokolnyc...@apple.com>> wrote:
>>> I am late to the discussion, but I would like to express my opinion.
>>> 
>>> As someone who deeply cares about the code consistency and leaves a lot of 
>>> style-related nits, I am definitely happy to see efforts to simplify the 
>>> life of the reviewers and automate the formatting process. Explaining the 
>>> code style to new contributors consumes a lot of time and results in more 
>>> iterations on PRs. Automating this would be great!
>>> 
>>> However, I am concerned regarding the changes merged in PR #5312.
>>> 
>>> - Many changes that went in that PR were against the original code style of 
>>> the project that we, committers and contributors, tried to maintain for all 
>>> these years. I think we had a lot of well-structured and consistent places, 
>>> which attracted devs to Iceberg.
>>> - I feel the code quality dropped as some places were written in the old 
>>> style and now look really weird. If we followed the new style originally, 
>>> maybe, it would look much better as we would structure the code differently.
>>> - As a someone whose team maintains an internal fork that is very close to 
>>> master (I know we are not the only ones), I feel such a massive change will 
>>> be a challenge to cherry-pick. I am not convinced introducing automatic 
>>> formatting had to be so radical.
>>> 
>>> The first two points concern me the most. Since I am late to the 
>>> discussion, it would not be fair for me to request changes. However, I 
>>> strongly suggest the community to reconsider the approach we took while we 
>>> haven’t merged many changes on top of PR #5312. 
>>> 
>>> To sum up, I definitely support automating the formatting to simplify 
>>> reviews and help new contributors but I would try to come with a way to 
>>> make it less radical and follow whatever we had all these years. We did 
>>> discuss some of the formatting guidelines a few years ago and it is all 
>>> different now.
>>> 
>>> - Anton
>>> 
>>>> On Jul 27, 2022, at 11:40 AM, Steven Wu <stevenz...@gmail.com 
>>>> <mailto:stevenz...@gmail.com>> wrote:
>>>> 
>>>> Eduard, thank you for driving this effort! Great work!
>>>> 
>>>> On Wed, Jul 27, 2022 at 11:11 AM Eduard Tudenhoefner <edu...@tabular.io 
>>>> <mailto:edu...@tabular.io>> wrote:
>>>> Hey everyone, 
>>>> 
>>>> Google Java Format + Spotless integration have been merged as part of 
>>>> https://github.com/apache/iceberg/pull/5266 
>>>> <https://github.com/apache/iceberg/pull/5266>.
>>>> 
>>>> The big bang reformatting (aka spotlessApply) will happen as part of 
>>>> https://github.com/apache/iceberg/pull/5312 
>>>> <https://github.com/apache/iceberg/pull/5312> and once this PR is merged, 
>>>> please perform the following steps if you have open PRs: 
>>>> 
>>>> 1. Rebase your PR branch against the commit before the „Big Bang“
>>>> 2. Run `./gradlew spotlessApply`
>>>> 3. Squash all commits
>>>> 4. Rebase against the latest HEAD of the master branch
>>>> 5. Force-push your branch
>>>> 
>>>> This should hopefully produce a minimum of conflicts on PRs.
>>>> 
>>>> Additionally, make sure to install the google-java-format plugin for your 
>>>> IDE as outlined in https://github.com/apache/iceberg-docs/pull/125 
>>>> <https://github.com/apache/iceberg-docs/pull/125> so that you can format 
>>>> code inside the IDE as well.
>>>> 
>>>> Eduard
>>>> 
>>>> On Fri, Jul 15, 2022 at 12:45 PM Eduard Tudenhoefner <edu...@tabular.io 
>>>> <mailto:edu...@tabular.io>> wrote:
>>>> That's correct Ryan. The Google style is what can be applied across tools 
>>>> and Spotless is the tool that we use to check and apply that style (and 
>>>> also auto-apply the license header). The apply part is the important piece 
>>>> that Checkstyle for example is missing (as it's tedious to manually apply 
>>>> what checkstyle:check complains about).
>>>> 
>>>> Additionally, Spotless 
>>>> <https://github.com/diffplug/spotless/tree/main/plugin-gradle> supports a 
>>>> bunch of other languages, so it would even be possible to use it for 
>>>> auto-formatting Gradle files (or Scala files using scalafmt) if we wanted 
>>>> to, but for now I would probably focus on the Java codebase.
>>>> 
>>>> Eduard
>>>> 
>>>> On Thu, Jul 14, 2022 at 6:58 PM Ryan Blue <b...@tabular.io 
>>>> <mailto:b...@tabular.io>> wrote:
>>>> Okay, I understand. So it isn't that spotless is supported across tools, 
>>>> it is that google style is supported and we can use Spotless to both check 
>>>> and apply that style. Is that correct?
>>>> 
>>>> On Thu, Jul 14, 2022 at 9:56 AM Eduard Tudenhoefner <edu...@tabular.io 
>>>> <mailto:edu...@tabular.io>> wrote:
>>>> Yeah it's how Dmitri said, the Google style is most likely on purpose not 
>>>> configurable, because otherwise it's getting more difficult to achieve the 
>>>> same formatting results across IDEs & CMD line. 
>>>> 
>>>> 
>>>> On Wed, Jul 13, 2022 at 7:39 PM Dmitri Bourlatchkov 
>>>> <dmitri.bourlatch...@dremio.com <mailto:dmitri.bourlatch...@dremio.com>> 
>>>> wrote:
>>>> Yeah, the Google style has very specific wrapping limits and javadoc 
>>>> format. It is not configurable, if we go with it.
>>>> 
>>>> From my POV, using another style would be fine. However, with a custom 
>>>> style it can be hard and tricky to ensure that all IDEs enforce it the 
>>>> same way as spotless.
>>>> 
>>>> The Google style has a plugin at least for IntelliJ (and likely for 
>>>> Eclipse too) so it will behave the same way in IDE and cmd. line build.
>>>> 
>>>> On Wed, Jul 13, 2022 at 1:28 PM Ryan Blue <b...@tabular.io 
>>>> <mailto:b...@tabular.io>> wrote:
>>>> Is there any value in trying to get Spotless to match the current format 
>>>> as closely as possible? It would be great to have fewer changes.
>>>> 
>>>> For example, when I tested the apply, I see this:
>>>> 
>>>> -  /**
>>>> -   * Returns an initialized {@link AliyunProperties}
>>>> -   */
>>>> +  /** Returns an initialized {@link AliyunProperties} */
>>>> Changes like that don’t seem very valuable to me. Similarly, it looks like 
>>>> a lot of text is wrapped at a different line length. If we can alter that, 
>>>> we could easily avoid a lot of changes.
>>>> 
>>>> Ryan
>>>> 
>>>> 
>>>> On Wed, Jul 13, 2022 at 7:41 AM Eduard Tudenhoefner <edu...@tabular.io 
>>>> <mailto:edu...@tabular.io>> wrote:
>>>> As a first step, I created https://github.com/apache/iceberg/pull/5266 
>>>> <https://github.com/apache/iceberg/pull/5266> which configures Spotless to 
>>>> use the Google Java Format and also apply the correct copyright header for 
>>>> Java files.
>>>> 
>>>> Once this PR is merged, the next steps would be:
>>>> removing conflicting Checkstyle rules that are not in line with the Google 
>>>> format
>>>> formatting the entire code base via `./gradlew spotlessApply`
>>>> setting `enforceCheck` to `true` in 
>>>> https://github.com/apache/iceberg/blob/80318d8cfbeb0d96d0afc27c84bc3dbddde35344/baseline.gradle#L48
>>>>  
>>>> <https://github.com/apache/iceberg/blob/80318d8cfbeb0d96d0afc27c84bc3dbddde35344/baseline.gradle#L48>
>>>>  so that validation fails if code isn't properly formatted
>>>> updating docs around the current Formatter usage and how to configure 
>>>> Eclipse/IntelliJ
>>>> The first 3 steps should be done together as part of the big bang.
>>>> 
>>>> Eduard
>>>> 
>>>> On Mon, Jul 11, 2022 at 10:54 PM Ryan Blue <b...@apache.org 
>>>> <mailto:b...@apache.org>> wrote:
>>>> Okay, it sounds like there's mostly agreement for going with spotless. 
>>>> Let's try that out. We'll work on some changes to add spotless so that 
>>>> `spotlessApply` works. Then we can do the big bang migration (which I also 
>>>> agree is the best option) just before the 1.0.
>>>> 
>>>> Thanks, everyone!
>>>> 
>>>> On Mon, Jul 11, 2022 at 11:50 AM Dmitri Bourlatchkov 
>>>> <dmitri.bourlatch...@dremio.com <mailto:dmitri.bourlatch...@dremio.com>> 
>>>> wrote:
>>>> My experience with the Google Code Style + Spotless was positive too.
>>>> 
>>>> I'd be fine with another code style as long as it is "deterministic" (e.g. 
>>>> does not make changes on repeated execution) and works in IntelliJ IDEA / 
>>>> Eclipse / etc.
>>>> 
>>>> Regarding cherry-picking into older branches, I think Robert's suggestion 
>>>> can be tweaked slightly to be helpful there too:
>>>> 
>>>> 1. Checkout old branch
>>>> 2. Apply the new style (run gradle ...)
>>>> 3. Cherry-pick without committing
>>>> 4. Manually revert to old style
>>>> 5. Commit
>>>> 6. Reset to original branch HEAD
>>>> 7. Cherry pick commit 5 again
>>>> 
>>>> It's a bit lengthy and may be a tedious process, but it should allow 
>>>> applying the git-level changes mostly automatically. 
>>>> 
>>>> Cheers,
>>>> Dmitri.
>>>> 
>>>> 
>>>> On Fri, Jul 8, 2022 at 2:53 AM Robert Stupp <sn...@snazy.de 
>>>> <mailto:sn...@snazy.de>> wrote:
>>>> From my experience, it’s a big win to have automatic code formatting.
>>>> 
>>>> In projectnessie we use automatic code formatting for all languages and 
>>>> haven’t serious issues with Spotless. It is just nice to not have to bike 
>>>> shed about whitespaces, line breaks, brackets, etc. It was a bit of 
>>>> discussion, because people had bad memories from past experiences with 
>>>> automatic code formatting breaking code and introducing subtle bugs.
>>>> 
>>>> I think that using code styles that „do not allow bike shedding“ (Google 
>>>> Code Style) are a very good option.
>>>> 
>>>> So far none of us has seen issues with any of the Spotless code formatters 
>>>> that we use: XML, Kotlin/Gradle, Kotlin, Antlr4, Java, Scala - relying on 
>>>> the „standard“ settings w/o any customizations. We use this piece of code, 
>>>> externalized into an internal Gradle plugin: 
>>>> https://github.com/projectnessie/gradle-build-plugins/blob/main/spotless/src/main/kotlin/org/projectnessie/buildtools/spotless/SpotlessHelperPlugin.kt
>>>>  
>>>> <https://github.com/projectnessie/gradle-build-plugins/blob/main/spotless/src/main/kotlin/org/projectnessie/buildtools/spotless/SpotlessHelperPlugin.kt>
>>>>  For Iceberg, it would probably be nice to have some Groovy code 
>>>> formatting for the build scripts as well.
>>>> 
>>>> Sure, the migration will add some pain. IMHO the best option is a „big 
>>>> bang“ across the whole code base, because it happens only once. Migrating 
>>>> one module after another is a „repeated series of pains“.
>>>> 
>>>> Since the result of a `./gradlew spotlessApply` is deterministic, people 
>>>> that have open PRs could:
>>>> 1. Rebase their PR branch against the commit before the „Big Bang“
>>>> 2. Include a commit with the necessary Gradle build change (one the only 
>>>> contains the changes to add Spotless)
>>>> 3. Do the `./gradlew spotlessApply`
>>>> 4. Squash all commits in the PR-branch
>>>> 5. Rebase again - against the HEAD of the master branch
>>>> 6. Force-push PR-branch
>>>> Because git is „clever enough“ to eliminate the „duplicated/unrelated 
>>>> changes“, the final result of the above steps is just the diff with the 
>>>> changes for the open PR.
>>>> 
>>>> 
>>>>> Am 08.07.2022 um 00:59 schrieb Ryan Blue <b...@tabular.io 
>>>>> <mailto:b...@tabular.io>>:
>>>>> 
>>>>> We were just talking about this proposal internally. I think it would be 
>>>>> great to have automatic code formatting, especially since we have to 
>>>>> point out a lot of changes manually. The main question is how to get 
>>>>> there without too much disruption. This came up in our discussions around 
>>>>> the upcoming 1.0 release, since that may be a good opportunity to make 
>>>>> all of the code changes.
>>>>> 
>>>>> For background, the main concern about adding something like this is 
>>>>> applying all of the changes needed to get the existing code to conform to 
>>>>> the new style. That is really disruptive because it will cause all of the 
>>>>> PRs to need to be rebased and makes it really difficult to cherry-pick 
>>>>> changes from after the code formatting happens to branches that were 
>>>>> created before code formatting. The 1.0 release makes a good opportunity 
>>>>> because we are making other changes (removing deprecations) and will 
>>>>> hopefully have people upgrading their branches to the new major version, 
>>>>> rather than cherry picking.
>>>>> 
>>>>> This is as good a time as any to add automatic code formatting, but it's 
>>>>> up to the community: so should we refromat the project and apply spotless 
>>>>> code formatting everywhere? I'm interested to hear opinions!
>>>>> 
>>>>> Ryan
>>>>> 
>>>>> On Thu, Mar 10, 2022 at 3:00 AM Eduard Tudenhoefner <edu...@dremio.com 
>>>>> <mailto:edu...@dremio.com>> wrote:
>>>>> Hello everyone,
>>>>> 
>>>>> I would like to get the discussion started around automatic code 
>>>>> formatting + enforcing and how we get there.
>>>>> 
>>>>> Currently we use Checkstyle check to enforce formatting. However, the 
>>>>> problem with that is that you still have to manually do the actual 
>>>>> formatting.
>>>>> 
>>>>> What I would like to propose is the usage of Spotless 
>>>>> (https://github.com/diffplug/spotless 
>>>>> <https://github.com/diffplug/spotless>) for checking and enforcing Java 
>>>>> code style (it can also enforce code style for Scala, Markdown, ... btw). 
>>>>> Spotless is being used by many projects 
>>>>> (https://github.com/search?l=gradle&q=spotless&type=Code 
>>>>> <https://github.com/search?l=gradle&q=spotless&type=Code>) and comes 
>>>>> essentially with two tasks:
>>>>> * spotlessCheck: Checks that sourcecode satisfies formatting steps
>>>>> * spotlessApply: Applies code formatting steps to sourcecode in-place
>>>>> 
>>>>> 
>>>>> Code format
>>>>> 
>>>>> The problem with code format is that there is no single format that can 
>>>>> satisfy the preferences of everybody. However, from my experience, once 
>>>>> people start to use any code format that produces consistent results 
>>>>> across Eclipse/IntelliJ/cmd line, people stop worrying about code format 
>>>>> details. 
>>>>> This is also one of the reasons why the creators of Go decided to have a 
>>>>> code formatter built-in (https://go.dev/doc/effective_go#formatting 
>>>>> <https://go.dev/doc/effective_go#formatting>):
>>>>> 
>>>>> Formatting issues are the most contentious but the least consequential. 
>>>>> People can adapt to different formatting styles but it's better if they 
>>>>> don't have to, and less time is devoted to the topic if everyone adheres 
>>>>> to the same style. The problem is how to approach this Utopia without a 
>>>>> long prescriptive style guide.
>>>>> With Go we take an unusual approach and let the machine take care of most 
>>>>> formatting issues. The gofmt program (also available as go fmt, which 
>>>>> operates at the package level rather than source file level) reads a Go 
>>>>> program and emits the source in a standard style of indentation and 
>>>>> vertical alignment, retaining and if necessary reformatting comments.
>>>>> 
>>>>> 
>>>>> I would like to propose using the Google Java Format with Spotless. The 
>>>>> reason for this format is essentially that this is a widely-adopted code 
>>>>> format that is designed specifically for code reviews (since we're 
>>>>> spending more time reviewing code than writing it). 
>>>>> Additionally, it produces consistent formatting results across 
>>>>> Eclipse/IntelliJ/cmd line, which I think is another very important factor.
>>>>> 
>>>>> Thus, our initial Gradle spotless configuration could look similar to the 
>>>>> above below:
>>>>> 
>>>>> 
>>>>> pluginManager.withPlugin('com.diffplug.spotless') {
>>>>>     spotless {
>>>>>       // don't run spotlessCheck during gradle check task during the 
>>>>> transition phase
>>>>>       enforceCheck = false
>>>>>       java {
>>>>>         target 'src/main/java/**/*.java', 'src/test/java/**/*.java', 
>>>>> 'src/jmh/java/**/*.java'
>>>>>         googleJavaFormat()
>>>>>       }
>>>>>     }
>>>>> }
>>>>> 
>>>>> We don't have to use Google Java Format. Spotless also supports 
>>>>> formatting the code with other formats, but from previous experience the 
>>>>> Google Java Format seemed to be really the only one to produce consistent 
>>>>> results across Eclipse/IntelliJ/cmd line.
>>>>> 
>>>>> 
>>>>> How do we get to a point where the entire codebase is properly formatted 
>>>>> (and enforceCheck = false can be removed)?
>>>>> 
>>>>> Now this is a difficult question. Obviously we don't want to have a 
>>>>> single format-everything commit, as that would affect lots of in-flight 
>>>>> PRs.
>>>>> 
>>>>> There would have to be some form of gradual formatting, for example 
>>>>> module by module. Spotless offers something called Ratched 
>>>>> (https://github.com/diffplug/spotless/tree/main/plugin-gradle#ratchet 
>>>>> <https://github.com/diffplug/spotless/tree/main/plugin-gradle#ratchet>) 
>>>>> that allows to enforce code format gradually (but I'm not sure this would 
>>>>> be a good thing either).
>>>>> 
>>>>> How exactly we'd like to approach this transitioning phase this is a 
>>>>> completely separate discussion, but I feel like at least we could get the 
>>>>> ball rolling so that we make it also easier for newcomers to contribute 
>>>>> to the project, since it would be straightforward for them to make their 
>>>>> PRs adhere to the code format and also save time during PR reviews.
>>>>> 
>>>>> 
>>>>> Eduard 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Ryan Blue
>>>>> Tabular
>>>> 
>>>> 
>>>> 
>>>> -- 
>>>> Ryan Blue
>>>> 
>>>> 
>>>> -- 
>>>> Ryan Blue
>>>> Tabular
>>>> 
>>>> 
>>>> -- 
>>>> Ryan Blue
>>>> Tabular
>>> 
>> 
> 

Reply via email to