Hey everyone,

Google Java Format + Spotless integration have been merged as part of
https://github.com/apache/iceberg/pull/5266.

The big bang reformatting (aka *spotlessApply*) will happen as part of
https://github.com/apache/iceberg/pull/5312 and once this PR is merged,
please perform the following steps if you have open PRs:

1. Rebase your PR branch against the commit before the „Big Bang“
2. Run `*./gradlew spotlessApply*`
3. Squash all commits
4. Rebase against the latest *HEAD* of the master branch
5. Force-push your branch

This should hopefully produce a minimum of conflicts on PRs.

Additionally, make sure to install the *google-java-format* plugin for your
IDE as outlined in https://github.com/apache/iceberg-docs/pull/125 so that
you can format code inside the IDE as well.

Eduard

On Fri, Jul 15, 2022 at 12:45 PM Eduard Tudenhoefner <edu...@tabular.io>
wrote:

> That's correct Ryan. The Google style is what can be applied across tools
> and Spotless is the tool that we use to check and *apply* that style (and
> also auto-apply the license header). The *apply* part is the important
> piece that Checkstyle for example is missing (as it's tedious to manually
> apply what *checkstyle:check* complains about).
>
> Additionally, Spotless
> <https://github.com/diffplug/spotless/tree/main/plugin-gradle> supports a
> bunch of other languages, so it would even be possible to use it for
> auto-formatting Gradle files (or Scala files using scalafmt) if we wanted
> to, but for now I would probably focus on the Java codebase.
>
> Eduard
>
> On Thu, Jul 14, 2022 at 6:58 PM Ryan Blue <b...@tabular.io> wrote:
>
>> Okay, I understand. So it isn't that spotless is supported across tools,
>> it is that google style is supported and we can use Spotless to both check
>> and apply that style. Is that correct?
>>
>> On Thu, Jul 14, 2022 at 9:56 AM Eduard Tudenhoefner <edu...@tabular.io>
>> wrote:
>>
>>> Yeah it's how Dmitri said, the Google style is most likely on purpose
>>> not configurable, because otherwise it's getting more difficult to achieve
>>> the same formatting results across IDEs & CMD line.
>>>
>>>
>>> On Wed, Jul 13, 2022 at 7:39 PM Dmitri Bourlatchkov <
>>> dmitri.bourlatch...@dremio.com> wrote:
>>>
>>>> Yeah, the Google style has very specific wrapping limits and javadoc
>>>> format. It is not configurable, if we go with it.
>>>>
>>>> From my POV, using another style would be fine. However, with a custom
>>>> style it can be hard and tricky to ensure that all IDEs enforce it the same
>>>> way as spotless.
>>>>
>>>> The Google style has a plugin at least for IntelliJ (and likely for
>>>> Eclipse too) so it will behave the same way in IDE and cmd. line build.
>>>>
>>>> On Wed, Jul 13, 2022 at 1:28 PM Ryan Blue <b...@tabular.io> wrote:
>>>>
>>>>> Is there any value in trying to get Spotless to match the current
>>>>> format as closely as possible? It would be great to have fewer changes.
>>>>>
>>>>> For example, when I tested the apply, I see this:
>>>>>
>>>>> -  /**
>>>>> -   * Returns an initialized {@link AliyunProperties}
>>>>> -   */
>>>>> +  /** Returns an initialized {@link AliyunProperties} */
>>>>>
>>>>> Changes like that don’t seem very valuable to me. Similarly, it looks
>>>>> like a lot of text is wrapped at a different line length. If we can alter
>>>>> that, we could easily avoid a lot of changes.
>>>>>
>>>>> Ryan
>>>>>
>>>>> On Wed, Jul 13, 2022 at 7:41 AM Eduard Tudenhoefner <edu...@tabular.io>
>>>>> wrote:
>>>>>
>>>>>> As a first step, I created
>>>>>> https://github.com/apache/iceberg/pull/5266 which configures
>>>>>> Spotless to use the Google Java Format and also apply the correct 
>>>>>> copyright
>>>>>> header for Java files.
>>>>>>
>>>>>> Once this PR is merged, the next steps would be:
>>>>>>
>>>>>>    - removing conflicting Checkstyle rules that are not in line with
>>>>>>    the Google format
>>>>>>    - formatting the entire code base via `*./gradlew spotlessApply*`
>>>>>>    - setting `*enforceCheck*` to `*true*` in
>>>>>>    
>>>>>> https://github.com/apache/iceberg/blob/80318d8cfbeb0d96d0afc27c84bc3dbddde35344/baseline.gradle#L48
>>>>>>    so that validation fails if code isn't properly formatted
>>>>>>    - updating docs around the current Formatter usage and how to
>>>>>>    configure Eclipse/IntelliJ
>>>>>>
>>>>>> The first 3 steps should be done together as part of the big bang.
>>>>>>
>>>>>> Eduard
>>>>>>
>>>>>> On Mon, Jul 11, 2022 at 10:54 PM Ryan Blue <b...@apache.org> wrote:
>>>>>>
>>>>>>> Okay, it sounds like there's mostly agreement for going with
>>>>>>> spotless. Let's try that out. We'll work on some changes to add 
>>>>>>> spotless so
>>>>>>> that `spotlessApply` works. Then we can do the big bang migration 
>>>>>>> (which I
>>>>>>> also agree is the best option) just before the 1.0.
>>>>>>>
>>>>>>> Thanks, everyone!
>>>>>>>
>>>>>>> On Mon, Jul 11, 2022 at 11:50 AM Dmitri Bourlatchkov <
>>>>>>> dmitri.bourlatch...@dremio.com> wrote:
>>>>>>>
>>>>>>>> My experience with the Google Code Style + Spotless was positive
>>>>>>>> too.
>>>>>>>>
>>>>>>>> I'd be fine with another code style as long as it is
>>>>>>>> "deterministic" (e.g. does not make changes on repeated execution) and
>>>>>>>> works in IntelliJ IDEA / Eclipse / etc.
>>>>>>>>
>>>>>>>> Regarding cherry-picking into older branches, I think Robert's
>>>>>>>> suggestion can be tweaked slightly to be helpful there too:
>>>>>>>>
>>>>>>>> 1. Checkout old branch
>>>>>>>> 2. Apply the new style (run gradle ...)
>>>>>>>> 3. Cherry-pick without committing
>>>>>>>> 4. Manually revert to old style
>>>>>>>> 5. Commit
>>>>>>>> 6. Reset to original branch HEAD
>>>>>>>> 7. Cherry pick commit 5 again
>>>>>>>>
>>>>>>>> It's a bit lengthy and may be a tedious process, but it should
>>>>>>>> allow applying the git-level changes mostly automatically.
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Dmitri.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Jul 8, 2022 at 2:53 AM Robert Stupp <sn...@snazy.de> wrote:
>>>>>>>>
>>>>>>>>> From my experience, it’s a big win to have automatic code
>>>>>>>>> formatting.
>>>>>>>>>
>>>>>>>>> In projectnessie we use automatic code formatting for all
>>>>>>>>> languages and haven’t serious issues with Spotless. It is just nice 
>>>>>>>>> to not
>>>>>>>>> have to bike shed about whitespaces, line breaks, brackets, etc. It 
>>>>>>>>> was a
>>>>>>>>> bit of discussion, because people had bad memories from past 
>>>>>>>>> experiences
>>>>>>>>> with automatic code formatting breaking code and introducing subtle 
>>>>>>>>> bugs.
>>>>>>>>>
>>>>>>>>> I think that using code styles that „do not allow bike shedding“
>>>>>>>>> (Google Code Style) are a very good option.
>>>>>>>>>
>>>>>>>>> So far none of us has seen issues with any of the Spotless code
>>>>>>>>> formatters that we use: XML, Kotlin/Gradle, Kotlin, Antlr4, Java, 
>>>>>>>>> Scala -
>>>>>>>>> relying on the „standard“ settings w/o any customizations. We use this
>>>>>>>>> piece of code, externalized into an internal Gradle plugin:
>>>>>>>>> https://github.com/projectnessie/gradle-build-plugins/blob/main/spotless/src/main/kotlin/org/projectnessie/buildtools/spotless/SpotlessHelperPlugin.kt
>>>>>>>>>  For
>>>>>>>>> Iceberg, it would probably be nice to have some Groovy code 
>>>>>>>>> formatting for
>>>>>>>>> the build scripts as well.
>>>>>>>>>
>>>>>>>>> Sure, the migration will add some pain. IMHO the best option is a
>>>>>>>>> „big bang“ across the whole code base, because it happens only once.
>>>>>>>>> Migrating one module after another is a „repeated series of pains“.
>>>>>>>>>
>>>>>>>>> Since the result of a `./gradlew spotlessApply` is deterministic,
>>>>>>>>> people that have open PRs could:
>>>>>>>>> 1. Rebase their PR branch against the commit before the „Big Bang“
>>>>>>>>> 2. Include a commit with the necessary Gradle build change (one
>>>>>>>>> the only contains the changes to add Spotless)
>>>>>>>>> 3. Do the `./gradlew spotlessApply`
>>>>>>>>> 4. Squash all commits in the PR-branch
>>>>>>>>> 5. Rebase again - against the HEAD of the master branch
>>>>>>>>> 6. Force-push PR-branch
>>>>>>>>> Because git is „clever enough“ to eliminate the
>>>>>>>>> „duplicated/unrelated changes“, the final result of the above steps 
>>>>>>>>> is just
>>>>>>>>> the diff with the changes for the open PR.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am 08.07.2022 um 00:59 schrieb Ryan Blue <b...@tabular.io>:
>>>>>>>>>
>>>>>>>>> We were just talking about this proposal internally. I think it
>>>>>>>>> would be great to have automatic code formatting, especially since we 
>>>>>>>>> have
>>>>>>>>> to point out a lot of changes manually. The main question is how to 
>>>>>>>>> get
>>>>>>>>> there without too much disruption. This came up in our discussions 
>>>>>>>>> around
>>>>>>>>> the upcoming 1.0 release, since that may be a good opportunity to 
>>>>>>>>> make all
>>>>>>>>> of the code changes.
>>>>>>>>>
>>>>>>>>> For background, the main concern about adding something like this
>>>>>>>>> is applying all of the changes needed to get the existing code to 
>>>>>>>>> conform
>>>>>>>>> to the new style. That is really disruptive because it will cause all 
>>>>>>>>> of
>>>>>>>>> the PRs to need to be rebased and makes it really difficult to 
>>>>>>>>> cherry-pick
>>>>>>>>> changes from after the code formatting happens to branches that were
>>>>>>>>> created before code formatting. The 1.0 release makes a good 
>>>>>>>>> opportunity
>>>>>>>>> because we are making other changes (removing deprecations) and will
>>>>>>>>> hopefully have people upgrading their branches to the new major 
>>>>>>>>> version,
>>>>>>>>> rather than cherry picking.
>>>>>>>>>
>>>>>>>>> This is as good a time as any to add automatic code formatting,
>>>>>>>>> but it's up to the community: so should we refromat the project and 
>>>>>>>>> apply
>>>>>>>>> spotless code formatting everywhere? I'm interested to hear opinions!
>>>>>>>>>
>>>>>>>>> Ryan
>>>>>>>>>
>>>>>>>>> On Thu, Mar 10, 2022 at 3:00 AM Eduard Tudenhoefner <
>>>>>>>>> edu...@dremio.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello everyone,
>>>>>>>>>>
>>>>>>>>>> I would like to get the discussion started around automatic code
>>>>>>>>>> formatting + enforcing and how we get there.
>>>>>>>>>>
>>>>>>>>>> Currently we use Checkstyle *check* to enforce formatting.
>>>>>>>>>> However, the problem with that is that you still have to manually do 
>>>>>>>>>> the
>>>>>>>>>> actual formatting.
>>>>>>>>>>
>>>>>>>>>> What I would like to propose is the usage of *Spotless* (
>>>>>>>>>> https://github.com/diffplug/spotless) for *checking* and
>>>>>>>>>> *enforcing* Java code style (it can also enforce code style for
>>>>>>>>>> Scala, Markdown, ... btw). Spotless is being used by many projects (
>>>>>>>>>> https://github.com/search?l=gradle&q=spotless&type=Code) and
>>>>>>>>>> comes essentially with two tasks:
>>>>>>>>>> * *spotlessCheck*: Checks that sourcecode satisfies formatting
>>>>>>>>>> steps
>>>>>>>>>> * *spotlessApply*: Applies code formatting steps to sourcecode
>>>>>>>>>> in-place
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *Code format*
>>>>>>>>>>
>>>>>>>>>> The problem with code format is that there is no single format
>>>>>>>>>> that can satisfy the preferences of everybody. However, from my 
>>>>>>>>>> experience,
>>>>>>>>>> once people start to use *any* code format that produces
>>>>>>>>>> consistent results across *Eclipse**/IntelliJ/cmd line*, people
>>>>>>>>>> stop worrying about code format details.
>>>>>>>>>> This is also one of the reasons why the creators of Go decided to
>>>>>>>>>> have a code formatter built-in (
>>>>>>>>>> https://go.dev/doc/effective_go#formatting):
>>>>>>>>>>
>>>>>>>>>> *Formatting issues are the most contentious but the least
>>>>>>>>>>> consequential. People can adapt to different formatting styles but 
>>>>>>>>>>> it's
>>>>>>>>>>> better if they don't have to, and less time is devoted to the topic 
>>>>>>>>>>> if
>>>>>>>>>>> everyone adheres to the same style. The problem is how to approach 
>>>>>>>>>>> this
>>>>>>>>>>> Utopia without a long prescriptive style guide.*
>>>>>>>>>>> *With Go we take an unusual approach and let the machine take
>>>>>>>>>>> care of most formatting issues. The gofmt program (also available 
>>>>>>>>>>> as go
>>>>>>>>>>> fmt, which operates at the package level rather than source file 
>>>>>>>>>>> level)
>>>>>>>>>>> reads a Go program and emits the source in a standard style of 
>>>>>>>>>>> indentation
>>>>>>>>>>> and vertical alignment, retaining and if necessary reformatting 
>>>>>>>>>>> comments.*
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I would like to propose using the Google Java Format with
>>>>>>>>>> Spotless. The reason for this format is essentially that this is a
>>>>>>>>>> widely-adopted code format that is designed specifically for code 
>>>>>>>>>> reviews
>>>>>>>>>> (since we're spending more time reviewing code than writing it).
>>>>>>>>>> Additionally, it produces consistent formatting results across
>>>>>>>>>> *Eclipse**/IntelliJ/cmd line*, which I think is another very
>>>>>>>>>> important factor.
>>>>>>>>>>
>>>>>>>>>> Thus, our initial Gradle spotless configuration could look
>>>>>>>>>> similar to the above below:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *pluginManager.withPlugin('com.diffplug.spotless') {    spotless
>>>>>>>>>> {      // don't run spotlessCheck during gradle check task during the
>>>>>>>>>> transition phase      enforceCheck = false      java {        target
>>>>>>>>>> 'src/main/java/**/*.java', 'src/test/java/**/*.java',
>>>>>>>>>> 'src/jmh/java/**/*.java'        googleJavaFormat()      }    }}*
>>>>>>>>>>
>>>>>>>>>> We don't have to use Google Java Format. Spotless also supports
>>>>>>>>>> formatting the code with other formats, but from previous experience 
>>>>>>>>>> the
>>>>>>>>>> Google Java Format seemed to be really the only one to produce 
>>>>>>>>>> consistent
>>>>>>>>>> results across *Eclipse**/IntelliJ/cmd line*.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> *How do we get to a point where the entire codebase is properly
>>>>>>>>>> formatted (and enforceCheck = false can be removed)?*
>>>>>>>>>>
>>>>>>>>>> Now this is a difficult question. Obviously we don't want to have
>>>>>>>>>> a single *format-everything* commit, as that would affect lots
>>>>>>>>>> of in-flight PRs.
>>>>>>>>>>
>>>>>>>>>> There would have to be some form of gradual formatting, for
>>>>>>>>>> example module by module. Spotless offers something called Ratched (
>>>>>>>>>> https://github.com/diffplug/spotless/tree/main/plugin-gradle#ratchet)
>>>>>>>>>> that allows to enforce code format gradually (but I'm not sure this 
>>>>>>>>>> would
>>>>>>>>>> be a good thing either).
>>>>>>>>>>
>>>>>>>>>> How exactly we'd like to approach this transitioning phase this
>>>>>>>>>> is a completely separate discussion, but I feel like at least we 
>>>>>>>>>> could get
>>>>>>>>>> the ball rolling so that we make it also easier for newcomers to 
>>>>>>>>>> contribute
>>>>>>>>>> to the project, since it would be straightforward for them to make 
>>>>>>>>>> their
>>>>>>>>>> PRs adhere to the code format and also save time during PR reviews.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Eduard
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Ryan Blue
>>>>>>>>> Tabular
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Ryan Blue
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Ryan Blue
>>>>> Tabular
>>>>>
>>>>
>>
>> --
>> Ryan Blue
>> Tabular
>>
>

Reply via email to