I am afraid it is not easily possible. I looked into it a bit and even if we create a custom formatting step in Spotless (easy), there is no way to simply change the line limit. We would have to fork the Google format lib. I don’t think it is worth. Let’s just stick to 100 then.
- Anton > On Jul 29, 2022, at 10:03 AM, Ryan Blue <b...@tabular.io> wrote: > > Is line length something that we can fix? That seems like a reasonable change > to Google style that we may be able to do something about. > > On Fri, Jul 29, 2022 at 9:52 AM Anton Okolnychyi > <aokolnyc...@apple.com.invalid> wrote: > Austin, I don’t think anyone questions the value of consistent code or the > value of automatic formatting. It is quite the opposite, actually. I think > everyone on this thread actually cares about those things :) > > Was automatic formatting helpful? For sure! Did changing the line limit from > 120 to 100 on a 4+ year project brought a lot of value, for example? I am not > sure. > > - Anton > >> On Jul 29, 2022, at 9:44 AM, Yufei Gu <flyrain...@gmail.com >> <mailto:flyrain...@gmail.com>> wrote: >> >> It is the right direction to remove the trivial part of the coding. Just >> wondering why we change the max line length from 120 to 100. >> >> Best, >> >> Yufei >> >> `This is not a contribution` >> >> >> On Fri, Jul 29, 2022 at 9:21 AM Austin Bennett <whatwouldausti...@gmail.com >> <mailto:whatwouldausti...@gmail.com>> wrote: >> Probably worthwhile to really think about the long-term desired outcomes. >> Even a large, one-time, disruption of the codebase for the sake of future >> consistency could be quite valuable. Remember you'll also lose fidelity >> with history/blame. At some point a 'breaking' ( in a different sense ) >> change might be useful rather than anchoring on what exists. >> >> >> >> On Fri, Jul 29, 2022 at 8:59 AM Anton Okolnychyi >> <aokolnyc...@apple.com.invalid <mailto:aokolnyc...@apple.com.invalid>> wrote: >> On a side note, having automatic formatting was really nice. I think it is >> the right direction. If we only had a way to tweak a few rules like the max >> line length, one-line Javadoc and maybe consider using the Google format for >> 1.8 cause we are using the 1.7 format at the moment, which was not designed >> for lambdas. That way we may have all the benefits of automatic formatting >> with less disruption to the code. >> >> - Anton >> >>> On Jul 28, 2022, at 7:35 PM, Anton Okolnychyi >>> <aokolnyc...@apple.com.INVALID <mailto:aokolnyc...@apple.com.INVALID>> >>> wrote: >>> >>> I had a chance to rebase one of my PRs. It is really hard to justify the >>> new max line length of 100 chars. I think the most unreadable code is >>> usually around places when single statements are split into multiple lines. >>> Since Java has explicit types and we prefer full and elaborate class/var >>> names, the new line limit alone triggered so many changes and requires >>> splitting a lot of statements that would previously fit on a single line. >>> >>> The max line length is one of the reasons why Palantir folks forked the >>> Google format. >>> >>> https://github.com/palantir/palantir-java-format >>> <https://github.com/palantir/palantir-java-format> >>> >>> - Anton >>> >>> >>>> On Jul 28, 2022, at 7:42 AM, Eduard Tudenhoefner <edu...@tabular.io >>>> <mailto:edu...@tabular.io>> wrote: >>>> >>>> Thanks for your honest feedback Anton, which I really appreciate. I agree >>>> with the points you brought up. >>>> >>>> I don't prefer the Google Java Format either, but it is at least something >>>> that produces consistent results across tools. >>>> If you ask me, I would love to customize the code format and have it as >>>> close as possible to what the Iceberg project had. But one of the reasons >>>> why the format isn't customizable is because it is difficult to support >>>> such customizations across tools. >>>> >>>> The problem with code format is that there is no single format that can >>>> satisfy the preferences of everybody. However, from my experience, once >>>> people start to use any code format that produces consistent results >>>> across Eclipse/IntelliJ/cmd line, people stop worrying about code format >>>> details. >>>> >>>> This is also one of the reasons why the creators of Go decided to have a >>>> code formatter built-in (https://go.dev/doc/effective_go#formatting >>>> <https://go.dev/doc/effective_go#formatting>). >>>> >>>> In the long run I think the decision to use Spotless + code formatting >>>> helps, especially for new people that want to contribute and don't want to >>>> feel frustrated due to review feedback around code style. >>>> >>>> Eduard >>>> >>>> On Thu, Jul 28, 2022 at 6:15 AM Anton Okolnychyi <aokolnyc...@apple.com >>>> <mailto:aokolnyc...@apple.com>> wrote: >>>> I am late to the discussion, but I would like to express my opinion. >>>> >>>> As someone who deeply cares about the code consistency and leaves a lot of >>>> style-related nits, I am definitely happy to see efforts to simplify the >>>> life of the reviewers and automate the formatting process. Explaining the >>>> code style to new contributors consumes a lot of time and results in more >>>> iterations on PRs. Automating this would be great! >>>> >>>> However, I am concerned regarding the changes merged in PR #5312. >>>> >>>> - Many changes that went in that PR were against the original code style >>>> of the project that we, committers and contributors, tried to maintain for >>>> all these years. I think we had a lot of well-structured and consistent >>>> places, which attracted devs to Iceberg. >>>> - I feel the code quality dropped as some places were written in the old >>>> style and now look really weird. If we followed the new style originally, >>>> maybe, it would look much better as we would structure the code >>>> differently. >>>> - As a someone whose team maintains an internal fork that is very close to >>>> master (I know we are not the only ones), I feel such a massive change >>>> will be a challenge to cherry-pick. I am not convinced introducing >>>> automatic formatting had to be so radical. >>>> >>>> The first two points concern me the most. Since I am late to the >>>> discussion, it would not be fair for me to request changes. However, I >>>> strongly suggest the community to reconsider the approach we took while we >>>> haven’t merged many changes on top of PR #5312. >>>> >>>> To sum up, I definitely support automating the formatting to simplify >>>> reviews and help new contributors but I would try to come with a way to >>>> make it less radical and follow whatever we had all these years. We did >>>> discuss some of the formatting guidelines a few years ago and it is all >>>> different now. >>>> >>>> - Anton >>>> >>>>> On Jul 27, 2022, at 11:40 AM, Steven Wu <stevenz...@gmail.com >>>>> <mailto:stevenz...@gmail.com>> wrote: >>>>> >>>>> Eduard, thank you for driving this effort! Great work! >>>>> >>>>> On Wed, Jul 27, 2022 at 11:11 AM Eduard Tudenhoefner <edu...@tabular.io >>>>> <mailto:edu...@tabular.io>> wrote: >>>>> Hey everyone, >>>>> >>>>> Google Java Format + Spotless integration have been merged as part of >>>>> https://github.com/apache/iceberg/pull/5266 >>>>> <https://github.com/apache/iceberg/pull/5266>. >>>>> >>>>> The big bang reformatting (aka spotlessApply) will happen as part of >>>>> https://github.com/apache/iceberg/pull/5312 >>>>> <https://github.com/apache/iceberg/pull/5312> and once this PR is merged, >>>>> please perform the following steps if you have open PRs: >>>>> >>>>> 1. Rebase your PR branch against the commit before the „Big Bang“ >>>>> 2. Run `./gradlew spotlessApply` >>>>> 3. Squash all commits >>>>> 4. Rebase against the latest HEAD of the master branch >>>>> 5. Force-push your branch >>>>> >>>>> This should hopefully produce a minimum of conflicts on PRs. >>>>> >>>>> Additionally, make sure to install the google-java-format plugin for your >>>>> IDE as outlined in https://github.com/apache/iceberg-docs/pull/125 >>>>> <https://github.com/apache/iceberg-docs/pull/125> so that you can format >>>>> code inside the IDE as well. >>>>> >>>>> Eduard >>>>> >>>>> On Fri, Jul 15, 2022 at 12:45 PM Eduard Tudenhoefner <edu...@tabular.io >>>>> <mailto:edu...@tabular.io>> wrote: >>>>> That's correct Ryan. The Google style is what can be applied across tools >>>>> and Spotless is the tool that we use to check and apply that style (and >>>>> also auto-apply the license header). The apply part is the important >>>>> piece that Checkstyle for example is missing (as it's tedious to manually >>>>> apply what checkstyle:check complains about). >>>>> >>>>> Additionally, Spotless >>>>> <https://github.com/diffplug/spotless/tree/main/plugin-gradle> supports a >>>>> bunch of other languages, so it would even be possible to use it for >>>>> auto-formatting Gradle files (or Scala files using scalafmt) if we wanted >>>>> to, but for now I would probably focus on the Java codebase. >>>>> >>>>> Eduard >>>>> >>>>> On Thu, Jul 14, 2022 at 6:58 PM Ryan Blue <b...@tabular.io >>>>> <mailto:b...@tabular.io>> wrote: >>>>> Okay, I understand. So it isn't that spotless is supported across tools, >>>>> it is that google style is supported and we can use Spotless to both >>>>> check and apply that style. Is that correct? >>>>> >>>>> On Thu, Jul 14, 2022 at 9:56 AM Eduard Tudenhoefner <edu...@tabular.io >>>>> <mailto:edu...@tabular.io>> wrote: >>>>> Yeah it's how Dmitri said, the Google style is most likely on purpose not >>>>> configurable, because otherwise it's getting more difficult to achieve >>>>> the same formatting results across IDEs & CMD line. >>>>> >>>>> >>>>> On Wed, Jul 13, 2022 at 7:39 PM Dmitri Bourlatchkov >>>>> <dmitri.bourlatch...@dremio.com <mailto:dmitri.bourlatch...@dremio.com>> >>>>> wrote: >>>>> Yeah, the Google style has very specific wrapping limits and javadoc >>>>> format. It is not configurable, if we go with it. >>>>> >>>>> From my POV, using another style would be fine. However, with a custom >>>>> style it can be hard and tricky to ensure that all IDEs enforce it the >>>>> same way as spotless. >>>>> >>>>> The Google style has a plugin at least for IntelliJ (and likely for >>>>> Eclipse too) so it will behave the same way in IDE and cmd. line build. >>>>> >>>>> On Wed, Jul 13, 2022 at 1:28 PM Ryan Blue <b...@tabular.io >>>>> <mailto:b...@tabular.io>> wrote: >>>>> Is there any value in trying to get Spotless to match the current format >>>>> as closely as possible? It would be great to have fewer changes. >>>>> >>>>> For example, when I tested the apply, I see this: >>>>> >>>>> - /** >>>>> - * Returns an initialized {@link AliyunProperties} >>>>> - */ >>>>> + /** Returns an initialized {@link AliyunProperties} */ >>>>> Changes like that don’t seem very valuable to me. Similarly, it looks >>>>> like a lot of text is wrapped at a different line length. If we can alter >>>>> that, we could easily avoid a lot of changes. >>>>> >>>>> Ryan >>>>> >>>>> >>>>> On Wed, Jul 13, 2022 at 7:41 AM Eduard Tudenhoefner <edu...@tabular.io >>>>> <mailto:edu...@tabular.io>> wrote: >>>>> As a first step, I created https://github.com/apache/iceberg/pull/5266 >>>>> <https://github.com/apache/iceberg/pull/5266> which configures Spotless >>>>> to use the Google Java Format and also apply the correct copyright header >>>>> for Java files. >>>>> >>>>> Once this PR is merged, the next steps would be: >>>>> removing conflicting Checkstyle rules that are not in line with the >>>>> Google format >>>>> formatting the entire code base via `./gradlew spotlessApply` >>>>> setting `enforceCheck` to `true` in >>>>> https://github.com/apache/iceberg/blob/80318d8cfbeb0d96d0afc27c84bc3dbddde35344/baseline.gradle#L48 >>>>> >>>>> <https://github.com/apache/iceberg/blob/80318d8cfbeb0d96d0afc27c84bc3dbddde35344/baseline.gradle#L48> >>>>> so that validation fails if code isn't properly formatted >>>>> updating docs around the current Formatter usage and how to configure >>>>> Eclipse/IntelliJ >>>>> The first 3 steps should be done together as part of the big bang. >>>>> >>>>> Eduard >>>>> >>>>> On Mon, Jul 11, 2022 at 10:54 PM Ryan Blue <b...@apache.org >>>>> <mailto:b...@apache.org>> wrote: >>>>> Okay, it sounds like there's mostly agreement for going with spotless. >>>>> Let's try that out. We'll work on some changes to add spotless so that >>>>> `spotlessApply` works. Then we can do the big bang migration (which I >>>>> also agree is the best option) just before the 1.0. >>>>> >>>>> Thanks, everyone! >>>>> >>>>> On Mon, Jul 11, 2022 at 11:50 AM Dmitri Bourlatchkov >>>>> <dmitri.bourlatch...@dremio.com <mailto:dmitri.bourlatch...@dremio.com>> >>>>> wrote: >>>>> My experience with the Google Code Style + Spotless was positive too. >>>>> >>>>> I'd be fine with another code style as long as it is "deterministic" >>>>> (e.g. does not make changes on repeated execution) and works in IntelliJ >>>>> IDEA / Eclipse / etc. >>>>> >>>>> Regarding cherry-picking into older branches, I think Robert's suggestion >>>>> can be tweaked slightly to be helpful there too: >>>>> >>>>> 1. Checkout old branch >>>>> 2. Apply the new style (run gradle ...) >>>>> 3. Cherry-pick without committing >>>>> 4. Manually revert to old style >>>>> 5. Commit >>>>> 6. Reset to original branch HEAD >>>>> 7. Cherry pick commit 5 again >>>>> >>>>> It's a bit lengthy and may be a tedious process, but it should allow >>>>> applying the git-level changes mostly automatically. >>>>> >>>>> Cheers, >>>>> Dmitri. >>>>> >>>>> >>>>> On Fri, Jul 8, 2022 at 2:53 AM Robert Stupp <sn...@snazy.de >>>>> <mailto:sn...@snazy.de>> wrote: >>>>> From my experience, it’s a big win to have automatic code formatting. >>>>> >>>>> In projectnessie we use automatic code formatting for all languages and >>>>> haven’t serious issues with Spotless. It is just nice to not have to bike >>>>> shed about whitespaces, line breaks, brackets, etc. It was a bit of >>>>> discussion, because people had bad memories from past experiences with >>>>> automatic code formatting breaking code and introducing subtle bugs. >>>>> >>>>> I think that using code styles that „do not allow bike shedding“ (Google >>>>> Code Style) are a very good option. >>>>> >>>>> So far none of us has seen issues with any of the Spotless code >>>>> formatters that we use: XML, Kotlin/Gradle, Kotlin, Antlr4, Java, Scala - >>>>> relying on the „standard“ settings w/o any customizations. We use this >>>>> piece of code, externalized into an internal Gradle plugin: >>>>> https://github.com/projectnessie/gradle-build-plugins/blob/main/spotless/src/main/kotlin/org/projectnessie/buildtools/spotless/SpotlessHelperPlugin.kt >>>>> >>>>> <https://github.com/projectnessie/gradle-build-plugins/blob/main/spotless/src/main/kotlin/org/projectnessie/buildtools/spotless/SpotlessHelperPlugin.kt> >>>>> For Iceberg, it would probably be nice to have some Groovy code >>>>> formatting for the build scripts as well. >>>>> >>>>> Sure, the migration will add some pain. IMHO the best option is a „big >>>>> bang“ across the whole code base, because it happens only once. Migrating >>>>> one module after another is a „repeated series of pains“. >>>>> >>>>> Since the result of a `./gradlew spotlessApply` is deterministic, people >>>>> that have open PRs could: >>>>> 1. Rebase their PR branch against the commit before the „Big Bang“ >>>>> 2. Include a commit with the necessary Gradle build change (one the only >>>>> contains the changes to add Spotless) >>>>> 3. Do the `./gradlew spotlessApply` >>>>> 4. Squash all commits in the PR-branch >>>>> 5. Rebase again - against the HEAD of the master branch >>>>> 6. Force-push PR-branch >>>>> Because git is „clever enough“ to eliminate the „duplicated/unrelated >>>>> changes“, the final result of the above steps is just the diff with the >>>>> changes for the open PR. >>>>> >>>>> >>>>>> Am 08.07.2022 um 00:59 schrieb Ryan Blue <b...@tabular.io >>>>>> <mailto:b...@tabular.io>>: >>>>>> >>>>>> We were just talking about this proposal internally. I think it would be >>>>>> great to have automatic code formatting, especially since we have to >>>>>> point out a lot of changes manually. The main question is how to get >>>>>> there without too much disruption. This came up in our discussions >>>>>> around the upcoming 1.0 release, since that may be a good opportunity to >>>>>> make all of the code changes. >>>>>> >>>>>> For background, the main concern about adding something like this is >>>>>> applying all of the changes needed to get the existing code to conform >>>>>> to the new style. That is really disruptive because it will cause all of >>>>>> the PRs to need to be rebased and makes it really difficult to >>>>>> cherry-pick changes from after the code formatting happens to branches >>>>>> that were created before code formatting. The 1.0 release makes a good >>>>>> opportunity because we are making other changes (removing deprecations) >>>>>> and will hopefully have people upgrading their branches to the new major >>>>>> version, rather than cherry picking. >>>>>> >>>>>> This is as good a time as any to add automatic code formatting, but it's >>>>>> up to the community: so should we refromat the project and apply >>>>>> spotless code formatting everywhere? I'm interested to hear opinions! >>>>>> >>>>>> Ryan >>>>>> >>>>>> On Thu, Mar 10, 2022 at 3:00 AM Eduard Tudenhoefner <edu...@dremio.com >>>>>> <mailto:edu...@dremio.com>> wrote: >>>>>> Hello everyone, >>>>>> >>>>>> I would like to get the discussion started around automatic code >>>>>> formatting + enforcing and how we get there. >>>>>> >>>>>> Currently we use Checkstyle check to enforce formatting. However, the >>>>>> problem with that is that you still have to manually do the actual >>>>>> formatting. >>>>>> >>>>>> What I would like to propose is the usage of Spotless >>>>>> (https://github.com/diffplug/spotless >>>>>> <https://github.com/diffplug/spotless>) for checking and enforcing Java >>>>>> code style (it can also enforce code style for Scala, Markdown, ... >>>>>> btw). Spotless is being used by many projects >>>>>> (https://github.com/search?l=gradle&q=spotless&type=Code >>>>>> <https://github.com/search?l=gradle&q=spotless&type=Code>) and comes >>>>>> essentially with two tasks: >>>>>> * spotlessCheck: Checks that sourcecode satisfies formatting steps >>>>>> * spotlessApply: Applies code formatting steps to sourcecode in-place >>>>>> >>>>>> >>>>>> Code format >>>>>> >>>>>> The problem with code format is that there is no single format that can >>>>>> satisfy the preferences of everybody. However, from my experience, once >>>>>> people start to use any code format that produces consistent results >>>>>> across Eclipse/IntelliJ/cmd line, people stop worrying about code format >>>>>> details. >>>>>> This is also one of the reasons why the creators of Go decided to have a >>>>>> code formatter built-in (https://go.dev/doc/effective_go#formatting >>>>>> <https://go.dev/doc/effective_go#formatting>): >>>>>> >>>>>> Formatting issues are the most contentious but the least consequential. >>>>>> People can adapt to different formatting styles but it's better if they >>>>>> don't have to, and less time is devoted to the topic if everyone adheres >>>>>> to the same style. The problem is how to approach this Utopia without a >>>>>> long prescriptive style guide. >>>>>> With Go we take an unusual approach and let the machine take care of >>>>>> most formatting issues. The gofmt program (also available as go fmt, >>>>>> which operates at the package level rather than source file level) reads >>>>>> a Go program and emits the source in a standard style of indentation and >>>>>> vertical alignment, retaining and if necessary reformatting comments. >>>>>> >>>>>> >>>>>> I would like to propose using the Google Java Format with Spotless. The >>>>>> reason for this format is essentially that this is a widely-adopted code >>>>>> format that is designed specifically for code reviews (since we're >>>>>> spending more time reviewing code than writing it). >>>>>> Additionally, it produces consistent formatting results across >>>>>> Eclipse/IntelliJ/cmd line, which I think is another very important >>>>>> factor. >>>>>> >>>>>> Thus, our initial Gradle spotless configuration could look similar to >>>>>> the above below: >>>>>> >>>>>> >>>>>> pluginManager.withPlugin('com.diffplug.spotless') { >>>>>> spotless { >>>>>> // don't run spotlessCheck during gradle check task during the >>>>>> transition phase >>>>>> enforceCheck = false >>>>>> java { >>>>>> target 'src/main/java/**/*.java', 'src/test/java/**/*.java', >>>>>> 'src/jmh/java/**/*.java' >>>>>> googleJavaFormat() >>>>>> } >>>>>> } >>>>>> } >>>>>> >>>>>> We don't have to use Google Java Format. Spotless also supports >>>>>> formatting the code with other formats, but from previous experience the >>>>>> Google Java Format seemed to be really the only one to produce >>>>>> consistent results across Eclipse/IntelliJ/cmd line. >>>>>> >>>>>> >>>>>> How do we get to a point where the entire codebase is properly formatted >>>>>> (and enforceCheck = false can be removed)? >>>>>> >>>>>> Now this is a difficult question. Obviously we don't want to have a >>>>>> single format-everything commit, as that would affect lots of in-flight >>>>>> PRs. >>>>>> >>>>>> There would have to be some form of gradual formatting, for example >>>>>> module by module. Spotless offers something called Ratched >>>>>> (https://github.com/diffplug/spotless/tree/main/plugin-gradle#ratchet >>>>>> <https://github.com/diffplug/spotless/tree/main/plugin-gradle#ratchet>) >>>>>> that allows to enforce code format gradually (but I'm not sure this >>>>>> would be a good thing either). >>>>>> >>>>>> How exactly we'd like to approach this transitioning phase this is a >>>>>> completely separate discussion, but I feel like at least we could get >>>>>> the ball rolling so that we make it also easier for newcomers to >>>>>> contribute to the project, since it would be straightforward for them to >>>>>> make their PRs adhere to the code format and also save time during PR >>>>>> reviews. >>>>>> >>>>>> >>>>>> Eduard >>>>>> >>>>>> >>>>>> -- >>>>>> Ryan Blue >>>>>> Tabular >>>>> >>>>> >>>>> >>>>> -- >>>>> Ryan Blue >>>>> >>>>> >>>>> -- >>>>> Ryan Blue >>>>> Tabular >>>>> >>>>> >>>>> -- >>>>> Ryan Blue >>>>> Tabular >>>> >>> >> > > > > -- > Ryan Blue > Tabular