Hi!

I'm one of the co-authors of the KIP.
Let me answer a few questions (the thread is a bit long so I won't opt
for inline).

* Many of the comments touch on the aspect that having multiple BUILD
files looks like to be higher cognitive requirement than what is
currently required by Gradle
which simply globs all the sources per-module. That is correct. Bazel
does not shy away from this. The trade-off is that with some
additional care/fidelity on the build graph
the tool aims to be faster at early-cutoff optimization.
(This is a bit of an odd statement since during the transition we've
opted to copy the target structure as it is in Bazel so you see
similar large globs. Our goal is to continue to refine internally at
Confluent into smaller targets)

* Bazel has good support for remote caching and remote execution.
Apache Kafka being an open source project is eligible for some
free-tier usage of public SaaS companies such as
https://www.buildbuddy.io/
(They list a few notable open source repos they serve
https://www.buildbuddy.io/open-source-repos/)
Of course Apache sponsored infrastructure would be a good investment as well :)

* Performance: This is tricky and it's tough to get exact replication
of the environment. Intuition insight suggests though that a
cold-cache (clean build) for a fine grained build graph in Bazel
may be slower than Gradle since there is a lot of IO spent creating many JARs.
(Although I have some interesting numbers suggesting how a much
smaller CLASSPATH improves javac noticeably that I did as part of
https://fzakaria.com/2024/11/08/jvm-boot-optimization-via-javaindex.html)

* The folder structure in our demo repo (kafka-overlay) is only
different to apply the overlay pattern
(https://fzakaria.com/2024/08/29/bazel-overlay-pattern.html). The
directory structure would be the same if adopted.
(Meta commentary: Although long-term it could change. The need for
top-level modules in Bazel are unnecessary since the java packages
themselves are the build targets).


On Fri, Nov 22, 2024 at 9:15 AM Vince Rose <vr...@confluent.io> wrote:
>
>
>
> On Fri, Nov 22, 2024 at 10:01 AM Viktor Somogyi-Vass 
> <viktor.somo...@cloudera.com.invalid> wrote:
>>
>> Hi Vincent
>>
>> I would also like to say that I'm not an expert in Bazel either. Generally
>> I think it's good to reduce build and test times in every infra. While
>> reading up on Bazel, it seems like it's most useful in monolithic code
>> bases, such as Confluent as you mentioned in the KIP (so it's not a
>> coincidence :) ). I have a few thoughts though:
>>
>> SVV1. That 175 vs 100 seconds in the no-cache seems to be a big difference.
>> There is a performance benchmark by Gradle (
>> https://blog.gradle.org/gradle-vs-bazel-jvm) which might be biased from
>> their side, but seems to conclude different results. One thing they mention
>> in the article is that they had one BAZEL.build for a project, instead of
>> the package level granularity. In your provided example, you seem to have
>> it on a per module basis. Could this be the difference in performance? Have
>> you tried it out with different granularities for Bazel?
>>
>> SVV2. I can share the arguments in CP2 as well. Bazel would optimally need
>> a build file in every package, which you'll need to maintain manually. This
>> is an extra burden on the developer side. Based on the example you cited,
>> Gradle seems to have better JVM integration compared to Bazel as the latter
>> one tries to be a generic build tool and not tuned for the JVM. I think
>> that this could be a potential concern.
>>
>> SVV3. Before switching to a completely different build system, I think we
>> have to ask ourselves: did we exhaust any optimization in Gradle that can
>> be done, and is it really that slower to build in it? As the article above
>> mentioned, if Bazel build files are defined on package level, then the
>> package level cyclic dependencies can be eliminated. This, however, can be
>> done with Java's module system too, which would be the idiomatic Java way.
>> If this is done, perhaps it could speed up Gradle builds as well.
>>
>> Best,
>> Viktor
>>
>> On Fri, Nov 22, 2024 at 5:43 PM Gaurav Narula <ka...@gnarula.com> wrote:
>>
>> > Thanks for the KIP Vince and Farid!
>> >
>> > Glad to see Bazel being considered for Kafka and really impressive work on
>> > the demo repository!
>> >
>> > From the benchmarks, I'd imagine the build would only get faster once
>> > BUILD.bazel files don't glob the entire source/test set for a module but
>> > are more fine grained. I suppose it would be even faster if we were to use
>> > RBE [0] for executing builds/tests in parallel with caching [1]. Would be
>> > nice to get a sense of the potential improvements there if you've already
>> > done some experiments and some possible RBE options that we may use? Maybe
>> > another motivation/optimisation would be the use of target-determinator
>> > [2], previously discussed in [3] for running partial tests for PRs?
>> >
>> > I see that you mention potential changes in IDE configurations. It would
>> > be nice to mention plugins for common IDEs that support Bazel. IIRC,
>> > IntelliJ has decent Bazel support [4] but (IMHO) it's not as good as the
>> > Gradle integration. Can you perhaps discuss the potential changes in
>> > developer workflows, e.g. the need to add dependencies/files to BUILD.bazel
>> > files manually or running Gazelle to create BUILD.bazel files 
>> > automatically?
>> >
>> > I see that in the example repository you declare the versions manually
>> > [5]. I think it would be cumbersome to maintain two separate versions list
>> > and perhaps we can generate the version mapping from
>> > `gradle/dependencies.gradle`, unless they need to deviate for some reason?
>> >
>> > To add to some of @Chia's questions and perhaps answer them partially:
>> >
>> > > CP0: Can Bazel help simplify build.gradle (currently around 3826 lines)?
>> >
>> > I think some of the complexity moves to *.bzl, *.bazel files [6] [7] [8]
>> >
>> > The configuration is in a language called Starlark [9] which is a subset
>> > of Python that avoids non-determinism. I'd say the configuration is more
>> > verbose than Gradle but is easier to reason about. I often find Gradle
>> > plugins do a lot of magic and are harder to navigate.
>> >
>> > > CP1: Gradle commands are part of the public API, so in which Kafka
>> > version should we announce the deprecation of Gradle commands?
>> >
>> > My impression is once we get rid of Gradle so Phase 5?
>> >
>> > > CP2: The example (https://github.com/confluentinc/kafka-bazel) seems to
>> > modify the folder structure of the entire Kafka project. Is this change
>> > required? My concern is that such a major change should be done in a major
>> > release. If restructuring is a requirement for Bazel, does that mean we
>> > have to accept it after this KIP pass?
>> >
>> > I stumbled upon Farid's blog post [10] which talks a bit about this and my
>> > impression is the overlay can be merged into the main repository without
>> > changing its structure, potentially in a side directory as LLVM did [11]. I
>> > suppose this is what Phase 2 is about?
>> >
>> > That said, I do think refining the build graph, which includes having more
>> > fine grained BUILD.bazel files, may warrant moving some classes/packages
>> > around. This is because Bazel isn't happy with cycles in the build graph
>> > while Gradle allows them within a module.
>> >
>> > I'm looking forward to trying the overlay repository in the coming days
>> > and will follow up with more questions/feedback!
>> >
>> > Regards,
>> > Gaurav
>> >
>> > [0]: https://bazel.build/remote/rbe
>> > [1]: https://bazel.build/remote/caching
>> > [2]: https://github.com/bazel-contrib/target-determinator
>> > [3]: https://lists.apache.org/thread/xqbj7ywmwf5zfycgo1ckyhl9zs43h1dd
>> > [4]: https://github.com/bazelbuild/intellij
>> > [5]:
>> > https://github.com/confluentinc/kafka-bazel/blob/a165c512c5d43256564b53ee0be2fbfe7388d5f9/kafka-bazel/kafka-overlay/dependencies/dependencies.bzl#L24
>> > [6]:
>> > https://github.com/confluentinc/kafka-bazel/tree/a165c512c5d43256564b53ee0be2fbfe7388d5f9/kafka-bazel/kafka-overlay/tools-bazel
>> > [7]:
>> > https://github.com/confluentinc/kafka-bazel/tree/a165c512c5d43256564b53ee0be2fbfe7388d5f9/kafka-bazel/kafka-overlay/dependencies
>> > [8]:
>> > https://github.com/confluentinc/kafka-bazel/blob/a165c512c5d43256564b53ee0be2fbfe7388d5f9/kafka-bazel/WORKSPACE.bazel
>> > [9]: https://github.com/bazelbuild/starlark
>> > [10]: https://fzakaria.com/2024/08/29/bazel-overlay-pattern.html
>> > [11]: https://github.com/llvm/llvm-project/tree/main/utils/bazel
>> >
>> > > On 22 Nov 2024, at 03:39, Chia-Ping Tsai <chia7...@apache.org> wrote:
>> > >
>> > > hi Vince
>> > >
>> > > Thanks for introducing this interesting KIP! I'm not an expert in Bazel,
>> > so please be patient with my potentially odd questions:
>> > >
>> > > CP0: Can Bazel help simplify build.gradle (currently around 3826 lines)?
>> > >
>> > > CP1: Gradle commands are part of the public API, so in which Kafka
>> > version should we announce the deprecation of Gradle commands?
>> > >
>> > > CP2: The example (https://github.com/confluentinc/kafka-bazel) seems to
>> > modify the folder structure of the entire Kafka project. Is this change
>> > required? My concern is that such a major change should be done in a major
>> > release. If restructuring is a requirement for Bazel, does that mean we
>> > have to accept it after this KIP pass?
>> > >
>> > > Best,
>> > > Chia-Ping
>> > >
>> > > On 2024/11/21 21:16:50 Vince Rose wrote:
>> > >> Hey devs,
>> > >>
>> > >> I want to start a discussion about introducing Bazel as a build system
>> > for
>> > >> Apache Kafka.
>> > >>
>> > >>
>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1115%3A+Bazel+Builds
>> > >>
>> > >> -Vince
>> > >>
>> >
>> >

Reply via email to