As previously discussed [1], I took on the effort the effort of trying to come up with a demo for using bazel as a build system for C++/Python. The results [2] are a little bit of a mixed bag.
I was able to construct an example that runs on my Mac that can compile and run most of the tests in "src/arrow" as well as the IPC read/write test, and a python test (test_array.py). I also have C++ Flight compiling. A demonstration for how different library locations can be selected is also available [3]. This would need a lot more work to come to the current functionality that CMake has. After going through this exercise I put together a list of pros and cons below. I would like to hear from other devs: 1. Their opinions on setting this up as an alternative system (I'm willing to invest some more time in it). 2. What people think the minimum bar for merging a PR like this should be? Pros: 1. Being able to run "bazel test python/..." and having compilation of all python dependencies just work is a nice experience. 2. Because of the granular compilation units, it can improve developer velocity. Unit tests can depend only on the sub-components they are meant to test. They don't need to compile and relink arrow.so. 3. The built-in documentation it provides about visibility and relationships between components is nice (its uncovered some "interesting dependencies"). I didn't make heavy use of it, but its concept of "visibility" makes things more explicit about what external consumers should be depending on, and what inter-project components should depend on (e.g. explicitly limit the scope of vendored code). 4. Extensions are essentially python, which might be easier to work with then CMake Cons: 1. Bazel is opinionated on C++ layout. In particular it requires some workarounds to deal with circular .h/.cc dependencies. The two main ways of doing this are either increasing the size of compilable units [4] to span all dependencies in the cycle, or creating separate header/implementation targets, I've used both strategies in the PR. One could argue that it would be nice to reduce circular dependencies in general. 2. Bazel python support still seems lacking. To make the test work, I needed to explicitly include all transitive dependencies of the "pip" installed packaged by hand. 3. Bazel in general doesn't seem to have wide adoption so any customization probably won't have a whole lot of support (I've been told there are some adapters with CMake that can leverage some of the existing code). 4. It is more verbose to configure then CMake (each compilation unit needs to be spelled out with dependencies). 5. The "packaging" story of different build artifacts still needs to be explored. Thanks, Micah [1] https://lists.apache.org/thread.html/26c2a9e7e35ffc6f6ff68fbbfb38a0a33002b8e7210e8d323566f447@%3Cdev.arrow.apache.org%3E [2] https://github.com/apache/arrow/pull/5897/files [3] https://github.com/apache/arrow/pull/5897/files#diff-85ecc9fdaae4c714198a1c31c7748f2a [4] https://github.com/apache/arrow/pull/5897/files#diff-c23198ffa8af9adf6825cb9c6f6e135b