Dev's I would request to be as conservative as possible in choosing (keeping) a build system.
For developers, packagers and even end-users for some languages the build system is just another dependency. Even if cmake is not ideal, it has become quite ubiquitous which is a huge plus. Maybe it is possible to come up with a way of expressing the dependency relations in cmake in a way that makes maintaining them easier. Otherwise it is maybe possible to generate them from a (simple) description file? Cheers, Maarten. > On Oct 19, 2019, at 11:22 PM, Micah Kornfield <emkornfi...@gmail.com> wrote: > >> >> Perhaps meson is also worth exploring? > > > It could be, if someone else wants to take a look we can, compare what > things look at in each. Recently, Bazel build rules seem like they would be > useful for some work projects I've been dealing with, so I plan on focusing > my exploration there. > > On Wed, Oct 16, 2019 at 6:27 AM Antoine Pitrou <anto...@python.org> wrote: > >> >> Perhaps meson is also worth exploring? >> >> >> Le 15/10/2019 à 23:06, Micah Kornfield a écrit : >>> Hi Wes, >>> I agree on both accounts that it won't be a done in the short term, and >> it >>> makes sense to tackle in incrementally. Like I said I don't have much >>> bandwidth at the moment but might be able to re-arrange a few things on >> my >>> plate. I think some people have asked on the mailing list how they might >>> be able to help, this might be one area that doesn't require a lot of >>> in-depth knowledge of C++ at least for a proof of concept. I'll try to >>> open up some JIRAs soon. >>> >>> Thanks, >>> Micah >>> >>> On Tue, Oct 15, 2019 at 10:33 AM Wes McKinney <wesmck...@gmail.com> >> wrote: >>> >>>> hi Micah, >>>> >>>> Definitely Bazel is worth exploring, but we must be realistic about >>>> the amount of energy (several hundred hours or more) that's been >>>> invested in the build system we have now. So a new build system will >>>> be a large endeavor, but hopefully can make things simpler. >>>> >>>> Aside from the requirements gathering process, if it is felt that >>>> Bazel is a possible path forward in the future, it may be good to try >>>> to break up the work into more tractable pieces. For example, a first >>>> step would be to set up Bazel configurations to build the project's >>>> thirdparty toolchain. Since we're reliant in ExternalProject in CMake >>>> to do a lot of heavy lifting there for us, I imagine this (taking care >>>> of what ThirdpartyToolchain.cmake does not) will take up a lot of the >>>> energy >>>> >>>> - Wes >>>> >>>> On Sun, Oct 13, 2019 at 1:06 PM Micah Kornfield <emkornfi...@gmail.com> >>>> wrote: >>>>> >>>>>> >>>>>> >>>>>> This might be taking the thread on more of a tangent, but maybe we >>>> should >>>>> start collecting requirements for the C++ build system in general and >> see >>>>> if there might be better solution that can address some of these >>>> concerns? >>>>> In particular, Bazel at least on the surface seems like it might be a >>>>> better fit for some of the use cases discussed here. I know this is a >>>> big >>>>> project (and I currently don't have much bandwidth for it) but I think >> if >>>>> CMake is lacking in these areas it might be worth at least exploring >>>>> instead of going down the path of building our own meta-build system on >>>> top >>>>> of CMake. >>>>> >>>>> Requirements that I think we are targeting: >>>>> 1. Be able to provide an out of box build system that requires as >> close >>>> to >>>>> zero dependencies beyond a standard C++ toolchain (e.g. "$BUILD >> minimal" >>>>> works on any C++ developers desktop without additional requirements) >>>>> 2. The build system should limit configuration knobs in favor of >> implied >>>>> dependencies (e.g. "$BUILD python" automatically builds "compute", >>>>> "filesystem", "ipc") >>>>> 3. The build system should be configurable to use (and have the user >>>>> specify) one of "System packages", "Conda packages" or source packages >>>> for >>>>> providing dependencies (and fallback options between the three). >>>>> 4. The build system should be able to treat some dependencies as >>>> optional >>>>> (e.g. different compression libraries or allocators). >>>>> 5. Easily allow developers to limit building unnecessary code for >> their >>>>> particular task at hand. >>>>> 6. The build system must work across the following >> toolchains/platforms: >>>>> - Linux: g++ and clang. x86 and ARM >>>>> - Mac >>>>> - Windows (msys2 and MSVC) >>>>> >>>>> Thanks, >>>>> Micah >>>>> >>>>> >>>>> >>>>> On Thu, Oct 10, 2019 at 6:09 AM Antoine Pitrou <anto...@python.org> >>>> wrote: >>>>> >>>>>> >>>>>> Yes, we could express dependencies in a Python script and have it >>>>>> generate a CMake module of if/else chains in cmake_modules (which we >>>>>> would check in git to avoid having people depend on a Python install, >>>>>> perhaps). >>>>>> >>>>>> Still, that is an additional maintenance burden. >>>>>> >>>>>> Regards >>>>>> >>>>>> Antoine. >>>>>> >>>>>> >>>>>> Le 10/10/2019 à 14:50, Wes McKinney a écrit : >>>>>>> I guess one question we should first discuss is: who is the C++ build >>>>>>> system for? >>>>>>> >>>>>>> The users who are most sensitive to benchmark-driven decision making >>>>>>> will generally be consuming the project through pre-built binaries, >>>>>>> like our Python or R packages. If C++ developers build the project >>>>>>> from source and don't do a minimal read of the documentation to see >>>>>>> what a "recommended configuration" looks like, I would say that is >>>>>>> more their fault than ours. In the case of the ARROW_JEMALLOC option, >>>>>>> I think it's important for C++ system integrators to be aware of the >>>>>>> impact of the choice of memory allocator. >>>>>>> >>>>>>> The concern I have with the current "out of the box" experience is >>>>>>> that people are getting the impression that "I have to build $X, $Y, >>>>>>> and $Z -- which I don't necessarily need -- to have $CORE_FEATURE_1". >>>>>>> They can, of course, read the documentation and learn that those >>>>>>> things can be toggled off, but I think the user that reaches for a >>>>>>> self-built source install is much different in general than someone >>>>>>> who uses the project through the Linux binary packages, for example. >>>>>>> >>>>>>> On the subject of managing intraproject dependencies and >>>>>>> relationships, I think we should develop a better way to express >>>>>>> relationships between components than we have now. >>>>>>> >>>>>>> As an example, building the Python library assumes that various >>>>>>> components are enabled >>>>>>> >>>>>>> - ARROW_COMPUTE=ON >>>>>>> - ARROW_FILESYSTEM=ON >>>>>>> - ARROW_IPC=ON >>>>>>> >>>>>>> Somewhere in the code we might have some code like >>>>>>> >>>>>>> if (ARROW_PYTHON) >>>>>>> set(ARROW_COMPUTE ON) >>>>>>> ... >>>>>>> endif() >>>>>>> >>>>>>> This doesn't strike me as that scalable. I would rather see a >>>>>>> dependency file like >>>>>>> >>>>>>> component_dependencies = { >>>>>>> ... >>>>>>> 'python': ['compute', 'filesystem', 'ipc'], >>>>>>> ... >>>>>>> } >>>>>>> >>>>>>> A helper Python script as part of the build could be used to give >>>>>>> CMake (because CMake is a bit poor as a programming language) the >>>> list >>>>>>> of required components based on what the user has indicated to CMake. >>>>>>> >>>>>>> On Thu, Oct 10, 2019 at 7:36 AM Francois Saint-Jacques >>>>>>> <fsaintjacq...@gmail.com> wrote: >>>>>>>> >>>>>>>> There's always the route of vendoring some library and not exposing >>>>>>>> external CMake options. This would achieve the goal of >>>>>>>> compile-out-of-the-box and enable important feature in the basic >>>>>>>> build. We also simplify dependencies requirements (benefits CI or >>>>>>>> developer). The downside is following security patches and grumpy >>>>>>>> reaction from package maintainers. I think we should explore this >>>>>>>> route for dependencies that match the following criteria: >>>>>>>> >>>>>>>> - libarrow*.so don't export any of the symbols of the dependency and >>>>>>>> not referenced in any public headers >>>>>>>> - dependency is lightweight, e.g. excludes boost, openssl, grpc, >>>> llvm, >>>>>>>> thrift, protobuf >>>>>>>> - dependency is not-ubiquitous on major platform and have a stable >>>>>>>> API, e.g. excludes libz and openssl >>>>>>>> >>>>>>>> A small list of candidates: >>>>>>>> - RapidJSON (enables JSON) >>>>>>>> - DoubleConversion (enables CSV) >>>>>>>> >>>>>>>> There's a precedent, arrow already vendors small C++ libraries >>>>>>>> (datetime, utf8cpp, variant, xxhash). >>>>>>>> >>>>>>>> François >>>>>>>> >>>>>>>> >>>>>>>> On Thu, Oct 10, 2019 at 6:03 AM Antoine Pitrou <anto...@python.org> >>>>>> wrote: >>>>>>>>> >>>>>>>>> >>>>>>>>> Hi all, >>>>>>>>> >>>>>>>>> I'm a bit concerned that we're planning to add many additional >>>> build >>>>>>>>> options in the quest to have a core zero-dependency build in C++. >>>>>>>>> See for example https://issues.apache.org/jira/browse/ARROW-6633 >>>> or >>>>>>>>> https://issues.apache.org/jira/browse/ARROW-6612. >>>>>>>>> >>>>>>>>> The problem is that this is creating many possible configurations >>>> and >>>>>> we >>>>>>>>> will only be testing a tiny subset of them. Inevitably, users >>>> will try >>>>>>>>> other option combinations and they'll fail building for some random >>>>>>>>> reason. It will not be a very good user experience. >>>>>>>>> >>>>>>>>> Another related issue is user perception when doing a default >>>> build. >>>>>>>>> For example https://issues.apache.org/jira/browse/ARROW-6638 >>>> proposes >>>>>> to >>>>>>>>> build with jemalloc disabled by default. Inevitably, people will >>>> be >>>>>>>>> doing benchmarks with this (publicly or not) and they'll conclude >>>> Arrow >>>>>>>>> is not as performant as it claims to be. >>>>>>>>> >>>>>>>>> Perhaps we should look for another approach instead? >>>>>>>>> >>>>>>>>> For example we could have a single ARROW_BARE_CORE (whatever the >>>> name) >>>>>>>>> option that when enabled (not by default) builds the tiniest >>>> minimal >>>>>>>>> subset of Arrow. It's more inflexible, but at least it's something >>>>>> that >>>>>>>>> we can reasonably test. >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> >>>>>>>>> Antoine. >>>>>> >>>> >>> >>