There's always the route of vendoring some library and not exposing external CMake options. This would achieve the goal of compile-out-of-the-box and enable important feature in the basic build. We also simplify dependencies requirements (benefits CI or developer). The downside is following security patches and grumpy reaction from package maintainers. I think we should explore this route for dependencies that match the following criteria:
- libarrow*.so don't export any of the symbols of the dependency and not referenced in any public headers - dependency is lightweight, e.g. excludes boost, openssl, grpc, llvm, thrift, protobuf - dependency is not-ubiquitous on major platform and have a stable API, e.g. excludes libz and openssl A small list of candidates: - RapidJSON (enables JSON) - DoubleConversion (enables CSV) There's a precedent, arrow already vendors small C++ libraries (datetime, utf8cpp, variant, xxhash). François On Thu, Oct 10, 2019 at 6:03 AM Antoine Pitrou <anto...@python.org> wrote: > > > Hi all, > > I'm a bit concerned that we're planning to add many additional build > options in the quest to have a core zero-dependency build in C++. > See for example https://issues.apache.org/jira/browse/ARROW-6633 or > https://issues.apache.org/jira/browse/ARROW-6612. > > The problem is that this is creating many possible configurations and we > will only be testing a tiny subset of them. Inevitably, users will try > other option combinations and they'll fail building for some random > reason. It will not be a very good user experience. > > Another related issue is user perception when doing a default build. > For example https://issues.apache.org/jira/browse/ARROW-6638 proposes to > build with jemalloc disabled by default. Inevitably, people will be > doing benchmarks with this (publicly or not) and they'll conclude Arrow > is not as performant as it claims to be. > > Perhaps we should look for another approach instead? > > For example we could have a single ARROW_BARE_CORE (whatever the name) > option that when enabled (not by default) builds the tiniest minimal > subset of Arrow. It's more inflexible, but at least it's something that > we can reasonably test. > > Regards > > Antoine.