hi Richard,

On Thu, Nov 7, 2019 at 9:59 AM Richard Bachmann
<richard.bachm...@cern.ch> wrote:
>
> Hello,
> I'm contacting you on behalf of the LCG Releases team at CERN. We
> provide a common software stack for LHCb, ATLAS and others to be used at
> CERN and the worldwide computing grid.
>
> Right now we're looking into optimizing the way we're building Apache
> Arrow (C++ & Python) and its dependencies. Ideally we'd like to build
> Arrow using only the minimum of necessary dependencies to run it, and to
> use packages already installed in the stack to fulfill these
> dependencies. The former would be nice to keep the stack clean, the
> latter would help us avoid duplication and failing builds due to mirrors
> going offline.
>
> Our builds currently run with the ARROW_DEPENDENCY_SOURCE=AUTO
> <https://github.com/apache/arrow/blob/master/docs/source/developers/cpp.rst>
> setting, which results in duplicate and non-essential packages being
> downloaded by Arrow, as well as dependency on external mirrors. Setting
> it to SYSTEM allows us to avoid the downloads, but then the build
> process fails due to missing unused dependencies.

I'm surprised to hear this based on what I know about the build system
and from extensive local development.

Can you show the exact CMake invocation you are using and indicate
which unused dependencies are being downloaded?

In this Docker minimal build (unless something has been recently
broken) that the project can be built with only a small number of
third party dependencies:

https://github.com/apache/arrow/tree/master/cpp/examples/minimal_build

Note that we support a fully "offline" build to allow thirdparty
dependencies to be built in an air-gapped environment

https://github.com/apache/arrow/blob/master/docs/source/developers/cpp.rst#offline-builds

> Do you know if there is a recommended way to achieve this? The problem
> seems to stem from the fact that all listed dependencies are downloaded,
> whether they are needed or not. We have considered patching out the
> non-essential dependencies ('double-conversion', 'GTEST', etc.) from the
> dependency list, as well as formally adding the unneeded dependencies to
> the stack in order to run with the SYSTEM setting. However, if there is
> a proper way to do it we would of course prefer to follow that course of
> action.

We'll be able to know more based on how you're calling CMake and with
what options, but the build system should not be downloading any
dependencies that are not needed.

>
> Any help would be very appreciated.
> Kind regards:
>
>      - Richard Bachmann
>

Reply via email to