Hi,

I notice I haven't followed up on this.
I've now created https://github.com/apache/arrow/issues/46196 to track the removal of ARROW_USE_PRECOMPILED_HEADERS.

Regards

Antoine.


Le 02/10/2024 à 19:13, Felipe Oliveira Carvalho a écrit :
I say we remove it.

Arrow has great header hygiene (compared to other codebases I've worked
on). With a little bit more effort we can probably eliminate long header
include chains.

--
Felipe

On Wed, Oct 2, 2024 at 6:53 AM Antoine Pitrou <anto...@python.org> wrote:


Hello,

Long ago, we added a ARROW_USE_PRECOMPILED_HEADERS to the Arrow C++
CMake options in the hope of speeding up builds by reducing C++ header
parsing time.

However, we later started to use a concurrent (*) solution added in
CMake itself: CMAKE_UNITY_BUILD, which merges batches of source files
together before compiling them, achieving a similar objective.

While using unity builds is almost effortless and a no-regret solution,
precompiled headers are more delicate as they have a cost of their own:

1) they introduce an additional step at the top of the dependency graph,
which most later steps depend on

2) including them can be surprisingly expensive, so tuning the set of
headers included in a precompiled header is a bit of a guessing game

3) they come with their own non-trivial disk space overhead
(https://github.com/apache/arrow/pull/44288 shows that by disabling
precompiled headers on PyArrow, we can save 2GB on the size of a Docker
container)

CMAKE_UNITY_BUILD is now fully supported on all the CMake versions we
allow with Arrow C++, so there might not be much point in keeping
ARROW_USE_PRECOMPILED_HEADERS around.

What do you think? Should we deprecate ARROW_USE_PRECOMPILED_HEADERS and
remove it in a release or two?

Regards

Antoine.


(*) Though unity builds and precompiled headers can be used together,
the additional benefits are probably minor compared to enabling only one
of them.



Reply via email to