Hello,

Long ago, we added a ARROW_USE_PRECOMPILED_HEADERS to the Arrow C++ CMake options in the hope of speeding up builds by reducing C++ header parsing time.

However, we later started to use a concurrent (*) solution added in CMake itself: CMAKE_UNITY_BUILD, which merges batches of source files together before compiling them, achieving a similar objective.

While using unity builds is almost effortless and a no-regret solution, precompiled headers are more delicate as they have a cost of their own:

1) they introduce an additional step at the top of the dependency graph, which most later steps depend on

2) including them can be surprisingly expensive, so tuning the set of headers included in a precompiled header is a bit of a guessing game

3) they come with their own non-trivial disk space overhead (https://github.com/apache/arrow/pull/44288 shows that by disabling precompiled headers on PyArrow, we can save 2GB on the size of a Docker container)

CMAKE_UNITY_BUILD is now fully supported on all the CMake versions we allow with Arrow C++, so there might not be much point in keeping ARROW_USE_PRECOMPILED_HEADERS around.

What do you think? Should we deprecate ARROW_USE_PRECOMPILED_HEADERS and remove it in a release or two?

Regards

Antoine.


(*) Though unity builds and precompiled headers can be used together, the additional benefits are probably minor compared to enabling only one of them.

Reply via email to