This seems reasonable as long as it is actually feasible (the dependencies are cleanly separable)..
A while ago I had a proof of concept bazel build working that was able to automatically build the changes together. On Monday, August 16, 2021, David Li <lidav...@apache.org> wrote: > I support this. In the past I had to effectively do this manually to build > Arrow/PyArrow in a monorepo (to build for multiple Python versions > simultaneously without having conflicting copies of Arrow for each Python > version). From what I remember, there's some usage of Arrow-internal > headers that need to be replaced, but fortunately they were all very simple > to replace. > > Though in my personal experience, it wasn't often that I needed to touch > src/arrow/python. > > -David > > On Mon, Aug 16, 2021, at 11:08, Alessandro Molina wrote: > > PyArrow is currently full Cython codebase, but in reality it relies on > some > > classes and functions that are implemented in C++ within the src/python > > directory ( https://github.com/apache/arrow/tree/master/cpp/src/ > arrow/python > > ). Especially for numpy/pandas conversion code that has to interface with > > Numpy arrays data at low level. > > > > When working in the area of PyArrow it's not uncommon that you end up > > jumping back and forth between the Arrow C++ codebase for Python and > > PyArrow and you can also end up with, sometimes hard to catch, > integration > > issues if you forgot to recompile libarrow even if you are working on a > > Python only change. > > > > I'm wondering if it wouldn't make life easier for contributors if the > > src/arrow/python directory was moved into pyarrow and we made PyArrow > able > > to build it. > > > > That would probably reduce risk of integration issues as rebuilding > pyarrow > > alone would probably be enough for most python specific changes (as it > > would also rebuild the Python specific C++). > > > > I think that moving src/arrow/python into pyarrow would also make the > > codebase more cohesive which would lower the barrier for new contributors > > looking for how to fix a pyarrow specific issue. > > > > Unless there is any major side effect (outside of having to build a bit > > more complex build scripts for pyarrow, but it's already CMake based, so > > building some C++ shouldn't be a big deal) that I'm missing, it seems > that > > the benefits of having all Python related code into a single place would > > surpass the side effects. > > > > Also I'm not sure how widespread it is the requirement of Python from > C++, > > but it seems to me that if we moved all Python specific code into pyarrow > > we could make libarrow decoupled from Python. Which might make it easier > to > > deal with Virtualenvs or debug versions of python as you wouldn't have to > > deal with Python3_EXECUTABLE etc when building libarrow. > > > > Any thoughts? > > >