+1 fo a jira to track this. I looked into it a little bit just out of curiosity.
I passed --verbose to pip to get insight into what's going on in in the "Installing build dependencies..." step. I did this for both 0.15.1 and 0.16. They took 4:10 and 5:57 respectively. It looks like 0.16.0 spent 2:43 installing numpy, which is absent from the 0.15.1 log. I'm not sure what changed to cause this. I collected logs with the following command (note it relies on ts in moreutils for adding timestamps): python -m pip download --dest /tmp pyarrow==0.16.0 --no-binary :all: --verbose 2>&1 | ts | tee /tmp/0.16.0.log I found the numpy difference and measured its runtime by grepping for "Running setup.py" in these logs. The logs are uploaded to google drive: https://drive.google.com/drive/folders/1rPoYAsVul3HGdrviiCLGPf_P8dOlBCd1?usp=sharing On Fri, May 29, 2020 at 5:49 AM Wes McKinney <wesmck...@gmail.com> wrote: > hi Valentyn, > > This is the first I've ever heard of anyone doing what you are doing, > so safe to say that we've given little to no consideration to this use > case. We have been focused on providing binary packages for pip and > conda. Could you please open a JIRA and provide more detailed > information about what you are seeing? > > Thanks > Wes > > On Thu, May 28, 2020 at 4:47 PM Valentyn Tymofieiev > <valen...@google.com.invalid> wrote: > > > > Hi Arrow dev community, > > > > Do you have any insight why > > > > python -m pip download --dest /tmp pyarrow==0.16.0 --no-binary > > :all: > > > > takes several minutes to execute? From the output we can see that pip get > > stuck on: > > > > File was already downloaded /tmp/pyarrow-0.16.0.tar.gz > > Installing build dependencies ... | > > > > There is a significant increase in runtime between 0.15.1 and 0.16.0. I > > suspect some build dependencies need to be installed before pip > > understands the dependencies of pyarrow. Is there some inefficiency in > > Avro's setup.py that is causing this? > > > > Thanks, > > Valentyn >