It's possible it's related to https://github.com/apache/arrow/commit/6a583e553de28e3341987911bb63fc19f99a6fb0#diff-23eeeb4347bdd26bfc6b7ee9a3b755dd
Is the issue still present with 0.17.0 or 0.17.1? In any case please do open an issue if it is not resolved in master and/or the latest releases. On Fri, May 29, 2020 at 10:41 AM Brian Hulette <bhule...@apache.org> wrote: > > +1 fo a jira to track this. I looked into it a little bit just out of > curiosity. > > I passed --verbose to pip to get insight into what's going on in in the > "Installing build dependencies..." step. I did this for both 0.15.1 and > 0.16. They took 4:10 and 5:57 respectively. It looks like 0.16.0 spent > 2:43 installing numpy, which is absent from the 0.15.1 log. I'm not sure > what changed to cause this. > > I collected logs with the following command (note it relies on ts in > moreutils for adding timestamps): > python -m pip download --dest /tmp pyarrow==0.16.0 --no-binary :all: > --verbose 2>&1 | ts | tee /tmp/0.16.0.log > I found the numpy difference and measured its runtime by grepping for > "Running setup.py" in these logs. > > The logs are uploaded to google drive: > https://drive.google.com/drive/folders/1rPoYAsVul3HGdrviiCLGPf_P8dOlBCd1?usp=sharing > > On Fri, May 29, 2020 at 5:49 AM Wes McKinney <wesmck...@gmail.com> wrote: > > > hi Valentyn, > > > > This is the first I've ever heard of anyone doing what you are doing, > > so safe to say that we've given little to no consideration to this use > > case. We have been focused on providing binary packages for pip and > > conda. Could you please open a JIRA and provide more detailed > > information about what you are seeing? > > > > Thanks > > Wes > > > > On Thu, May 28, 2020 at 4:47 PM Valentyn Tymofieiev > > <valen...@google.com.invalid> wrote: > > > > > > Hi Arrow dev community, > > > > > > Do you have any insight why > > > > > > python -m pip download --dest /tmp pyarrow==0.16.0 --no-binary > > > :all: > > > > > > takes several minutes to execute? From the output we can see that pip get > > > stuck on: > > > > > > File was already downloaded /tmp/pyarrow-0.16.0.tar.gz > > > Installing build dependencies ... | > > > > > > There is a significant increase in runtime between 0.15.1 and 0.16.0. I > > > suspect some build dependencies need to be installed before pip > > > understands the dependencies of pyarrow. Is there some inefficiency in > > > Avro's setup.py that is causing this? > > > > > > Thanks, > > > Valentyn > >