It's possible it's related to

https://github.com/apache/arrow/commit/6a583e553de28e3341987911bb63fc19f99a6fb0#diff-23eeeb4347bdd26bfc6b7ee9a3b755dd

Is the issue still present with 0.17.0 or 0.17.1? In any case please
do open an issue if it is not resolved in master and/or the latest
releases.

On Fri, May 29, 2020 at 10:41 AM Brian Hulette <bhule...@apache.org> wrote:
>
> +1 fo a jira to track this. I looked into it a little bit just out of
> curiosity.
>
> I passed --verbose to pip to get insight into what's going on in in the
> "Installing build dependencies..." step. I did this for both 0.15.1 and
> 0.16. They took 4:10 and 5:57 respectively.  It looks like 0.16.0 spent
> 2:43 installing numpy, which is absent from the 0.15.1 log. I'm not sure
> what changed to cause this.
>
> I collected logs with the following command (note it relies on ts in
> moreutils for adding timestamps):
>   python -m pip download --dest /tmp pyarrow==0.16.0 --no-binary :all:
> --verbose 2>&1 | ts | tee /tmp/0.16.0.log
> I found the numpy difference and measured its runtime by grepping for
> "Running setup.py" in these logs.
>
> The logs are uploaded to google drive:
> https://drive.google.com/drive/folders/1rPoYAsVul3HGdrviiCLGPf_P8dOlBCd1?usp=sharing
>
> On Fri, May 29, 2020 at 5:49 AM Wes McKinney <wesmck...@gmail.com> wrote:
>
> > hi Valentyn,
> >
> > This is the first I've ever heard of anyone doing what you are doing,
> > so safe to say that we've given little to no consideration to this use
> > case. We have been focused on providing binary packages for pip and
> > conda. Could you please open a JIRA and provide more detailed
> > information about what you are seeing?
> >
> > Thanks
> > Wes
> >
> > On Thu, May 28, 2020 at 4:47 PM Valentyn Tymofieiev
> > <valen...@google.com.invalid> wrote:
> > >
> > > Hi Arrow dev community,
> > >
> > > Do you have any insight why
> > >
> > >           python -m pip download --dest /tmp pyarrow==0.16.0 --no-binary
> > > :all:
> > >
> > > takes several minutes to execute? From the output we can see that pip get
> > > stuck on:
> > >
> > >   File was already downloaded /tmp/pyarrow-0.16.0.tar.gz
> > >   Installing build dependencies ... |
> > >
> > > There is a significant increase in runtime between 0.15.1 and 0.16.0. I
> > > suspect  some build dependencies need to be installed before pip
> > > understands the dependencies of pyarrow.  Is there some inefficiency in
> > > Avro's setup.py that is causing this?
> > >
> > > Thanks,
> > > Valentyn
> >

Reply via email to