PyArrow has always required Numpy, so this sounds like a red herring.
If Numpy wasn't downloaded as part of source dependencies before, it was
certainly a bug.

Regards

Antoine.


Le 29/05/2020 à 18:29, Wes McKinney a écrit :
> It's possible it's related to
> 
> https://github.com/apache/arrow/commit/6a583e553de28e3341987911bb63fc19f99a6fb0#diff-23eeeb4347bdd26bfc6b7ee9a3b755dd
> 
> Is the issue still present with 0.17.0 or 0.17.1? In any case please
> do open an issue if it is not resolved in master and/or the latest
> releases.
> 
> On Fri, May 29, 2020 at 10:41 AM Brian Hulette <bhule...@apache.org> wrote:
>>
>> +1 fo a jira to track this. I looked into it a little bit just out of
>> curiosity.
>>
>> I passed --verbose to pip to get insight into what's going on in in the
>> "Installing build dependencies..." step. I did this for both 0.15.1 and
>> 0.16. They took 4:10 and 5:57 respectively.  It looks like 0.16.0 spent
>> 2:43 installing numpy, which is absent from the 0.15.1 log. I'm not sure
>> what changed to cause this.
>>
>> I collected logs with the following command (note it relies on ts in
>> moreutils for adding timestamps):
>>   python -m pip download --dest /tmp pyarrow==0.16.0 --no-binary :all:
>> --verbose 2>&1 | ts | tee /tmp/0.16.0.log
>> I found the numpy difference and measured its runtime by grepping for
>> "Running setup.py" in these logs.
>>
>> The logs are uploaded to google drive:
>> https://drive.google.com/drive/folders/1rPoYAsVul3HGdrviiCLGPf_P8dOlBCd1?usp=sharing
>>
>> On Fri, May 29, 2020 at 5:49 AM Wes McKinney <wesmck...@gmail.com> wrote:
>>
>>> hi Valentyn,
>>>
>>> This is the first I've ever heard of anyone doing what you are doing,
>>> so safe to say that we've given little to no consideration to this use
>>> case. We have been focused on providing binary packages for pip and
>>> conda. Could you please open a JIRA and provide more detailed
>>> information about what you are seeing?
>>>
>>> Thanks
>>> Wes
>>>
>>> On Thu, May 28, 2020 at 4:47 PM Valentyn Tymofieiev
>>> <valen...@google.com.invalid> wrote:
>>>>
>>>> Hi Arrow dev community,
>>>>
>>>> Do you have any insight why
>>>>
>>>>           python -m pip download --dest /tmp pyarrow==0.16.0 --no-binary
>>>> :all:
>>>>
>>>> takes several minutes to execute? From the output we can see that pip get
>>>> stuck on:
>>>>
>>>>   File was already downloaded /tmp/pyarrow-0.16.0.tar.gz
>>>>   Installing build dependencies ... |
>>>>
>>>> There is a significant increase in runtime between 0.15.1 and 0.16.0. I
>>>> suspect  some build dependencies need to be installed before pip
>>>> understands the dependencies of pyarrow.  Is there some inefficiency in
>>>> Avro's setup.py that is causing this?
>>>>
>>>> Thanks,
>>>> Valentyn
>>>

Reply via email to