Is there or should there be some checking of digests just to make sure that
we are really testing against the same thing in /tmp/test-spark that we are
distributing from the archive?

On Thu, Jul 19, 2018 at 11:15 AM Sean Owen <sro...@apache.org> wrote:

> Ideally, that list is updated with each release, yes. Non-current releases
> will now always download from archive.apache.org though. But we run into
> rate-limiting problems if that gets pinged too much. So yes good to keep
> the list only to current branches.
>
> It looks like the download is cached in /tmp/test-spark, for what it's
> worth.
>
> On Thu, Jul 19, 2018 at 11:06 AM Felix Cheung <felixcheun...@hotmail.com>
> wrote:
>
>> +1 this has been problematic.
>>
>> Also, this list needs to be updated every time we make a new release?
>>
>> Plus can we cache them on Jenkins, maybe we can avoid downloading the
>> same thing from Apache archive every test run.
>>
>>
>> ------------------------------
>> *From:* Marco Gaido <marcogaid...@gmail.com>
>> *Sent:* Monday, July 16, 2018 11:12 PM
>> *To:* Hyukjin Kwon
>> *Cc:* Sean Owen; dev
>> *Subject:* Re: Cleaning Spark releases from mirrors, and the flakiness
>> of HiveExternalCatalogVersionsSuite
>>
>> +1 too
>>
>> On Tue, 17 Jul 2018, 05:38 Hyukjin Kwon, <gurwls...@gmail.com> wrote:
>>
>>> +1
>>>
>>> 2018년 7월 17일 (화) 오전 7:34, Sean Owen <sro...@apache.org>님이 작성:
>>>
>>>> Fix is committed to branches back through 2.2.x, where this test was
>>>> added.
>>>>
>>>> There is still some issue; I'm seeing that archive.apache.org is
>>>> rate-limiting downloads and frequently returning 503 errors.
>>>>
>>>> We can help, I guess, by avoiding testing against non-current releases.
>>>> Right now we should be testing against 2.3.1, 2.2.2, 2.1.3, right? 2.0.x is
>>>> now effectively EOL right?
>>>>
>>>> I can make that quick change too if everyone's amenable, in order to
>>>> prevent more failures in this test from master.
>>>>
>>>> On Sun, Jul 15, 2018 at 3:51 PM Sean Owen <sro...@gmail.com> wrote:
>>>>
>>>>> Yesterday I cleaned out old Spark releases from the mirror system --
>>>>> we're supposed to only keep the latest release from active branches out on
>>>>> mirrors. (All releases are available from the Apache archive site.)
>>>>>
>>>>> Having done so I realized quickly that the
>>>>> HiveExternalCatalogVersionsSuite relies on the versions it downloads being
>>>>> available from mirrors. It has been flaky, as sometimes mirrors are
>>>>> unreliable. I think now it will not work for any versions except 2.3.1,
>>>>> 2.2.2, 2.1.3.
>>>>>
>>>>> Because we do need to clean those releases out of the mirrors soon
>>>>> anyway, and because they're flaky sometimes, I propose adding logic to the
>>>>> test to fall back on downloading from the Apache archive site.
>>>>>
>>>>> ... and I'll do that right away to unblock
>>>>> HiveExternalCatalogVersionsSuite runs. I think it needs to be backported 
>>>>> to
>>>>> other branches as they will still be testing against potentially
>>>>> non-current Spark releases.
>>>>>
>>>>> Sean
>>>>>
>>>>

Reply via email to