Test result:
 - The test for only compile state are succeeding (I deleted some old
caches) cache size 1146.26M. See here
https://travis-ci.org/sunjincheng121/flink/caches
- timeout to 1200 test fail, get the same error, but I think maybe the
storage problem, so I delete more old cache and restart the CI. See here
https://travis-ci.org/apache/flink/builds/547136163

So now it feels like the storage size of the cache is limited. If so we can
add some cleanup logic for the old cache (I am not sure,some validation is
needed)

Best
Jincheng

jincheng sun <sunjincheng...@gmail.com> 于2019年6月18日周二 下午6:00写道:

> I agree with the explanation from @Chesnay Schepler <ches...@apache.org>.  
> this
> should be a problem with the Travis infrastructure because recently we have
> not big changed the logic of Travis inside Flink.
> At present, most of the failures are after the compile is completed. The
> cache size is only 7.7M, which means that the JARs are not successfully
> uploaded.
>
> So here is a question:
>  - Where can we check the cache storage to see if there is a problem with
> the storage?
>
> In order to try to find out some reason for the CI issue,  I do the
> follows test:
>
>  - I delete other test phases locally and test them - Test whether the
> cache is uploaded normally during the compilation phase. See here
> https://travis-ci.org/sunjincheng121/flink/builds/547155029
>  - Increase Travis cache timeout to 1200 - Test the cache cannot be
> downloaded due to cache is a timeout. (I think this test will have the same
> result ) See here https://travis-ci.org/apache/flink/builds/547136163
>
> Will feedback here after testing.
>
> Best,
> Jincheng
>
> Chesnay Schepler <ches...@apache.org> 于2019年6月18日周二 下午3:53写道:
>
>> The problem is not that bad stuff is in the cache (which is the only
>> thing a cache cleaning solves), it is that the test stages don't
>> download the correct one.
>>
>> Our compile stage uploads stuff in to the cache, and the subsequent test
>> builds downloads it again.
>>
>> Whether the upload from the compile phase is visible to the test phase
>> is basically a timing thing; it depends on the visibility guarantee that
>> the backing infrastructure provides. So far it _usually_ worked, but
>> these are naturally things that may change over time.
>>
>> On 18/06/2019 09:20, Jeff Zhang wrote:
>> > If it is travis caching issue, we can file apache infra ticket and ask
>> them
>> > to clean the cache.
>> >
>> >
>> >
>> > Chesnay Schepler <ches...@apache.org> 于2019年6月18日周二 下午3:18写道:
>> >
>> >> This is (hopefully a short-lived) hiccup on the Travis caching
>> >> infrastructure.
>> >>
>> >> There's nothing we can do to _fix_ it; if it persists we'll have to
>> >> rework our travis setup again to not rely on caching.
>> >>
>> >> On 18/06/2019 08:34, Kurt Young wrote:
>> >>> Hi dev,
>> >>>
>> >>> I noticed that all the travis tests triggered by pull request are
>> failed
>> >>> with the same error:
>> >>>
>> >>> "Cached flink dir /home/travis/flink_cache/xxxxx/flink does not exist.
>> >>> Exiting build."
>> >>>
>> >>> Anyone have a clue on what happened and how to fix this?
>> >>>
>> >>> Best,
>> >>> Kurt
>> >>>
>> >>
>>
>>

Reply via email to