Thanks, the actual problem is that the ActorSystem gets shutdown. This
breaks the testing code. Should be fixed once
https://github.com/apache/flink/pull/1852 is merged.
On Tue, Apr 5, 2016 at 12:25 PM, Matthias J. Sax wrote:
> Happened again after your fix:
> https://travis-ci.org/apache/flink/j
Happened again after your fix:
https://travis-ci.org/apache/flink/jobs/120620482
-Matthias
On 04/01/2016 08:57 PM, Maximilian Michels wrote:
> Fixed with the resolution of https://issues.apache.org/jira/browse/FLINK-3689.
>
> On Fri, Apr 1, 2016 at 12:40 PM, Maximilian Michels wrote:
>> Hi Mat
Thanks. Just tried is out and it works :)
On 04/01/2016 08:57 PM, Maximilian Michels wrote:
> Fixed with the resolution of https://issues.apache.org/jira/browse/FLINK-3689.
>
> On Fri, Apr 1, 2016 at 12:40 PM, Maximilian Michels wrote:
>> Hi Matthias,
>>
>> Thanks for spotting the test failure.
Fixed with the resolution of https://issues.apache.org/jira/browse/FLINK-3689.
On Fri, Apr 1, 2016 at 12:40 PM, Maximilian Michels wrote:
> Hi Matthias,
>
> Thanks for spotting the test failure. It's actually a bug in the code
> and not a test problem. Fixing it.
>
> Cheers,
> Max
>
> On Fri, Apr
Hi Matthias,
Thanks for spotting the test failure. It's actually a bug in the code
and not a test problem. Fixing it.
Cheers,
Max
On Fri, Apr 1, 2016 at 9:33 AM, Ufuk Celebi wrote:
> Hey Matthias,
>
> the test has been only recently added with the resource management
> refactoring. It's probabl
Hey Matthias,
the test has been only recently added with the resource management
refactoring. It's probably just a too aggressive timeout for Travis.
@Max: Did you ever see this fail?
– Ufuk
On Fri, Apr 1, 2016 at 9:24 AM, Matthias J. Sax wrote:
> Anyone seen this before? One-time thing or tes
If there is none yet, then we do. Label it with "test-stability". I think
the consensus was also to mark it as critical.
Otherwise, just add the log to the JIRA.
On Tue, Oct 6, 2015 at 2:57 PM, Matthias J. Sax wrote:
> Hi,
>
> One test just failed on current master:
> https://travis-ci.org/apac
I have a patch pending that should help with these timeout issues (and null
checks)...
On Mon, Sep 7, 2015 at 2:41 PM, Matthias J. Sax wrote:
> Please lock here:
>
> https://travis-ci.org/apache/flink/jobs/79086396
>
> > Failed tests:
> > KafkaITCase>KafkaTestBase.prepare:155 Test setup failed:
+1 for a "test-stability" label and labeling these issues as "critical"
On Mon, Aug 24, 2015 at 6:31 PM, Stephan Ewen wrote:
> Pushed a fix for the StateCheckpointedITCase
>
> On Mon, Aug 24, 2015 at 12:19 PM, Maximilian Michels
> wrote:
>
>> +1 for labeling the JIRAs with "test-stability".
>>
Pushed a fix for the StateCheckpointedITCase
On Mon, Aug 24, 2015 at 12:19 PM, Maximilian Michels wrote:
> +1 for labeling the JIRAs with "test-stability".
>
> On Sat, Aug 22, 2015 at 8:21 PM, Márton Balassi
> wrote:
>
> > +1 for Vasia's suggestion
> > On Aug 22, 2015 8:07 PM, "Vasiliki Kalavri
Hi Matthias,
Thanks for reporting. The label test-stability exists now.
Cheers,
Max
On Sun, Aug 23, 2015 at 12:32 PM, Matthias J. Sax <
mj...@informatik.hu-berlin.de> wrote:
> Hi,
>
> because there is (not yet) a label for failing tests, I just report it
> over the mailing list again. I also op
+1 for labeling the JIRAs with "test-stability".
On Sat, Aug 22, 2015 at 8:21 PM, Márton Balassi
wrote:
> +1 for Vasia's suggestion
> On Aug 22, 2015 8:07 PM, "Vasiliki Kalavri"
> wrote:
>
> > I just came across 2 more :/
> > I'm also in favor of tracking these with JIRA. How about "test-stabil
+1 for Vasia's suggestion
On Aug 22, 2015 8:07 PM, "Vasiliki Kalavri"
wrote:
> I just came across 2 more :/
> I'm also in favor of tracking these with JIRA. How about "test-stability"
> for a label?
>
> -V.
>
> On 21 August 2015 at 12:47, Matthias J. Sax >
> wrote:
>
> > I like the idea with the
I just came across 2 more :/
I'm also in favor of tracking these with JIRA. How about "test-stability"
for a label?
-V.
On 21 August 2015 at 12:47, Matthias J. Sax
wrote:
> I like the idea with the special label. Otherwise, it will be difficult
> to find the correct tickets.
>
> -Matthias
>
> O
I like the idea with the special label. Otherwise, it will be difficult
to find the correct tickets.
-Matthias
On 08/21/2015 12:15 PM, Till Rohrmann wrote:
> I'm also in favor of JIRA, because I fear that nobody will keep the wiki
> page in sync. Maybe we can assign a special label for test stabi
I'm also in favor of JIRA, because I fear that nobody will keep the wiki
page in sync. Maybe we can assign a special label for test stability to
these JIRA issues. Then we can quickly find all currently instable test
cases.
On Fri, Aug 21, 2015 at 11:02 AM, Robert Metzger
wrote:
> I agree that w
I agree that we should look for a solution other than opening a lot of
small discussion threads on the mailing list.
When I have a test failure, I usually search my gmail inbox to see whether
somebody else wrote something about the error already.
Creating a JIRA for each failing test might be a be
Thanks for the info.
Over the weeks I lost track which errors/failing/instable tests are know
an which not. Should we start a wiki page or similar to collect know
errors? If a test fails on a know error, it can just be ignored. This
would avoid "spam" on the mailing list.
Any thoughts about this?
Sachin saw the error as well, as reported here:
https://issues.apache.org/jira/browse/FLINK-2468
I also see it from time to time.I have a wip branch where I relaxed the
constraints for the test to pass a bit.
On Thu, Aug 20, 2015 at 10:05 PM, Matthias J. Sax <
mj...@informatik.hu-berlin.de> wrote:
Looks like a rare race between the cleanup (two changes) and the test
validating both changes.
I'll push a fix to make the test more reliable.
On Sun, Aug 16, 2015 at 11:04 PM, Matthias J. Sax <
mj...@informatik.hu-berlin.de> wrote:
> Hi,
>
> I hit a failing test in flink-runtime. Not sure if it
I think the YARN problem is as before, but with a longer timeout.
Before, when after 60 seconds the expected output did not come, the tests
aborted.
The timeout is now 180 seconds, which is probably so long that the deadlock
detector (5 minutes no output) kicks in.
In any case, there is something
May be an issue with the embedded YARN mini cluster...
On Mon, Aug 10, 2015 at 8:37 PM, Stephan Ewen wrote:
> I think the YARN problem is as before, but with a longer timeout.
>
> Before, when after 60 seconds the expected output did not come, the tests
> aborted.
> The timeout is now 180 second
Not sure about the yarn test... As yarn was instable all the time I just
ignored it...
-Matthias
On 08/09/2015 09:38 PM, Ufuk Celebi wrote:
> PS what about the yarn test case... Is that one known (with that trace)?
>
> On Sunday, August 9, 2015, Ufuk Celebi wrote:
>
>> There is an issue for th
There is an issue for this from last week. Couldn't look into it last week,
will do tomorrow. Thanks for the logs. :)
On Sunday, August 9, 2015, Matthias J. Sax
wrote:
> Wrong link... sorry.
>
> https://travis-ci.org/mjsax/flink/jobs/74787655
>
>
>
> On 08/09/2015 04:02 PM, Maximilian Michels wr
PS what about the yarn test case... Is that one known (with that trace)?
On Sunday, August 9, 2015, Ufuk Celebi wrote:
> There is an issue for this from last week. Couldn't look into it last
> week, will do tomorrow. Thanks for the logs. :)
>
> On Sunday, August 9, 2015, Matthias J. Sax > wrote
Wrong link... sorry.
https://travis-ci.org/mjsax/flink/jobs/74787655
On 08/09/2015 04:02 PM, Maximilian Michels wrote:
> Hi Matthias,
>
> Is that the correct build URL? I can't spot any failing Gelly tests. The
> build appears to be stuck in the YARNSessionFIFOITCase.
>
> Cheers,
> Max
>
> O
Hi Matthias,
Is that the correct build URL? I can't spot any failing Gelly tests. The
build appears to be stuck in the YARNSessionFIFOITCase.
Cheers,
Max
On Sun, Aug 9, 2015 at 3:37 PM, Matthias J. Sax <
mj...@informatik.hu-berlin.de> wrote:
> Hi,
>
> I got a new failing test in this build (fli
I've also seen the BufferSpillerTest fail:
https://travis-ci.org/apache/flink/jobs/74057503
On Tue, 4 Aug 2015 at 14:10 Robert Metzger wrote:
> I've assigned https://issues.apache.org/jira/browse/FLINK-1680 to myself.
> Maybe Tachyon 0.7 will fix the issues.
>
> On Tue, Aug 4, 2015 at 1:57 PM,
I've assigned https://issues.apache.org/jira/browse/FLINK-1680 to myself.
Maybe Tachyon 0.7 will fix the issues.
On Tue, Aug 4, 2015 at 1:57 PM, Stephan Ewen wrote:
> Yes.
>
> We should know, though, whether this is a Java 6 bug, or a bug in our
> system that just happens to occur only with Java
Yes.
We should know, though, whether this is a Java 6 bug, or a bug in our
system that just happens to occur only with Java 6 (because of different
timings in this other engine)
On Tue, Aug 4, 2015 at 12:27 PM, Chesnay Schepler <
chesnay.schep...@fu-berlin.de> wrote:
> Aren't we dropping java 6
Aren't we dropping java 6 support?
On 04.08.2015 12:21, Stephan Ewen wrote:
The "StateCheckpointedITCase" has not failed so far, which also test these
guarantees thoroughly.
But we need to first rule out the BarrierBuffer. The problem is that the
bug occur only on Java 6 and cannot be reproduce
The "StateCheckpointedITCase" has not failed so far, which also test these
guarantees thoroughly.
But we need to first rule out the BarrierBuffer. The problem is that the
bug occur only on Java 6 and cannot be reproduced locally...
On Tue, Aug 4, 2015 at 12:14 PM, Gyula Fóra wrote:
> Honestly I
Honestly I don't think the partitioned state changes have anything to do
with the stability, only the reworked test case, which now test proper
exactly-once which was missing before.
Stephan Ewen ezt írta (időpont: 2015. aug. 4., K, 12:12):
> Yes, the build stability is super serious right now.
Yes, the build stability is super serious right now.
Here are the problems in question, and what we could do about this:
BarrierBuffer:
Barrier Buffer tests fail in Java 6 builds.
I have not found a way to diagnose that problem, yet, but if we cannot find
the issue today,
I've also seen this fail: https://travis-ci.org/apache/flink/jobs/74025862
in SuccessAfterNetworkBuffersFailureITCase
Build seems quite flaky recently.
On Tue, 4 Aug 2015 at 10:27 Matthias J. Sax
wrote:
> Rebased on:
>
>
> https://github.com/mjsax/flink/commit/fab61a1954ff1554448e826e1d273689e
Rebased on:
https://github.com/mjsax/flink/commit/fab61a1954ff1554448e826e1d273689ed520fc3
But if the gap between two rebases is large, it's hard to say what the
problem might be...
The old parent commit (ie, rebase before last rebase) was
https://github.com/mjsax/flink/commit/148395bcd81a93bcb1
What are the commits that you rebased on? Could you maybe narrow down what
caused the regression?
On Mon, 3 Aug 2015 at 23:31 Matthias J. Sax
wrote:
> I only report failing tests after a rebase. ;)
>
> -Matthias
>
> On 08/03/2015 11:23 PM, Henry Saputra wrote:
> > Thanks for reporting it , Matth
I only report failing tests after a rebase. ;)
-Matthias
On 08/03/2015 11:23 PM, Henry Saputra wrote:
> Thanks for reporting it , Matthias. Will try to run Travis for latest Flink.
>
> Tachyon test is a bit flaky. Maybe updating to latest release could help.
>
> - Henry
>
> On Mon, Aug 3, 2015
Thanks for reporting it , Matthias. Will try to run Travis for latest Flink.
Tachyon test is a bit flaky. Maybe updating to latest release could help.
- Henry
On Mon, Aug 3, 2015 at 2:18 PM, Matthias J. Sax
wrote:
> Today, not a single built was successful completely. Please see here:
>
> Flink
Seen this a few times as well.
May be something with the latest "partitioned state" changes...
On Mon, Aug 3, 2015 at 5:48 PM, Matthias J. Sax <
mj...@informatik.hu-berlin.de> wrote:
> Hi,
>
> I just hit a failing test
> (https://travis-ci.org/apache/flink/jobs/73899795). It is know or new?
>
>
Thanks Matthias for overlooking the issue.
Thank you Till for the problem formulation and the suggested steps for
solving the synchronization problem. I will look into this as soon as
possible.
Cheers,
Max
On Fri, Jul 17, 2015 at 11:18 AM, Matthias J. Sax <
mj...@informatik.hu-berlin.de> wrote:
I will open an JIRA for this. It's getting "complicated".
On 07/17/2015 11:04 AM, Till Rohrmann wrote:
> I think the problem might be related to the way the test is constructed.
> The test submits a job to the JM and then tries to poll the accumulators
> from the JM. If it does not succeed, then t
I think the problem might be related to the way the test is constructed.
The test submits a job to the JM and then tries to poll the accumulators
from the JM. If it does not succeed, then the polling is retried with an
decreasing pause in between. Furthermore, the task which updates the
accumulator
Hi,
the test still fails. This time in both runs (Flink Travis and my own
Travis) -- only for Java 8 again:
https://travis-ci.org/apache/flink/jobs/71314132
https://travis-ci.org/mjsax/flink/jobs/71179608
-Matthias
On 07/16/2015 02:28 PM, Matthias J. Sax wrote:
> Great! I will. As 4 of 5 runs
Great! I will. As 4 of 5 runs succeeded I cannot test explicitly. Will
have an eye on it in future runs.
-Matthias
On 07/16/2015 02:24 PM, Maximilian Michels wrote:
> Hi Matthias,
>
> I've pushed a fix to the master. The problem should be solved. Please tell
> me if your Travis reports an error
Hi Matthias,
I've pushed a fix to the master. The problem should be solved. Please tell
me if your Travis reports an error again. My Travis never complained :)
Cheers,
Max
On Thu, Jul 16, 2015 at 12:00 PM, Maximilian Michels wrote:
> Hi Matthias,
>
> This is indeed a timing issue when checking
Hi Matthias,
This is indeed a timing issue when checking for the results in this test.
The new accumulator implementation now continuously reports from the
running tasks to the job manager. This was merged yesterday.
The assertion that fails there is a bit strict. Actually, I've already
integrate
Hey,
this has been merged yesterday. I guess it's a timing issue when verifying the
results. Can you file an issue for this?
– Ufuk
On 16 Jul 2015, at 11:30, Matthias J. Sax wrote:
> Hi,
>
> I hit another failing test (that is new to me):
>
>> Results :
>> Failed tests:
>> AccumulatorLiveIT
48 matches
Mail list logo