Hi,

Are you asking about bugs in Flink, in libraries that Flink is using or bugs in 
applications that were using Flink? From my perspective/what I have seen:

The most problematic bugs while developing features for Flink:

        Dead locks & data losses caused by concurrency issues in network stack 
after changing some trivial things in new data notifications.
        Data visibility issues for concurrent writes/reads when implementing S3 
connector.

The most problematic bug/type of bugs in the Dependencies:

        Dead locks in the external connector (for example 
https://issues.apache.org/jira/browse/KAFKA-6132 
<https://issues.apache.org/jira/browse/KAFKA-6132> ). Integration with external 
systems is always difficult. If you add concurrency issues to the mix…

The most problematic bug in the Flink application:

        Being unaware that for some reasons, some unknown to me code was 
interrupting (SIGINT) threads spawned by a custom SourceFunction, that were 
emitting the data, when the job was back pressured. This was causing records 
serialisation very rarerly to be interrupted in the middle showing up on the 
down stream receiver as deserialisation errors. 

Piotrek

> On 1 Oct 2019, at 04:18, Konstantinos Kallas 
> <konstantinos.kal...@hotmail.com> wrote:
> 
> Hi everyone.
> 
> I wanted to ask Flink users what are the most subtle Flink bugs that 
> people have witnessed. The cause of the bugs could be anything (e.g. 
> wrong assumptions on data, parallelism of non-parallel operator, simple 
> mistakes).
> 
> We are developing a testing framework for Flink and it would be 
> interesting to have examples of difficult to spot bugs to evaluate our 
> testing framework on.
> 
> Thanks,
> Konstantinos Kallas

Reply via email to