I would also prefer having only one source. The PR still has both
variants so that people can check them out.

In my opinion the assumptions about interruptibility are easier to
break than the requirement of locking. Even if we get the kafka source
to work with the interruptions (which I doubt, because this fails
somewhere in their code) this would not guarantee that this will
always work in future versions. With the locking you either have the
locking, then it is correct (even for feature versions) or you don't,
then it is immediately incorrect.

On Fri, May 29, 2015 at 10:56 PM, Gyula Fóra <gyula.f...@gmail.com> wrote:
> Hey,
>
> It seems like both interfaces are pretty much capable of doing the same
> thing but work on slightly different assumptions.
>
> Isn't there a way that the kafka source can work with the interruptions? I
> think the reachedEnd/next interface is slightly easier to grasp than the
> run() with the locks. But in any case I would slightly prefer having only
> one of them if they can technically do the same thing.
>
> Also adding a new interface means we add a new streamtask implementation
> which is also getting slightly too much.
>
> What is you opinion on this?
>
> Gyula
>
>
>
> On Fri, May 29, 2015 at 6:51 PM, Aljoscha Krettek <aljos...@apache.org>
> wrote:
>
>> Hi All,
>> after finishing my pull request that should fix the problems with the
>> synchronisation of checkpoints and element emission (the reason for
>> the faulty results of the exactly-once tests) I discovered that the
>> Kafka source does not deal well with being interrupted. We recently
>> changed the SourceFunction to the reachedEnd()/next() interface, with
>> the contract that the source must be interruptible to be able to
>> perform checkpoints. Now this doesn't seem to work with Kafka. I added
>> another Source interface in my PR
>> (https://github.com/apache/flink/pull/742). This is similar to the old
>> interface of run()/cancel(), with the addition that the source must
>> acquire a lock before updating state and emitting elements. The update
>> of state and the emission of elements must happen in the same
>> synchronized block to ensure consistency. This seems to solve the
>> problem but now we have two source interfaces.
>>
>> The question is now. What do you think about the two interfaces?
>> Should we keep both? Remove one?
>>
>> Cheers,
>> Aljoscha
>>

Reply via email to