May bad, (4) is clear now.

On Mon, Jun 22, 2015 at 8:59 PM, Aljoscha Krettek <aljos...@apache.org>
wrote:

> Marton referred to the KafkaSink in 4). For sources the job will keep
> running by reading from a different broker.
>
> On Mon, 22 Jun 2015 at 18:45 Stephan Ewen <se...@apache.org> wrote:
>
> > I would like to consolidate those as well.
> >
> > Biggest blocker is, however, that the PersistentKafkaSource never commits
> > to zookeeper when checkpointing is not enabled. It should at least group
> > commit periodically in those cases.
> >
> > Concerning (4), I though the high-level consumer (that we build
> > the PersistentKafkaSource on) handles broker failures. Apparently it does
> > not?
> >
> > On Mon, Jun 22, 2015 at 2:08 PM, Márton Balassi <
> balassi.mar...@gmail.com>
> > wrote:
> >
> > > Hey,
> > >
> > > Due to the effort invested to the Kafka connector mainly by Robert and
> > > Gabor Hermann we are going to ship a fairly nice solution for reading
> > from
> > > and writing to Kafka with 0.9.0. This is the most prominent streaming
> > > connector currently, and rightfully so as pipeline level end-to-end
> > exactly
> > > once processing for streaming is dependent on it in this release.
> > >
> > > To make it even more user-friendly and efficient I find the following
> > tasks
> > > important:
> > >
> > > 1. Currently we ship two Kafka sources and for historic reason the
> > > non-persistent one is called KafkaSource and the persistent one is
> > > PersistentKafkaSource, the latter is also more difficult to find.
> > Although
> > > the former one is easier to read and a bit faster, let us enforce that
> > > people use the fault-tolerant version by default. The behavior is
> already
> > > documented both in javadoc and on the Flink website. Reported by Ufuk.
> > >
> > > Proposed solution: Deprecate the non-persistent KafkaSource, may be
> move
> > > the PersistentKafkaSource to org.apache.flink.streaming.kafka.
> (Currently
> > > it is in its subpackage, persistent.) Eventually we could even rename
> the
> > > current PersistentKafkaSource to KafkaSource.
> > >
> > > 2. The documentation of the streaming connectors is a bit hidden on the
> > > website. These are not included in the connectors section [1], but are
> at
> > > the very end of the streaming guide. [2]
> > >
> > > Proposed solution: Move them to connectors and link it from the
> streaming
> > > guide.
> > >
> > > 3. Collocate the KafkaSource and the KafkaSink with the corresponding
> > > Brokers if possible for improved performance. There is a ticket for
> this.
> > > [3]
> > >
> > > 4. Handling Broker failures on the KafkaSink side.
> > >
> > > Currently instead of looking for a new Broker the sink throws an
> > exception
> > > and thus cancels the job if the Broker failes. Assuming that the job
> has
> > > execution retries left and an available Broker to write to the job
> comes
> > > back, finds the Broker and continues. Added a ticket for it just now.
> > > [4] Reported
> > > by Aljoscha.
> > >
> > > [1]
> > >
> > >
> >
> http://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html
> > > [2]
> > >
> > >
> >
> http://ci.apache.org/projects/flink/flink-docs-master/apis/streaming_guide.html#stream-connectors
> > > [3] https://issues.apache.org/jira/browse/FLINK-1673
> > > [4] https://issues.apache.org/jira/browse/FLINK-2256
> > >
> > > Best,
> > >
> > > Marton
> > >
> >
>

Reply via email to