Re: [Discussion] Flink Pulsar Connector

Fabian Hueske Tue, 24 Apr 2018 02:34:08 -0700

Hi Sijie, hi Pulsar community!

Thanks for the detailed overview of Pulsar.
I like the idea of adding a Pulsar connector to Flink.


As Gordon mentioned, the Flink community would like to ensure that the
connector is maintained after being added.
We experienced that connector maintenance, including fixing bugs, upgrading
to newer versions, adding new features, and reviewing contributions,
results in a lot of work for the community
Since you ensured that the Pulsar community is eager to help with this
effort, I think we could add the connector to Flink.
This would also be a good initiative for a tighter collaboration between
two ASF projects.

Best, Fabian

2018-04-22 7:02 GMT+02:00 Henry Saputra <[email protected]>:

> Here is the link to Apache Flink JIRA issue for this:
>
> https://issues.apache.org/jira/browse/FLINK-9168
>
> - Henry
>
> On Fri, Apr 20, 2018 at 12:08 AM, Sijie Guo <[email protected]> wrote:
>
> > Hi Flinkers,
> >
> > As discussed with @tzulitai at apache/flink#5845
> > <https://github.com/apache/flink/pull/5845>, I am starting a discussion
> > thread about contributing flink pulsar connectors (including both source
> > and sink connectors) from pulsar community to flink project. We'd like to
> > see what are people's thoughts about this and how we can proceed for
> this.
> >
> > For people who doesn't know about Apache Pulsar, here are some
> background:
> >
> > ---
> >
> > Apache Pulsar (incubating) <https://pulsar.incubator.apache.org/> is a
> > distributed pub/sub messaging system, which provides very flexible
> > messaging model - unifying traditional queuing (e.g. SQS, rabbitmq) and
> > high-performance streaming (e.g. Kinesis, Kafka) into one pub/sub
> messaging
> > model + api. It is backed by a scalable segment/log storage Apache
> > BookKeeper, which provide unbounded stream storage for Pulsar. Because of
> > its segment-centric architecture design, Pulsar provides compelling
> > unbounded streaming data storage. It is good for both streaming and batch
> > processing, which I believe it fits very well into Flink's data
> processing
> > model. Besides that, pulsar has a lot of advanced features going on its
> > upcoming 2.0 release, including built-in schema registry, topic
> compaction,
> > regex subscription, and tiered storage
> > <https://github.com/apache/incubator-pulsar/wiki/PIP-17:-
> > Tiered-storage-for-Pulsar-topics>
> >  ...
> >
> > Pulsar was developed by Yahoo since 2012-ish and has been running on
> > production for 4+ years, over 10+ data centers and processing/delivering
> > billions of messages per day. It was open sourced at 2016. Since it is
> open
> > sourced, it has been adopted by various companies. Nowadays, the pulsar
> > slack channel discussion is very active and fast-growing. The community
> > currently has about 15 committers.
> >
> > ---
> >
> > I happened to work with ZongYang (who is also a pulsar contributor) on
> > developing pulsar connectors for flink to satisfy pulsar users requests.
> We
> > would like to contribute the connector work to flink and continue the
> > collaboration between flink and pulsar communities. From pulsar community
> > perspective, we are also very committed to developing pulsar's ecosystem,
> > and willing and dedicated to developing/maintaining flink pulsar
> > connectors.
> >
> > Hope this email thread give you guys enough background of pulsar and
> clear
> > some of the concerns that @tzulitai raised in the jira ticket / pull
> > request. Looking forward to any feedback from pulsar community and deep
> > collaboration between flink and pulsar community.
> >
> > Also /cc pulsar dev mailing list ([email protected]). If
> > there are any questions, pulsar devs can also help to answer.
> >
> > Thanks,
> > Sijie
> >
>

Re: [Discussion] Flink Pulsar Connector

Reply via email to