Technically speaking, removing the old connector code is a backwards incompatible change which requires a major version bump, i.e. Flink 2.x. Given that we don't have a clear plan on when to have the next major version release, it seems unclear how long the old connector code will be there if we check it in right now. Or will we remove the old connector without a major version bump? In any case, it sounds not quite user friendly to the those who might use the old Pulsar connector. I am not sure if it is worth these potential problems in order to have the Pulsar source connector checked in one or two months earlier.
Thanks, Jiangjie (Becket) Qin On Thu, Sep 12, 2019 at 3:52 PM Stephan Ewen <se...@apache.org> wrote: > Agreed, if we check in the old code, we should make it clear that it will > be removed as soon as the FLIP-27 based version of the connector is there. > We should not commit to maintaining the old version, that would be indeed > too much overhead. > > On Thu, Sep 12, 2019 at 3:30 AM Becket Qin <becket....@gmail.com> wrote: > > > Hi Stephan, > > > > Thanks for the volunteering to help. > > > > Yes, the overhead would just be review capacity. In fact, I am not > worrying > > too much about the review capacity. That is just a one time cost. My > > concern is mainly about the long term burden. Assume we have new source > > interface ready in 1.10 with newly added Pulsar connectors in old > > interface. Later on if we migrate Pulsar to new source interface, the old > > Pulsar connector might be deprecated almost immediately after checked in, > > but we may still have to maintain two code bases. For the existing > > connectors, we have to do that anyways. But it would be good to avoid > > introducing a new connector with the same problem. > > > > Thanks, > > > > Jiangjie (Becket) Qin > > > > On Tue, Sep 10, 2019 at 6:51 PM Stephan Ewen <se...@apache.org> wrote: > > > > > Hi all! > > > > > > Nice to see this lively discussion about the Pulsar connector. > > > Some thoughts on the open questions: > > > > > > ## Contribute to Flink or maintain as a community package > > > > > > Looks like the discussion is more going towards contribution. I think > > that > > > is good, especially if we think that we want to build a similarly deep > > > integration with Pulsar as we have for example with Kafka. The > connector > > > already looks like a more thorough connector than many others we have > in > > > the repository. > > > > > > With either a repo split, or the new build system, I hope that the > build > > > overhead is not a problem. > > > > > > ## Committer Support > > > > > > Becket offered some help already, I can also help a bit. I hope that > > > between us, we can cover this. > > > > > > ## Contribute now, or wait for FLIP-27 > > > > > > As Becket said, FLIP-27 is actually making some PoC-ing progress, but > > will > > > take 2 more months, I would estimate, before it is fully available. > > > > > > If we want to be on the safe side with the contribution, we should > > > contribute the source sooner and adjust it later. That would also help > us > > > in case things get crazy towards the 1.10 feature freeze and it would > be > > > hard to find time to review the new changes. > > > What would be the overhead of contributing now? Given that the code is > > > already there, it looks like it would be only review capacity, right? > > > > > > Best, > > > Stephan > > > > > > On Tue, Sep 10, 2019 at 11:04 AM Yijie Shen <henry.yijies...@gmail.com > > > > > wrote: > > > > > > > Hi everyone! > > > > > > > > Thanks for your attention and the promotion of this work. > > > > > > > > We will prepare a FLIP as soon as possible for more specific > > discussions. > > > > > > > > For FLIP-27, it seems that we have not reached a consensus. > Therefore, > > > > I will explain all the functionalities of the existing connector in > > > > the FLIP (including Source, Sink, and Catalog) to continue our > > > > discussions in FLIP. > > > > > > > > Thanks for your kind help. > > > > > > > > Best, > > > > Yijie > > > > > > > > On Tue, Sep 10, 2019 at 9:57 AM Becket Qin <becket....@gmail.com> > > wrote: > > > > > > > > > > Hi Sijie, > > > > > > > > > > If we agree that the goal is to have Pulsar connector in 1.10, how > > > about > > > > we > > > > > do the following: > > > > > > > > > > 0. Start a FLIP to add Pulsar connector to Flink main repo as it > is a > > > new > > > > > public interface to Flink main repo. > > > > > 1. Start to review the Pulsar sink right away as there is no change > > to > > > > the > > > > > sink interface so far. > > > > > 2. Wait a little bit on FLIP-27. Flink 1.10 is going to be code > > freeze > > > in > > > > > late Nov and let's say we give a month to the development and > review > > of > > > > > Pulsar connector, we need to have FLIP-27 by late Oct. There are > > still > > > 7 > > > > > weeks. Personally I think it is doable. If FLIP-27 is not ready by > > late > > > > > Oct, we can review and check in Pulsar connector with the existing > > > source > > > > > interface. This means we will have Pulsar connector in Flink 1.10, > > > either > > > > > with or without FLIP-27. > > > > > > > > > > Because we are going to have Pulsar sink and source checked in > > > > separately, > > > > > it might make sense to have two FLIPs, one for Pulsar sink and > > another > > > > for > > > > > Pulsar source. And we can start the work on Pulsar sink right away. > > > > > > > > > > Thanks, > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > On Mon, Sep 9, 2019 at 4:13 PM Sijie Guo <guosi...@gmail.com> > wrote: > > > > > > > > > > > Thank you Bowen and Becket. > > > > > > > > > > > > What's the take from Flink community? Shall we wait for FLIP-27 > or > > > > shall we > > > > > > proceed to next steps? And what the next steps are? :-) > > > > > > > > > > > > Thanks, > > > > > > Sijie > > > > > > > > > > > > On Thu, Sep 5, 2019 at 2:43 PM Bowen Li <bowenl...@gmail.com> > > wrote: > > > > > > > > > > > > > Hi, > > > > > > > > > > > > > > I think having a Pulsar connector in Flink can be a good mutual > > > > benefit > > > > > > to > > > > > > > both communities. > > > > > > > > > > > > > > Another perspective is that Pulsar connector is the 1st > streaming > > > > > > connector > > > > > > > that integrates with Flink's metadata management system and > > Catalog > > > > APIs. > > > > > > > It'll be cool to see how the integration turns out and whether > we > > > > need to > > > > > > > improve Flink Catalog stack, which are currently in Beta, to > > cater > > > to > > > > > > > streaming source/sink. Thus I'm in favor of merging Pulsar > > > connector > > > > into > > > > > > > Flink 1.10. > > > > > > > > > > > > > > I'd suggest to submit smaller sized PRs, e.g. maybe one for > basic > > > > > > > source/sink functionalities and another for schema and catalog > > > > > > integration, > > > > > > > just to make them easier to review. > > > > > > > > > > > > > > It doesn't seem to hurt to wait for FLIP-27. But I don't think > > > > FLIP-27 > > > > > > > should be a blocker in cases where it cannot make its way into > > 1.10 > > > > or > > > > > > > doesn't leave reasonable amount of time for committers to > review > > or > > > > for > > > > > > > Pulsar connector to fully adapt to new interfaces. > > > > > > > > > > > > > > Bowen > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 3:21 AM Becket Qin < > becket....@gmail.com> > > > > wrote: > > > > > > > > > > > > > > > Hi Till, > > > > > > > > > > > > > > > > You are right. It all depends on when the new source > interface > > is > > > > going > > > > > > > to > > > > > > > > be ready. Personally I think it would be there in about a > month > > > or > > > > so. > > > > > > > But > > > > > > > > I could be too optimistic. It would also be good to hear what > > do > > > > > > Aljoscha > > > > > > > > and Stephan think as they are also involved in FLIP-27. > > > > > > > > > > > > > > > > In general I think we should have Pulsar connector in Flink > > 1.10, > > > > > > > > preferably with the new source interface. We can also check > it > > in > > > > right > > > > > > > now > > > > > > > > with old source interface, but I suspect few users will use > it > > > > before > > > > > > the > > > > > > > > next official release. Therefore, it seems reasonable to > wait a > > > > little > > > > > > > bit > > > > > > > > to see whether we can jump to the new source interface. As > long > > > as > > > > we > > > > > > > make > > > > > > > > sure Flink 1.10 has it, waiting a little bit doesn't seem to > > hurt > > > > much. > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 3:59 PM Till Rohrmann < > > > trohrm...@apache.org > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > I'm wondering what the problem would be if we committed the > > > > Pulsar > > > > > > > > > connector before the new source interface is ready. If I > > > > understood > > > > > > it > > > > > > > > > correctly, then we need to support the old source interface > > > > anyway > > > > > > for > > > > > > > > the > > > > > > > > > existing connectors. By checking it in early I could see > the > > > > benefit > > > > > > > that > > > > > > > > > our users could start using the connector earlier. > Moreover, > > it > > > > would > > > > > > > > > prevent that the Pulsar integration is being delayed in > case > > > > that the > > > > > > > > > source interface should be delayed. The only downside I see > > is > > > > the > > > > > > > extra > > > > > > > > > review effort and potential fixes which might be irrelevant > > for > > > > the > > > > > > new > > > > > > > > > source interface implementation. I guess it mainly depends > on > > > how > > > > > > > certain > > > > > > > > > we are when the new source interface will be ready. > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > Till > > > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 8:56 AM Becket Qin < > > > becket....@gmail.com> > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Sijie and Yijie, > > > > > > > > > > > > > > > > > > > > Thanks for sharing your thoughts. > > > > > > > > > > > > > > > > > > > > Just want to have some update on FLIP-27. Although the > FLIP > > > > wiki > > > > > > and > > > > > > > > > > discussion thread has been quiet for some time, a few > > > > committer / > > > > > > > > > > contributors in Flink community were actually prototyping > > the > > > > > > entire > > > > > > > > > thing. > > > > > > > > > > We have made some good progress there but want to update > > the > > > > FLIP > > > > > > > wiki > > > > > > > > > > after the entire thing is verified to work in case there > > are > > > > some > > > > > > > last > > > > > > > > > > minute surprise in the implementation. I don't have an > > exact > > > > ETA > > > > > > yet, > > > > > > > > > but I > > > > > > > > > > guess it is going to be within a month or so. > > > > > > > > > > > > > > > > > > > > I am happy to review the current Flink Pulsar connector > and > > > > see if > > > > > > it > > > > > > > > > would > > > > > > > > > > fit in FLIP-27. It would be good to avoid the case that > we > > > > checked > > > > > > in > > > > > > > > the > > > > > > > > > > Pulsar connector with some review efforts and shortly > after > > > > that > > > > > > the > > > > > > > > new > > > > > > > > > > Source interface is ready. > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 8:39 AM Yijie Shen < > > > > > > henry.yijies...@gmail.com > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > Thanks for all the feedback and suggestions! > > > > > > > > > > > > > > > > > > > > > > As Sijie said, the goal of the connector has always > been > > to > > > > > > provide > > > > > > > > > > > users with the latest features of both systems as soon > as > > > > > > possible. > > > > > > > > We > > > > > > > > > > > propose to contribute the connector to Flink and hope > to > > > get > > > > more > > > > > > > > > > > suggestions and feedback from Flink experts to ensure > the > > > > high > > > > > > > > quality > > > > > > > > > > > of the connector. > > > > > > > > > > > > > > > > > > > > > > For FLIP-27, we noticed its existence at the beginning > of > > > > > > reworking > > > > > > > > > > > the connector implementation based on Flink 1.9; we > also > > > > wanted > > > > > > to > > > > > > > > > > > build a connector that supports both batch and stream > > > > computing > > > > > > > based > > > > > > > > > > > on it. > > > > > > > > > > > However, it has been inactive for some time, so we > > decided > > > to > > > > > > > provide > > > > > > > > > > > a connector with most of the new features, such as the > > new > > > > type > > > > > > > > system > > > > > > > > > > > and the new catalog API first. We will pay attention to > > the > > > > > > > progress > > > > > > > > > > > of FLIP-27 continually and incorporate it with the > > > connector > > > > as > > > > > > > soon > > > > > > > > > > > as possible. > > > > > > > > > > > > > > > > > > > > > > Regarding the test status of the connector, we are > > > following > > > > the > > > > > > > > other > > > > > > > > > > > connectors' test in Flink repository and aimed to > provide > > > > > > > throughout > > > > > > > > > > > tests as we could. We are also happy to hear > suggestions > > > and > > > > > > > > > > > supervision from the Flink community to improve the > > > > stability and > > > > > > > > > > > performance of the connector continuously. > > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > > > > > Yijie > > > > > > > > > > > > > > > > > > > > > > On Thu, Sep 5, 2019 at 5:59 AM Sijie Guo < > > > guosi...@gmail.com > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > Thanks everyone for the comments and feedback. > > > > > > > > > > > > > > > > > > > > > > > > It seems to me that the main question here is about - > > > "how > > > > can > > > > > > > the > > > > > > > > > > Flink > > > > > > > > > > > > community maintain the connector?". > > > > > > > > > > > > > > > > > > > > > > > > Here are two thoughts from myself. > > > > > > > > > > > > > > > > > > > > > > > > 1) I think how and where to host this integration is > > kind > > > > of > > > > > > less > > > > > > > > > > > important > > > > > > > > > > > > here. I believe there can be many ways to achieve it. > > > > > > > > > > > > As part of the contribution, what we are looking for > > here > > > > is > > > > > > how > > > > > > > > > these > > > > > > > > > > > two > > > > > > > > > > > > communities can build the collaboration relationship > on > > > > > > > developing > > > > > > > > > > > > the integration between Pulsar and Flink. Even we can > > try > > > > our > > > > > > > best > > > > > > > > to > > > > > > > > > > > catch > > > > > > > > > > > > up all the updates in Flink community. We are still > > > > > > > > > > > > facing the fact that we have less experiences in > Flink > > > than > > > > > > folks > > > > > > > > in > > > > > > > > > > > Flink > > > > > > > > > > > > community. In order to make sure we maintain and > > deliver > > > > > > > > > > > > a high-quality pulsar-flink integration to the users > > who > > > > use > > > > > > both > > > > > > > > > > > > technologies, we need some help from the experts from > > > Flink > > > > > > > > > community. > > > > > > > > > > > > > > > > > > > > > > > > 2) We have been following FLIP-27 for a while. > > Originally > > > > we > > > > > > were > > > > > > > > > > > thinking > > > > > > > > > > > > of contributing the connectors back after integrating > > > with > > > > the > > > > > > > > > > > > new API introduced in FLIP-27. But we decided to > > initiate > > > > the > > > > > > > > > > > conversation > > > > > > > > > > > > as early as possible. Because we believe there are > more > > > > > > benefits > > > > > > > > > doing > > > > > > > > > > > > it now rather than later. As part of contribution, it > > can > > > > help > > > > > > > > Flink > > > > > > > > > > > > community understand more about Pulsar and the > > potential > > > > > > > > integration > > > > > > > > > > > points. > > > > > > > > > > > > Also we can also help Flink community verify the new > > > > connector > > > > > > > API > > > > > > > > as > > > > > > > > > > > well > > > > > > > > > > > > as other new API (e.g. catalog API). > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > Sijie > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 4, 2019 at 5:24 AM Becket Qin < > > > > > > becket....@gmail.com> > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > Hi Yijie, > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks for the interest in contributing the Pulsar > > > > connector. > > > > > > > > > > > > > > > > > > > > > > > > > > In general, I think having Pulsar connector with > > strong > > > > > > support > > > > > > > > is > > > > > > > > > a > > > > > > > > > > > > > valuable addition to Flink. So I am happy the > > shepherd > > > > this > > > > > > > > effort. > > > > > > > > > > > > > Meanwhile, I would also like to provide some > context > > > and > > > > > > recent > > > > > > > > > > > efforts on > > > > > > > > > > > > > the Flink connectors ecosystem. > > > > > > > > > > > > > > > > > > > > > > > > > > The current way Flink maintains its connector has > hit > > > the > > > > > > > > > scalability > > > > > > > > > > > bar. > > > > > > > > > > > > > With more and more connectors coming into Flink > repo, > > > we > > > > are > > > > > > > > > facing a > > > > > > > > > > > few > > > > > > > > > > > > > problems such as long build and testing time. To > > > address > > > > this > > > > > > > > > > problem, > > > > > > > > > > > we > > > > > > > > > > > > > have attempted to do the following: > > > > > > > > > > > > > 1. Split out the connectors into a separate > > repository. > > > > This > > > > > > is > > > > > > > > > > > temporarily > > > > > > > > > > > > > on hold due to potential solution to shorten the > > build > > > > time. > > > > > > > > > > > > > 2. Encourage the connectors to stay as ecosystem > > > project > > > > > > while > > > > > > > > > Flink > > > > > > > > > > > tries > > > > > > > > > > > > > to provide good support for functionality and > > > > compatibility > > > > > > > > tests. > > > > > > > > > > > Robert > > > > > > > > > > > > > has driven to create a Flink Ecosystem project > > website > > > > and it > > > > > > > is > > > > > > > > > > going > > > > > > > > > > > > > through some final approval process. > > > > > > > > > > > > > > > > > > > > > > > > > > Given the above efforts, it would be great to first > > see > > > > if we > > > > > > > can > > > > > > > > > > have > > > > > > > > > > > > > Pulsar connector as an ecosystem project with great > > > > support. > > > > > > It > > > > > > > > > would > > > > > > > > > > > be > > > > > > > > > > > > > good to hear how the Flink Pulsar connector is > tested > > > > > > currently > > > > > > > > to > > > > > > > > > > see > > > > > > > > > > > if > > > > > > > > > > > > > we can learn something to maintain it as an > ecosystem > > > > project > > > > > > > > with > > > > > > > > > > good > > > > > > > > > > > > > quality and test coverage. If the quality as an > > > ecosystem > > > > > > > project > > > > > > > > > is > > > > > > > > > > > hard > > > > > > > > > > > > > to guarantee, we may as well adopt it into the main > > > repo. > > > > > > > > > > > > > > > > > > > > > > > > > > BTW, another ongoing effort is FLIP-27 where we are > > > > making > > > > > > > > changes > > > > > > > > > to > > > > > > > > > > > the > > > > > > > > > > > > > Flink source connector architecture and interface. > > This > > > > > > change > > > > > > > > will > > > > > > > > > > > likely > > > > > > > > > > > > > land in 1.10. Therefore timing wise, if we are > going > > to > > > > have > > > > > > > the > > > > > > > > > > Pulsar > > > > > > > > > > > > > connector in main repo, I am wondering if we should > > > hold > > > > a > > > > > > > little > > > > > > > > > bit > > > > > > > > > > > and > > > > > > > > > > > > > let the Pulsar connector adapt to the new interface > > to > > > > avoid > > > > > > > > > shortly > > > > > > > > > > > > > deprecated work? > > > > > > > > > > > > > > > > > > > > > > > > > > Thanks, > > > > > > > > > > > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 4, 2019 at 4:32 PM Chesnay Schepler < > > > > > > > > > ches...@apache.org> > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > I'm quite worried that we may end up repeating > > > history. > > > > > > > > > > > > > > > > > > > > > > > > > > > > There were already 2 attempts at contributing a > > > pulsar > > > > > > > > connector, > > > > > > > > > > > both > > > > > > > > > > > > > > of which failed because no committer was getting > > > > involved, > > > > > > > > > despite > > > > > > > > > > > the > > > > > > > > > > > > > > contributor opening a dedicated discussion thread > > > > about the > > > > > > > > > > > contribution > > > > > > > > > > > > > > beforehand and getting several +1's from > > committers. > > > > > > > > > > > > > > > > > > > > > > > > > > > > We should really make sure that if we > > welcome/approve > > > > such > > > > > > a > > > > > > > > > > > > > > contribution it will actually get the attention > it > > > > > > deserves. > > > > > > > > > > > > > > > > > > > > > > > > > > > > As such, I'm inclined to recommend maintaining > the > > > > > > connector > > > > > > > > > > outside > > > > > > > > > > > of > > > > > > > > > > > > > > Flink. We could link to it from the documentation > > to > > > > give > > > > > > it > > > > > > > > more > > > > > > > > > > > > > exposure. > > > > > > > > > > > > > > With the upcoming page for sharing artifacts > among > > > the > > > > > > > > community > > > > > > > > > > > (what's > > > > > > > > > > > > > > the state of that anyway?), this may be a better > > > > option. > > > > > > > > > > > > > > > > > > > > > > > > > > > > On 04/09/2019 10:16, Till Rohrmann wrote: > > > > > > > > > > > > > > > Hi everyone, > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > thanks a lot for starting this discussion > Yijie. > > I > > > > think > > > > > > > the > > > > > > > > > > Pulsar > > > > > > > > > > > > > > > connector would be a very valuable addition > since > > > > Pulsar > > > > > > > > > becomes > > > > > > > > > > > more > > > > > > > > > > > > > and > > > > > > > > > > > > > > > more popular and it would further expand > Flink's > > > > > > > > > > interoperability. > > > > > > > > > > > Also > > > > > > > > > > > > > > > from a project perspective it makes sense for > me > > to > > > > place > > > > > > > the > > > > > > > > > > > connector > > > > > > > > > > > > > > in > > > > > > > > > > > > > > > the downstream project. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > My main concern/question is how can the Flink > > > > community > > > > > > > > > maintain > > > > > > > > > > > the > > > > > > > > > > > > > > > connector? We have seen in the past that > > connectors > > > > are > > > > > > > some > > > > > > > > of > > > > > > > > > > the > > > > > > > > > > > > > most > > > > > > > > > > > > > > > actively developed components because they need > > to > > > be > > > > > > kept > > > > > > > in > > > > > > > > > > sync > > > > > > > > > > > with > > > > > > > > > > > > > > the > > > > > > > > > > > > > > > external system and with Flink. Given that the > > > Pulsar > > > > > > > > community > > > > > > > > > > is > > > > > > > > > > > > > > willing > > > > > > > > > > > > > > > to help with maintaining, improving and > evolving > > > the > > > > > > > > connector, > > > > > > > > > > I'm > > > > > > > > > > > > > > > optimistic that we can achieve this. Hence, +1 > > for > > > > > > > > contributing > > > > > > > > > > it > > > > > > > > > > > back > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > Flink. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > Till > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Wed, Sep 4, 2019 at 2:03 AM Sijie Guo < > > > > > > > guosi...@gmail.com > > > > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >> Hi Yun, > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Since I was the main driver behind FLINK-9641 > > and > > > > > > > > FLINK-9168, > > > > > > > > > > let > > > > > > > > > > > me > > > > > > > > > > > > > > try to > > > > > > > > > > > > > > >> add more context on this. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> FLINK-9641 and FLINK-9168 was created for > > bringing > > > > > > Pulsar > > > > > > > as > > > > > > > > > > > source > > > > > > > > > > > > > and > > > > > > > > > > > > > > >> sink for Flink. The integration was done with > > > Flink > > > > > > 1.6.0. > > > > > > > > We > > > > > > > > > > > sent out > > > > > > > > > > > > > > pull > > > > > > > > > > > > > > >> requests about a year ago and we ended up > > > > maintaining > > > > > > > those > > > > > > > > > > > connectors > > > > > > > > > > > > > > in > > > > > > > > > > > > > > >> Pulsar for Pulsar users to use Flink to > process > > > > event > > > > > > > > streams > > > > > > > > > in > > > > > > > > > > > > > Pulsar. > > > > > > > > > > > > > > >> (See > > > > > > > > > https://github.com/apache/pulsar/tree/master/pulsar-flink > > > > > > > > > > ). > > > > > > > > > > > The > > > > > > > > > > > > > > Flink > > > > > > > > > > > > > > >> 1.6 integration is pretty simple and there is > no > > > > schema > > > > > > > > > > > > > considerations. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> In the past year, we have made a lot of > changes > > in > > > > > > Pulsar > > > > > > > > and > > > > > > > > > > > brought > > > > > > > > > > > > > > >> Pulsar schema as the first-class citizen in > > > Pulsar. > > > > We > > > > > > > also > > > > > > > > > > > integrated > > > > > > > > > > > > > > with > > > > > > > > > > > > > > >> other computing engines for processing Pulsar > > > event > > > > > > > streams > > > > > > > > > with > > > > > > > > > > > > > Pulsar > > > > > > > > > > > > > > >> schema. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> It led us to rethink how to integrate with > Flink > > > in > > > > the > > > > > > > best > > > > > > > > > > way. > > > > > > > > > > > Then > > > > > > > > > > > > > > we > > > > > > > > > > > > > > >> reimplement the pulsar-flink connectors from > the > > > > ground > > > > > > up > > > > > > > > > with > > > > > > > > > > > schema > > > > > > > > > > > > > > and > > > > > > > > > > > > > > >> bring table API and catalog API as the > > first-class > > > > > > citizen > > > > > > > > in > > > > > > > > > > the > > > > > > > > > > > > > > >> integration. With that being said, in the new > > > > > > pulsar-flink > > > > > > > > > > > > > > implementation, > > > > > > > > > > > > > > >> you can register pulsar as a flink catalog and > > > > query / > > > > > > > > process > > > > > > > > > > the > > > > > > > > > > > > > event > > > > > > > > > > > > > > >> streams using Flink SQL. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> This is an example about how to use Pulsar as > a > > > > Flink > > > > > > > > catalog: > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://github.com/streamnative/pulsar-flink/blob/3eeddec5625fc7dddc3f8a3ec69f72e1614ca9c9/README.md#use-pulsar-catalog > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Yijie has also written a blog post explaining > > why > > > we > > > > > > > > > > re-implement > > > > > > > > > > > the > > > > > > > > > > > > > > flink > > > > > > > > > > > > > > >> connector with Flink 1.9 and what are the > > changes > > > we > > > > > > made > > > > > > > in > > > > > > > > > the > > > > > > > > > > > new > > > > > > > > > > > > > > >> connector: > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://medium.com/streamnative/use-apache-pulsar-as-streaming-table-with-8-lines-of-code-39033a93947f > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> We believe Pulsar is not just a simple data > sink > > > or > > > > > > source > > > > > > > > for > > > > > > > > > > > Flink. > > > > > > > > > > > > > It > > > > > > > > > > > > > > >> actually can be a fully integrated streaming > > data > > > > > > storage > > > > > > > > for > > > > > > > > > > > Flink in > > > > > > > > > > > > > > many > > > > > > > > > > > > > > >> areas (sink, source, schema/catalog and > state). > > > The > > > > > > > > > combination > > > > > > > > > > of > > > > > > > > > > > > > Flink > > > > > > > > > > > > > > >> and Pulsar can create a great streaming > > warehouse > > > > > > > > architecture > > > > > > > > > > for > > > > > > > > > > > > > > >> streaming-first, unified data processing. > Since > > we > > > > are > > > > > > > > talking > > > > > > > > > > to > > > > > > > > > > > > > > >> contribute Pulsar integration to Flink here, > we > > > are > > > > also > > > > > > > > > > > dedicated to > > > > > > > > > > > > > > >> maintain, improve and evolve the integration > > with > > > > Flink > > > > > > to > > > > > > > > > help > > > > > > > > > > > the > > > > > > > > > > > > > > users > > > > > > > > > > > > > > >> who use both Flink and Pulsar. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Hope this give you a bit more background about > > the > > > > > > pulsar > > > > > > > > > flink > > > > > > > > > > > > > > >> integration. Let me know what are your > thoughts. > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> Thanks, > > > > > > > > > > > > > > >> Sijie > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> On Tue, Sep 3, 2019 at 11:54 AM Yun Tang < > > > > > > > myas...@live.com> > > > > > > > > > > > wrote: > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >>> Hi Yijie > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> I can see that Pulsar becomes more and more > > > popular > > > > > > > > recently > > > > > > > > > > and > > > > > > > > > > > very > > > > > > > > > > > > > > >> glad > > > > > > > > > > > > > > >>> to see more people willing to contribute to > > Flink > > > > > > > > ecosystem. > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> Before any further discussion, would you > please > > > > give > > > > > > some > > > > > > > > > > > explanation > > > > > > > > > > > > > > of > > > > > > > > > > > > > > >>> the relationship between this thread to > current > > > > > > existing > > > > > > > > > JIRAs > > > > > > > > > > of > > > > > > > > > > > > > > pulsar > > > > > > > > > > > > > > >>> source [1] and sink [2] connector? Will the > > > > > > contribution > > > > > > > > > > contains > > > > > > > > > > > > > part > > > > > > > > > > > > > > of > > > > > > > > > > > > > > >>> those PRs or totally different > implementation? > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> [1] > > > > https://issues.apache.org/jira/browse/FLINK-9641 > > > > > > > > > > > > > > >>> [2] > > > > https://issues.apache.org/jira/browse/FLINK-9168 > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> Best > > > > > > > > > > > > > > >>> Yun Tang > > > > > > > > > > > > > > >>> ________________________________ > > > > > > > > > > > > > > >>> From: Yijie Shen <henry.yijies...@gmail.com> > > > > > > > > > > > > > > >>> Sent: Tuesday, September 3, 2019 13:57 > > > > > > > > > > > > > > >>> To: dev@flink.apache.org < > dev@flink.apache.org > > > > > > > > > > > > > > > > > >>> Subject: [DISCUSS] Contribute Pulsar Flink > > > > connector > > > > > > back > > > > > > > > to > > > > > > > > > > > Flink > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> Dear Flink Community! > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> I would like to open the discussion of > > > contributing > > > > > > > Pulsar > > > > > > > > > > Flink > > > > > > > > > > > > > > >>> connector [0] back to Flink. > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> ## A brief introduction to Apache Pulsar > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> Apache Pulsar[1] is a multi-tenant, > > > > high-performance > > > > > > > > > > distributed > > > > > > > > > > > > > > >>> pub-sub messaging system. Pulsar includes > > > multiple > > > > > > > features > > > > > > > > > > such > > > > > > > > > > > as > > > > > > > > > > > > > > >>> native support for multiple clusters in a > > Pulsar > > > > > > > instance, > > > > > > > > > with > > > > > > > > > > > > > > >>> seamless geo-replication of messages across > > > > clusters, > > > > > > > very > > > > > > > > > low > > > > > > > > > > > > > publish > > > > > > > > > > > > > > >>> and end-to-end latency, seamless scalability > to > > > > over a > > > > > > > > > million > > > > > > > > > > > > > topics, > > > > > > > > > > > > > > >>> and guaranteed message delivery with > persistent > > > > message > > > > > > > > > storage > > > > > > > > > > > > > > >>> provided by Apache BookKeeper. Nowadays, > Pulsar > > > has > > > > > > been > > > > > > > > > > adopted > > > > > > > > > > > by > > > > > > > > > > > > > > >>> more and more companies[2]. > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> ## The status of Pulsar Flink connector > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> The Pulsar Flink connector we are planning to > > > > > > contribute > > > > > > > is > > > > > > > > > > built > > > > > > > > > > > > > upon > > > > > > > > > > > > > > >>> Flink 1.9.0 and Pulsar 2.4.0. The main > features > > > > are: > > > > > > > > > > > > > > >>> - Pulsar as a streaming source with > > exactly-once > > > > > > > guarantee. > > > > > > > > > > > > > > >>> - Sink streaming results to Pulsar with > > > > at-least-once > > > > > > > > > > semantics. > > > > > > > > > > > (We > > > > > > > > > > > > > > >>> would update this to exactly-once as well > when > > > > Pulsar > > > > > > > gets > > > > > > > > > all > > > > > > > > > > > > > > >>> transaction features ready in its 2.5.0 > > version) > > > > > > > > > > > > > > >>> - Build upon Flink new Table API Type system > > > > > > > (FLIP-37[3]), > > > > > > > > > and > > > > > > > > > > > can > > > > > > > > > > > > > > >>> automatically (de)serialize messages with the > > > help > > > > of > > > > > > > > Pulsar > > > > > > > > > > > schema. > > > > > > > > > > > > > > >>> - Integrate with Flink new Catalog API > > > > (FLIP-30[4]), > > > > > > > which > > > > > > > > > > > enables > > > > > > > > > > > > > the > > > > > > > > > > > > > > >>> use of Pulsar topics as tables in Table API > as > > > > well as > > > > > > > SQL > > > > > > > > > > > client. > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> ## Reference > > > > > > > > > > > > > > >>> [0] > > https://github.com/streamnative/pulsar-flink > > > > > > > > > > > > > > >>> [1] https://pulsar.apache.org/ > > > > > > > > > > > > > > >>> [2] https://pulsar.apache.org/en/powered-by/ > > > > > > > > > > > > > > >>> [3] > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > > > > > > > > > > > > > >>> [4] > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > >>> Best, > > > > > > > > > > > > > > >>> Yijie Shen > > > > > > > > > > > > > > >>> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >