Hi everyone, I've updated the Catalog PR and make all settings small case. And tests are added as well. Hi Bowen, could you please take a look. https://github.com/apache/flink/pull/10455
For the sink part of the connector, I've made a separate PR https://github.com/apache/flink/pull/10875. Could someone help review this? Best, Yijie On Thu, Jan 9, 2020 at 8:44 AM Bowen Li <bowenl...@gmail.com> wrote: > Hi Yijie, > > There's just one more concern on the yaml configs. Otherwise, I think we > should be good to go. > > Can you update your PR and ensure all tests pass? I can help review and > merge in the next couple weeks. > > Thanks, > Bowen > > > On Mon, Dec 23, 2019 at 7:03 PM Yijie Shen <henry.yijies...@gmail.com> > wrote: > > > Hi Bowen, > > > > I've done updated the design doc, PTAL. > > Btw the PR for catalog is https://github.com/apache/flink/pull/10455, > > could > > you please take a look? > > > > Best, > > Yijie > > > > On Mon, Dec 9, 2019 at 8:44 AM Bowen Li <bowenl...@gmail.com> wrote: > > > > > Hi Yijie, > > > > > > I took a look at the design doc. LGTM overall, left a few questions. > > > > > > On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <becket....@gmail.com> > wrote: > > > > > > > Yes, you are absolutely right. Cannot believe I posted in the wrong > > > > thread... > > > > > > > > On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <imj...@gmail.com> wrote: > > > > > > > >> Thanks Becket the the updating, > > > >> > > > >> But shouldn't this message be posted in FLIP-27 discussion > thread[1]? > > > >> > > > >> > > > >> Best, > > > >> Jark > > > >> > > > >> [1]: > > > >> > > > >> > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html > > > >> > > > >> On Wed, 4 Dec 2019 at 12:12, Becket Qin <becket....@gmail.com> > wrote: > > > >> > > > >> > Hi all, > > > >> > > > > >> > Sorry for the long belated update. I have updated FLIP-27 wiki > page > > > with > > > >> > the latest proposals. Some noticeable changes include: > > > >> > 1. A new generic communication mechanism between SplitEnumerator > and > > > >> > SourceReader. > > > >> > 2. Some detail API method signature changes. > > > >> > > > > >> > We left a few things out of this FLIP and will address them in > > > separate > > > >> > FLIPs. Including: > > > >> > 1. Per split event time. > > > >> > 2. Event time alignment. > > > >> > 3. Fine grained failover for SplitEnumerator failure. > > > >> > > > > >> > Please let us know if you have any question. > > > >> > > > > >> > Thanks, > > > >> > > > > >> > Jiangjie (Becket) Qin > > > >> > > > > >> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen < > > > henry.yijies...@gmail.com> > > > >> > wrote: > > > >> > > > > >> > > Hi everyone, > > > >> > > > > > >> > > I've put the catalog part design in separate doc with more > details > > > for > > > >> > > easier communication. > > > >> > > > > > >> > > > > > >> > > > > > >> > > > > >> > > > > > > https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing > > > >> > > > > > >> > > I would love to hear your thoughts on this. > > > >> > > > > > >> > > Best, > > > >> > > Yijie > > > >> > > > > > >> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen < > > > >> henry.yijies...@gmail.com> > > > >> > > wrote: > > > >> > > > > > >> > > > Hi everyone, > > > >> > > > > > > >> > > > Glad to receive your valuable feedbacks. > > > >> > > > > > > >> > > > I'd first separate the Pulsar catalog as another doc and show > > more > > > >> > design > > > >> > > > and implementation details there. > > > >> > > > > > > >> > > > For the current FLIP-72, I would separate it into the sink > part > > > for > > > >> > > > current work and keep the source part as future works until we > > > reach > > > >> > > > FLIP-27 finals. > > > >> > > > > > > >> > > > I also reply to some of the comments in the design doc. I will > > > >> rewrite > > > >> > > the > > > >> > > > catalog part in regarding to Bowen's advice in both email and > > > >> comments. > > > >> > > > > > > >> > > > Thanks for the help again. > > > >> > > > > > > >> > > > Best, > > > >> > > > Yijie > > > >> > > > > > > >> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong < > walter...@gmail.com > > > > > > >> > wrote: > > > >> > > > > > > >> > > >> Hi Yijie, > > > >> > > >> > > > >> > > >> I also agree with Jark on separating the Catalog part into > > > another > > > >> > FLIP. > > > >> > > >> > > > >> > > >> With FLIP-27[1] also in the air, it is also probably great to > > > split > > > >> > and > > > >> > > >> unblock the sink implementation contribution. > > > >> > > >> I would suggest either putting in a detail implementation > plan > > > >> section > > > >> > > in > > > >> > > >> the doc, or (maybe too much separation?) splitting them into > > > >> different > > > >> > > >> FLIPs. What do you guys think? > > > >> > > >> > > > >> > > >> -- > > > >> > > >> Rong > > > >> > > >> > > > >> > > >> [1] > > > >> > > >> > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > > > >> > > >> > > > >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <imj...@gmail.com> > > > wrote: > > > >> > > >> > > > >> > > >> > Hi Yijie, > > > >> > > >> > > > > >> > > >> > Thanks for the design document. I agree with Bowen that the > > > >> catalog > > > >> > > part > > > >> > > >> > needs more details. > > > >> > > >> > And I would suggest to separate Pulsar Catalog as another > > FLIP. > > > >> IMO, > > > >> > > it > > > >> > > >> has > > > >> > > >> > little to do with source/sink. > > > >> > > >> > Having a separate FLIP can unblock the contribution for > sink > > > (or > > > >> > > source) > > > >> > > >> > and keep the discussion more focus. > > > >> > > >> > I also left some comments in the documentation. > > > >> > > >> > > > > >> > > >> > Thanks, > > > >> > > >> > Jark > > > >> > > >> > > > > >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen < > > > >> henry.yijies...@gmail.com > > > >> > > > > > >> > > >> > wrote: > > > >> > > >> > > > > >> > > >> > > Hi Bowen, > > > >> > > >> > > > > > >> > > >> > > Thanks for your comments. I'll add catalog details as you > > > >> > suggested. > > > >> > > >> > > > > > >> > > >> > > One more question: since we decide to not implement > source > > > >> part of > > > >> > > the > > > >> > > >> > > connector at the moment. > > > >> > > >> > > What can users do with a Pulsar catalog? > > > >> > > >> > > Create a table backed by Pulsar and check existing pulsar > > > >> tables > > > >> > to > > > >> > > >> see > > > >> > > >> > > their schemas? Drop tables maybe? > > > >> > > >> > > > > > >> > > >> > > Best, > > > >> > > >> > > Yijie > > > >> > > >> > > > > > >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li < > > > bowenl...@gmail.com> > > > >> > > wrote: > > > >> > > >> > > > > > >> > > >> > > > Hi Yijie, > > > >> > > >> > > > > > > >> > > >> > > > Per the discussion, maybe you can move pulsar source to > > > >> 'future > > > >> > > >> work' > > > >> > > >> > > > section in the FLIP for now? > > > >> > > >> > > > > > > >> > > >> > > > Besides, the FLIP seems to be quite rough at the > moment, > > > and > > > >> I'd > > > >> > > >> > > recommend > > > >> > > >> > > > to add more details . > > > >> > > >> > > > > > > >> > > >> > > > A few questions mainly regarding the proposed pulsar > > > catalog. > > > >> > > >> > > > > > > >> > > >> > > > - Can you provide some background of pulsar schema > > > >> registry > > > >> > and > > > >> > > >> how > > > >> > > >> > it > > > >> > > >> > > > works? > > > >> > > >> > > > - The proposed design of pulsar catalog is very > vague > > > now, > > > >> > can > > > >> > > >> you > > > >> > > >> > > > share some details of how a pulsar catalog would > work > > > >> > > internally? > > > >> > > >> > E.g. > > > >> > > >> > > > - which APIs does it support exactly? E.g. I see > > from > > > >> your > > > >> > > >> > > > prototype that table creation is supported but > not > > > >> > > alteration. > > > >> > > >> > > > - is it going to connect to a pulsar schema > > registry > > > >> via a > > > >> > > >> http > > > >> > > >> > > > client or a pulsar client, etc > > > >> > > >> > > > - will it be able to handle multiple versions of > > > >> pulsar, > > > >> > or > > > >> > > >> just > > > >> > > >> > > > one? How is compatibility handles between > different > > > >> > > >> Flink-Pulsar > > > >> > > >> > > versions? > > > >> > > >> > > > - will it support only reading from pulsar schema > > > >> > registry , > > > >> > > >> or > > > >> > > >> > > > both read/write? Will it work end-to-end in Flink > > SQL > > > >> for > > > >> > > >> users > > > >> > > >> > to > > > >> > > >> > > create > > > >> > > >> > > > and manipulate a pulsar table such as "CREATE > > TABLE t > > > >> WITH > > > >> > > >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > > > >> > > >> > > > - Is a pulsar topic always gonna be a > > non-partitioned > > > >> > table? > > > >> > > >> How > > > >> > > >> > is > > > >> > > >> > > > a partitioned topic mapped to a Flink table? > > > >> > > >> > > > - How to map Flink's catalog/database namespace to > > > >> pulsar's > > > >> > > >> > > > multi-tenant namespaces? I'm not very familiar with > > how > > > >> multi > > > >> > > >> > tenancy > > > >> > > >> > > works > > > >> > > >> > > > in pulsar, and some background context/use cases may > > > help > > > >> > here > > > >> > > >> too. > > > >> > > >> > > E.g. > > > >> > > >> > > > - can a pulsar client/consumer/producer be > > > >> multiple-tenant > > > >> > > at > > > >> > > >> the > > > >> > > >> > > > same time? > > > >> > > >> > > > - how does authentication work in pulsar's > > > >> multi-tenancy > > > >> > and > > > >> > > >> the > > > >> > > >> > > > catalog? asking since I didn't see the proposed > > > pulsar > > > >> > > catalog > > > >> > > >> > has > > > >> > > >> > > > username/password configs > > > >> > > >> > > > - the FLIP seems propose mapping a pulsar cluster > > and > > > >> > > >> > > > 'tenant/namespace' respectively to Flink's > > 'catalog' > > > >> and > > > >> > > >> > > 'database'. I > > > >> > > >> > > > wonder whether it totally makes sense, or should > we > > > >> > actually > > > >> > > >> map > > > >> > > >> > > "tenant" > > > >> > > >> > > > to "catalog", and "namespace" to "database"? > > > >> > > >> > > > > > > >> > > >> > > > Cheers, > > > >> > > >> > > > Bowen > > > >> > > >> > > > > > > >> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < > > > >> > > >> henry.yijies...@gmail.com> > > > >> > > >> > > > wrote: > > > >> > > >> > > > > > > >> > > >> > > >> Hi everyone, > > > >> > > >> > > >> > > > >> > > >> > > >> Per discussion in the previous thread > > > >> > > >> > > >> < > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > > > >> > > >> > > >> >, > > > >> > > >> > > >> I have created FLIP-72 to kick off a more detailed > > > >> discussion > > > >> > on > > > >> > > >> the > > > >> > > >> > > Flink > > > >> > > >> > > >> Pulsar connector: > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > > > >> > > >> > > >> > > > >> > > >> > > >> In short, the connector has the following features: > > > >> > > >> > > >> > > > >> > > >> > > >> - > > > >> > > >> > > >> > > > >> > > >> > > >> Pulsar as a streaming source with exactly-once > > > guarantee. > > > >> > > >> > > >> - > > > >> > > >> > > >> > > > >> > > >> > > >> Sink streaming results to Pulsar with at-least-once > > > >> > semantics. > > > >> > > >> > > >> - > > > >> > > >> > > >> > > > >> > > >> > > >> Build upon Flink new Table API Type system (FLIP-37 > > > >> > > >> > > >> < > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > > > >> > > >> > > >> > > > > >> > > >> > > >> ), and can automatically (de)serialize messages > with > > > the > > > >> > help > > > >> > > of > > > >> > > >> > > Pulsar > > > >> > > >> > > >> schema. > > > >> > > >> > > >> - > > > >> > > >> > > >> > > > >> > > >> > > >> Integrate with Flink new Catalog API (FLIP-30 > > > >> > > >> > > >> < > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > > > >> > > >> > > >> >), > > > >> > > >> > > >> which enables the use of Pulsar topics as tables in > > > Table > > > >> > API > > > >> > > as > > > >> > > >> > well > > > >> > > >> > > >> as > > > >> > > >> > > >> SQL client. > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > >> > > > > >> > > > > > > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > > > >> > > >> > > >> > > > >> > > >> > > >> > > > >> > > >> > > >> Would love to here your thoughts on this. > > > >> > > >> > > >> > > > >> > > >> > > >> Best, > > > >> > > >> > > >> Yijie > > > >> > > >> > > >> > > > >> > > >> > > > > > > >> > > >> > > > > > >> > > >> > > > > >> > > >> > > > >> > > > > > > >> > > > > > >> > > > > >> > > > > > > > > > >