Hi Bowen, I've done updated the design doc, PTAL. Btw the PR for catalog is https://github.com/apache/flink/pull/10455, could you please take a look?
Best, Yijie On Mon, Dec 9, 2019 at 8:44 AM Bowen Li <bowenl...@gmail.com> wrote: > Hi Yijie, > > I took a look at the design doc. LGTM overall, left a few questions. > > On Tue, Dec 3, 2019 at 10:39 PM Becket Qin <becket....@gmail.com> wrote: > > > Yes, you are absolutely right. Cannot believe I posted in the wrong > > thread... > > > > On Wed, Dec 4, 2019 at 1:46 PM Jark Wu <imj...@gmail.com> wrote: > > > >> Thanks Becket the the updating, > >> > >> But shouldn't this message be posted in FLIP-27 discussion thread[1]? > >> > >> > >> Best, > >> Jark > >> > >> [1]: > >> > >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-27-Refactor-Source-Interface-td24952.html > >> > >> On Wed, 4 Dec 2019 at 12:12, Becket Qin <becket....@gmail.com> wrote: > >> > >> > Hi all, > >> > > >> > Sorry for the long belated update. I have updated FLIP-27 wiki page > with > >> > the latest proposals. Some noticeable changes include: > >> > 1. A new generic communication mechanism between SplitEnumerator and > >> > SourceReader. > >> > 2. Some detail API method signature changes. > >> > > >> > We left a few things out of this FLIP and will address them in > separate > >> > FLIPs. Including: > >> > 1. Per split event time. > >> > 2. Event time alignment. > >> > 3. Fine grained failover for SplitEnumerator failure. > >> > > >> > Please let us know if you have any question. > >> > > >> > Thanks, > >> > > >> > Jiangjie (Becket) Qin > >> > > >> > On Tue, Nov 19, 2019 at 10:28 AM Yijie Shen < > henry.yijies...@gmail.com> > >> > wrote: > >> > > >> > > Hi everyone, > >> > > > >> > > I've put the catalog part design in separate doc with more details > for > >> > > easier communication. > >> > > > >> > > > >> > > > >> > > >> > https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing > >> > > > >> > > I would love to hear your thoughts on this. > >> > > > >> > > Best, > >> > > Yijie > >> > > > >> > > On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen < > >> henry.yijies...@gmail.com> > >> > > wrote: > >> > > > >> > > > Hi everyone, > >> > > > > >> > > > Glad to receive your valuable feedbacks. > >> > > > > >> > > > I'd first separate the Pulsar catalog as another doc and show more > >> > design > >> > > > and implementation details there. > >> > > > > >> > > > For the current FLIP-72, I would separate it into the sink part > for > >> > > > current work and keep the source part as future works until we > reach > >> > > > FLIP-27 finals. > >> > > > > >> > > > I also reply to some of the comments in the design doc. I will > >> rewrite > >> > > the > >> > > > catalog part in regarding to Bowen's advice in both email and > >> comments. > >> > > > > >> > > > Thanks for the help again. > >> > > > > >> > > > Best, > >> > > > Yijie > >> > > > > >> > > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <walter...@gmail.com> > >> > wrote: > >> > > > > >> > > >> Hi Yijie, > >> > > >> > >> > > >> I also agree with Jark on separating the Catalog part into > another > >> > FLIP. > >> > > >> > >> > > >> With FLIP-27[1] also in the air, it is also probably great to > split > >> > and > >> > > >> unblock the sink implementation contribution. > >> > > >> I would suggest either putting in a detail implementation plan > >> section > >> > > in > >> > > >> the doc, or (maybe too much separation?) splitting them into > >> different > >> > > >> FLIPs. What do you guys think? > >> > > >> > >> > > >> -- > >> > > >> Rong > >> > > >> > >> > > >> [1] > >> > > >> > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface > >> > > >> > >> > > >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <imj...@gmail.com> > wrote: > >> > > >> > >> > > >> > Hi Yijie, > >> > > >> > > >> > > >> > Thanks for the design document. I agree with Bowen that the > >> catalog > >> > > part > >> > > >> > needs more details. > >> > > >> > And I would suggest to separate Pulsar Catalog as another FLIP. > >> IMO, > >> > > it > >> > > >> has > >> > > >> > little to do with source/sink. > >> > > >> > Having a separate FLIP can unblock the contribution for sink > (or > >> > > source) > >> > > >> > and keep the discussion more focus. > >> > > >> > I also left some comments in the documentation. > >> > > >> > > >> > > >> > Thanks, > >> > > >> > Jark > >> > > >> > > >> > > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen < > >> henry.yijies...@gmail.com > >> > > > >> > > >> > wrote: > >> > > >> > > >> > > >> > > Hi Bowen, > >> > > >> > > > >> > > >> > > Thanks for your comments. I'll add catalog details as you > >> > suggested. > >> > > >> > > > >> > > >> > > One more question: since we decide to not implement source > >> part of > >> > > the > >> > > >> > > connector at the moment. > >> > > >> > > What can users do with a Pulsar catalog? > >> > > >> > > Create a table backed by Pulsar and check existing pulsar > >> tables > >> > to > >> > > >> see > >> > > >> > > their schemas? Drop tables maybe? > >> > > >> > > > >> > > >> > > Best, > >> > > >> > > Yijie > >> > > >> > > > >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li < > bowenl...@gmail.com> > >> > > wrote: > >> > > >> > > > >> > > >> > > > Hi Yijie, > >> > > >> > > > > >> > > >> > > > Per the discussion, maybe you can move pulsar source to > >> 'future > >> > > >> work' > >> > > >> > > > section in the FLIP for now? > >> > > >> > > > > >> > > >> > > > Besides, the FLIP seems to be quite rough at the moment, > and > >> I'd > >> > > >> > > recommend > >> > > >> > > > to add more details . > >> > > >> > > > > >> > > >> > > > A few questions mainly regarding the proposed pulsar > catalog. > >> > > >> > > > > >> > > >> > > > - Can you provide some background of pulsar schema > >> registry > >> > and > >> > > >> how > >> > > >> > it > >> > > >> > > > works? > >> > > >> > > > - The proposed design of pulsar catalog is very vague > now, > >> > can > >> > > >> you > >> > > >> > > > share some details of how a pulsar catalog would work > >> > > internally? > >> > > >> > E.g. > >> > > >> > > > - which APIs does it support exactly? E.g. I see from > >> your > >> > > >> > > > prototype that table creation is supported but not > >> > > alteration. > >> > > >> > > > - is it going to connect to a pulsar schema registry > >> via a > >> > > >> http > >> > > >> > > > client or a pulsar client, etc > >> > > >> > > > - will it be able to handle multiple versions of > >> pulsar, > >> > or > >> > > >> just > >> > > >> > > > one? How is compatibility handles between different > >> > > >> Flink-Pulsar > >> > > >> > > versions? > >> > > >> > > > - will it support only reading from pulsar schema > >> > registry , > >> > > >> or > >> > > >> > > > both read/write? Will it work end-to-end in Flink SQL > >> for > >> > > >> users > >> > > >> > to > >> > > >> > > create > >> > > >> > > > and manipulate a pulsar table such as "CREATE TABLE t > >> WITH > >> > > >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? > >> > > >> > > > - Is a pulsar topic always gonna be a non-partitioned > >> > table? > >> > > >> How > >> > > >> > is > >> > > >> > > > a partitioned topic mapped to a Flink table? > >> > > >> > > > - How to map Flink's catalog/database namespace to > >> pulsar's > >> > > >> > > > multi-tenant namespaces? I'm not very familiar with how > >> multi > >> > > >> > tenancy > >> > > >> > > works > >> > > >> > > > in pulsar, and some background context/use cases may > help > >> > here > >> > > >> too. > >> > > >> > > E.g. > >> > > >> > > > - can a pulsar client/consumer/producer be > >> multiple-tenant > >> > > at > >> > > >> the > >> > > >> > > > same time? > >> > > >> > > > - how does authentication work in pulsar's > >> multi-tenancy > >> > and > >> > > >> the > >> > > >> > > > catalog? asking since I didn't see the proposed > pulsar > >> > > catalog > >> > > >> > has > >> > > >> > > > username/password configs > >> > > >> > > > - the FLIP seems propose mapping a pulsar cluster and > >> > > >> > > > 'tenant/namespace' respectively to Flink's 'catalog' > >> and > >> > > >> > > 'database'. I > >> > > >> > > > wonder whether it totally makes sense, or should we > >> > actually > >> > > >> map > >> > > >> > > "tenant" > >> > > >> > > > to "catalog", and "namespace" to "database"? > >> > > >> > > > > >> > > >> > > > Cheers, > >> > > >> > > > Bowen > >> > > >> > > > > >> > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < > >> > > >> henry.yijies...@gmail.com> > >> > > >> > > > wrote: > >> > > >> > > > > >> > > >> > > >> Hi everyone, > >> > > >> > > >> > >> > > >> > > >> Per discussion in the previous thread > >> > > >> > > >> < > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html > >> > > >> > > >> >, > >> > > >> > > >> I have created FLIP-72 to kick off a more detailed > >> discussion > >> > on > >> > > >> the > >> > > >> > > Flink > >> > > >> > > >> Pulsar connector: > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector > >> > > >> > > >> > >> > > >> > > >> In short, the connector has the following features: > >> > > >> > > >> > >> > > >> > > >> - > >> > > >> > > >> > >> > > >> > > >> Pulsar as a streaming source with exactly-once > guarantee. > >> > > >> > > >> - > >> > > >> > > >> > >> > > >> > > >> Sink streaming results to Pulsar with at-least-once > >> > semantics. > >> > > >> > > >> - > >> > > >> > > >> > >> > > >> > > >> Build upon Flink new Table API Type system (FLIP-37 > >> > > >> > > >> < > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System > >> > > >> > > >> > > >> > > >> > > >> ), and can automatically (de)serialize messages with > the > >> > help > >> > > of > >> > > >> > > Pulsar > >> > > >> > > >> schema. > >> > > >> > > >> - > >> > > >> > > >> > >> > > >> > > >> Integrate with Flink new Catalog API (FLIP-30 > >> > > >> > > >> < > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs > >> > > >> > > >> >), > >> > > >> > > >> which enables the use of Pulsar topics as tables in > Table > >> > API > >> > > as > >> > > >> > well > >> > > >> > > >> as > >> > > >> > > >> SQL client. > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > >> > > >> > https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u > >> > > >> > > >> > >> > > >> > > >> > >> > > >> > > >> Would love to here your thoughts on this. > >> > > >> > > >> > >> > > >> > > >> Best, > >> > > >> > > >> Yijie > >> > > >> > > >> > >> > > >> > > > > >> > > >> > > > >> > > >> > > >> > > >> > >> > > > > >> > > > >> > > >> > > >