Hi everyone, I've put the catalog part design in separate doc with more details for easier communication.
https://docs.google.com/document/d/1LMnABtXn-wQedsmWv8hopvx-B-jbdr8-jHbIiDhdsoE/edit?usp=sharing I would love to hear your thoughts on this. Best, Yijie On Mon, Oct 21, 2019 at 11:15 AM Yijie Shen <henry.yijies...@gmail.com> wrote: > Hi everyone, > > Glad to receive your valuable feedbacks. > > I'd first separate the Pulsar catalog as another doc and show more design > and implementation details there. > > For the current FLIP-72, I would separate it into the sink part for > current work and keep the source part as future works until we reach > FLIP-27 finals. > > I also reply to some of the comments in the design doc. I will rewrite the > catalog part in regarding to Bowen's advice in both email and comments. > > Thanks for the help again. > > Best, > Yijie > > On Fri, Oct 18, 2019 at 12:40 AM Rong Rong <walter...@gmail.com> wrote: > >> Hi Yijie, >> >> I also agree with Jark on separating the Catalog part into another FLIP. >> >> With FLIP-27[1] also in the air, it is also probably great to split and >> unblock the sink implementation contribution. >> I would suggest either putting in a detail implementation plan section in >> the doc, or (maybe too much separation?) splitting them into different >> FLIPs. What do you guys think? >> >> -- >> Rong >> >> [1] >> >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface >> >> On Wed, Oct 16, 2019 at 9:00 PM Jark Wu <imj...@gmail.com> wrote: >> >> > Hi Yijie, >> > >> > Thanks for the design document. I agree with Bowen that the catalog part >> > needs more details. >> > And I would suggest to separate Pulsar Catalog as another FLIP. IMO, it >> has >> > little to do with source/sink. >> > Having a separate FLIP can unblock the contribution for sink (or source) >> > and keep the discussion more focus. >> > I also left some comments in the documentation. >> > >> > Thanks, >> > Jark >> > >> > On Thu, 17 Oct 2019 at 11:24, Yijie Shen <henry.yijies...@gmail.com> >> > wrote: >> > >> > > Hi Bowen, >> > > >> > > Thanks for your comments. I'll add catalog details as you suggested. >> > > >> > > One more question: since we decide to not implement source part of the >> > > connector at the moment. >> > > What can users do with a Pulsar catalog? >> > > Create a table backed by Pulsar and check existing pulsar tables to >> see >> > > their schemas? Drop tables maybe? >> > > >> > > Best, >> > > Yijie >> > > >> > > On Thu, Oct 17, 2019 at 1:04 AM Bowen Li <bowenl...@gmail.com> wrote: >> > > >> > > > Hi Yijie, >> > > > >> > > > Per the discussion, maybe you can move pulsar source to 'future >> work' >> > > > section in the FLIP for now? >> > > > >> > > > Besides, the FLIP seems to be quite rough at the moment, and I'd >> > > recommend >> > > > to add more details . >> > > > >> > > > A few questions mainly regarding the proposed pulsar catalog. >> > > > >> > > > - Can you provide some background of pulsar schema registry and >> how >> > it >> > > > works? >> > > > - The proposed design of pulsar catalog is very vague now, can >> you >> > > > share some details of how a pulsar catalog would work internally? >> > E.g. >> > > > - which APIs does it support exactly? E.g. I see from your >> > > > prototype that table creation is supported but not alteration. >> > > > - is it going to connect to a pulsar schema registry via a >> http >> > > > client or a pulsar client, etc >> > > > - will it be able to handle multiple versions of pulsar, or >> just >> > > > one? How is compatibility handles between different >> Flink-Pulsar >> > > versions? >> > > > - will it support only reading from pulsar schema registry , >> or >> > > > both read/write? Will it work end-to-end in Flink SQL for >> users >> > to >> > > create >> > > > and manipulate a pulsar table such as "CREATE TABLE t WITH >> > > > PROPERTIES(type=pulsar)" and "DROP TABLE t"? >> > > > - Is a pulsar topic always gonna be a non-partitioned table? >> How >> > is >> > > > a partitioned topic mapped to a Flink table? >> > > > - How to map Flink's catalog/database namespace to pulsar's >> > > > multi-tenant namespaces? I'm not very familiar with how multi >> > tenancy >> > > works >> > > > in pulsar, and some background context/use cases may help here >> too. >> > > E.g. >> > > > - can a pulsar client/consumer/producer be multiple-tenant at >> the >> > > > same time? >> > > > - how does authentication work in pulsar's multi-tenancy and >> the >> > > > catalog? asking since I didn't see the proposed pulsar catalog >> > has >> > > > username/password configs >> > > > - the FLIP seems propose mapping a pulsar cluster and >> > > > 'tenant/namespace' respectively to Flink's 'catalog' and >> > > 'database'. I >> > > > wonder whether it totally makes sense, or should we actually >> map >> > > "tenant" >> > > > to "catalog", and "namespace" to "database"? >> > > > >> > > > Cheers, >> > > > Bowen >> > > > >> > > > On Fri, Sep 20, 2019 at 1:16 AM Yijie Shen < >> henry.yijies...@gmail.com> >> > > > wrote: >> > > > >> > > >> Hi everyone, >> > > >> >> > > >> Per discussion in the previous thread >> > > >> < >> > > >> >> > > >> > >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Contribute-Pulsar-Flink-connector-back-to-Flink-tc32538.html >> > > >> >, >> > > >> I have created FLIP-72 to kick off a more detailed discussion on >> the >> > > Flink >> > > >> Pulsar connector: >> > > >> >> > > >> >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-72%3A+Introduce+Pulsar+Connector >> > > >> >> > > >> In short, the connector has the following features: >> > > >> >> > > >> - >> > > >> >> > > >> Pulsar as a streaming source with exactly-once guarantee. >> > > >> - >> > > >> >> > > >> Sink streaming results to Pulsar with at-least-once semantics. >> > > >> - >> > > >> >> > > >> Build upon Flink new Table API Type system (FLIP-37 >> > > >> < >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-37%3A+Rework+of+the+Table+API+Type+System >> > > >> > >> > > >> ), and can automatically (de)serialize messages with the help of >> > > Pulsar >> > > >> schema. >> > > >> - >> > > >> >> > > >> Integrate with Flink new Catalog API (FLIP-30 >> > > >> < >> > > >> >> > > >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-30%3A+Unified+Catalog+APIs >> > > >> >), >> > > >> which enables the use of Pulsar topics as tables in Table API as >> > well >> > > >> as >> > > >> SQL client. >> > > >> >> > > >> >> > > >> >> > > >> >> > > >> > >> https://docs.google.com/document/d/1rES79eKhkJxrRfQp1b3u8LB2aPaq-6JaDHDPJIA8kMY/edit#heading=h.28v5v23yeq1u >> > > >> >> > > >> >> > > >> Would love to here your thoughts on this. >> > > >> >> > > >> Best, >> > > >> Yijie >> > > >> >> > > > >> > > >> > >> >