My notes from the community meeting. Jerry, Matteo, and I talked about this idea:
* major problems are: -- the process of adding new stuff (approval, review, quality control/approval stuff) -- what to do with stuff that is not under apache umbrella (as in: code ownership, not license). ASF possibly does not allow that -- conflict resolution with commercial entities pushing their competing connectors * overall it is not a strict no; more like we need to think about potential issues and hear other folks some options that sounded ok were like a repo with apache stuff and an option to add (like "brew tap") other repos at own risk On Tue, May 25, 2021 at 1:58 PM Andrey Yegorov <andrey.yego...@datastax.com> wrote: > I do agree that hosting of the binaries is not an issue. > Discovering the binaries is. Proposed repo should not host the binaries > but the urls/checksums etc. and serve as a simple data storage for the CLI > to use. > > Having multiple parties publishing their connectors is fine but it makes > it hard for the new user to see the full extent of the ecosystem given > little motivation for all parties to proactively notify the pulsar > community outside of their own client base. > > I.e. Kafka ecosystem does not list Snowflake connector: > https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem (Confluent > Hub does, but let's look at FOSS) > Snowflake's docs do: > https://docs.snowflake.com/en/user-guide/kafka-connector-install.html > > Having a registry at the time when Snowflake and others start building > Pulsar connectors will result in them notifying the community by adding > their connectors to the registry; the motivation at that time is obvious: > ease of installation for their customers. > > > On Tue, May 25, 2021 at 11:59 AM Jerry Peng <jerry.boyang.p...@gmail.com> > wrote: > >> Hello Andrey, >> >> Thank you for bringing this up! This is definitely an important issue! >> >> All of the connector binaries are already hosted on Maven central thus I >> don't think hosting the binaries is an issue. Perhaps the key problem >> here >> is about discovery. >> >> My thoughts: >> >> 1. We should document clearly on the Apache Pulsar website all the >> connectors that we offer. >> >> https://pulsar.apache.org/docs/en/io-connectors/ >> >> Seems like we already do that? If not, we should make sure to keep this >> list up to date. Maybe the list is not visible enough to new users. If >> so, we should figure out how to advertise the connectors we already have >> in >> a better fashion. >> >> 2. I do like the idea of having a tool that can perhaps search and >> install >> connectors automatically for you. Perhaps this is a feature we can add to >> the existing pulsar-admin CLI tool. This feature can search maven for >> connector binaries and download / install them if instructed by the user. >> >> Best, >> >> Jerry >> >> On Tue, May 25, 2021 at 11:20 AM Andrey Yegorov < >> andrey.yego...@datastax.com> >> wrote: >> >> > Hello, As Pulsar becomes increasingly popular, we will have to deal >> with a >> > larger userbase looking to deploy Pulsar in a wider array of use cases, >> > interfacing with a more diverse set of other components. To help with >> > this, we should create a plan as a project to help community members >> > publish and discover connectors beyond what the Pulsar PMC wants to >> > maintain. >> > >> > Current plans include splitting connectors into separate repos (PIP 62) >> or >> > moving under the umbrella of the projects they integrate with (as per >> > conversations during the community meetings). This will definitely help >> > with the build times but may negatively affect the discoverability of >> the >> > connectors and ease of installation. >> > >> > I think Pulsar can benefit from a simple package registry that (1) >> hosts a >> > list of free to use (apache or other approved license) >> > connectors/references to the binaries, and (2) provides a CLI (e.g. via >> > pulsar admin) to simplify discovery, download, and installation of the >> > connectors for the new users. >> > >> > What do you think? Would you find something like this useful? >> > >> > The implementation can be as simple as another GitHub repo with a >> > predefined structure like >> > >> > {connector name}/{major version}/metadata >> > >> > where metadata contains url to the nar, checksum, range of compatible >> > pulsar versions, contacts, license, short description, etc. >> > >> > Plus the CLI that can search/list compatible connectors, >> > download/install/update the connector. >> > >> > As prior art examples, one can refer to: >> > >> > 1. >> > >> > brew (package manager for MacOS) >> > 1. >> > >> > Formulas/registry: https://github.com/Homebrew/homebrew-core >> > 2. >> > >> > brew itself https://github.com/Homebrew/brew >> > 2. >> > >> > Apache Solr >> > 1. >> > >> > https://solr.apache.org/guide/8_8/package-manager.html. >> > >> > There is also the Helm chart repository from our cousins at the CNCF >> over >> > at the Artifact Hub < >> https://urldefense.proofpoint.com/v2/url?u=https-3A__artifacthub.io_&d=DwIBaQ&c=adz96Xi0w1RHqtPMowiL2g&r=0B1UvYMwy7dr9qtqFwQCfxUyrozUgZzbOshynTIaYUY&m=5VGTvu3-sbRB4yxv7chMNCH9azZNDrJ_ReACRT6o5I8&s=_euXKnolKTg6dZywYpfqpb6q7HapdwpgyUgx3PQUxBU&e= >> >. >> > >> > I believe such a registry managed by the PMC will reduce the risk of >> > fragmentation of the ecosystem, improve discoverability, and allow >> simple >> > detection of not-up-to-date connectors for the new releases of Pulsar. >> > >> > -- >> > Andrey Yegorov >> > >> > > > -- > Andrey Yegorov > -- Andrey Yegorov