Could we add a system topic that has exactly one partition per broker?

On Thu, Apr 22, 2021 at 11:22 PM Joe Francis <j...@verizonmedia.com.invalid>
wrote:

> To be clear, I would love to have this feature. But I would not use this
> feature if that means whenever a  broker that hosts a "system topic" has a
> hiccup, it would  result in an outage for N other brokers. I run 100+
> brokers/million+  topics in a cluster (hence an "audit topic" would be
> wonderful for all kinds of purposes), and would not want an "system topic"
> as the single point of failure.
>
> So you have to make this log local to the broker, or sacrifice the
> reliability of the log (best case log).  Local log has its advantages - you
> can log a lot more about the system itself into it, (eg: security events
> like failed auth etc), but you will need to provide an aggregate view for
> the cluster as a whole from all the brokers
>
> Joe
>
>
>
>
> On Thu, Apr 22, 2021 at 6:10 AM Joe Francis <j...@verizonmedia.com> wrote:
>
> > Completely disagree that we have accepted this risk with PIP-39. That is
> > different because it is an admin flow. A failure in a namespace policy
> > change does not affect data flow.
> >
> >  What you are proposing  is in the data path. Topics and subs are
> > created in the data flow path. Failure means outages. PIP-39 is not going
> > to help you there.
> >
> > Joe
> >
> > On Wed, Apr 21, 2021 at 11:10 PM Michael Marshall <mikemars...@gmail.com
> >
> > wrote:
> >
> >> Hi Joe,
> >>
> >> I agree there is a risk in adding more interdependencies between
> brokers.
> >> I
> >> will point out that we have already accepted this risk with the
> >> implementation of PIP 39, which propagates namespace policy changes to
> >> other brokers using messages sent to a system topic. However, that
> doesn't
> >> necessarily mean we should build more interdependencies between brokers.
> >>
> >> Here is the link to PIP 39:
> >>
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_pulsar_wiki_PIP-2D39-253A-2DNamespace-2DChange-2DEvents&d=DwIBaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=ke-uQGYh--_pwtgn13szgq-axZcRVTJXoSurefbZEk4&m=ACSaHFP9BC5MVQZZWMp0KvSJnvwUr4Jvd08xKKbQWBI&s=G_K7a-seNfGGb-Z4Wy0Q5iMrbdL2j9WCoMUWfwUH5RY&e=
> >> .
> >>
> >> I will look into the implementation of PIP 39 to better understand its
> >> design, as I think it will likely influence this feature's design.
> >>
> >> Thanks,
> >> Michael
> >>
> >> On Wed, Apr 21, 2021 at 5:50 PM Joe F <joefranc...@gmail.com> wrote:
> >>
> >> > I would be very careful about implementing  such a feature, because of
> >> > introducing  undesirable interdependencies. Broker processes only talk
> >> to
> >> > the metadata store or data store. This keeps brokers isolated from
> each
> >> > other - one broker is not dependent on the functioning of another
> >> broker.
> >> >
> >> > A broker publishing to a topic hosted on another broker (which for eg:
> >> is
> >> > serving "system topic"),  sets up an undesirable dependency,  which
> >> reduces
> >> > total system resiliency and availability for the cluster. These are
> >> better
> >> > implemented as notifications off the metadata changes.
> >> >
> >> > Good feature, but needs careful thought to do it right
> >> > Joe
> >> >
> >> > On Wed, Apr 21, 2021 at 4:03 PM Michael Marshall <
> mikemars...@gmail.com
> >> >
> >> > wrote:
> >> >
> >> > > Thanks for your response, PengHui.
> >> > >
> >> > > I think this feature would be useful to end users for cluster
> >> management,
> >> > > which is why I want to contribute a first class feature instead of
> >> > writing
> >> > > my own plugin that would add little value to the community.
> >> > >
> >> > > > With the broker interceptor you can intercept all the REST API
> >> request
> >> > > and response, Pulsar commands between the broker and clients.
> >> > >
> >> > > Based on looking through the interceptor trait, I don't see a way to
> >> > > trigger events based on auto created/deleted topics. For example,
> >> when a
> >> > > producer connects to a broker for a nonexistent topic (assuming auto
> >> > topic
> >> > > creation is allowed), a managed ledger, and thus a topic, is created
> >> > > without ever interacting with that interceptor trait. The same
> >> appears to
> >> > > be true for garbage collected topics. I think we'll need more than
> >> this
> >> > > interceptor to properly capture all cases where topics are created
> or
> >> > > deleted.
> >> > >
> >> > > Regarding my reference to potential further work, it does appear
> that
> >> low
> >> > > level auditing of connections and pulsar commands could be covered
> by
> >> the
> >> > > interceptor. However, it would still be on the end user to implement
> >> such
> >> > > functionality.
> >> > >
> >> > > Thanks,
> >> > > Michael
> >> > >
> >> > >
> >> > > On Wed, Apr 21, 2021 at 3:51 AM PengHui Li <codelipeng...@gmail.com
> >
> >> > > wrote:
> >> > >
> >> > > > Hi Michael,
> >> > > >
> >> > > > Currently, Pulsar supports a pluginable Broker Interceptor, you
> can
> >> > find
> >> > > > it here
> >> > > >
> >> > >
> >> >
> >>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_pulsar_blob_6704f12104219611164aa2bb5bbdfc929613f1bf_pulsar-2Dbroker_src_main_java_org_apache_pulsar_broker_intercept_BrokerInterceptor.java&d=DwIBaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=ke-uQGYh--_pwtgn13szgq-axZcRVTJXoSurefbZEk4&m=ACSaHFP9BC5MVQZZWMp0KvSJnvwUr4Jvd08xKKbQWBI&s=6Li1guS8lImjrxPo9A0nnQAmDMnYEKHlAGqlVYvB8Ug&e=
> >> > > >
> >> > > > With the broker interceptor you can intercept all the REST API
> >> request
> >> > > and
> >> > > > response, Pulsar commands between the broker and clients.
> >> > > > This can be used to audit the system events.
> >> > > >
> >> > > > Thanks,
> >> > > > Penghui
> >> > > > On Apr 21, 2021, 5:13 AM +0800, Michael Marshall <
> >> > mikemars...@gmail.com
> >> > > >,
> >> > > > wrote:
> >> > > > > Hello all,
> >> > > > >
> >> > > > > I would like to propose adding a new feature to Pulsar that will
> >> > > require
> >> > > > a
> >> > > > > PIP. In addition to feedback on the proposed feature, I am
> looking
> >> > for
> >> > > > > guidance on how to go about creating the PIP. Thanks for any
> help
> >> you
> >> > > can
> >> > > > > provide.
> >> > > > >
> >> > > > > I would like to add an optional system topic where topic
> creation
> >> and
> >> > > > topic
> >> > > > > deletion events are published. This feature will make it easier
> to
> >> > > > leverage
> >> > > > > the auto topic creation and inactive topic deletion features by
> >> > > > providing a
> >> > > > > way for users to reactively discover changes to topics. The
> >> largest
> >> > > > benefit
> >> > > > > is that users won't need to poll for these updates with an admin
> >> > > client.
> >> > > > > Instead, they will get them as messages.
> >> > > > >
> >> > > > > I looked to see if an equivalent feature already exists, but I
> >> don't
> >> > > see
> >> > > > > one. For reference, the `PatternMultiTopicsConsumerImpl`
> currently
> >> > > polls
> >> > > > > for all topics in a namespace and then does set operations to
> >> compute
> >> > > the
> >> > > > > "new" topics to which it should subscribe. This client
> >> implementation
> >> > > > could
> >> > > > > possibly leverage the new feature.
> >> > > > >
> >> > > > > There are still details I need to work out, like how it will
> work
> >> for
> >> > > > > partitioned vs unpartitioned topics and what kind of guarantees
> we
> >> > have
> >> > > > > regarding messaging semantics (I think we'll want at least once
> >> > message
> >> > > > > delivery here). I plan to include these details in the PIP with
> >> > > > discussions
> >> > > > > about trade offs for different implementations.
> >> > > > >
> >> > > > > Does this feature sound helpful and reasonable to others? If so,
> >> is
> >> > the
> >> > > > > next step to formally write a proposal in a Google Doc or to put
> >> > > > together a
> >> > > > > doc on the Pulsar GitHub Wiki?
> >> > > > >
> >> > > > > Related and/or future work to consider in this design: I can see
> >> > adding
> >> > > > > different system topics for these types of auditable system
> >> events.
> >> > We
> >> > > > > currently rely on log lines as our primary way for end users to
> >> audit
> >> > > > > system events, e.g. a producer connecting to a broker or a
> >> > subscription
> >> > > > > getting created, but we could instead have topics that represent
> >> > > streams
> >> > > > of
> >> > > > > these different kinds of events. A persistent topic could make
> >> these
> >> > > > audit
> >> > > > > events more durable and more structured which should lend
> >> themselves
> >> > to
> >> > > > > being more easily analyzed. Further, users could choose to turn
> >> > on/off
> >> > > > > these audit events, perhaps at the broker or namespace level, to
> >> fit
> >> > > > their
> >> > > > > own needs.
> >> > > > >
> >> > > > > Let me know what you think and how I should proceed.
> >> > > > >
> >> > > > > Regards,
> >> > > > > Michael Marshall
> >> > > >
> >> > >
> >> >
> >>
> >
>


-- 
Jonathan Ellis
co-founder, http://www.datastax.com
@spyced

Reply via email to