Could we add a system topic that has exactly one partition per broker? On Thu, Apr 22, 2021 at 11:22 PM Joe Francis <j...@verizonmedia.com.invalid> wrote:
> To be clear, I would love to have this feature. But I would not use this > feature if that means whenever a broker that hosts a "system topic" has a > hiccup, it would result in an outage for N other brokers. I run 100+ > brokers/million+ topics in a cluster (hence an "audit topic" would be > wonderful for all kinds of purposes), and would not want an "system topic" > as the single point of failure. > > So you have to make this log local to the broker, or sacrifice the > reliability of the log (best case log). Local log has its advantages - you > can log a lot more about the system itself into it, (eg: security events > like failed auth etc), but you will need to provide an aggregate view for > the cluster as a whole from all the brokers > > Joe > > > > > On Thu, Apr 22, 2021 at 6:10 AM Joe Francis <j...@verizonmedia.com> wrote: > > > Completely disagree that we have accepted this risk with PIP-39. That is > > different because it is an admin flow. A failure in a namespace policy > > change does not affect data flow. > > > > What you are proposing is in the data path. Topics and subs are > > created in the data flow path. Failure means outages. PIP-39 is not going > > to help you there. > > > > Joe > > > > On Wed, Apr 21, 2021 at 11:10 PM Michael Marshall <mikemars...@gmail.com > > > > wrote: > > > >> Hi Joe, > >> > >> I agree there is a risk in adding more interdependencies between > brokers. > >> I > >> will point out that we have already accepted this risk with the > >> implementation of PIP 39, which propagates namespace policy changes to > >> other brokers using messages sent to a system topic. However, that > doesn't > >> necessarily mean we should build more interdependencies between brokers. > >> > >> Here is the link to PIP 39: > >> > >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_pulsar_wiki_PIP-2D39-253A-2DNamespace-2DChange-2DEvents&d=DwIBaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=ke-uQGYh--_pwtgn13szgq-axZcRVTJXoSurefbZEk4&m=ACSaHFP9BC5MVQZZWMp0KvSJnvwUr4Jvd08xKKbQWBI&s=G_K7a-seNfGGb-Z4Wy0Q5iMrbdL2j9WCoMUWfwUH5RY&e= > >> . > >> > >> I will look into the implementation of PIP 39 to better understand its > >> design, as I think it will likely influence this feature's design. > >> > >> Thanks, > >> Michael > >> > >> On Wed, Apr 21, 2021 at 5:50 PM Joe F <joefranc...@gmail.com> wrote: > >> > >> > I would be very careful about implementing such a feature, because of > >> > introducing undesirable interdependencies. Broker processes only talk > >> to > >> > the metadata store or data store. This keeps brokers isolated from > each > >> > other - one broker is not dependent on the functioning of another > >> broker. > >> > > >> > A broker publishing to a topic hosted on another broker (which for eg: > >> is > >> > serving "system topic"), sets up an undesirable dependency, which > >> reduces > >> > total system resiliency and availability for the cluster. These are > >> better > >> > implemented as notifications off the metadata changes. > >> > > >> > Good feature, but needs careful thought to do it right > >> > Joe > >> > > >> > On Wed, Apr 21, 2021 at 4:03 PM Michael Marshall < > mikemars...@gmail.com > >> > > >> > wrote: > >> > > >> > > Thanks for your response, PengHui. > >> > > > >> > > I think this feature would be useful to end users for cluster > >> management, > >> > > which is why I want to contribute a first class feature instead of > >> > writing > >> > > my own plugin that would add little value to the community. > >> > > > >> > > > With the broker interceptor you can intercept all the REST API > >> request > >> > > and response, Pulsar commands between the broker and clients. > >> > > > >> > > Based on looking through the interceptor trait, I don't see a way to > >> > > trigger events based on auto created/deleted topics. For example, > >> when a > >> > > producer connects to a broker for a nonexistent topic (assuming auto > >> > topic > >> > > creation is allowed), a managed ledger, and thus a topic, is created > >> > > without ever interacting with that interceptor trait. The same > >> appears to > >> > > be true for garbage collected topics. I think we'll need more than > >> this > >> > > interceptor to properly capture all cases where topics are created > or > >> > > deleted. > >> > > > >> > > Regarding my reference to potential further work, it does appear > that > >> low > >> > > level auditing of connections and pulsar commands could be covered > by > >> the > >> > > interceptor. However, it would still be on the end user to implement > >> such > >> > > functionality. > >> > > > >> > > Thanks, > >> > > Michael > >> > > > >> > > > >> > > On Wed, Apr 21, 2021 at 3:51 AM PengHui Li <codelipeng...@gmail.com > > > >> > > wrote: > >> > > > >> > > > Hi Michael, > >> > > > > >> > > > Currently, Pulsar supports a pluginable Broker Interceptor, you > can > >> > find > >> > > > it here > >> > > > > >> > > > >> > > >> > https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_apache_pulsar_blob_6704f12104219611164aa2bb5bbdfc929613f1bf_pulsar-2Dbroker_src_main_java_org_apache_pulsar_broker_intercept_BrokerInterceptor.java&d=DwIBaQ&c=sWW_bEwW_mLyN3Kx2v57Q8e-CRbmiT9yOhqES_g_wVY&r=ke-uQGYh--_pwtgn13szgq-axZcRVTJXoSurefbZEk4&m=ACSaHFP9BC5MVQZZWMp0KvSJnvwUr4Jvd08xKKbQWBI&s=6Li1guS8lImjrxPo9A0nnQAmDMnYEKHlAGqlVYvB8Ug&e= > >> > > > > >> > > > With the broker interceptor you can intercept all the REST API > >> request > >> > > and > >> > > > response, Pulsar commands between the broker and clients. > >> > > > This can be used to audit the system events. > >> > > > > >> > > > Thanks, > >> > > > Penghui > >> > > > On Apr 21, 2021, 5:13 AM +0800, Michael Marshall < > >> > mikemars...@gmail.com > >> > > >, > >> > > > wrote: > >> > > > > Hello all, > >> > > > > > >> > > > > I would like to propose adding a new feature to Pulsar that will > >> > > require > >> > > > a > >> > > > > PIP. In addition to feedback on the proposed feature, I am > looking > >> > for > >> > > > > guidance on how to go about creating the PIP. Thanks for any > help > >> you > >> > > can > >> > > > > provide. > >> > > > > > >> > > > > I would like to add an optional system topic where topic > creation > >> and > >> > > > topic > >> > > > > deletion events are published. This feature will make it easier > to > >> > > > leverage > >> > > > > the auto topic creation and inactive topic deletion features by > >> > > > providing a > >> > > > > way for users to reactively discover changes to topics. The > >> largest > >> > > > benefit > >> > > > > is that users won't need to poll for these updates with an admin > >> > > client. > >> > > > > Instead, they will get them as messages. > >> > > > > > >> > > > > I looked to see if an equivalent feature already exists, but I > >> don't > >> > > see > >> > > > > one. For reference, the `PatternMultiTopicsConsumerImpl` > currently > >> > > polls > >> > > > > for all topics in a namespace and then does set operations to > >> compute > >> > > the > >> > > > > "new" topics to which it should subscribe. This client > >> implementation > >> > > > could > >> > > > > possibly leverage the new feature. > >> > > > > > >> > > > > There are still details I need to work out, like how it will > work > >> for > >> > > > > partitioned vs unpartitioned topics and what kind of guarantees > we > >> > have > >> > > > > regarding messaging semantics (I think we'll want at least once > >> > message > >> > > > > delivery here). I plan to include these details in the PIP with > >> > > > discussions > >> > > > > about trade offs for different implementations. > >> > > > > > >> > > > > Does this feature sound helpful and reasonable to others? If so, > >> is > >> > the > >> > > > > next step to formally write a proposal in a Google Doc or to put > >> > > > together a > >> > > > > doc on the Pulsar GitHub Wiki? > >> > > > > > >> > > > > Related and/or future work to consider in this design: I can see > >> > adding > >> > > > > different system topics for these types of auditable system > >> events. > >> > We > >> > > > > currently rely on log lines as our primary way for end users to > >> audit > >> > > > > system events, e.g. a producer connecting to a broker or a > >> > subscription > >> > > > > getting created, but we could instead have topics that represent > >> > > streams > >> > > > of > >> > > > > these different kinds of events. A persistent topic could make > >> these > >> > > > audit > >> > > > > events more durable and more structured which should lend > >> themselves > >> > to > >> > > > > being more easily analyzed. Further, users could choose to turn > >> > on/off > >> > > > > these audit events, perhaps at the broker or namespace level, to > >> fit > >> > > > their > >> > > > > own needs. > >> > > > > > >> > > > > Let me know what you think and how I should proceed. > >> > > > > > >> > > > > Regards, > >> > > > > Michael Marshall > >> > > > > >> > > > >> > > >> > > > -- Jonathan Ellis co-founder, http://www.datastax.com @spyced