codelipenghui commented on code in PR #25242:
URL: https://github.com/apache/pulsar/pull/25242#discussion_r2891549704
##########
pip/pip-455.md:
##########
@@ -0,0 +1,125 @@
+
+---
+
+# PIP-455: Support Namespace Bundle Lookup and Topic Preloading
+
+
+---
+
+## Background Knowledge
+
+Apache Pulsar uses **namespace bundles** as the unit of ownership and load
balancing.
+
+Key concepts:
+
+- **Namespace Bundle**: A subdivision of a namespace's hash space,
representing a set of topics whose names hash into that range.
+- **Bundle Ownership**: At any given time, each bundle is owned by exactly one
broker, which is responsible for serving all topics within that bundle.
+- **Lazy Topic Loading**: By default, topics are not loaded into memory until
the first producer/consumer request arrives. This reduces startup overhead but
increases first-call latency.
+- **PulsarAdmin & pulsar-admin CLI**: The administrative interface for
managing Pulsar clusters, including operations on namespaces, topics, bundles,
etc.
+
+Currently, there is **no API to proactively lookup a namespace bundle or load
all topics within a bundle**. This forces users to trigger topic creation via
producer/consumer requests, which is not suitable for:
+- Warm-up scenarios (preloading topics before traffic arrives)
+- Disaster recovery (forcing bundle ownership transfer and topic loading)
+- Observability (checking which broker actually owns a bundle)
+
+---
+
+## Motivation
+
+The current implementation of namespace and bundle management lacks support
for **explicit lookup and preloading**. This leads to several pain points:
+
+1. **No way to warm up topics**
+ In production, after a broker restart or bundle unload, topics are loaded
lazily. The first request experiences high latency due to topic metadata
loading, cursor recovery, and ownership establishment. There is no API to
proactively load topics in a bundle to avoid this cold-start penalty. Some use
cases (e.g., migration validation, pre-warming for large-scale events) require
loading all topics in a namespace.
+
+2. **Difficult to verify bundle ownership**
+ While internal lookup mechanisms exist, there is no admin-facing API to
query the owner of a specific bundle and force-load it onto the current broker.
This makes operational debugging and manual intervention cumbersome.
+
+3. **Client-Admin API inconsistency**
+ The `pulsar-admin` CLI provides `unload`, `split`, `clear-backlog` for
bundles, but no `load` or `lookup` counterpart. This asymmetry complicates
operational tooling.
+
+4. **Dependency cycle in the codebase**
+ The `LookupData` class resides in `pulsar-common`, but
`pulsar-client-admin-api` cannot depend on it directly. This forced a
workaround via a new interface to avoid cyclic dependencies.
+
+This proposal introduces **bundle-level and namespace-level lookup + load**
APIs, enabling operators to proactively control bundle ownership and topic
lifecycle.
+
+---
+
+## Goals
+
+### In Scope
+
+- Provide a new admin API to **lookup a namespace bundle**, returning the
broker serving it (same as topic lookup but at bundle granularity).
+- Provide a new admin API to **load all topics in a namespace bundle** onto
the owning broker.
+- Provide a new admin API to **load all topics in a namespace** (by iterating
over its bundles).
+- Extend the `pulsar-admin namespaces` CLI with `lookup` and `lookup-bundle`
commands.
+- Introduce `LookupDataInterface` to break the cyclic dependency between
`pulsar-common` and `pulsar-client-admin-api`.
+
+
+
+## High-Level Design
+
+The core idea is to extend the existing `Namespaces` admin resource to support
**lookup operations at both namespace and bundle granularity**, with an
optional flag to trigger topic loading.
+
+### 1. New REST Endpoints
+
+**V2 :**
+
+```
+PUT /admin/v2/namespaces/{tenant}/{namespace}/lookup
Review Comment:
Could you please also define the response format of this API, it will be a
map which mapping the service URL for each bundle?
##########
pip/pip-455.md:
##########
@@ -0,0 +1,125 @@
+
+---
+
+# PIP-455: Support Namespace Bundle Lookup and Topic Preloading
+
+
+---
+
+## Background Knowledge
+
+Apache Pulsar uses **namespace bundles** as the unit of ownership and load
balancing.
+
+Key concepts:
+
+- **Namespace Bundle**: A subdivision of a namespace's hash space,
representing a set of topics whose names hash into that range.
+- **Bundle Ownership**: At any given time, each bundle is owned by exactly one
broker, which is responsible for serving all topics within that bundle.
+- **Lazy Topic Loading**: By default, topics are not loaded into memory until
the first producer/consumer request arrives. This reduces startup overhead but
increases first-call latency.
+- **PulsarAdmin & pulsar-admin CLI**: The administrative interface for
managing Pulsar clusters, including operations on namespaces, topics, bundles,
etc.
+
+Currently, there is **no API to proactively lookup a namespace bundle or load
all topics within a bundle**. This forces users to trigger topic creation via
producer/consumer requests, which is not suitable for:
+- Warm-up scenarios (preloading topics before traffic arrives)
+- Disaster recovery (forcing bundle ownership transfer and topic loading)
+- Observability (checking which broker actually owns a bundle)
+
+---
+
+## Motivation
+
+The current implementation of namespace and bundle management lacks support
for **explicit lookup and preloading**. This leads to several pain points:
+
+1. **No way to warm up topics**
+ In production, after a broker restart or bundle unload, topics are loaded
lazily. The first request experiences high latency due to topic metadata
loading, cursor recovery, and ownership establishment. There is no API to
proactively load topics in a bundle to avoid this cold-start penalty. Some use
cases (e.g., migration validation, pre-warming for large-scale events) require
loading all topics in a namespace.
+
+2. **Difficult to verify bundle ownership**
+ While internal lookup mechanisms exist, there is no admin-facing API to
query the owner of a specific bundle and force-load it onto the current broker.
This makes operational debugging and manual intervention cumbersome.
+
+3. **Client-Admin API inconsistency**
+ The `pulsar-admin` CLI provides `unload`, `split`, `clear-backlog` for
bundles, but no `load` or `lookup` counterpart. This asymmetry complicates
operational tooling.
+
+4. **Dependency cycle in the codebase**
+ The `LookupData` class resides in `pulsar-common`, but
`pulsar-client-admin-api` cannot depend on it directly. This forced a
workaround via a new interface to avoid cyclic dependencies.
+
+This proposal introduces **bundle-level and namespace-level lookup + load**
APIs, enabling operators to proactively control bundle ownership and topic
lifecycle.
+
+---
+
+## Goals
+
+### In Scope
+
+- Provide a new admin API to **lookup a namespace bundle**, returning the
broker serving it (same as topic lookup but at bundle granularity).
+- Provide a new admin API to **load all topics in a namespace bundle** onto
the owning broker.
+- Provide a new admin API to **load all topics in a namespace** (by iterating
over its bundles).
+- Extend the `pulsar-admin namespaces` CLI with `lookup` and `lookup-bundle`
commands.
+- Introduce `LookupDataInterface` to break the cyclic dependency between
`pulsar-common` and `pulsar-client-admin-api`.
+
+
+
+## High-Level Design
+
+The core idea is to extend the existing `Namespaces` admin resource to support
**lookup operations at both namespace and bundle granularity**, with an
optional flag to trigger topic loading.
+
+### 1. New REST Endpoints
+
+**V2 :**
+
+```
+PUT /admin/v2/namespaces/{tenant}/{namespace}/lookup
+PUT /admin/v2/namespaces/{tenant}/{namespace}/{bundle}/lookup
+```
+
+**V1 :**
+
+```
+PUT /admin/namespaces/{property}/{cluster}/{namespace}/lookup
Review Comment:
We don't need to support V1 API since we are working on removing the V1
endpoints from the repo.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]