This is an automated email from the ASF dual-hosted git repository.
lhotari pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/pulsar-site.git
The following commit(s) were added to refs/heads/main by this push:
new 978f44fa741 [improve][doc] Clarify producer name uniqueness and
Key_Shared batching requirements (#1093)
978f44fa741 is described below
commit 978f44fa74119511339ad2c1956935fffe1e3f88
Author: kevin-pan-skydio <[email protected]>
AuthorDate: Tue Mar 17 00:17:16 2026 -0700
[improve][doc] Clarify producer name uniqueness and Key_Shared batching
requirements (#1093)
---
docs/client-libraries-consumers.md | 6 +++
docs/client-libraries-producers.md | 43 ++++++++++++++++++++++
docs/concepts-clients.md | 18 ++++++---
docs/concepts-messaging.md | 10 ++++-
.../version-3.0.x/client-libraries-consumers.md | 6 +++
versioned_docs/version-3.0.x/concepts-clients.md | 6 +++
versioned_docs/version-3.0.x/concepts-messaging.md | 2 +-
.../version-4.0.x/client-libraries-consumers.md | 6 +++
.../version-4.0.x/client-libraries-producers.md | 43 ++++++++++++++++++++++
versioned_docs/version-4.0.x/concepts-clients.md | 18 ++++++---
versioned_docs/version-4.0.x/concepts-messaging.md | 10 ++++-
.../version-4.1.x/client-libraries-consumers.md | 6 +++
.../version-4.1.x/client-libraries-producers.md | 43 ++++++++++++++++++++++
versioned_docs/version-4.1.x/concepts-clients.md | 12 ++++--
versioned_docs/version-4.1.x/concepts-messaging.md | 10 ++++-
15 files changed, 220 insertions(+), 19 deletions(-)
diff --git a/docs/client-libraries-consumers.md
b/docs/client-libraries-consumers.md
index f6cae3b4b24..b1555ffdf0f 100644
--- a/docs/client-libraries-consumers.md
+++ b/docs/client-libraries-consumers.md
@@ -240,6 +240,12 @@ The `Shared` subscription is different from the
`Exclusive` and `Failover` subsc
This is a new subscription type since 2.4.0 release. Create new consumers and
subscribe with `Key_Shared` subscription type.
+:::note Producer batching requirement
+
+When using Key_Shared subscriptions, producers **must** either **disable
batching** or **use key-based batching** (e.g., `BatcherBuilder.KEY_BASED` in
Java). Default batching may pack messages with different keys into the same
batch, breaking Key_Shared routing semantics. See [below](#key_shared-batching)
for code examples.
+
+:::
+
````mdx-code-block
<Tabs groupId="lang-choice"
defaultValue="Java"
diff --git a/docs/client-libraries-producers.md
b/docs/client-libraries-producers.md
index 5d7f1078600..dba81e4d901 100644
--- a/docs/client-libraries-producers.md
+++ b/docs/client-libraries-producers.md
@@ -61,6 +61,49 @@ This example shows how to create a producer.
</Tabs>
````
+### Producer naming
+
+Every producer has a name that must be **unique across all Pulsar clusters**.
If you do not explicitly set a name, Pulsar generates a globally unique name
automatically. If you assign a name, the broker enforces that only one producer
with that name can publish on a topic at a time.
+
+You **must** set an explicit producer name when using [message
deduplication](cookbooks-deduplication.md). Even when deduplication is not
required, setting a meaningful producer name is recommended — it makes
debugging significantly easier because the name appears in broker logs, admin
stats, and metrics, letting you quickly trace messages back to the producing
application.
+
+````mdx-code-block
+<Tabs groupId="lang-choice"
+ defaultValue="Java"
+
values={[{"label":"Java","value":"Java"},{"label":"C++","value":"C++"},{"label":"Python","value":"Python"}]}>
+
+ <TabItem value="Java">
+
+ ```java
+ Producer<String> producer = pulsarClient.newProducer(Schema.STRING)
+ .topic("my-topic")
+ .producerName("my-unique-producer-name")
+ .create();
+ ```
+
+ </TabItem>
+
+ <TabItem value="C++">
+
+ ```cpp
+ ProducerConfiguration producerConfig;
+ producerConfig.setProducerName("my-unique-producer-name");
+ Producer producer;
+ Result result = client.createProducer("my-topic", producerConfig, producer);
+ ```
+
+ </TabItem>
+
+ <TabItem value="Python">
+
+ ```python
+ producer = client.create_producer('my-topic',
producer_name='my-unique-producer-name')
+ ```
+
+ </TabItem>
+</Tabs>
+````
+
## Publish messages
Pulsar supports both synchronous and asynchronous publishing of messages in
most clients. In some language-specific clients, such as Node.js and C#, you
can publish messages synchronously based on the asynchronous method using
language-specific mechanisms (like `await`).
diff --git a/docs/concepts-clients.md b/docs/concepts-clients.md
index 5ba436849d9..c57e06fafb3 100644
--- a/docs/concepts-clients.md
+++ b/docs/concepts-clients.md
@@ -13,12 +13,12 @@ Pulsar client libraries support transparent reconnection
and/or connection failo
Before an application creates a producer/consumer, the Pulsar client library
needs to initiate a setup phase including two steps:
-1. The client attempts to determine the owner of the topic by sending an HTTP
lookup request to the broker.
+1. The client attempts to determine the owner of the topic by sending an HTTP
lookup request to the broker.
The request could reach one of the active brokers which, by looking at the
(cached) Zookeeper metadata knows who is serving the topic or, in case nobody
is serving it, tries to assign it to the least loaded broker.
-2. Once the client library has the broker address, it creates a TCP connection
(or reuses an existing connection from the pool) and authenticates it.
-
+2. Once the client library has the broker address, it creates a TCP connection
(or reuses an existing connection from the pool) and authenticates it.
+
Within this connection, the client and broker exchange binary commands
from a custom protocol. At this point, the client sends a command to create
producer/consumer to the broker, which will comply after having validated the
authorization policy.
Whenever the TCP connection breaks, the client immediately re-initiates this
setup phase and keeps trying with exponential backoff to re-establish the
producer or consumer until the operation succeeds.
@@ -27,6 +27,12 @@ Whenever the TCP connection breaks, the client immediately
re-initiates this set
A producer is a process that attaches to a topic and publishes messages to a
Pulsar [broker](concepts-architecture-overview.md#broker). The Pulsar broker
processes the messages.
+### Producer naming
+
+Every producer has a name that **must be unique across all Pulsar clusters**.
If you do not explicitly assign a name when creating a producer, Pulsar
automatically generates a globally unique name. If you choose to set a name
explicitly, the broker enforces that only one producer with that name can be
publishing on a topic at any given time — attempting to create a second
producer with the same name on the same topic will fail.
+
+Explicitly naming producers is required when using [message
deduplication](cookbooks-deduplication.md), because Pulsar uses the producer
name together with the sequence ID to identify and filter duplicate messages.
It is also useful for debugging and monitoring, since the producer name appears
in metrics and admin stats.
+
### Send mode
Send mode is a mechanism determining whether producers send messages to
brokers synchronously (sync) or asynchronously (async).
@@ -146,10 +152,10 @@ try {
// Send messages within transaction
producer.newMessage(txn).value("message-1").send();
producer.newMessage(txn).value("message-2").send();
-
- // Acknowledge messages within transaction
+
+ // Acknowledge messages within transaction
consumer.acknowledgeAsync(messageId, txn);
-
+
// Commit transaction
txn.commit().get();
} catch (Exception e) {
diff --git a/docs/concepts-messaging.md b/docs/concepts-messaging.md
index b32db08fd86..53f3cf37429 100644
--- a/docs/concepts-messaging.md
+++ b/docs/concepts-messaging.md
@@ -28,7 +28,7 @@ Messages are the basic "unit" of Pulsar. They're what
producers publish to topic
| Value / data payload | The data carried by the message. All Pulsar messages
contain raw bytes, although message data can also conform to data
[schemas](schema-get-started.md).
|
| Key | The key (string type) of the message. It is a short
name of message key or partition key. Messages are optionally tagged with keys,
which is useful for features like [topic
compaction](concepts-topic-compaction.md).
|
| Properties | An optional key/value map of user-defined properties.
|
-| Producer name | The name of the producer who produces the message. If
you do not specify a producer name, the default name is used.
|
+| Producer name | The name of the producer who produces the message. If
you do not specify a producer name, Pulsar automatically generates a globally
unique name. If you explicitly assign a name, it **must be unique across all
Pulsar clusters**, otherwise the producer will fail to create. The broker
enforces that only one producer with the same name can be publishing on a topic
at any given time. See [Producer naming](concepts-clients.md#producer-naming)
for details. |
| Topic name | The name of the topic that the message is published
to.
|
| Schema version | The version number of the schema that the message is
produced with.
|
| Sequence ID | Each Pulsar message belongs to an ordered sequence on
its topic. The sequence ID of a message is initially assigned by its producer,
indicating its order in that sequence, and can also be customized.<br
/>Sequence ID can be used for message deduplication. If
`brokerDeduplicationEnabled` is set to `true`, the sequence ID of each message
is unique within a producer of a topic (non-partitioned) or a partition. |
@@ -657,6 +657,14 @@ Shared subscriptions do not guarantee message ordering or
support cumulative ack
The Key_Shared subscription type in Pulsar allows multiple consumers to attach
to the same subscription. But different with the Shared type, messages in the
Key_Shared type are delivered in distribution across consumers and messages
with the same key or same ordering key are delivered to only one consumer. No
matter how many times the message is re-delivered, it is delivered to the same
consumer.
+:::note Producer requirements for Key_Shared
+
+When using Key_Shared subscriptions, producers **must** either **disable
batching** or **use key-based batching** (e.g., `BatcherBuilder.KEY_BASED` in
Java). The default batching strategy may pack messages with different keys into
the same batch, which breaks Key_Shared routing because the broker uses the
first message's key to route the entire batch.
+
+See [Batching for Key_Shared
Subscriptions](#batching-for-key_shared-subscriptions) for details and code
examples.
+
+:::
+

:::note
diff --git a/versioned_docs/version-3.0.x/client-libraries-consumers.md
b/versioned_docs/version-3.0.x/client-libraries-consumers.md
index 39296cbdee1..6ad34a4cb31 100644
--- a/versioned_docs/version-3.0.x/client-libraries-consumers.md
+++ b/versioned_docs/version-3.0.x/client-libraries-consumers.md
@@ -239,6 +239,12 @@ The `Shared` subscription is different from the
`Exclusive` and `Failover` subsc
This is a new subscription type since 2.4.0 release. Create new consumers and
subscribe with `Key_Shared` subscription type.
+:::note Producer batching requirement
+
+When using Key_Shared subscriptions, producers **must** either **disable
batching** or **use key-based batching** (e.g., `BatcherBuilder.KEY_BASED` in
Java). Default batching may pack messages with different keys into the same
batch, breaking Key_Shared routing semantics. See [below](#key_shared-batching)
for code examples.
+
+:::
+
````mdx-code-block
<Tabs groupId="lang-choice"
defaultValue="Java"
diff --git a/versioned_docs/version-3.0.x/concepts-clients.md
b/versioned_docs/version-3.0.x/concepts-clients.md
index cb82b4a9d31..bae16b088e1 100644
--- a/versioned_docs/version-3.0.x/concepts-clients.md
+++ b/versioned_docs/version-3.0.x/concepts-clients.md
@@ -21,6 +21,12 @@ Whenever the TCP connection breaks, the client immediately
re-initiates this set
A producer is a process that attaches to a topic and publishes messages to a
Pulsar [broker](reference-terminology.md#broker). The Pulsar broker processes
the messages.
+### Producer naming
+
+Every producer has a name that **must be unique across all Pulsar clusters**.
If you do not explicitly assign a name when creating a producer, Pulsar
automatically generates a globally unique name. If you choose to set a name
explicitly, the broker enforces that only one producer with that name can be
publishing on a topic at any given time — attempting to create a second
producer with the same name on the same topic will fail.
+
+Explicitly naming producers is required when using [message
deduplication](cookbooks-deduplication.md), because Pulsar uses the producer
name together with the sequence ID to identify and filter duplicate messages.
It is also useful for debugging and monitoring, since the producer name appears
in metrics and admin stats.
+
### Send mode
Producers send messages to brokers synchronously (sync) or asynchronously
(async).
diff --git a/versioned_docs/version-3.0.x/concepts-messaging.md
b/versioned_docs/version-3.0.x/concepts-messaging.md
index 958968eb1bb..ab401891024 100644
--- a/versioned_docs/version-3.0.x/concepts-messaging.md
+++ b/versioned_docs/version-3.0.x/concepts-messaging.md
@@ -27,7 +27,7 @@ Messages are the basic "unit" of Pulsar. The following table
lists the component
| Value / data payload | The data carried by the message. All Pulsar messages
contain raw bytes, although message data can also conform to data
[schemas](schema-get-started.md).
|
| Key | The key (string type) of the message. It is a short
name of message key or partition key. Messages are optionally tagged with keys,
which is useful for features like [topic
compaction](concepts-topic-compaction.md).
|
| Properties | An optional key/value map of user-defined properties.
|
-| Producer name | The name of the producer who produces the message. If
you do not specify a producer name, the default name is used.
|
+| Producer name | The name of the producer who produces the message. If
you do not specify a producer name, Pulsar automatically generates a globally
unique name. If you explicitly assign a name, it **must be unique across all
Pulsar clusters**, otherwise the producer will fail to create. The broker
enforces that only one producer with the same name can be publishing on a topic
at any given time. See [Producer naming](concepts-clients.md#producer-naming)
for details. |
| Topic name | The name of the topic that the message is published
to.
|
| Schema version | The version number of the schema that the message is
produced with.
|
| Sequence ID | Each Pulsar message belongs to an ordered sequence on
its topic. The sequence ID of a message is initially assigned by its producer,
indicating its order in that sequence, and can also be customized.<br
/>Sequence ID can be used for message deduplication. If
`brokerDeduplicationEnabled` is set to `true`, the sequence ID of each message
is unique within a producer of a topic (non-partitioned) or a partition. |
diff --git a/versioned_docs/version-4.0.x/client-libraries-consumers.md
b/versioned_docs/version-4.0.x/client-libraries-consumers.md
index f6cae3b4b24..b1555ffdf0f 100644
--- a/versioned_docs/version-4.0.x/client-libraries-consumers.md
+++ b/versioned_docs/version-4.0.x/client-libraries-consumers.md
@@ -240,6 +240,12 @@ The `Shared` subscription is different from the
`Exclusive` and `Failover` subsc
This is a new subscription type since 2.4.0 release. Create new consumers and
subscribe with `Key_Shared` subscription type.
+:::note Producer batching requirement
+
+When using Key_Shared subscriptions, producers **must** either **disable
batching** or **use key-based batching** (e.g., `BatcherBuilder.KEY_BASED` in
Java). Default batching may pack messages with different keys into the same
batch, breaking Key_Shared routing semantics. See [below](#key_shared-batching)
for code examples.
+
+:::
+
````mdx-code-block
<Tabs groupId="lang-choice"
defaultValue="Java"
diff --git a/versioned_docs/version-4.0.x/client-libraries-producers.md
b/versioned_docs/version-4.0.x/client-libraries-producers.md
index 5d7f1078600..dba81e4d901 100644
--- a/versioned_docs/version-4.0.x/client-libraries-producers.md
+++ b/versioned_docs/version-4.0.x/client-libraries-producers.md
@@ -61,6 +61,49 @@ This example shows how to create a producer.
</Tabs>
````
+### Producer naming
+
+Every producer has a name that must be **unique across all Pulsar clusters**.
If you do not explicitly set a name, Pulsar generates a globally unique name
automatically. If you assign a name, the broker enforces that only one producer
with that name can publish on a topic at a time.
+
+You **must** set an explicit producer name when using [message
deduplication](cookbooks-deduplication.md). Even when deduplication is not
required, setting a meaningful producer name is recommended — it makes
debugging significantly easier because the name appears in broker logs, admin
stats, and metrics, letting you quickly trace messages back to the producing
application.
+
+````mdx-code-block
+<Tabs groupId="lang-choice"
+ defaultValue="Java"
+
values={[{"label":"Java","value":"Java"},{"label":"C++","value":"C++"},{"label":"Python","value":"Python"}]}>
+
+ <TabItem value="Java">
+
+ ```java
+ Producer<String> producer = pulsarClient.newProducer(Schema.STRING)
+ .topic("my-topic")
+ .producerName("my-unique-producer-name")
+ .create();
+ ```
+
+ </TabItem>
+
+ <TabItem value="C++">
+
+ ```cpp
+ ProducerConfiguration producerConfig;
+ producerConfig.setProducerName("my-unique-producer-name");
+ Producer producer;
+ Result result = client.createProducer("my-topic", producerConfig, producer);
+ ```
+
+ </TabItem>
+
+ <TabItem value="Python">
+
+ ```python
+ producer = client.create_producer('my-topic',
producer_name='my-unique-producer-name')
+ ```
+
+ </TabItem>
+</Tabs>
+````
+
## Publish messages
Pulsar supports both synchronous and asynchronous publishing of messages in
most clients. In some language-specific clients, such as Node.js and C#, you
can publish messages synchronously based on the asynchronous method using
language-specific mechanisms (like `await`).
diff --git a/versioned_docs/version-4.0.x/concepts-clients.md
b/versioned_docs/version-4.0.x/concepts-clients.md
index 5ba436849d9..c57e06fafb3 100644
--- a/versioned_docs/version-4.0.x/concepts-clients.md
+++ b/versioned_docs/version-4.0.x/concepts-clients.md
@@ -13,12 +13,12 @@ Pulsar client libraries support transparent reconnection
and/or connection failo
Before an application creates a producer/consumer, the Pulsar client library
needs to initiate a setup phase including two steps:
-1. The client attempts to determine the owner of the topic by sending an HTTP
lookup request to the broker.
+1. The client attempts to determine the owner of the topic by sending an HTTP
lookup request to the broker.
The request could reach one of the active brokers which, by looking at the
(cached) Zookeeper metadata knows who is serving the topic or, in case nobody
is serving it, tries to assign it to the least loaded broker.
-2. Once the client library has the broker address, it creates a TCP connection
(or reuses an existing connection from the pool) and authenticates it.
-
+2. Once the client library has the broker address, it creates a TCP connection
(or reuses an existing connection from the pool) and authenticates it.
+
Within this connection, the client and broker exchange binary commands
from a custom protocol. At this point, the client sends a command to create
producer/consumer to the broker, which will comply after having validated the
authorization policy.
Whenever the TCP connection breaks, the client immediately re-initiates this
setup phase and keeps trying with exponential backoff to re-establish the
producer or consumer until the operation succeeds.
@@ -27,6 +27,12 @@ Whenever the TCP connection breaks, the client immediately
re-initiates this set
A producer is a process that attaches to a topic and publishes messages to a
Pulsar [broker](concepts-architecture-overview.md#broker). The Pulsar broker
processes the messages.
+### Producer naming
+
+Every producer has a name that **must be unique across all Pulsar clusters**.
If you do not explicitly assign a name when creating a producer, Pulsar
automatically generates a globally unique name. If you choose to set a name
explicitly, the broker enforces that only one producer with that name can be
publishing on a topic at any given time — attempting to create a second
producer with the same name on the same topic will fail.
+
+Explicitly naming producers is required when using [message
deduplication](cookbooks-deduplication.md), because Pulsar uses the producer
name together with the sequence ID to identify and filter duplicate messages.
It is also useful for debugging and monitoring, since the producer name appears
in metrics and admin stats.
+
### Send mode
Send mode is a mechanism determining whether producers send messages to
brokers synchronously (sync) or asynchronously (async).
@@ -146,10 +152,10 @@ try {
// Send messages within transaction
producer.newMessage(txn).value("message-1").send();
producer.newMessage(txn).value("message-2").send();
-
- // Acknowledge messages within transaction
+
+ // Acknowledge messages within transaction
consumer.acknowledgeAsync(messageId, txn);
-
+
// Commit transaction
txn.commit().get();
} catch (Exception e) {
diff --git a/versioned_docs/version-4.0.x/concepts-messaging.md
b/versioned_docs/version-4.0.x/concepts-messaging.md
index 223b03d8b52..ec8101ebe18 100644
--- a/versioned_docs/version-4.0.x/concepts-messaging.md
+++ b/versioned_docs/version-4.0.x/concepts-messaging.md
@@ -28,7 +28,7 @@ Messages are the basic "unit" of Pulsar. They're what
producers publish to topic
| Value / data payload | The data carried by the message. All Pulsar messages
contain raw bytes, although message data can also conform to data
[schemas](schema-get-started.md).
|
| Key | The key (string type) of the message. It is a short
name of message key or partition key. Messages are optionally tagged with keys,
which is useful for features like [topic
compaction](concepts-topic-compaction.md).
|
| Properties | An optional key/value map of user-defined properties.
|
-| Producer name | The name of the producer who produces the message. If
you do not specify a producer name, the default name is used.
|
+| Producer name | The name of the producer who produces the message. If
you do not specify a producer name, Pulsar automatically generates a globally
unique name. If you explicitly assign a name, it **must be unique across all
Pulsar clusters**, otherwise the producer will fail to create. The broker
enforces that only one producer with the same name can be publishing on a topic
at any given time. See [Producer naming](concepts-clients.md#producer-naming)
for details. |
| Topic name | The name of the topic that the message is published
to.
|
| Schema version | The version number of the schema that the message is
produced with.
|
| Sequence ID | Each Pulsar message belongs to an ordered sequence on
its topic. The sequence ID of a message is initially assigned by its producer,
indicating its order in that sequence, and can also be customized.<br
/>Sequence ID can be used for message deduplication. If
`brokerDeduplicationEnabled` is set to `true`, the sequence ID of each message
is unique within a producer of a topic (non-partitioned) or a partition. |
@@ -655,6 +655,14 @@ Shared subscriptions do not guarantee message ordering or
support cumulative ack
The Key_Shared subscription type in Pulsar allows multiple consumers to attach
to the same subscription. But different with the Shared type, messages in the
Key_Shared type are delivered in distribution across consumers and messages
with the same key or same ordering key are delivered to only one consumer. No
matter how many times the message is re-delivered, it is delivered to the same
consumer.
+:::note Producer requirements for Key_Shared
+
+When using Key_Shared subscriptions, producers **must** either **disable
batching** or **use key-based batching** (e.g., `BatcherBuilder.KEY_BASED` in
Java). The default batching strategy may pack messages with different keys into
the same batch, which breaks Key_Shared routing because the broker uses the
first message's key to route the entire batch.
+
+See [Batching for Key_Shared
Subscriptions](#batching-for-key_shared-subscriptions) for details and code
examples.
+
+:::
+

:::note
diff --git a/versioned_docs/version-4.1.x/client-libraries-consumers.md
b/versioned_docs/version-4.1.x/client-libraries-consumers.md
index f6cae3b4b24..b1555ffdf0f 100644
--- a/versioned_docs/version-4.1.x/client-libraries-consumers.md
+++ b/versioned_docs/version-4.1.x/client-libraries-consumers.md
@@ -240,6 +240,12 @@ The `Shared` subscription is different from the
`Exclusive` and `Failover` subsc
This is a new subscription type since 2.4.0 release. Create new consumers and
subscribe with `Key_Shared` subscription type.
+:::note Producer batching requirement
+
+When using Key_Shared subscriptions, producers **must** either **disable
batching** or **use key-based batching** (e.g., `BatcherBuilder.KEY_BASED` in
Java). Default batching may pack messages with different keys into the same
batch, breaking Key_Shared routing semantics. See [below](#key_shared-batching)
for code examples.
+
+:::
+
````mdx-code-block
<Tabs groupId="lang-choice"
defaultValue="Java"
diff --git a/versioned_docs/version-4.1.x/client-libraries-producers.md
b/versioned_docs/version-4.1.x/client-libraries-producers.md
index 5d7f1078600..dba81e4d901 100644
--- a/versioned_docs/version-4.1.x/client-libraries-producers.md
+++ b/versioned_docs/version-4.1.x/client-libraries-producers.md
@@ -61,6 +61,49 @@ This example shows how to create a producer.
</Tabs>
````
+### Producer naming
+
+Every producer has a name that must be **unique across all Pulsar clusters**.
If you do not explicitly set a name, Pulsar generates a globally unique name
automatically. If you assign a name, the broker enforces that only one producer
with that name can publish on a topic at a time.
+
+You **must** set an explicit producer name when using [message
deduplication](cookbooks-deduplication.md). Even when deduplication is not
required, setting a meaningful producer name is recommended — it makes
debugging significantly easier because the name appears in broker logs, admin
stats, and metrics, letting you quickly trace messages back to the producing
application.
+
+````mdx-code-block
+<Tabs groupId="lang-choice"
+ defaultValue="Java"
+
values={[{"label":"Java","value":"Java"},{"label":"C++","value":"C++"},{"label":"Python","value":"Python"}]}>
+
+ <TabItem value="Java">
+
+ ```java
+ Producer<String> producer = pulsarClient.newProducer(Schema.STRING)
+ .topic("my-topic")
+ .producerName("my-unique-producer-name")
+ .create();
+ ```
+
+ </TabItem>
+
+ <TabItem value="C++">
+
+ ```cpp
+ ProducerConfiguration producerConfig;
+ producerConfig.setProducerName("my-unique-producer-name");
+ Producer producer;
+ Result result = client.createProducer("my-topic", producerConfig, producer);
+ ```
+
+ </TabItem>
+
+ <TabItem value="Python">
+
+ ```python
+ producer = client.create_producer('my-topic',
producer_name='my-unique-producer-name')
+ ```
+
+ </TabItem>
+</Tabs>
+````
+
## Publish messages
Pulsar supports both synchronous and asynchronous publishing of messages in
most clients. In some language-specific clients, such as Node.js and C#, you
can publish messages synchronously based on the asynchronous method using
language-specific mechanisms (like `await`).
diff --git a/versioned_docs/version-4.1.x/concepts-clients.md
b/versioned_docs/version-4.1.x/concepts-clients.md
index 735d3b4b188..d52dd3d18dd 100644
--- a/versioned_docs/version-4.1.x/concepts-clients.md
+++ b/versioned_docs/version-4.1.x/concepts-clients.md
@@ -13,12 +13,12 @@ Pulsar client libraries support transparent reconnection
and/or connection failo
Before an application creates a producer/consumer, the Pulsar client library
needs to initiate a setup phase including two steps:
-1. The client attempts to determine the owner of the topic by sending an HTTP
lookup request to the broker.
+1. The client attempts to determine the owner of the topic by sending an HTTP
lookup request to the broker.
The request could reach one of the active brokers which, by looking at the
(cached) Zookeeper metadata knows who is serving the topic or, in case nobody
is serving it, tries to assign it to the least loaded broker.
-2. Once the client library has the broker address, it creates a TCP connection
(or reuses an existing connection from the pool) and authenticates it.
-
+2. Once the client library has the broker address, it creates a TCP connection
(or reuses an existing connection from the pool) and authenticates it.
+
Within this connection, the client and broker exchange binary commands
from a custom protocol. At this point, the client sends a command to create
producer/consumer to the broker, which will comply after having validated the
authorization policy.
Whenever the TCP connection breaks, the client immediately re-initiates this
setup phase and keeps trying with exponential backoff to re-establish the
producer or consumer until the operation succeeds.
@@ -27,6 +27,12 @@ Whenever the TCP connection breaks, the client immediately
re-initiates this set
A producer is a process that attaches to a topic and publishes messages to a
Pulsar [broker](concepts-architecture-overview.md#broker). The Pulsar broker
processes the messages.
+### Producer naming
+
+Every producer has a name that **must be unique across all Pulsar clusters**.
If you do not explicitly assign a name when creating a producer, Pulsar
automatically generates a globally unique name. If you choose to set a name
explicitly, the broker enforces that only one producer with that name can be
publishing on a topic at any given time — attempting to create a second
producer with the same name on the same topic will fail.
+
+Explicitly naming producers is required when using [message
deduplication](cookbooks-deduplication.md), because Pulsar uses the producer
name together with the sequence ID to identify and filter duplicate messages.
It is also useful for debugging and monitoring, since the producer name appears
in metrics and admin stats.
+
### Send mode
Send mode is a mechanism determining whether producers send messages to
brokers synchronously (sync) or asynchronously (async).
diff --git a/versioned_docs/version-4.1.x/concepts-messaging.md
b/versioned_docs/version-4.1.x/concepts-messaging.md
index fe7c2106d45..de169cc5322 100644
--- a/versioned_docs/version-4.1.x/concepts-messaging.md
+++ b/versioned_docs/version-4.1.x/concepts-messaging.md
@@ -28,7 +28,7 @@ Messages are the basic "unit" of Pulsar. They're what
producers publish to topic
| Value / data payload | The data carried by the message. All Pulsar messages
contain raw bytes, although message data can also conform to data
[schemas](schema-get-started.md).
|
| Key | The key (string type) of the message. It is a short
name of message key or partition key. Messages are optionally tagged with keys,
which is useful for features like [topic
compaction](concepts-topic-compaction.md).
|
| Properties | An optional key/value map of user-defined properties.
|
-| Producer name | The name of the producer who produces the message. If
you do not specify a producer name, the default name is used.
|
+| Producer name | The name of the producer who produces the message. If
you do not specify a producer name, Pulsar automatically generates a globally
unique name. If you explicitly assign a name, it **must be unique across all
Pulsar clusters**, otherwise the producer will fail to create. The broker
enforces that only one producer with the same name can be publishing on a topic
at any given time. See [Producer naming](concepts-clients.md#producer-naming)
for details. |
| Topic name | The name of the topic that the message is published
to.
|
| Schema version | The version number of the schema that the message is
produced with.
|
| Sequence ID | Each Pulsar message belongs to an ordered sequence on
its topic. The sequence ID of a message is initially assigned by its producer,
indicating its order in that sequence, and can also be customized.<br
/>Sequence ID can be used for message deduplication. If
`brokerDeduplicationEnabled` is set to `true`, the sequence ID of each message
is unique within a producer of a topic (non-partitioned) or a partition. |
@@ -655,6 +655,14 @@ Shared subscriptions do not guarantee message ordering or
support cumulative ack
The Key_Shared subscription type in Pulsar allows multiple consumers to attach
to the same subscription. But different with the Shared type, messages in the
Key_Shared type are delivered in distribution across consumers and messages
with the same key or same ordering key are delivered to only one consumer. No
matter how many times the message is re-delivered, it is delivered to the same
consumer.
+:::note Producer requirements for Key_Shared
+
+When using Key_Shared subscriptions, producers **must** either **disable
batching** or **use key-based batching** (e.g., `BatcherBuilder.KEY_BASED` in
Java). The default batching strategy may pack messages with different keys into
the same batch, which breaks Key_Shared routing because the broker uses the
first message's key to route the entire batch.
+
+See [Batching for Key_Shared
Subscriptions](#batching-for-key_shared-subscriptions) for details and code
examples.
+
+:::
+

:::note