toTable [kafka]

via GitHub Tue, 04 Feb 2025 20:56:49 -0800


mjsax commented on code in PR #18760:
URL: https://github.com/apache/kafka/pull/18760#discussion_r1942246261



##########
streams/src/main/java/org/apache/kafka/streams/kstream/KStream.java:
##########
@@ -758,185 +756,118 @@ <VR> KStream<K, VR> flatMapValues(final 
ValueMapperWithKey<? super K, ? super V,
 
     /**
      * Materialize this stream to an auto-generated repartition topic and 
create a new {@code KStream}
-     * from the auto-generated topic using default serializers, deserializers, 
and producer's default partitioning strategy.
-     * The number of partitions is determined based on the upstream topics 
partition numbers.
-     * <p>
-     * The created topic is considered as an internal topic and is meant to be 
used only by the current Kafka Streams instance.
-     * Similar to auto-repartitioning, the topic will be created with infinite 
retention time and data will be automatically purged by Kafka Streams.
-     * The topic will be named as "${applicationId}-&lt;name&gt;-repartition", 
where "applicationId" is user-specified in
-     * {@link StreamsConfig} via parameter {@link 
StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG},
+     * from the auto-generated topic.
+     *
+     * <p>The created topic is considered an internal topic and is meant to be 
used only by the current
+     * Kafka Streams instance.
+     * The topic will be named as "${applicationId}-&lt;name&gt;-repartition",
+     * where "applicationId" is user-specified in {@link StreamsConfig} via 
parameter
+     * {@link StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG},
      * "&lt;name&gt;" is an internally generated name, and "-repartition" is a 
fixed suffix.
+     * The number of partitions for the repartition topic is determined based 
on the upstream topics partition numbers.
+     * Furthermore, the topic will be created with infinite retention time and 
data will be automatically purged
+     * by Kafka Streams.
+     *
+     * <p>You can retrieve all generated internal topic names via {@link 
Topology#describe()}.
+     * To explicitly set key/value serdes, specify the number of used 
partitions or the partitioning strategy,
+     * or to customize the name of the repartition topic, use {@link 
#repartition(Repartitioned)}.
      *
-     * @return {@code KStream} that contains the exact same repartitioned 
records as this {@code KStream}.
+     * @return A {@code KStream} that contains the exact same, but 
repartitioned records as this {@code KStream}.
      */
     KStream<K, V> repartition();
 
     /**
-     * Materialize this stream to an auto-generated repartition topic and 
create a new {@code KStream}
-     * from the auto-generated topic using {@link Serde key serde}, {@link 
Serde value serde}, {@link StreamPartitioner},
-     * number of partitions, and topic name part as defined by {@link 
Repartitioned}.
-     * <p>
-     * The created topic is considered as an internal topic and is meant to be 
used only by the current Kafka Streams instance.
-     * Similar to auto-repartitioning, the topic will be created with infinite 
retention time and data will be automatically purged by Kafka Streams.
-     * The topic will be named as "${applicationId}-&lt;name&gt;-repartition", 
where "applicationId" is user-specified in
-     * {@link StreamsConfig} via parameter {@link 
StreamsConfig#APPLICATION_ID_CONFIG APPLICATION_ID_CONFIG},
-     * "&lt;name&gt;" is either provided via {@link Repartitioned#as(String)} 
or an internally
-     * generated name, and "-repartition" is a fixed suffix.
-     *
-     * @param repartitioned the {@link Repartitioned} instance used to specify 
{@link Serdes},
-     *                      {@link StreamPartitioner} which determines how 
records are distributed among partitions of the topic,
-     *                      part of the topic name, and number of partitions 
for a repartition topic.
-     * @return a {@code KStream} that contains the exact same repartitioned 
records as this {@code KStream}.
+     * See {@link #repartition()}.
      */
     KStream<K, V> repartition(final Repartitioned<K, V> repartitioned);
 
     /**
-     * Materialize this stream to a topic using default serializers specified 
in the config and producer's
-     * default partitioning strategy.
-     * The specified topic should be manually created before it is used (i.e., 
before the Kafka Streams application is
+     * Materialize this stream to a topic.
+     * The topic should be manually created before it is used (i.e., before 
the Kafka Streams application is
      * started).
      *
-     * @param topic the topic name
+     * <p>To explicitly set key/value serdes or the partitioning strategy, use 
{@link #to(String, Produced)}.
+     *
+     * @param topic
+     *        the output topic name
+     * 
+     * @see #to(TopicNameExtractor)
      */
     void to(final String topic);
 
     /**
-     * Materialize this stream to a topic using the provided {@link Produced} 
instance.
-     * The specified topic should be manually created before it is used (i.e., 
before the Kafka Streams application is
-     * started).
-     *
-     * @param topic       the topic name
-     * @param produced    the options to use when producing to the topic
+     * See {@link #to(String).}
      */
     void to(final String topic,
             final Produced<K, V> produced);
 
     /**
-     * Dynamically materialize this stream to topics using default serializers 
specified in the config and producer's
-     * default partitioning strategy.
-     * The topic names for each record to send to is dynamically determined 
based on the {@link TopicNameExtractor}.
+     * Materialize the record of this stream to different topics.
+     * The provided {@link TopicNameExtractor} is applied to each input record 
to compute the output topic name.
+     * All topics should be manually created before they are use (i.e., before 
the Kafka Streams application is started).
+     *
+     * <p>To explicitly set key/value serdes or the partitioning strategy, use 
{@link #to(TopicNameExtractor, Produced)}.
      *
-     * @param topicExtractor    the extractor to determine the name of the 
Kafka topic to write to for each record
+     * @param topicExtractor
+     *        the extractor to determine the name of the Kafka topic to write 
to for each record
      */
     void to(final TopicNameExtractor<K, V> topicExtractor);
 
     /**
-     * Dynamically materialize this stream to topics using the provided {@link 
Produced} instance.
-     * The topic names for each record to send to is dynamically determined 
based on the {@link TopicNameExtractor}.
-     *
-     * @param topicExtractor    the extractor to determine the name of the 
Kafka topic to write to for each record
-     * @param produced          the options to use when producing to the topic
+     * See {@link #to(TopicNameExtractor)}.

Review Comment:
   It's covered above on `to(String)` as well as `to(TopicNameExtractor)`:
   ```
   <p>To explicitly set key/value serdes or the partitioning strategy, use 
{@link #to(String, Produced)}.
   ```
   
   Not sufficient, or did you miss it?
   
   > Would it make sense to refer from most generic to the most specific 
overload instead?
   
   I did consider this originally, but found if overall more complicated than 
helpful, and though it's easier to describe the most simple overload and just 
add "forward reference" (as quoted above, and used elsewhere) instead.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Re: [PR] MINOR: cleanup KStream JavaDocs (7/N) - repartition/to/toTable [kafka]

Reply via email to