[GitHub] [kafka] vvcephei commented on a change in pull request #10810: MINOR: Improve Kafka Streams JavaDocs with regard to record metadata

GitBox Fri, 04 Jun 2021 01:21:14 -0700


vvcephei commented on a change in pull request #10810:
URL: https://github.com/apache/kafka/pull/10810#discussion_r645023265




##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/ProcessorContext.java
##########
@@ -158,32 +165,52 @@ Cancellable schedule(final Duration interval,
     void commit();
 
     /**
-     * Returns the topic name of the current input record; could be null if it 
is not
-     * available (for example, if this method is invoked from the punctuate 
call).
+     * Returns the topic name of the current input record; could be {@code 
null} if it is not
+     * available.
+     *
+     * <p> For example, if this method is invoked within a punctuate callback, 
or while processing a
+     * record that was forwarded by a punctuation callback, the record won't 
have an associated topic.
+     * Another example is
+     * {@link 
org.apache.kafka.streams.kstream.KTable#transformValues(ValueTransformerWithKeySupplier,
 String...)}
+     * (and siblings), that do not always guarantee to provide a valid topic 
name, as they might be
+     * executed "out-of-band" due to some internal optimizations applied by 
the Kafka Streams DSL.
      *
      * @return the topic name
      */
     String topic();
 
     /**
-     * Returns the partition id of the current input record; could be -1 if it 
is not
-     * available (for example, if this method is invoked from the punctuate 
call).
+     * Returns the partition id of the current input record; could be {@code 
-1} if it is not
+     * available.
+     *
+     * <p> For example, if this method is invoked within a punctuate callback, 
or while processing a
+     * record that was forwarded by a punctuation callback the record won't 
have an associated partition id.
+     * Another example is
+     * {@link 
org.apache.kafka.streams.kstream.KTable#transformValues(ValueTransformerWithKeySupplier,
 String...)}
+     * (and siblings), that do not always guarantee to provide a valid 
partition id, as they might be
+     * executed "out-of-band" due to some internal optimizations applied by 
the Kafka Streams DSL.
      *
      * @return the partition id
      */
     int partition();
 
     /**
-     * Returns the offset of the current input record; could be -1 if it is not
-     * available (for example, if this method is invoked from the punctuate 
call).
+     * Returns the offset of the current input record; could be {@code -1} if 
it is not
+     * available.
+     *
+     * <p> For example, if this method is invoked within a punctuate callback, 
or while processing a
+     * record that was forwarded by a punctuation callback, the record won't 
have an associated offset.
+     * Another example is
+     * {@link 
org.apache.kafka.streams.kstream.KTable#transformValues(ValueTransformerWithKeySupplier,
 String...)}
+     * (and siblings), that do not always guarantee to provide a valid offset, 
as they might be
+     * executed "out-of-band" due to some internal optimizations applied by 
the Kafka Streams DSL.
      *
      * @return the offset
      */
     long offset();
 
     /**
-     * Returns the headers of the current input record; could be null if it is 
not
-     * available (for example, if this method is invoked from the punctuate 
call).
+     * Returns the headers of the current input record.

Review comment:
       I see what you mean. I think it's worth providing a little more 
"context."
   
   Even in the punctuation case, it is obvious when you're specifically 
thinking about the relationship between upstream punctuation and downstream 
processors' contexts, but if you have a large Streams topology, maintained by a 
large team, you can easily have a situation where one person adds a punctuation 
that interacts poorly with logic far downstream, primarily maintained by other 
people.
   
   In that case, the downstream folks might be encountering a situation where 
they can't figure out why the headers are empty sometimes. In those cases, a 
little bread-crumb that says "this might be empty if..." can save someone hours 
or days of debugging.

##########
File path: 
streams/src/main/java/org/apache/kafka/streams/processor/internals/ProcessorRecordContext.java
##########
@@ -43,12 +43,11 @@ public ProcessorRecordContext(final long timestamp,
                                   final int partition,
                                   final String topic,
                                   final Headers headers) {
-
         this.timestamp = timestamp;
         this.offset = offset;
         this.topic = topic;
         this.partition = partition;
-        this.headers = headers;
+        this.headers = Objects.requireNonNull(headers);

Review comment:
       Yeah, as long as the tests pass, I think it's ok.

##########
File path: 
streams/test-utils/src/main/java/org/apache/kafka/streams/processor/MockProcessorContext.java
##########
@@ -319,7 +321,7 @@ public void setRecordMetadata(final String topic,
         this.topic = topic;
         this.partition = partition;
         this.offset = offset;
-        this.headers = headers;
+        this.headers = Objects.requireNonNull(headers);

Review comment:
       If you want to introduce the invariant without introducing an NPE 
regression, you could just coerce a null:
   
   ```suggestion
           this.headers = headers == null ? new RecordHeaders() : headers;
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

[GitHub] [kafka] vvcephei commented on a change in pull request #10810: MINOR: Improve Kafka Streams JavaDocs with regard to record metadata

Reply via email to