Github user XuPingyong commented on the issue:

    https://github.com/apache/flink/pull/4525
  
    @greghogan, if the object passed to nextRecord may be reused internally 
by the InputFormat, do the similar cases need to  be re-considered?
    
    In `DataSourceTask.java`:
                  
                  OT reuse = serializer.createInstance();
    
                // as long as there is data to read
                while (!this.taskCanceled && !format.reachedEnd()) {
                    OT returned;
                    if ((returned = format.nextRecord(reuse)) != null) {
                        output.collect(returned);
                    }
                }
    
    And in many batch drivers:
                  
                  final MutableObjectIterator<T> in = taskContext.getInput(0);
                  T value = serializer.createInstance();
                  while (running && (value = in.next(value)) != null) {
                      .......
                  } 
    
    
    In my opinion:
         1.  `Null` records are meaningless, but `null` is meaningful for input 
or format which means the end. If a user only call `InputFormat#nextRecord` 
without `InputFormat#reachedEnd`, only `null` can be returned. 
         2.  The returned object of `InputFormat#nextRecord` should not need to 
be considered that it may be passed again. If a immutable object is returned, 
an exception will be thrown  when it is reused again in 
`InputFormat#nextRecord`.
    
    @greghogan, could you please offer some cases that the object passed to 
nextRecord can be reused internally by the InputFormat?  Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

Reply via email to