kazuyukitanimura commented on code in PR #1133: URL: https://github.com/apache/datafusion-comet/pull/1133#discussion_r1870467062
########## spark/src/main/java/org/apache/comet/CometBatchIterator.java: ########## @@ -47,12 +62,8 @@ public class CometBatchIterator { * @return the number of rows of the current batch. -1 if there is no more batch. */ public int next(long[] arrayAddrs, long[] schemaAddrs) { - boolean hasBatch = input.hasNext(); - - if (!hasBatch) { - return -1; - } - - return nativeUtil.exportBatch(arrayAddrs, schemaAddrs, input.next()); + int numRows = nativeUtil.exportBatch(arrayAddrs, schemaAddrs, currentBatch); Review Comment: Do we need something like ``` if (currentBatch == null) { return -1; } ``` ########## spark/src/main/java/org/apache/comet/CometBatchIterator.java: ########## @@ -33,12 +33,27 @@ public class CometBatchIterator { final Iterator<ColumnarBatch> input; final NativeUtil nativeUtil; + private ColumnarBatch currentBatch = null; CometBatchIterator(Iterator<ColumnarBatch> input, NativeUtil nativeUtil) { this.input = input; this.nativeUtil = nativeUtil; } + /** + * Fetch the next input batch. + * + * @return Number of rows in next batch or -1 if no batches left. + */ + public int hasNext() { + if (input.hasNext()) { + currentBatch = input.next(); Review Comment: Hmmm this will advance `input` just by calling `hasNext`, will be a problem when this gets called twice? ########## native/core/src/execution/operators/scan.rs: ########## @@ -200,6 +217,21 @@ impl ScanExec { let mut env = JVMClasses::get_env()?; + let mut timer = jvm_fetch_time.timer(); + + let num_rows: i32 = unsafe { + jni_call!(&mut env, + comet_batch_iterator(iter).has_next() -> i32)? + }; + + timer.stop(); + + if num_rows == -1 { + return Ok(InputBatch::EOF); + } Review Comment: we have the same code at line 268 Should we remove the one at line 268? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org