parthchandra commented on PR #2078: URL: https://github.com/apache/datafusion-comet/pull/2078#issuecomment-3165214392
> `Refactor the FileReader API to use an InputStream (instead of a file path)...` — Does using a file path still have shading issues? No, but the input to `CometVectorizedParquetReader.newCometReader` is an `InputFile` that need not be a `HadoopInputFile` and need not be backed by an actual file. In that case, the file path is invalid and the Comet `FileReader` fails. A bunch of Iceberg unit tests failed because of that. The `WrappedInputFile` implementation exists to work around that. The underlying object in `WrappedInputFile` is an `org.apache.iceberg.io.InputFile` which is accessed via reflection to avoid creating a dependency on Iceberg. Similarly the `org.apache.iceberg.io.SeekableInputStream` that is returned by `org.apache.iceberg.io.InputFile` is wrapped in `WrappedSeekableInputStream` to avoid dependencies on Iceberg. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org