kazuyukitanimura commented on code in PR #1359: URL: https://github.com/apache/datafusion-comet/pull/1359#discussion_r1958826427
########## docs/source/user-guide/datasources.md: ########## @@ -35,3 +35,81 @@ converted into Arrow format, allowing native execution to happen after that. Comet does not provide native JSON scan, but when `spark.comet.convert.json.enabled` is enabled, data is immediately converted into Arrow format, allowing native execution to happen after that. + +# Supported Storages + +## Local +In progress + +## HDFS + +Apache DataFusion Comet native reader seamlessly scans files from remote HDFS for [supported formats](#supported-spark-data-sources) + +### Using experimental native DataFusion reader +Unlike to native Comet reader the Datafusion reader fully supports nested types processing. This reader is currently experimental only + +To build Comet with native DataFusion reader and remote HDFS support it is required to have a JDK installed + +Example: +Build a Comet for `spark-3.4` provide a JDK path in `JAVA_HOME` +Provide the JRE linker path in `RUSTFLAGS`, the path can vary depending on the system. Typically JRE linker is a part of installed JDK + +```shell +export JAVA_HOME="/opt/homebrew/opt/openjdk@11" Review Comment: nit: is JAVA_HOME still the requirement? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org