comphead commented on PR #1377: URL: https://github.com/apache/datafusion-comet/pull/1377#issuecomment-2651312412
> > From the org of [datafusion-contrib](https://github.com/datafusion-contrib?q=hdfs&type=all&language=&sort=), I see many hdfs crates, which one is best for comet? > > I know it was brought up in the original issue, but just plugging my pure Rust implementation: https://github.com/datafusion-contrib/hdfs-native-object-store. It's already integrated in delta-rs and delta-kernel-rs. Though this arguably could be the one time a JNI based implementation could make sense since you're guaranteed to have Java installed and probably your classpath already set correctly since you're running Spark. > > Doesn't seem like something that should have the implementation copied into the Comet repo, as it seems out of scope. Hey @Kimahriman its nice to see you here, we checked the contribution crate https://github.com/datafusion-contrib/hdfs-native-object-store and https://github.com/datafusion-contrib/datafusion-objectstore-hdfs. I really like that the crate because has less memory footprint and has no JVM dependency and therefore no JVM roundtrips. Can't wait this crate to grow up and use it. For now `libhdfs` on JVM provides more HDFS client which critical on production sites comparing to https://github.com/Kimahriman/hdfs-native?tab=readme-ov-file#supported-hdfs-settings where it is hard to configure cluster network configuration, namenode retries, etc. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org