JigaoLuo commented on PR #79: URL: https://github.com/apache/datafusion-site/pull/79#issuecomment-3039990579
This also ties into my initial impression that “the Embedded Index is just a hashset to speed up scans, which adds overhead to Parquet." as mentioned as a follow-up here: https://github.com/apache/datafusion-site/pull/79#discussion_r2186572247 I think this framing might unintentionally limit how readers perceive its potential. To address this, we could consider adding **an Outlook section** (either at the beginning or the end of the blog) to explicitly highlight what the Embedded Index is capable of. It’s not just a hashset for pruning; in principle, it could support a wide range of use cases. Use cases are also discussed here: https://github.com/apache/datafusion/issues/16374#issuecomment-3039796047 I’d be happy to help draft such an Outlook section, pending confirmation from your side. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org