Tushar7012 commented on issue #20054: URL: https://github.com/apache/datafusion/issues/20054#issuecomment-3814224884
Hi @Dandandan , I have updated the [ArrowBytesViewMap](cci:2://file:///d:/Agentic_AI/Gssoc_Apache/datafusion/datafusion/physical-expr-common/src/binary_view_map.rs:115:0-136:1) implementation to optimize memory access for inlined values and resolve the recent compilation errors: 1. **Single-Fetch Optimization**: Modified [insert_if_new_inner](cci:1://file:///d:/Agentic_AI/Gssoc_Apache/datafusion/datafusion/physical-expr-common/src/binary_view_map.rs:222:4-304:5) to fetch and convert the input row value to `&[u8]` exactly once per row. This value is then reused for both hash table lookups and builder insertion, eliminating redundant memory access and byte conversions during hash collisions. 2. **API Constraints**: CC @Dandandan — I investigated performing direct `u128` comparisons for existing entries in the builder. However, `GenericByteViewBuilder` does not currently expose its internal `views` buffer publicly. I have used `self.builder.get_value(idx) == value_bytes` as the optimized fallback, which remains efficient for inlined values. 3. **Type Safety & Fixes**: Resolved the `E0599`, `E0277`, and `E0308` compilation errors by correctly leveraging `GenericByteViewArray<B>` and ensuring consistent byte-slice comparisons. **Verification:** Successfully ran tests for `binary_view_map`: - `cargo test -p datafusion-physical-expr-common --lib binary_view_map::tests` - Result: **8 passed; 0 failed** -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
