crepererum opened a new issue, #13433: URL: https://github.com/apache/datafusion/issues/13433
# What Migrate from `hashbrown::raw::RawTable` to `hashbrown::hash_table::HashTable`. # Why `RawTable` and `raw_entry` are removed in hashbrown 0.15.1. See https://github.com/apache/datafusion/pull/13256 . Personally I think it's the right thing to do because `RawTable` was a bit of a weird API. # How First, we need to get some memory accounting for `HashTable`. This can be done in a similar way to https://github.com/apache/datafusion/blob/e25f5e7485ffcd810f96c7be096b04b3cacf30b3/datafusion/common/src/utils/proxy.rs#L110-L111 ```rust /// Extension trait for hash browns [`HashTable`] to account for allocations. pub trait HashTableAllocExt { /// Item type. type T; /// Insert new element into table and increase /// `accounting` by any newly allocated bytes. /// /// Returns the bucket where the element was inserted. /// Note that allocation counts capacity, not size. /// /// # Example: /// ``` /// TODO: rewrite this example! /// ``` fn insert_accounted( &mut self, x: Self::T, hasher: impl Fn(&Self::T) -> u64, accounting: &mut usize, ); } impl<T> HashTableAllocExt for HashTable<T> where T: Eq { type T = T; fn insert_accounted( &mut self, x: Self::T, hasher: impl Fn(&Self::T) -> u64, accounting: &mut usize, ) { let hash = hasher(&x); // NOTE: `find_entry` does NOT grow! match self.find_entry(hash, |y| y == &x) { Ok(_occupied) => {} Err(_absent) => { if self.len() == self.capacity() { // need to request more memory let bump_elements = self.capacity().max(16); let bump_size = bump_elements * size_of::<T>(); *accounting = (*accounting).checked_add(bump_size).expect("overflow"); self.reserve(bump_elements, &hasher); } // still need to insert the element since first try failed self.entry(hash, |y| y == &x, hasher).insert(x); } } } } ``` Then, migrate every `RawTable` to `HashTable`. Since `RawTable` is used all over the place, I think this should be split into multiple PRs. Luckily, `HashTable` is already available in hashbrown 0.14.x, so we can do this iteratively before upgrading to hashbrown 0.15.x. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
