timsaucer commented on PR #17350: URL: https://github.com/apache/datafusion/pull/17350#issuecomment-3239420667
I'm wondering if we can save ourselves some effort here and do this: ```rust pub struct FFI_ScalarUDF { ... /// Internal hash function result pub hash_value: u64, ... } impl From<Arc<ScalarUDF>> for FFI_ScalarUDF { fn from(udf: Arc<ScalarUDF>) -> Self { let name = udf.name().into(); let aliases = udf.aliases().iter().map(|a| a.to_owned().into()).collect(); let volatility = udf.signature().volatility.into(); let short_circuits = udf.short_circuits(); let mut state = DefaultHasher::new(); udf.hash(&mut state); let hash_value = state.finish(); let private_data = Box::new(ScalarUDFPrivateData { udf }); Self { name, aliases, volatility, short_circuits, invoke_with_args: invoke_with_args_fn_wrapper, return_type: return_type_fn_wrapper, return_field_from_args: return_field_from_args_fn_wrapper, coerce_types: coerce_types_fn_wrapper, hash_value, clone: clone_fn_wrapper, release: release_fn_wrapper, private_data: Box::into_raw(private_data) as *mut c_void, } } } impl PartialEq for ForeignScalarUDF { fn eq(&self, other: &Self) -> bool { let Self { name, aliases, udf, signature, } = self; name == &other.name && aliases == &other.aliases && signature == &other.signature && udf.hash_value == other.udf.hash_value } } impl Hash for ForeignScalarUDF { fn hash<H: Hasher>(&self, state: &mut H) { let Self { name, aliases, udf, signature, } = self; name.hash(state); aliases.hash(state); // This appears to be a hash of the hash value, but if you review how // u64 is hashed, it is just pushing the byte values into state. udf.hash_value.hash(state); signature.hash(state); } } ``` And further, I wonder if we even need `fn hash()` to include the name, aliases, and signature at all. Those all come from the FFI calls, so wouldn't they be included in the internal call to `hash`? It seems like we have an opportunity here to have a simpler path, but I'm not 100% confident I haven't overlooked some need to call `hash()` function every time. I don't think we expect these functions to have a state that is changing and thus hash is updating. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org