erratic-pattern commented on code in PR #10473:
URL: https://github.com/apache/datafusion/pull/10473#discussion_r1604294311


##########
datafusion/optimizer/src/common_subexpr_eliminate.rs:
##########
@@ -48,7 +51,42 @@ use indexmap::IndexMap;
 /// Since an identifier is likely to be copied many times, it is better that 
an identifier
 /// is small or "copy". otherwise some kinds of reference count is needed. 
String
 /// description here is not such a good choose.
-type Identifier = String;
+
+#[derive(Clone, Copy, Debug, Eq, PartialEq)]
+struct Identifier<'n> {
+    hash: u64,
+    expr: &'n Expr,
+}
+
+const SEED: RandomState = RandomState::with_seeds(0, 0, 0, 0);

Review Comment:
   do we need to worry about DOS here? docs for 
[`with_seeds`](https://docs.rs/ahash/latest/ahash/random_state/struct.RandomState.html#method.with_seeds)
 states that one of the number should be random if you need DOS resistance.
   
   Some alternatives:
   * use `RandomState::new` to automatically create seed from a random source
   * change `Identifier::new` to take a `Hasher` as a second argument. Then the 
`Hasher` can be instantiated on the  `CommonSubexprEliminate` struct.
   
   With second option you can use ~`DefaultHasher::new`~`RandomState::new` from 
the standard library. Unless we want to use specifically `ahash` for some 
reason that I don't understand.
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to