zhuliquan commented on PR #11455:
URL: https://github.com/apache/datafusion/pull/11455#issuecomment-2248130513
@alamb I thought a idea that we can attach lru-cache (key is regex, value is
result of compiled regex) to struct `RegexpLikeFunc`.
```rust
#[derive(Debug)]
pub struct RegexpLikeFunc {
lru_cache: LruCache<String, regex::Regex>,
signature: Signature,
}
impl Default for RegexpLikeFunc {
fn default() -> Self {
Self::new()
}
}
impl RegexpLikeFunc {
pub fn new() -> Self {
use DataType::*;
Self {
signature: Signature::one_of(
vec![
Exact(vec![Utf8, Utf8]),
Exact(vec![LargeUtf8, Utf8]),
Exact(vec![Utf8, Utf8, Utf8]),
Exact(vec![LargeUtf8, Utf8, Utf8]),
],
Volatility::Immutable,
),
lru_cache: LruCache::new(NonZeroUsize::new(1024).unwrap()),
}
}
}
```
We can use cache like way in `regexp_is_match_utf8`.
https://github.com/apache/arrow-rs/blob/af40ea382275dba967bfabc1632fded07d2129b9/arrow-string/src/regexp.rs#L50
https://github.com/apache/arrow-rs/blob/af40ea382275dba967bfabc1632fded07d2129b9/arrow-string/src/regexp.rs#L83-L95
it's make full use of result of compiled regex. and this way can be applied
for scalar or array two cases.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]