> Meanwhile, I suggest refactoring the code to pass the regex around in 
sensitive areas. :(

I wish I could (or maybe more clearly, had the ability!). The Elixir code 
is all generated from this delightful set of regex and rules. It works 
surprisingly well but I suspect a performance hit on OTP28 (benchmarking 
next week after I coerce benchee to compile on OTP 28).

https://github.com/unicode-org/cldr/blob/main/common/segments/root.xml



On Sunday, March 30, 2025 at 1:17:42 AM UTC+11 Kip wrote:

> > and the latter may have runtime impact.
> Definitely a concern for the "simple" case of a use-once regex. I was 
> thinking of something like (pseudo code)
>     var 
> = 
> :erlang.iolist_to_binary(["__regex_",:erlang.integer_to_list(abs(:erlang.monotonic_time(:nanosecond)))])
>     quote do
>      var =  if  var, do: var, else: Regex.compile!(binary_or_tuple, 
> options)
>     end
> Too much risk of performance impact?  I think the BEAM optimises out the 
> binding in positive cases like this?
>
> > Instead, we are discussing adding the optimization we did before 
> directly to Erlang/OTP
> Yep, I'm anxiously awaiting a good outcome from that conversation :-) 
> Thanks for encouraging the OTP team on this.
>
> On Sunday, March 30, 2025 at 12:54:32 AM UTC+11 José Valim wrote:
>
>> The memoization will only be useful if we either do variable hoisting, 
>> which are inherently limited to the current function, or we store it in 
>> persistent term. The former will require meaningful changes in the compiler 
>> and the latter may have runtime impact.
>>
>> Instead, we are discussing adding the optimization we did before directly 
>> to Erlang/OTP. Meanwhile, I suggest refactoring the code to pass the regex 
>> around in sensitive areas. :(
>>
>>
>>
>> *José Valimhttps://dashbit.co/ <https://dashbit.co/>*
>>
>>
>> On Sat, Mar 29, 2025 at 14:29 Kip <kipc...@gmail.com> wrote:
>>
>>> TLDR;
>>> Memoize (bind to a variable) the result of `Regex.compile!/2` on OTP 28 
>>> so that it is only compiled once.
>>>
>>> Background
>>>
>>> Since in OTP 28 it's not possible to unquote a regex (~r/..../) into 
>>> code, the implementation of sigil_r on OTP 28 has to compile the regex at 
>>> runtime. In code which iterates over text using regex (for example Unicode 
>>> break algorithm, Unicode transforms and so on) this could lead to a 
>>> performance penalty. 
>>>
>>> Proposal
>>>
>>> Bind the result of Regex.compile!/2 to a variable called something like 
>>> `__regex_#{hash_of_regex_string}` if its successful. If the variable is 
>>> bound, use it directly without compilation. Do performance testing to 
>>> confirm that there is benefit to memoizing.
>>>
>>> I am fine to do this work if the proposal has merit.
>>>
>>>
>>> -- 
>>> You received this message because you are subscribed to the Google 
>>> Groups "elixir-lang-core" group.
>>> To unsubscribe from this group and stop receiving emails from it, send 
>>> an email to elixir-lang-co...@googlegroups.com.
>>> To view this discussion visit 
>>> https://groups.google.com/d/msgid/elixir-lang-core/afa20f7d-372c-4307-9884-7a1a931a927cn%40googlegroups.com
>>>  
>>> <https://groups.google.com/d/msgid/elixir-lang-core/afa20f7d-372c-4307-9884-7a1a931a927cn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"elixir-lang-core" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elixir-lang-core+unsubscr...@googlegroups.com.
To view this discussion visit 
https://groups.google.com/d/msgid/elixir-lang-core/ff5967fa-206c-4d03-82d3-4a95abafa3a4n%40googlegroups.com.

Reply via email to