> Meanwhile, I suggest refactoring the code to pass the regex around in sensitive areas. :(
I wish I could (or maybe more clearly, had the ability!). The Elixir code is all generated from this delightful set of regex and rules. It works surprisingly well but I suspect a performance hit on OTP28 (benchmarking next week after I coerce benchee to compile on OTP 28). https://github.com/unicode-org/cldr/blob/main/common/segments/root.xml On Sunday, March 30, 2025 at 1:17:42 AM UTC+11 Kip wrote: > > and the latter may have runtime impact. > Definitely a concern for the "simple" case of a use-once regex. I was > thinking of something like (pseudo code) > var > = > :erlang.iolist_to_binary(["__regex_",:erlang.integer_to_list(abs(:erlang.monotonic_time(:nanosecond)))]) > quote do > var = if var, do: var, else: Regex.compile!(binary_or_tuple, > options) > end > Too much risk of performance impact? I think the BEAM optimises out the > binding in positive cases like this? > > > Instead, we are discussing adding the optimization we did before > directly to Erlang/OTP > Yep, I'm anxiously awaiting a good outcome from that conversation :-) > Thanks for encouraging the OTP team on this. > > On Sunday, March 30, 2025 at 12:54:32 AM UTC+11 José Valim wrote: > >> The memoization will only be useful if we either do variable hoisting, >> which are inherently limited to the current function, or we store it in >> persistent term. The former will require meaningful changes in the compiler >> and the latter may have runtime impact. >> >> Instead, we are discussing adding the optimization we did before directly >> to Erlang/OTP. Meanwhile, I suggest refactoring the code to pass the regex >> around in sensitive areas. :( >> >> >> >> *José Valimhttps://dashbit.co/ <https://dashbit.co/>* >> >> >> On Sat, Mar 29, 2025 at 14:29 Kip <kipc...@gmail.com> wrote: >> >>> TLDR; >>> Memoize (bind to a variable) the result of `Regex.compile!/2` on OTP 28 >>> so that it is only compiled once. >>> >>> Background >>> >>> Since in OTP 28 it's not possible to unquote a regex (~r/..../) into >>> code, the implementation of sigil_r on OTP 28 has to compile the regex at >>> runtime. In code which iterates over text using regex (for example Unicode >>> break algorithm, Unicode transforms and so on) this could lead to a >>> performance penalty. >>> >>> Proposal >>> >>> Bind the result of Regex.compile!/2 to a variable called something like >>> `__regex_#{hash_of_regex_string}` if its successful. If the variable is >>> bound, use it directly without compilation. Do performance testing to >>> confirm that there is benefit to memoizing. >>> >>> I am fine to do this work if the proposal has merit. >>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elixir-lang-core" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elixir-lang-co...@googlegroups.com. >>> To view this discussion visit >>> https://groups.google.com/d/msgid/elixir-lang-core/afa20f7d-372c-4307-9884-7a1a931a927cn%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/elixir-lang-core/afa20f7d-372c-4307-9884-7a1a931a927cn%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/elixir-lang-core/ff5967fa-206c-4d03-82d3-4a95abafa3a4n%40googlegroups.com.