Sorry for the noise. You're totally right of course, memoization can only work within the context of a single function (as written). Persistent term has the issues you mentioned. And even in the cases I'm looking to improve, the code structure won't benefit enough.
I hope the OTP team can optimise the loading of regex in a future release. Case closed. On Sunday, March 30, 2025 at 1:26:22 AM UTC+11 Kip wrote: > > Meanwhile, I suggest refactoring the code to pass the regex around in > sensitive areas. :( > > I wish I could (or maybe more clearly, had the ability!). The Elixir code > is all generated from this delightful set of regex and rules. It works > surprisingly well but I suspect a performance hit on OTP28 (benchmarking > next week after I coerce benchee to compile on OTP 28). > > https://github.com/unicode-org/cldr/blob/main/common/segments/root.xml > > > > On Sunday, March 30, 2025 at 1:17:42 AM UTC+11 Kip wrote: > >> > and the latter may have runtime impact. >> Definitely a concern for the "simple" case of a use-once regex. I was >> thinking of something like (pseudo code) >> var >> = >> :erlang.iolist_to_binary(["__regex_",:erlang.integer_to_list(abs(:erlang.monotonic_time(:nanosecond)))]) >> quote do >> var = if var, do: var, else: Regex.compile!(binary_or_tuple, >> options) >> end >> Too much risk of performance impact? I think the BEAM optimises out the >> binding in positive cases like this? >> >> > Instead, we are discussing adding the optimization we did before >> directly to Erlang/OTP >> Yep, I'm anxiously awaiting a good outcome from that conversation :-) >> Thanks for encouraging the OTP team on this. >> >> On Sunday, March 30, 2025 at 12:54:32 AM UTC+11 José Valim wrote: >> >>> The memoization will only be useful if we either do variable hoisting, >>> which are inherently limited to the current function, or we store it in >>> persistent term. The former will require meaningful changes in the compiler >>> and the latter may have runtime impact. >>> >>> Instead, we are discussing adding the optimization we did before >>> directly to Erlang/OTP. Meanwhile, I suggest refactoring the code to pass >>> the regex around in sensitive areas. :( >>> >>> >>> >>> *José Valimhttps://dashbit.co/ <https://dashbit.co/>* >>> >>> >>> On Sat, Mar 29, 2025 at 14:29 Kip <kipc...@gmail.com> wrote: >>> >>>> TLDR; >>>> Memoize (bind to a variable) the result of `Regex.compile!/2` on OTP 28 >>>> so that it is only compiled once. >>>> >>>> Background >>>> >>>> Since in OTP 28 it's not possible to unquote a regex (~r/..../) into >>>> code, the implementation of sigil_r on OTP 28 has to compile the regex at >>>> runtime. In code which iterates over text using regex (for example Unicode >>>> break algorithm, Unicode transforms and so on) this could lead to a >>>> performance penalty. >>>> >>>> Proposal >>>> >>>> Bind the result of Regex.compile!/2 to a variable called something like >>>> `__regex_#{hash_of_regex_string}` if its successful. If the variable is >>>> bound, use it directly without compilation. Do performance testing to >>>> confirm that there is benefit to memoizing. >>>> >>>> I am fine to do this work if the proposal has merit. >>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "elixir-lang-core" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to elixir-lang-co...@googlegroups.com. >>>> To view this discussion visit >>>> https://groups.google.com/d/msgid/elixir-lang-core/afa20f7d-372c-4307-9884-7a1a931a927cn%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/elixir-lang-core/afa20f7d-372c-4307-9884-7a1a931a927cn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> -- You received this message because you are subscribed to the Google Groups "elixir-lang-core" group. To unsubscribe from this group and stop receiving emails from it, send an email to elixir-lang-core+unsubscr...@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/elixir-lang-core/630b4d75-1a16-43a7-ac24-c0e2f6b2ba59n%40googlegroups.com.