They don't have to be *arbitrarily* long. They actually can be quite short. 2^n is large, even for relatively small n. Again, I refer to the graphs in the blog post above. We are talking about input length of less than 100 characters to get CPU churn of *minutes*. That's not a huge limit for a search-box.
Yes, the compile-time is negligible - but it's negligible *because it's linear in the input size*. And match-times are only small, because RE2 made the conscious decision (for *exactly* the reasons I outline here) to forego features that were very common and even expected of regexp-engines, so it could guarantee a linear match time. That was novel at the time - almost no engine did that and the most commonly used most certainly didn't. When people in this thread say "regular expressions are great because they give linear match time on the matched text size", they forget that most "regular expression" engines don't actually give you that. They only give that, if you specifically design with that in mind. I also explained why "it is trivial to limit the input size to something a user could input" is simply wrong, in practice. If you have exponential growth, the difference between 20 and 30 characters (again, we are talking search-terms here, it is not at all unreasonable to look for a 30 character identifier) is three orders of magnitude. If the curve is too steep, you can't reasonably separate between sensible input lengths from attacks. If you get a nice, linear curve, that becomes trivial. Just multiply any sensible limit you could think of by 100 and you're still going to end up well within spec. And even if we ignore the DoS-potential - just considering what that does to your load-patterns, that some user's queries will be quasi free, while others will be a million times or more as expensive is a good argument to pay attention to this. Just from an operational POV. Seriously - you should *absolutely* consider the algorithmic complexity of functions you are calling with user-provided input. On Mon, Jun 8, 2020 at 11:22 PM Robert Engels <reng...@ix.netcom.com> wrote: > You can’t say it arbitrary regex inputted by a user then claim they can be > arbitrarily long - these are incompatible. For any reasonable user input > the compile time is negligible, and it is trivial to limit the input size > to something a user could input. > > On Jun 8, 2020, at 2:27 PM, Axel Wagner <axel.wagner...@googlemail.com> > wrote: > > > And that exact logic is what wouldn't work, if compilation or matching > would be exponential, for example. > Being able to say "as long as the inputs don't get astronomically long, > the running time of the algorithm will be reasonable" is *exactly* the > conclusion that a small complexity class gives you. > > On Mon, Jun 8, 2020 at 7:17 PM Robert Engels <reng...@ix.netcom.com> > wrote: > >> If the input regex string is bounded by a typical user input box - the >> compilation time will be negligible when compared to the runtimes. >> >> On Jun 8, 2020, at 11:07 AM, Axel Wagner <axel.wagner...@googlemail.com> >> wrote: >> >> >> Oh, true, I overlooked that detail. But FTR, they clarify their question >> subsequently and specifically ask whether compilation is linear in >> expression size. No Must* in that clarification. >> >> On Mon, Jun 8, 2020 at 6:05 PM Thomas Bushnell, BSG <tbushn...@google.com> >> wrote: >> >>> On Mon, Jun 8, 2020 at 12:02 PM Axel Wagner < >>> axel.wagner...@googlemail.com> wrote: >>> >>>> On Mon, Jun 8, 2020 at 5:41 PM Thomas Bushnell, BSG < >>>> tbushn...@google.com> wrote: >>>> >>>>> The OP was about MustCompile, so I think it's clear they are not using >>>>> patterns passed in by external requests. >>>>> >>>> >>>> I don't think that's clear at all. How do you assume patterns from >>>> external sources can be matched, if not by compiling them? >>>> >>> >>> "MustCompile is like Compile but panics if the expression cannot be >>> parsed." >>> >>> Thomas >>> >> -- You received this message because you are subscribed to the Google Groups "golang-nuts" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/golang-nuts/CAEkBMfHxYgCfPtLZoZFppmsZYmYZMts9%3Dk4Q4pKv7ygAiR5SnA%40mail.gmail.com.