Gus writes: > Loren Wilton wrote: > >> rule work, but the ones with the "\|\\\|" (matching "|\|") all choke: > >> > >> "fro|\""|" {RET("__XM_Sft_Ms_Fp_L33T");} > >> > >> It produces the above output, instead of "fro|\|". Not sure what it's > >> doing. Any solution other than commenting out the offending rules? > > > > What version of re2c? There reports that version before version12 > > "have bugs". > > The bugs aren't specified, but I assume they could include miscompiling > > the regexes. > > It's v0.12. I don't think it's re2c anyway--re2c is choking on it because > sa-compile is producing the above .re file. That line of the .re file > should start with "fro|\|", and if I manually edit it, it compiles fine. > > It's consistently interpreting escaped backslashes (\\) as either \" or > \"", which screws re2c up because that creates an incorrect amount of "'s. > I even tried escaping it has hex "\x52" (or whatever the right number > was--don't have my ascii table handy anymore :).
yep, this sounds like a bug in the code which extracts the "base strings" from the full regexp: the BodyRuleBaseExtractor plugin, iirc. could you open a bugzilla ticket? --j.