Joe Smith wrote: > I've looked at the code a bit, and it seems like there is indeed only one > point > of contact with the rest of the suite, textsearch.cxx, which handles all types > of text searches (normal, regexp & fuzzy), and calls Regexpr::re_search(), > which > calls re_match2() to run the actual regexp match. > > So the structure makes it easy to replace the regexp code in one place. > > Unfortunately, the way the functions work does not match well with the Boost > RE > classes, although I'm sure it would be possible with an interface layer. > > For example, the Boost engine handles locale-specific issues internally, > whereas > OOo's engine knows almost nothing about character case or multi-character > sequences. Instead, it preps the text to be searched by running it through a > filter. I don't understand the i18n & character encoding issues well enough to > guess what that filter is actually doing or how it should be handled. > Hi Joe,
hm - then I think a combination of those two approaches might be a winning strategy - LibO uses icu for all those nifty transliteration stuff & what not. I notice that newer boost versions also optionally support icu, maybe that already gives us good enough coverage - I'd be tempted to just give it a whirl, and add it as an optional, experimental feature to have people play with it. Cheers, -- Thorsten
pgp8DTxCj9okj.pgp
Description: PGP signature
_______________________________________________ LibreOffice mailing list LibreOffice@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/libreoffice