On 2017-11-12, The Wanderer <wande...@fastmail.fm> wrote: > >> (?m)(\W|^)panda.*str(\W|$) > > That would be expected to find only documents containing 'panda' > followed by 'str'. To also find ones which contain 'str' followed by > 'pandas' (and add the missing 's' back in), you'd probably want: > > (?m)(\W|^)(pandas.*str|str.*pandas)(\W|$) > > I have not tested this, but I use similar '(a.*b|b.*a)' regexes on a > semi-regular basis for searching one of my own text archives.
I tried that, actually, following the same logic, or thought I did (there might have been a typo somewhere) yet it produced *less* results, but trying the formula again now it seems to "work" (although the regex is pretty useless because it matches reams and reams of stuff because 'anda' 'panda' 'pandas' 'expandable' 'str' 'struct' 'instruction' 'castration', etc. are all matched). This produces two hits (from the same file) only: (?m)(\W|^)\bpanda\b.*\bstr\b|\bstr\b.*\bpanda\b(\W|$) I'm not certain how you're supposed to construct the formula to only match the literal strings (if literal is indeed the term) "panda" and "str". I also have no idea what the expected results might look like. To add insult insult to injury, I know nothing about regexes either. ;-) > (Also, I'm not sure the '\W' bits are needed, but I don't know the field > of what's-being-searched-for well enough to be certain about why those > may have been added.) >