On Wed, Jul 18, 2018 at 7:59 PM, MRAB <pyt...@mrabarnett.plus.com> wrote: > On 2018-07-18 22:40, Larry Martell wrote: >> >> On Tue, Jul 17, 2018 at 11:43 AM, Neil Cerutti <ne...@norwich.edu> wrote: >>> >>> On 2018-07-16, Larry Martell <larry.mart...@gmail.com> wrote: >>>> >>>> I had some code that did this: >>>> >>>> meas_regex = '_M\d+_' >>>> meas_re = re.compile(meas_regex) >>>> >>>> if meas_re.search(filename): >>>> stuff1() >>>> else: >>>> stuff2() >>>> >>>> I then had to change it to this: >>>> >>>> if meas_re.search(filename): >>>> if 'MeasDisplay' in filename: >>>> stuff1a() >>>> else: >>>> stuff1() >>>> else: >>>> if 'PatternFov' in filename: >>>> stuff2a() >>>> else: >>>> stuff2() >>>> >>>> This code needs to process many tens of 1000's of files, and it >>>> runs often, so it needs to run very fast. Needless to say, my >>>> change has made it take 2x as long. Can anyone see a way to >>>> improve that? >>> >>> >>> Can you expand/improve the regex pattern so you don't have rescan >>> the string to check for the presence of MeasDisplay and >>> PatternFov? In other words, since you're already using the giant, >>> Swiss Army sledgehammer of the re module, go ahead and use enough >>> features to cover your use case. >> >> >> Yeah, that was my first thought, but I haven't been able to come up >> with a regex that works. >> >> There are 4 cases I need to detect: >> >> case1 = 'spam_M123_eggs_MeasDisplay_sausage' >> case2 = 'spam_M123_eggs_sausage_and_spam' >> case3 = 'spam_spam_spam_PatternFov_eggs_sausage_and_spam' >> case4 = 'spam_spam_spam_eggs_sausage_and_spam' >> >> I thought this regex would work: >> >> '(_M\d+_){0,1}.*?(MeasDisplay|PatternFOV){0,1}' >> >> And then I could look at the match objects and see which of the 4 >> cases it was. But try as I might, I could not get it to work. Any >> regex gurus want to tell me what I am doing wrong here? >> > The trick to capturing both of the parts when they are both optional is to > use a lookahead and make it optional: > > r'(?=.*?(_M\d+_))?(?=.*?(MeasDisplay|PatternFov))?'
Wow! Thanks so much. This works perfectly. I don't understand it, but I will spend some time dissecting it and I will add another tool to my arsenal. -- https://mail.python.org/mailman/listinfo/python-list