Re: regex pattern to extract repeating groups

MRAB Mon, 27 Aug 2018 17:56:36 -0700

On 2018-08-28 00:58, Malcolm wrote:

On 28/08/2018 7:09 AM, John Pote wrote:
On 26/08/2018 00:55, Malcolm wrote:
I am trying to understand why regex is not extracting all of thecharacters between two delimiters.
The complete string is the xmp IFD data extracted from a .CR2 imagefile.
I do have a work around, but it's messy and possibly not future proof.
Do you mean future proof your workaround or Cannon's .CR2 raw imagefiles might change? I guess .CR2's won't change but Cannon havebrought out the new .CR3 raw image file for which I needed to upgrademy photo editing suit (at least I didn't but used their tool toconvert .CR3s from the camera to the digital negative format whichmany photo editors can handle.) Can send you sample .CR3 if you wantto compare.
Regards,
John
John

Thank you.

Some background
The application is for personal use. Why I'm familiar with python
generally (and thanks to all who post code and answer questions), this
is the first time I have used structs to read a binary file, xml parsers
to parse some of the RFD contents and re.

First
I have now discovered that when print the return of re.search that the
matched='truncates the matched characters'.  To see/get all found
characters I need to use the span as indexes to the original string. I'm
not sure if this is mentioned in the re documentation. But all the
samples I've seen on the web use only small strings. This was the cause
of my question.

re.search returns a "match object". When you print it, you get what isbasically a summary. If you want the matched portion of the string, usethe match object's .group method:


[snip]

re_pattern = r'( *<dc:.*</dc:)'
x = re.search(re_pattern, data, re.DOTALL)
print(x.group())
--
https://mail.python.org/mailman/listinfo/python-list

Re: regex pattern to extract repeating groups

Reply via email to