Thank you very much, it works. I guess I didn't read it right. Arjen
On Sep 17, 3:22 pm, Jason Drew <[EMAIL PROTECTED]> wrote: > You just need a one-character addition to your regex: > > regex = re.compile(r'<organisatie.*?</organisatie>', re.S) > > Note, there is now a question mark (?) after the .* > > By default, regular expressions are "greedy" and will grab as much > text as possible when making a match. So your original expression was > grabbing everything between the first opening tag and the last closing > tag. The question mark says, don't be greedy, and you get the > behaviour you need. > > This is covered in the documentation for the re > module.http://docs.python.org/lib/module-re.html > > Jason > > On Sep 17, 9:00 am, duikboot <[EMAIL PROTECTED]> wrote: > > > Hello, > > > I am trying to extract a list of strings from a text. I am looking it > > for hours now, googling didn't help either. > > Could you please help me? > > > >>>s = """ > > >>>\n<organisatie>\n<Profiel_Id>28996</Profiel_Id>\n</organisatie>\n<organisatie>\n<Profiel_Id>28997</Profiel_Id>\n</organisatie>""" > > >>> regex = re.compile(r'<organisatie.*</organisatie>', re.S) > > >>> L = regex.findall(s) > > >>> print L > > > ['organisatie>\n<Profiel_Id>28996</Profiel_Id>\n</organisatie> > > \n<organisatie>\n<Profiel_Id>28997</Profiel_Id>\n</organisatie'] > > > I expected: > > [('organisatie>\n<Profiel_Id>28996</Profiel_Id>\n</organisatie> > > \n<organisatie>), (<organisatie>\n<Profiel_Id>28997</Profiel_Id>\n</ > > organisatie')] > > > I must be missing something very obvious. > > > Greetings Arjen -- http://mail.python.org/mailman/listinfo/python-list