Re: need help with re module

2007-06-22 Thread Gabriel Genellina
En Sat, 23 Jun 2007 01:12:17 -0300, samwyse <[EMAIL PROTECTED]> escribió: > Speak for yourself. If I'm writing an HTML syntax checker, I think I'll > skip BeautifulSoup and use something that gives me the results that I > expect, not the results that you expect. Sure! By the way, I'm looking for

Re: need help with re module

2007-06-22 Thread samwyse
Gabriel Genellina wrote: > En Wed, 20 Jun 2007 17:56:30 -0300, David Wahler <[EMAIL PROTECTED]> > escribió: > >> On 6/20/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote: >> [snip] >> I agree that BeautifulSoup is probably the best tool for the job, but >> this doesn't sound right to me. Since th

Re: need help with re module

2007-06-20 Thread Gabriel Genellina
En Wed, 20 Jun 2007 17:56:30 -0300, David Wahler <[EMAIL PROTECTED]> escribió: > On 6/20/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote: >> En Wed, 20 Jun 2007 13:58:34 -0300, linuxprog <[EMAIL PROTECTED]> >> escribió: >> >> > i have that string "helloworldok" and i want to >> > extract all the

Re: need help with re module

2007-06-20 Thread Gabriel Genellina
En Wed, 20 Jun 2007 17:24:27 -0300, John Salerno <[EMAIL PROTECTED]> escribió: > Gabriel Genellina wrote: > >> py> from BeautifulSoup import BeautifulSoup >> py> chaine = """helloworldok""" >> py> soup = BeautifulSoup(chaine) >> py> soup.findAll(text=True) >> [u'hello', u'world', u'ok'] > > Wow.

Re: need help with re module

2007-06-20 Thread David Wahler
On 6/20/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote: > En Wed, 20 Jun 2007 13:58:34 -0300, linuxprog <[EMAIL PROTECTED]> > escribió: > > > i have that string "helloworldok" and i want to > > extract all the text , without html tags , the result should be some > > thing like that : helloworldok

Re: need help with re module

2007-06-20 Thread Matimus
Here is an example: >>> s = "Helloworldok" >>> matchtags = re.compile(r"<[^>]+>") >>> matchtags.findall(s) ['', '', ''] >>> matchtags.sub('',s) 'Helloworldok' I probably shouldn't have shown you that. It may not work for all HTML, and you should probably be looking at something like BeautifulSoup

Re: need help with re module

2007-06-20 Thread John Salerno
Gabriel Genellina wrote: > py> from BeautifulSoup import BeautifulSoup > py> chaine = """helloworldok""" > py> soup = BeautifulSoup(chaine) > py> soup.findAll(text=True) > [u'hello', u'world', u'ok'] Wow. That *is* beautiful. :) -- http://mail.python.org/mailman/listinfo/python-list

Re: need help with re module

2007-06-20 Thread Gabriel Genellina
En Wed, 20 Jun 2007 13:58:34 -0300, linuxprog <[EMAIL PROTECTED]> escribió: > i have that string "helloworldok" and i want to > extract all the text , without html tags , the result should be some > thing like that : helloworldok > > i have tried that : > > from re import findall > >

Re: need help with re module

2007-06-20 Thread Matimus
On Jun 20, 9:58 am, linuxprog <[EMAIL PROTECTED]> wrote: > hello > > i have that string "helloworldok" and i want to > extract all the text , without html tags , the result should be some > thing like that : helloworldok > > i have tried that : > > from re import findall > > chaine

need help with re module

2007-06-20 Thread linuxprog
hello i have that string "helloworldok" and i want to extract all the text , without html tags , the result should be some thing like that : helloworldok i have tried that : from re import findall chaine = """helloworldok""" print findall('[a-zA-z][^(<.*>)].+?[a-zA-Z]