En Sat, 23 Jun 2007 01:12:17 -0300, samwyse <[EMAIL PROTECTED]> escribió:
> Speak for yourself. If I'm writing an HTML syntax checker, I think I'll
> skip BeautifulSoup and use something that gives me the results that I
> expect, not the results that you expect.
Sure! By the way, I'm looking for
Gabriel Genellina wrote:
> En Wed, 20 Jun 2007 17:56:30 -0300, David Wahler <[EMAIL PROTECTED]>
> escribió:
>
>> On 6/20/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote:
>>
[snip]
>> I agree that BeautifulSoup is probably the best tool for the job, but
>> this doesn't sound right to me. Since th
En Wed, 20 Jun 2007 17:56:30 -0300, David Wahler <[EMAIL PROTECTED]>
escribió:
> On 6/20/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote:
>> En Wed, 20 Jun 2007 13:58:34 -0300, linuxprog <[EMAIL PROTECTED]>
>> escribió:
>>
>> > i have that string "helloworldok" and i want to
>> > extract all the
En Wed, 20 Jun 2007 17:24:27 -0300, John Salerno
<[EMAIL PROTECTED]> escribió:
> Gabriel Genellina wrote:
>
>> py> from BeautifulSoup import BeautifulSoup
>> py> chaine = """helloworldok"""
>> py> soup = BeautifulSoup(chaine)
>> py> soup.findAll(text=True)
>> [u'hello', u'world', u'ok']
>
> Wow.
On 6/20/07, Gabriel Genellina <[EMAIL PROTECTED]> wrote:
> En Wed, 20 Jun 2007 13:58:34 -0300, linuxprog <[EMAIL PROTECTED]>
> escribió:
>
> > i have that string "helloworldok" and i want to
> > extract all the text , without html tags , the result should be some
> > thing like that : helloworldok
Here is an example:
>>> s = "Helloworldok"
>>> matchtags = re.compile(r"<[^>]+>")
>>> matchtags.findall(s)
['', '', '']
>>> matchtags.sub('',s)
'Helloworldok'
I probably shouldn't have shown you that. It may not work for all
HTML, and you should probably be looking at something like
BeautifulSoup
Gabriel Genellina wrote:
> py> from BeautifulSoup import BeautifulSoup
> py> chaine = """helloworldok"""
> py> soup = BeautifulSoup(chaine)
> py> soup.findAll(text=True)
> [u'hello', u'world', u'ok']
Wow. That *is* beautiful. :)
--
http://mail.python.org/mailman/listinfo/python-list
En Wed, 20 Jun 2007 13:58:34 -0300, linuxprog <[EMAIL PROTECTED]>
escribió:
> i have that string "helloworldok" and i want to
> extract all the text , without html tags , the result should be some
> thing like that : helloworldok
>
> i have tried that :
>
> from re import findall
>
>
On Jun 20, 9:58 am, linuxprog <[EMAIL PROTECTED]> wrote:
> hello
>
> i have that string "helloworldok" and i want to
> extract all the text , without html tags , the result should be some
> thing like that : helloworldok
>
> i have tried that :
>
> from re import findall
>
> chaine
hello
i have that string "helloworldok" and i want to
extract all the text , without html tags , the result should be some
thing like that : helloworldok
i have tried that :
from re import findall
chaine = """helloworldok"""
print findall('[a-zA-z][^(<.*>)].+?[a-zA-Z]
10 matches
Mail list logo