Michiel Overtoom wrote:
> 
> Alex wrote...
>>
>>Okay, heres the general idea of the html I have to work with:
>>
>><div>
>>   noun
>>   <table class='luna'>
>>   <table class='luna'>
>>   <table class='luna'>
>>   <table class='luna'>
>>   verb
>>   <table class='luna'>
>>   <table class='luna'>
>>   <table class='luna'>
>></div>
>>
>>Okay, I left off some stuff. 
> 
> I wish you didn't, or at least provided an URL where I can get the page
> which you are trying to parse.  Now I don't have a valid testcase to
> tinker
> with.  And maybe you can also show your code which you already came up
> with.
> 
> 
>> I can easily get the tables but it is the span's than I am having trouble
> with. 
> 
> I can't see any SPAN tags in the example you provided.
> 
> Greetings,
> 
> -- 
> "The ability of the OSS process to collect and harness
> the collective IQ of thousands of individuals across
> the Internet is simply amazing." - Vinod Vallopillil
> http://www.catb.org/~esr/halloween/halloween4.html
> 
> --
> http://mail.python.org/mailman/listinfo/python-list
> 
> 

Oh, well sorry, I wrote the span tags, but they didn't show up. But it was
around the noun. Here is the code I have to get the definitions alone:

import urllib
from BeautifulSoup import BeautifulSoup

class defWord:
    def __init__(self, word):
        self.word = word

        def get_defs(term):
            soup =
BeautifulSoup(urllib.urlopen('http://dictionary.reference.com/search?q=%s' %
term))

            for tabs in soup.findAll('table', {'class': 'luna-Ent'}):
                yield tabs.findAll('td')[-1].contents[0].string

        self.mainList = list(get_defs(self.word))

Theres a bit more to it, but it doesn't matter here, and so if you look I am
using dictionary.com as the website. If you look at the html, the "" tags
are where the type of the word is and that is what I need, in order. Or if I
can figure out how many <table> tags are inbetween each "" tag, that too
would work. 

If you need anything else, feel free to ask! 
-- 
View this message in context: 
http://www.nabble.com/Re%3A-Help-with-BeautifulSoup-tp18418004p18423003.html
Sent from the Python - python-list mailing list archive at Nabble.com.

--
http://mail.python.org/mailman/listinfo/python-list

Reply via email to