i actually realized there are 3 potentials for class names. either food or drink or dessert. so my question is whether or not i can alter your function to look like this?
def isFoodOrDrinkOrDesert(attr): return attr in ['food', 'drink', 'desert'] thanks in advance for the help Kent Johnson wrote: > [EMAIL PROTECTED] wrote: > > i have some html which looks like this where i want to scrape out the > > href stuff (the www.cnn.com part) > > > > <div class="noFood">Cheese</div> > > <div class="food">Blue</div> > > <a class="btn" href = "http://www.cnn.com"> > > > > > > so i wrote this code which scrapes it perfectly: > > > > for incident in row('div', {'class':'noFood'}): > > b = incident.findNextSibling('div', {'class': 'food'}) > > print b > > n = b.findNextSibling('a', {'class': 'btn'}) > > print n > > link = n['href'] + "','" > > > > problem is that sometimes the 2nd tag , the <div class="food"> tag , is > > sometimes called food, sometimes called drink. > > Apparently you are using Beautiful Soup. The value in the attribute > dictionary can be a callable; try this: > > def isFoodOrDrink(attr): > return attr in ['food', 'drink'] > > b = incident.findNextSibling('div', {'class': isFoodOrDrink}) > > Alternately you could omit the class spec and check for it in code. > > Kent -- http://mail.python.org/mailman/listinfo/python-list