Hi guys, I'm starting to learn Python and so far am very impressed with it's possibilities. I do however need some help with certain things I'm trying to do which as of yet haven't managed to find the answer by myself. Hopefully, someone will be able to give me some pointers :)
First my background, I haven't programmed seriously in over 5 years, but recently have started programming again in Delphi/Pascal scripting, and that's what I'm most familiar with right now. I'm also much more confortable with structured programming in contrast to OO (which isn't helping much with Python :)) Anyway, I have a very specific project in mind which I've mostly implemented in Pascal and I'd like to implement it in Python since the possibilities after that are much more interesting. Basically, I'm getting a htmlsource from a URL and need to a.) find specific URLs b.) find specific data c.) with specific URLs, load new html pages and repeat. I've managed to load the html source I want into an object called htmlsource using: >>> import urllib >>> sock = urllib.urlopen("URL Link") >>> htmlSource = sock.read() >>> sock.close() I'm assuming that htmlSource is a string with \n at the end of each line. NOTE: I've become very accustomed with the TStringList class in Delphi so forgive me if I'm trying to work in that way with Python... Basically, I want to search through the whole string( htmlSource), for a specific keyword, when it's found, I want to know which line it's on so that I can retrieve that line and then I should be able to parse/extract what I need using Regular Expressions (which I'm getting quite confortable with). So how can this be accomplished? Second main thing I'd like to know has to do with urllister, I'm very intrigued by it's use of grabbing automatically url links from the source. but I've only managed to get it to retrive everything, which is a lot. what are my options in term of getting it to be more specific? Can I tell it to retrieve a URL IF a keyword is found on the same string line? Hopefully someone will be able able/willing to give me a hand, I think with these roadblocks out of the way, I should be able to figure out the rest of what I need. Thanks in advance! Benji99 ---------------------------------------------- Posted with NewsLeecher v1.0 Final * Binary Usenet Leeching Made Easy * http://www.newsleecher.com/?usenet ---------------------------------------------- -- http://mail.python.org/mailman/listinfo/python-list