Was this code a complete waste of my time? On Wed, Apr 15, 2009 at 1:09 AM, Brian <brian.min...@colorado.edu> wrote:
> On Ubuntu: > > sudo apt-get install python-pyrss2gen python-beautifulsoup # download > ScrapeNFeed > > Python: > Not sure what's wrong with this but it's most of the code you'll need: > ----------- > from urllib import urlopen > from BeautifulSoup import BeautifulSoup > from PyRSS2Gen import RSSItem, Guid > import ScrapeNFeed > import re > > url='http://jobs.spb.ca.gov/exams_title.cfm' > job_html = urlopen(url).read() > job_soup = BeautifulSoup(job_html) > jobs = job_soup.findAll('strong', text=re.compile('.*RESEARCH.*')) > > class JobFeed(ScrapeNFeed.ScrapedFeed): > def HTML2RSS(self, headers, body): > items = [RSSItem(title=job, > link=url, > description=job_soup.h2.string.strip()) > for job in jobs] > > self.addRSSItems(jobs) > > JobFeed.load(job_soup.title.string.strip(), > url, > 'jobs.rss', > 'jobs.pickle', > managingEditor='', > ) > > > > > > > > On Tue, Apr 14, 2009 at 4:17 PM, Joe Larson <j...@joelarson.com> wrote: > >> Hello list! >> >> I am a Python Beginner. I thought a good beginning project would be to use >> the Portable Python environment http://www.portablepython.com/ with >> Beautiful Soup http://www.crummy.com/software/BeautifulSoup/ and Scrape >> 'N' Feed http://www.crummy.com/software/ScrapeNFeed/ to create and RSS >> feed of this page http://jobs.spb.ca.gov/exams_title.cfm - ideally >> filtering just for positions with the string 'Research Analyst'. >> >> In my day job I work on the Windows OS (hence the Portable Python) - at >> home I use Ubuntu but also carry Portable Ubuntu as well. >> >> I just wanted to shoot this to the list - see if anyone had any >> suggestions or tips. I'm reading O'Reilly's Learning Python and The Python >> Tutorial, but it's still very challenging as this is my first programming >> language. Thanks all! Sincerely ~ joelar >> -- >> http://mail.python.org/mailman/listinfo/python-list >> > >
-- http://mail.python.org/mailman/listinfo/python-list