On Sep 11, 8:20 am, Chuck <galois...@gmail.com> wrote: > Hi all, > > I would like to code a simple podcast catcher in Python merely as an > exercise in internet programming. I am a CS student and new to > Python, but understand Java fairly well. I understand how to connect > to a server with urlopen, but then I don't understand how to download > the mp3, or whatever, podcast? Do I need to somehow parse the XML > document? I really don't know. Any ideas? > > Thanks! > > Chuck
You will first have to download the RSS XML file, then parse that file for the URL for the audio file itself. Something like eTree will help immensely in this part. You'll also have to keep track of what you've already downloaded. I'd recommend taking a look at the RSS XML yourself, so you know what it is you have to parse out, and where to find it. From there, it should be fairly easy to come up with the proper query to pull it automatically out of the XML. As a kindness to the provider, I would recommend a fairly lengthy sleep between GETs, particularly if you want to scrape their back catalog. Unfortunately, I no longer have the script I created to do just such a thing in the past, but the process is rather straightforward, once you know where to look. ~G -- http://mail.python.org/mailman/listinfo/python-list