On Sep 11, 12:56 pm, Chuck <galois...@gmail.com> wrote: > On Sep 11, 10:30 am, Falcolas <garri...@gmail.com> wrote: > > > > > > > On Sep 11, 8:20 am, Chuck <galois...@gmail.com> wrote: > > > > Hi all, > > > > I would like to code a simple podcast catcher in Python merely as an > > > exercise in internet programming. I am a CS student and new to > > > Python, but understand Java fairly well. I understand how to connect > > > to a server with urlopen, but then I don't understand how to download > > > the mp3, or whatever, podcast? Do I need to somehow parse the XML > > > document? I really don't know. Any ideas? > > > > Thanks! > > > > Chuck > > > You will first have to download the RSS XML file, then parse that file > > for the URL for the audio file itself. Something like eTree will help > > immensely in this part. You'll also have to keep track of what you've > > already downloaded. > > > I'd recommend taking a look at the RSS XML yourself, so you know what > > it is you have to parse out, and where to find it. From there, it > > should be fairly easy to come up with the proper query to pull it > > automatically out of the XML. > > > As a kindness to the provider, I would recommend a fairly lengthy > > sleep between GETs, particularly if you want to scrape their back > > catalog. > > > Unfortunately, I no longer have the script I created to do just such a > > thing in the past, but the process is rather straightforward, once you > > know where to look. > > > ~G > > Thanks! I will see what I can do.- Hide quoted text - > > - Show quoted text -
I am not sure how eTree fits in. Is that eTree.org? -- http://mail.python.org/mailman/listinfo/python-list