On Mar 11, 7:28 pm, Roy Smith <r...@panix.com> wrote: > In article > <239c4ad7-ac93-45c5-98d6-71a434e1c...@r21g2000yqa.googlegroups.com>, > John Salerno <johnj...@gmail.com> wrote: > > > > > > > > > > > Getting the time that the song is played is easy, because the time is > > wrapped in a <div> tag all by itself with a class attribute that has a > > specific value I can search for. But the actual song title and artist > > information is harder, because the HTML isn't quite as precise. Here's > > a sample: > > > <div class="cmPlaylistContent"> > > <strong> > > <a href="/lsp/t2995/"> > > Love Without End, Amen > > </a> > > </strong> > > <br/> > > <a href="/lsp/a436/"> > > George Strait > > </a> > > [...] > > Therefore, I appeal to your greater wisdom in these matters. Given > > this HTML, is there a "best practice" for how to refer to the song > > title and artist? > > Obviously, any attempt at screen scraping is fraught with peril. > Beautiful Soup is a great tool but it doesn't negate the fact that > you've made a pact with the devil. That being said, if I had to guess, > here's your puppy: > > > <a href="/lsp/t2995/"> > > Love Without End, Amen > > </a> > > the thing to look for is an "a" element with an href that starts with > "/lsp/t", where "t" is for "track". Likewise: > > > <a href="/lsp/a436/"> > > George Strait > > </a> > > an href starting with "/lsp/a" is probably an artist link. > > You owe the Oracle three helpings of tag soup.
Well, I had considered exactly that method, but I don't know for sure if the titles and names will always have links like that, so I didn't want to tie my programming to something so specific. But perhaps it's still better than just taking the first two strings. -- http://mail.python.org/mailman/listinfo/python-list