In article <239c4ad7-ac93-45c5-98d6-71a434e1c...@r21g2000yqa.googlegroups.com>, John Salerno <johnj...@gmail.com> wrote:
> Getting the time that the song is played is easy, because the time is > wrapped in a <div> tag all by itself with a class attribute that has a > specific value I can search for. But the actual song title and artist > information is harder, because the HTML isn't quite as precise. Here's > a sample: > > <div class="cmPlaylistContent"> > <strong> > <a href="/lsp/t2995/"> > Love Without End, Amen > </a> > </strong> > <br/> > <a href="/lsp/a436/"> > George Strait > </a> > [...] > Therefore, I appeal to your greater wisdom in these matters. Given > this HTML, is there a "best practice" for how to refer to the song > title and artist? Obviously, any attempt at screen scraping is fraught with peril. Beautiful Soup is a great tool but it doesn't negate the fact that you've made a pact with the devil. That being said, if I had to guess, here's your puppy: > <a href="/lsp/t2995/"> > Love Without End, Amen > </a> the thing to look for is an "a" element with an href that starts with "/lsp/t", where "t" is for "track". Likewise: > <a href="/lsp/a436/"> > George Strait > </a> an href starting with "/lsp/a" is probably an artist link. You owe the Oracle three helpings of tag soup. -- http://mail.python.org/mailman/listinfo/python-list