On Aug 15, 8:02 pm, Mike Paul <paul.mik...@gmail.com> wrote: > I'm trying to scrap a dynamic page with lot of javascript in it. > Inorder to get all the data from the page i need to access the > javascript. But i've no idea how to do it.
I'm not sure exactly what you are trying to do, but scraping websites that use a lot of JavaScript are often very problematic. The last time I did so, I had to write some pretty funky regular expressions to pick data out of the JavaScript. Fortunately, the data was directly in the JavaScript, rather than me having to reproduce the Ajax calling chain. If you need to do that, then you almost certainly want to use a package designed for doing such things. One such package is HtmlUnit. It is a "GUI-less browser" with a built-in JavaScript engine that is design for such scraping tasks. Unfortunately, you have to program it in Java, rather than Python. (You might be able to use Jython instead of Java, but I don't know for sure.) |>ouglas P.S. For scraping tasks, you probably want to use BeautifulSoup rather than urllib2. -- http://mail.python.org/mailman/listinfo/python-list