"Seymour" <[EMAIL PROTECTED]> writes: > I am trying to find a way to sign onto my Wall Street Journal account > (http://online.wsj.com/public/us) and automatically download various > financial pages on stocks and mutual funds that I am interested in > tracking. I have a subscription to this site and am trying to figure [...] > My questions are: > 1. Is there an easier way to grab these pages from a password protected > site, or is the use of Mechanoid a reasonable approach?
This is the first time I heard of anybody using mechanoid. As the author of mechanize, of which mechnoid is a fork, I was always in the dark about why the author decided to fork it (he hasn't emailed me...). I don't know if there's any activity on the mechanoid project, but I'm certainly still working on mechanize, and there's an active mailing list: http://wwwsearch.sourceforge.net/ https://lists.sourceforge.net/lists/listinfo/wwwsearch-general > 2. Is there an easy way of recording a web surfing session in Firefox > to see what the browser sends to the site? I am thinking that this > might help me better understand the Mechanoid commands, and more easily > program it. I do a fair amount of VBA Programming in Microsoft Excel > and have always found the Macro Recording feature a very useful > starting point which has greatly helped me get up to speed. With Firefox, you can use the Livehttpheaders extension: http://livehttpheaders.mozdev.org/ The mechanize docs explain how to turn on display of HTTP headers that it sends. Going further, certainly there's at least one HTTP-based recorder for twill, which actually watches your browser traffic and generates twill code for you (twill is a simple language for functional testing and scraping built on top of mechanize): http://twill.idyll.org/ http://darcs.idyll.org/%7Et/projects/scotch/doc/ That's not an entirely reliable process, but some people might find it helpful. I think there may be one for zope.testbrowser too (or ZopeTestBrowser (sp?), the standalone version that works without Zope) -- I'm not sure. (zope.testbrowser is also built on mechanize.) Despite the name, I'm told this can be used for scraping as well as testing. I would imagine that it would be fairly easy to modify or extend Selenium IDE to emit mechanize or twill or zope.testbrowser (etc.) code (perhaps without any coding, I used too many Firefox Selenium plugins and now forget which had which features). Personally I would avoid using Selenium itself to actually automate tasks, though, since unlike mechanize &c., Selenium drags in an entire browser, which brings with it some inflexibility (though not as bad as in the past). It does have advantages though: most obviously, it knows JavaScript. John -- http://mail.python.org/mailman/listinfo/python-list