We are looking to select the language & toolset more suitable for a project that requires getting data from several web-sites in real- time....html parsing/scraping. It would require full emulation of the browser, including handling cookies, automated logins & following multiple web-link paths. Multiple threading would be a plus but not requirement.
Some solutions were suggested: Perl: LWP::Simple WWW::Mechanize HTML::Parser Curl & libcurl: Can you suggest solutions for python? Pros & Cons using Perl vs. Python? Why Python? Pointers to various other tools & their comparisons with python solutions will be most appreciated. Anyone who is knowledgeable about the application subject, please do share your knowledge to help us do this right. With best regards. Sanjay. -- http://mail.python.org/mailman/listinfo/python-list