Dear colleagues,
I'm trying to parse the html content from this webpage:
http://timesofindia.indiatimes.com/searchresult.cms?sortorder=score&searchtype=2&maxrow=10&startdate=2001-01-01&enddate=2011-08-25&article=2&pagenumber=1&isphrase=no&query=IIM&searchfield=&section=&kdaterange=30&date1mm=01&date1dd=01&date1yyyy=2001&date2mm=08&date2dd=25&date2yyyy=2011

Using the following code
library(RCurl)
library(XML)
myurl<-c("http://timesofindia.indiatimes.com/searchresult.cms?sortorder=score&searchtype=2&maxrow=10&startdate=2001-01-01&enddate=2011-08-25&article=2&pagenumber=1&isphrase=no&query=IIM&searchfield=&section=&kdaterange=30&date1mm=01&date1dd=01&date1yyyy=2001&date2mm=08&date2dd=25&date2yyyy=2011";)

.x<-getURL(myurl)
htmlTreeParse(.x, asText=T)

This prints approximately 15 lines of the output from the html document and 
then mysteriously stops. The command line prompt does not reappear and force 
quit is the only option. 
I'm running R 2.13 on Mac os 10.6 and the latest versions of XML and RCURL are 
installed.
Yours, Simon Kiss

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to