Dear colleagues, I'm trying to parse the html content from this webpage: http://timesofindia.indiatimes.com/searchresult.cms?sortorder=score&searchtype=2&maxrow=10&startdate=2001-01-01&enddate=2011-08-25&article=2&pagenumber=1&isphrase=no&query=IIM&searchfield=§ion=&kdaterange=30&date1mm=01&date1dd=01&date1yyyy=2001&date2mm=08&date2dd=25&date2yyyy=2011
Using the following code library(RCurl) library(XML) myurl<-c("http://timesofindia.indiatimes.com/searchresult.cms?sortorder=score&searchtype=2&maxrow=10&startdate=2001-01-01&enddate=2011-08-25&article=2&pagenumber=1&isphrase=no&query=IIM&searchfield=§ion=&kdaterange=30&date1mm=01&date1dd=01&date1yyyy=2001&date2mm=08&date2dd=25&date2yyyy=2011") .x<-getURL(myurl) htmlTreeParse(.x, asText=T) This prints approximately 15 lines of the output from the html document and then mysteriously stops. The command line prompt does not reappear and force quit is the only option. I'm running R 2.13 on Mac os 10.6 and the latest versions of XML and RCURL are installed. Yours, Simon Kiss ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.