On 4/25/2013 1:19 PM, Dirk Eddelbuettel wrote:
On 25 April 2013 at 13:00, Spencer Graves wrote:
| Hello:
|
|
|        What tools would you recommend for extracting the table of
| members of the US House of representatives from
| "http://house.gov/representatives/"; and
| 
"http://en.wikipedia.org/wiki/List_of_current_members_of_the_United_States_House_of_Representatives_by_age";?
|
|
|
|        I started writing something using getURL{RCurl}.  However, I'm
| getting bogged down manually selecting character sequences to search for
| and split on.

You could try your own sos package to search what others have done here; the
XML package is popular for it but the whole scheme is fraught with little
pitfalls as html very definitely is not a good format for data-delivery, and
an html page clearly is no API for data access.


Thanks to Gabriel Becker and Dirk Eddelbuettel for suggesting XML: Its "readHTMLTable" solves my problem.


I confess that I tried "sos" before posting to this list without getting useful results: The search terms I tried returned too many matches to be useful.


And Gabriel was correct in that I should have sent the question to R-Help, but I only concluded that after sending it here.


      Thanks again.
      Spencer

Dirk

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to