On Sat, 11 Feb 2012 22:49:07 -0200, Nilza BARROS wrote: 

> I have
to read data from a worksheet that is available on the Internet. I
>
have been doing this by copying the worksheet from the browser.
> But I
would like to be able to copy the data automatically using the url
>
command.
> 
> But when using "url" command the result is the source
code, I mean, a html
> code.
> I see that the data I need is in the
source code but before thinking about
> reading the data from the html
code I wonder if there is a package or
> anoher way to extract these
data since reading from the code will demand
> many work and it can be
not so accurate.
> 
> Below one can see the from where I am trying to
export the data:
> 
>
dadoshttp://www.mar.mil.br/dhn/chm/meteo/prev/dados/pnboia/sc1201_arquivos/sheet002.htm","r
>
")

Hi Nilza, 

The URL that you posted points at a document that has
another document within it, in a frame. These files are Excel dumps into
HTML. To view the actual data you need the URIs for each data set. Those
appear at the bottom of the listing, under sc1201_arquivos/sheet001.htm
and sheet002.htm. Your code must fetch these files, not the one at
http://www.mar.mil.br/dhn/chm/meteo/prev/dados/pnboia/sc1202.htm [1]
which only "wraps" them. Most of what you see on the file that you
linked isn't HTML - it's JavaScript and style information for the data
living on the two separate HTML documents. 

You can do this in R using
the RCurl and XML libraries, by pulling the specific files for each data
source. If this is a one-time thing, I'd suggest just coding something
simple that loads the data for each file. If this is something you'll
execute periodically, you'll need a bit more code to extract the
internal data sheets (e.g. the "planhilas" at the bottom), then
extracting the actual data. 

Let me know if you want this as a one-time
thing, or as a reusable program. If you don't know how to use RCurl and
XML to parse HTML I'll be happy to help with that too. I'd just like to
know more about the scope of your question. 

Cheers, 

pr3d 

--

pr3d4t0r at #R, ##java, #awk, #pyton
irc.freeenode.net

-- 
pr3d4t0r at
#R, ##java, #awk, #pyton
irc.freeenode.net
  

Links:
------
[1]
http://www.mar.mil.br/dhn/chm/meteo/prev/dados/pnboia/sc1202.htm

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to