Where is readtext() from?
Some combination of scraping
http://home.brisnet.org.au/~bgreen/Data/Hanson1/
and
http://home.brisnet.org.au/~bgreen/Data/Hanson2/
to recover the required file names:
library(rvest)
read_html("http://home.brisnet.org.au/~bgreen/Data/Hanson1/") |>
html_element("body") |> html_element("table") |> html_table()
will get you most of the way there ...
then an lapply() or for loop to download all the bits ...?
On 2023-07-25 6:06 p.m., Bob Green wrote:
Hello,
I am seeking advice as to how I can download the 833 files from this
site:"http://home.brisnet.org.au/~bgreen/Data/"
I want to be able to download them to perform a textual analysis.
If the 833 files, which are in a Directory with two subfolders were on
my computer I could read them through readtext. Using readtext I get the
error:
> x = readtext("http://home.brisnet.org.au/~bgreen/Data/*")
Error in download_remote(file, ignore_missing, cache, verbosity) :
Remote URL does not end in known extension. Please download the file
manually.
> x = readtext("http://home.brisnet.org.au/~bgreen/Data/Dir/()")
Error in download_remote(file, ignore_missing, cache, verbosity) :
Remote URL does not end in known extension. Please download the file
manually.
Any suggestions are appreciated.
Bob
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
--
Dr. Benjamin Bolker
Professor, Mathematics & Statistics and Biology, McMaster University
Director, School of Computational Science and Engineering
(Acting) Graduate chair, Mathematics & Statistics
> E-mail is sent at my convenience; I don't expect replies outside of
working hours.
______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.