Re: [CODE4LIB] ethics of screenscraping library opacs?

2021-11-28 Thread Peter Velikonja
As Kyle mentioned, a screenscraping method is inefficient and will get you incomplete results. As a vendor to public libraries, I routinely request (and receive) MARC dumps. Some libraries are better than others at pulling these from their ILS, but records based on MARC come from the Library of C

Re: [CODE4LIB] ethics of screenscraping library opacs?

2021-11-28 Thread Tim Spalding
I echo what others have said about getting MARC. But a few thoughts about screen scraping: 1. Obey robots.txt in all respects. 2. Obey best practices, such as waiting at least 1 second between requests. 3. If the site doesn't exclude robots, then it is absolutely getting hit already. Chances are i