As Kyle mentioned, a screenscraping method is inefficient and will
get you incomplete results. As a vendor to public libraries, I routinely
request (and receive) MARC dumps. Some libraries are better than
others at pulling these from their ILS, but records based on MARC come
from the Library of C
I echo what others have said about getting MARC. But a few thoughts about
screen scraping:
1. Obey robots.txt in all respects.
2. Obey best practices, such as waiting at least 1 second between requests.
3. If the site doesn't exclude robots, then it is absolutely getting hit
already. Chances are i