Re: [CODE4LIB] API for book descriptions?

2018-05-16 Thread Mark Watkins
Hi Christina, You might consider the Harvard Open Metadata APIs, if they contain the works you are interested in. Free API and unlicensed content (the Amazon APIs do come with a license agreement/restrictions). Doc here: https://wiki.harvard.edu/confluence/display/LibraryStaffDoc/LibraryCloud

Re: [CODE4LIB] ai in libraries

2018-12-12 Thread Mark Watkins
chine learning experiments on it, but alas, no time).. https://emeritus.library.harvard.edu/open-metadata -- Mark Watkins Bookship (https://www.bookshipapp.com)

Re: [CODE4LIB] Keyword Extraction from Text

2019-09-17 Thread Mark Watkins
It does depend a bit on what kinds of "key terms" or "important words" you have in mind, but I have had good luck with Google's NLP APIs. They free for small numbers of queries (if memory serves in the thousands per day, but don't quote me on it). It does a good job of identifying people, places

Re: [CODE4LIB] ethics of screenscraping library opacs?

2021-11-30 Thread Mark Watkins
Adding to what others have said, an API will likely give you better (and faster!) results than scraping. Not sure if the Harvard Library Data covers what you need, but their data is available via API (or download), for free, in a legal and rights-respecting manner. Amazing resource. https://lib

[CODE4LIB] CODEX Hackaton @ MIT (cool books hackaton in Feb!)

2016-12-12 Thread Mark Watkins
We're hosting a CODEX Hackathon (our 3rd) at the MIT Media Lab on February 10-12, 2017.  CODEX is a community of folks who want to imagine the future of books and reading: programmers, designers, writers, librarians, publishers, readers.  All are welcome. It's the best intersecton of books and t

Re: [CODE4LIB] OCR software

2017-07-20 Thread Mark Watkins
I have a recently released a bookclub - related app called Bookship, which features the ability to scan a page of text from a book so users can post quotes. (www.bookshipapp.com). So my use case is people taking pictures of pages with their phone and OCR-ing it. I extensively tested Tesseract (