There are several Citation parsers available, you may try exploring which one works best for you. I am listing some of them, as per my knowledge and experience:
1. Anystyle.io 2. GROBID 3. Excite 4. Outside 5. biblio-glutton 6. CERMINE Hope it helps. On Fri, May 13, 2022 at 12:10 AM Danielle Reay <dr...@drew.edu> wrote: > Hello, > > We have a faculty member looking to create a dataset from an annotated > bibliography she compiled. Right now it exists as a word file and as a pdf. > The entries are relatively structured with a citation and an abstract, but > the document is about 150 pages long with multiple entries per page. Rather > than manually copy and paste everything to create the spreadsheet/csv, I > wanted to ask for suggestions or approaches to doing this by either > scraping or extracting structured data from the pdf. Thanks very much in > advance! > > Danielle Reay > > Digital Scholarship Technology Manager > Drew University > -- Regards, Vinit Kumar, PhD Assistant Professor Department of Library and Information Science Babasaheb Bhimrao Ambedkar University(A Central University) Lucknow, 226025 +91-9454120174 +91-7007508744 https://sites.google.com/view/vinitkumar ORCID: https://orcid.org/0000-0001-8306-2087 ResearchGate: https://www.researchgate.net/profile/Vinit-Kumar-9 Google Scholar : https://scholar.google.com/citations?hl=en&user=GmSgmrkAAAAJ Scopus : https://www.scopus.com/authid/detail.uri?authorId=57209917865 Disclaimer: This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please contact the sender and delete the email and attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the Babasaheb Bhimrao Ambedkar University, Lucknow.