Hi,
Subject says it all: I found a book >200 pages of genealogical data
(Papal Zouave army 1860-70). There's a good quality PDF in the BNF
Gallica, and a worse one I found on Geneanet. But: this one has been
OCRed. My own OCR-capaity has only one limit, but it is an extremely low
one.
I need the data in a database or spreadsheet. That's not very
difficult: been there, done that. but most of the times my workhorse
from the last 30 year,MS Office, was sufficient. But to get to my
target, I need a textfile to apply a series of S&R commands to turn it
into .CSV, a route with no problems for me.
1. Adobe can't export or 'save as' the PDF, because it's somehow protected.
2. Microsoft (Word 2007 and 2016) can't open it or import it because it
is Microsoft, and can't be botered
3. I tried to open the PDF in LibreOffice, and when dinner was
finished, so was LO: it had fabricated a Big Beatiful Bill File.
This file sat in a nameless window overlaying LO Draw. I am not
proficient in LO, and this was the first time I encountered such a
floating window. It appeared that I could select a page in the file,
click on it and have the graphic image selected. Delete and only the
text remained. NB: text in a Draw file... Is that useful?
But it also proved error-prone: sometimes I deleted not the graphic
image, but the whole page. So besides this being a tedious and long
process, this is just not the ideal way to handle say 250 pages.
* If somehow I get this text without the graphics in a LO Draw file,
will I be able to make a Writes file out of it?
* Is there a better route between the PDF and a .csv file?
Leo
--
To unsubscribe e-mail to: users+unsubscr...@global.libreoffice.org
Problems? https://www.libreoffice.org/get-help/mailing-lists/how-to-unsubscribe/
Posting guidelines + more: https://wiki.documentfoundation.org/Netiquette
List archive: https://listarchives.libreoffice.org/global/users/
Privacy Policy: https://www.documentfoundation.org/privacy