Its ugly but, could you use pdf.js to extract the text in a browser widget showing the pdf? http://git.macropus.org/2011/11/pdftotext/example/
Not sure what else is in pdf.js but it looks interesting. On Fri, Jul 8, 2016 at 10:30 AM, Paul Dupuis <p...@researchware.com> wrote: > On 7/8/2016 11:55 AM, Colin Holgate wrote: > > I was trying an export as spreadsheet from Acrobat Pro, but that didn’t > work. Doing a Save as Text from Acrobat Reader was more successful, but the > columns come out in a different order, and some columns get combined into a > single string. > > Over the few years, I have spent a ridiculous amount of time exploring > PDF access via LiveCode is every way possible. Ultimately, for our needs > we created the XPDF external and transferred it to LiveCode, but we > explored javascript extraction from a browser. Interapplication > communication, shell command line tools, etc., etc. > > The reality is the PDF format is great for visually representing a > printed page and totally sucks for text content - that is actually > getting the characters of the document rather than an image of the > characters. > > There is NO really mapping of characters to their appearance in the PDF > other than geometric position on the page. You get no font information, > no size, no styles, zip. You get line breaks at the end of every visible > line and you can get line breaks in what appears to be the middle of > content depending upon how the original source document was rendered > into a PDF. Headers and footers end up in the middle of paragraphs. You > have no real way to tell a line break from a paragraph break and more. > > In truth a NEW portable document format needs to be invented that > connects and preserves content to its appearance, but I suspect that > people who want to keep both intact and portable are just using HTML5 > and CSS3. > > > _______________________________________________ > use-livecode mailing list > use-livecode@lists.runrev.com > Please visit this url to subscribe, unsubscribe and manage your > subscription preferences: > http://lists.runrev.com/mailman/listinfo/use-livecode > _______________________________________________ use-livecode mailing list use-livecode@lists.runrev.com Please visit this url to subscribe, unsubscribe and manage your subscription preferences: http://lists.runrev.com/mailman/listinfo/use-livecode