Hi Alex, Dan, Gena, Annie, and Others, I can adapt Dan's suggestion of using pdftotext to an AppleScript by putting a wrapper around the command line instruction, so you can select files in Finder and run the AppleScript in order to create .html versions of the selected files in the same directory. This might work for his purposes, but the problem is the requirement "I'd need line spacing preserved, and navigation could be made into headings."
It is actually difficult to find a program that maintains formatting in converting back to HTML, especially of table data in PDF files, although that might depend on how the PDF file was generated. The test case I use for this is the HTML page for Appendix A of the "VoiceOver Getting Started Guide" at: http://help.apple.com/voiceover/info/guide/10.8/English.lproj/index.html You'll notice that the columns of shortcut keys and associated actions read correctly when you use the web page. If you print out a PDF version of the page, VoiceOver reads the entries under the first column of shortcuts and then the entries for the column of associated actions. This does not get fixed if you convert back to HTML with pdftotext (or many other programs). The best solution I could find, in response to a question from a member of the mac-access list whose bank statement was delivered in PDF format, was to get the trial version of Wondershare PDF Converter or PDF Converter Pro for Mac from the developer's web site. This had a limit of 5 pages for conversion in the trial version, which was suitable for bank statements, but not in general. Note, this is not a general recommendation for that software, but it worked for that specific purpose. To read more about this and usage tests, see the mail archive link for my mac-access link post at: • Re: Tables in PDF documents http://www.mail-archive.com/mac-access%40mac-access.net/msg11985.html To get back to this specific suggestion of pdftotext, I'll post the link to my earlier recommendation of pdftotext to Dan, which gave some suggestions for alternatively using this either as an AppleScript or Automator action as well as other notes about the application: • pdftotext utility [was Re: Xpdf for mac] http://www.mail-archive.com/macvisionaries%40googlegroups.com/msg61916.html The AppleScript described there for HTML conversions can be adapted for similar use (e.g., highlight files in Finder, then run the AppleScript to batch convert files without having to use Terminal.) I'll paste in the AppleScript below my signature starting below the line "---Cut Here---" and ending with the line "end run". You can save it from the AppleScript editor under a name of your choice, like "PDF to HTML". HTH. Cheers, Esther ---Cut Here--- (* Use pdftotext to create an HTML version of the selected PDF file Created 25 January 2013; modifeid from PDF to Text AppleScript of 17 May 2011 *) on run tell application "Finder" set chosenFile to the selection as alias end tell do shell script "/usr/local/bin/pdftotext -htmlmeta " & quoted form of POSIX path of chosenFile end run On Jan 25, 2013, at 8:32 AM, - wrote: > > In terminal one can use the pdftotext program found at: > > http://www.bluem.net/en/mac/packages/ > > The command to convert to html is: > > pdftotext file.pdf -htmlmeta > > The converted file has a html extension. The original files are retained as > pdf. > > This can be put in a script with a loop to convert all pdf files in a > directory. > > XB > -- You received this message because you are subscribed to the Google Groups "MacVisionaries" group. To post to this group, send email to macvisionaries@googlegroups.com. To unsubscribe from this group, send email to macvisionaries+unsubscr...@googlegroups.com. Visit this group at http://groups.google.com/group/macvisionaries?hl=en. For more options, visit https://groups.google.com/groups/opt_out.