On Wed 27 Nov 2019 at 12:58:45 (-0500), rhkra...@gmail.com wrote: > On Wednesday, November 27, 2019 12:07:16 PM Richard Owlett wrote: > > I'm trying to create some spreadsheets containing nutritional data. > > > > Sample documents I'm using for input include: > > > https://choosemyplate-prod.azureedge.net/sites/default/files/2WeekMenusGr > > > oceryList.pdf > > > > > > https://choosemyplate-prod.azureedge.net/sites/default/files/2WeekMenusAn > > > dFoodGroupContent.pdf > > > > > > http://www.nhlbi.nih.gov/files/docs/public/heart/new_dash.pdf > > > > Using "pdftotext -layout input.pdf output.txt" is tantalizingly close > > to what I want. > > > > When PRINTED that visually preserves relationships. > > I wish to copy a column of data to a spreadsheet, but can only select > > horizontally, NOT vertically.
Perhaps look for an editor that can select rectangular blocks. For example, emacs has rectangular variants of commands. https://www.gnu.org/software/emacs/manual/html_node/emacs/Rectangles.html Back in the last millennium, I was using TDE (Thomson-Davis Editor) to do much the same in DOS. > I didn't try your pdftotext command, so I don't know how tantalizingly close > that got you. > > I opened one of the .pdfs in Okular, then switched to selection mode and > selected a column of data using the mouse. I then copied it to a text > editor. > It looks very columnar to me, with the only (minor) problems being an extra > line containing some unprintable characters (□ -- copied and pasted here, but > they show up differently in nedit) (and a line end character). Yes, copying directly from PDFs in xpdf also selects rectangles. OTOH evince (by default: I'm not overly familiar with its capabilities) appears to select by lines, even though a rectangle is displayed while dragging the mouse. > I'm sure that could easily be imported into a spreadsheet just specifying > those unprintable characters as the record separator (using that text file). Or first tidy it up in an editor. > (I used Okular 0.14.3 as distributed in Wheezy.) > > > > I suspect it may not be practical in the general case. > > Are there any examples that might come close? You should be able to coerce any decent editor into earning you a cigar. Cheers, David.