On Mon, Mar 19, 2001 at 04:26:38PM +0100, Sven LUTHER wrote: > Hello, ... > > Seeing that most everything comes in pdf format these days, and that at least > xpdf, acroread (well it is i386 only and non-free, but still usefull) and gv > can read and display/print this format, i asked myself if would be possible to > edit document in pdf format ?
as others have mentioned, PDF is a postscript hybrid. it'll faithfully recreate the original document on-screen or on-printer since it's got all the moveto this spot set this color draw a line to here and there pick a font and size moveto over here drawtext("Hello World") instructions therein. the trouble is, using programs like quark xpress, graphic artists can 'kern' their text for a more (or less) pleasing effect; when they do so, often the resulting postscript code breaks the text at the kern point. as an example the word "Type" is often kerned to tuck the "y" under the "T". that might generate pseudocode something like this: moveto x,y drawtext 'T' moveto h,v drawtext 'ype' in the original document, the text stream knows the characters in the string 'Type' are consecutive and make up a whole word, and it also knows about the kerning information to get 'T' to snuggle up to 'y'. in the postscript (a.k.a. pdf) there's only a put 'T' over here put 'ype' over there which could conceivably even be in reversed order -- so long as the human reading the resulting display can grab the lexical intent of the word, all is well. (ordering, in postscript, only matters when items overlap -- and if they have no stroke and identical fill color, not even then.) hard to re-munge that sort of randomly-broken alphabetics back into editable text. but not impossible! i can imagine some bright soul coming up with a sort algorithm based on locale (e.g. roman = left-to-right then top-to-bottom, vs. arabic, vs. mandarin) to re-assemble text fragments into likely "original text stream" format. (superscripts and subscripts throw a monkeywrench at that concept, of course.) asking it to also keep track of font sizing and style and so forth, would be an immense task. yet it's conceivably doable. but i don't know of any such, at the moment. (doesn't mean there aren't any, i just don't know of any. could be a big difference, there.) -- any takers? :) -- It is always hazardous to ask "Why?" in science, but it is often interesting to do so just the same. -- Isaac Asimov, 'The Genetic Code' [EMAIL PROTECTED] http://newbieDoc.sourceforge.net/ -- we need your brain! http://www.dontUthink.com/ -- your brain needs us!