Re: [Groff] Creating Word/rtf output

Ted Harding Tue, 03 Mar 2009 10:52:35 -0800

On 03-Mar-09 17:48:02, Robert Goulding wrote:
> On Tue, Mar 3, 2009 at 10:58 AM, Jeremy C. Reed <r...@reedmedia.net>
> wrote:
>>> "GNU troff?? What is that? These days the publishers really want
>>> either MSWord or TEX files. We can try the HTML result, but I
>>> predict lots of issues in the typesetting! You'll be correcting
>>> proofs for weeks.
>>
>> Maybe try troffcvt
>> http://www.snake.net/software/troffcvt/
>>
> Thanks, I'll definitely try this.  Looking through the documentation,
> it seems that it has no support for macro packages, and would not be
> able to handle footnotes in an intelligent way (it also ignores page
> position traps).  So I don't know whether it's going to be better than
> my grohtml workflow - but I'll try anything.
> 
> Will still appreciate improvements to my native-groff procedure!
> 
> Robert.


The following is my personal opinion.
I have been through the business of intially creating nice groff
formatting (resulting in PS/PDF files) only to have co-workers
and publishers insist on Word. This was years ago.

I tried troffcvt, and was very disappointed with its capabilities.
It can cope with very primitive troff, has problems with macros,
and (as far as I remember) does not recognise any of your own
macro, string or character definitions.

Seeing Jeremy's suggestion of troffcvt above, I visited the website.
I see it was last updated in 2001-01-13 , and describes version 1.04.

I then went back to the ancient machine on which I installed it,
and Lo! I find that it is version 1.04, installed in June 1997!

What I found was that, even if you could get workable RTF out of
troffcvt, and import it into Word, the amount of work you had to
do to then get the Word document into decent shape was a close
approximation to re-typing it from scratch!

In the end, I wrote an 'awk' script that stripped out some troff
formatting, and recognised paragraphs so that it output each
paragraph as one continuous line (where Word is concerned, it treats
CRLF -- or used to -- as a paragraph break). In-line escapes were
kept in, as were some other things (including some macro calls)
that could be read in as if they were part of the text. Then I went
through the Word document with "Search", making by hand the formatting
changes that corresponded to the troff markup. There was also some
"find and replace" work,

These documents were chapters for a book, so there was a lot of work,
However, it was feasible, and the best way (I think) to proceed.

Another approach I tried once was to write an 'awk' script to
convert my troff into WordPerfect 5.1. This could be done almost
perfectly, since (bless them!) WordPerfect published the full
details of the file format for WP5.1 (including tables, font
changes, non-English characters which were based on a 2-byte
encoding which I think may have been an early version of utf8).
Also, the WordPerfect 5.1 equation editor used a decription
language very closely related to eqn.

The resulting WP-5.1 could be imported directly into Word (though
not always successfully, since Word tended to have its own view
of what WordPerfect meant by their format). On the other hand,
in those days, publushers were often happy to deal directly with
WoedPerfect files.

Another approach I have used (when the publisher was happy with it),
one instance of which goes back to my earliest exposure to UNIX
troff in the early 1980s, is to use groff to produce "camera-ready"
copy. I have produced 3 books in this way. You need to get very
precise and detailed page layout instructions from the publisher
(and this may include fonts), but once armed with that you can
press on with groff. Groff, after all, is capable of making
arbitrary marks at arbitrary positions on the page, entirely under
your own very precise control!

These days, of course, once you have the layout instructions, the
publisher will probably be quite happy to work with the final
version in PDF format.

The overall impression that all of that left me with, though,
is that troff and Word are incompatible; and (again in my view)
Word has in-built Tourette's Syndrome.

Some publishers, indeed, will happily work with UNIX troff or
GNU groff. Not only O'Reilly (that goes without saying), but
also Harper-Collins, with whom I have done some work on their
multilingual dictionaries. They had a program to convert from
XML (which is what the dictionaries were initially composed in)
to troff. I was also able to write 'awk' routines to do the
same sort of thing, but with refinements. The resulting troff
source was then sent off to their printers, who could deal with
it as it stood (again, bless them!).

As to TeX -- I have very little experience of using that (when
I once tentatively embarked on a potential migration from troff
to TeX, I found the learning effort excessive for the end result;
in those days TeX did some things well, but troff did other
things better -- maybe it still does). But there may be mileage
in trying to convert from troff to TeX, since the basic principles
are similar.

Another possibility is to use groff to produce ".dvi" output.
If a publisher can cope with TeX, they can deal with ".dvi"
somehow. Again, this tends to come back to doing the entire
final formatting oneself, conforming to the publisher's style
and layout requirements.

Just some thoughts. I'm afraid I don't have any neat solution
for Robert's problem, and the above is based on my past experience,
some of it from a long time ago.

Best wishes to all,
Ted.

--------------------------------------------------------------------
E-Mail: (Ted Harding) <ted.hard...@manchester.ac.uk>
Fax-to-email: +44 (0)870 094 0861
Date: 03-Mar-09                                       Time: 18:52:07
------------------------------ XFMail ------------------------------

Re: [Groff] Creating Word/rtf output

Reply via email to