On 03/27/2010 02:41 PM, Claudio Beccari wrote:
Dear all,
those using LyX or direct LaTeX (pdflatex) often need to convert
sources in MS Word .doc format into .lix format.
On Linux platforms there are at least AbiWord and Kword that can open
doc files and save them in various other formats, .tex included.
Unfortunately the LaTeX file thus obtained is pittyful.
OpenOffice will also save in LaTeX format. I have used it often myself,
but with old WordPerfect files, and you are right of course that the
output file could use some cleaning up. Much of this can be done with a
script, such as the exceedingly trivial sed script attached.
>From the wiki page of LyX it is possible to download of a
Word2LyXMacro that works well for on Windows platformas, but I did not
succeed to make it work on a Mac with MSOffice2004. On Windows the
macro performs very well and the LyX code produced allows LyX to view
the file without problems and possibly to save it in .lyx format, of
course but also in a pretty good LaTeX format, which in general
requires just a few minor adjustments, for language, input encoding,
output font encoding, font usage (Latin Modern would be a better
default then EC if the pdflatex option is selected), and few other
small things.
Somewhere on the package description for the Debian/Ubuntu package the
wv software is suggested; apparently this software has so many
dependencies that even on a Ubuntu platform it's difficult to compile
and install it, even if the wv libraries are already installed.
I would kindly suggest to examine the possibility of integrating into
LyX the necessary code to open, read, edit a .doc file on any LyX
implementation (Linux, Mac, Windows), so as to be able to save it in
.lyx format. Any user can reopen the .lyx file and do with it anything
LyX is capable of.
I have thought for a while about writing some sort of doc2tex script
using OpenOffice. You could use PyUno to run OpenOffice headless, import
the doc file and then export it as LaTeX. Then one could try to do some
cleanup and, optionally, pass the resulting file to tex2lyx. But I
haven't found the time or willpower to mess with PyUno. Still, I don't
think it would be very hard for someone who knew a bit of Java.
rh
s/^\\backslash.*$//g
s/^\\latex.*$//g
s/^\\newline\s+$//g
s/\\protected_separator\s+$//g
s/\\align.*$//g
s/\\series.*$//g
#/^$/d