On 03/27/2010 02:41 PM, Claudio Beccari wrote:
Dear all,
those using LyX or direct LaTeX (pdflatex) often need to convert sources in MS Word .doc format into .lix format. On Linux platforms there are at least AbiWord and Kword that can open doc files and save them in various other formats, .tex included. Unfortunately the LaTeX file thus obtained is pittyful.

OpenOffice will also save in LaTeX format. I have used it often myself, but with old WordPerfect files, and you are right of course that the output file could use some cleaning up. Much of this can be done with a script, such as the exceedingly trivial sed script attached.

>From the wiki page of LyX it is possible to download of a Word2LyXMacro that works well for on Windows platformas, but I did not succeed to make it work on a Mac with MSOffice2004. On Windows the macro performs very well and the LyX code produced allows LyX to view the file without problems and possibly to save it in .lyx format, of course but also in a pretty good LaTeX format, which in general requires just a few minor adjustments, for language, input encoding, output font encoding, font usage (Latin Modern would be a better default then EC if the pdflatex option is selected), and few other small things. Somewhere on the package description for the Debian/Ubuntu package the wv software is suggested; apparently this software has so many dependencies that even on a Ubuntu platform it's difficult to compile and install it, even if the wv libraries are already installed. I would kindly suggest to examine the possibility of integrating into LyX the necessary code to open, read, edit a .doc file on any LyX implementation (Linux, Mac, Windows), so as to be able to save it in .lyx format. Any user can reopen the .lyx file and do with it anything LyX is capable of.

I have thought for a while about writing some sort of doc2tex script using OpenOffice. You could use PyUno to run OpenOffice headless, import the doc file and then export it as LaTeX. Then one could try to do some cleanup and, optionally, pass the resulting file to tex2lyx. But I haven't found the time or willpower to mess with PyUno. Still, I don't think it would be very hard for someone who knew a bit of Java.

rh

s/^\\backslash.*$//g
s/^\\latex.*$//g
s/^\\newline\s+$//g
s/\\protected_separator\s+$//g
s/\\align.*$//g
s/\\series.*$//g
#/^$/d

Reply via email to