All that seems like too much work!
I would first try to convert Word to Latex, import into LyX and chechk the result. You will still probably have to do some manual
adjustment, but they will hopefully be less.
I recently use this application:
http://www.grindeq.com/index.php?p=word2latex
It is not free, but you can download a trail version. If I am not wrong, it works with all the features for a limited period of time.
Since you only want to do one conversion, it should be enough.
Good luck!
Nicolás
Steve Litt wrote:
Hi all,
I have a 300 page book written in MS Word version 97, and I have to convert it
to LyX in order to make the second edition.
I'll accept all condolences now :-)
Believe it or not, the MS Word version was written very much what you guys
would call WYSIWYM. I had styles for everything -- almost no appearance was
fine tuned. Obviously it's essential that all those styles transfer over into
the LyX version.
I'll accept all condolences now :-)
So heres what my plan, unless someone else has a better idea.
First, I'll export to RTF.
I'll accept all condolences now :-)
Then in Vim I'll do this:
:%s/}/}\r/g
Now the rtf file will have lines that are somewhat recognizeable as markup.
Next I'll look at the \stylesheet part of the RTF, and make a list of all
paragraph and character styles, sort of like this:
\fs20 Normal
\s1 heading 1
\s2 heading 2
\cs10 \additive Default Paragraph Font
\s16 myparagraphstyle
\cs17 mycharstyle
\cs18 mycharstyle2
Then, within Vim I'll run substitions so that the text referred to by the
numbers such as \s2 are prepended with my own tags such as phdr2, and better
yet that text has a proper ending tag appended. This is not so simple for
three reasons:
1) There's always a bunch of gobblety gook between the \s2 and the text to
which it refers, and that must eventually be deleted.
2) There's often gobblety gook before the \s2, and that gobblety gook must
eventually be deleted.
3) It's MUCH harder to reliably put end tags at the end of the text to which
it refers. If I don't put end tags, that means I'll have a much harder time
converting it to LyX.
Next, I'll re-import the rtf into MS Word. What should happen is it re-imports
the same as it originally was, only now it has my tags. From there I should
be able to export it to plain text, and use my tags to create the LyX file
with suitable scripts. Or maybe make scripts to directly manipulate the RTF.
Of course, for all my custom character and paragraph styles, I'll need to
create those styles within LyX, in a blank document, before appending the
actual content.
Then comes the cleanup. Stuff like tables and images won't convert -- I'll
need to manually do that cleanup and then run at least a rough proofread.
The good news is, because the original document used styles for almost every
appearance, fine tuning won't be necessary (hooray for styles!).
I'd estimate this to be about a week's job. That's a lot of time, but in the
end I'll have converted a 300 page book, style for style, from MS Word to
LyX.
If anyone has a better idea for converting a 300 page MS Word document to LyX,
style for style and word for word, please let me know.
Thanks
STeveT
Steve Litt
Recession Relief Package
http://www.recession-relief.US