odt conversion

stefano franchi Wed, 12 Feb 2014 07:13:09 -0800

On Wed, Feb 12, 2014 at 4:43 AM, Csikos Bela <bcsikos...@freemail.hu> wrote:


> Csikos Bela <bcsikos...@freemail.hu> írta:
> >stefano franchi <stefano.fran...@gmail.com> írta:
> >>My next struggle with word conversions came much sooner than I
> thought.Suggestions >>are welcome on how to tackle the conversion a
> document with the following >>characteristics:
> >>~ 16,000 wordsclass: articleengine: LuaTexBib: biblatex + biberno mathno
> imagesno X->>references, branches, etc.lots of footnotes
> >>In short, your standard Humanities article...Here is what I tried, with
> related >>problems:1. >Lyx&#39;s own Xhtmla - does not know what to do with
> biblatex, hence >>all references >are just bib keys and there is no
> bibliography
> >
> >Did you try latex2rtf?
> >
> >I don't know if it works with biblatex and footnotes but for me it worked
> well with bibtex.
> >
> >Might worth a try.
>
> Sorry, I did not notice you are using LuaTeX. I guess latex2rtf does not
> work with it.
>
> By the way what can it offer that latex can not?
>
>

Success!

I was finally able to do the job with tex4ht and the ooxelatex script.
ooxelatex is a script that configures tex4ht to produce output in odt
format from a xelatex source.
It took some hunting, because this script (and many other similar scripts)
have apparently been removed from TexLive 2013 installation of tex4ht.
However, the version available on the svn repository for tex4ht did the
trick. (see point 4 below for tex4ht's peculiar status in TexLive)

I now have an odt file---which I could easily convert to word's doc---with
proper footnotes and biblatex/biber-processed bibliography. The only
difference from my original setup was to switch from luatex to xetex, but
that was painless enough. I only really need (Lua|Xe)Tex in order to work
with unicode input sources, and either does the job. I do prefer LuaTex
because of its better compatibility with microformatting when producing pdf
output, but of course that is irrelevant when converting to odt or html.

Lessons learned from this experience in view of a more general lyx-doc
conversion project:

1. The production of bibliography (and associated in text references)
require latex processing, hence the conversion must go from latex to
odt/doc and not from lyx to odt/doc. This may be true for other *semantic*
components of a text that require (multiple) latex processing
(X-references, indices, and so on).

1.1 This means that there are really two different use-cases for a word
conversion-tool, depending on whether the final product is doc or pdf. In
the former case, latex processing (or a simulation of latex processing
carried out from within lyx) is necessary. In the latter case is not.
Several people may collaborate on a paper sending versions back and forth
and roundtripping between lyx and doc (Rainer's use-case, I guess) with
plenty of references, cross-references, etc, ***as long as the lyx person
will produce the final pdf** (and as long as a correct system to preserve
those information through the roundtrip has been devised).

2. tex4ht can preserve all the relevant information from a latex file
because it lets latex itself do the processing instead of trying to parse
the latex file. To be more precise, it first runs latex with a special
package (tex4ht.sty) in order to produce a (modified) DVI file. Then it
runs a (java) program on the DVI file to produce (x)html, odt, docbook,
etcetera
I wonder if a lyx-doc conversion shouldn't use the same approach, either by
relying on tex4ht itself or by trying to replicate, on  a much smaller
scale, its approach. tex4ht is a very ambitious and therefore very complex
program. Perhaps a more focused (odt only) version could avoid much of the
complexity?

3. I haven't looked into the math issue. tex4ht is capable of producing
MathML from latex sources, and, according to tex4ht's own website, "The
OpenDocument code employs MathML for formulas, and XSL-FO for formatting."
I really have no idea about the meaning of that last clause or whether an
adt-MathML formula would be correctly exported to word's doc/docx format.

4. As some of you many know, tex4ht is an almost orphaned project after the
sudden and unexpected death of its creator, Eitan Gurari, in 2009. Karl
Berry  and Radhakrishnan CV are maintaining the project, but there has been
very little activity since 2009. There have been frequent updates to
maintain compatibility with biblatex (which was moving very fast in those
years), but little else. Indeed the official release is still Eitan's last
of 2009. This peculiar situation may be worrisome for a conversion tool
relying on tex4ht



Cheers,

Stefano

-- 
__________________________________________________
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies         Ph:   +1 (979) 845-2125
Texas A&M University                          Fax:  +1 (979) 845-6421
College Station, Texas, USA

stef...@tamu.edu
http://stefano.cleinias.org

Re: Struggling again on lyx-->doc/odt conversion

Reply via email to