Dear Lyx devels,

given the intense discussion we have had in the last few days on this
possible project, I thought I would briefly sum up some of the early
conclusions (also because some items were discussed in private
emails).
(BTW: In the following I  say "Word" for the sake of brevity.  I
actually mean Word XML | Libreoffice ODT)

1. One project or two?

Is a LyX-->Word export a subset of the LyX<-->Word roundtrip?

A. If the final ouput is Word, the conversion to Word is a subset of
the roundtrip *if and only if* the XML output preserve Lyx-only
(non-LaTeX) information (e.g. tracked-changes, LyX-notes, etc).

B. If the final output is pdf, then it is not. It is not necessary to
actually process the info in the .tex file with Latex (e.g
bibliography,, and more). All we need to do is to make sure that the
info that Latex will eventually need are preserved through the
roundtrip. So, for a citation, we only need to make sure that when we
go to Word we produce something like (made up XML):
<citationCommand>
       citet
       <citationKey>
            myBibkey
       </citationKey>
</citationCommand>

and when we come back we reconstruct the proper LyX bib inset from
those info. It will then be up to Latex to produce the actual citation
and the corresponding reference.

So scenario 2 is actually simpler, because we do not have a dependency
on LaTeX at all.
At the same time, scenario 1 is more important for those people who
are likely to interact with Word users (see Juergen's comments, which
I subscribe to).

In general, then, we have overlapping projects with substantial
differences sets:
A - The LyX-only information that needs to be somehow encoded in the XML file
B - The Latex-produced-only information that is missing from LyX

Preserving LyX-only information in a XML file (A) strikes me as being
substantially easier than producing the LyX-missing information (B)
for the Word file. The latter requires TeX runs, the former does not.

2. How to produce a Word output, that is, how to solve problem B above?
Since TeX is basically required to process a Lyx-produced tex file,
the following approaches are available (there may be more than three,
but these have known and working implementations):

a. Mimic a TeX run by running a TeX-like processor on the tex file,
but target XML as output
examples: LatexML

b. Run Latex and process the resulting Pdf or DVI file into XML
examples: tex4ht

c. Modify an existing Tex engine to target XML instead of pdf (or dvi)
examples: XML from Context input in LuaTeX

All three approaches are ambitious and have different shortcomings.

(a) (Mimicking Latex) has the obvious problem that even once the basic
LaTeX functionality is recovered, the LaTeX packages have to be
basically recreated for the new engine. This is what happens in
LaTeXML, where you have to write "bindings" fr every package you need
to support. At the moment, many packages are not supported, including
biblatex, and from the little I have seen on their mailing list adding
such support is not trivial.
On the plus side, since XML is the target, all the formatting-only
machinery of TeX can be ignored (well, in theory. Real world is messy)

b. This approach has the advantage of bringing in support for all
LaTeX packages for free. However, parsing a DVI file with the goal of
producing XML is not trivial given the completely different design
goals of DVI/vs/XMl

c. Finally, modifying an existing TeX engine (e.g. LuaTeX) may be the
cleaner approach---at the price of much increased complexity.

3. Should  LyX<-->Word conversion be direct or use an intermediary
format (e.g. pandoc | mmd | etc.)?

This question applies mostly to the roundtrip project. The consensus
seems to be that it would be better to avoid yet another format and go
for direct conversion. On the minus side, such an approach would make
it impossible (well, more difficult) to switch back-ends for the round
trip, if so desired (see Rainer's points)



These seem to me to be the most important issues we face. I maybe
forgetting some important points. If so, please correct me.
Comments of any kind are welcome.


Cheers,

Stefano
-- 
__________________________________________________
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies         Ph:   +1 (979) 845-2125
Texas A&M University                          Fax:  +1 (979) 845-6421
College Station, Texas, USA

[email protected]
http://stefano.cleinias.org

Reply via email to