On Mon, Mar 3, 2014 at 3:17 AM, Rainer M Krug <[email protected]> wrote:
> stefano franchi <[email protected]> writes:
>
>> Dear Lyx devels,
>
> Hi Stefano,
>
> thanks for a good summary of the discussion - I think you have
> identified the main points. I have some comments below.
>
>>
>> given the intense discussion we have had in the last few days on this
>> possible project, I thought I would briefly sum up some of the early
>> conclusions (also because some items were discussed in private
>> emails).
>> (BTW: In the following I  say "Word" for the sake of brevity.  I
>> actually mean Word XML | Libreoffice ODT)
>>
>> 1. One project or two?
>>
>> Is a LyX-->Word export a subset of the LyX<-->Word roundtrip?
>>
>> A. If the final ouput is Word, the conversion to Word is a subset of
>> the roundtrip *if and only if* the XML output preserve Lyx-only
>> (non-LaTeX) information (e.g. tracked-changes, LyX-notes, etc).
>
> This point needs to be clarified: If one needs a semantic export, this
> is true, as all semantic information needs to be maintained in the
> round-trip as well as in the export. But not if the export should *look
> like* the latex export. The limit in these discussions is semantics, and
> not as much formatting (exceptions do apply, e.g. italics or bold of
> words, which is important in articles which contain species names which
> have to be in italics).
>

It should not.  That is: a word-export conversion should *not* aim at
preserving formatting. It should preserve the same (often
formatting-encoded)  semantic information of the round-trip converter.
The only difference is that LyX has only pointers for some of these
info (e.g. bibtes keys)  and not the information  itself. This is the
major problem.
(Additionally, some semantic LyX-provide semantic information could be
shed by the exporter, such as tracked-changes and so on. But this is a
minor problem).


> Additionally: if it is a subset - perfect - but as in (B), the
> round-trip does not have to include everything, as there could be a
> semantic exporter.
>
>>
>> B. If the final output is pdf, then it is not. It is not necessary to
>> actually process the info in the .tex file with Latex (e.g
>> bibliography,, and more). All we need to do is to make sure that the
>> info that Latex will eventually need are preserved through the
>> roundtrip. So, for a citation, we only need to make sure that when we
>> go to Word we produce something like (made up XML):
>> <citationCommand>
>>        citet
>>        <citationKey>
>>             myBibkey
>>        </citationKey>
>> </citationCommand>
>>
>> and when we come back we reconstruct the proper LyX bib inset from
>> those info. It will then be up to Latex to produce the actual citation
>> and the corresponding reference.
>
> Agreed.
>
>>
>> So scenario 2 is actually simpler, because we do not have a dependency
>> on LaTeX at all.
>> At the same time, scenario 1 is more important for those people who
>> are likely to interact with Word users (see Juergen's comments, which
>> I subscribe to).
>
> I would say to design the round-trip-export so that it can easily be
> extended to become a fully fledged semantic exporter.
>
>>
>> In general, then, we have overlapping projects with substantial
>> differences sets:
>> A - The LyX-only information that needs to be somehow encoded in the XML file
>> B - The Latex-produced-only information that is missing from LyX
>>
>> Preserving LyX-only information in a XML file (A) strikes me as being
>> substantially easier than producing the LyX-missing information (B)
>> for the Word file. The latter requires TeX runs, the former does not.
>
> I assume you are here referring to the last A and B, as if I understand
> correctly, the first definition of A and B is the opposite?

Yes, Sorry, I switched them up. But the following refers correctly to
the A and B options just mentioned.

>
>
>>
>> 2. How to produce a Word output, that is, how to solve problem B above?
>> Since TeX is basically required to process a Lyx-produced tex file,
>> the following approaches are available (there may be more than three,
>> but these have known and working implementations):
>>
>> a. Mimic a TeX run by running a TeX-like processor on the tex file,
>> but target XML as output
>> examples: LatexML
>>
>> b. Run Latex and process the resulting Pdf or DVI file into XML
>> examples: tex4ht
>>
>> c. Modify an existing Tex engine to target XML instead of pdf (or dvi)
>> examples: XML from Context input in LuaTeX
>>
>> All three approaches are ambitious and have different shortcomings.
>>
>> (a) (Mimicking Latex) has the obvious problem that even once the basic
>> LaTeX functionality is recovered, the LaTeX packages have to be
>> basically recreated for the new engine. This is what happens in
>> LaTeXML, where you have to write "bindings" fr every package you need
>> to support. At the moment, many packages are not supported, including
>> biblatex, and from the little I have seen on their mailing list adding
>> such support is not trivial.
>> On the plus side, since XML is the target, all the formatting-only
>> machinery of TeX can be ignored (well, in theory. Real world is messy)
>>
>> b. This approach has the advantage of bringing in support for all
>> LaTeX packages for free. However, parsing a DVI file with the goal of
>> producing XML is not trivial given the completely different design
>> goals of DVI/vs/XMl
>>
>> c. Finally, modifying an existing TeX engine (e.g. LuaTeX) may be the
>> cleaner approach---at the price of much increased complexity.
>>
>> 3. Should  LyX<-->Word conversion be direct or use an intermediary
>> format (e.g. pandoc | mmd | etc.)?
>>
>> This question applies mostly to the roundtrip project. The consensus
>> seems to be that it would be better to avoid yet another format and go
>> for direct conversion. On the minus side, such an approach would make
>> it impossible (well, more difficult) to switch back-ends for the round
>> trip, if so desired (see Rainer's points)
>
> Unless one defines a clear software interface, which can be used by
> other converters. Effectively, this could mean to extend the LyX server
> to provide the information needed by the converter. So the parsing would
> be doing in LyX (advantage: no worries about different .lyx formats) and
> the conversion into docx in the external converter.

Right. Even though I am not sure I fully understand Rob's suggestion
about using the server  yet.


-- 
__________________________________________________
Stefano Franchi
Associate Research Professor
Department of Hispanic Studies         Ph:   +1 (979) 845-2125
Texas A&M University                          Fax:  +1 (979) 845-6421
College Station, Texas, USA

[email protected]
http://stefano.cleinias.org

Reply via email to