Liviu Andronic <landronim...@gmail.com> writes: > On Tue, Mar 4, 2014 at 12:17 AM, stefano franchi > <stefano.fran...@gmail.com> wrote: >>> not as much formatting (exceptions do apply, e.g. italics or bold of >>> words, which is important in articles which contain species names which >>> have to be in italics). >>> >> >> It should not. That is: a word-export conversion should *not* aim at >> preserving formatting. It should preserve the same (often >> formatting-encoded) semantic information of the round-trip converter. >> > I agree. LyX should focus on styles only, and users should use the > Logical Markup module for things like italic/bold. This will preserve > the style, which the user can then configure as desired in > Word/OpenOffice.
But this is, unfortunately, not what is happening during the export to formats other then LaTeX. Most of the (by LyX auto-configured) export routes try to mimic the LaTeX document to a certain degree - and this is where some misunderstandings come in. I would actually suggest to split the exporter section into two: 1) Formatted output- these one apply formatting and try to mimic the LaTeX look by fingerpainting 2) Semantic export - these ones try to maintain the semantic information and leave the formatting to the program handling the exported file. This separation would make it even possible to have several export paths to the same format auto-configured. Cheers, Rainer > > Liviu > > >> The only difference is that LyX has only pointers for some of these >> info (e.g. bibtes keys) and not the information itself. This is the >> major problem. >> (Additionally, some semantic LyX-provide semantic information could be >> shed by the exporter, such as tracked-changes and so on. But this is a >> minor problem). >> >> >>> Additionally: if it is a subset - perfect - but as in (B), the >>> round-trip does not have to include everything, as there could be a >>> semantic exporter. >>> >>>> >>>> B. If the final output is pdf, then it is not. It is not necessary to >>>> actually process the info in the .tex file with Latex (e.g >>>> bibliography,, and more). All we need to do is to make sure that the >>>> info that Latex will eventually need are preserved through the >>>> roundtrip. So, for a citation, we only need to make sure that when we >>>> go to Word we produce something like (made up XML): >>>> <citationCommand> >>>> citet >>>> <citationKey> >>>> myBibkey >>>> </citationKey> >>>> </citationCommand> >>>> >>>> and when we come back we reconstruct the proper LyX bib inset from >>>> those info. It will then be up to Latex to produce the actual citation >>>> and the corresponding reference. >>> >>> Agreed. >>> >>>> >>>> So scenario 2 is actually simpler, because we do not have a dependency >>>> on LaTeX at all. >>>> At the same time, scenario 1 is more important for those people who >>>> are likely to interact with Word users (see Juergen's comments, which >>>> I subscribe to). >>> >>> I would say to design the round-trip-export so that it can easily be >>> extended to become a fully fledged semantic exporter. >>> >>>> >>>> In general, then, we have overlapping projects with substantial >>>> differences sets: >>>> A - The LyX-only information that needs to be somehow encoded in the XML >>>> file >>>> B - The Latex-produced-only information that is missing from LyX >>>> >>>> Preserving LyX-only information in a XML file (A) strikes me as being >>>> substantially easier than producing the LyX-missing information (B) >>>> for the Word file. The latter requires TeX runs, the former does not. >>> >>> I assume you are here referring to the last A and B, as if I understand >>> correctly, the first definition of A and B is the opposite? >> >> Yes, Sorry, I switched them up. But the following refers correctly to >> the A and B options just mentioned. >> >>> >>> >>>> >>>> 2. How to produce a Word output, that is, how to solve problem B above? >>>> Since TeX is basically required to process a Lyx-produced tex file, >>>> the following approaches are available (there may be more than three, >>>> but these have known and working implementations): >>>> >>>> a. Mimic a TeX run by running a TeX-like processor on the tex file, >>>> but target XML as output >>>> examples: LatexML >>>> >>>> b. Run Latex and process the resulting Pdf or DVI file into XML >>>> examples: tex4ht >>>> >>>> c. Modify an existing Tex engine to target XML instead of pdf (or dvi) >>>> examples: XML from Context input in LuaTeX >>>> >>>> All three approaches are ambitious and have different shortcomings. >>>> >>>> (a) (Mimicking Latex) has the obvious problem that even once the basic >>>> LaTeX functionality is recovered, the LaTeX packages have to be >>>> basically recreated for the new engine. This is what happens in >>>> LaTeXML, where you have to write "bindings" fr every package you need >>>> to support. At the moment, many packages are not supported, including >>>> biblatex, and from the little I have seen on their mailing list adding >>>> such support is not trivial. >>>> On the plus side, since XML is the target, all the formatting-only >>>> machinery of TeX can be ignored (well, in theory. Real world is messy) >>>> >>>> b. This approach has the advantage of bringing in support for all >>>> LaTeX packages for free. However, parsing a DVI file with the goal of >>>> producing XML is not trivial given the completely different design >>>> goals of DVI/vs/XMl >>>> >>>> c. Finally, modifying an existing TeX engine (e.g. LuaTeX) may be the >>>> cleaner approach---at the price of much increased complexity. >>>> >>>> 3. Should LyX<-->Word conversion be direct or use an intermediary >>>> format (e.g. pandoc | mmd | etc.)? >>>> >>>> This question applies mostly to the roundtrip project. The consensus >>>> seems to be that it would be better to avoid yet another format and go >>>> for direct conversion. On the minus side, such an approach would make >>>> it impossible (well, more difficult) to switch back-ends for the round >>>> trip, if so desired (see Rainer's points) >>> >>> Unless one defines a clear software interface, which can be used by >>> other converters. Effectively, this could mean to extend the LyX server >>> to provide the information needed by the converter. So the parsing would >>> be doing in LyX (advantage: no worries about different .lyx formats) and >>> the conversion into docx in the external converter. >> >> Right. Even though I am not sure I fully understand Rob's suggestion >> about using the server yet. >> >> >> -- >> __________________________________________________ >> Stefano Franchi >> Associate Research Professor >> Department of Hispanic Studies Ph: +1 (979) 845-2125 >> Texas A&M University Fax: +1 (979) 845-6421 >> College Station, Texas, USA >> >> stef...@tamu.edu >> http://stefano.cleinias.org -- Rainer M. Krug email: RMKrug<at>gmail<dot>com
pgpmUcwKgmRf_.pgp
Description: PGP signature