Re: New DocBook support (0005)

Pavel Sanda Sun, 05 Jul 2020 03:00:28 -0700

On Sat, Jul 04, 2020 at 12:42:31AM +0200, Thibaut Cuvelier wrote:
> From f7004ab6518c230f6cedd89410d9059e9596a113 Mon Sep 17 00:00:00 2001
> From: Thibaut Cuvelier <cuvelier.thib...@gmail.com>
> Date: Mon, 8 Jun 2020 23:27:49 +0200
> Subject: [PATCH 5/9] New DocBook support


Now the real beef. It's impossible to keep permanent attention given the length 
of the
patch so this might not be all and we'll go through more iterations, sorry...

BTW I really appreacite your comments verbosity esp. on more tricky parts of 
the code like
in InsetIndex, I only wish the rest of LyX codebase was similar.

...
> +docstring authorsToDocBookAuthorGroup(docstring const & authorsString, 
> XMLStream & xs, Buffer const & buf)
...
> +            if (! parts.prefix.empty())
> +                xs << xml::StartTag("honorific") << parts.prefix << 
> xml::EndTag("honorific") << xml::CR();
> +            if (! parts.prename.empty())
> +                xs << xml::StartTag("firstname") << parts.prename << 
> xml::EndTag("firstname") << xml::CR();
> +            if (! parts.surname.empty())
> +                xs << xml::StartTag("surname") << parts.surname << 
> xml::EndTag("surname") << xml::CR();
> +            if (! parts.suffix.empty())
> +                xs << xml::StartTag("othername", "role=\"suffix\"") << 
> parts.suffix << xml::EndTag("othername") << xml::CR();

Let me voice again my concern that this kind of chaining might backfire.
operator << does not guarantee you order of evaluation, the clear case was the 
previous code like:
xs<< (a==0?c,d) << ++a ;
Different compilers can do either ++a first and then compare or vice versa.

This case is more tricky, because it depends whether StartTag/EndTag 
functionality depends on ordering.
If the only things it does is syntax sugar like TAG -> <TAG> then you are fine, 
if there is some counting
of nestedness you are asking for troubles. I didn't see what's the case just by 
quick looking at the
StartTag implementation but it seems it touches some internal structures so we 
migh be in trouble
(and even if we are not now it's waiting for someone to refactor it).

Sequence point like ';' guarantees you the order so 
xs << xml::StartTag("honorific") << parts.prefix; xs << 
xml::EndTag("honorific") << xml::CR();
looks like safer variant to me. I see the construct xs<<StartTag<<EndTag is 
everywhere so it might
be difficult to change now. If you leave it as is please make sure that calling 
EndTag earlier than
StartTag is OK and add some comment in the class implementation for people who 
might want to make
it more complex one day.

> -Buffer::ExportStatus Buffer::writeDocBookSource(odocstream & os, string 
> const & fname,
> +Buffer::ExportStatus Buffer::writeDocBookSource(odocstream & os, string 
> const & /*fname*/,

time to get rid of string const & /*fname*/ altogether? 

> @@ -79,6 +79,12 @@ public:
>       std::string const & htmlAttrib() const;
>       /// tag type, defaults to "div"
>       std::string const & htmlTag() const;
> +     /// tag type
> +     std::string const & docbookTag(bool hasTitle = false) const;
> +     /// attribute (mostly, role)
> +     std::string const & docbookAttr() const;
> +     /// caption tag (mostly, either caption or title)
> +     std::string const & docbookCaption() const;

the comments should be rather next to the variables below

>  private:
>       ///
>       std::string defaultCSSClass() const;
> @@ -120,6 +126,12 @@ private:
>       mutable std::string defaultcssclass_;
>       ///
>       docstring html_style_;
> +     ///
> +     mutable std::string docbook_tag_;
> +     ///
> +     mutable std::string docbook_caption_;
> +     ///
> +     std::string docbook_attr_;
>  };


>  //  The order of the LayoutTags enum is no more important. [asierra300396]
> @@ -104,6 +104,21 @@ enum LayoutTags {
>       LT_HTMLPREAMBLE,
>       LT_HTMLSTYLE,
>       LT_HTMLFORCECSS,
> +     LT_DOCBOOKTAG,
> +     LT_DOCBOOKATTR,
> +    LT_DOCBOOKININFO,
> +     LT_DOCBOOKWRAPPERTAG,
> +     LT_DOCBOOKWRAPPERATTR,
> +     LT_DOCBOOKSECTIONTAG,
> +    LT_DOCBOOKITEMWRAPPERTAG,
> +    LT_DOCBOOKITEMWRAPPERATTR,
> +    LT_DOCBOOKITEMTAG,
> +    LT_DOCBOOKITEMATTR,
> +    LT_DOCBOOKITEMLABELTAG,
> +    LT_DOCBOOKITEMLABELATTR,
> +     LT_DOCBOOKITEMINNERTAG,
> +     LT_DOCBOOKITEMINNERATTR,
> +     LT_DOCBOOKFORCEABSTRACTTAG,

ditto

> @@ -204,6 +219,21 @@ bool Layout::readIgnoreForcelocal(Lexer & lex, TextClass 
> const & tclass)
>               { "commanddepth",   LT_COMMANDDEPTH },
>               { "copystyle",      LT_COPYSTYLE },
>               { "dependson",      LT_DEPENDSON },
> +             { "docbookattr",             LT_DOCBOOKATTR },
> +             { "docbookforceabstracttag", LT_DOCBOOKFORCEABSTRACTTAG },
> +        { "docbookininfo",           LT_DOCBOOKININFO },
> +        { "docbookitemattr",         LT_DOCBOOKITEMATTR },
> +             { "docbookiteminnerattr",    LT_DOCBOOKITEMINNERATTR },
> +             { "docbookiteminnertag",     LT_DOCBOOKITEMINNERTAG },
> +             { "docbookitemlabelattr",    LT_DOCBOOKITEMLABELATTR },
> +             { "docbookitemlabeltag",     LT_DOCBOOKITEMLABELTAG },
> +        { "docbookitemtag",          LT_DOCBOOKITEMTAG },
> +        { "docbookitemwrapperattr",  LT_DOCBOOKITEMWRAPPERATTR },
> +        { "docbookitemwrappertag",   LT_DOCBOOKITEMWRAPPERTAG },
> +             { "docbooksectiontag",       LT_DOCBOOKSECTIONTAG },
> +             { "docbooktag",              LT_DOCBOOKTAG },
> +             { "docbookwrapperattr",      LT_DOCBOOKWRAPPERATTR },
> +             { "docbookwrappertag",       LT_DOCBOOKWRAPPERTAG },
>               { "end",            LT_END },
>               { "endlabelstring", LT_ENDLABELSTRING },
>               { "endlabeltype",   LT_ENDLABELTYPE },

ditto

> @@ -689,6 +719,66 @@ bool Layout::readIgnoreForcelocal(Lexer & lex, TextClass 
> const & tclass)
> +             case LT_DOCBOOKWRAPPERATTR:
> +                     lex >> docbookwrapperattr_;
> +                     break;
> +
> +             case LT_DOCBOOKSECTIONTAG:
> +                     lex >> docbooksectiontag_;
> +                     break;
> +
> +        case LT_DOCBOOKITEMWRAPPERTAG:
> +            lex >> docbookitemwrappertag_;
> +            break;
> +
> +        case LT_DOCBOOKITEMWRAPPERATTR:
> +            lex >> docbookitemwrapperattr_;
> +            break;
> +
> +             case LT_DOCBOOKITEMTAG:
> +                     lex >> docbookitemtag_;
> +                     break;
> +

ditto

>  string makeMarginValue(char const * side, double d)
> diff --git a/src/Layout.h b/src/Layout.h
> index ffc976d8ff..bfcb510861 100644
> --- a/src/Layout.h
> +++ b/src/Layout.h
> @@ -193,6 +193,36 @@ public:
>       ///
>       bool htmltitle() const { return htmltitle_; }
>       ///
> +     std::string const & docbookattr() const;
> +     ///
> +     std::string const & docbookininfo() const;
> +    ///
> +    std::string const & docbookwrappertag() const;
> +    ///
> +    std::string const & docbookwrapperattr() const;
> +    ///
> +    std::string const & docbooksectiontag() const;
> +    ///
> +    std::string const & docbookitemwrappertag() const;
> +    ///
> +    std::string const & docbookitemwrapperattr() const;
> +    ///
> +    std::string const & docbookitemlabeltag() const;
> +    ///
> +    std::string const & docbookitemlabelattr() const;
> +     ///
> +     std::string const & docbookiteminnertag() const;

ditto

> @@ -457,6 +487,39 @@ private:
>       bool htmllabelfirst_;
>       /// CSS information needed by this layout.
>       docstring htmlstyle_;
> +     mutable std::string docbookitemtag_;
> +     /// Roles to add to docbookitemtag_, if any (default: none).
> +     mutable std::string docbookitemattr_;
> +    /// Tag corresponding to the wrapper around an item (mainly for lists).
> +    mutable std::string docbookitemwrappertag_;
> +    /// Roles to add to docbookitemwrappertag_, if any (default: none).
> +    mutable std::string docbookitemwrapperattr_;
> +    /// Tag corresponding to this label (only for description lists;
> +    /// labels in the common sense do not exist with DocBook).
> +     mutable std::string docbookitemlabeltag_;
> +     /// Roles to add to docbooklabeltag_, if any (default: none).
> +     mutable std::string docbookitemlabelattr_;
> +     /// Tag to add within the item, around its direct content (mainly for 
> lists).
> +     mutable std::string docbookiteminnertag_;
> +     /// Roles to add to docbookiteminnertag_, if any (default: none).
> +     mutable std::string docbookiteminnerattr_;
> +    /// Tag corresponding to this wrapper around the main tag.
> +    mutable std::string docbookwrappertag_;
> +    /// Roles to add to docbookwrappertag_, if any (default: none).
> +    mutable std::string docbookwrapperattr_;
> +    /// Outer tag for this section, only if this layout represent a 
> sectionning item, including chapters (default: section).
> +    mutable std::string docbooksectiontag_;
> +    /// Whether this tag must/can/can't go into an <info> tag (default: 
> never, as it only makes sense for metadata).
> +     mutable std::string docbookininfo_;
> +    /// whether this element (root or not) does not accept text without a 
> section(i.e. the first text that is met
> +     /// in LyX must be considered as the abstract if this is true); this 
> text must be output with the specific tag
> +     /// held by this attribute
> +    mutable std::string docbookforceabstracttag_;

ditto

> diff --git a/src/OutputParams.h b/src/OutputParams.h
> index 1ad36722d0..0244a0ea41 100644
> --- a/src/OutputParams.h
> +++ b/src/OutputParams.h
> @@ -16,6 +16,7 @@
>  #include "Changes.h"
>  
>  #include <memory>
> +#include <unordered_set>
>  
> +     /// Anchors that should not be output (LyX-side identifier, not 
> DocBook-side).
> +     std::unordered_set<docstring> docbook_anchors_to_ignore;

Adding new deps into .h files is not a best things for compilation speeds.
Won't std::vector already included from Changes.h be enough (do we care about 
uniqness)?

>+void Paragraph::simpleDocBookOnePar(Buffer const & buf,
...
> +        // FIXME XHTML
> +        // Other such tags? What about the other text ranges?
> +
> +        vector<xml::EndFontTag>::const_iterator cit = tagsToClose.begin();
> +        vector<xml::EndFontTag>::const_iterator cen = tagsToClose.end();
> +        for (; cit != cen; ++cit)
> +            xs << *cit;
...

> +    // FIXME XHTML
> +    // I'm worried about what happens if a branch, say, is itself
> +    // wrapped in some font stuff. I think that will not work.
> +    xs.closeFontTags();

Just to clarify FIXME XHTML is here because we need to fix this
in other routines implementing xHTML or because this routine is shared?

> +    void simpleDocBookOnePar(Buffer const & buf,
> +                                                      XMLStream &,
> +                                          OutputParams const & runparams,
> +                                          Font const & outerfont,
> +                             bool start_paragraph = true,
> +                             bool close_paragraph = true,
> +                                          pos_type initial = 0) const;

whitespace

> diff --git a/src/insets/InsetBibtex.cpp b/src/insets/InsetBibtex.cpp
> index 3bfc593013..368d4f7ef6 100644
> --- a/src/insets/InsetBibtex.cpp
> +++ b/src/insets/InsetBibtex.cpp
> @@ -51,6 +50,11 @@
>  #include "support/textutils.h"
>  
>  #include <limits>
> +#include <map>
> +#include <regex>

Careful here. I was never part of that discussion but there were some
issues with regex implementation for different compilers so we provide
#include "support/regex.h"
and I think it should be used that across our codebase consistently.
(someone else might chime in?)

> +void InsetBibtex::docbook(XMLStream & xs, OutputParams const &) const
> +     // Header for bibliography (title required).
> +     xs << xml::StartTag("bibliography") << xml::CR()
> +        << xml::StartTag("title")
> +        << reflabel
> +        << xml::EndTag("title") << xml::CR();

chains everywhere in this function..

> +void InsetCaption::docbook(XMLStream &, OutputParams const &) const
>  {
> -     int ret;
> -     os << "<title>";
> -     ret = InsetText::docbook(os, runparams);
> -     os << "</title>\n";
> -     return ret;
> +     // This function should never be called (rather InsetFloat::docbook, 
> the titles should be skipped in floats).

adding LYXERRR warning?

> +void InsetFloat::docbook(XMLStream & xs, OutputParams const & runparams) 
> const
> +{
> +     // Determine whether the float has a title or not. For this, iterate 
> through the paragraphs and look
> +    // for an InsetCaption. Do the same for labels and subfigures.
> +    // The caption and the label for each subfigure is handled by recursive 
> calls.
> +    const InsetCaption* caption = nullptr;
> +    const InsetLabel* label = nullptr;
> +    std::vector<const InsetBox *> subfigures;
> +
> +    auto it = paragraphs().begin();
> +    auto end = paragraphs().end();
> +     for (; it != end; ++it) {

whitespace

> +    if (!ftype.docbookAttr().empty()) {
> +        if (!attr.empty())
> +            attr += " ";
> +        attr += from_utf8(ftype.docbookAttr());
> +    }
> +
> +     xs << xml::StartTag(ftype.docbookTag(caption != nullptr), attr) << 
> xml::CR();
> +    if (caption != nullptr) {

ditto

> diff --git a/src/insets/InsetGraphics.cpp b/src/insets/InsetGraphics.cpp
> index 0daae4c921..50a5b58fa4 100644
> --- a/src/insets/InsetGraphics.cpp
> +++ b/src/insets/InsetGraphics.cpp
> @@ -96,6 +96,7 @@ TODO
>  #include <algorithm>
>  #include <sstream>
>  #include <tuple>

unrelated, but i wonder whether tupple is ever used here

> +void InsetHyperlink::docbook(XMLStream & xs, OutputParams const &) const
>  {
> -     os << "<ulink url=\""
> -        << subst(getParam("target"), from_ascii("&"), from_ascii("&amp;"))
> -        << "\">"
> +     xs << xml::StartTag("link", "xlink:href=\"" + subst(getParam("target"), 
> from_ascii("&"), from_ascii("&amp;")) + "\"")
>          << xml::escapeString(getParam("name"))
> -        << "</ulink>";
> -     return 0;
> +        << xml::EndTag("link");

chains

> +             // Handle the index terms (including the specific index for 
> this entry).
> +             xs << xml::StartTag("indexterm", attrs);
> +             if (terms.size() > 0) { // hasEndRange has no content.
> +                     xs << xml::StartTag("primary")
> +                        << terms[0]
> +                        << xml::EndTag("primary");
> +             }
> +             if (terms.size() > 1) {
> +                     xs << xml::StartTag("secondary")
> +                        << terms[1]
> +                        << xml::EndTag("secondary");
> +             }
> +             if (terms.size() > 2) {
> +                     xs << xml::StartTag("tertiary")
> +                        << terms[2]
> +                        << xml::EndTag("tertiary");
> +             }
> +
> +             // Handle see and see also.
> +             if (!see.empty()) {
> +                     xs << xml::StartTag("see")
> +                        << see
> +                        << xml::EndTag("see");
> +             }
> +
> +             if (!seeAlsoes.empty()) {
> +                     for (auto & entry : seeAlsoes) {
> +                             xs << xml::StartTag("seealso")
> +                                << entry
> +                                << xml::EndTag("seealso");

chains

> +void InsetListings::docbook(XMLStream & xs, OutputParams const & rp) const
> +     if (!caption.empty())
> +             xs << xml::StartTag("bridgehead")
> +                << XMLStream::ESCAPE_NONE
> +                << caption
> +                << xml::EndTag("bridgehead");

chains

> +void InsetNewline::docbook(XMLStream & xs, OutputParams const & runparams) 
> const
>  {
> -     os << '\n';
> -     return 0;
> +     if (runparams.docbook_in_par) {
> +             xs.closeFontTags();
> +             xs << xml::EndTag("para");
> +             xs << xml::StartTag("para");

chains

> -// FIXME This should be changed to use the TOC. Perhaps
> -// that could be done when XHTML output is added.
> -int InsetPrintNomencl::docbook(odocstream & os, OutputParams const &) const
> +void InsetPrintNomencl::docbook(XMLStream & xs, OutputParams const & 
> runparams) const
..
> +     xs << xml::StartTag("glossary")
> +        << xml::CR()
> +        << xml::StartTag("title")
> +        << toclabel
> +        << xml::EndTag("title")
> +        << xml::CR();
> +
> +     EntryMap::const_iterator eit = entries.begin();
> +     EntryMap::const_iterator const een = entries.end();
> +     for (; eit != een; ++eit) {
> +             NomenclEntry const & ne = eit->second;
> +             string const parid = ne.par->magicLabel();
> +
> +             xs << xml::StartTag("glossentry", "xml:id=\"" + parid + "\"")
> +                << xml::CR()
> +                << xml::StartTag("glossterm")
> +                << ne.symbol
> +                << xml::EndTag("glossterm")
> +                << xml::CR()
> +                << xml::StartTag("glossdef")
> +                << xml::CR()
> +                << xml::StartTag("para")
> +                << ne.desc
> +                << xml::EndTag("para")
> +                << xml::CR()
> +                << xml::EndTag("glossdef")
> +                << xml::CR()
> +                << xml::EndTag("glossentry")
> +                << xml::CR();

chains

> -int InsetRef::docbook(odocstream & os, OutputParams const & runparams) const
> +void InsetRef::docbook(XMLStream & xs, OutputParams const &) const
..
> +     if (!name.empty()) {
> +             docstring attr = from_utf8("linkend=\"") + linkend + 
> from_utf8("\"");
> +
> +             xs << xml::StartTag("link", to_utf8(attr))
> +                << name
> +                << xml::EndTag("link");
> +             return;

chains

> --- a/src/output_docbook.h
> +++ b/src/output_docbook.h
> +#include "support/docstream.h"
...
> +#include <deque>
> +#include <memory>

Are these includes necessary in the header?

> --- a/src/xml.cpp
> +++ b/src/xml.cpp
> @@ -32,6 +32,7 @@
> +#include <iostream>

used?

Thanks,
Pavel
-- 
lyx-devel mailing list
lyx-devel@lists.lyx.org
http://lists.lyx.org/mailman/listinfo/lyx-devel

Re: New DocBook support (0005)

Reply via email to