Re: writer

Allan Rae Sat, 6 Feb 1999 01:15:21 -0500
WARNING: this is even longer than Lars' original post!
With all the traffic on lyx-devel lately I hope this doesn't
slip through the cracks :-)
 
Lars Gullik Bjønnes wrote:
> 
> Currently only caring about insets, and writing to file. (Could also
> be used to write to the painter)
> 
> By doing it like this we can hide all writing and formatting detail
> insete the classes derived from writer (we can use whatever method we
> like format classes from Asger would possibly be usable)
> 
> Gains:
>         - simplified code in insets
>         - we ensure that all insets can be output with all different
>           writers
>         - when adding new insets, the compiler will barf unless you
>           inplement the new writer method in all derived classes.
> 
> Ok, this is likely not the perfect way to do this, but IMO at least
> one thing should be kept from this: the absence of a "code"
> enum/field.
> 
> There should be no code anywhere that cares what kind of writer it got
> passed.

This is definitely something we need to care about.
The difficulty we have is that writers have to know about each and
every inset type and have a specific method to handle each type.
I'm inclined to agree with JMarc (?? I deleted that email so I'll just 
blame JMarc for now ;) that this scheme may as well have methods like:
        writeInsetUrl(InsetUrl const &);
and access the components of InsetUrl using "standard" inset operators
or methods like getContents or such.  This admittedly ties them even
closer to the Insets though.  But using such a scheme we could have 
overloaded write() methods or better yet overloaded operator<< methods.

Lars sample code for easier comparison:
> // Pure Virtual class.
> class Writer {
>         writeInsetUrl(<data args in inseturl>) = 0;
> };
> 
> class LaTeXWriter : Writer {
>         writeInsetUrl(<data args in inseturl>) {
>                 filestr << "\url{" << args << "}" << endl;
>         }
> };
>
[...other examples...]
>
> class PainterWriter : Writer {
>         writeInsetUrl(<data args in inseturl>) {
>                 painter.drawButton(text text);
>         }
> };
> 
> Think of a verbatim inset:
> 
> InsetVerbatim::write(Writer wri) {
>         wri.writeInsetVerbatim(vector<verbatimvalidobjects> vec);
> }
> 
> LaTeXWriter::writeInsetVerbatim(vector<verbatimvalidobjects> vec) {
>         filestr << "\\begin{verbatim}" << endl
>         for (a = vec.begin(); a != vec.end(); ++a) {
>                 (*a).write(this);

What exactly is in a (*a)?
I'd have thought it would be an LString but it looks like you
are having other insets -- hmmm... multiple paragraphs in a
verbatim inset one insetparagraph each perhaps? 

>         }
>         filestr.endineol();
>         filestr << "\\end{verbatim}" << endl;
> }
> 
> Well I don't know anymore, but it seemed like nice idea when I was
> taking a shower. Especially since it hid all the hairy stuff.

Trying to come up with an alternate scheme is difficult.  Below
is a scheme that tries to push the writers and insets to be more 
independent.  It does however introduce some complications for the 
writers (and maybe maintenance difficulties later -- unless we are 
careful in implementing the state machines). It all involves making
writers look like ostreams.  I'll show you some code (no warranty 
implied ;) :

class Writer {
        enum {command_start, command_end, command_option_start,
                command_option_end,verbatim_start, verbatim_end...}
                WriterStyles;
        Writer(LString const &);  //filename
        virtual Writer & operator<< (LString const &) = 0;
        virtual Writer & operator<< (int const &) = 0;
        virtual Writer & operator<< (WriterStyles const &) = 0;
        ...
}

class LaTeXWriter : Writer {
public:
        LaTeXWriter(LString const &);  // filename
        virtual Writer & operator<< (LString const &);
        virtual Writer & operator<< (int const &);
        virtual Writer & operator<< (WriterStyles const &);
        ...
private:
        auto_ptr<ostream> our_output_file; 
}

Writer & LaTeXWriter::operator<< (WriterStyles const & ws)
{
        static LWriterState state = default_state;
        switch (state) {
        case default_state:
                switch (ws) {
                case verbatim_start:
                        state = in_environment;
                        *our_output_file << "\\begin{verbatim}";
                // we could get fancy and use a number of different
                // optimizations such as:
                //      enviro_name = "verbatim";
                // and then catching all environment ends and
                // only having one output statement.
                        break; 
                case *_end:
                        // error no _end's allowed in default_state
                        ...
                        break;
                ...
                }
        case in_environment:
                switch (ws) {
                case verbatim_end:
                        state = default_state;
                        *our_output_file << "\\end{verbatim}"
                                         << endl; 
                // another thing we'd probably do is to do our
                // initial output to an LString buffer like we do
                // now so we can look backwards in the output
                // and break long lines or add/remove '\n'.
                        break;
                ...
                }
        ...
        }
        return *this;
}

class InsetVerbatim : Inset {
        virtual write(Writer &) const;
        ...
}

Writer & InsetVerbatim::write(Writer & wr) const
{
        return wr << WriterStyles::verbatim_start 
                  << contents 
                  << WriterStyles::verbatim_end;
}

Of course it'd be nice to be able to write:
        LaTeXWriter lwr("somefile.tex") << preamble_and_stuff();
        // preamble_and_stuff() could actually be in an 
        // InsetPreamble or something similar.
        for (document_structure::const_iterator iter = buffer.begin();
             iter != buffer.end();
             ++iter) {
                lwr << (*iter);
        }
        lwr << closing_of_document();
        
where (*iter) is effectively any inset.  So we therefore need to add:

Writer & operator<< (Writer & wr, InsetVerbatim const & iv)
{
        return iv.write(wr);
}

and likewise for all the other different insets *or* I think the
following will work:

Writer & operator<< (Writer & wr, Inset const & inset)
{
        return inset.write(wr);
}

Thus giving us a double-dispatch but if we inline this one it
shouldn't cost much in time or space.

The scheme above makes the writers look somewhat uglier than Lars
scheme does but it should also be a bit more independent of the 
insets.  The hardest part is going to be the state machines in each
of the writers WriterStyles handlers.  For example:

class PainterWriter : Writer {
public:
        PainterWriter(Painter &);
        virtual Writer & operator<< (LString const &);
        virtual Writer & operator<< (int const &);
        virtual Writer & operator<< (WriterStyles const &);
        ...
private:
        enum {default_state, in_button, ...} PWriterState;
}
        
Writer & PainterWriter::operator<< (WriterStyles & ws)
{
        static PWriterState state = default_state;
        // we shouldn't need to know the current state in any of
        // the other overloaded operator<< but if we do we'll
        // need to make state into a member function with a local
        // static PWriterState.  Note that it *wouldn't* be a
        // static member function.
        switch (state) {
        case in_button:
                // everything until the next WriterStyle::*_end
                // will be in a button -- of course there is an
                // implicit assumption here that we can't have
                // buttons within buttons.
                switch (ws) {
                case ref_end:
                case url_end:
                case label_end:
                        painter.drawButton(data);
                        // where data is a variable that we filled
                        // with incoming data while waiting for 
                        // an *_end.
                        state = default_state;
                }
                break;
        ...
        }
}

Unfortunately whatever we come up with is going to have some ugly
bits or hairy stuff.  The best thing about this scheme is that state
machines are fairly easy to develop and are self documenting.
In fact we could use an external tool to maintain the state machine
design and generate skeleton code.

The third option which I'm still inclined to prefer (for its 
simplicity) is to have each Inset provide writer specific methods.
Such a scheme wouldn't make it any harder for the writer to keep a
check of the output (to break lines at appropriate points or insert 
extra spaces etc.).  It would also be possible to use an ostream
syntax for the writers.  Let's call this Option3.

Now to analyse some other considerations:
Adding a new inset:
        Lars' scheme:
                each writer needs a new writeInsetXXX
                hence a change to writer.h
                and a recompile of everything
                Difficulties:  how to reuse common areas of code?
        Allan's scheme:
                Probably add new WriterStyles (and support for them)
                Probably recompile everything.
                Difficulties:  maybe an extra state in state machines
        Option3:
                implement new inset
                compile only new inset
                (assumes we don't need to implement any extra
                 support in writers -- if we do we'll end up
                 recompiling everything.)

Adding a new writer:
        Allan's scheme:
        Lars' scheme:
                implement new writer class
                compile only new writer class
                Difficulties: 
                Lars' scheme - each writeInset...
                Allan's scheme - each <<(WriterStyles) since
                the other << methods are much simpler.
        Option3:
                as above and add new methods for each and
                every inset.  Then recompile everything.
                Difficulties: each new writer specific method.


How often do we add new writers?
        (If we start with ASCII, LaTeX, SGMLDocBook and HTML4_0
         would we need any others? (XML?) man pages and a number of
         other things can be generated by SGMLTools (either now or 
         in the future)
How often do we add new insets? 
        (I'd say more often than new writers)
Can any of these schemes support generic writers or insets?
        (eg. embedding other applications in an InsetExternApp)


Other comments:
I like the iostream appearance of my scheme with the overloaded
operator<<.  Unfortunately,  I doubt we could modify Lars' scheme
to use overloaded operator<< unless we changed it to being 
overloaded on inset types. I think my scheme needs something better
than the WriterStyles enum to configure the Writer stream.  Maybe
something similar to the ostream manipulators (setw() and the like).


-- 
Allan. (ARRae)
Re: writer

Reply via email to