Re: [sage-devel] Re: BipartiteGraph status

Robert Miller Mon, 01 Mar 2010 12:18:43 -0800

A.  Change BipartiteGraph so that it doesn't inherit from Graph.

B.  Make BipartiteGraph handling an integral part of the Graph class...

C. ... BipartiteGraph is a subclass of Graph and overrides methods
when necessary.

On Mon, Mar 1, 2010 at 9:55 AM, Ryan Hinton <iob...@email.com> wrote:
> It looks like you're also supporting option (C) for solving problem
> (2): very interesting. Since you wrote most of this code, your opinion
> is the most important to me.  But I personally prefer both other
> options over (C).  Option (A) puts the maintenance responsibility
> completely on BipartiteGraph developers (a.k.a. users).  Option (B)
> puts the maintenance responsibility on Graph/GenericGraph developers.
> Option (C) just assumes everything works.

Option C does not just assume everything works. For example, if you
make "side" a required option to add_vertex, and you call a function
which uses add_vertex without that argument, it will blow up, loudly
and honestly. If it blows up, that means that nobody has ever tried it
before and fixed it. It is true (maybe) that some things will appear
to work when in fact they don't make sense or give the wrong answer.
However it should be relatively easy to peruse the code now, to see
which functions are like this, and fix them, so the real issue at hand
is when someone writes a new function for Graph without being aware of
BipartiteGraph's.

> A Graph method that do something bad for a BG will *only* complain
> when someone writes a doctest *in graph.py* for this method *using a
> BipartiteGraph*.  Or when someone writes a doctest in
> bipartite_graph.py for a method that only appears in graph.py.  Or,
> most likely, when someone starts using a BG and finds that some
> methods just blow up.

I don't think that this is necessarily the picture. For one, your
recent work has made many more people who would be reviewing new graph
functions aware that BG's are around, and should be checked. Also,
what is the real harm of having a function on Graph's that doesn't
work for BG's but nobody ever tries to use for BG's?

> I like option (A) because it acknowledges that BipartiteGraph and
> Graph have important differences.  The functionality of BG will lag
> that of Graph, but explicitly so -- if a method exists for a BG, it
> should work with high confidence.  I like option (B) because it is the
> similar to the solution for DiGraph: it allows for the important
> differences to be handled where necessary, and anyone working in
> graph.py will quickly become aware of the cases that need to be
> handled.  (Notice the surprise of several on this list that
> BipartiteGraph even existed!)

Let's clarify a bit: DiGraph and Graph both inherit from GenericGraph.
This is already a mixture of what you are calling option A and B, for
DiGraphs. Perhaps this approach makes sense for BipartiteGraphs too,
but the fact is that a BipartiteGraph *is* a Graph, and that
relationship should be preserved by class hierarchy. Also, the
problems won't be solved by making BipartiteGraph a GenericGraph,
since that will just push some of the functions out of the way, but
not others.

> It seems like we should either separate the classes as in (A) so that
> someone working in graph.py doesn't have to know or worry about
> bipartite_graph.py, or integrate them as in (B) so Graph and
> GenericGraph developers are expected to account for (and more likely
> be aware of) the BipartiteGraph case.  Now that I've presented it as a
> viable option, the wishful thinking of (C) appears unsustainable.

I don't understand what is wishful about option C. If a BG has a
function inherited from Graph which is buggy on BG input, then that
function has a bug and needs to be fixed. Maybe it is wishful to
believe that people will fix bugs as they find them, but I prefer this
to major class obfuscation and additional complications in the code
which will most likely cause more bugs.

> I think option (A) gives the fewest bugs, and option (C) the most.  I
> think option (A) implies slightly more work now simply because a large
> number of one-line methods (with documentation and doctests copied
> from graph.py) would need to be written.  I would put options (B) and
> (C) as roughly equal work now.

Here's my breakdown of your proposed options:

Work now:
Option A & B: Both of these are a lot of work now. And there is nobody
who is obviously ready to do the work, all of which should be done at
around the same time, with deprecations, etc.
Option C: This just requires finishing #1941 and checking that the
rest of the functions make sense. No major overhauling, just fixing
some bugs.

Bugs:
Option A: This will introduce more bugs, guaranteed. Any time you
rearrange class structure, in my experience, it takes a long time to
track down all the bugs, some of which may be quite subtle. However,
adding new functions to Graph will not introduce bugs to
BipartiteGraph (although it may to Graph).
Option B: Basically the same a A. New bugs guaranteed, subtle and a
long time to fix them all. I think this option is even more prone to
subtle bugs and endless work than A. Adding new functions to Graph or
BipartiteGraph will likely introduce bugs, since author and reviewer
are likely to forget checks all over the place.
Option C: This approach only fixes bugs, and will not introduce new
ones. At worst, some functions on BG's which used to give wrong
answers will complain loudly instead. Adding new functions to Graph
may introduce bugs in BG's (or in Graphs).

> Work later comes from changes or additions to graph.py.
>
> i.  A new method is added.
>  A.  A BipartiteGraph user who discovers the new method and wants to
> use it needs to copy the documentation, make a few small changes to
> the doctest, and handle the call appropriately (likely 1-line
> delegation).

This means massive amounts of code duplication. Any functions on BG's
(nearly all the graph functions, potentially) will be duplicated.

>  B.  The graph.py developer is expected to handle the bipartite case.

This may introduce a lot of resistance to people who want to develop
for graphs in Sage. "What is all this complicated infrastructure? I
don't care about BipartiteGraphs, and I don't have time, so I'll just
use my patch myself and not submit it."

>  C.  A BipartiteGraph user discovers the new method and tries to use
> it.  It might work, it might raise an exception, or it might give a
> wrong answer.

I believe that option C can be implemented so that new methods rarely
give wrong answers. By making all the basic functions (i.e.
add_vertex) fail loudly if they are not being used "Bipartite"-ly, we
can make it very likely that such a new function will also blow up,
which is the optimal solution for me.

> ii.  The algorithm for an existing method is changed.
>  A.  There will be a BipartiteGraph doctest for this same method, so
> an exception or wrong result will usually be discovered before
> release.
>  B.  Again, the graph.py developer should have included a
> BipartiteGraph doctest and handled any necessary special cases.
>  C.  Existing, correct code using BipartiteGraphs may begin to raise
> exceptions or silently give wrong results at the next release.

This disadvantage to option C can be overcome by including doctests
for the functions which work for BG's which actually test the function
for BG's, in the docstring in the graph code itself. Any function
which is *expected* to work for BG's should be tested for BG's, and
there are plenty of places in the documentation these tests could be
put.

>> Back when #1941 was created, this thorough review was done. It may
>> need to be updated.
>
> I think it does.  There are lots of methods in graph.py; at least a
> few need updating that are absent from the current list.

> I'm not sure what class-handling capabilities you are referring to,
> but I'm not a Python expert.  I don't see this as a language issue.
> Python doesn't get in the way, but it certainly doesn't solve the
> problem.  Well, perhaps we could use some of the introspective
> features of Python in option (A) to indicate when Graph and
> BipartiteGraph don't have the same set of methods.

To me, having {{{isinstance(BipartiteGraph(), Graph) is True}}} is the
only right way to go. All over Sage, the "is a" relationship is very
carefully preserved, and makes sense mathematically and in terms of
code. I think the main thing you are worried about in option C is
people writing new code for Graphs and inadvertently introducing bugs
in BipartiteGraphs. My main answer is documentation: if it is expected
to work, it is tested, and those tests protect its proper functioning.
If there is no test, then there is no reason to expect it to work,
before or after the patch. And also, new code always introduces bugs,
you just have to pick your poison and decide how much work you want to
do.

-- 
Robert L. Miller
http://www.rlmiller.org/

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Re: [sage-devel] Re: BipartiteGraph status

Reply via email to