Dear all,

I am working with several people on software for doing matroid theory. Our goal 
is to produce a Sage package. In our implementation, it makes sense (both 
mathematically and computationally) to have a set object that is aware of the 
universe it lives in (which we call the "ground set"). Hence I set out to 
create a pair of classes: a Powerset_object which serves as parent, and a 
Set_object_in_powerset. The latter would be a subclass of 
Set_object_enumerated, which means that the end user need not even notice it's 
there.

The idea is that, for small sets, we can use bit-packing: the bits of an 
integer correspond to the elements. This gives blazing-fast intersection, 
union, etc. For end users, the Set_object_in_powerset would behave just like a 
Set_object_enumerated, since it is a subclass.

Here is an example of the speed-up that bit-packing can provide:

%time
S = set(range(9))
t = 0
for S1 in subsets(S):
    for S2 in subsets(S):
        if len(set(S1).intersection(S2)) > 0:
            t += 1
t

gives CPU time: 0.85 s,  Wall time: 0.85 s

%time
t = 0
for s1 in range(0,2**9):
    for s2 in range(0,2**9):
        if s1 & s2:
            t += 1
t

gives CPU time: 0.08 s,  Wall time: 0.08 s

Now what happens if we use Sage's Set_objects instead of Python's?

%time
S = Set(range(9))
t = 0
for S1 in S.subsets():
    for S2 in S.subsets():
        if len(S1.intersection(S2)) > 0:
            t += 1
t

gives CPU time: 44.94 s,  Wall time: 45.02 s

Oops! If you profile this code, it turns out that those 44 extra seconds are 
all spent in Set_object.__init__(). I did not yet experiment with Cython, but I 
fear that calling the __init__ method of Set_object is not going to be sped up 
by this.

Now how do I solve this problem? There seem to be several options:
1) Sacrifice compatibility with Sage's Set category, meaning less 
user-friendliness
2) Write fast code, and create a wrapper in Sage. The extra layer will add to 
the complexity of the code, and questions by the end user along the lines of 
"list all bases of this matroid" would still be slow, even if the internal 
implementation is fast, due to conversions.
3) ???

I hope there is a third option. Would it be possible to subclass 
Set_object_enumerated, but NOT call the parent's initialization function?

Kind regards,

Stefan van Zwam.

-- 
To post to this group, send an email to sage-devel@googlegroups.com
To unsubscribe from this group, send an email to 
sage-devel+unsubscr...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/sage-devel
URL: http://www.sagemath.org

Reply via email to