[sage-devel] Re: Constructive discussion on the sage development model

Paul-Olivier Dehaye Mon, 26 Nov 2012 03:13:02 -0800

My understanding of aspect-oriented programming in general: encourages the 
developer to realize that there are several concerns that will be present 
in a large project, and to organize the project in such a way to minimize, 
at any location in the code, the number of concerns whose concrete 
implementation is relevant. It is particularly important in large projects 
where participants have very diverse knowledge, backgrounds and skills. I 
would venture it could serve also as a defensive strategy against what I 
see as the major drawback of the open-source development model: openness of 
the codebase means that everyone is equally (not-)responsible for 
explaining, documenting, fixing bugs, etc. Identifying these concerns 
(which are just fuzzy words and not programming constructs) and structuring 
the code (/concern1, /concern2, ...) around them helps guide people to the 
relevant places in the code (the directory structure explains what it does 
to humans), talk constructively about this concern on a mailing list, reuse 
code, and I think fix bugs as well (no patch spanning dozens of files).


How this would apply to sage: 

The first step is to identify the concerns as I started doing in my 
previous email, so we can more easily talk about them, see how they 
interact and explain them. I advocate doing that for a bit before any code 
is written (I know full well many others would oppose to this and say "let 
the code do the talking"). Doing it this way helps plan strategically, 
 avoids doing "mistakes", and clarifies when we have made one. Having a 
plan also helps explain it. 
Don't confuse a plan with more docs buried in the code. There is 
documentation everywhere that explains what that particular class or method 
is doing, but nothing at a higher level: where is sage's outline as it 
stands?  (the first part of "State of Sage" speech?). For instance where is 
the answer to a user's question: "Right now, how do I add information about 
my particular mathematical interest?", spread over the doc of 5 or so base 
classes that s/he would have to find? And not in the doc of a few others 
that s/he would have to know to ignore, because it's obsolete?. 

Even more importantly, 
Where's sage's long term plan? 
(the second part of "State of Sage" speech, the response to the first part?)

For instance, where is the answer to a CAS-developer's question: "How do we 
want in the long run to assist people in adding information about their 
particular mathematical interest?"
Let me give a stab to an answer for my ideal system (not unrealistic but 
would require tweaks to several key components, among others to the 
coercion model).

- Separate the actual Categories framework as it currently exists into two 
parts. The first would be a "Mathematical abstraction" layer, where very 
abstract mathematical information is implemented.
- Force the mathematical developer to think in terms of mathematics first, 
i.e. require a category to be specified in order to be able to construct a 
Parent. 
- Doing it this way helps us automatically create a lot of tests first, 
before anything concrete has been written (developer needs not be concerned 
about writing the tests, they are magically created). Obviously the new 
Parent would fail all of them. That's ok though. Developing that Parent 
would be easier, precisely because the failed tests would provide 
specifications and lead development. This testing layer is currently the 
second part of the Categories framework. We would have gained here because 
we would have successfully separated part of the testing from the 
representation and automated it. Part of the testing should be driven 
exclusively by the mathematical abstraction (in a commutative group, is a*b 
== b*a, for any pair a and b?) 
- Specify the coercions that should exist at the "Mathematical abstraction" 
level. 
- Specify for each new Parent how to coerce from the Category down to the 
Parent. 
- All the coercions and tests are automatically generated from this 
information. This breaks down coercions between representations as 
compositions of two separate kinds of coercions, each simpler to implement. 
It scatters the coercion code maybe even more, but it minimizes the number 
of concerns at any one location in the code. The developer needs not be 
concerned with all the various representations of the abstract mathematical 
category s/he is representing, which might require significant investment 
into someone else's technical code. Only with "theirs" and the mathematical 
abstraction. 
(PS: it might already be possible to use the coercion model in this way. If 
so, please please please point me to where it is explained. In any case, my 
point stands as I was not able to find this despite searching.)

Maybe you disagree with my answer, maybe not. Right now there is no way of 
knowing what the official strategic plan is on sage, to flag concerns with 
it and thus to make adjustments (highlighting just like above that more 
modularity might be desirable for certain parts). 

Once we agree on the desired modularity we need in the code, we can decide 
on how to modify what we have to achieve separation of concern. In Python, 
the answer would most likely be "with decorators and metaclasses" (indeed, 
as you pointed out, cached_function and so on isolate one aspect of a 
function). Ideally, I think, the decorators would be used to add tons of 
flags (relating to concerns separate to the main one at that part of the 
code), and the metaclass would alter methods according to the flags 
encountered. Swapping the metaclass but using the same class code (with 
tons of decorators) is a conceptually simple method to get a class factory, 
generating classes  with the same core functionality but different 
implementations. This makes development more modular and hence easier. 

Again, I can see opposition to this along a few main lines:
 - "It would be nice, but it is a dream": We can figure this out. 
 - "What's the point": I can explain more of the benefits. For the coercion 
model for instance. 
 - "This will slow down the one class I am using a lot": Not if it's done 
right. In fact, if it's done very right it would speed up considerably 
everything, unless you start doing some very weird stuff for a user to do 
(iterating over possible class constructions in sage, for instance, which 
is less weird if you are testing).
 - "This is programming by committee": I am suggesting having _one_ high 
level discussion archived somewhere, that we can tweak as we go. I don't 
think it's exaggerated. 
 - "Wait, you were talking about aspects, where are those?": Aspects are 
concerns that have been identified, isolated, and have been turned into 
programming constructs that mask their implementation details to their 
target user. For instance cached_method. Once we have identified clearly 
concerns, we can figure out how to turn them into aspects.
 - "This is a lot of talking, no programming": Not yet, and that's fine. I 
am sure parts of a large long term project is communicating to the 
developers what the big picture is. We don't need to have that discussion 
only by inference from written (buggy) code.

Paul

On Friday, November 23, 2012 5:02:23 PM UTC+1, Paul-Olivier Dehaye wrote:
>
> Only trying to foster a (constructive) discussion, by suggesting a way to 
> talk about the development...
>
> I think issues in our development model could be resolved if we adhered 
> strictly to principles of aspect oriented software development.
> http://en.wikipedia.org/wiki/Aspect-oriented_software_development
> Our "business" is to "serve the best mathematical information for our 
> users" (what best means depends on the user and his/her goal at that 
> instant: teach/learn/research/...). Here is a list I have been compiling in 
> the past few days of different issues that pop up regularly on math-devel 
> and might be useful to consider as "cross-cutting concerns" in the sense of 
> that wikipedia link:
>
> - mathematical abstraction (parts of the categories framework will help 
> structure all this abstraction in the long run, but we have actually spread 
> our collective mathematical knowledge all over the place)
> - logging
> - documentation
> - testing
> - persistence (pickling, cached_function, cached_method, databases, etc. 
> For info:
> ~> grep -R "loads(dumps"  * | wc -l
> 1368
> )
> - exactness/correctness (for instance the issue Nathann is raising in this 
> thread)
> - speed
> - coercions/types 
> - inputs of the sage system (notebook, preparser, command line interface, 
> ...)
> - outputs of the sage system (notebook, latex, other CAS)
> - deprecation
> - cross-compatibility (for instance with Solaris)
> - parallelism
> - precision issues
> - tutorials
> - conventions (French or English notation for tableaux, arithmetic or 
> analytic normalization for L-function, D4 or D8?)
> - actually organizing and performing computations
> - explaining how results were obtained (whether or not to send warnings 
> for instance, is it conjectural)
> - street credit for algorithm discovery, implementation, data, ...
> - version control (at the moment we do this with outside tools)
> - patch submission ((at the moment we do this with outside tools)
> - bug report
> - development model (version control, patch submission, bug reports are 
> parts of this, also interferes with the testing aspect)
> - use case (building a great system in the long run, doing research right 
> now, teaching)
> - ...
>
> The last point is meant to highlight that we write code not only for many 
> different people but also that each person might have more than one goal in 
> mind.
>
> I think organizing our code into aspects would help fix bugs (localizing 
> changes into less files) and move faster. It would also help welcome new 
> people to sage (minimizing investment needed). It would clarify things too 
> (for instance, the categories framework, as it currently exists, is an 
> aggregation of several of the aspects listed above: strong flavor of 
> mathematical abstraction, Parent/Element, and like everywhere else, some 
> testing and documenting). 
>
> The development gets tricky whenever code is written and people read it 
> with a different set of concerns in mind. They are forced to: we have no 
> mechanism to define most of these concerns, and thus no way of making code 
> dependent on the state of any of those concerns. 
>
> We should be able to identify atomic aspects by simple discussion, on this 
> list for instance. 
>
> This email is simply meant as a suggestion on how to frame the discussion. 
> Please do not shut down this avenue of thinking purely on the grounds that 
>  - "It would be nice, but it is a dream": I have made no attempt to 
> explain how one might concretely implement this (we are of course  already 
> separating some of the concerns above). You might think it's a dream, but 
> please consider that at least one person is thinking about how to do it. 
>  - "What's the point": Sorry you missed it, re-read the email.
>  - "This will slow down the one class I am using a lot": Not if it's done 
> right.
>
> Paul
>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To post to this group, send email to sage-devel@googlegroups.com.
To unsubscribe from this group, send email to 
sage-devel+unsubscr...@googlegroups.com.
Visit this group at http://groups.google.com/group/sage-devel?hl=en.

[sage-devel] Re: Constructive discussion on the sage development model

Reply via email to