Let me share some more thoughts on the interplay between modularization and 
the review process. I'm afraid I'm going to be long... TL;DR: 
modularization is good to keep Sage going forward.

Our review process discourages patchbombs, and for a reason. For one, no 
one wants to review patchbombs. But, most importantly, it is too risky to 
drop a patchbomb into the next release, having the doctests and a couple of 
human eyes as the only safety guarantee.

The consequence of this is that we modify old code by carefully 
*sprinkling* diffs here and there, breaking up major changes into minor, 
innocent looking patches. I know something about it, I'm doing it right now 
for the Pari interface. There is nothing fundamentally bad about this, and, 
eventually, old code gets fully rewritten and the codebase gets improved. 
Only, this is time-consuming, and, by definition, it leads to *patchy* code.

What I want to highlight is that non-modularization + review process 
discourage major rewrites too much. Some examples:
- How many modules in Sage haven't moved to the category framework yet 
(assuming that the move would make sense for the module)?
- With all the respect I owe to the huge work done by William, John, and 
the other contributors to elliptic curves in Sage, I've found myself many 
times thinking "This design is so bad! I want to rewrite it from scratch".

On the opposite site, take the sage-coding project: it is making huge leaps 
forwards, designing non-trivial mathematical interfaces at an impressive 
pace. How do they manage it? By writing from scratch, at times even by 
writing outside of Sage. They have no (or, rather, few) pre-existing 
interfaces to cater for, and they can simply drop small tickets with 
half-working code because no other code is depending on them. Most of 
Sage's code was developed in bursts like this one. Only, after some time, 
you may realize that there is a better design for what you've written some 
years before, and that's when you're stuck with the problem of redesigning 
while maintaining old code at the same time.

Constantly rewriting from scratch is silly, and that's why the review 
process is there. But, never rewriting parts of a code that is 10 years old 
is also silly. By making it impossible to drop patchbombs, we're slowing 
down progress. If we made Sage more modular, if it was easy to build code 
*outside* of Sage and then progressively bring it inside Sage once it is 
mature, then rewriting whole modules would become much easier.

Let's say I'm set up on rewriting the elliptic curve modules, and that I've 
gathered a small team of enthusiasts to help me. Here's my choices right 
now:
- Announce my plan on sage-devel, seek consensus, maybe write a white 
paper, modify sage.schemes.elliptic_curves little by little with small 
patches. This would likely keep breaking the public API at each minor 
release until the rewrite is done. If we keep faith and get to the end, we 
would end up with *patchy* code.
- Announce my plan on sage-devel, seek consensus, start development in a 
sage.schemes.elliptic_curves.new package, inside-review the small tickets. 
When the package is ready, go back to sage-devel, propose to replace the 
old package with the new one, wait a couple of years. Eventually the 
rewrite will be accepted, old code will break, the only way for users to 
keep the old interface will be to stop upgrading Sage. Or, most likely, the 
rewrite will stay in .new, it will be forgotten, and rot there for the rest 
of its life.

Now, suppose that elliptic curves were a *core* Sage module, shipped with 
Sage, but which you can uninstall nevertheless. Here's how I would go: 
Start the new elliptic module outside of Sage, play with it, redesign it a 
few times. When it starts getting serious, write about it on sage-devel, 
seek consensus, document the public API. When the module gets stable 
enough, start advertising it, tell people they can try it out with `sage -i 
defeo/elliptic_curves`. When the module has gained some traction, go back 
to sage-devel and propose to replace the core package with it. Have more 
people look at it, reach feature parity with the core package, wait some 
minor releases. Eventually the replacement gets accepted, a lot of warnings 
are issued to the users, the new code goes in at the next *major* release, 
old code can still run (at least for some time) by doing `sage -u 
elliptic_curves` and `sage -i elliptic_curves_old`, I become the main 
maintainer of the elliptic curves core package, John is happy because he 
does not have to do maintenance anymore. Or maybe the replacement is not 
accepted, then I write an incendiary email on sage-devel saying all Sage 
devs are fascists, and I'm quitting the community, and tell all my 
colleagues to `sage -u elliptic_curves` and `sage -i 
defeo/elliptic_curves`. The community has lost a (bad) developer, but no 
code was lost in the process :)

I understand that some people here are scared of dispersion: what if we end 
up with a dozen, half-baked, badly maintained, concurrent elliptic curves 
modules? Wouldn't it be better to make all these developers contribute to 
Sage directly? This is a legitimate fear, and a realistic scenario if the 
community is not properly organized. Better tools can only lead to better 
code if they are properly used. I hope that my exposé gave you an 
impression of how we could use more modularization for good, and will make 
you embrace that vision.

Technical details are coming, as soon as I'm finished writing the report on 
Sage days 77.

Luca

-- 
You received this message because you are subscribed to the Google Groups 
"sage-devel" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to sage-devel+unsubscr...@googlegroups.com.
To post to this group, send email to sage-devel@googlegroups.com.
Visit this group at https://groups.google.com/group/sage-devel.
For more options, visit https://groups.google.com/d/optout.

Reply via email to