Let me share some more thoughts on the interplay between modularization and the review process. I'm afraid I'm going to be long... TL;DR: modularization is good to keep Sage going forward.
Our review process discourages patchbombs, and for a reason. For one, no one wants to review patchbombs. But, most importantly, it is too risky to drop a patchbomb into the next release, having the doctests and a couple of human eyes as the only safety guarantee. The consequence of this is that we modify old code by carefully *sprinkling* diffs here and there, breaking up major changes into minor, innocent looking patches. I know something about it, I'm doing it right now for the Pari interface. There is nothing fundamentally bad about this, and, eventually, old code gets fully rewritten and the codebase gets improved. Only, this is time-consuming, and, by definition, it leads to *patchy* code. What I want to highlight is that non-modularization + review process discourage major rewrites too much. Some examples: - How many modules in Sage haven't moved to the category framework yet (assuming that the move would make sense for the module)? - With all the respect I owe to the huge work done by William, John, and the other contributors to elliptic curves in Sage, I've found myself many times thinking "This design is so bad! I want to rewrite it from scratch". On the opposite site, take the sage-coding project: it is making huge leaps forwards, designing non-trivial mathematical interfaces at an impressive pace. How do they manage it? By writing from scratch, at times even by writing outside of Sage. They have no (or, rather, few) pre-existing interfaces to cater for, and they can simply drop small tickets with half-working code because no other code is depending on them. Most of Sage's code was developed in bursts like this one. Only, after some time, you may realize that there is a better design for what you've written some years before, and that's when you're stuck with the problem of redesigning while maintaining old code at the same time. Constantly rewriting from scratch is silly, and that's why the review process is there. But, never rewriting parts of a code that is 10 years old is also silly. By making it impossible to drop patchbombs, we're slowing down progress. If we made Sage more modular, if it was easy to build code *outside* of Sage and then progressively bring it inside Sage once it is mature, then rewriting whole modules would become much easier. Let's say I'm set up on rewriting the elliptic curve modules, and that I've gathered a small team of enthusiasts to help me. Here's my choices right now: - Announce my plan on sage-devel, seek consensus, maybe write a white paper, modify sage.schemes.elliptic_curves little by little with small patches. This would likely keep breaking the public API at each minor release until the rewrite is done. If we keep faith and get to the end, we would end up with *patchy* code. - Announce my plan on sage-devel, seek consensus, start development in a sage.schemes.elliptic_curves.new package, inside-review the small tickets. When the package is ready, go back to sage-devel, propose to replace the old package with the new one, wait a couple of years. Eventually the rewrite will be accepted, old code will break, the only way for users to keep the old interface will be to stop upgrading Sage. Or, most likely, the rewrite will stay in .new, it will be forgotten, and rot there for the rest of its life. Now, suppose that elliptic curves were a *core* Sage module, shipped with Sage, but which you can uninstall nevertheless. Here's how I would go: Start the new elliptic module outside of Sage, play with it, redesign it a few times. When it starts getting serious, write about it on sage-devel, seek consensus, document the public API. When the module gets stable enough, start advertising it, tell people they can try it out with `sage -i defeo/elliptic_curves`. When the module has gained some traction, go back to sage-devel and propose to replace the core package with it. Have more people look at it, reach feature parity with the core package, wait some minor releases. Eventually the replacement gets accepted, a lot of warnings are issued to the users, the new code goes in at the next *major* release, old code can still run (at least for some time) by doing `sage -u elliptic_curves` and `sage -i elliptic_curves_old`, I become the main maintainer of the elliptic curves core package, John is happy because he does not have to do maintenance anymore. Or maybe the replacement is not accepted, then I write an incendiary email on sage-devel saying all Sage devs are fascists, and I'm quitting the community, and tell all my colleagues to `sage -u elliptic_curves` and `sage -i defeo/elliptic_curves`. The community has lost a (bad) developer, but no code was lost in the process :) I understand that some people here are scared of dispersion: what if we end up with a dozen, half-baked, badly maintained, concurrent elliptic curves modules? Wouldn't it be better to make all these developers contribute to Sage directly? This is a legitimate fear, and a realistic scenario if the community is not properly organized. Better tools can only lead to better code if they are properly used. I hope that my exposé gave you an impression of how we could use more modularization for good, and will make you embrace that vision. Technical details are coming, as soon as I'm finished writing the report on Sage days 77. Luca -- You received this message because you are subscribed to the Google Groups "sage-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to sage-devel+unsubscr...@googlegroups.com. To post to this group, send email to sage-devel@googlegroups.com. Visit this group at https://groups.google.com/group/sage-devel. For more options, visit https://groups.google.com/d/optout.