On Fri, Feb 20, 2004 at 11:02:55AM +0100, Igor Genibel wrote: > * Jeroen van Wolffelaar <[EMAIL PROTECTED]> [2004-02-18 19:30:05 +0100]: > > [I was very busy the last weeks so, I apologize about my silence]
n/p. > Hi Jeroen, > > thanks a lot for this contribution I cannot entirely integrate in > developer.php because neither on klecker.d.o nor on master.d.o php4-cgi > is intalled. > So the convert-to-db probably needs to be rewritten in other language. Or alternatively, one could request php4-cgi to be installed on klecker... shouldn't be a too much of a deal. A bit hackish though, but one could even have that code in developer.php itself and execute only when the .db is outdated, that means that the first request after update takes 1 sec longer than usual, not really a big deal (performance is always at least as good as it is currently :-), since currently bugs.txt is read anyway). Heck, one could even have a special mode in developer.php, called via wget from the bugs update. Otoh, it's very easy code, so it _also_ can be rewritten indeed (perl seems to be the best choice imho). > Moreover, the developer.php (and all its backend) really needs to be > rewritten because it is really ugly and the performance are really slow. > That's why I started some week ago on a complete rewrite in order to > provide static html files (based on xml tranformation) in order to > increase the performance. I'm from the "don't touch what's not broken" school. I do agree the code is ugly, but performance can be easily improved by having a sane datastructure, i.e. something else than the textfile-parsing-gibberish there is now. With only the bugs.txt -> .db improvement, developer.php is already very usable, performance wise. The ddpo.py code I didn't dare to touch, as I'm no python hacker. I personally believe I'm best at designing code (higher level) and directions etc., more than I write code (though I usually write it myself too, but not always). In any case, I think statically generating .html pages is not the way to go. With dynamically, but efficiently, generated pages, one is very flexible, can have any selection of packages, without really performance inpact (a bit html generating php code with some db lookups is quite fast), without the need for a lengthy 'generate all pages one _might_ be requesting' process, which is already done too often imho, while it isn't needed. Especially updates can all be done independently with seperate .db files as source of information. Added bonus is that you get page design and data retrieval seperated for free, so data can be reused anywhere by anyone. I'm especially thinking about making the PTS info and the developer.php info cooperate in data retrieval, rather than both doing it their own way. Of course, current 'extract' needs to be better designed. Since I've now a copy, I could write a proposal on a better data structure. > I think it's time for this piece of code to be available on alioth > because I want more people to work on it (redesign, recode, ...) > > So Jeroen, I would be glad to see you, if you are interested, involved > in the project, and you will see that all the ddpo code is entirely > available in the qa cvs tree. Cool :), ok. > For the moment, I will continue to maintain it as it is in this cvs tree > and try to improve its performance, ... and start the project on alioth > in order to provide a really better tools than it is now. You can better work in qa's cvs tree, not? Reuser the ddpo dirs etc, and ditch ddpo.py when it's unneeded, and simply rewrite parts of developer.php whenever there is a better interface for retrieving data? I don't think the current design of 'process to generate data', and dynamically generated page on top of that is broken, it's only the implementation that can use improvement. It's always better to try to redo only things that are broken, starting from scratch would be a waste of time imho, while it currently does work (though not very fast, and especially not easily extendible). My proposal (think first (1&2), act later (3&4)): 1) document what info is going into extract, and where that's all coming from 2) think of a good way of storing that data, which be retrieved efficiently 3) split extract in multiple scripts (I prefer no python personally) that retrieve those data, and put it in an efficient form 4) modify developer.php to use the better data accessing, and change the logic a bit, so that other selection criteria for packages can be implemented. IMHO, this can be done without much work, as it should be. --Jeroen -- Jeroen van Wolffelaar [EMAIL PROTECTED] (also for Jabber & MSN; ICQ: 33944357) http://Jeroen.A-Eskwadraat.nl
pgpFSLt9iPXXF.pgp
Description: PGP signature