Package: popularity-contest Severity: normal Hi,
I'm trying to parse the raw statistics available on http://popcon.debian.org/ and ran into some problems. Firstly, while it is fine that the fields are separated by multiple spaces, it should then not be the case that field values contain spaces themselves. Unfortunately this is the case for example for the package name "Not in Sid". This is a similar request as in bug report #574743 which asks for sanitizing the package names before putting it into the statistics. Second of all, going together with package name sanitization (which, as above example shows can make the data unparsable) some obvious bogus entries can be entirely removed like the "Not in Sid" example from above. There exists no such package. If you want to include the information then better do it in a commented line as you do for the header of the file for which you use # as a comment character. Thirdly, at the end of the file there is one large line only consisting of minus characters. Can this line also not be commented with a #? The same goes for the very last line which presents a total. Firstly it is not necessary to put a "rank" on this line (it has the same rank as the last entry) but it is also not necessary to have this line at all because any machine parsing the rest of the file can easily generate it. If you want this line for human consumption, you can just simply prefix it with a # to make it a comment. Would you welcome a patch fixing these issues? cheers, josch -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org