UDD contains names where spaces are not stripped
Hi, by chance I realised that the uploaders table contains some names where names are not stripped: udd=> select '"' || u.name || '"' as name_with_spaces, uploader from uploaders u where name like '% ' or name like ' %' ; name_with_spaces | uploader --+--- " Mehdi Dogguy" | Mehdi Dogguy " David Paleino" | David Paleino " Stéphane Glondu" | Stéphane Glondu " Stefano Zacchiroli"| Stefano Zacchiroli " Stefano Zacchiroli"| Stefano Zacchiroli " Stefano Zacchiroli"| Stefano Zacchiroli " Stefano Zacchiroli"| Stefano Zacchiroli " Stefano Zacchiroli"| Stefano Zacchiroli "Andreas Tille "| Andreas Tille " LI Daobing"| LI Daobing " David Paleino" | David Paleino " Stefano Zacchiroli"| Stefano Zacchiroli " Nikita V. Youshchenko" | Nikita V. Youshchenko " Nikita V. Youshchenko" | Nikita V. Youshchenko " Nikita V. Youshchenko" | Nikita V. Youshchenko " Nikita V. Youshchenko" | Nikita V. Youshchenko " Nikita V. Youshchenko" | Nikita V. Youshchenko "Colin Tuckley " | Colin Tuckley "Colin Tuckley " | Colin Tuckley "Colin Tuckley " | Colin Tuckley (20 rows) This causes slight errors when counting uploads of people. My guess is this is due to some old importer code (I've checked the hit for my name which is a pretty old upload). Thus I wonder whether it might be the easiest fix to simply fix this with some proper UPDATE statement to remove unneeded spaces. This statement is doing the trick in my local clone: UPDATE uploaders SET name = trim(name), uploader = trim(name) || ' ' || email WHERE name like ' %' or name like '% ' ; If I'm not misleaded historic uploads will not importet from scratch so this would cure the situation. Otherwise users need to always remember adding some trim(name) when dealing with the uploaders.name column not to mention that it gets even harder to deal with the uploader column that might feature extra spaces in the middle. What do you think? Kind regards Andreas. -- http://fam-tille.de
Appstream data not in UTF-8?
Hello Matthias, tracker.debian.org has been failing to import the appstream metadata for a while (since November 26th) with this exception: Task UpdateAppStreamStatsTask failed with the following traceback. Traceback (most recent call last): File "/srv/tracker.debian.org/distro-tracker/distro_tracker/core/tasks/base.py", line 378, in run_task task.execute() File "/srv/tracker.debian.org/distro-tracker/distro_tracker/core/tasks/base.py", line 256, in execute call_methods_with_prefix(self, 'execute_') File "/srv/tracker.debian.org/distro-tracker/distro_tracker/core/utils/misc.py", line 44, in call_methods_with_prefix method(*args, **kwargs) File "/srv/tracker.debian.org/distro-tracker/distro_tracker/vendor/debian/tracker_tasks.py", line 889, in execute_main self._load_appstream_hint_stats(component, arch, all_stats) File "/srv/tracker.debian.org/distro-tracker/distro_tracker/vendor/debian/tracker_tasks.py", line 728, in _load_appstream_hint_stats hints_json = get_resource_text(url, force_update=self.force_update) File "/srv/tracker.debian.org/distro-tracker/distro_tracker/core/utils/http.py", line 356, in get_resource_text return content.decode(encoding) Exception Type: UnicodeDecodeError Exception Value: 'utf-8' codec can't decode byte 0xcc in position 990416: invalid continuation byte Request data not supplied Can you look into this and fix the data? (the precise position of the error in the stream varied over the days, here you have the position as of today, December 7th 09:36 UTC) This refers to those two lines: url = 'https://appstream.debian.org/hints/sid/{}/Hints-{}.json.gz' \ .format(section, arch) hints_json = get_resource_text(url, force_update=self.force_update) arch is "amd64", an section is likely "main" but it's not impossible that it's in one of the other sections. Thank you in advance for your help! -- ⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog ⣾⠁⢠⠒⠀⣿⡁ ⢿⡄⠘⠷⠚⠋The Debian Handbook: https://debian-handbook.info/get/ ⠈⠳⣄ Debian Long Term Support: https://deb.li/LTS
Re: Appstream data not in UTF-8?
On Thu, 07 Dec 2023, Raphael Hertzog wrote: > tracker.debian.org has been failing to import the appstream metadata for a > while (since November 26th) with this exception: Quick correction. The first time it failed that way was on November 19th at 08:13 UTC. Cheers, -- ⢀⣴⠾⠻⢶⣦⠀ Raphaël Hertzog ⣾⠁⢠⠒⠀⣿⡁ ⢿⡄⠘⠷⠚⠋The Debian Handbook: https://debian-handbook.info/get/ ⠈⠳⣄ Debian Long Term Support: https://deb.li/LTS
Re: Appstream data not in UTF-8?
Am Thu, Dec 07, 2023 at 11:03:29AM +0100 schrieb Raphael Hertzog: > On Thu, 07 Dec 2023, Raphael Hertzog wrote: > > tracker.debian.org has been failing to import the appstream metadata for a > > while (since November 26th) with this exception: > > Quick correction. The first time it failed that way was on November 19th > at 08:13 UTC. I do not remember the time exactly but my gut feeling says this is pretty close to the bookworm upgrade. Kind regards Andreas. -- http://fam-tille.de
Processed: some bug triaging
Processing commands for cont...@bugs.debian.org: > # to avoid these *.debian.org bugs showing up against the release > # lets tag them as sid only (although technically that's not totally > # correct either) > tags 388141 = sid Bug #388141 [www.debian.org] www.debian.org SPI copyright claim not legally valid until all contributors are contacted for relicensing Removed tag(s) jessie, bullseye, bookworm, buster-ignore, stretch, trixie, and buster. > tags 548024 sid Bug #548024 [www.debian.org] packages.debian.org: mirror doesn't close old databases Added tag(s) sid. > tags 1037324 sid Bug #1037324 [www.debian.org] d.o/CD/live refers to images no longer built Added tag(s) sid. > tags 1055043 sid Bug #1055043 [qa.debian.org] Debian carnivore: port from Python 2 to 3 Added tag(s) sid. > tags 1051661 sid Bug #1051661 [ftp.debian.org] /usr/bin/qemu-system-ppc: Package not installable Added tag(s) sid. > tags 939147 sid Bug #939147 [ftp.debian.org] gcc-7-source 7.4.0-11 vanished from the archive Added tag(s) sid. > tags 1023811 sid Bug #1023811 [ftp.debian.org] golang-github-grpc-ecosystem-go-grpc-middleware_1.3.0-1 sources changed in the archive? Added tag(s) sid. > # apparently "sarge" isn't enough to not have it affect testing > tags 497471 sid Bug #497471 [cdimage.debian.org] sarge images have syslinux binaries without source Added tag(s) sid. > tags 507706 sid etch Bug #507706 [cdimage.debian.org] Missing sources for d-i components/kernel of etch-n-half images Added tag(s) etch and sid. > # let's connect this > affects 1025650 src:nmap Bug #1025650 [ftp.debian.org] ftp.debian.org: Nmap Public Source License Version 0.94 - Is it DFSG-compliant? Added indication that 1025650 affects src:nmap > # might be better to add some words > reassign 993274 release-notes Bug #993274 [upgrade-reports] machine unbootable after upgrade Bug reassigned from package 'upgrade-reports' to 'release-notes'. Ignoring request to alter found versions of bug #993274 to the same values previously set Ignoring request to alter fixed versions of bug #993274 to the same values previously set > # seems like a (hopefully old) grub bug > reassign 1006369 src:grub2 2.06-3~deb10u1 Bug #1006369 [upgrade-reports] mini-NAS v1 currently not system-upgradable to Buster Bug reassigned from package 'upgrade-reports' to 'src:grub2'. Ignoring request to alter found versions of bug #1006369 to the same values previously set Ignoring request to alter fixed versions of bug #1006369 to the same values previously set Bug #1006369 [src:grub2] mini-NAS v1 currently not system-upgradable to Buster Marked as found in versions grub2/2.06-3~deb10u1. > thanks Stopping processing here. Please contact me if you need assistance. -- 1006369: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1006369 1023811: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1023811 1025650: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1025650 1037324: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1037324 1051661: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1051661 1055043: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1055043 388141: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=388141 497471: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=497471 507706: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=507706 548024: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=548024 939147: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=939147 993274: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=993274 Debian Bug Tracking System Contact ow...@bugs.debian.org with problems
Re: UDD contains names where spaces are not stripped
On 07/12/23 at 09:58 +0100, Andreas Tille wrote: > Hi, > > by chance I realised that the uploaders table contains some names where names > are not stripped: > > udd=> select '"' || u.name || '"' as name_with_spaces, uploader from > uploaders u where name like '% ' or name like ' %' ; > name_with_spaces | uploader > --+--- > " Mehdi Dogguy" | Mehdi Dogguy > " David Paleino" | David Paleino > " Stéphane Glondu" | Stéphane Glondu > " Stefano Zacchiroli"| Stefano Zacchiroli > " Stefano Zacchiroli"| Stefano Zacchiroli > " Stefano Zacchiroli"| Stefano Zacchiroli > " Stefano Zacchiroli"| Stefano Zacchiroli > " Stefano Zacchiroli"| Stefano Zacchiroli > "Andreas Tille "| Andreas Tille > " LI Daobing"| LI Daobing > " David Paleino" | David Paleino > " Stefano Zacchiroli"| Stefano Zacchiroli > " Nikita V. Youshchenko" | Nikita V. Youshchenko > " Nikita V. Youshchenko" | Nikita V. Youshchenko > " Nikita V. Youshchenko" | Nikita V. Youshchenko > " Nikita V. Youshchenko" | Nikita V. Youshchenko > " Nikita V. Youshchenko" | Nikita V. Youshchenko > "Colin Tuckley " | Colin Tuckley > "Colin Tuckley " | Colin Tuckley > "Colin Tuckley " | Colin Tuckley > (20 rows) > > > This causes slight errors when counting uploads of people. My guess is this > is due to some old importer code (I've checked the hit for my name which > is a pretty old upload). Thus I wonder whether it might be the easiest > fix to simply fix this with some proper UPDATE statement to remove unneeded > spaces. This statement is doing the trick in my local clone: > >UPDATE uploaders SET name = trim(name), uploader = trim(name) || ' ' || > email WHERE name like ' %' or name like '% ' ; > > If I'm not misleaded historic uploads will not importet from scratch so > this would cure the situation. Otherwise users need to always remember > adding some trim(name) when dealing with the uploaders.name column not > to mention that it gets even harder to deal with the uploader column > that might feature extra spaces in the middle. > > What do you think? Hi, Uploaders is refreshed every few hours from archive data, so a one-time UPDATE would not help. UDD usually tries to preserve inaccuracies, so those might be interesting for QA work. In your case, why don't you use the email address to identify uploaders? (possibly combining it with the carnivore data to identify different emails belonging to the same person ?) Lucas
Re: UDD contains names where spaces are not stripped
Am Thu, Dec 07, 2023 at 07:59:38PM +0100 schrieb Lucas Nussbaum: > On 07/12/23 at 09:58 +0100, Andreas Tille wrote: > > Hi, > > > > by chance I realised that the uploaders table contains some names where > > names > > are not stripped: > > > > udd=> select '"' || u.name || '"' as name_with_spaces, uploader from > > uploaders u where name like '% ' or name like ' %' ; > > name_with_spaces | uploader > > --+--- > > " Mehdi Dogguy" | Mehdi Dogguy > > " David Paleino" | David Paleino > > " Stéphane Glondu" | Stéphane Glondu > > " Stefano Zacchiroli"| Stefano Zacchiroli > > " Stefano Zacchiroli"| Stefano Zacchiroli > > " Stefano Zacchiroli"| Stefano Zacchiroli > > " Stefano Zacchiroli"| Stefano Zacchiroli > > " Stefano Zacchiroli"| Stefano Zacchiroli > > "Andreas Tille "| Andreas Tille > > " LI Daobing"| LI Daobing > > " David Paleino" | David Paleino > > " Stefano Zacchiroli"| Stefano Zacchiroli > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > "Colin Tuckley " | Colin Tuckley > > "Colin Tuckley " | Colin Tuckley > > "Colin Tuckley " | Colin Tuckley > > (20 rows) > > ... > >UPDATE uploaders SET name = trim(name), uploader = trim(name) || ' ' || > > email WHERE name like ' %' or name like '% ' ; > > > > Uploaders is refreshed every few hours from archive data, so a one-time > UPDATE would not help. UDD usually tries to preserve inaccuracies, so > those might be interesting for QA work. OK. > In your case, why don't you use the email address to identify uploaders? Since this also does not work: udd=> SELECT count(*), uploader FROM uploaders WHERE name ilike '%tille%' GROUP BY uploader; count | uploader ---+ 1 | Andreas Tille 1 | Andreas Tille 8785 | Andreas Tille (3 Zeilen) > (possibly combining it with the carnivore data to identify different emails > belonging to the same person ?) I could fiddle around with carnivore but that's overkill for thst purpose and I insist that not stripping blanks from names does not make any sense, IMHO. (1 Zeile) BTW: I found udd=> SELECT count(*), name FROM (SELECT CASE WHEN changed_by_name = '' THEN maintainer_name ELSE changed_by_name END AS name FROM upload_history) uh WHERE name ilike '%tille%' group by name; count | name ---+--- 16524 | Andreas Tille (1 Zeile) So why do I have 8707 uploads per uploaders but 16524 per upload_history? Is my assumption wrong that both values should match (modulo some wrongly spelled names) Kind regards Andreas. -- http://fam-tille.de
Re: UDD contains names where spaces are not stripped
On 07/12/23 at 20:24 +0100, Andreas Tille wrote: > Am Thu, Dec 07, 2023 at 07:59:38PM +0100 schrieb Lucas Nussbaum: > > On 07/12/23 at 09:58 +0100, Andreas Tille wrote: > > > Hi, > > > > > > by chance I realised that the uploaders table contains some names where > > > names > > > are not stripped: > > > > > > udd=> select '"' || u.name || '"' as name_with_spaces, uploader from > > > uploaders u where name like '% ' or name like ' %' ; > > > name_with_spaces | uploader > > > --+--- > > > " Mehdi Dogguy" | Mehdi Dogguy > > > " David Paleino" | David Paleino > > > " Stéphane Glondu" | Stéphane Glondu > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > "Andreas Tille "| Andreas Tille > > > " LI Daobing"| LI Daobing > > > " David Paleino" | David Paleino > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > "Colin Tuckley " | Colin Tuckley > > > "Colin Tuckley " | Colin Tuckley > > > "Colin Tuckley " | Colin Tuckley > > > (20 rows) > > > ... > > >UPDATE uploaders SET name = trim(name), uploader = trim(name) || ' ' > > > || email WHERE name like ' %' or name like '% ' ; > > > > > > > Uploaders is refreshed every few hours from archive data, so a one-time > > UPDATE would not help. UDD usually tries to preserve inaccuracies, so > > those might be interesting for QA work. > > OK. > > > In your case, why don't you use the email address to identify uploaders? > > Since this also does not work: > > udd=> SELECT count(*), uploader FROM uploaders WHERE name ilike '%tille%' > GROUP BY uploader; > count | uploader > ---+ > 1 | Andreas Tille > 1 | Andreas Tille > 8785 | Andreas Tille > (3 Zeilen) > > > (possibly combining it with the carnivore data to identify different emails > > belonging to the same person ?) > > I could fiddle around with carnivore but that's overkill for thst > purpose and I insist that not stripping blanks from names does not make > any sense, IMHO. (1 Zeile) > > > BTW: I found > > udd=> SELECT count(*), name FROM (SELECT CASE WHEN changed_by_name = '' THEN > maintainer_name ELSE changed_by_name END AS name FROM upload_history) uh > WHERE name ilike '%tille%' group by name; > count | name > ---+--- > 16524 | Andreas Tille > (1 Zeile) > > So why do I have 8707 uploads per uploaders but 16524 per upload_history? > > Is my assumption wrong that both values should match (modulo some wrongly > spelled names) If you look at the uploaders table, there are three columns: - 'uploader', than contains the raw data - 'name' and 'email' that contain the parsed (and trimmed) data udd=> select uploader, name, email, count(*) from uploaders where uploader ilike '%tille%' group by 1,2,3; uploader | name | email | count +-+--+--- Andreas Tille| Andreas Tille | ti...@debian.org | 8785 Andreas Tille| Andreas Tille | andr...@an3as.eu | 1 Andreas Tille| Andreas Tille | ti...@debian.org | 1 So, just use name and/or email? Lucas
Re: UDD contains names where spaces and quotes are not stripped
Am Thu, Dec 07, 2023 at 08:36:12PM +0100 schrieb Lucas Nussbaum: > On 07/12/23 at 20:24 +0100, Andreas Tille wrote: > > Am Thu, Dec 07, 2023 at 07:59:38PM +0100 schrieb Lucas Nussbaum: > > > On 07/12/23 at 09:58 +0100, Andreas Tille wrote: > > > > > > > > udd=> select '"' || u.name || '"' as name_with_spaces, uploader from > > > > uploaders u where name like '% ' or name like ' %' ; > > > > name_with_spaces | uploader > > > > --+--- > > > > " Mehdi Dogguy" | Mehdi Dogguy > > > > " David Paleino" | David Paleino > > > > " Stéphane Glondu" | Stéphane Glondu > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > "Andreas Tille "| Andreas Tille > > > > " LI Daobing"| LI Daobing > > > > " David Paleino" | David Paleino > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > "Colin Tuckley " | Colin Tuckley > > > > "Colin Tuckley " | Colin Tuckley > > > > "Colin Tuckley " | Colin Tuckley > > > > (20 rows) > > > > ... > > > >UPDATE uploaders SET name = trim(name), uploader = trim(name) || ' ' > > > > || email WHERE name like ' %' or name like '% ' ; > > > > > > > > > > BTW: I found > > > > udd=> SELECT count(*), name FROM (SELECT CASE WHEN changed_by_name = '' > > THEN maintainer_name ELSE changed_by_name END AS name FROM upload_history) > > uh WHERE name ilike '%tille%' group by name; > > count | name > > ---+--- > > 16524 | Andreas Tille > > (1 Zeile) > > > > So why do I have 8707 uploads per uploaders but 16524 per upload_history? ??? > > Is my assumption wrong that both values should match (modulo some wrongly > > spelled names) Could you please comment on these different results? > If you look at the uploaders table, there are three columns: > - 'uploader', than contains the raw data > - 'name' and 'email' that contain the parsed (and trimmed) data > > udd=> select uploader, name, email, count(*) from uploaders where uploader > ilike '%tille%' group by 1,2,3; > uploader | name | email | > count > +-+--+--- > Andreas Tille| Andreas Tille | ti...@debian.org | > 8785 > Andreas Tille| Andreas Tille | andr...@an3as.eu | > 1 > Andreas Tille| Andreas Tille | ti...@debian.org | > 1 > > So, just use name and/or email? Well, I do not seek for a solution for this (non-)problem. I simply think that not stripping values from spaces before injecting these into UDD is wrong. I simply stumbled upon this when I did the query above. I stumbled upon another reason which might be even worse: select distinct done, done_name, done_email, owner, owner_name, owner_email from archived_bugs where done_name like '%"%' or owner_name like '%"%' order by done_name; done | done_name | done_email | owner | owner_name | owner_email ---+-+-+-+-+-- | | der...@debian.org | "vane...@gmail.com" | "vane...@gmail.com" | vane...@gmail.com
Re: UDD contains names where spaces and quotes are not stripped
On 08/12/23 at 07:39 +0100, Andreas Tille wrote: > Am Thu, Dec 07, 2023 at 08:36:12PM +0100 schrieb Lucas Nussbaum: > > On 07/12/23 at 20:24 +0100, Andreas Tille wrote: > > > Am Thu, Dec 07, 2023 at 07:59:38PM +0100 schrieb Lucas Nussbaum: > > > > On 07/12/23 at 09:58 +0100, Andreas Tille wrote: > > > > > > > > > > udd=> select '"' || u.name || '"' as name_with_spaces, uploader from > > > > > uploaders u where name like '% ' or name like ' %' ; > > > > > name_with_spaces | uploader > > > > > --+--- > > > > > " Mehdi Dogguy" | Mehdi Dogguy > > > > > " David Paleino" | David Paleino > > > > > " Stéphane Glondu" | Stéphane Glondu > > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > > "Andreas Tille "| Andreas Tille > > > > > " LI Daobing"| LI Daobing > > > > > " David Paleino" | David Paleino > > > > > " Stefano Zacchiroli"| Stefano Zacchiroli > > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > > " Nikita V. Youshchenko" | Nikita V. Youshchenko > > > > > "Colin Tuckley " | Colin Tuckley > > > > > "Colin Tuckley " | Colin Tuckley > > > > > "Colin Tuckley " | Colin Tuckley > > > > > (20 rows) > > > > > ... > > > > >UPDATE uploaders SET name = trim(name), uploader = trim(name) || ' > > > > > ' || email WHERE name like ' %' or name like '% ' ; > > > > > > > > > > > > > > BTW: I found > > > > > > udd=> SELECT count(*), name FROM (SELECT CASE WHEN changed_by_name = '' > > > THEN maintainer_name ELSE changed_by_name END AS name FROM > > > upload_history) uh WHERE name ilike '%tille%' group by name; > > > count | name > > > ---+--- > > > 16524 | Andreas Tille > > > (1 Zeile) > > > > > > So why do I have 8707 uploads per uploaders but 16524 per upload_history? > > ??? upload_history contains all uploads ever made to Debian. uploaders contains packages currently in the suite (not superseded by another upload) > > So, just use name and/or email? > > Well, I do not seek for a solution for this (non-)problem. I simply > think that not stripping values from spaces before injecting these into > UDD is wrong. I simply stumbled upon this when I did the query above. It has been like that for about 15 years. I'm not sure changing the API because you think is wrong is a good idea. > I stumbled upon another reason which might be even worse: > > select distinct done, done_name, done_email, owner, owner_name, owner_email > from archived_bugs where done_name like '%"%' or owner_name like '%"%' order > by done_name; > done > | done_name > | done_email > | owner > | > owner_name | owner_email > > ---+-+-+-+-+-- > > | > | der...@debian.org > | "vane...@gmail.com" > | "vane...@gmail.com" > | vane...@gmail.com > > | > | twer...@debian.org > | "Varun Hiremath" > | "Varun Hiremath" > | varunhirem...@gmail.co