Le Thu, Jan 21, 2010 at 04:07:19PM +0100, Andreas Tille a écrit : > On Thu, Jan 21, 2010 at 11:54:31PM +0900, Charles Plessy wrote: > > > > I will try to provide drafts for > > the loading in UDD. But I never programmed in Python, so I do not expect it > > will work out of the box. Hopefully, it will save you some typing. > > That's a good way to push me for helping you instead of waiting until I > find time to do it from scratch. Just ask in case of trouble.
Hi Andreas and everybody, today I took a couple of hours to study the UDD and python (and snakes and Greek mythology, thanks to the Wikipedia syndrome). I attached to this email a draft for a bibliographic reference gatherer, “bibref_gatherer.py”. Although in my previous emails I described a tab-delimited export format from the upstream-medadata.d.n system, I realised that this is not robust in case one field unfortunately contains a tab. Instead of re-inventing the wheel with quoting mechanisms, I simply switched the exchange format to YAML. http://upstream-metadata.debian.net/for_UDD/biblio.yaml The above files contains triples to be loaded in a table of the UDD. They provide the information needed to feed the Blends web sentinel with bibliographic information. Since I do not run a local copy of the UDD, I did not test the attached gatherer. Please treat it as a stub. It is meant to be used with the following patch to the UDD configuration file. Index: config-org.yaml =================================================================== --- config-org.yaml (révision 1680) +++ config-org.yaml (copie de travail) @@ -19,6 +19,7 @@ ddtp: module udd.ddtp_gatherer ftpnew: module udd.ftpnew_gatherer screenshots: module udd.screenshot_gatherer + bibref: module udd.bibref_gatherer dehs: module udd.dehs_gatherer ldap: module udd.ldap_gatherer wannabuild: module udd.wannabuild_gatherer @@ -528,6 +529,14 @@ table: screenshots screenshots_json: /org/udd.debian.org/mirrors/screenshots/screenshots.json +bibref: + type: bibref + update-command: /org/udd.debian.org/udd/scripts/fetch_bibref.sh + path: /org/udd.debian.org/mirrors/bibref + cache: /org/udd.debian.org/mirrors/cache + table: bibref + bibref_yaml: /org/udd.debian.org/mirrors/bibref/bibref.yaml + wannabuild: type: wannabuild wbdb: "dbname=wanna-build host=localhost port=5433 user=guest" Please tell me what you think about it, and if you would like me to commit the whole to the UDD sources. Have a nice week-end, -- Charles Plessy Debian Med packaging team, http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan
#!/usr/bin/env python """ This script imports bibliographic references from upstream-metadata.debian.net. """ from gatherer import gatherer from sys import stderr, exit online=0 def get_gatherer(connection, config, source): return bibref_gatherer(connection, config, source) class screenshot_gatherer(gatherer): """ Bibliographic references from upstream-metadata.debian.net. """ def __init__(self, connection, config, source): gatherer.__init__(self, connection, config, source) self.assert_my_config('table') my_config = self.my_config cur = self.cursor() query = "DELETE FROM %s" % my_config['table'] cur.execute(query) query = """PREPARE bibref_insert (text, text, text) AS INSERT INTO %s (package, key, value) VALUES ($1, $2, $3)""" % (my_config['table']) cur.execute(query) pkg = None def run(self): my_config = self.my_config #start harassing the DB, preparing the final inserts and making place #for the new data: cur = self.cursor() bibref_file = my_config['bibref_yaml'] fp = open(bibref_file, 'r') result = fp.read() fp.close() for res in safe_load_all(result): package, key, value = res query = """EXECUTE bibref_insert (%(package)s, %(key)s, %(value)s)""" try: cur.execute(query, res) except UnicodeEncodeError, err: print >>stderr, "Unable to inject data for package %s. %s" % (res['name'], err) print >>stderr, "-->", res cur.execute("DEALLOCATE bibref_insert") cur.execute("ANALYZE %s" % my_config['table']) if __name__ == '__main__': main() # vim:set et tabstop=2: