[Laszlo explicitely in CC because I do not know whether you followed this longish mails]
Hi, On Mon, Feb 20, 2012 at 10:08:20PM -0500, Yaroslav Halchenko wrote: > > The alternative approach could perfectly be to seek for files matching > > /usr/share/doc/*/upstream > > and do the BibTeX generation afterwards. > > yeap -- that was the idea behind dbib_collect -- to gather from > all possible places, but if we converge on /upstream, and it would (or > does already?) allow multiple entries, trigger-generated snippets-cat-er > might be preferable eliminating the need for an additional user-land > tool. Currently debian/upstream explicitely does allow only one entry and there was some criticism about this fact expressed by Laszlo. So I would like to address this point here explicitely. For me the question is, how to handle multiple entries sanely and how + where do we use them. Currently by design tasks pages display only one reference entry. If you would put multiple entries into a tasks file the last one will win. Even worse if you would try something like Published-Authors: Alois Schlögl, Clemens Brunner Published-DOI: 10.1109/MC.2008.407 Published-In: Computer, 41(10): 44-50 Published-Title: BioSig: A Free and Open Source Software Library for BCI Research Published-URL: http://pub.ist.ac.at/~schloegl/publications/Schloegl2007_BCI_Software.pdf Published-Year: 2008 Published-Authors: Some other Author Published-Title: Some stupid title you would end up with Author: Some other Author Title Some stupid title DOI: 10.1109/MC.2008.407 In: Computer, 41(10): 44-50 URL: http://pub.ist.ac.at/~schloegl/publications/Schloegl2007_BCI_Software.pd Year: 2008 This is by design of the RFC 822 parser where the last entry with a certain name wins. So in tasks pages we do not have any reasonable means to specify more than one reference. Basing on this "feature" I suggested to design the bibref table like CREATE TABLE bibref ( package text NOT NULL, key text NOT NULL, value text NOT NULL, PRIMARY KEY (package,key) ); to explicitely prevent duplication of values to ensure data integrity (we had at least one case in the past where some package,key pair occured twice and had broken my attempt to create the tasks pages). When keeping this design of a flexible package,key,value table which can easily adapt to new keys the only chance I see would be to do CREATE TABLE bibref ( package text NOT NULL, key text NOT NULL, value text NOT NULL, rank int NOT NULL, PRIMARY KEY (package,key,rank) ); >From the tasks pages point of view this would require some changes but I'd regard it as doable. Remark: Currently the algorithm for parsing the references is: Take references from tasks file only if you did not found references in UDD. From tasks files I see no chance to specify more than one reference, so the only way to specify more than one is debian/upstream via UDD. In short: I see a chance for implementing multiple references via debian/upstream - UDD - tasks pages when defining some ranking. However, handling multiple references is asking for additional trouble also for other use case. For instance if I think about finding a key for the BibTeX database. This would have been pretty simple for only one reference - just take the package key and be done with it. For multiple reference you somehow need to event a key and for the moment I do not see no handy way to do this. We could somehow relay on the sequence the references are given which also could serve as rank value for the UDD table. We could also use a key like <package><rank> based on this sequence, but I'd consider this all as a bit hackish - better suggestions are welcome. We also need to make sure that the different references are properly separated inside the yaml file. I have no experience with yaml files but I have seen Laszlo inserting '-' signs in front of the first entry of each separate reference. I guess usual yaml parser will just do the right thing and simply assume that this will work flawlessly. In short: If we really want to support multiple references we need to clarify the use cases and the implementation details first. I'm personally not really convinced that we could not go with one major reference per package and whether the trouble we need to deal with is worth the effort for some exceptions. However, I do not consider myself as a final user of those references and if there are honest arguments of users raised I'd be easily convinced to help implementing this. > > packages with renamed files). If you are asking: "Why, this should be > > installed?" I would say: "You are right, probably nobody has really > > thought about this." I would fully agree that it could add extra > > information to the doc inside a binary package - so why not installing > > it. > > ;-) BTW, every developer is free to mention debian/upstream in debian/docs for the moment - we just missed to do this. > > Despite this my plan should work with or without the installation of > > the files. I would like to do something like this: > > > debian/control: > > Build-Depends: upstream-to-bibref-helper > > # for sure the package needs a better name > > e.g. debian-bibliography-tools... ? ;-) > or may be the whole debian-bibliography could be abbreviated as > debbib..., then debbib-tools I'm not great in inventing names - so any sane suggestion which is not too longish (just to save the energy in pressing keys :-)) would be fine. The only thing we should decide when finding a name would be whether we want to restrict it explicitely to bibliography or whether we rather stick to a more generic "upstream" name which enables more flexibility in case we need to handle some other upstream data. > > database. Please note that this is just a scetch which should be > > enhanced and perhaps / probably the debian.bib file should end up at a > > better place where bibtex files will be automatically searched for etc - > > but these are implemantation details. > > IIRC I have looked for such a place and there were no suitable one (I > could be wrong), so in that preliminary debian-bibliography > package we placed /usr/share/bib/debian.bib with the intent to seek > adding /usr/share/bib into default BIBINPUTS. I do not consider /usr/share as the right place to put autogenerated data and the method I described is autogenerated. > Here debian.bib > http://anonscm.debian.org/gitweb/?p=pkg-exppsy/debian-bibliography.git;a=blob;hb=HEAD;f=bib/debian.bib > is not a compilation of software bib references but > rather ready-to-use entries for debian documents (e.g. papers) and > some wiki pages. With http://wiki.debian.org/CategoryPublication > we can now extend it automatically each "release" with relevant > publication entries (script yet TODO). > ... So this is rather a manually compiled list of references and is something else than what we discussed before about fetching the bibliographic data from debian/upstream right? Or do you want to freeze the bibliographic data at some certain point in time inside a source package and then upload this package with the references? I'd regard this method as a possible alternative with the drawback of beeing not perfectly up to date - just to make sure I do understand you correctly. Kind regards Andreas. -- http://fam-tille.de -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

