On 03/01/2012 09:05 PM, Drew Frank wrote: > Hi, > > Bug #6366 was reported over a year ago to address a deficiency in "join": > it is unable to join on fields that are sorted numerically (rather than > lexicographically). The bug report has a patch attached -- I applied it to > the current Git head, cleaned up a few things, and added a couple tests. > > In response to previous discussions on the issue, my two cents: > * "sort" includes several other sorting criteria, but they're not likely to > be useful in this context. > * This patch doesn't implement the functionality of sort's "-g" option, so > "-n" is appropriate. > * If two values have different string representations but compare as equal > due to lack of precision...well, the person doing the join should be aware > of the limits of floating point representations and use the "sort, join, > sort -n" strategy. I doubt this would come up much in practice though. > > The lack of this option seems like an unusually conspicuous wart and I'd > love to see it addressed. Please let me know if I can be of any more help > to that effect. > > Best, > Drew Frank
> * src/join.c: add new flags and implement numeric comparison feature. > * tests/misc/join: add two tests for numerically sorted key fields. > This patch is based on code written by Alex Shinn > <Alex Shinn <at> gmail.com> > --- > src/join.c | 22 +++++++++++++++++++--- > tests/misc/join | 6 ++++++ > 2 files changed, 25 insertions(+), 3 deletions(-) Missing documentation in NEWS and doc/coreutils.texi. Based on just the diffstat, this patch is non-trivial, so we'd need copyright assignment to the FSF from both you and from Alex Shinn, as co-authors of this material. Is that something you are interested in pursuing? I'm refraining from reviewing the patch itself, in case we need to come up with a clean-room reimplementation. -- Eric Blake ebl...@redhat.com +1-919-301-3266 Libvirt virtualization library http://libvirt.org
signature.asc
Description: OpenPGP digital signature