Hello Tamir , I think the better and simple way of doing this through Pig.
http://wiki.apache.org/pig/PigOverview As Pig provides SQL type of interface over Hadoop and support the kind of operation you need to do with data quite easily. Thanks , --- Peeyush On Tue, 2009-03-24 at 13:33 +0200, Tamir Kamara wrote: > Hi, > > We need to implement a Join with a between operator instead of an equal. > What we are trying to do is search a file for a key where the key falls > between two fields in the search file like this: > > main file (ip, a, b): > (80, zz, yy) > (125, vv, bb) > > search file (from-ip, to-ip, d, e): > (52, 75, xxx, yyy) > (78, 98, aaa, bbb) > (99, 115, xxx, ddd) > (125, 130, hhh, aaa) > (150, 162, qqq, sss) > > the outcome should be in the form (ip, a, b, d, e): > (80, zz, yy, aaa, bbb) > (125, vv, bb, eee, hhh) > > We could convert the ip ranges in the search file to single record ips and > then do a regular join, but the number of single ips is huge and this is > probably not a good way. > What would be a good course for doing this in hadoop ? > > > Thanks, > Tamir