Hi,

We need to implement a Join with a between operator instead of an equal.
What we are trying to do is search a file for a key where the key falls
between two fields in the search file like this:

main file (ip, a, b):
(80, zz, yy)
(125, vv, bb)

search file (from-ip, to-ip, d, e):
(52, 75, xxx, yyy)
(78, 98, aaa, bbb)
(99, 115, xxx, ddd)
(125, 130, hhh, aaa)
(150, 162, qqq, sss)

the outcome should be in the form (ip, a, b, d, e):
(80, zz, yy, aaa, bbb)
(125, vv, bb, eee, hhh)

We could convert the ip ranges in the search file to single record ips and
then do a regular join, but the number of single ips is huge and this is
probably not a good way.
What would be a good course for doing this in hadoop ?


Thanks,
Tamir

Reply via email to