In addition to other suggestions, you could also take a look at
building a Cascading job with a custom Joiner class.

- John

On Tue, Mar 24, 2009 at 7:33 AM, Tamir Kamara <tamirkam...@gmail.com> wrote:
> Hi,
>
> We need to implement a Join with a between operator instead of an equal.
> What we are trying to do is search a file for a key where the key falls
> between two fields in the search file like this:
>
> main file (ip, a, b):
> (80, zz, yy)
> (125, vv, bb)
>
> search file (from-ip, to-ip, d, e):
> (52, 75, xxx, yyy)
> (78, 98, aaa, bbb)
> (99, 115, xxx, ddd)
> (125, 130, hhh, aaa)
> (150, 162, qqq, sss)
>
> the outcome should be in the form (ip, a, b, d, e):
> (80, zz, yy, aaa, bbb)
> (125, vv, bb, eee, hhh)
>
> We could convert the ip ranges in the search file to single record ips and
> then do a regular join, but the number of single ips is huge and this is
> probably not a good way.
> What would be a good course for doing this in hadoop ?
>
>
> Thanks,
> Tamir
>

Reply via email to