Hi
I have a small problem with doing a custom join, that I would need some help 
with. Maybe I'm also approaching the problem wrong.
So basically I have two dataset.
The simplified example: The first one has a start and end value. The second 
dataset is just a list of ordered numbers and some value (value is ignored in 
the example)
Example
One = {3,6},{5,7}
Two = 1,2,3,4,5,6,7
What I need is a sort of custom join, that brings to the first dataset all 
elements from the second that are within the range.
Something like .. join where one.start <= two.number <= one.end
So {3,6} from one would only need to "see" 3,4,5
Joining does not work out of the box here as the key is sort of "dynamic" 
depending on the value of one.
I can just use a map for the first dataset and broadcast the second into the 
mapper which can then select the required elements - but my assumption is that 
the second dataset might actually be very large as well, but the qualifying 
join "numbers" from two will actually be small.
Is there something I could do in this particular case?
Thanks a lot
Johannes

Reply via email to