[ https://issues.apache.org/jira/browse/FLINK-685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot closed FLINK-685. -------------------------------- Resolution: Auto Closed This issue was labeled "stale-minor" 7 ago and has not received any updates so I have gone ahead and closed it. If you are still affected by this or would like to raise the priority of this ticket please re-open, removing the label "auto-closed" and raise the ticket priority accordingly. > Add support for semi-joins > -------------------------- > > Key: FLINK-685 > URL: https://issues.apache.org/jira/browse/FLINK-685 > Project: Flink > Issue Type: New Feature > Components: API / DataSet > Reporter: GitHub Import > Assignee: pietro pinoli > Priority: Minor > Labels: auto-closed, github-import, stale-assigned > > A semi-join is basically a join filter. One input is "filtering" and the > other one is "filtered". > A tuple of the "filtered" input is emitted exactly once if the "filtering" > input has one (ore more) tuples with matching join keys. That means that the > output of a semi-join has the same type as the "filtered" input and the > "filtering" input is completely discarded. > In order to support a semi-join, we need to add an additional physical > execution strategy, that ensures, that a tuple of the "filtered" input is > emitted only once if the "filtering" input has more than one tuple with > matching keys. Furthermore, a couple of optimizations compared to standard > joins can be done such as storing only keys and not the full tuple of the > "filtering" input in a hash table. > ---------------- Imported from GitHub ---------------- > Url: https://github.com/stratosphere/stratosphere/issues/685 > Created by: [fhueske|https://github.com/fhueske] > Labels: enhancement, java api, runtime, > Milestone: Release 0.6 (unplanned) > Created at: Mon Apr 14 12:05:29 CEST 2014 > State: open -- This message was sent by Atlassian Jira (v8.3.4#803005)