Matthias J. Sax created FLINK-2320: -------------------------------------- Summary: Enable DataSet DataStream Joins Key: FLINK-2320 URL: https://issues.apache.org/jira/browse/FLINK-2320 Project: Flink Issue Type: New Feature Reporter: Matthias J. Sax
Currently, DataSets and DataStreams cannot be joined with each other. This feature should include the following: - extend Streaming API to allow one join input to be a DataSet * in a first step, DataSet can be limited to be a DataSource * later on, full Flink program could compute DataSet -> maybe, Flink program be used update Join-DataSet periodically (in base data changed); including "synchonized" switching from old to new DataSet; update triggered by user/time/base-data-change? - in first version, inner-equi join should be sufficient * DataSet is used as build side for Hash-Join * extend current Hash-Join to consume DataStream as probe input - for full programs computing DataSet input, it might be helpful to extend optimizer ? - What about other joins? What join algorithm do we need to support (full/left/right) outer joins for Set-Stream-Join? What about theta joins? -- This message was sent by Atlassian JIRA (v6.3.4#6332)