Thanks, Vino & Hequn.

On Mon, Jul 16, 2018 at 5:47 PM Hequn Cheng <chenghe...@gmail.com> wrote:

> Hi Shivam,
>
> I think the non-window stream-stream join can solve your problem.
> The non-window join will store all data from both inputs and output joined
> results. The semantics of non-window join is exactly the same with batch
> join.
> One important thing to note is that the state of join might grow
> infinitely depending on the number of distinct input rows, so please
> provide a query configuration with valid retention interval[1] to prevent
> excessive state size.
>
> Let me know If you have any other confusions.
>
> Best, Hequn
>
> [1]
> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming.html#idle-state-retention-time
>
> On Mon, Jul 16, 2018 at 5:18 PM, Shivam Sharma <28shivamsha...@gmail.com>
> wrote:
>
>> Hi Vino,
>>
>> First I want to tell you that we are working on Flink SQL so there is no
>> chance to use Data Stream API.
>>
>> I will give one example of my use case here:-
>>
>> Let's say we have two Kafka Topics:
>>
>>    1. UserName to UserId Mapping => {"userName": "shivam", "userId": 123}
>>    2. User transactions information in which username is coming. => {"user":
>>    "shivam", "transactionAmount": 3250}
>>
>> Final result should be like this  => {"user": "shivam", "userId": 123,
>> "transactionAmount": 3250}
>>
>> SQL Query for this: SELECT t2.user, t1.userID, t2.transactionAmount from
>> userTable as t1 join transactionTable as t2 on t1.userName = t2.user
>>
>> Now, whenever a transaction happens then we need to add UserId also in
>> the record using Flink SQL. We need to join these two streams. So need to
>> store userName to id mapping somewhere like in RocksDB
>>
>> Thanks
>>
>> On Mon, Jul 16, 2018 at 12:04 PM vino yang <yanghua1...@gmail.com> wrote:
>>
>>> Hi Shivam,
>>>
>>> Can you provide more details about your use case? The join for batch or
>>> streaming? which join type (window or non-window or stream-dimension table
>>> join)?
>>>
>>> If it is stream-dimension table join and the table is huge, use Redis
>>> or some cache based on memory, can help to process your problem. And you
>>> can customize the flink's physical plan (like Hequn said) and use async
>>> operator to optimize access to the third-party system.
>>>
>>> Thanks,
>>> Vino yang.
>>>
>>> 2018-07-16 9:17 GMT+08:00 Hequn Cheng <chenghe...@gmail.com>:
>>>
>>>> Hi Shivam,
>>>>
>>>> Currently, fink sql/table-api support window join and non-window
>>>> join[1].
>>>> If your requirements are not being met by sql/table-api, you can also
>>>> use the datastream to implement your own logic. You can refer to the
>>>> non-window join implement as an example[2][3].
>>>>
>>>> Best, Hequn
>>>>
>>>> [1]
>>>> https://ci.apache.org/projects/flink/flink-docs-master/dev/table/sql.html#joins
>>>> [2]
>>>> https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/plan/nodes/datastream/DataStreamJoin.scala
>>>> [3]
>>>> https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/join/NonWindowInnerJoin.scala
>>>>
>>>> On Sun, Jul 15, 2018 at 11:29 PM, Shivam Sharma <
>>>> 28shivamsha...@gmail.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> We have one use case in which we need to persist Table in Flink which
>>>>> can be later used to join with other tables. This table can be huge so we
>>>>> need to store it in off-heap but faster access. Any suggestions regarding
>>>>> this?
>>>>>
>>>>> --
>>>>> Shivam Sharma
>>>>> Data Engineer @ Goibibo
>>>>> Indian Institute Of Information Technology, Design and Manufacturing
>>>>> Jabalpur
>>>>> Mobile No- (+91) 8882114744
>>>>> Email:- 28shivamsha...@gmail.com
>>>>> LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
>>>>> <https://www.linkedin.com/in/28shivamsharma>*
>>>>>
>>>>
>>>>
>>>
>>
>> --
>> Shivam Sharma
>> Data Engineer @ Goibibo
>> Indian Institute Of Information Technology, Design and Manufacturing
>> Jabalpur
>> Mobile No- (+91) 8882114744
>> Email:- 28shivamsha...@gmail.com
>> LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
>> <https://www.linkedin.com/in/28shivamsharma>*
>>
>
>

-- 
Shivam Sharma
Data Engineer @ Goibibo
Indian Institute Of Information Technology, Design and Manufacturing
Jabalpur
Mobile No- (+91) 8882114744
Email:- 28shivamsha...@gmail.com
LinkedIn:-*https://www.linkedin.com/in/28shivamsharma
<https://www.linkedin.com/in/28shivamsharma>*

Reply via email to