Tracking ID in log4j MDC

2020-12-01 Thread Anil K
Hi All, Is it possible to have a tracking id in MDC that will be shared across chained users defined operations like Filter, KeySelector, Flat map, Process function, and Producer? Tracking id will be read from headers of Kafka Message, which if possible plan to set to MDC in log4j. Right now I a

Re: PyFlink - Scala UDF - How to convert Scala Map in Table API?

2020-12-01 Thread Xingbo Huang
Hi Pierre, I wrote a PyFlink implementation, you can see if it meets your needs: from pyflink.datastream import StreamExecutionEnvironment from pyflink.table import StreamTableEnvironment, EnvironmentSettings, DataTypes from pyflink.table.udf import udf def test(): env = StreamExecutionEnv

Re: Join a datastream with tables stored in Hive

2020-12-01 Thread Leonard Xu
Hi, Krzysztof > * I have a high pace stream of events coming in Kafka. > * I have some dimension tables stored in Hive. These tables are changed > daily. I can keep a snapshot for each day. For this use case, Flink supports temporal join the latest hive partition as temporal table now, you

Re: Join a datastream with tables stored in Hive

2020-12-01 Thread Leonard Xu
Hi, Maciej > > I didn't find a SQL solution to this problem. > Now Flink provides the SQL solution, you can see the doc[1], the Flink-1.12 document link that posted by Chesnay should have updated but not..., I’ll check the document of 1.12. Best, Leonard [1] https://ci.apache.org/projects/f

Re: State Processor API SQL State

2020-12-01 Thread Yun Tang
Hi Dom, + user mail list Once you got to know the state descriptor, I think you could query the join state. The state name is easy to get via [1], it should be "left-records" and "right-records", and you could check what kind of join and whether has unique key to decide what kind of state (val

Re: Performance consequence of leftOuterJoinLateral

2020-12-01 Thread Danny Chan
Hi, Rex ~ For "leftOuterJoinLateral" do you mean join a table function through lateral table ? If it is, yes, the complexity is O(1) for each probe key of LHS. The table function evaluate the extra columns and append it to the left columns. Rex Fenley 于2020年12月2日周三 上午7:54写道: > Hello, > > I'm cu

Re: Filter Null in Array in SQL Connector

2020-12-01 Thread Rex Fenley
There's no stack trace, there's literally just the exception logged and it's nonobvious. It looked like flink was just stuck and not processing any data the first time we ran into the problem until we dug deeper. After I get through this next phase of work (1 to 2 weeks) I'll be sure to slice off t

Performance consequence of leftOuterJoinLateral

2020-12-01 Thread Rex Fenley
Hello, I'm curious if there's any performance consequence of using a TableFunction + leftOuterJoinLateral to create some new columns vs creating each column individually? I'm hoping that lookup for a row with leftOuterJoinLateral is essentially O(1), so as soon as the TableFunction is done it jus

Re: Join a datastream with tables stored in Hive

2020-12-01 Thread Maciej Bryński
Hi, There is an implementation only for temporal tables which needs some Java/Scala coding (no SQL-only implementation). On the same page there is annotation: Attention Flink does not support event time temporal table joins currently. So this is the reason, I'm asking this question. My use case: I

Re: Join a datastream with tables stored in Hive

2020-12-01 Thread Chesnay Schepler
According to the documentation this is already implemented. On 12/1/2020 3:53 PM, maverick wrote: Hi Kurt, Is there any Jira task for tracking progress of adding event time support

Partitioned tables in SQL client configuration.

2020-12-01 Thread Maciek Próchniak
Hello, I try to configure SQL Client to query partitioned ORC data on local filesystem. I have directory structure like that: /tmp/table1/startdate=2020-11-28 /tmp/table1/startdate=2020-11-27 etc. If I run SQL Client session and create table by hand: create table tst (column1 string, star

Re: Join a datastream with tables stored in Hive

2020-12-01 Thread maverick
Hi Kurt, Is there any Jira task for tracking progress of adding event time support to temporal joins ? Regards, Maciek -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Filter Null in Array in SQL Connector

2020-12-01 Thread Danny Chan
My local test indicates that the debezium-json works correctly with your given schema and example record, can you give more detailed exception stack trace and a record that can reproduce this problem ? Rex Fenley 于2020年12月1日周二 上午7:15写道: > Hello, > > Any updates on this bug? > > Thanks! > > On Fr

Re: PyFlink - Scala UDF - How to convert Scala Map in Table API?

2020-12-01 Thread Pierre Oberholzer
Hi Xingbo, That would mean giving up on using Flink (table) features on the content of the parsed JSON objects, so definitely a big loss. Let me know if I missed something. Thanks ! Le mar. 1 déc. 2020 à 07:26, Xingbo Huang a écrit : > Hi Pierre, > > Have you ever thought of declaring your ent