Re: [Discuss] Semantics of event time for state TTL

2019-04-15 Thread Yu Li
Thanks for initiating the discussion and wrap-up the conclusion Andrey, and thanks all for participating. Just to confirm, that for the out-of-order case, the conclusion is to update the data and timestamp with the currently-being-processed record w/o checking whether it's an old data, right? In t

Re: Netty channel closed at AKKA gated status

2019-04-15 Thread zhijiang
Hi Wenrui, You might further check whether there exists network connection issue between job master and target task executor if you confirm the target task executor is still alive. Best, Zhijiang -- From:Biao Liu Send Time:2019年4月

Re: Can back pressure data be gathered by Flink metric system?

2019-04-15 Thread zhijiang
Hi Henry, Thanks for the explanation. I am not sure whether it is feasible on your side to monitor the backpressure via restful api provided by flink. Some experience on my side to share. We ever monitored the backpressure via the metrics of outqueue length/usage on producer side and inqueue le

Is that possible to specify join algorithm hint in Flink SQL

2019-04-15 Thread yinhua.dai
Hi team, I know we can specify the join algorithm hint with dataset API https://ci.apache.org/projects/flink/flink-docs-release-1.0/apis/batch/dataset_transformations.html#join-algorithm-hints But wondering if this is possible to support with the SQL API? We have market data with a currency id(a

Re: Netty channel closed at AKKA gated status

2019-04-15 Thread Biao Liu
Hi Wenrui, If a task manager is killed (kill -9), it would have no chance to log anything. If the task manager exits since connection timeout, there would be something in log file. So it is probably killed by other user or operating system. Please check the log of operating system. BTW, I don't thi

Re: Can back pressure data be gathered by Flink metric system?

2019-04-15 Thread 徐涛
Hi Zhijiang, Because I want to know the current and the trend of backpressure status in Flink Job. Like other index such as latency, I can monitor it, and show it in graph by getting data from metric. Now using the metric to get the backpressure data is the simplest way I can think. Bes

Re: How would I use OneInputStreamOperator to deal with data skew?

2019-04-15 Thread Kurt Young
I think you can simply copy the source codes to your project if maven dependency can not be used. Best, Kurt On Mon, Apr 15, 2019 at 9:47 PM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > Hi again Kurt, > > could you please help me with the pom.xml file? I have included all Table > e

Re: Netty channel closed at AKKA gated status

2019-04-15 Thread Wenrui Meng
There is no exception or any warning in the task manager `'athena592-phx2/10.80.118.166:44177'` log. In addition, the host was not shut down either in cluster monitor dashboard. It probably requires to turn on DEBUG log to get more useful information. If the task manager gets killed, I assume there

Re: S3 Bucket Source

2019-04-15 Thread Steven Nelson
That looks like exactly what I needed. Thanks! -Steve On Mon, Apr 15, 2019 at 3:42 PM Addison Higham wrote: > Hi Steven, > > Usually, what you want to do is something like this: > > Instead of a `SourceFunction` use a `RichParallelSourceFunction` and as an > argument to that function, you might

Re: S3 Bucket Source

2019-04-15 Thread Addison Higham
Hi Steven, Usually, what you want to do is something like this: Instead of a `SourceFunction` use a `RichParallelSourceFunction` and as an argument to that function, you might have a list of prefixes you want to consume in parallel. The `RichParallelSourceFunction` has a a method called `getRunt

Re: Sink data into java stream variable

2019-04-15 Thread Oytun Tez
Hi Soheil, This is a tricky question that requires much more context due to distributed nature of Flink. Take a look at [1], especially BufferingSink example which I will *assume* behaves similar to your need. Oytun [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/state

inconsistent module descriptor for hadoop uber jar (Flink 1.8)

2019-04-15 Thread Hao Sun
Hi, I can not find the root cause of this, I think hadoop version is mixed up between libs somehow. --- ERROR --- java.text.ParseException: inconsistent module descriptor file found in ' https://repo1.maven.org/maven2/org/apache/flink/flink-shaded-hadoop2-uber/2.8.3-1.8.0/flink-shaded-hadoop2-uber

Sink data into java stream variable

2019-04-15 Thread Soheil Pourbafrani
Hi, In Flink Stream processing can we sink data into java stream array?

Re: Identify orphan records after joining two streams

2019-04-15 Thread Hequn Cheng
Hi Averell, > I feel that it's over-complicated I think a Table API or SQL[1] job can also achieve what you want. Probably more simple and takes up less code. The workflow looks like: 1. union all two source tables. You may need to unify the schema of the two tables as union all can only used to u

Re: Flink JDBC: Disable auto-commit mode

2019-04-15 Thread Rong Rong
+1, Thanks Konstantinos for opening the ticket. This would definitely be a useful feature. -- Rong On Mon, Apr 15, 2019 at 7:34 AM Fabian Hueske wrote: > Great, thank you! > > Am Mo., 15. Apr. 2019 um 16:28 Uhr schrieb Papadopoulos, Konstantinos < > konstantinos.papadopou...@iriworldwide.com>:

S3 Bucket Source

2019-04-15 Thread Steven Nelson
I am working on a process to do some compaction of files in S3. I read a bucket full of files key them, pull them all into a window, then remove older versions of the file. The files are not organized inside the bucket, they are simply name by guid. I can iterate them using a custom Source that jus

Re: Flink JDBC: Disable auto-commit mode

2019-04-15 Thread Fabian Hueske
Great, thank you! Am Mo., 15. Apr. 2019 um 16:28 Uhr schrieb Papadopoulos, Konstantinos < konstantinos.papadopou...@iriworldwide.com>: > Hi Fabian, > > > > I opened the following issue to track the improvement proposed: > > https://issues.apache.org/jira/browse/FLINK-12198 > > > > Best, > > Konst

RE: Flink JDBC: Disable auto-commit mode

2019-04-15 Thread Papadopoulos, Konstantinos
Hi Fabian, I opened the following issue to track the improvement proposed: https://issues.apache.org/jira/browse/FLINK-12198 Best, Konstantinos From: Papadopoulos, Konstantinos Sent: Δευτέρα, 15 Απριλίου 2019 12:30 μμ To: Fabian Hueske Cc: Rong Rong ; user Subject: RE: Flink JDBC: Disable aut

Re: [Discuss] Semantics of event time for state TTL

2019-04-15 Thread Andrey Zagrebin
Hi everybody, Thanks a lot for your detailed feedback on this topic. It looks like we can already do some preliminary wrap-up for this discussion. As far as I see we have the following trends: *Last access timestamp: **Event timestamp of currently being processed record* *Current timestamp to c

Re: How would I use OneInputStreamOperator to deal with data skew?

2019-04-15 Thread Felipe Gutierrez
Hi again Kurt, could you please help me with the pom.xml file? I have included all Table ecosystem dependencies and the flink-table-runtime-blink as well. However the class org.apache.flink.table.runtime.context.ExecutionContext is still not found. I guess I am missing some dependency, but I do no

Re: How would I use OneInputStreamOperator to deal with data skew?

2019-04-15 Thread Felipe Gutierrez
oh, yes. I just saw. I will use 1.9 then. thanks *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com * On Mon, Apr 15, 2019 at 3:23 PM Kurt Young wrote: > It's because all blink codes are not shipped with

Re: How would I use OneInputStreamOperator to deal with data skew?

2019-04-15 Thread Kurt Young
It's because all blink codes are not shipped with 1.8.0, they current only available in 1.9-SNAPSHOT. Best, Kurt On Mon, Apr 15, 2019 at 7:18 PM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > Hi, > > what are the artifacts that I have to import on maven in order to use > Blink Api? >

Identify orphan records after joining two streams

2019-04-15 Thread Averell
Hello, I have two data streams, and want to join them using a tumbling window. Each of the streams would have at most one record per window. There is also a requirement to log/save the records that don't have a companion from the other stream. What would be the best option for my case? Would that

Re: How would I use OneInputStreamOperator to deal with data skew?

2019-04-15 Thread Felipe Gutierrez
Hi, what are the artifacts that I have to import on maven in order to use Blink Api? I am using Flink 1.8.0 and I am trying to import blink code to use its ExecutionContext

Re: Hbase Connector failed when deployed to yarn

2019-04-15 Thread Fabian Hueske
That's great! Thank you. Let me know if you have any questions. Fabian Am Mo., 15. Apr. 2019 um 11:32 Uhr schrieb Hai : > Hi Fabian: > > > OK ,I am glad to do that. > > > Regards > > Original Message > *Sender:* Fabian Hueske > *Recipient:* hai > *Cc:* user; Yun Tang > *Date:* Monday, Apr 15,

Re: Hbase Connector failed when deployed to yarn

2019-04-15 Thread Hai
Hi Fabian: OK ,I am glad to do that. Regards Original Message Sender: Fabian Hueske Recipient: hai Cc: user; Yun Tang Date: Monday, Apr 15, 2019 17:16 Subject: Re: Hbase Connector failed when deployed to yarn Hi, The Jira issue is still unassigned. Would you be up to work on a fix?

RE: Flink JDBC: Disable auto-commit mode

2019-04-15 Thread Papadopoulos, Konstantinos
Hi Fabian, Glad to hear that you agree for such an improvement. Of course, I can handle it. Best, Konstantinos From: Fabian Hueske Sent: Δευτέρα, 15 Απριλίου 2019 11:56 πμ To: Papadopoulos, Konstantinos Cc: Rong Rong ; user Subject: Re: Flink JDBC: Disable auto-commit mode Hi Konstantinos,

Re: Hbase Connector failed when deployed to yarn

2019-04-15 Thread Fabian Hueske
Hi, The Jira issue is still unassigned. Would you be up to work on a fix? Best, Fabian Am Fr., 12. Apr. 2019 um 05:07 Uhr schrieb hai : > Hi, Tang: > > > Thaks for your reply, will this issue fix soon?I don’t think put > flink-hadoop-compatibility > jar under FLINK_HOME/lib is a elegant soluti

Re: [DISCUSS] Create a Flink ecosystem website

2019-04-15 Thread Robert Metzger
Hey Daryl, thanks a lot for posting a link to this first prototype on the mailing list! I really like it! Becket: Our plan forward is that Congxian is implementing the backend for the website. He has already started with the work, but needs at least one more week. [Re-sending this email because

Re: Can back pressure data be gathered by Flink metric system?

2019-04-15 Thread Biao Liu
Hi Henry, I have just checked the back pressure relevant codes. It is indeed not included in metric system. As a workaround way, you can manually trigger the back pressure tracking through RESTful API (see details in [1]) periodically. And plot with the data returned. BTW I think it's a reasonable

Re: Flink JDBC: Disable auto-commit mode

2019-04-15 Thread Fabian Hueske
Hi Konstantinos, This sounds like a useful extension to me. Would you like to create a Jira issue and contribute the improvement? In the meantime, you can just fork the code of JDBCInputFormat and adjust it to your needs. Best, Fabian Am Mo., 15. Apr. 2019 um 08:53 Uhr schrieb Papadopoulos, Kon

Re: Join of DataStream and DataSet

2019-04-15 Thread Fabian Hueske
Hi Reminia, What Hequn said is correct. However, I would *not* use a regular but model the problem as a time-versioned table join. A regular join will materialize both inputs which is probably not want you want to do for a stream. For a time-versioned table join, only the time-versioned table wou

Re: Apache Flink - Question about broadcast state pattern usage

2019-04-15 Thread Fabian Hueske
Hi, I think your understanding is correct. Having multiple map states for a broadcasted stream gives more flexibility. You can have states of different key and value types and store completely different information in them. Fabian Am Fr., 12. Apr. 2019 um 20:30 Uhr schrieb M Singh : > Hi Fabi

Re: Apache Flink - How to destroy global window and release it's resources

2019-04-15 Thread Fabian Hueske
Hi, Aljoscha know the implementation best (since he implemented it). >From my understanding (Aljoscha please correct me if I'm wrong), all Flink managed state is removed (given that user-defined state is correctly cleaned up). However, for each key, a window and a trigger object might be kept (th

Re: Does HadoopOutputFormat create MapReduce job internally?

2019-04-15 Thread Fabian Hueske
Hi Morven, You posted the same question a few days ago and it was also answered correctly. Please do not repost the same question again. You can reply to the earlier thread if you have a follow up question. To answer your question briefly: No, Flink does not trigger a MapReduce job. The whole job

Re: How would I use OneInputStreamOperator to deal with data skew?

2019-04-15 Thread Felipe Gutierrez
Cool, thanks Kurt! *-* *- Felipe Gutierrez* *- skype: felipe.o.gutierrez* *- **https://felipeogutierrez.blogspot.com * * * On Mon, Apr 15, 2019 at 6:06 AM Kurt Young wrote: > Hi, > > You can checkout the bundle opera