Re: timeout exception when consuming from kafka

2019-07-22 Thread Yitzchak Lieberman
Hi. Another question - what will happen during a triggered checkpoint if one of the kafka brokers is unavailable? Will appreciate your insights. Thanks. On Mon, Jul 22, 2019 at 12:42 PM Yitzchak Lieberman < yitzch...@sentinelone.com> wrote: > Hi. > > I'm running a Flink application (version 1.8

Re: apache flink: Why checkpoint coordinator takes long time to get completion

2019-07-22 Thread Zili Chen
Hi Xiangyu, Could you share the corresponding JIRA that fixed this issue? Best, tison. Xiangyu Su 于2019年7月19日周五 下午8:47写道: > btw. it seems like this issue has been fixed in 1.8.1 > > On Fri, 19 Jul 2019 at 12:21, Xiangyu Su wrote: > >> Ok, thanks. >> >> and this time-consuming until now alway

Re: Execution environments for testing: local vs collection vs mini cluster

2019-07-22 Thread Biao Liu
Hi Juan, I'm not sure what you really want. Before giving some suggestions, could you answer the questions below first? 1. Do you want to write a unit test (or integration test) case for your project or for Flink? Or just want to run your job locally? 2. Which mode do you want to test? DataStream

Execution environments for testing: local vs collection vs mini cluster

2019-07-22 Thread Juan Rodríguez Hortalá
Hi, In https://ci.apache.org/projects/flink/flink-docs-stable/dev/local_execution.html and https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/runtime/minicluster/MiniCluster.html I see there are 3 ways to create an execution environment for testing: - StreamExecut

MiniClusterResource class not found using AbstractTestBase

2019-07-22 Thread Juan Rodríguez Hortalá
Hi, I'm trying to use AbstractTestBase in a test in order to use the mini cluster. I'm using specs2 with Scala, so I cannot extend AbstractTestBase because I also have to extend org.specs2.Specification, so I'm trying to access the mini cluster directly using Specs2 BeforeAll to initialize it as f

Re: Job Manager becomes irresponsive if the size of the session cluster grows

2019-07-22 Thread Prakhar Mathur
On Mon, Jul 22, 2019, 16:08 Prakhar Mathur wrote: > Hi, > > We enabled GC logging, here are the logs > > [GC (Allocation Failure) [PSYoungGen: 6482015K->70303K(6776832K)] > 6955827K->544194K(20823552K), 0.0591479 secs] [Times: user=0.09 sys=0.00, > real=0.06 secs] > [GC (Allocation Failure) [PSYo

Fwd: FsStateBackend,hdfs rpc api too much,FileCreated and FileDeleted is for what?

2019-07-22 Thread 陈Darling
Hi Yun Tang Your suggestion is very very important to us.  According to your suggestion, We have suggested that users increase the interval time (1 to 5 minutes) and set state.backend.fs.memory-threshold=10k.  But we only have one hdfs cluster, we try to reduce Hdfs api call, I don't know if there

Fwd: FsStateBackend,hdfs rpc api too much,FileCreated and FileDeleted is for what?

2019-07-22 Thread 陈Darling
Hi Yun Tang Your suggestion is very very important to us.  According to your suggestion, We have suggested that users increase the interval time (1 to 5 minutes) and set state.backend.fs.memory-threshold=10k.  But we only have one hdfs cluster, we try to reduce Hdfs api call, I don't know if there

Fwd: FsStateBackend,hdfs rpc api too much,FileCreated and FileDeleted is for what?

2019-07-22 Thread 陈Darling
Hi In my understanding,CreateFile and FileCreated api is different,FileCreated is more like a check api, but I don’t find where it was called in the src source. I don’t understand when  FileCreated Api was called and for what。Is FileCreated api a hdfs internal confirmation api?FLINK-11696  is to re

Re:AW: Re:Unable to build Flink1.10 from source

2019-07-22 Thread Haibo Sun
Please check whether the following profile section exists in "flink-filesystems/flink-mapr-fs/pom.xml". If not, you should pull the latest code and try to compile it again. If yes, please share the latest error message, it may be different from before.

Re: [SURVEY] How many people implement Flink job based on the interface Program?

2019-07-22 Thread Jeff Zhang
Hi Flavio, Based on the discussion in the tickets you mentioned above, the program-class attribute was a mistake and community is intended to use main-class to replace it. Deprecating Program interface is a part of work of flink new client api. IIUC, your requirements are not so complicated. We c

GroupBy result delay

2019-07-22 Thread Fanbin Bu
Hi, I have a Flink sql streaming job defined by: SELECT user_id , hop_end(created_at, interval '30' second, interval '1' minute) as bucket_ts , count(name) as count FROM event WHERE name = 'signin' GROUP BY user_id , hop(created_at, interval '30' second, interval '1' minute) there is a

AW: Re:Unable to build Flink1.10 from source

2019-07-22 Thread Yebgenya Lazarkhosrouabadi
Hi, I used the command mvn clean package -DskipTests -Punsafe-mapr-repo , but it didn’t work. I get the same error. Regards Yebgenya Lazar Von: Haibo Sun Gesendet: Montag, 22. Juli 2019 04:40 An: Yebgenya Lazarkhosrouabadi Cc: user@flink.apache.org Betreff: Re:Unable to build Flink1.10 from

Re: Timestamp(timezone) conversion bug in non blink Table/SQL runtime

2019-07-22 Thread Shuyi Chen
Hi Lasse, Thanks for the reply. If your input is in epoch time, you are not getting local time, instead, you are getting a wrong time that does not make sense. For example, if the user input value is 0 (which means 00:00:00 UTC on 1 January 1970), and your local timezone is UTC-8, converting 00:0

Will StreamingFileSink.forBulkFormat(...) support overriding OnCheckpointRollingPolicy?

2019-07-22 Thread Elkhan Dadashov
Hi folks, Will StreamingFileSink.forBulkFormat(...) support overriding OnCheckpointRollingPolicy? Does anyone use StreamingFileSink *with checkpoint disabled *for writing Parquet output files? The output parquet files are generated, but they are empty, and stay in *inprogress* state, even when t

Re: Timestamp(timezone) conversion bug in non blink Table/SQL runtime

2019-07-22 Thread Lasse Nedergaard
Hi. I have encountered the same problem when you input epoch time to window table function and then use window.start and window.end the out doesn’t output in epoch but local time and I located the problem to the same internal function as you. Med venlig hilsen / Best regards Lasse Nedergaard

Re: Job submission timeout with no error info.

2019-07-22 Thread Fakrudeen Ali Ahmed
It turns out the actual issue was a configuration issue and we just had to pore over job manager log carefully. We were using HDFS [really API on top of windows blob] as source and we didn’t provide the server location and it took the path prefix as the server. Only thing here would have been F

Timestamp(timezone) conversion bug in non blink Table/SQL runtime

2019-07-22 Thread Shuyi Chen
Hi all, Currently, in the non-blink table/SQL runtime, Flink used SqlFunctions.internalToTimestamp(long v) from Calcite to convert event time (in long) to java.sql.Timestamp. However, as discussed in the recent Calcite mailing list (Jul. 19, 2019), SqlFunctions.internalToTimestamp() assumes the in

Re: Job submission timeout with no error info.

2019-07-22 Thread Fakrudeen Ali Ahmed
Thanks Andrey. The environment we run [Azure HD insight cluster] only supports Flink 1.4.2 now. So I can’t run with 1.8 in this environment. I can run in a different environment with 1.8 [on Kubernetes not YARN though] and report the results. Thanks, -Fakrudeen (define (sqrte n xn eph) (if (> ep

Re: Job submission timeout with no error info.

2019-07-22 Thread Andrey Zagrebin
Hi Fakrudeen, Thanks for sharing the logs. Could you also try it with Flink 1.8? Best, Andrey On Sat, Jul 20, 2019 at 12:44 AM Fakrudeen Ali Ahmed wrote: > Hi Andrey, > > > > > > Flink version: 1.4.2 > > Please find the client log attached and job manager log is at: job > manager log >

Use batch and stream environment in a single pipeline

2019-07-22 Thread Andres Angel
Hello everyone, I need to create a table from a stream environment and thinking in a pure SQL approach I was wondering if I can create few of the enrichment tables in batch environment and only the streaming payload as streaming table environment. I tried to create a batch table environment with

Re: Extending REST API with new endpoints

2019-07-22 Thread Oytun Tez
I did take a look at it, but things got out of hand very quickly from there on :D I see that WebSubmissionExtension implements WebMonitorExtension, but then WebSubmissionExtension was used in DispatcherRestEndpoint, which I couldn't know how to manipulate/extend... How can I plug my Extension int

Re: Extending REST API with new endpoints

2019-07-22 Thread Oytun Tez
I simply want to open up endpoints to query QueryableStates. What I had in mind was to give operators an interface to implement their own QueryableState controllers, e.g. serializers etc. We are trying to use Flink in more of an "application framework" fashion, so extensibility helps a lot. As the

Re: [SURVEY] How many people implement Flink job based on the interface Program?

2019-07-22 Thread Flavio Pompermaier
Hi Tison, we use a modified version of the Program interface to enable a web UI do properly detect and run Flink jobs contained in a jar + their parameters. As stated in [1], we dected multiple Main classes per jar by handling an extra comma-separeted Manifest entry (i.e. 'Main-classes'). As menti

[SURVEY] How many people implement Flink job based on the interface Program?

2019-07-22 Thread Zili Chen
Hi guys, We want to have an accurate idea of how many people are implementing Flink job based on the interface Program, and how they actually implement it. The reason I ask for the survey is from this thread[1] where we notice this codepath is stale and less useful than it should be. As it is an

Re: Time extracting in flink

2019-07-22 Thread Andy Hoang
Thanks Biao, just want to not reinvent the wheel :) > On Jul 22, 2019, at 4:29 PM, Biao Liu wrote: > > Hi Andy, > > As far as I know, Flink does not support feature like that. > > I would suggest recording and calculating the time in user code. > For example, add a timestamp field (maybe an

timeout exception when consuming from kafka

2019-07-22 Thread Yitzchak Lieberman
Hi. I'm running a Flink application (version 1.8.0) that uses FlinkKafkaConsumer to fetch topic data and perform transformation on the data, with state backend as below: StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.enableCheckpointing(5_000, Ch

Re: Time extracting in flink

2019-07-22 Thread Biao Liu
Hi Andy, As far as I know, Flink does not support feature like that. I would suggest recording and calculating the time in user code. For example, add a timestamp field (maybe an array) in your record with printing a timestamp on in by each processing. Andy Hoang 于2019年7月22日周一 下午4:49写道: > Hi

Time extracting in flink

2019-07-22 Thread Andy Hoang
Hi guys, I’m trying to write elk log for flink, this help us to store/calculate processing time of a group of operators for business auditing. I read about process_function and Debugging Windows & Event Time in docs. They’re focus on “keyed” events and monitoring using web/metric, where I want

Re: Extending REST API with new endpoints

2019-07-22 Thread Biao Liu
Hi, As far as I know, the RESTful handler is not pluggable. And I don't see a strong reason from your description to do so. Could you explain more about your requirement? Oytun Tez 于2019年7月20日周六 上午4:36写道: > Yep, I scanned all of the issues in Jira and the codebase, I couldn't find > a way to p