Re: read a finite number of messages from Kafka using Kafka connector without extending it?

2019-02-18 Thread Konstantin Knauf
Hi Yu, I am not aware of a way to use the FlinkKafkaConsumer to generate a finite data stream. You could, of course, use a FilterFunction or FlatMapFunction to filter out events outside of the time interval right after the Kafka Source. This way you would not need to modify it, but you have to sto

Re: Reading messages from start - new job submission

2019-02-18 Thread sohimankotia
Hi, Which version of flink you r using ? Reset offset to earliest : https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-consumers-start-position-configuration Thanks -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: [ANNOUNCE] Apache Flink 1.7.2 released

2019-02-18 Thread Robert Metzger
Thank you Gordon! Help spread the word here: https://twitter.com/ApacheFlink/status/1097416946688102401 On Mon, Feb 18, 2019 at 7:41 AM Dian Fu wrote: > Great job. It's great to have a more stable 1.7 release available. Thanks > @Gordon for making it happen. > > Regards, > Dian > > 在 2019年2月1

Re: StreamingFileSink causing AmazonS3Exception

2019-02-18 Thread Kostas Kloudas
Hi Padarn, This is the jira issue: https://issues.apache.org/jira/browse/FLINK-11187 and the fix, as you can see, was first included in version 1.7.2. Cheers, Kostas On Mon, Feb 18, 2019 at 3:49 AM Padarn Wilson wrote: > Hi Addison, Kostas, Steffan, > > I am also encountering this exact issue

Re: Confusion in Heartbeat configurations

2019-02-18 Thread zhijiang
Hi sohimankotia, In order not to strongly rely on the akka implementation, flink implements the heartbeat mechanism for health monitor for the components of TaskExecutor, JobMaster and ResourceManager from FLIP6. So you can see two sets of heartbeat setting, one is for akka internal implementat

Re: [Meetup] Apache Flink+Beam+others in Seattle. Feb 21.

2019-02-18 Thread Fabian Hueske
Thank you Pablo! Am Fr., 15. Feb. 2019 um 20:42 Uhr schrieb Pablo Estrada : > Hello everyone, > There is an upcoming meetup happening in the Google Seattle office, on > February 21st, starting at 5:30pm: > https://www.meetup.com/seattle-apache-flink/events/258723322/ > > People will be chatting a

Submitting job to Flink on yarn timesout on flip-6 1.5.x

2019-02-18 Thread Richard Deurwaarder
Hello, I am trying to upgrade our job from flink 1.4.2 to 1.7.1 but I keep running into timeouts after submitting the job. The flink job runs on our hadoop cluster and starts using Yarn. Relevant config options seem to be: jobmanager.rpc.port: 55501 recovery.jobmanager.port: 55502 yarn.applic

Re: [Table] Types of query result and tablesink do not match error

2019-02-18 Thread Fabian Hueske
Hi François, I had a look at the code and the GenericTypeInfo checks equality by comparing the classes the represent (Class == Class). Class does not override the default implementation of equals, so this is an instance equality check. The check can evaluate to false, if Map was loaded by two diff

Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

2019-02-18 Thread Jark Wu
Thanks Stephan for the proposal and a big +1 to this! I also think it's a good idea to add a link of discussion/FLIP/JIRA to each item as Zhijiang mentioned above. This would be a great help for keeping track of progress and joining in the discussion easily. Best, Jark On Fri, 15 Feb 2019 at 11:

Re: Is group.id required in Kafka connector for offsets to be stored in checkpoint?

2019-02-18 Thread Konstantin Knauf
Hi David, Hi Sohi, this should not be the case. If a savepoint/checkpoint is provided, Flink should always take the offsets from the state regardless of the `group.id` provided. Which Flink version and which FlinkKafkaConsumer version do you use? Best, Konstantin On Mon, Feb 18, 2019 at 5:50 AM

Re: Is group.id required in Kafka connector for offsets to be stored in checkpoint?

2019-02-18 Thread sohimankotia
Yes Konstantin Knauf-2 . You are right . -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Confusion in Heartbeat configurations

2019-02-18 Thread sohimankotia
Thanks Zhijiang . Sorry to ask again . So both set of heartbeats are implementing same feature . If Yes , which one has highest priority to detect failure . If no , can you explain little more or point to some references to understand difference . Thanks Sohi -- Sent from: http://apache-fli

Re: Reading messages from start - new job submission

2019-02-18 Thread avilevi
Hi , I'm using 1.7.0 Cheers -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: Reduce one event under multiple keys

2019-02-18 Thread Fabian Hueske
Hi Stephen, Sorry for the late response. If you don't need to match open and close events, your approach of using a flatMap to fan-out for the hierarchical folder structure and a window operator (or two for open and close) for counting and aggregating should be a good design. Best, Fabian Am Mo.

Re: Limit in batch flink sql job

2019-02-18 Thread Fabian Hueske
Thanks for pointing this out! This is indeed a bug in the documentation. I'll fix that. Thank you, Fabian Am Mi., 13. Feb. 2019 um 02:04 Uhr schrieb yinhua.dai < yinhua.2...@outlook.com>: > OK, thanks. > It might be better to update the document which has the following example > that confused m

Re: KafkaTopicPartition internal class treated as generic type serialization

2019-02-18 Thread Fabian Hueske
Hi Eric, I did a quick search in our Jira to check if this is a known issue but didn't find anything. Maybe Gordon (in CC) knows a bit more about this problem. Best, Fabian Am Fr., 15. Feb. 2019 um 11:08 Uhr schrieb Eric Troies : > Hi, I'm having the exact same issue with flink 1.4.0 using scal

Re: Test FileUtilsTest.testDeleteDirectory failed when building Flink

2019-02-18 Thread Fabian Hueske
Hi Paul, Which components (Flink, JDK, Docker base image, ...) are you upgrading and which versions do you come from? I think it would be good to check how (and with which options) the JVM in the container is started. Best, Fabian Am Fr., 15. Feb. 2019 um 09:50 Uhr schrieb Paul Lam : > Hi all,

Dataset sampling

2019-02-18 Thread Flavio Pompermaier
Hi to all, is there any plan to support different sampling techniques? This would be very helpful when interactive table API will be available.. Best, Flavio

Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

2019-02-18 Thread Shaoxuan Wang
Hi Stephan, Thanks for summarizing the work&discussions into a roadmap. It really helps users to understand where Flink will forward to. The entire outline looks good to me. If appropriate, I would recommend to add another two attracting categories in the roadmap. *Flink ML Enhancement* - Refac

Re: Flink 1.6 Yarn Session behavior

2019-02-18 Thread Jins George
Thank you Gary. That was helpful. Thanks, Jins George On 2/17/19 10:03 AM, Gary Yao wrote: Hi Jins George, Every TM brings additional overhead, e.g., more heartbeat messages. However, a cluster with 28 TMs would not be considered big as there are users that are running Flink applications on tho

Re: Dataset sampling

2019-02-18 Thread Fabian Hueske
Hi Flavio, I'm not aware of any particular plan to add sampling operators to the Table API or SQL. However, I agree. It would be a good feature. Best, Fabian Am Mo., 18. Feb. 2019 um 15:44 Uhr schrieb Flavio Pompermaier < pomperma...@okkam.it>: > Hi to all, > is there any plan to support differ

Flink Streaming Job with OutputFormat stops without error message

2019-02-18 Thread Marke Builder
Hi, I'm using a flink streaming job which read from kafka and write to hbase with the OutputFormat. Like: https://github.com/apache/flink/blob/master/flink-connectors/flink-hbase/src/test/java/org/apache/flink/addons/hbase/example/HBaseWriteStreamExample.java But after a certain time, the job end

subscribe

2019-02-18 Thread Artur Mrozowski
art...@gmail.com

Re: ProducerFencedException when running 2 jobs with FlinkKafkaProducer

2019-02-18 Thread Rohan Thimmappa
Hi Tzu-Li, Any updated on this. This is consistently reproducible. Same jar - Separate source topic to Separate destination topic. This sort of blocker for flink upgrada. i tried with 1.7.2 but no luck. org.apache.flink.streaming.connectors.kafka.FlinkKafka011Exception: Failed to send data to

Re: ProducerFencedException when running 2 jobs with FlinkKafkaProducer

2019-02-18 Thread Tzu-Li (Gordon) Tai
Hi, I just saw a JIRA opened for this: https://issues.apache.org/jira/browse/FLINK-11654. The JIRA ticket's description matches what I had in mind and can confirm the bug assessment. Unfortunately, I currently do not have the capacity to provide a fix and test for this. For the meantime, I've mad

The submitting is hanging when register a hdfs file as registerCacheFile in 1.7 based on RestClusterClient

2019-02-18 Thread Joshua Fan
Hi, all As the title says, the submitting is always hanging there when the cache file is not reachable, actually because the RestClient uses a java.io.File to get the cache file. I use RestClusterClient to submit job in Flink 1.7. Below is instructions shown in https://ci.apache.org/projects/fli

Re: subscribe

2019-02-18 Thread Fabian Hueske
Hi Artur, In order to subscribe to Flink's user mailing list you need to send a mail to user-subscr...@flink.apache.org Best, Fabian Am Mo., 18. Feb. 2019 um 20:34 Uhr schrieb Artur Mrozowski : > art...@gmail.com >