Re: OVER operator filtering out records

2019-08-23 Thread Vinod Mehra
Although things improved during bootstrapping and when even volume was larger. As soon as the traffic slowed down the events are getting stuck (buffered?) at the OVER operator for a very long time. Several hours. On Fri, Aug 23, 2019 at 5:04 PM Vinod Mehra wrote: > (Forgot to mention that we are

Re: Problem with Flink on Yarn

2019-08-23 Thread Rong Rong
This seems like your Kerberos server is starting to issue invalid token to your job manager. Can you share how your Kerberos setting is configured? This might also relate to how your KDC servers are configured. -- Rong On Fri, Aug 23, 2019 at 7:00 AM Zhu Zhu wrote: > Hi Juan, > > Have you tried

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Gavin Lee
Got it. Thanks Till & Zili. +1 for the release notes need to cover such issues. On Fri, Aug 23, 2019 at 11:01 PM Oytun Tez wrote: > Hi all, > > We also had to rollback our upgrade effort for 2 reasons: > > - Official Docker container is not ready yet > - This artefact is not published with > sc

Re: Per Partition Watermarking source idleness

2019-08-23 Thread Eduardo Winpenny Tejedor
Hi Prakhar, Everything is probably working as expected, if a partition does not receive any messages then the watermark of the operator does not advance (as it is the minimum across all partitions). You'll need to define a strategy for the watermark to advance even when no messages are received f

Re: OVER operator filtering out records

2019-08-23 Thread Vinod Mehra
(Forgot to mention that we are using Flink 1.4) Update: Earlier the OVER operator was assigned a parallelism of 64. I reduced it to 1 and the problem went away! Now the OVER operator is not filtering/buffering the events anymore. Can someone explain this please? Thanks, Vinod On Fri, Aug 23, 20

Re: [SURVEY] How do you use high-availability services in Flink?

2019-08-23 Thread Aleksandar Mastilovic
Hi all, Since I’m currently working on an implementation of HighAvailabilityServicesFactory I thought it would be good to report here about my experience so far. Our use case is cloud based, where we package Flink and our supplementary code into a docker image, then run those images through Ku

flink sql syntax for accessing object

2019-08-23 Thread Fanbin Bu
Hi, I have a table with schema being a scala case class or a Map. How do I access the field? Tried the following and it doesn't work. case class MyObject(myField: String) case class Event(myObject: MyObject, myMap: Map[String, String]) table = tableEnv.fromDataStream[Event](myStream, 'myObject, '

OVER operator filtering out records

2019-08-23 Thread Vinod Mehra
We have a SQL based flink job which is consume a very low volume stream (1 or 2 events in few hours): *SELECT user_id,COUNT(*) OVER (PARTITION BY user_id ORDER BY rowtime RANGE INTERVAL '30' DAY PRECEDING) as count_30_days, COALESCE(occurred_at, logged_at) AS latency_marker,rowtimeFRO

kinesis table connector support

2019-08-23 Thread Fanbin Bu
Hi, Looks like Flink table connectors do not include `kinesis`. (only FileSystem, Kafka, ES) see https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/connect.html#table-connectors . I also found some examples for Kafka: https://eventador.io/blog/flink_table_and_sql_api_with_apache_flin

Use logback instead of log4j

2019-08-23 Thread Vishwas Siravara
Hi , >From the flink doc , in order to use logback instead of log4j " Users willing to use logback instead of log4j can just exclude log4j (or delete it from the lib/ folder)." https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/logging.html . However when i delete it from the lib a

Hive version in Flink

2019-08-23 Thread Yebgenya Lazarkhosrouabadi
Hello, I'm using Flink on Cloudera-quickstart-vm-5.13 and need to access the Hive-Tables. The version of hive on Cloudera is 1.1.0 , but in order to access the data of the Hive-Tables, a higher version of hive is needed. Unfortunately it is not possible to easily change the version of Hive on C

type error with generics ..

2019-08-23 Thread Debasish Ghosh
Hello - I have the following call to addSource where I pass a Custom SourceFunction .. env.addSource( new CollectionSourceFunctionJ(data, TypeInformation.of(new TypeHint(){})) ) where data is List and CollectionSourceFunctionJ is a Scala case class .. case class CollectionSourceFunctionJ[T](d

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Oytun Tez
Hi all, We also had to rollback our upgrade effort for 2 reasons: - Official Docker container is not ready yet - This artefact is not published with scala: org.apache.flink:flink-queryable-state-client-java_2.11:jar:1.9.0 --- Oytun Tez *M O T A W O R D* The World's Fastest Human Translatio

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Zili Chen
Hi Till, Did we mention this in release note(or maybe previous release note where we did the exclusion)? Best, tison. Till Rohrmann 于2019年8月23日周五 下午10:28写道: > Hi Gavin, > > if I'm not mistaken, then the community excluded the Scala FlinkShell > since a couple of versions for Scala 2.12. The p

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Till Rohrmann
Hi Gavin, if I'm not mistaken, then the community excluded the Scala FlinkShell since a couple of versions for Scala 2.12. The problem seems to be that some of the tests failed. See here [1] for more information. [1] https://issues.apache.org/jira/browse/FLINK-10911 Cheers, Till On Fri, Aug 23,

Re: Problem with Flink on Yarn

2019-08-23 Thread Zhu Zhu
Hi Juan, Have you tried Flink release built with Hadoop 2.7 or later version? If you are using Flink 1.8/1.9, it should be Pre-bundled Hadoop 2.7+ jar which can be found in the Flink download page. I think YARN-3103 is about AMRMClientImp.class and it is in the flink shaded hadoop jar. Thanks, Z

Are there any news on custom trigger support for SQL/Table API?

2019-08-23 Thread Theo Diefenthal
Hi there, I currently evaluate to let our experienced system users write Flink-SQL queries directly. Currently, all queries our users need are implemented programmatically. There is one major problem preventing us from just giving SQL to our users directly. Almost all queries of our users ar

Re: I'm not able to make a stream-stream Time windows JOIN in Flink SQL

2019-08-23 Thread Theo Diefenthal
Hi Fabian, Hi Zhenghua Thank you for your suggestions and telling me that I was on the right track. And good to know how to find out whether something yields to time-bounded or regular join. @Fabian: Regarding your suggested first option: Isn't that exactly what my first try was? With this T

Problem with Flink on Yarn

2019-08-23 Thread Juan Gentile
Hello! We are running Flink on Yarn and we are currently getting the following error: 2019-08-23 06:11:01,534 WARN org.apache.hadoop.security.UserGroupInformation - PriviledgedActionException as: (auth:KERBEROS) cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.se

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Gavin Lee
I used package on apache official site, with mirror [1], the difference is I used scala 2.12 version. I also tried to build from source for both scala 2.11 and 2.12, when build 2.12 the FlinkShell.class is in flink-dist jar file but after running mvn clean package -Dscala-2.12, this class was remov

Per Partition Watermarking source idleness

2019-08-23 Thread Prakhar Mathur
Hi, We are using flink v1.6. We are facing data loss issues while consuming data from older offsets in Kafka with windowing. We are exploring per partition watermarking strategy. But we noticed that when we are trying to consume from multiple topics and if any of the partition is not receiving dat

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Zili Chen
I downloaded 1.9.0 dist from here[1] and didn't see the issue. Where do you download it? Could you try to download the dist from [1] and see whether the problem last? Best, tison. [1] http://mirrors.tuna.tsinghua.edu.cn/apache/flink/flink-1.9.0/flink-1.9.0-bin-scala_2.11.tgz Gavin Lee 于2019年8月

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Gavin Lee
Thanks for your reply @Zili. I'm afraid it's not the same issue. I found that the FlinkShell.class was not included in flink dist jar file in 1.9.0 version. Nowhere can find this class file inside jar, either in opt or lib directory under root folder of flink distribution. On Fri, Aug 23, 2019 at

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Zili Chen
Hi Gavin, I also find a problem in shell if the directory contain whitespace then the final command to run is incorrect. Could you check the final command to be executed? FYI, here is the ticket[1]. Best, tison. [1] https://issues.apache.org/jira/browse/FLINK-13827 Gavin Lee 于2019年8月23日周五 下午

Re: Can I use watermarkers to have a global trigger of different ProcessFunction's?

2019-08-23 Thread David Anderson
If you want to use event time processing with in-order data, then you can use an AscendingTimestampExtractor [1]. David [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/event_timestamp_extractors.html#assigners-with-ascending-timestamps On Thu, Aug 22, 2019 at 4:03 PM Felipe Gutie

Re: [ANNOUNCE] Apache Flink 1.9.0 released

2019-08-23 Thread Gavin Lee
Why bin/start-scala-shell.sh local return following error? bin/start-scala-shell.sh local Error: Could not find or load main class org.apache.flink.api.scala.FlinkShell For flink 1.8.1 and previous ones, no such issues. On Fri, Aug 23, 2019 at 2:05 PM qi luo wrote: > Congratulations and thanks

checkpoint failure suddenly even state size is into 10 mb around

2019-08-23 Thread Sushant Sawant
Hi all, m facing two issues which I believe are co-related though. 1. Kafka source shows high back pressure. 2. Sudden checkpoint failure for entire day until restart. My job does following thing, a. Read from Kafka b. Asyncio to external system c. Dumping in Cassandra, Elasticsearch Checkpointin