Re: Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread Yang Wang
Hi Pankaj, Always using `-yid` to submit a flink job to an existing yarn session cluster is a safe way. For example, `flink run -d -yid application_1234 examples/streaming/WordCount.jar`. Maybe the magic properties file will be removed in the future. Best, Yang Pankaj Chand 于2019年12月13日周五 下午

Re: Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread Pankaj Chand
Vino and Kostas: Thank you for the info! I was using Flink 1.9.1 with Pre-bundled Hadoop 2.7.5. Cloudlab has quarantined my cluster experiment without notice 😕, so I'll let you know if and when they allow me to access the files in the future. regards, Pankaj On Thu, Dec 12, 2019 at 8:35 AM Ko

Re: Scala case class TypeInformation and Serializer

2019-12-12 Thread 杨光
I did some test by using createTypeInformation[MyClassToAnalyze] , it works fine with some simple case class but will throw some "could not find implicit value" or "constructor _UrlType in class _UrlType cannot be accessed in <$anon: org.apache.flink.api.scala.typeutils.ScalaCaseClassSerializer" ex

How to understand create watermark for Kafka partitions

2019-12-12 Thread qq
Hi all, I confused with watermark for each Kafka partitions. As I know watermark created by data stream level. But why also say created watermark for each Kafka topic partitions ? As I tested, watermarks also created by global, even I run my job with parallels. And assign watermarks on

Re: State Processor API: StateMigrationException for keyed state

2019-12-12 Thread vino yang
Hi pwestermann, Can you share the relevant detailed exception message? Best, Vino pwestermann 于2019年12月13日周五 上午2:00写道: > I am trying to get the new State Processor API but I am having trouble with > keyed state (this is for Flink 1.9.1 with RocksDB on S3 as the backend). > I can read keyed sta

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread Li Peng
Ok I think I identified the issue: 1. I accidentally bundled another version of slf4j in my job jar, which results in some incompatibility with the slf4j jar bundled with flink/bin. Apparently slf4j in this case defaults to something that ignores the conf? Once I removed slf4j from my job jar, the

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread Li Peng
Hey ouywl, interesting, I figured something like that would happen. I actually replaced all the log4j-x files with the same config I originally posted, including log4j-console, but that didn't change the behavior either. Hey Yang, yes I verified the properties files are as I configured, and that t

State Processor API: StateMigrationException for keyed state

2019-12-12 Thread pwestermann
I am trying to get the new State Processor API but I am having trouble with keyed state (this is for Flink 1.9.1 with RocksDB on S3 as the backend). I can read keyed state for simple key type such as Strings but whenever I tried to read state with a more complex key type - such as a named tuple ty

Join a datastream with tables stored in Hive

2019-12-12 Thread Krzysztof Zarzycki
Hello dear Flinkers, If this kind of question was asked on the groups, I'm sorry for a duplicate. Feel free to just point me to the thread. I have to solve a probably pretty common case of joining a datastream to a dataset. Let's say I have the following setup: * I have a high pace stream of events

Re: Flink ML feature

2019-12-12 Thread Rong Rong
Hi guys, Yes, as Till mentioned. The community is working on a new ML library and we are working closely with the Alink project to bring the algorithms. You can find more information regarding the new ML design architecture in FLIP-39 [1]. One of the major change is that the new ML library [2] wi

Re: Flink 1.9.1 KafkaConnector missing data (1M+ records)

2019-12-12 Thread Kostas Kloudas
Hi Harrison, Really sorry for the late reply. Do you have any insight on whether the missing records were read by the consumer and just the StreamingFileSink failed to write their offsets, or the Kafka consumer did not even read them or dropped them for some reason? I asking this in order to narro

Re: Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread Kostas Kloudas
Hi Pankaj, When you start a session cluster with the bin/yarn-session.sh script, Flink will create the cluster and then write a "Yarn Properties file" named ".yarn-properties-YOUR_USER_NAME" in the directory: either the one specified by the option "yarn.properties-file.location" in the flink-conf.

Re: Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread vino yang
Hi Pankaj, Can you tell us what's Flink version do you use? And can you share the Flink client and job manager log with us? This information would help us to locate your problem. Best, Vino Pankaj Chand 于2019年12月12日周四 下午7:08写道: > Hello, > > When using Flink on YARN in session mode, each Flin

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread ouywl
 @Li Peng    I found your problems.  Your start cmd use args “start-foreground”, It will run “exec "${FLINK_BIN_DIR}"/flink-console.sh ${ENTRY_POINT_NAME} "${ARGS[@]}””, and In ' flink-console.sh’, the code is “log_setting=("-Dlog4j.configuration=file:${FLINK_CONF_DIR}/log4j-cons

Re: Scala case class TypeInformation and Serializer

2019-12-12 Thread Timo Walther
Hi, the serializers are created from TypeInformation. So you can simply inspect the type information. E.g. by using this in the Scala API: val typeInfo = createTypeInformation[MyClassToAnalyze] And going through the object using a debugger. Actually, I don't understand why scala.Tuple2 is tr

Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread Pankaj Chand
Hello, When using Flink on YARN in session mode, each Flink job client would automatically know the YARN cluster to connect to. It says this somewhere in the documentation. So, I killed the Flink session cluster by simply killing the YARN application using the "yarn kill" command. However, when s

Re: Sample Code for querying Flink's default metrics

2019-12-12 Thread Pankaj Chand
Thank you, Chesnay! On Thu, Dec 12, 2019 at 5:46 AM Chesnay Schepler wrote: > Yes, when a cluster was started it takes a few seconds for (any) metrics > to be available. > > On 12/12/2019 11:36, Pankaj Chand wrote: > > Hi Vino, > > Thank you for the links regarding backpressure! > > I am current

Re: Sample Code for querying Flink's default metrics

2019-12-12 Thread Chesnay Schepler
Yes, when a cluster was started it takes a few seconds for (any) metrics to be available. On 12/12/2019 11:36, Pankaj Chand wrote: Hi Vino, Thank you for the links regarding backpressure! I am currently using code to get metrics by calling REST API via curl. However, many times the REST API

Re: Sample Code for querying Flink's default metrics

2019-12-12 Thread Pankaj Chand
Hi Vino, Thank you for the links regarding backpressure! I am currently using code to get metrics by calling REST API via curl. However, many times the REST API via curl gives an empty JSON object/array. Piped through JQ (for filtering JSON) it produces a null value. This is breaking my code. Exa

Re: Scala case class TypeInformation and Serializer

2019-12-12 Thread 杨光
Actually the original source code have too many third part classes which is hard to simplify , the question I want to ask is there any possible for me to find out which is ser/dser by which Serializer class,then we can tuning or and customer Serializer to improve performance. Yun Tang 于2019年12月12

Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-12 Thread Zhu Zhu
Thanks Hequn for driving the release and everyone who makes this release possible! Thanks, Zhu Zhu Wei Zhong 于2019年12月12日周四 下午3:45写道: > Thanks Hequn for being the release manager. Great work! > > Best, > Wei > > 在 2019年12月12日,15:27,Jingsong Li 写道: > > Thanks Hequn for your driving, 1.8.3 fixed