Reading/writing Kafka message keys using DDL pyFlink

2020-07-24 Thread Manas Kale
Hi, How do I read/write Kafka message keys using DDL? I have not been able to see any documentation for the same. Thanks!

Re: Reading/writing Kafka message keys using DDL pyFlink

2020-07-24 Thread Leonard Xu
Hi, Kale Unfortunately Flink SQL does not support read/write Kafka message keys yet, there is a FLIP[1] to discuss this feature. Best Leonard Xu [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-107%3A+Reading+table+columns+from+different+parts+of+source+records#FLIP107:Readingtableco

Re: Kafka connector with PyFlink

2020-07-24 Thread Wojciech Korczyński
Hi, thank you for your answer, it is very helpful. Currently my python program looks like: from pyflink.datastream import StreamExecutionEnvironment from pyflink.dataset import ExecutionEnvironment from pyflink.table import TableConfig, DataTypes, BatchTableEnvironment, StreamTableEnvironment fro

Re: GenericData cannot be cast to type scala.Product

2020-07-24 Thread Aljoscha Krettek
For anyone following this: the discussion is happening on the Jira issue: https://issues.apache.org/jira/browse/FLINK-18478 Best, Aljoscha On 23.07.20 15:32, Georg Heiler wrote: Hi, as a follow up to https://issues.apache.org/jira/browse/FLINK-18478 I now face a class cast exception. The repr

Re: Kafka connector with PyFlink

2020-07-24 Thread Xingbo Huang
Hi Wojtek, The following ways of using Pyflink is my personal recommendation: 1. Use DDL[1] to create your source and sink instead of the descriptor way, because as of flink 1.11, there are some bugs in the descriptor way. 2. Use `execute_sql` for single statement, use `create_statement_set` for

Re: Reading/writing Kafka message keys using DDL pyFlink

2020-07-24 Thread Manas Kale
Okay, thanks for the info! On Fri, Jul 24, 2020 at 2:11 PM Leonard Xu wrote: > Hi, Kale > > Unfortunately Flink SQL does not support read/write Kafka message keys > yet, there is a FLIP[1] to discuss this feature. > > Best > Leonard Xu > [1] > https://cwiki.apache.org/confluence/display/FLINK/FL

REST API randomly returns Not Found for an existing job

2020-07-24 Thread Tomasz Dudziak
Hi, I have come across an issue related to GET /job/:jobId endpoint from monitoring REST API in Flink 1.9.0. A few seconds after successfully starting a job and confirming its status as RUNNING, that endpoint would return 404 (Not Found). Interestingly, querying immediately again (literally a m

Re: REST API randomly returns Not Found for an existing job

2020-07-24 Thread Chesnay Schepler
How reproducible is this problem / how often does it occur? How is the cluster deployed? Is anything else happening to the cluster around that that time (like a JobMaster failure)? On 24/07/2020 13:28, Tomasz Dudziak wrote: Hi, I have come across an issue related to GET /job/:jobId endpoint

Re: Flink DataStream[String] kafkacosumer avro streaming file sink

2020-07-24 Thread Robert Metzger
Thank you for your question. I responded on StackOverflow. Let's finish the discussion there. On Fri, Jul 24, 2020 at 5:07 AM Vijayendra Yadav wrote: > Hi Flink Team, > > *FLINK Streaming:* I have DataStream[String] from kafkaconsumer > > DataStream stream = env > .addSource(new FlinkKafkaCo

Re: Flink 1.11 job stop with save point timeout error

2020-07-24 Thread Robert Metzger
Hi Ivan, thanks a lot for your message. Can you post the JobManager log here as well? It might contain additional information on the reason for the timeout. On Fri, Jul 24, 2020 at 4:03 AM Ivan Yang wrote: > Hello everyone, > > We recently upgrade FLINK from 1.9.1 to 1.11.0. Found one strange be

Re: How to start flink standalone session on windows ?

2020-07-24 Thread Chesnay Schepler
Flink no longer runs natively on Windows; you will have to use some unix-like environment like WSL or cygwin. On 24/07/2020 04:27, wangl...@geekplus.com.cn wrote: There's no  start-cluster.bat and flink.bat in bin directory. So how can i start flink on windowns OS? Thanks, Lei -

Re: Flink app cannot restart

2020-07-24 Thread Robert Metzger
Hi Rainie, I believe we need the full JobManager log to understand what's going on with your job. The logs you've provided so far only tell us that a TaskManager has died (which is expected, when a node goes down). What is interesting to see is what's happening next: are we having enough resources

Re: Recommended pattern for implementing a DLQ with Flink+Kafka

2020-07-24 Thread Robert Metzger
Hey Tom, I'm not aware of any patterns for this problem, sorry. Intuitively, I would send dead letters to a separate Kafka topic. Best, Robert On Wed, Jul 22, 2020 at 7:22 PM Tom Fennelly wrote: > Thanks Chen. > > I'm thinking about errors that occur while processing a record/message > that s

Re: Flink 1.11 Simple pipeline (data stream -> table with egg -> data stream) failed

2020-07-24 Thread Timo Walther
Hi Dmytro, `StreamTableEnvironment` does not support batch mode currently. Only `TableEnvironment` supports the unified story. I saw that you disabled the check in the `create()` method. This check exists for a reason. For batch execution, the planner sets specific properties on the stream g

Re: Recommended pattern for implementing a DLQ with Flink+Kafka

2020-07-24 Thread Tom Fennelly
Thanks Robert. On Fri, Jul 24, 2020 at 2:32 PM Robert Metzger wrote: > Hey Tom, > > I'm not aware of any patterns for this problem, sorry. Intuitively, I > would send dead letters to a separate Kafka topic. > > Best, > Robert > > > On Wed, Jul 22, 2020 at 7:22 PM Tom Fennelly > wrote: > >> Than

Re: Kafka connector with PyFlink

2020-07-24 Thread Wojciech Korczyński
Hi, I've done like you recommended: from pyflink.datastream import StreamExecutionEnvironment from pyflink.dataset import ExecutionEnvironment from pyflink.table import TableConfig, DataTypes, BatchTableEnvironment, StreamTableEnvironment, ScalarFunction from pyflink.table.descriptors import Sche

Printing effective config for flint 1.11 cli

2020-07-24 Thread Senthil Kumar
Hello, My understanding is that flink consumes the config from the config file as well as those specified via the -D option. I assume that the -D will override the values from the config file? Is there a way to somehow see what the effective config is? i.e. print all of the config values that f

Re: REST API randomly returns Not Found for an existing job

2020-07-24 Thread Kostas Kloudas
Hi Tomasz, Thanks a lot for reporting this issue. If you have verified that the job is running AND that the REST server is also up and running (e.g. check the overview page) then I think that this should not be happening. I am cc'ing Chesnay who may have an additional opinion on this. Cheers, Kos

Re: AllwindowStream and RichReduceFunction

2020-07-24 Thread Flavio Pompermaier
In my reduce function I want to compute some aggregation on the sub-results of a map-partition (that I tried to migrate from DataSet to DataStream without success). The original code was something like: input.mapPartition(new RowToStringSketches(sketchMapSize)) // .reduce(new SketchesStri

Re: Question on Pattern Matching

2020-07-24 Thread Kostas Kloudas
Hi Basanth, If I understand your usecase correctly: 1) you want to find all A->B->C->D 2) for all of them you want to know how long it took to complete 3) if one completed within X it is considered ok, if not, it is considered problematic 4) you want to count each one of them One way to go is th

Re: Changing watermark in the middle of a flow

2020-07-24 Thread Kostas Kloudas
Hi Lorenzo, If you want to learn how Flink uses watermarks, it is worth checking [1]. Now in a nutshell, what a watermark will do in a pipeline is that it may fire timers that you may have registered, or windows that you may have accumulated. If you have no time-sensitive operations between the f

Re: Printing effective config for flint 1.11 cli

2020-07-24 Thread Kostas Kloudas
Hi Senthil, You can see the configuration from the WebUI or you can get from the REST API[1]. In addition, if you enable debug logging, you will have a line starting with "Effective executor configuration:" in your client logs (although I am not 100% sure if this will contain all the configuration

Re: Flink 1.11 Simple pipeline (data stream -> table with egg -> data stream) failed

2020-07-24 Thread Dmytro Dragan
Hi Timo, Thank you for response. Well, it was working. We have a number of pipelines in production which reuse DataStream and Table API parts on Flink 1.10, both for stream and batch. The same that simple case without aggregation would work in Flink 1.11 But let`s assume there are some incompati

Re: Kafka connector with PyFlink

2020-07-24 Thread Xingbo Huang
Hi Wojciech, In many cases, you can make sure that your code can run correctly in local mode, and then submit the job to the cluster for testing. For how to add jar packages in local mode, you can refer to the doc[1]. Besides, you'd better use blink planner which is the default planner. For how to

Flink Session TM Logs

2020-07-24 Thread Richard Moorhead
When running a flink session on YARN, task manager logs for a job are not available after completion. How do we locate these logs?

SourceReaderBase not part of flink-core 1.11.1

2020-07-24 Thread Yuval Itzchakov
Hi, I'm implementing a custom SourceReader and want to base it on SourceReaderBase. However, it seems like while SourceReader and Source are part of `flink-core:1.11.1`, SourceReaderBase is not? Do I need an external package for it? -- Best Regards, Yuval Itzchakov.

Re: Flink state reconciliation

2020-07-24 Thread Kostas Kloudas
Hi Alex, Maybe Seth (cc'ed) may have an opinion on this. Cheers, Kostas On Thu, Jul 23, 2020 at 12:08 PM Александр Сергеенко wrote: > > Hi, > > We use so-called "control stream" pattern to deliver settings to the Flink > job using Apache Kafka topics. The settings are in fact an unlimited stre

RE: REST API randomly returns Not Found for an existing job

2020-07-24 Thread Tomasz Dudziak
Yes, the job was running and the REST server as well. No JobMaster failures noticed. I used a test cluster deployed on a bunch of VM's and bare metal boxes. I am afraid, I can no longer reproduce this issue. It occurred a couple days ago and lasted for an entire day with jobs being quite often er

Re: SourceReaderBase not part of flink-core 1.11.1

2020-07-24 Thread Robert Metzger
SourceReaderBase seems to be in flink-connector-base. On Fri, Jul 24, 2020 at 5:59 PM Yuval Itzchakov wrote: > Hi, > I'm implementing a custom SourceReader and want to base it on > SourceReaderBase. However, it seems like while SourceReader and Source are > part of `flink-core:1.11.1`, SourceRea

Fwd: Flink Session TM Logs

2020-07-24 Thread Richard Moorhead
-- Forwarded message - From: Robert Metzger Date: Fri, Jul 24, 2020 at 1:09 PM Subject: Re: Flink Session TM Logs To: Richard Moorhead I accidentally replied to you directly, not to the user@ mailing list. Is it okay for you to publish the thread on the list again? On Fri, Ju

Re: REST API randomly returns Not Found for an existing job

2020-07-24 Thread Kostas Kloudas
Thanks a lot for the update Tomasz and keep up posted if it happens again. Kostas On Fri, Jul 24, 2020 at 6:37 PM Tomasz Dudziak wrote: > > Yes, the job was running and the REST server as well. No JobMaster failures > noticed. > I used a test cluster deployed on a bunch of VM's and bare metal b

Re: Flink 1.11 job stop with save point timeout error

2020-07-24 Thread Ivan Yang
Hi Robert, Below is the job manager log after issuing the “flink stop” command 2020-07-24 19:24:12,388 INFO org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Triggering checkpoint 1 (type=CHECKPOINT) @ 1595618652138 for job 853c59916ac33dfbf46503b33289929e.

Flink 1.11.1 - job manager exists with exit code 0

2020-07-24 Thread Alexey Trenikhun
Hello, I've Flink 1.11.1 session cluster running via docker compose, I upload job jar, when I submit job jobmanager exits without any errors in log: ... {"@timestamp":"2020-07-25T04:32:54.007Z","@version":"1","message":"Starting execution of job katana-fsp (64ff3943fdc5024c5beef1612518c627) und