Re: Job Logs - Yarn Application Mode

2022-05-19 Thread Shengkai Fang
Thanks for Biao's explanation. Best, Shengkai Biao Geng 于2022年5月20日周五 11:16写道: > Hi there, > @Zain, Weihua's suggestion should be able to fulfill the request to check > JM logs. If you do want to use YARN cli for running Flink applications, it > is possible to check JM's log with the YARN comma

Re: Could not copy native libraries - Permission denied

2022-05-19 Thread yu'an huang
What is your deployment mode, on yarn, Kubernetes or standalone? Can you provide more logs about this error? > On 18 May 2022, at 4:07 PM, Zain Haider Nemati wrote: > > Hi, > We are using flink version 1.13 with a kafka source and a kinesis sink with a > parallelism of 3. > On submitting the

Re: Applying backpressure to limit state memory consumption

2022-05-19 Thread yu'an huang
H Robini, In my experience, the state size of memory state backend is limit by the heap memory. See this link for details: https://nightlies.apache.org/flink/flink-docs-master/docs/ops/state/state_backends/ “When deciding between HashMapStateBackend and RocksDB, it is a choice between performan

Re: Job Logs - Yarn Application Mode

2022-05-19 Thread Biao Geng
Hi there, @Zain, Weihua's suggestion should be able to fulfill the request to check JM logs. If you do want to use YARN cli for running Flink applications, it is possible to check JM's log with the YARN command like: *yarn logs -applicationId application_xxx_yyy -am -1 -logFiles jobmanager.log* For

Re: Does kafka key is supported in kafka sink table

2022-05-19 Thread Shengkai Fang
Hi. Yes. Flink supports to write the value to the Kafka record key parts. You just need to specify which column belongs to the key in the WITH blocks, e.g. ``` CREATE TABLE kafka_sink ( ... ) WITH ( `key.fields` = 'id' ); ``` [1] https://nightlies.apache.org/flink/flink-docs-master/docs/conne

Re: Job Logs - Yarn Application Mode

2022-05-19 Thread Shengkai Fang
Hi. I am not familiar with the YARN application mode. Because the job manager is started when submit the jobs. So how can users know the address of the JM? Do we need to look up the Yarn UI to search the submitted job with the JobID? Best, Shengkai Weihua Hu 于2022年5月20日周五 10:23写道: > Hi, > You

Re: How to KafkaConsume from Particular Partition in Flink(version 1.14.4)

2022-05-19 Thread Shengkai Fang
Hi, If you use SQL API, you can specify the partition in the DDL[1] and filter out the record that you don't need. ``` CREATE TABLE KafkaSource ( ... `partition` METADATA ) WITH ( ... ); SELECT * FROM KafkaSource WHERE partition = 1; ``` Best, Shengkai [1] https://nightlies.apache.o

Flink - SQL Tumble End on event time not returning any result

2022-05-19 Thread Raghunadh Nittala
Hi Team, I have a Flink job that consumes from a kafka topic and tries to create windows (using Tumble) based on few columns like eventId and eventName. Kafka topic has data in format of comma separated values like below: event1,Util1,1647614467000,0.12 event1,Util1,1647614527000,0.26 event1,Util

Re: Job Logs - Yarn Application Mode

2022-05-19 Thread Weihua Hu
Hi, You can get the logs from Flink Web UI if job is running. Best, Weihua > 2022年5月19日 下午10:56,Zain Haider Nemati 写道: > > Hey All, > How can I check logs for my job when it is running in application mode via > yarn

Re: Incorrect checkpoint id used when job is recovering

2022-05-19 Thread yuxia
There's a simliar issue FLINK-19816[1] [1] [ https://issues.apache.org/jira/browse/FLINK-19816 | https://issues.apache.org/jira/browse/FLINK-19816 ] Best regards, Yuxia 发件人: "tao xiao" 收件人: "User" 发送时间: 星期四, 2022年 5 月 19日 下午 9:16:34 主题: Re: Incorrect checkpoint id used when job is rec

Re: Kinesis Sink - Data being received with intermittent breaks

2022-05-19 Thread Ber, Jeremy
Hi Zain— Are you seeing any data loss present within the Flink Dashboard subtasks of each task? On the bottom of your dashboard you should see data going from each blue box to the next. Is this a comprehensive set of data? Meaning do you see 80M from the source -> first operator -> second opera

Re: Checkpointing - Job Manager S3 Head Requests are 404s

2022-05-19 Thread Aeden Jameson
Great, that all makes sense to me. Thanks again. On Thu, May 19, 2022 at 11:42 AM David Anderson wrote: > > Sure, happy to try to help. > > What's happening with the hadoop filesystem is that before it writes each key > it checks to see if the "parent directory" exists by checking for a key with

Re: Checkpointing - Job Manager S3 Head Requests are 404s

2022-05-19 Thread David Anderson
Sure, happy to try to help. What's happening with the hadoop filesystem is that before it writes each key it checks to see if the "parent directory" exists by checking for a key with the prefix up to the last "/", and if that key isn't found it then creates empty marker files to cause of that pare

Re: Checkpointing - Job Manager S3 Head Requests are 404s

2022-05-19 Thread Aeden Jameson
Thanks for the response David. That's the conclusion I came to as well. The Hadoop plugin behavior doesn't appear to reflect more recent changes to S3 like strong read-after-write consistency, https://aws.amazon.com/blogs/aws/amazon-s3-update-strong-read-after-write-consistency/ . Given the impro

Re: Confusing S3 Entropy Injection Behavior

2022-05-19 Thread Aeden Jameson
Thanks for the response David. I'm using Flink 1.13.5. >> For point 1 the behavior you are seeing is what is expected. Great. That's what I concluded after digging into things a little more. This helps me be sure I just didn't miss some other configuration. Thank you. >> For point 2, I'm not sur

Re: Checkpointing - Job Manager S3 Head Requests are 404s

2022-05-19 Thread David Anderson
Aeden, this is probably happening because you are using the Hadoop implementation of S3. The Hadoop S3 filesystem tries to imitate a filesystem on top of S3. In so doing it makes a lot of HEAD requests. These are expensive, and they violate read-after-create visibility, which is what you seem to b

Job Logs - Yarn Application Mode

2022-05-19 Thread Zain Haider Nemati
Hey All, How can I check logs for my job when it is running in application mode via yarn

Re: HTTP REST API as Ingress/Egress

2022-05-19 Thread Austin Cawley-Edwards
Hi Himanshu, Unfortunately, this is not supported by Statefun, though this type of application should not be too difficult to using something like the Kafka Request/Reply pattern[1], and putting that in front of a Statefun cluster. Best, Austin [1]: https://dzone.com/articles/synchronous-kafka-u

Re: Confusing S3 Entropy Injection Behavior

2022-05-19 Thread David Anderson
Aeden, I want to expand my answer after having re-read your question a bit more carefully. For point 1 the behavior you are seeing is what is expected. With hadoop the metadata written by the job manager will literally include "_entropy_" in its path, while this will be replaced in paths of any a

Re: How to KafkaConsume from Particular Partition in Flink(version 1.14.4)

2022-05-19 Thread Weihua Hu
Hi Harshit, FlinkKafkaConsumer does not support consuming a particular partition of a topic. Best, Weihua > 2022年5月18日 下午5:02,harshit.varsh...@iktara.ai 写道: > > particular

Introducing Statefun Tsukuyomi

2022-05-19 Thread Tymur Yarosh
Hello guys, I've created a library to help narrow integration testing of Stateful Functions applications written in Java. It utilizes Stateful Functions' RequestReply protocol and provides Java DSL to test the function. Statefun Tsukuyomi sets up the function under test with the initial state,

Applying backpressure to limit state memory consumption

2022-05-19 Thread Robin Cassan
Hey all! I have a conceptual question on the DataStream API: when using an in-memory state backend (like the HashMapStateBackend), how can you ensure that the hashmap won't grow uncontrollably until OutOfMemory happens? In my case, I would be consuming from a Kafka topic, into a SessionWindow. The

Re: Incorrect checkpoint id used when job is recovering

2022-05-19 Thread tao xiao
Hi team, Can anyone shed some light? On Sat, May 14, 2022 at 8:56 AM tao xiao wrote: > Hi team, > > Does anyone have any ideas? > > On Thu, May 12, 2022 at 9:20 PM tao xiao wrote: > >> Forgot to mention the Flink version is 1.13.2 and we use kubernetes >> native mode >> >> On Thu, May 12, 2022

Re: Confusing S3 Entropy Injection Behavior

2022-05-19 Thread David Anderson
This sounds like it could be FLINK-17359 [1]. What version of Flink are you using? Another likely explanation arises from the fact that only the checkpoint data files (the ones created and written by the task managers) will have the _entropy_ replaced. The job manager does not inject entropy into

Re:Window aggregation fails after upgrading to Flink 1.15

2022-05-19 Thread Xuyang
Hi, can you provide the DDL of your source table? I test your query in my idea and it works. Here is my code. createtemporarytable myTable( userid int, pageid int, p_userid string, rowtime asproctime() ) with ( 'connector'='datagen' ); SELECT window_start, window_end, userid, count(pageid) AS

Re: Flink application on yarn cluster - main method not found

2022-05-19 Thread Weihua Hu
Hi, Which version of flink are you using? It looks like there is a conflict between the flink version of the cluster and the version in userjar Best, Weihua > 2022年5月19日 下午4:49,Zain Haider Nemati 写道: > > Hi, > Im running flink application on yarn cluster it is giving me this error, it > is w

Re: Flink application on yarn cluster - main method not found

2022-05-19 Thread Zain Haider Nemati
Hi Folks, Would appreciate it if someone could help me out with this ! Cheers On Thu, May 19, 2022 at 1:49 PM Zain Haider Nemati wrote: > Hi, > Im running flink application on yarn cluster it is giving me this error, > it is working fine on standalone cluster. Any idea what could be causing > t

Final reminder: ApacheCon North America call for presentations closing soon

2022-05-19 Thread Rich Bowen
[Note: You're receiving this because you are subscribed to one or more Apache Software Foundation project mailing lists.] This is your final reminder that the Call for Presetations for ApacheCon North America 2022 will close at 00:01 GMT on Monday, May 23rd, 2022. Please don't wait! Get your talk

Re: HTTP REST API as Ingress/Egress

2022-05-19 Thread Himanshu Sareen
Hi All, It will be of great help if someone can share views. As per application design. Synchronous access to a stateful fucntion. 1. Application will access/invoke a stateful function via a HTTP call. 2. Application will wait for an response. 3. Once Stateful function completes the ex

Flink Stateful Function - Regex Match in State Key

2022-05-19 Thread Himanshu Sareen
Hi All, My understanding is Flink uses exact match on key to fetch/load state in a stateful functions. But is it possible to use a regex expression in target-id or as a key to a stateful function, thus fetching/loading all matching states. Regards, Himanshu

Flink application on yarn cluster - main method not found

2022-05-19 Thread Zain Haider Nemati
Hi, Im running flink application on yarn cluster it is giving me this error, it is working fine on standalone cluster. Any idea what could be causing this? Exception in thread "main" java.lang.NoSuchMethodError: org.apache.flink.client.deployment.application.ClassPathPackagedProgramRetriever.newBu

RE: Checkpoint directories not cleared as TaskManagers run

2022-05-19 Thread Schwalbe Matthias
Hi James, Let me give some short answers, (there is documentation that better describes this): >> - why do taskmanagers create the chk-x directory but only the jobmanager can >> delete it? Shouldn’t the jobmanager be the only component creating and >> deleting these directories? That would see

Re: Kafka Sink Key and Value Avro Schema Usage Issues

2022-05-19 Thread Peter Schrott
Hi Ghiy, I am not quite sure about your actual problem, why the schema is not generated as expected. I also needed to work with the Kafka keys in the business logic, therefore I found a way to deserialize and serialize the key along with the event itself by overriding KafkaRecord[De]Serialization