Unsubscribe

2024-05-12 Thread Mark Petronic

Re: Combining multiple stages into a multi-stage processing pipeline

2024-04-07 Thread Mark Petronic
are each of the 21 scores to the overall score and conditionally out.collect() only the List for the record sets below threshold. Then, subsequent stages can rinse and repeat this process in one clean job graph. Thanks again for you thoughts. They really helped light the light bulb for me :) Mark O

Combining multiple stages into a multi-stage processing pipeline

2024-04-06 Thread Mark Petronic
ices so I am just looking for some confirmation that I am heading down a reasonable path for this design. Thank you in advance, Mark

Guidance on general design approach for Flink

2024-01-30 Thread Mark Petronic
I am working on a new application to perform real time anomaly detection using Flink. I am relatively new to Flink but have already one application in production that is fairly basic and of low throughput. This next one will be more complex and much higher throughput. My query is about handling la

Best practice way to conditionally discard a window and not serialize the results

2023-10-30 Thread Mark Petronic
I am reading stats from Kinesis, deserializing them into a stat POJO and then doing something like this using an aggregated window with no defined processWindow function: timestampedStats .keyBy(v -> v.groupKey()) .window(TumblingEventTimeWindows.of(Time.seconds(appCfg.get

Checkpoint/Savepoint restore of S3 file reads using continuous read mode

2023-10-06 Thread Mark Petronic
I am trying to understand the Flink design pattern for consuming files from S3 continuously as they appear. I have written the below minimal program to do that and it works as expected wrt detecting newly-uploaded S3 files within the configured 5 second monitoring poll period. Then it just prints t

Unsubscribe

2023-07-04 Thread Mark Petronic

Web UI not working with createLocalEnvironmentWithWebUI()

2023-04-16 Thread Mark Petronic
did and it made no difference. org.apache.flink flink-runtime-web ${flink.version} *Can anyone help me understand why the web UI does not work in createLocalEnvironmentWithWebUI()?* Thanks, Mark

Re: IntervalJoin invisibly becomes a regular Join - why?

2023-03-16 Thread mark
l interval join is retained? Many thanks in advance for any insight that anyone can offer. Regards, Mark On Wed, 15 Mar 2023 at 16:02, Leonard Xu wrote: > > > > CREATE TEMPORARY VIEW filteredResults AS > > SELECT * from suspiciousOrders WHERE small_ts > large_ts; > >

IntervalJoin invisibly becomes a regular Join - why?

2023-03-15 Thread mark
erted to an (unsafe) regular join - is that true? Many thanks in advance. Regards, Mark Nuttall

subscribe

2023-03-15 Thread mark
subscribe

Flink Operator create Session Job unsuccessfully

2022-11-30 Thread Mark Lee
Hi, I'm creating a session job using FlinkOperator. The CRD as follows: # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # dist

How to make flink operator as a cluster operator?

2022-11-28 Thread Mark Lee
Hi all, How to make flink operator as a cluster operator? And How to register it to CVO(Cluster Version Operator)? I didn't find any code or configure file in flink-operator's code. Thank you. Mark

How can I deploy a flink cluster with 4 TaskManagers?

2022-11-25 Thread Mark Lee
Hi all, How can I deploy a flink cluster with 1 Job Manager and 4 Task Managers using FlinkDeployment CR? Such sample in Flink Operator can only create 1 Task Manager. apiVersion: flink.apache.org/v1beta1 kind: FlinkDeployment metadata: name: basic-session-deployment-only-example spec: im

Flink Operator in an off-line k8s enviroment

2022-11-22 Thread Mark Lee
Hi all, I installed flink operator following https://nightlies.apache.org/flink/flink-kubernetes-operator-docs-main/docs/ try-flink-kubernetes-operator/quick-start/. helm repo add flink-operator-repo https://downloads.apache.org/flink/flink-kubernetes-operator-1.2.0/ helm install flink-kubernete

答复: Deploy Flink Operator in an k8s enviroment without helm?

2022-11-17 Thread Mark Lee
Thanks Gyula, My basic OS does not have a helm client and don’t allow install it. Could I deploy flink operator in such situation? 发件人: user-return-51648-lifuqiong00=126@flink.apache.org 代表 Gyula Fóra 发送时间: 2022年11月18日 13:26 收件人: Biao Geng 抄送: Mark Lee ; user@flink.apache.org 主题

答复: Deploy Flink Operator in an k8s enviroment without helm?

2022-11-17 Thread Mark Lee
basic OS does not supply in my scenario. 发件人: Biao Geng 发送时间: 2022年11月18日 13:12 收件人: Mark Lee ; user@flink.apache.org 主题: Re: Deploy Flink Operator in an k8s enviroment without helm? Hi Mark, I believe you can do that without helm. Just like that you can install some software in CentOS

Deploy Flink Operator in an k8s enviroment without helm?

2022-11-17 Thread Mark Lee
Hi all, I am trying to deploy flink operator followed Quick Start . But it seems need helm client installed in Kubernetes environment. Could we deploy flink operator without

答复: Flink Operator in Golang?

2022-11-17 Thread Mark Lee
I got it, Thanks Zhanghao! 发件人: user-return-51640-lifuqiong00=126@flink.apache.org 代表 zhanghao.c...@outlook.com 发送时间: 2022年11月17日 23:36 收件人: Mark Lee ; user@flink.apache.org 主题: Re: Flink Operator in Golang? Hi Mark, 1. Directly quoting from https://cwiki.apache.org

Flink Operator in Golang?

2022-11-17 Thread Mark Lee
Hi, I found we already have Flink operator implemented by java. But I have two questions: 1. If we can implement Flink operator using golang? Is there some hidden difficult traps? 2. We can submit Java jar jobs or sql jobs, can we submit golang jobs? Thank you.

Help with completing Flink SQL please!

2022-01-20 Thread mark
in advance for any help anyone can offer. Regards, Mark

Securing Stateful Functions

2021-10-15 Thread mark
based `org.apache.flink.statefun.sdk.io.Router`. We need help please in knowing how to set up the Router's client connection with the necessary client-side certificates. Very many thanks in advance! Regards, Mark

Re: Resource changed on src filesystem after upgrade

2021-01-29 Thread Mark Davis
thus there is a a timestamp difference. I guess a lot has changed since 1.8 in the YARN deployment area. Too bad there is no clear instruction how to submit a job programmatically every time I have to reverse engineer CliFrontend. Sorry for the confusion and thanks! Mark

Resource changed on src filesystem after upgrade

2021-01-17 Thread Mark Davis
comes from the Hadoop code. I am running a very old version of the HDP platform v.2.6.5 so it might be the one to blame. But the code was working perfectly fine before the upgrade, so I am confused. Could you please advise. Thank you! Mark

Re: Anomalous spikes in aggregations of keyed data

2020-11-30 Thread Kegel, Mark
At the moment we checkpoint every minute. I can turn this frequency down but I’m not sure that will fix/hide the issue. Mark From: Arvid Heise Date: Monday, November 30, 2020 at 2:33 PM To: Kegel, Mark Cc: user@flink.apache.org Subject: Re: Anomalous spikes in aggregations of keyed data Hi

Anomalous spikes in aggregations of keyed data

2020-11-30 Thread Kegel, Mark
We have a high volume (600-700 shards) kinesis data stream that we are doing a simple keying and aggregation on. The logic is very simple: kinesis source, key by fields (A,B,C), window (1-minute, tumbling), aggregate by summing over integer field R, connect to sink. We are seeing some anomalous

Re: Resource leak in DataSourceNode?

2020-08-30 Thread Mark Davis
DataSourceNode not calling the close() method. Cheers, Mark ‐‐‐ Original Message ‐‐‐ On Thursday, August 27, 2020 3:30 PM, Robert Metzger wrote: > Hi Mark, > > Thanks a lot for your message and the good investigation! I believe you've > found a bug in Flink. I filed an issu

Resource leak in DataSourceNode?

2020-08-26 Thread Mark Davis
Hi, I am trying to investigate a problem with non-released resources in my application. I have a stateful application which submits Flink DataSetjobs using code very similar to the code in CliFrontend. I noticed what I am getting a lot of non-closed connections to my data store (HBase in my ca

Re: Asynchronous I/O poor performance

2020-07-09 Thread Mark Zitnik
Jul 2020, 23:31 Arvid Heise, wrote: > Hi Mark, > > Async wait operators cannot be chained to sources so the messages go > through the network stack. Thus, having some latency is normal and cannot > be avoided. It can be tuned though, but I don't think that this is the > iss

Re: Asynchronous I/O poor performance

2020-07-06 Thread Mark Zitnik
t,3L,TimeUnit.SECONDS) resultStream.print() println(env.getConfig.getAutoWatermarkInterval) env.execute("Flink Scala API Skeleton") } } is this normal behavior? On Mon, Jul 6, 2020 at 2:45 PM Benchao Li wrote: > Hi Mark, > > According to your data, I think the config of Async

Re: Asynchronous I/O poor performance

2020-07-05 Thread Mark Zitnik
Hi Benchao The capacity is 100 Parallelism is 8 Rpc req is 20ms Thanks On Sun, 5 Jul 2020, 6:16 Benchao Li, wrote: > Hi Mark, > > Could you give more details about your Flink job? > - the capacity of AsyncDataStream > - the parallelism of AsyncDataStream operator > - the t

Fwd: Asynchronous I/O poor performance

2020-07-04 Thread Mark Zitnik
Hi In my flink application I need to enrich data using AsyncDataStream.unorderedWait but I am getting poor perforce at the beginning I was just working with http call, but I have switched to grpc, I running on 8 core node and getting total of 3200 events per second my service that I am using is no

Re: Run command after Batch is finished

2020-06-09 Thread Mark Davis
every output must check whether other outputs finished already... Quite cumbersome. Also there is a problem with exceptions - the OutputFormats can be never open and never closed. Mark ‐‐‐ Original Message ‐‐‐ On Monday, June 8, 2020 5:50 PM, Chesnay Schepler wrote: > This goes in t

Re: Run command after Batch is finished

2020-06-07 Thread Mark Davis
with the batch code. Maybe a hook to a job manager or similar. I do not plan to run anything heavy there, just some formal cleanups. Is there something like this? Thank you! Mark ‐‐‐ Original Message ‐‐‐ On Saturday, June 6, 2020 4:29 PM, Jeff Zhang wrote: > It would run in

Re: Run command after Batch is finished

2020-06-06 Thread Mark Davis
Hi Jeff, Thank you very much! That is exactly what I need. Where the listener code will run in the cluster deployment(YARN, k8s)? Will it be sent over the network? Thank you! Mark ‐‐‐ Original Message ‐‐‐ On Friday, June 5, 2020 6:13 PM, Jeff Zhang wrote: > You can try JobListe

Run command after Batch is finished

2020-06-05 Thread Mark Davis
- in this mode the finally block is never run. Thank you! Mark

Re: Lack of KeyedBroadcastStateBootstrapFunction

2020-03-30 Thread Mark Niehe
Hi Gordan and Seth, Thanks for explanation and opening up the ticket. I'll add some details in the ticket to explain what we're trying to do which will hopefully add some context. -- <http://segment.com/> Mark Niehe · Software Engineer Integrations <https://segment.com

Lack of KeyedBroadcastStateBootstrapFunction

2020-03-23 Thread Mark Niehe
ction. Because these are the only two options, it's not possible to bootstrap both keyed and broadcast state for the same operator. Are there any plans to add that functionality or did I miss it entirely when going through the API docs? Thanks, -- <http://segment.com/> Mark Niehe · S

Flink Release Security Workflow

2020-03-18 Thread Mark Hapner
Are there any docs/links that describe the security workflow for a Flink release? For instance, the static code scan workflow; pen test workflow; security review of new features; etc. The reason for the question is to better understand how to include Flink within the security workflow of a prod

State Processor API Keyed State

2020-02-18 Thread Mark Niehe
eration.java:141) at org.apache.flink.contrib.streaming.state.RocksDBKeyedStateBackendBuilder.build(RocksDBKeyedStateBackendBuilder.java:270) ... 13 more Has anyone come across this before and figured out a fix? Any help you can give would be greatly appreciated! Thanks, -- <http://segment.com/> Mark Niehe · Software En

Re: GC overhead limit exceeded, memory full of DeleteOnExit hooks for S3a files

2020-02-03 Thread Mark Harris
lkFormat? Best regards, Mark From: Kostas Kloudas Sent: 03 February 2020 15:39 To: Mark Harris Cc: Piotr Nowojski ; Cliff Resnick ; David Magalhães ; Till Rohrmann ; flink-u...@apache.org Subject: Re: GC overhead limit exceeded, memory full of DeleteOnExit hooks

Re: GC overhead limit exceeded, memory full of DeleteOnExit hooks for S3a files

2020-02-03 Thread Mark Harris
Hi Kostas, Sorry, stupid question: How do I set that for a StreamingFileSink? Best regards, Mark From: Kostas Kloudas Sent: 03 February 2020 14:58 To: Mark Harris Cc: Piotr Nowojski ; Cliff Resnick ; David Magalhães ; Till Rohrmann ; flink-u...@apache.org

Re: GC overhead limit exceeded, memory full of DeleteOnExit hooks for S3a files

2020-02-03 Thread Mark Harris
progress was being made. Increasing the memory available to the TM seems to have fixed the problem. I think the DeleteOnExit problem will mean it needs to be restarted every few weeks, but that's acceptable for now. Thanks again, Mark From: Mark Harris

Re: GC overhead limit exceeded, memory full of DeleteOnExit hooks for S3a files

2020-01-30 Thread Mark Harris
something else may be taking up the taskmanagers memory which didn't make it into that heap dump. I plan to repeat the analysis on a heapdump created by -XX:+HeapDumpOnOutOfMemoryError shortly. Best regards, Mark From: Piotr Nowojski Sent: 30 Janua

Re: GC overhead limit exceeded, memory full of DeleteOnExit hooks for S3a files

2020-01-30 Thread Mark Harris
e a factor? Best regards, Mark From: Piotr Nowojski Sent: 27 January 2020 16:16 To: Cliff Resnick Cc: David Magalhães ; Mark Harris ; Till Rohrmann ; flink-u...@apache.org ; kkloudas Subject: Re: GC overhead limit exceeded, memory full of DeleteOnExit hook

Re: GC overhead limit exceeded, memory full of DeleteOnExit hooks for S3a files

2020-01-27 Thread Mark Harris
Hi Piotr, Thanks for the link to the issue. Do you know if there's a workaround? I've tried setting the following in my core-site.xml: ​fs.s3a.fast.upload.buffer=true To try and avoid writing the buffer files, but the taskmanager breaks with the same problem. Best rega

GC overhead limit exceeded, memory full of DeleteOnExit hooks for S3a files

2020-01-21 Thread Mark Harris
single taskmanager / job that uses the StreamingFileSink crashed with the GC overhead limit exceeded error. I've had a look for advice on handling this error more broadly without luck. Any suggestions or advice gratefully received. Best regards, Mark Harris The information contained i

Re: DataSet API: HBase ScannerTimeoutException and double Result processing

2019-11-25 Thread Mark Davis
g of the HBase exception thrown from ClientScanner.next() is correct. If the call to mapResultToOutType(Result) finished without an error there is no need to restart from the same row. The new scanner should start from the next row. Is that so or am I missing something? Best regards, Mark

DataSet API: HBase ScannerTimeoutException and double Result processing

2019-11-23 Thread Mark Davis
track of all processed Results is not feasible in my case. Here is a simple job demonstrating an issue (HBase scan and RPC timeouts are set to 60 sec) Thank you! Best regards, Mark public static void main(String[] args) throws Exception { ExecutionEnvironment env

Re: Flink zookeeper HA problem

2019-03-11 Thread Harris, Mark
Sometimes it's the simplest things - the 40 or so jobs we have seem to take longer to reload on cluster start up than in flink 1.6, and it was timing out. Increasing the value for the timeout over 5 minutes and everything works again. From: Harris, Mark Sen

Flink zookeeper HA problem

2019-03-07 Thread Harris, Mark
bug would be really gratefully received. Best regards, Mark Harris The information contained in or attached to this email is intended only for the use of the individual or entity to which it is addressed. If you are not the intended recipient, or a person responsible for delivering it to th

Re: KafkaException or ExecutionStateChange failure on job startup

2018-10-26 Thread Mark Harris
Hi Dominik Setting that bit of configuration seems to have done the trick for the MXBean exception. Many thanks for your help. Best regards, Mark On Tue, 23 Oct 2018 at 14:41, Dominik Wosiński wrote: > Hey Mark, > > Do You use more than 1 Kafka consumer for Your jobs? I think thi

KafkaException or ExecutionStateChange failure on job startup

2018-10-23 Thread Mark Harris
heir "Exception" tab in the jobmanager. Is there something that we need to fix in our setup? Are there any implications around missing metrics etc? Best regards, Mark Harris -- hivehome.com <http://www.hivehome.com> Hive | London | Cambridge | Houston | Toronto The infor

Re: Avro serialization problem after updating to flink 1.6.0

2018-10-23 Thread Mark Harris
nk 1.3.2 and 1.6.1? Best regards, Mark On Thu, 4 Oct 2018 at 14:03, Aljoscha Krettek wrote: > Hi, > > can you check whether AlertEvent actually has a field called "SCHEMA$"? > You can do that via > javap path/to/AlertEvent.class > > Best, > Aljoscha >

Avro serialization problem after updating to flink 1.6.0

2018-09-27 Thread Mark Harris
, or work around it would be gratefully received. Best regards, Mark Harris -- hivehome.com <http://www.hivehome.com> Hive | London | Cambridge | Houston | Toronto The information contained in or attached to this email is confidential and intended only for the use of th