date:20210505

Re: Flink auto-scaling feature and documentation suggestions

2021-05-05 Thread David Anderson

Interesting. So if I understand correctly, basically you limited the parallelism of the sources in order to avoid running the job with constant backpressure, and then scaled up the windows to maximize throughput. On Tue, May 4, 2021 at 11:23 PM vishalovercome wrote: > In one of my jobs, windowin

Re: Is keyed state supported in PyFlink?

2021-05-05 Thread Dian Fu

Hi Sumeet, This feature is supported in 1.13.0 which was just released and so there is no documentation about it in 1.12. Regards, Dian > 2021年5月4日上午2:09，Sumeet Malhotra 写道： > > Hi, > > Is keyed state [1] supported by PyFlink yet? I can see some code for it in > the Flink master branch, bu

Re: Presence of Jars in Flink reg security

2021-05-05 Thread Till Rohrmann

Hi Prasanna, in the latest Flink version (1.13.0) I couldn't find these dependencies. Which version of Flink are you looking at? What you could check is whether one of these dependencies is contained in one of Flink's shaded dependencies [1]. [1] https://github.com/apache/flink-shaded Cheers, Ti

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-05 Thread Theo Diefenthal

Thanks for managing the release. +1. I like the focus on improving operations with this version. Von: "Matthias Pohl" An: "Etienne Chauchot" CC: "dev" , "Dawid Wysakowicz" , "user" , annou...@apache.org Gesendet: Dienstag, 4. Mai 2021 21:53:31 Betreff: Re: [ANNOUNCE] Apache Flink 1.13.0

Re: Is keyed state supported in PyFlink?

2021-05-05 Thread Sumeet Malhotra

Thanks Dian. Yes, I hadn't looked at the 1.13.0 documentation earlier. On Wed, May 5, 2021 at 1:46 PM Dian Fu wrote: > Hi Sumeet, > > This feature is supported in 1.13.0 which was just released and so there > is no documentation about it in 1.12. > > Regards, > Dian > > 2021年5月4日上午2:09，Sumeet M

PyFlink: Split input table stream using filter()

2021-05-05 Thread Sumeet Malhotra

Hi, I would like to split streamed data from Kafka into 2 streams based on some filter criteria, using PyFlink Table API. As described here [1], a way to do this is to use .filter() which should split the stream for parallel processing. Does this approach work in Table API as well? I'm doing the

Re: PyFlink: Split input table stream using filter()

2021-05-05 Thread Dian Fu

Hi Sumeet, Yes, this approach also works in Table API. Could you share which API you use to execute the job? For jobs with multiple sinks, StatementSet should be used. You could refer to [1] for more details on this. Regards, Dian [1] https://ci.apache.org/projects/flink/flink-docs-release-1

Re: Question about state processor data outputs

2021-05-05 Thread Chen-Che Huang

Hi Robert, Due to the performance issue of using state processor, I probably would like to give up state processor and am trying StreamingFileSink in a streaming manner. However, I need to store the files on GCS. However, I encountered the error below. It looks like Flink hasn't support GCS for

Re: Interacting with flink-jobmanager via CLI in separate pod

2021-05-05 Thread Robert Metzger

Hi, can you check the client log in the "log/" directory? The Flink client will try to access the K8s API server to retrieve the endpoint of the jobmanager. For that, the pod needs to have permissions (through a service account) to make such calls to K8s. My hope is that the logs or previous messag

Re: Presence of Jars in Flink reg security

2021-05-05 Thread Chesnay Schepler

One of these (plexus-utils) is afaik used by maven, so the scanner is potentially scanning the wrong thing. Or you are scanning all dependencies downloaded during the build of Flink, including everything used by various plugins of the build process & maven itself. On 5/5/2021 11:08 AM, Till Ro

Re: remote task manager netty exception

2021-05-05 Thread Roman Khachatryan

Hi, > Could it be somehow partition info isn't up to date on TM when job is > restarting? Partition info should be up to date or become so eventually - but this is assuming that JM is able to detect the failure. The latter may not be the case, as Sihan You wrote previously: > The strange thing i

[NOTICE] Flink 1.12.3 artifacts for Scala 2.12 were built against Scala 2.11

2021-05-05 Thread Chesnay Schepler

To all Scala 2.12 users, Due to a mistake during the release process of Flink 1.12.3 jars intended to be built against Scala 2.12 were actually built against Scala 2.11 . This affects all jars published to maven central; the convenience binaries are not affected. Scala 2.12 users are advised

Re: NPE when aggregate window.

2021-05-05 Thread tuk

Can some provide a bit more explanation why replacing /com.google.common.base.Objects.hashCode with toString().hashCode(),/ with /toString().hashCode()/ making it work? -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: NPE when aggregate window.

2021-05-05 Thread Arvid Heise

Hi, I'm assuming it's just a workaround for changing fields. The string representation happens to be stable while the underlying values change. It's best practice to use completely immutable types if you have similar issues, you should double-check that nothing can be changed in your data type or

Re: Interacting with flink-jobmanager via CLI in separate pod

2021-05-05 Thread Robert Cullen

Thanks for the reply. Here is an updated exception with DEBUG on. It appears to be timing out: 2021-05-05 16:56:19,700 DEBUG org.apache.flink.kubernetes.kubeclient.FlinkKubeClientFactory [] - Setting namespace of Kubernetes client to cmdaa 2021-05-05 16:56:19,700 DEBUG org.apache.flink.kubernetes.

Re: Flink auto-scaling feature and documentation suggestions

2021-05-05 Thread David Anderson

Well, I was thinking you could have avoided overwhelming your internal services by using something like Flink's async i/o operator, tuned to limit the total number of concurrent requests. That way the pipeline could have uniform parallelism without overwhelming those services, and then you'd rely o

Re: Flink auto-scaling feature and documentation suggestions

2021-05-05 Thread Ken Krugler

Hi Vishal, WRT “bring down our internal services” - a common pattern with making requests to external services is to measure latency, and throttle (delay) requests in response to increased latency. You’ll see this discussed frequently on web crawling forums as an auto-tuning approach. Typical

Re: Interacting with flink-jobmanager via CLI in separate pod

2021-05-05 Thread Robert Metzger

Okay, it appears to have resolved 10.43.0.1:30081 as the address of the JobManager. Most likely, the container can not access this address. Can you validate this from within the container? If I understand the Flink documentation correctly, you should be able to manually specify rest.address, rest.

Re: org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException: Failed 1 action: IOException: 1 time,

2021-05-05 Thread Robert Metzger

Hi Ragini, Since this exception is coming from the Hbase client, I assume the issue has nothing to do with Flink directly. I would recommend carefully studying the HBase client configuration parameters, maybe setup a simple Java application that "hammers" data into Hbase at a maximum rate to under

Session Windows - not working as expected

2021-05-05 Thread Swagat Mishra

Hi, Bit of background, I have a stream of customers who have purchased some product, reading these transactions on a KAFKA topic. I want to aggregate the number of products the customer has purchased in a particular duration ( say 10 seconds ) and write to a sink. I am using session windows to ac

Re: Flink auto-scaling feature and documentation suggestions

2021-05-05 Thread vishalovercome

Yes. While back-pressure would eventually ensure high throughput, hand tuning parallelism became necessary because the job with high source parallelism would immediately bring down our internal services - not giving enough time to flink to adjust the in-rate. Plus running all operators at such a hi

Re: Flink Event specific window

2021-05-05 Thread Swagat Mishra

Hi Arvid, I sent a separate mail titled - Session Windows - not working as expected closing this thread. Please have a look when you have a few minutes, much appreciated. Regards, Swagat On Wed, May 5, 2021 at 7:24 PM Swagat Mishra wrote: > Hi Arvid, > > Tried a small POC to reproduce the b

Re: Flink Event specific window

2021-05-05 Thread Swagat Mishra

Hi Arvid, I sent a separate mail titled - Session Windows - not working as expected ( to the user community ) All other details are here if you need, closing this thread. Please have a look when you have a few minutes, much appreciated. Regards, Swagat On Thu, May 6, 2021 at 1:50 AM Swagat Mis

Re: Session Windows - not working as expected

2021-05-05 Thread Sam

Adding the code for CustomWatermarkGenerator . @Override public void onEvent(Customer customer, long l, WatermarkOutput watermarkOutput) { currentMaxTimestamp = Math.max(currentMaxTimestamp, customer.getEventTime() ); } @Override public void onPeriodicEmit(WatermarkOutput watermarkOutput

Re: Session Windows - not working as expected

2021-05-05 Thread Swagat Mishra

This seems to be working fine in processing time but doesn't work in event time. Is there an issue with the way the water mark is defined or do we need to set up timers? Please advise. WORKING: customerStream .keyBy((KeySelector) Customer::getIdentifier) .window(ProcessingTimeSe

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-05 Thread Leonard Xu

Thanks Dawid & Guowei for the great work, thanks everyone involved. Best, Leonard > 在 2021年5月5日，17:12，Theo Diefenthal 写道： > > Thanks for managing the release. +1. I like the focus on improving operations > with this version. > > Von: "Matthias Pohl" > An: "Etienne Chauchot" > CC: "dev" ,

Re: How to tell between a local mode run vs. remote mode run?

2021-05-05 Thread Xingbo Huang

Hi Yik San, You can check whether the execution environment used is `LocalStreamEnvironment` and you can get the class object corresponding to the corresponding java object through py4j in PyFlink. You can take a look at the example I wrote below, I hope it will help you ``` from pyflink.table impo

Re: Protobuf support with Flink SQL and Kafka Connector

2021-05-05 Thread Jark Wu

Hi Shipeng, Matthias is correct. FLINK-18202 should address this topic. There is already a pull request there which is in good shape. You can also download the PR and build the format jar yourself, and then it should work with Flink 1.12. Best, Jark On Mon, 3 May 2021 at 21:41, Matthias Pohl wr

Re: Interacting with flink-jobmanager via CLI in separate pod

2021-05-05 Thread Yang Wang

It seems that you are using the NodePort to expose the rest service. If you only want to access the Flink UI/rest in the K8s cluster, then I would suggest to set "kubernetes.rest-service.exposed.type" to "ClusterIP". Because we are using the K8s master node to construct the JobManager rest endpoint

Re: Flink : Caused by: java.lang.IllegalStateException: No ExecutorFactory found to execute the application.

2021-05-05 Thread Yun Gao

Hi Ragini, How did you submit your job ? The exception here is mostly cuased that the `flink-client` is not included in the classpath at the client side. If the job is submitted via the flink cli, namely `flink run -c xx.jar`, it should be included by default, and if some programming way is used,

Re: Define rowtime on intermediate table field

2021-05-05 Thread Yun Gao

Hi Sumeet, I think you might first convert the table back to the DataStream [1], then define the timestamp and watermark with `assignTimestampsAndWatermarks(...)`, and then convert it back to table[2]. Best, Yun [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/common/

Re: How to tell between a local mode run vs. remote mode run?

2021-05-05 Thread Yik San Chan

Hi Xingbo, Thank you! On Thu, May 6, 2021 at 10:01 AM Xingbo Huang wrote: > Hi Yik San, > You can check whether the execution environment used is > `LocalStreamEnvironment` and you can get the class object corresponding to > the corresponding java object through py4j in PyFlink. You can take a

Re: savepoint command in code

2021-05-05 Thread Yun Tang

Hi, You could trigger savepoint via rest API [1] or refer to SavepointITCase[2] to see how to trigger savepoint in test code. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/ops/rest_api/#jobs-jobid-savepoints [2] https://github.com/apache/flink/blob/c688bf3c83e72155ccf5

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-05 Thread Xintong Song

Thanks Dawid & Guowei as the release managers, and everyone who has contributed to this release. Thank you~ Xintong Song On Thu, May 6, 2021 at 9:51 AM Leonard Xu wrote: > Thanks Dawid & Guowei for the great work, thanks everyone involved. > > Best, > Leonard > > 在 2021年5月5日，17:12，Theo Dief

Read kafka offsets from checkpoint - state processor

2021-05-05 Thread bat man

Hi Users, Is there a way that Flink 1.9 the checkpointed data can be read using the state processor api. Docs [1] says - When reading operator state, users specify the operator uid, the state name, and the type information. What is the type for the kafka operator, which needs to be specified whil

[Avro] Re: TypeSerializer Example

2021-05-05 Thread Sandeep khanzode

Hi, Is there a working example somewhere that I can refer for writing Avro entities in Flink state as well as Avro serializaition in KafkaConsumer/Producer? I tried to use Avro entities directly but there is an issue beyond Apache Avro 1.7.7 in that the entities created have a serialVersionUid.

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-05 Thread Yun Tang

Thanks for Dawid and Guowei's great work, and thanks for everyone involved for this release. Best Yun Tang From: Xintong Song Sent: Thursday, May 6, 2021 12:08 To: user ; dev Subject: Re: [ANNOUNCE] Apache Flink 1.13.0 released Thanks Dawid & Guowei as the rele

Re: Session Windows - not working as expected

2021-05-05 Thread Arvid Heise

Hi, Is your CustomerGenerator setting the event timestamp correctly? Are your evictors evicting too early? You can try to add some debug output into the watermark assigner and see if it's indeed progressing as expected. On Thu, May 6, 2021 at 12:48 AM Swagat Mishra wrote: > This seems to be wo

Re: Session Windows - not working as expected

2021-05-05 Thread Swagat Mishra

Yes customer generator is setting the event timestamp correctly like I see below. I debugged and found that the events are getting late, so never executed. i.e,. in the window operator the method this.isWindowLate( actualWindow) is getting executed to false for the rest of the events except the fi

39 matches

Mail list logo