from:"Felipe Gutierrez"

Error on deploying Flink docker image with Kubernetes (minikube) and automatically launch a stream WordCount job.

2020-09-18 Thread Felipe Gutierrez

loyment/kubernetes.html [2] https://github.com/apache/flink/blob/master/flink-examples/flink-examples-streaming/src/main/java/org/apache/flink/streaming/examples/wordcount/WordCount.java -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

Re: Error on deploying Flink docker image with Kubernetes (minikube) and automatically launch a stream WordCount job.

2020-09-23 Thread Felipe Gutierrez

thanks Yang, I got to put it to work in the way that you said. https://github.com/felipegutierrez/explore-flink Best, Felipe -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Thu, Sep 24, 2020 at 6:59 AM Yang Wang wrote: > > Hi Felipe, > &g

How can I increase the parallelism on the Table API for Streaming Aggregation?

2020-10-08 Thread Felipe Gutierrez

e [2]. Thanks, Felipe [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/tuning/streaming_aggregation_optimization.html [2] https://github.com/felipegutierrez/explore-flink/blob/master/docker/ops-playground-image/java/explore-flink/src/main/java/org/sense/flink/examples/table/TaxiRideCountTable.java -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

Re: How can I increase the parallelism on the Table API for Streaming Aggregation?

2020-10-09 Thread Felipe Gutierrez

thanks! I will test -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Thu, Oct 8, 2020 at 6:19 PM Khachatryan Roman wrote: > > Hi Felipe, > > Your source is not parallel so it doesn't make sense to make local group > operato

Stream aggregation using Flink Table API (Blink plan)

2020-11-09 Thread Felipe Gutierrez

ead of an unbounded window as the example presents? Thanks! Felipe [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/tuning/streaming_aggregation_optimization.html#split-distinct-aggregation [2] https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/tuning/streaming_aggregation_optimization.html#minibatch-aggregation -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-09 Thread Felipe Gutierrez

I realized that I forgot the image. Now it is attached. -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Mon, Nov 9, 2020 at 1:41 PM Felipe Gutierrez wrote: > > Hi community, > > I am testing the "Split Distinct Aggregation"

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-10 Thread Felipe Gutierrez

it deterministic? [1] https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/runtime/tasks/ProcessingTimeCallback.html Thanks -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Tue, Nov 10, 2020 at 7:55 AM Jark Wu

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-10 Thread Felipe Gutierrez

I see, thanks Timo -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Tue, Nov 10, 2020 at 3:22 PM Timo Walther wrote: > > Hi Felipe, > > with non-deterministic Jark meant that you never know if the mini batch > timer (every 3 s) or

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-12 Thread Felipe Gutierrez

uot;id" : 3, "ship_strategy" : "FORWARD", "side" : "second" } ] }, { "id" : 5, "type" : "LocalGroupAggregate(groupBy=[taxiId], select=[taxiId, COUNT(passengerCnt) AS count$0])", "pact"

Re: Stream aggregation using Flink Table API (Blink plan)

2020-11-12 Thread Felipe Gutierrez

I see. now it has different query plans. It was documented on another page so I got confused. Thanks! -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Thu, Nov 12, 2020 at 12:41 PM Jark Wu wrote: > > Hi Felipe, > > The defa

Help on the Split Distinct Aggregation from Table API

2020-12-10 Thread Felipe Gutierrez

ation but still with backpressure. Then when I change to split optimization I get low backpressure. Is there something wrong with my query or my data? Thanks, Felipe [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/tuning/streaming_aggregation_optimization.html#split

Re: Help on the Split Distinct Aggregation from Table API

2020-12-10 Thread Felipe Gutierrez

I just realized that i have to use the dayOfTheYear on the gropuBy. I will test again. On Thu, 10 Dec 2020, 18:48 Felipe Gutierrez, wrote: > Hi, > > I am trying to understand and simulate the "Split Distinct > Aggregation" [1] from Table API. I am executing the quer

Trying to simulate the Split Distinct Aggregation optimizations from Table API

2020-12-14 Thread Felipe Gutierrez

roughput reaches only 4K. I think that the problem is in my data that the query with distinct is consuming. So, how should I prepare the data to see the optimization of split distinct take effect? Thanks, Felipe [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/tuning/streaming

Why do I get this error when instantiating an akka ActorSystem for the Flink-JobManager?

2021-01-18 Thread Felipe Gutierrez

oader.loadClass(ClassLoader.java:418) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:351) ... 20 more [1] https://github.com/apache/flink/blob/master/flink-runtime/src/main/java/org/apache/flink/runtime/jobmaster/JobManagerRunnerI

what is the hash function that Flink creates the UID?

2020-03-02 Thread Felipe Gutierrez

. AFAIK, Flink does not provide a way to create an operator ID that has the operator name included [1][2]. Is there a specific reason for that? [1] https://issues.apache.org/jira/browse/FLINK-8592 [2] https://issues.apache.org/jira/browse/FLINK-9653 *--* *-- Felipe Gutierrez* *-

Setting the operator-id to measure percentile latency over several jobs

2020-03-05 Thread Felipe Gutierrez

ways the same hash when I restart the job, but I would like to set its name. [1] https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#latency-tracking [2] https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#rest-api-int

Re: Setting the operator-id to measure percentile latency over several jobs

2020-03-05 Thread Felipe Gutierrez

thanks! I was wondering why the operator name is not implemented for the latency metrics, because for the other metrics it is implemented. but thanks anyway! *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <ht

How do I get the value of 99th latency inside an operator?

2020-03-05 Thread Felipe Gutierrez

Flink source code to export those values to my own operator. Nevertheless, it is what I need. Kind Regards, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>*

Backpressure and 99th percentile latency

2020-03-05 Thread Felipe Gutierrez

hour the 99th percentile latency got down to milliseconds. Is that normal? Please see the figure attached. [1] https://flink.apache.org/2019/07/23/flink-network-stack-2.html#latency-tracking Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https

Re: Backpressure and 99th percentile latency

2020-03-06 Thread Felipe Gutierrez

throughput of the sources yet. I am changing the size of the window without restart the job. But I guess they have the same meaning for this question. [1] https://flink.apache.org/2019/07/23/flink-network-stack-2.html -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https

Re: Backpressure and 99th percentile latency

2020-03-09 Thread Felipe Gutierrez

default latency tracking from Flink). Thanks for the insight points! Felipe -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Sat, Mar 7, 2020 at 4:36 PM Zhijiang wrote: > > Thanks for the feedback Felipe! > Regarding with your belo

How do I get the outPoolUsage value inside my own stream operator?

2020-03-16 Thread Felipe Gutierrez

e. Even when I click on the Backpressure UI Interface. Thanks, Felipe -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

Re: How do I get the outPoolUsage value inside my own stream operator?

2020-03-17 Thread Felipe Gutierrez

metricGroup = taskMetricGroup.getGroup("buffers"); Gauge gauge = (Gauge) metricGroup.getMetric("outPoolUsage"); if (gauge != null && gauge.getValue() != null) { float outPoolUsage = gauge.getValue().floatValue(); this.outPoolUsageHistogram.update((long) (outPoolUsage * 100))

Is there a good benchmark for Flink Stream API?

2020-04-20 Thread Felipe Gutierrez

/taxiData.html [2] https://github.com/dataArtisans/flink-benchmarks Thanks, Felipe -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

How do I get the IP of the master and slave files programmatically in Flink?

2020-05-20 Thread Felipe Gutierrez

ry rest.address: " + jobMasterConfiguration.getConfiguration().getValue(restAddressOption)); System.out.println("rpcService: " + rpcService.getAddress()); Thanks, Felipe -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

Re: How do I get the IP of the master and slave files programmatically in Flink?

2020-05-21 Thread Felipe Gutierrez

- > > > > Ververica GmbH | Invalidenstrasse 115, 10115 Berlin, Germany > > > > -- > > > > Ververica GmbH > > Registered at Amtsgericht Charlottenburg: HRB 158244 B > > Managing Directors: Timothy Alexander Steinert, Yip Park Tung Jason, Ji >

Re: How do I get the IP of the master and slave files programmatically in Flink?

2020-05-22 Thread Felipe Gutierrez

("rest.address") .stringType() .noDefaultValue(); String restAddress = this.getRuntimeContext().getTaskEnvironment().getTaskManagerInfo().getConfiguration().getValue(restAddressOption); Thanks! -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On F

Re: How do I get the IP of the master and slave files programmatically in Flink?

2020-05-25 Thread Felipe Gutierrez

ok, I see. Do you suggest a better approach to send messages from the JobManager to the TaskManagers and my specific operator? Thanks, Felipe -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Mon, May 25, 2020 at 4:23 AM Yangze Guo wrote: > >

How can I set the parallelism higher than the task slot number in more machines?

2020-05-25 Thread Felipe Gutierrez

llelism(16) but I got the same result. 32 subtasks of the same operator. Thanks, Felipe [1] https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.html#taskmanager-numberoftaskslots -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

Re: How can I set the parallelism higher than the task slot number in more machines?

2020-05-25 Thread Felipe Gutierrez

Solved! that was because I was using slotSharingGroup() in all operators to ensure that they stay in the same task slot. I guess Flink was creating dummy operators to ensure that. Thanks anyway. -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Mon

Executing a controllable benchmark in Flink

2020-05-27 Thread Felipe Gutierrez

ark for stream processing? Thanks, Felipe -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

How do I make sure to place operator instances in specific Task Managers?

2020-05-28 Thread Felipe Gutierrez

ot; and "slotSharingGroup()" to define it but both source01 and source02 are placed in TM-01 and map01 is split into 2 TMs. The same with map02. Thanks, Felipe -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

Re: How do I make sure to place operator instances in specific Task Managers?

2020-05-29 Thread Felipe Gutierrez

because I am measuring one operator (all instances) and I want to place its downstream operators in another machine in order to use network channels. -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Fri, May 29, 2020 at 4:59 AM Weihua Hu wrote

Re: How do I make sure to place operator instances in specific Task Managers?

2020-05-29 Thread Felipe Gutierrez

ke this bellow on the four TMs. taskmanager.numberOfTaskSlots: 4 parallelism.default: 4 Maybe if I use different numberOfTaskSlots on different TMs would it work? -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Fri, May 29, 2020 at 9:00 AM Fe

Re: Executing a controllable benchmark in Flink

2020-05-29 Thread Felipe Gutierrez

tartTime + this.delayInNanoSeconds; while (System.nanoTime() < deadLine) ; } Thanks! -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Fri, May 29, 2020 at 12:46 PM Robert Metzger wrote: > > Hi Felipe, > > the file is just 80 MBs. It is

Re: How do I make sure to place operator instances in specific Task Managers?

2020-06-03 Thread Felipe Gutierrez

rom flink-conf.yaml). So I need 8 slots in each TM. When I use one slotSharingGroup for source, map, and flatmap, and other slotSharingGroup for the reducer, and parallelism of 16, somehow Grafana is showing to me more than 16 parallel instances of the operators. \Felipe -- -- Felipe Gutierrez

How to add org.joda.time.DateTime in the StreamExecutionEnvironment to the TaxiRide training example?

2020-06-12 Thread Felipe Gutierrez

cises/blob/master/src/main/java/com/ververica/flinktraining/exercises/datastream_java/datatypes/TaxiRide.java [2] https://ci.apache.org/projects/flink/flink-docs-stable/dev/custom_serializers.html -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

Re: How to add org.joda.time.DateTime in the StreamExecutionEnvironment to the TaxiRide training example?

2020-06-13 Thread Felipe Gutierrez

.java:390) at org.apache.flink.runtime.rest.RestClient.lambda$submitRequest$3(RestClient.java:374) at java.util.concurrent.CompletableFuture.uniCompose(CompletableFuture.java:966) at java.util.concurrent.CompletableFuture$UniCompose.tryFire(CompletableFuture.java:940) ... 4 mor

Re: How to add org.joda.time.DateTime in the StreamExecutionEnvironment to the TaxiRide training example?

2020-06-13 Thread Felipe Gutierrez

Registrar.apply(FlinkScalaKryoInstantiator.scala:98) ~[classes/:?] at org.apache.flink.runtime.types.AllScalaRegistrar.apply(FlinkScalaKryoInstantiator.scala:172) ~[classes/:?] at org.apache.flink.runtime.types.FlinkScalaKryoInstantiator.newKryo(FlinkScalaKryoInstantiator.scala:84) ~[classes/:?] ...

Re: How to add org.joda.time.DateTime in the StreamExecutionEnvironment to the TaxiRide training example?

2020-06-13 Thread Felipe Gutierrez

he problem. > > Cheers, > Till > > On Sat, Jun 13, 2020 at 12:09 PM Felipe Gutierrez > wrote: >> >> Hi, I tried to change the joda.time maven version to be the same of >> the flink-training example and I am getting this e

Have already anyone tried to monitor Flink applications using VisualVM?

2020-06-15 Thread Felipe Gutierrez

ied to monitor Flink applications using VisualVM? Thanks, Felipe [1] https://stackoverflow.com/questions/16023507/why-isnt-visualvm-showing-all-the-normal-tabs -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com

Re: Have already anyone tried to monitor Flink applications using VisualVM?

2020-06-15 Thread Felipe Gutierrez

oOpSecurityContext.java:30) at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:966) -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Mon, Jun 15, 2020 at 12:30 PM Felipe Gutierrez wrote: > > Hi, > > I want to run a flink job wi

what to consider when testing a data stream application using the TPC-H benchmark data?

2020-06-22 Thread Felipe Gutierrez

/flink-examples-batch/src/main/java/org/apache/flink/examples/java/relational/TPCHQuery3.java [2] https://github.com/apache/flink/blob/master/flink-examples/flink-examples-batch/src/main/java/org/apache/flink/examples/java/relational/TPCHQuery10.java Thanks, Felipe -- -- Felipe Gutierrez -- skype

Re: what to consider when testing a data stream application using the TPC-H benchmark data?

2020-06-22 Thread Felipe Gutierrez

. [2] https://ci.apache.org/projects/flink/flink-docs-stable/dev/table/tuning/streaming_aggregation_optimization.html Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Mon, Jun 22

Re: what to consider when testing a data stream application using the TPC-H benchmark data?

2020-06-23 Thread Felipe Gutierrez

onsider for sure! [1] https://stackoverflow.com/q/62061643/2096986 Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Mon, Jun 22, 2020 at 4:13 PM Arvid Heise wrote: > If y

Timeout when using RockDB to handle large state in a stream app

2020-06-29 Thread Felipe Gutierrez

ker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: java.util.concurrent.TimeoutException: Heartbeat of TaskManager with id cb1091d792f52ca4743f345790d87dd5 timed out. ... 26 more Thanks, Felipe -- -- Felipe Gutierrez -- skyp

Re: Timeout when using RockDB to handle large state in a stream app

2020-06-30 Thread Felipe Gutierrez

d out that GC is causing the problem, but I still haven't managed to > solve this. > > > > On Mon, Jun 29, 2020 at 12:39 PM Felipe Gutierrez > wrote: > > Hi community, > > I am trying to run a stream application with large state in a > standalone flink cluste

Re: Timeout when using RockDB to handle large state in a stream app

2020-07-03 Thread Felipe Gutierrez

yes. I agree. because RocsDB will spill data to disk if there is not enough space in memory. Thanks -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Fri, Jul 3, 2020 at 8:27 AM Yun Tang wrote: > > Hi Felipe, > > I noticed my previous mai

Re: Timeout when using RockDB to handle large state in a stream app

2020-07-06 Thread Felipe Gutierrez

rez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/tpch/TPCHQuery10.java -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.com On Fri, Jul 3, 2020 at 9:01 AM Felipe Gutierrez wrote: > > yes. I agree. because RocsDB will spil

Re: Timeout when using RockDB to handle large state in a stream app

2020-07-07 Thread Felipe Gutierrez

other thing that is not working is this parameter that when I set it I get an JVM argument error and the TM does not start. taskmanager.memory.task.heap.size: 2048m # default: 1024m # Flink error Best, Felipe -- -- Felipe Gutierrez -- skype: felipe.o.gutierrez -- https://felipeogutierrez.blogspot.co

How to know (in code) how many times the job restarted?

2021-06-10 Thread Felipe Gutierrez

Hello community, Is it possible to know programmatically how many times my Flink stream job restarted since it was running? My use case is like this. I have an Unit test that uses checkpoint and I throw one exception in a MapFunction for a given time, i.e.: for the 2 seconds ahead. Because Flink

Re: How to know (in code) how many times the job restarted?

2021-06-11 Thread Felipe Gutierrez

lease tell me :). This was the way that I solved Thanks Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* On Thu, Jun 10, 2021 at 5:41 PM Roman Khachatryan wrote: > Hi Felipe, > > You can use getRuntimeContext().getAttemptNumber() [1] (but beware > that

Re: How to know (in code) how many times the job restarted?

2021-06-13 Thread Felipe Gutierrez

I just realised that only the ProcessFunction is enough. I don't need the CheckpointFunction. On Fri, 11 Jun 2021, 18:11 Felipe Gutierrez, wrote: > Cool! > > I did using this example > https://ci.apache.org/projects/flink/flink-docs-release-1.4/dev/stream/state/state.h

Re: How to know (in code) how many times the job restarted?

2021-06-15 Thread Felipe Gutierrez

text.html#isRestored-- *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* On Mon, Jun 14, 2021 at 3:39 PM Roman Khachatryan wrote: > You can also use accumulators [1] to collect the number of restarts > (and then access it via client); but side outputs should work as well. > > [1] > >

Save state on a CoGroupFunction and recover it after a failure

2021-06-15 Thread Felipe Gutierrez

Hi, I have a problem on my stream pipeline where the events on a CoGroupFunction are not restored after the application crashes. The application is like this: stream01.coGroup(stream02) .where(...).equalTo(...) .window(TumblingEventTimeWindows.of(1 minute)) .apply(new MyCoGroupFunction()) .proces

How to use onTimer() on event stream for *ProcessFunction?

2021-06-16 Thread Felipe Gutierrez

ocs-release-1.13/docs/dev/datastream/operators/process_function/#example [2] https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/learn-flink/event_driven/#the-ontimer-method *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez*

Re: Save state on a CoGroupFunction and recover it after a failure

2021-06-16 Thread Felipe Gutierrez

e the regular window from Flink. is that correct? Thanks *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* On Wed, Jun 16, 2021 at 2:16 PM Robert Metzger wrote: > Hi Felipe, > > Which data source are you using? > > > Then, in the MyCoGroupFunction there are only events

Re: Save state on a CoGroupFunction and recover it after a failure

2021-06-16 Thread Felipe Gutierrez

is to use a CoProcessFunction where I can update the state of events arriving at CoProcessFunction.processElement1 and CoProcessFunction.processElement2. *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* On Wed, Jun 16, 2021 at 4:28 PM Felipe Gutierrez < felipe.o.gutier...@gmail.co

How to make onTimer() trigger on a CoProcessFunction after a failure?

2021-06-17 Thread Felipe Gutierrez

Hi community, I have implemented a join function using CoProcessFunction with CheckpointedFunction to recover from failures. I added some debug lines to check if it is restoring and it does. Before the crash, I process events that fall at processElement2. I create snapshots at snapshotState(), the

Re: How to know (in code) how many times the job restarted?

2021-06-17 Thread Felipe Gutierrez

/05/03/release-1.13.0.html) that I cannot see `isRestored()` equals true on integration tests? *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* On Thu, Jun 17, 2021 at 4:09 PM Arvid Heise wrote: > Does your ProcessFunction has state? If not it would be in line with the > documen

Re: How to use onTimer() on event stream for *ProcessFunction?

2021-06-17 Thread Felipe Gutierrez

cks if it is indeed the last timer or not before > outputting elements. > > The other implementation always outputs elements independent of additional > timers/elements being added. > > On Wed, Jun 16, 2021 at 4:08 PM Felipe Gutierrez < > felipe.o.gutier...@gmail.com&

Re: How to know (in code) how many times the job restarted?

2021-06-17 Thread Felipe Gutierrez

you please share the test code? > > I think the returned value might depend on the level on which the > tests are executed. If it's a regular job then it should return the > correct value (as with cluster). If the environment in which the code > is executed is mocked then it can

Re: How to know (in code) how many times the job restarted?

2021-06-18 Thread Felipe Gutierrez

ons.max(restoreList); LOG.info("restarts: " + max); restartsState.add(max + 1); } } } *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* On Thu, Jun 17, 2021 at 11:17 PM Roman Khachatryan wrote: > Thanks for sharing, &g

Re: How to know (in code) how many times the job restarted?

2021-06-18 Thread Felipe Gutierrez

aybe, do I have to configure some parameters to work with the state on integration tests? *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* On Fri, Jun 18, 2021 at 9:31 AM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > No, it didn't work. > > The "cont

Re: How to make onTimer() trigger on a CoProcessFunction after a failure?

2021-06-18 Thread Felipe Gutierrez

props); kafkaSource.assignTimestampsAndWatermarks( WatermarkStrategy. .forBoundedOutOfOrderness(Duration.ofSeconds(20))); Thanks, Felipe > > Best, > Piotrek > > czw., 17 cze 2021 o 13:46 Felipe Gutierrez > napisał(a): > >> Hi community, >&

Re: How to make onTimer() trigger on a CoProcessFunction after a failure?

2021-06-18 Thread Felipe Gutierrez

ial one, it's not well documented. For > an example you would need to take a look in the Flink code itself by > finding existing implementations of the `AbstractStreamOperator` or > `OneInputStreamOperator`. > > Best, > Piotrek > > pt., 18 cze 2021 o 12:49 Felipe Gutie

Re: How to know (in code) how many times the job restarted?

2021-06-18 Thread Felipe Gutierrez

: 1 > in the logs. > > These settings probably differ on the cluster and there is some > unrelated exception which causes a restart. > > Regards, > Roman > > On Fri, Jun 18, 2021 at 12:20 PM Felipe Gutierrez > wrote: > > > > I investigated a little bit

Re: How to know (in code) how many times the job restarted?

2021-06-21 Thread Felipe Gutierrez

> getRuntimeContext().getAttemptNumber() would be simpler and more > reliable. > > Regards, > Roman > > On Fri, Jun 18, 2021 at 6:23 PM Felipe Gutierrez > wrote: > > > > > > > > On Fri, Jun 18, 2021 at 5:40 PM Roman Khachatryan > wrote:

Re: How to make onTimer() trigger on a CoProcessFunction after a failure?

2021-06-21 Thread Felipe Gutierrez

9] >= [2021-06-21 16:57:21.672] Attempts restart: 1 Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.836 sec On Fri, Jun 18, 2021 at 2:46 PM Piotr Nowojski wrote: > I'm glad I could help, I hope it will solve your problem :) > > Best, > Piotrek > > pt., 18

Re: How to make onTimer() trigger on a CoProcessFunction after a failure?

2021-06-24 Thread Felipe Gutierrez

lable .coGroup(MyCoGroupFunction) works as a charm. Thank you again for the clarifications! Felipe On Mon, Jun 21, 2021 at 5:18 PM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote: > Hello Piotr, > > Could you please help me to ensure that I am implementing it in the > correct

How to implement different Join strategies for a Flink stream application?

2019-08-09 Thread Felipe Gutierrez

inExercise.java ) If I want to decide whether I use BradCastJoin or HashJoin or any other Join algorithm, which way do you think it is better? is there any other example code that I could borrow? Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierr

Implementing a low level join

2019-08-13 Thread Felipe Gutierrez

/projects/flink/flink-docs-release-1.8/dev/stream/operators/process_function.html#low-level-joins [2] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/state/broadcast_state.html *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com

Re: Implementing a low level join

2019-08-14 Thread Felipe Gutierrez

k-docs-stable/dev/stream/state/broadcast_state.html > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.8/dev/table/sql.html#joins > [3] > https://github.com/apache/flink/tree/master/flink-table/flink-table-planner/src/main/scala/org/apache/flink/table/runtime/join > > &

Re: Implementing a low level join

2019-08-15 Thread Felipe Gutierrez

I see, I am gonna try this. Thanks Hequn *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Thu, Aug 15, 2019 at 4:01 AM Hequn Cheng wrote: > Hi Felipe, > > If I understand correctly

Re: Implementing a low level join

2019-08-15 Thread Felipe Gutierrez

, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Thu, Aug 15, 2019 at 9:42 AM Fabian Hueske wrote: > Hi, > > Just to clarify. You cannot dynamically switch the join strategy w

Re: Implementing a low level join

2019-08-15 Thread Felipe Gutierrez

Thanks for the advice. *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Thu, Aug 15, 2019 at 9:59 AM Fabian Hueske wrote: > Hi Felipe, > > No, this is not possible (with reasonab

Can I use watermarkers to have a global trigger of different ProcessFunction's?

2019-08-21 Thread Felipe Gutierrez

merService().registerEventTimeTimer() and what is the logic that I should use in the onTimer() method? [1] https://github.com/felipegutierrez/explore-flink/blob/master/src/main/java/org/sense/flink/examples/stream/valencia/ValenciaDataSkewedBloomFilterJoinExample.java#L47 Thanks, Felipe *--* *-- Felipe Gutierrez*

Re: Can I use watermarkers to have a global trigger of different ProcessFunction's?

2019-08-22 Thread Felipe Gutierrez

thanks for the detail explanation! I removed my implementation of the watermark which is not necessary in my case. I will only use Watermarkers if I am dealing with out of order events. *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <ht

Flink & Mesos don't launch Job and Task managers

2019-09-05 Thread Felipe Gutierrez

ager.tasks.mem: 4096 taskmanager.heap.mb: 3500 [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/deployment/mesos.html#mesos-without-dcos Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>*

Re: Flink & Mesos don't launch Job and Task managers

2019-09-05 Thread Felipe Gutierrez

my bad. Flink allocates task managers dynamically. *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Thu, Sep 5, 2019 at 5:24 PM Felipe Gutierrez < felipe.o.gutier...@gmail.com> wrote:

How to access a file with Flink application running on Mesos?

2019-09-05 Thread Felipe Gutierrez

org.apache.flink.runtime.taskmanager.Task.run(Task.java:530) at java.lang.Thread.run(Thread.java:748) *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>*

How do I start a Flink application on my Flink+Mesos cluster?

2019-09-06 Thread Felipe Gutierrez

.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at akka.dispatch.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at akka.dispatch.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.

Re: How do I start a Flink application on my Flink+Mesos cluster?

2019-09-10 Thread Felipe Gutierrez

s" to be equal or less the available cores on a single node of the cluster. I am not sure about this parameter, but only after this configuration it worked. Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez

Re: How do I start a Flink application on my Flink+Mesos cluster?

2019-09-12 Thread Felipe Gutierrez

om/apache/flink/blob/0a405251b297109fde1f9a155eff14be4d943887/flink-mesos/src/main/java/org/apache/flink/mesos/runtime/clusterframework/MesosTaskManagerParameters.java#L344 > > On Tue, Sep 10, 2019 at 10:41 AM Felipe Gutierrez < > felipe.o.gutier...@gmail.com> wrote: > >> I

Error "Failed to load native Mesos library from" when I run Flink on a compiled version of Apache Mesos

2019-09-17 Thread Felipe Gutierrez

to load native Mesos library from /usr/java/packages/lib/amd64:/usr/lib/x86_64-linux-gnu/jni:/lib/x86_64-linux-gnu:/usr/lib/x86_64-linux-gnu:/usr/lib/jni:/lib:/usr/lib Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez

Re: Client for Monitoring API!

2019-09-18 Thread Felipe Gutierrez

yes. you can use prometheus+Grafana. https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#prometheus-orgapacheflinkmetricsprometheusprometheusreporter https://felipeogutierrez.blogspot.com/2019/04/monitoring-apache-flink-with-prometheus.html Felipe On 2019/09/18 11:36:3

Re: Error "Failed to load native Mesos library from" when I run Flink on a compiled version of Apache Mesos

2019-09-19 Thread Felipe Gutierrez

BlobServer - Stopped BLOB server at 0.0.0.0:37375 *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Wed, Sep 18, 2019 at 4:53 AM Rui Li wrote: > Hey Felipe, > > I haven't tr

Re: What is the right way to use the physical partitioning strategy in Data Streams?

2019-09-23 Thread Felipe Gutierrez

des between these partitioners to satisfy your > requirement. For example, > `sourceDataStream.rebalance().map(...).keyby(0).sum(1).print();` > > Thanks, > Biao /'bɪ.aʊ/ > > > > On Thu, 19 Sep 2019 at 16:49, Felipe Gutierrez < > felipe.o.gutier...@gmail.com> wrote: > &

Re: What is the right way to use the physical partitioning strategy in Data Streams?

2019-09-23 Thread Felipe Gutierrez

code. But now I want to tackle data skew by altering the way Flink partition keys using KeyedStream. [1] https://felipeogutierrez.blogspot.com/2019/08/implementing-dynamic-combiner-mini.html *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.

Re: What is the right way to use the physical partitioning strategy in Data Streams?

2019-09-24 Thread Felipe Gutierrez

them. [1] https://issues.apache.org/jira/browse/FLINK-1725 Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Mon, Sep 23, 2019 at 3:47 PM Biao Liu wrote: > Wow, that's rea

Difference between windows in Spark and Flink

2019-10-10 Thread Felipe Gutierrez

differences between their physical operators running in the cluster? [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/windows.html#windows [2] https://spark.apache.org/docs/latest/streaming-programming-guide.html#window-operations Thanks, Felipe *--* *-- Felipe Gutierrez

Re: Difference between windows in Spark and Flink

2019-10-11 Thread Felipe Gutierrez

guide.html#shuffle-operations Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Thu, Oct 10, 2019 at 7:25 PM Yun Tang wrote: > Hi Felipe > > Generally speaking, the key

Re: Difference between windows in Spark and Flink

2019-10-11 Thread Felipe Gutierrez

that is nice. So, only by this Flink shuffles fewer data them Spark. Now I need to plug Prometheus and Grafana to show it. Thanks Yun for your help! *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.

PreAggregate operator with timeout trigger

2019-10-28 Thread Felipe Gutierrez

timeout trigger. I am confused if I need to extend Trigger on MyPreAggregate-AbstractUdfStreamOperator or if I have to put a window as a parameter of the operator class MyPreAggregate-AbstractUdfStreamOperator. what is the best approach? Thanks, Felipe *--* *-- Felipe Gutierrez* *-- skype

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Felipe Gutierrez

/Timer.html Thanks! *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Wed, Oct 30, 2019 at 2:59 PM Piotr Nowojski wrote: > Hi, > > If you want to register a processing/event time

How can I get the backpressure signals inside my function or operator?

2019-11-05 Thread Felipe Gutierrez

gregation and flush tuples or when I keep pre aggregating. It is something like the "credit based control on the network stack" [2]. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/monitoring/metrics.html#default-shuffle-service [2] https://www.youtube.com/watch?v=AbqatHF3tZ

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Felipe Gutierrez

@Gyula, I am afraid I haven't got your point. *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Tue, Nov 5, 2019 at 10:11 AM Gyula Fóra wrote: > You might have to introduce some du

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Felipe Gutierrez

, I can aggregate earlier if I reach a number of keys. *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Tue, Nov 5, 2019 at 10:29 AM Gyula Fóra wrote: > Hi! > Sorry I should have given

Re: How can I get the backpressure signals inside my function or operator?

2019-11-05 Thread Felipe Gutierrez

o not rely on some external storage? [1] https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/back_pressure.html#sampling-threads *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com <https://felipeogutierrez.blogspot.com>* On Tue

What metrics can I see the root cause of "Buffer pool is destroyed" message?

2019-11-06 Thread Felipe Gutierrez

6) at org.apache.flink.runtime.io.network.api.writer.ChannelSelectorRecordWriter.emit(ChannelSelectorRecordWriter.java:60) at org.apache.flink.streaming.runtime.io.RecordWriterOutput.pushToRecordWriter(RecordWriterOutput.java:107) *--* *-- Felipe Gutierrez* *-- skype: felipe.o.gutierrez* *--* *https://felipeogutierrez.blogspot.com &

1 2 >

1 - 100 of 176 matches

Mail list logo