Re: Slot request bulk is not fulfillable! Could not allocate the required slot within slot request timeout

2022-11-03 Thread Ori Popowski
> Best regards, > > Martijn > > [1] > https://cloud.google.com/dataproc/docs/concepts/versioning/dataproc-release-2.0 > > On Sun, Oct 2, 2022 at 3:38 AM Ori Popowski wrote: > >> Hi, >> >> We're using Flink 2.10.2 on Google Dataproc. >> >>

Re: Adjusted frame length exceeds 2147483647

2022-03-18 Thread Ori Popowski
's just the size of the > request that is suspicious. > > [1] https://issues.apache.org/jira/browse/FLINK-24923 > > On Thu, Mar 17, 2022 at 5:29 PM Ori Popowski wrote: > >> This issue did not repeat, so it may be a network issue >> >> On Thu, Mar 17,

Re: Adjusted frame length exceeds 2147483647

2022-03-17 Thread Ori Popowski
ts a bug in Flink. Could it be that there was some network issue? > > Matthias > > On Tue, Mar 15, 2022 at 6:52 AM Ori Popowski wrote: > >> I am running a production job for at least 1 year, and I got to day this >> error: >> >> >> org.apache.fl

Adjusted frame length exceeds 2147483647

2022-03-14 Thread Ori Popowski
I am running a production job for at least 1 year, and I got to day this error: org.apache.flink.runtime.io.network.netty.exception.LocalTransportException: Adjusted frame length exceeds 2147483647: 2969686273 - discarded (connection to 'flink-session-playback-prod-1641716499-sw-6q8p.c.data-prod-

Re: Huge backpressure when using AggregateFunction with Session Window

2021-10-21 Thread Ori Popowski
gt; … much simplified > > > > When I started with similar questions I spent quite some time in the > debugger, breaking into the windowing functions and going up the call > stack, in order to understand how Flink works … time well spent > > > > > > I hope this

Re: Huge backpressure when using AggregateFunction with Session Window

2021-10-21 Thread Ori Popowski
sion and state backend are you using? > > Regards, > Timo > > On 20.10.21 16:17, Ori Popowski wrote: > > I have a simple Flink application with a simple keyBy, a SessionWindow, > > and I use an AggregateFunction to incrementally aggregate a result, and > > write to a Sink.

Huge backpressure when using AggregateFunction with Session Window

2021-10-20 Thread Ori Popowski
I have a simple Flink application with a simple keyBy, a SessionWindow, and I use an AggregateFunction to incrementally aggregate a result, and write to a Sink. Some of the requirements involve accumulating lists of fields from the events (for example, all URLs), so not all the values in the end s

Re: Exception: SequenceNumber is treated as a generic type

2021-10-18 Thread Ori Popowski
rite the class of Kinesis while > creating the fat jar (there should be warning and you should double-check > that your SequenceNumber wins). > > On Thu, Oct 14, 2021 at 3:22 PM Ori Popowski wrote: > >> Thanks for answering. >> >> Not sure I understood the hack sug

Re: Exception: SequenceNumber is treated as a generic type

2021-10-14 Thread Ori Popowski
gt; > Dawid > > [1] https://issues.apache.org/jira/browse/FLINK-24549 > > [2] > https://ci.apache.org/projects/flink/flink-docs-release-1.12/ops/monitoring/back_pressure.html > On 14/10/2021 12:41, Ori Popowski wrote: > > I'd appreciate if someone could advice on t

Re: Exception: SequenceNumber is treated as a generic type

2021-10-14 Thread Ori Popowski
I'd appreciate if someone could advice on this issue. Thanks On Tue, Oct 12, 2021 at 4:25 PM Ori Popowski wrote: > Hi, > > I have a large backpressure in a somewhat simple Flink application in > Scala. Using Flink version 1.12.1. > > To find the source of the problem,

Exception: SequenceNumber is treated as a generic type

2021-10-12 Thread Ori Popowski
Hi, I have a large backpressure in a somewhat simple Flink application in Scala. Using Flink version 1.12.1. To find the source of the problem, I want to eliminate all classes with generic serialization, so I set pipeline.generic-types=false in order to spot those classes and write a serializer

Table API, accessing nested fields

2020-11-10 Thread Ori Popowski
How can I access nested fields e.g. in select statements? For example, this won't work: val table = tenv .fromDataStream(stream) .select($"context.url", $"name") What is the correct way? Thanks.

Re: SQL aggregation functions inside the Table API

2020-11-09 Thread Ori Popowski
ported: > > val table = tenv.fromDataStream(stream) > > val newTable = tenv.sqlQuery(s"SELECT ... FROM $table") > > So switching between Table API and SQL can be done fluently. > > I hope this helps. > > Regards, > Timo > > > On 09.11.20 14:33, Ori Popowski

SQL aggregation functions inside the Table API

2020-11-09 Thread Ori Popowski
Hi, Some functions only exist in the SQL interface and are missing from the Table API. For example LAST_VALUE(expression) [1] I still want to use this function in my aggregation, and I don't want to implement a user-defined function. Can I combine an SQL expression inside my Table API? For examp

Re: Native memory allocation (mmap) failed to map 1006567424 bytes

2020-10-29 Thread Ori Popowski
gt; > [1] > https://issues.apache.org/jira/browse/FLINK-18712?focusedCommentId=17189138&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-17189138 > > [2] > https://docs.oracle.com/javase/8/docs/technotes/guides/troubleshoot/tooldescr007.html > >

Re: Native memory allocation (mmap) failed to map 1006567424 bytes

2020-10-29 Thread Ori Popowski
end.rocksdb.memory.managed` is either not explicitly configured, > or configured to `true`. > > In addition, I noticed that you are using Flink 1.10.0. You might want to > upgrade to 1.10.2, to include the latest bug fixes on the 1.10 release. > > Thank you~ > > Xintong Son

Re: Native memory allocation (mmap) failed to map 1006567424 bytes

2020-10-29 Thread Ori Popowski
e does not necessarily all go to the managed memory. You > can also try increasing the `jvm-overhead`, simply to leave more native > memory in the container in case there are other other significant native > memory usages. > > Thank you~ > > Xintong Song > > > > On

Re: Native memory allocation (mmap) failed to map 1006567424 bytes

2020-10-28 Thread Ori Popowski
whether > the Flink process indeed uses more memory than expected. This could be > achieved via: > - Run the `top` command > - Look into the `/proc/meminfo` file > - Any container memory usage metrics that are available to your Yarn > cluster > > Thank you~ > > Xintong

Native memory allocation (mmap) failed to map 1006567424 bytes

2020-10-27 Thread Ori Popowski
After the job is running for 10 days in production, TaskManagers start failing with: Connection unexpectedly closed by remote task manager Looking in the machine logs, I can see the following error: = Java processes for user hadoop = OpenJDK 64-Bit Server VM warning: INFO

Re: Is it possible that late events are processed before the window?

2020-10-07 Thread Ori Popowski
downstream operators which > consume the side output for late events. > > Cheers, > Till > > On Wed, Oct 7, 2020 at 2:32 PM Ori Popowski wrote: > >> After creating a toy example I think that I've got the concept of >> lateDataOutput wrong. >> >>

Re: Is it possible that late events are processed before the window?

2020-10-07 Thread Ori Popowski
w for that specific key. On Wed, Oct 7, 2020 at 2:42 PM Ori Popowski wrote: > I've made an experiment where I use an evictor on the main window (not the > late one), only to write a debug file when the window fires (I don't > actually evict events, I've made it so I can wri

Re: Is it possible that late events are processed before the window?

2020-10-07 Thread Ori Popowski
sionPlayback, late = true)) So, to repeat the question, is that normal? And if not - how can I fix this? Thanks On Tue, Oct 6, 2020 at 3:44 PM Ori Popowski wrote: > > I have a job with event-time session window of 30 minutes. > > I output late events to side output, where I have a tumb

Is it possible that late events are processed before the window?

2020-10-06 Thread Ori Popowski
I have a job with event-time session window of 30 minutes. I output late events to side output, where I have a tumbling processing time window of 30 minutes. I observe that the late events are written to storage before the "main" events. I wanted to know if it's normal before digging into the co

How can I drop events which are late by more than X hours/days?

2020-09-24 Thread Ori Popowski
I need to drop elements which are delayed by more than a certain amount of time from the current watermark. I wanted to create a FilterFunction where I get the current watermark, and if the difference between the watermark and my element's timestamp is greater than X - drop the element. However,

Watermark advancement in late side output

2020-09-21 Thread Ori Popowski
Let's say I have an event-time stream with a window and a side output for late data, and in the side output of the late data, I further assign timestamps and do windowing - what is the watermark situation here? The main stream has its own watermark advancement but the side output has its own. Do t

Re: sideOutputLateData doesn't work with map()

2020-09-17 Thread Ori Popowski
out.collect(elements.map(_._1).toList) } }) stream .getSideOutput(tag) .map(a => s"late: $a") .print() stream .map(list => list :+ 42) .print() senv.execute() On Thu, Sep 17, 2020 at 3:32 PM Ori Popowski wrote: > Hi, > &

sideOutputLateData doesn't work with map()

2020-09-17 Thread Ori Popowski
Hi, I have this simple flow: val senv = StreamExecutionEnvironment.getExecutionEnvironment senv.setStreamTimeCharacteristic(TimeCharacteristic.EventTime) val tag = OutputTag[Tuple1[Int]]("late") val stream = senv .addSource(new SourceFunction[Int] { override def run(ctx: SourceFunction.Sour

Re: How Flink distinguishes between late and in-time events?

2020-08-20 Thread Ori Popowski
atest watermark, it is marked as a late > event [1] > > Piotrek > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.11/concepts/timely-stream-processing.html#lateness > > czw., 20 sie 2020 o 17:13 Ori Popowski napisał(a): > >> In the documentation >

How Flink distinguishes between late and in-time events?

2020-08-20 Thread Ori Popowski
In the documentation it states that: *[…], Flink keeps the state of windows until their allowed lateness expires. Once this happens, Flink removes the window and deletes its state, as also d

Re: Key group is not in KeyGroupRange

2020-07-22 Thread Ori Popowski
Jul 21, 2020 at 1:46 PM Ori Popowski wrote: > I should have mentioned, I've opened a bug for it > https://issues.apache.org/jira/browse/FLINK-18637. So the discussion > moved there. > > On Tue, Jul 14, 2020 at 2:03 PM Ori Popowski wrote: > >> I'm getting this

Re: Key group is not in KeyGroupRange

2020-07-21 Thread Ori Popowski
I should have mentioned, I've opened a bug for it https://issues.apache.org/jira/browse/FLINK-18637. So the discussion moved there. On Tue, Jul 14, 2020 at 2:03 PM Ori Popowski wrote: > I'm getting this error when creating a savepoint. I've read in > https://issues.apache.

Key group is not in KeyGroupRange

2020-07-14 Thread Ori Popowski
I'm getting this error when creating a savepoint. I've read in https://issues.apache.org/jira/browse/FLINK-16193 that it's caused by unstable hashcode or equals on the key, or improper use of reinterpretAsKeyedStream. My key is a string and I don't use reinterpretAsKeyedStream, so what's going on?

Re: Savepoint fails due to RocksDB 2GiB limit

2020-07-13 Thread Ori Popowski
(List(event), Some(count + 1)) else (List.empty, Some(count)) } .keyBy(…) Using .aggregate(…, new MyProcessFunction) while using an aggregation to aggregate the events into a list, worked really bad and caused serious performance issues. Thanks! On Sun, Jul 12, 2020 at 10:32 AM Ori Popo

Re: Savepoint fails due to RocksDB 2GiB limit

2020-07-12 Thread Ori Popowski
ok for you? >> >> [1] >> https://ci.apache.org/projects/flink/flink-docs-release-1.11/dev/stream/operators/process_function.html#the-keyedprocessfunction >> >> Best, >> Congxian >> >> >> Ori Popowski 于2020年7月8日周三 下午8:30写道:

Savepoint fails due to RocksDB 2GiB limit

2020-07-08 Thread Ori Popowski
I've asked this question in https://issues.apache.org/jira/browse/FLINK-9268 but it's been inactive for two years so I'm not sure it will be visible. While creating a savepoint I get a org.apache.flink.util.SerializedThrowable: java.lang.NegativeArraySizeException. It's happening because some of m

Re: Heartbeat of TaskManager timed out.

2020-07-07 Thread Ori Popowski
. > > I'm not familiar with Scala. Just curious, if what you suspect is true, is > it a bug of Scala? > > Thank you~ > > Xintong Song > > > > On Tue, Jul 7, 2020 at 1:41 PM Ori Popowski wrote: > >> Hi, >> >> I just wanted to update that the

Re: Heartbeat of TaskManager timed out.

2020-07-06 Thread Ori Popowski
ems just resolved. Checkpoints, the heartbeat timeout, and also the memory and CPU utilization. I still need to confirm my suspicion towards Scala's flatten() though, since I haven't "lab-tested" it. [1] https://github.com/NetLogo/NetLogo/issues/1830 On Sun, Jul 5, 2020 at 2:21 P

Re: Heartbeat of TaskManager timed out.

2020-07-02 Thread Ori Popowski
ecutor - The heartbeat of > JobManager with id bc59ba6a > > No substantial amount memory was freed after that. > > If this memory usage pattern is expected, I'd suggest to: > 1. increase heap size > 2. play with PrintStringDeduplicationStatistics and UseStringDeduplication > fl

Re: Heartbeat of TaskManager timed out.

2020-07-01 Thread Ori Popowski
10/ops/memory/mem_migration.html > > On Sun, Jun 28, 2020 at 10:12 PM Ori Popowski wrote: > >> Thanks for the suggestions! >> >> > i recently tried 1.10 and see this error frequently. and i dont have >> the same issue when running with 1.9.1 >> I did downgrad

Re: Timeout when using RockDB to handle large state in a stream app

2020-06-29 Thread Ori Popowski
Hi there, I'm currently experiencing the exact same issue. http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Heartbeat-of-TaskManager-timed-out-td36228.html I've found out that GC is causing the problem, but I still haven't managed to solve this. On Mon, Jun 29, 2020 at 12:3

Re: Heartbeat of TaskManager timed out.

2020-06-28 Thread Ori Popowski
efore the timeout check. >- Is there any metrics monitoring the network condition between the JM >and timeouted TM? Possibly any jitters? > > > Thank you~ > > Xintong Song > > > [1] > https://ci.apache.org/projects/flink/flink-docs-release-1.10/ops/config.ht

Heartbeat of TaskManager timed out.

2020-06-25 Thread Ori Popowski
Hello, I'm running Flink 1.10 on EMR and reading from Kafka with 189 partitions and I have parallelism of 189. Currently running with RocksDB, with checkpointing disabled. My state size is appx. 500gb. I'm getting sporadic "Heartbeat of TaskManager timed out" errors with no apparent reason. I c

Re: State leak

2020-06-24 Thread Ori Popowski
ent, limit it with some > time threshold. If it's defined with inactivity gap and your client sends > infinite events, you could limit the session length to enforce a new > session (may be simpler done on the client side). > > Hope this helps, > Rafi > > > On Tue, Jun 23, 20

State leak

2020-06-23 Thread Ori Popowski
Hi, When working with an ever growing key-space (let's say session ID), and a SessionWindow with a ProcessFunction - should we worry about the state growing indefinitely? Or does the window make sure to clean state after triggers? Thanks

Re: [EXTERNAL] Re: Renaming the metrics

2020-06-23 Thread Ori Popowski
link/commit/fd8e1f77a83a3ae1253da53596d22471bb6fe902 > > and > > > https://github.com/cslotterback/flink/commit/ce3797ea46f3321885c4352ecc36b9385b7ca0ce > > > > This isn’t what I’d call ideal, but it gets the job done. I would love a > generic flink-approved method of confi

Re: Renaming the metrics

2020-06-22 Thread Ori Popowski
ojects/flink/flink-docs-master/monitoring/metrics.html#prometheuspushgateway-orgapacheflinkmetricsprometheusprometheuspushgatewayreporter > > On Mon, Jun 22, 2020 at 9:01 AM Ori Popowski wrote: > >> I have two Flink clusters sending metrics via Prometheus and they share >> all the metric names (i.e. >> f

Renaming the metrics

2020-06-22 Thread Ori Popowski
I have two Flink clusters sending metrics via Prometheus and they share all the metric names (i.e. flink_taskmanager_job_task_operator_currentOutputWatermark). I want to change the flink_ prefix to something else to distinguish between the clusters (maybe the job-name). How can I do it? Thanks.

Re: EventTimeSessionWindow firing too soon

2020-06-16 Thread Ori Popowski
ing. You could also get the current > watemark when using a ProcessWindowFunction and also emit that in the > records that you're printing, for debugging. > > What is that TimestampAssigner you're using for your timestamp > assigner/watermark extractor? > > Best, > Al

Re: EventTimeSessionWindow firing too soon

2020-06-16 Thread Ori Popowski
events 11:31-11:31 13 events Again, this is one session. How can we explain this? Why does Flink create 4 distinct windows within 8 minutes? I'm really lost here, I'd appreciate some help. On Tue, Jun 16, 2020 at 2:17 PM Ori Popowski wrote: > Hi, thanks for answering. > > &g

Re: EventTimeSessionWindow firing too soon

2020-06-16 Thread Ori Popowski
t offset, so you consume > historical data and Flink is catching-up. > > Regarding: *My event-time timestamps also do not have big gaps* > > Just to verify, if you do keyBy sessionId, do you check the gaps of > events from the same session? > > Rafi > > > On Tue, Jun 16,

Re: EventTimeSessionWindow firing too soon

2020-06-15 Thread Ori Popowski
; >> Hi >> >> I think it maybe you use the event time, and the timestamp between your >> event data is bigger than 30minutes, maybe you can check the source data >> timestamp. >> >> Best, >> Yichao Yang >> >> -----

EventTimeSessionWindow firing too soon

2020-06-15 Thread Ori Popowski
I'm using Flink 1.10 on YARN, and I have a EventTimeSessionWindow with a gap of 30 minutes. But as soon as I start the job, events are written to the sink (I can see them in S3) even though 30 minutes have not passed. This is my job: val stream = senv .addSource(new FlinkKafkaConsumer("…",

Re: Support for Flink in EMR 6.0

2020-05-04 Thread Ori Popowski
wever, if you have a EMR 6.0.0 with Hadoop 3 deployed, you should be > able deploy vanilla Flink 1.10.0 there as well! > > [1] > https://issues.apache.org/jira/browse/FLINK-11086?focusedCommentId=17088936&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comm

Support for Flink in EMR 6.0

2020-05-04 Thread Ori Popowski
Hi, EMR 6.0.0 has been released [1], and this release ignores Apache Flink (as well as other applications). Are there any plans to add support for Apache Flink for EMR 6.0.0 in the future? Thanks. [1] https://aws.amazon.com/about-aws/whats-new/2020/04/amazon-emr-announces-emr-release-6-with-new