Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Dominik Wosiński
Hey, I've got a question regarding the transaction failures in EXACTLY_ONCE flow with Flink 1.15.3 with Confluent Cloud Kafka. The case is that there is a FlinkKafkaProducer in EXACTLY_ONCE setup with default *transaction.timeout.ms *of 15min. During the processing

StreamQueryConfig vs TemporalTableFunction

2020-04-20 Thread Dominik Wosiński
Hey, I wanted to ask whether the TemporalTableFunctions are subject to StreamQueryConfig retention? I was pretty sure that they are not, but I have recently noticed weird behavior in one of my jobs that suggests that they indeed are. Thanks for answers, Best Regards, Dom.

Objects with fields that are not serializable

2020-04-14 Thread Dominik Wosiński
Hey, I have a question about using classes with fields that are not serializable in DataStream. Basically, I would like to use the Java's Optional in DataStream. So Say I have a class *Data *that has several optional fields and I would like to have *DataStream*. I don't think this should cause any

Re: How to consume kafka from the last offset?

2020-03-26 Thread Dominik Wosiński
Hey, Are You completely sure you mean *auto.offset.reset ?? *False is not valid setting for that AFAIK. Best, Dom. czw., 26 mar 2020 o 08:38 Jim Chen napisał(a): > Thanks! > > I made a mistake. I forget to set the auto.offset.reset=false. It's my > fault. > > Dominik Wo

Re: How to consume kafka from the last offset?

2020-03-25 Thread Dominik Wosiński
Hi Jim, Well, *auto.offset.reset *is only used when there is no offset saved for this *group.id * in Kafka. So, if You want to read the data from the latest record (and by latest I mean the newest here) You should assign the *group.id * that was not previously used

Re: Issues with Watermark generation after join

2020-03-24 Thread Dominik Wosiński
table partitioning as defined by *partitionBy ??* 2) Assuming that this is instance of parallel operator, does this mean that we need output from ALL operators so that the watermark progresses and the output is generated? Best Regards, Dom. wt., 24 mar 2020 o 10:01 Timo Walther napisał(a): > Hi

Re: Issues with Watermark generation after join

2020-03-19 Thread Dominik Wosiński
I have created a simple minimal reproducible example that shows what I am talking about: https://github.com/DomWos/FlinkTTF/tree/sql-ttf It contains a test that shows that even if the output is in order which is enforced by multiple sleeps, then for parallelism > 1 there is no output and for paral

Re: Timestamp Erasure

2020-03-19 Thread Dominik Wosiński
ails in the doc [1]. > > Best, > Jark > > [1]: > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/time_attributes.html#during-datastream-to-table-conversion-1 > > On Thu, 19 Mar 2020 at 02:38, Dominik Wosiński wrote: > >> Hey, >> I just want

Timestamp Erasure

2020-03-18 Thread Dominik Wosiński
Hey, I just wanted to ask about one thing about timestamps. So, currently If I have a KeyedBroadcastProcess function followed by Temporal Table Join, it works like a charm. But, say I want to delay emitting some of the results due to any reason. So If I *registerProcessingTimeTimer* and any elemen

Re: Issues with Watermark generation after join

2020-03-17 Thread Dominik Wosiński
Hey sure, the original Temporal Table SQL is: |SELECT e.*, f.level as level FROM | enablers AS e, | LATERAL TABLE (Detectors(e.timestamp)) AS f | WHERE e.id= f.id |"" And the previous SQL query to join A&B is something like : SELECT * | FROM A te, | B s | WHERE s.id = te.id AND s.level = te.leve

Re: Issues with Watermark generation after join

2020-03-16 Thread Dominik Wosiński
put was generated. So, I have switched to process function to see whether the watermarks are reaching this stage. Best Regards, Dom. pon., 16 mar 2020 o 19:46 Theo Diefenthal napisał(a): > Hi Dominik, > > I had the same once with a custom processfunction. My processfunction > buffered t

Fwd: AfterMatchSkipStrategy for timed out patterns

2020-03-16 Thread Dominik Wosiński
Hey all, I was wondering whether for CEP the *AfterMatchSkipStrategy *is applied during matching or if simply the results are removed after the match. The question is the result of the experiments I was doing with CEP. Say I have the readings from some sensor and I want to detect events over some

Issues with Watermark generation after join

2020-03-16 Thread Dominik Wosiński
Hey, I have noticed a weird behavior with a job that I am currently working on. I have 4 different streams from Kafka, lets call them A, B, C and D. Now the idea is that first I do SQL Join of A & B based on some field, then I create append stream from Joined A&B, let's call it E. Then I need to as

Re: Implementing a tick service

2020-01-21 Thread Dominik Wosiński
Hey, you have access to context in `onTimer` so You can easily reschedule the timer when it is fired. Best, Dom.

HadoopInputFormat

2019-11-06 Thread Dominik Wosiński
Hey, I wanted to ask if the *HadoopInputFormat* does currently support some custom partitioning scheme ? Say I have 200 files in HDFS each having the partitioning key in name, can we ATM use HadoopInputFormat to distribute reading to multiple TaskManagers using the key ?? Best Regards, Dom.

WebUI show custom config

2019-06-21 Thread Dominik Wosiński
Hey, I am building jobs that use Typesafe Config under the hood. The configs tend to grow big. I was wondering whether there is a possibility to use WebUI to show the config that the job was run with, currently the only idea is to log the config and check it inside the logs, but with dozens of jobs

Re: kafka corrupt record exception

2019-04-25 Thread Dominik Wosiński
Hey, Sorry for such a delay, but I have missed this message. Basically, technically you could have Kafka broker installed in version say 1.0.0 and using FlinkKafkaConsumer08. This could technically create issues. I'm not sure if You can automate the process of skipping corrupted messages, as You

Re: Flink Control Stream

2019-04-25 Thread Dominik Wosiński
Thanks for help Till, I thought so, but I wanted to be sure. Best Regards, Dom.

Flink Control Stream

2019-04-24 Thread Dominik Wosiński
Hey, I wanted to use the control stream to dynamically adjust parameters of the tasks. I know that it is possible to use *connect()* and *BroadcastState *to obtain such a thing. But I would like to have the possibility to control the parameters inside the *AsyncFunction. *Like specific timeout for

Re: kafka corrupt record exception

2019-04-02 Thread Dominik Wosiński
Hey, As far as I understand the error is not caused by the deserialization but really by the polling of the message, so custom deserialization schema won't really help in this case. There seems to be an error in the messages in Your topic. You can see here

Re: JSON to CEP coversion

2019-01-22 Thread Dominik Wosiński
Hey Anish, I have done some abstraction over the logic of CEP, but with the use of Apache Bahir[1], which introduces SIddhi CEP[2][ engine that allows SQL like definitions of the logic. Best, Dom. [1] https://github.com/apache/bahir [2] https://github.com/wso2/siddhi wt., 22 sty 2019 o 20:20 as

Re: Getting RemoteTransportException

2019-01-17 Thread Dominik Wosiński
*Hey,* As for the question about taskmanager.network.netty.server.numThreads . It is the size of the thread pool that will be used by the netty server. The default value is -1, which

Re: Passing vm options

2019-01-07 Thread Dominik Wosiński
Hey, AFAIK, Flink supports dynamic properties currently only on YARN and not really in standalone mode. If You are using YARN it should indeed be possible to set such configuration. If not, then I am afraid it is not possible. Best Regards, Dom. pon., 7 sty 2019 o 09:01 Avi Levi napisał(a): >

Re: Kafka consumer, is there a way to filter out messages using key only?

2018-12-27 Thread Dominik Wosiński
Hey, AFAIK, returning null from deserialize function in FlinkKafkaConsumer will indeed filter the message out and it won't be further processed. Best Regards, Dom. śr., 19 gru 2018 o 11:06 Dawid Wysakowicz napisał(a): > Hi, > > I'm afraid that there is no out-of-the box solution for this, but w

Re: Changes in Flink 1.6.2

2018-11-30 Thread Dominik Wosiński
aph.getJobGraph().getJobID it > will work. > > Best, > > Dawid > On 30/11/2018 15:19, Boris Lublinsky wrote: > > Dominik, > Any feedback on this? > > Boris Lublinsky > FDP Architect > boris.lublin...@lightbend.com > https://www.lightbend.com/ > > O

Re: Flink SQL

2018-11-30 Thread Dominik Wosiński
Hey, Not exactly sure by what you mean by "nothing" but generally the concept is. The data is fed to the dynamic table and the result of the query creates another dynamic table. So, if the resulting query returns an empty table, no data will indeed be written to the S3. Not sure if this was what Y

Re: Changes in Flink 1.6.2

2018-11-28 Thread Dominik Wosiński
Hey, Could you show the message that You are getting? Best Regards, Dom. śr., 28 lis 2018 o 19:08 Boris Lublinsky napisał(a): > > > > Prior to Flink version 1.6.2 including 1.6.1 > env.getStreamGraph.getJobGraph was happily returning currently defined > Graph, but in 1.6.2 this fails to compile

Re: Can JDBCSinkFunction support exectly once?

2018-11-21 Thread Dominik Wosiński
Hey, As far as I know, the function needs to implement the *TwoPhaseCommitFunction* and not the *CheckpointListener. JDBCSinkFunction *does not implement the two-phase commit, so currently it does not support exactly once. Best Regards, Dom. śr., 21 lis 2018 o 11:07 Jocean shi napisał(a): > Hi

Re: Store Predicate or any lambda in MapState

2018-11-21 Thread Dominik Wosiński
Hey Jayant, I don't really think that the sole fact of using Predicate should cause the *ClassNotFoundException* that You are talking about. The exception may come from the fact that some libraries are missing from Your cluster environment. Have You tried running the job locally to verify that the

Re: Kinesis Connector - NoClassDefFoundError

2018-11-20 Thread Dominik Wosiński
Hey, Have you updated the versions both on the environment and the dependency on the job? >From my personal experience, 95 % of such issues is due to the mismatch between Flink versions on the cluster you are using and Your job. Best Regards, Dom. wt., 20 lis 2018 o 07:41 Steve Bistline napisał

Re: Field could not be resolved by the field mapping when using kafka connector

2018-11-15 Thread Dominik Wosiński
Hey, Thanks for the info, I haven't noticed that. I was just going through older messages with no responses. Best Regards, Dom.

Re: Flink with parallelism 3 is running locally but not on cluster

2018-11-15 Thread Dominik Wosiński
PS. Could You also post the whole log for the application run ?? Best Regards, Dom. czw., 15 lis 2018 o 11:04 Dominik Wosiński napisał(a): > Hey, > > DId You try to run any other job on your setup? Also, could You please > tell what are the sources you are trying to use, do all m

Re: Flink with parallelism 3 is running locally but not on cluster

2018-11-15 Thread Dominik Wosiński
Hey, DId You try to run any other job on your setup? Also, could You please tell what are the sources you are trying to use, do all messages come from Kafka?? >From the first look, it seems that the JobManager can't connect to one of the TaskManagers. Best Regards, Dom. pon., 12 lis 2018 o 17:1

Re: Field could not be resolved by the field mapping when using kafka connector

2018-11-15 Thread Dominik Wosiński
Hey, Could You please show a sample data that You want to process? This would help in verifying the issue. Best Regards, Dom. wt., 13 lis 2018 o 13:58 Jeff Zhang napisał(a): > Hi, > > I hit the following error when I try to use kafka connector in flink table > api. There's very little document

Re: Manual trigger the window in fold operator or incremental aggregation

2018-10-24 Thread Dominik Wosiński
Hey Zhen Li, What are You trying to do exactly? Maybe there is a more suitable method than manually triggering windows available in Flink. Best Regards, Dom. śr., 24 paź 2018 o 09:25 Dawid Wysakowicz napisał(a): > Hi Zhen Li, > > As far as I know that is not possible. For such custom handling

Re: Kafka connector error: This server does not host this topic-partition

2018-10-23 Thread Dominik Wosiński
Hey Alexander, It seems that this issue occurs when the broker is down and the partition is selecting the new leader AFAIK. There is one JIRA issue I have found, not sure if that's what are You looking for: https://issues.apache.org/jira/browse/KAFKA-6221 This issue is connected with Kafka itself

Re: KafkaException or ExecutionStateChange failure on job startup

2018-10-23 Thread Dominik Wosiński
Hey Mark, Do You use more than 1 Kafka consumer for Your jobs? I think this relates to the known issue in Kafka: https://issues.apache.org/jira/browse/KAFKA-3992. The problem is that if You don't provide client ID for your *KafkaConsumer* Kafka assigns one, but this is done in an unsynchronized wa

Re: Mapstatedescriptor

2018-10-13 Thread Dominik Wosiński
Hey, It's the name for the whole descriptor. Not the keys, it means that no other descriptor should be created with the same name. Best Regards, Dom. Sob., 13.10.2018, 09:50 użytkownik Szymon napisał: > > > Hi, i have a question about MapStateDescriptor used to create MapState. > I have a keyed

Re: Making calls to external API wit Data Streams

2018-10-12 Thread Dominik Wosiński
atherAPIRequest > > > > On Fri, 12 Oct 2018 at 16:21, Dominik Wosiński wrote: > >> Hey, >> What is the exact issue that you are facing and the Flink version that >> you are using ?? >> >> >> Best Regards, >> Dom. >> >> pt., 12 pa

Re: Making calls to external API wit Data Streams

2018-10-12 Thread Dominik Wosiński
Hey, What is the exact issue that you are facing and the Flink version that you are using ?? Best Regards, Dom. pt., 12 paź 2018 o 16:11 Krishna Kalyan napisał(a): > Hello All, > > I need some help making async API calls. I have tried the following code > below. > > class AsyncWeatherAPIReques

Re: Duplicates in self join

2018-10-08 Thread Dominik Wosiński
, Dominik. pon., 8 paź 2018 o 08:00 Eric L Goodman napisał(a): > What is the best way to avoid or remove duplicates when joining a stream > with itself? I'm performing a streaming temporal triangle computation and > the first part is to find triads of two edges of the form vertexA-

ODP: How to add jvm Options when using Flink dashboard?

2018-09-05 Thread Dominik Wosiński
Hey, You can’t as Chesnay have already said, but for your usecase you could use arguments instead of JVM option and they will work equally good. Best Regards, Dom. Wysłane z aplikacji Poczta dla Windows 10 Od: Chesnay Schepler Wysłano: środa, 5 września 2018 11:43 Do: zpp; user@flink.apache.or

ODP: API for delayed/scheduled interval input source for integrationtests

2018-09-01 Thread Dominik Wosiński
Hey, Maybe it would be a good idea to create somekind of test source for DataStream that allows writing elements to stream directly. Similarly like it’s done for reactive libraries sources. This would make creating tests a lot easier for Flink. Best Regards, Dom. Wysłane z aplikacji Poczta dl

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Dominik Wosiński
Hey ;) I have received one response that was sent directly to my email and not to user group : > Hi Dominik, > > I think you can put the unserializable fields into RichFunctions and > initiate them in the `open` method, so the the fields won’t need to be > serialized with the

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Dominik Wosiński
Hey Paul, Yeah that is possible, but I was asking in terms of serialization schema. So I would really want to avoid RichFunction :) Best Regards, Dominik. pon., 27 sie 2018 o 10:23 Chesnay Schepler napisał(a): > The null check in the method is the general-purpose way of solving it. >

Dealing with Not Serializable classes in Java

2018-08-26 Thread Dominik Wosiński
) { if(objectMapper == null) { objectMapper = new ObjectMapper().registerModule(new JavaTimeModule()); } ... } But I was wondering whether do You have any prettier option of doing this? Thanks, Dominik.

Fwd: would you join a Slack workspace for Flink?

2018-08-26 Thread Dominik Wosiński
-- Forwarded message - From: Dominik Wosiński Date: niedz., 26 sie 2018 o 15:12 Subject: ODP: would you join a Slack workspace for Flink? To: Hequn Cheng Hey, I have been facing this issue for multiple open source projects and discussions. Slack in my opinion has two main

Re: Flink checkpointing to Google Cloud Storage

2018-08-21 Thread Dominik Wosiński
Hey, >From my perspective, such issues always meant clashing dependencies in case of Flink. Have you checked the full dependency tree if there are no issues there ? Best Regards, Dominik.

Re: Cluster die when one of the TM killed

2018-08-20 Thread Dominik Wosiński
Hey, Can You please provide a little more information about your setup and maybe logs showing when the crash occurs? Best Regards, Dominik 2018-08-20 16:23 GMT+02:00 Siew Wai Yow : > Hi, > > > When one of the task manager is killed, the whole cluster die, is this > something e

Re: Flink not rolling log files

2018-08-17 Thread Dominik Wosiński
log4j.appender.file.layout=org.apache.log4j.PatternLayout log4j.appender.file.layout.ConversionPattern=%d{-MM-dd HH:mm:ss,SSS} %-5p %-60c %x - %m%n log4j.appender.file.layout.DatePattern='.'-MM-dd Best Regards, Dominik.

Re: Flink Jobmanager Failover in HA mode

2018-08-17 Thread Dominik Wosiński
I have faced this issue, but in 1.4.0 IIRC. This seems to be related to https://issues.apache.org/jira/browse/FLINK-10011. What was the status of the jobs when the main Job Manager has been stopped ? 2018-08-17 17:08 GMT+02:00 Helmut Zechmann : > Hi all, > > we have a problem with flink 1.5.2 hig

Re: Initial value of ValueStates of primitive types

2018-08-17 Thread Dominik Wosiński
[Boolean])) If you will now do : print(test.value()) It will indeed print the *null*. But if You will do : val myTest = test.value() print(test.value()) It will now print *false *instead; Best Regards, Dominik. 2018-08-17 11:13 GMT+02:00 Averell : > Hi, > > In Flink's documents

ODP: docker, error NoResourceAvailableException..

2018-08-15 Thread Dominik Wosiński
Slots: 2 This will give you 2 Task Slots with only 1 Task Manager. But you will need to somehow override config in the container, possibly using : https://docs.docker.com/storage/volumes/ Regards, Dominik. Od: shyla deshpande Wysłano: środa, 15 sierpnia 2018 07:23 Do: user Temat: docker, error N

Re: How to set log level using Flink Docker Image

2018-06-21 Thread Dominik Wosiński
You can for example mount the *conf* directory using docker volumes. You would need to have *logback.xml* and then mount it as *conf/logback.xml *inside the image. Locally You could do this using *docker-compose.yml*, for mounting volumes in kubernetes refer to this page: https://kubernetes.io/doc

Blob Server Removes Failed Jobs Immediately

2018-06-20 Thread Dominik Wosiński
Hello, I'm not sure whether the problem is connected with bad configuration or it's some inconsistency in the documentation but according to this document: https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture . *I*f a job fails, all non-HA files' refCoun

Flink 1.4 with cassandra-connector: Shading error

2017-12-18 Thread dominik
iling to see something. I don't include netty in my dependencies, that could of course fix it, but I'm suspecting I will run into more dependency problems then. Here are my dependencies: https://gist.github.com/anonymous/565a0ad017976502a62b2919115b31fd What additional information can I provide? Thanks, Dominik

Re: S3 Access in eu-central-1

2017-11-28 Thread Dominik Bruhn
Hey Stephan, Hey Steve, that was the right hint, adding that open to the Java-Options fixed the problem. Maybe we should add this somehow to our Flink Wiki? Thanks! Dominik On 28/11/17 11:55, Stephan Ewen wrote: Got a pointer from Steve that this is answered on Stack Overflow here: https

Re: S3 Access in eu-central-1

2017-11-27 Thread Dominik Bruhn
Hey, can anyone give a hint? Does anyone have flink running with an S3 Bucket in Frankfurt/eu-central-1 and can share his config and setup? Thanks, Dominik > On 22. Nov 2017, at 17:52, domi...@dbruhn.de wrote: > > Hey everyone, > I'm trying since hours to get Flink 1.3.2 (down

S3 Access in eu-central-1

2017-11-22 Thread dominik
414.7K May 17 2012 httpclient-4.2.jar -rw---1 root root 218.0K May 1 2012 httpcore-4.2.jar -rw-rw-r--1 1005 1006 478.4K Jul 28 14:50 log4j-1.2.17.jar -rw-rw-r--1 1005 10068.7K Jul 28 14:50 slf4j-log4j12-1.7.7.jar Can anyone give me any hints? Thanks, Dominik

Tooling for resuming from checkpoints

2017-11-22 Thread dominik
checkpoint? Or at least a tool/commandline which we can use to validate that a checkpoint is valid so we can pick the latest one? How are others handling this? Manually? Would be happy to get some input there, Dominik

Re: Setting operator parallelism of a running job - Flink 1.2

2017-04-21 Thread Dominik Safaric
conditions. However, it is not the case. Best, Dominik > On 21 Apr 2017, at 15:36, Aljoscha Krettek wrote: > > Hi, > changing the parallelism is not possible while a job is running (currently). > What you would have to do to change the parallelism is create a savepoint and >

Setting operator parallelism of a running job - Flink 1.2

2017-04-21 Thread Dominik Safaric
slots available? If not, can someone change the parallelism of a job while in the restart mode in order to allow the job to continue? Thanks, Dominik

Re: Flink 1.2 time window operation

2017-03-30 Thread Dominik Safaric
fired. Hence, I this behaviour that you’ve described and we’ve expected did not occur. If it would help, I can share the source code and a detail Flink configuration. Cheers, Dominik > On 30 Mar 2017, at 13:09, Tzu-Li (Gordon) Tai wrote: > > Hi, > > Thanks for the clarific

Re: Flink 1.2 time window operation

2017-03-29 Thread Dominik Safaric
Hi Gordon, The job was run using processing time. The Kafka broker version I’ve used was 0.10.1.1. Dominik > On 30 Mar 2017, at 08:35, Tzu-Li (Gordon) Tai wrote: > > Hi Dominik, > > Was the job running with processing time or event time? If event time, how > are

Flink 1.2 time window operation

2017-03-27 Thread Dominik Safaric
Hi all, Lately I’ve been investigating onto the performance characteristics of Flink part of our internal benchmark. Part of this we’ve developed and deployed an application that pools data from Kafka, groups the data by a key during a fixed time window of a minute. In total, the topic that t

Re: Benchmarking streaming frameworks

2017-03-23 Thread Dominik Safaric
_ver3.pdf <https://people.eecs.berkeley.edu/~kubitron/courses/cs262a-F14/projects/reports/project11_report_ver3.pdf> https://hal.inria.fr/hal-01347638/document <https://hal.inria.fr/hal-01347638/document> Regards, Dominik > On 23 Mar 2017, at 11:09, Giselle van Dongen > wrote: > > Dea

Re: flink/cancel & shutdown hooks

2017-03-08 Thread Dominik Safaric
I’m not using YARN but instead of starting the cluster using bin/start-cluster.sh > On 8 Mar 2017, at 15:32, Ufuk Celebi wrote: > > On Wed, Mar 8, 2017 at 3:19 PM, Dominik Safaric > wrote: >> The cluster consists of 4 workers and a master node. > > Are you starting t

Re: flink/cancel & shutdown hooks

2017-03-08 Thread Dominik Safaric
I’m deploying the job from the master node of the cluster itself using bin/flink run -c . The cluster consists of 4 workers and a master node. Dominik > On 8 Mar 2017, at 15:16, Ufuk Celebi wrote: > > How are you deploying your job? > > Shutdown hooks are execut

flink/cancel & shutdown hooks

2017-03-07 Thread Dominik Safaric
the shutdown hook gets executed, however I do not see the same behaviour when stopping the Flink application using bin/flink cancel . Considering there are no exceptions thrown from the shutdown thread, what could the root cause of this be? Thanks, Dominik

Re: FlinkKafkaConsumer010 - creating a data stream of type DataStream>

2017-03-07 Thread Dominik Safaric
Hi Gordon, Thanks for the advice. Following it I’ve implemented the Keyed(De)SerializationSchema and am able to further emit the metadata to downstream operators. Regards, Dominik > On 7 Mar 2017, at 07:08, Tzu-Li (Gordon) Tai wrote: > > Hi Dominik, > > I would recommend

FlinkKafkaConsumer010 - creating a data stream of type DataStream>

2017-03-06 Thread Dominik Safaric
nce, Dominik

Memory Limits: MiniCluster vs. Local Mode

2017-03-03 Thread dominik
e local mode? What are the use-cases in which you would choose one over the other? 2. Is there an example how to use the MiniCluster? I see that I need a JobGraph, how do I get one? 3. What are the tuning parameters to limit the memory consumption of the MiniCluster (and maybe the local mode)? Thanks for your help, Dominik

TaskManager failure detection

2017-02-22 Thread Dominik Safaric
? Up to my knowledge, the state should be restored using the CheckpointCoordinator or ExecutionGraph. Correct me if I’m wrong. Thanks in advance, Dominik

Flink 1.2 Maven dependency

2017-02-09 Thread Dominik Safaric
com/artifact/org.apache.flink/flink-core>). Could someone explain why there isn’t a Maven dependency available yet? Thanks, Dominik

Debugging, logging and measuring operator subtask performance

2017-01-25 Thread Dominik Safaric
threads? What approach would you recommend? Thanks in advance, Dominik

Re: benchmarking flink streaming

2017-01-25 Thread Dominik Safaric
Hi Stephan, As I’m already familiar with the latency markers of Flink 1.2, there is one question that bothers me in regard to them - how does Flink measure end-to-end latency when dealing with e.g. aggregations? Suppose you have a topology ingesting data from Kafka, and you want to output fre

Set Parallelism and keyBy

2016-12-26 Thread Dominik Bruhn
of the keys to the tasks: The "ExpensiveOperation" is now not executed on the same nodes anymore all the time (visible by the prefixes in the print()). What am I doing wrong? Is the only chance to set the whole parallelism of the whole flink job to 16? Thanks, have nice holidays, Dominik

Re: Who's hiring, December 2016

2016-12-16 Thread Dominik Bruhn
job ad, feel free to ping me, Dominik On 16.12.2016 14:46, Kostas Tzoumas wrote: Hi folks, As promised, here is the first thread for Flink-related job positions. If your organization is hiring people on Flink-related positions do reply to this thread with a link for applications. data

Flink 1.1.3 RollingSink - understanding output blocks/parallelism

2016-12-14 Thread Dominik Safaric
the fact that this will result in 6 distinct block files, whereas I would like to have a single file containing all of the output values from the DataStream. Regards, Dominik

Flink 1.1.3 RollingSink - mismatch in the number of records consumed/produced

2016-12-12 Thread Dominik Safaric
records more then consumed/available. What is the cause of this behaviour? Regards, Dominik

Re: Partitioning operator state

2016-12-08 Thread Dominik Safaric
? Dominik > On 8 Dec 2016, at 10:04, Stefan Richter wrote: > > Hi Dominik, > > as Gordon’s response only covers keyed-state, I will briefly explain what > happens for non-keyed operator state. In contrast to Flink 1.1, Flink 1.2 > checkpointing does not write a single blac

Partitioning operator state

2016-12-07 Thread Dominik Safaric
Hi everyone, In the case of scaling out a Flink cluster, how does Flink handle operator state partitioning of a staged topology? Regards, Dominik

Cannot connect to the JobManager - Flink 1.1.3 cluster mode

2016-11-23 Thread Dominik Safaric
. Since the cluster I am running has a VNET configured, could SSH be bypassed or is it a must? Thanks in advance, Dominik

Re: Flink Material & Papers

2016-11-21 Thread Dominik Safaric
applicable to Flink as well. Regards, Dominik > On 21 Nov 2016, at 17:53, Hanna Prinz wrote: > > Guten Abend everyone, > > I’m currently writing a term paper about Flink at the HTW Berlin and I wanted > to ask you if you can help with papers (or other material) about Flin

Running the JobManager and TaskManager on the same node in a cluster

2016-11-16 Thread Dominik Safaric
Hi, It is generally recommended for streaming engines, also including Flink to run a separate master node - in the case of Flink, the JobManager. However, why should one in Flink run the JobManager on a separate node? Performance wise, the JobManager isn’t intense unlike of course TaskManager

TaskManager log thread

2016-11-11 Thread Dominik Safaric
If taskmanager.debug.memory.startLogThread is set to true, where does the task manager output the logs to? Unfortunately I couldn’t find this information in the documentation, hence the question. Thanks in advance, Dominik

Release Process

2016-11-03 Thread Dominik Bruhn
policy here? When can I expect to see this in the a official release? How are the rules here? Thanks for a clarification, Dominik [1]: https://github.com/apache/flink/pull/2373 [2]: https://issues.apache.org/jira/browse/FLINK-4394

Re: Flink Kafka 0.10.0.0 connector

2016-11-03 Thread Dominik Safaric
parameterTool.getRequired("topic"), > new SimpleStringSchema(), > parameterTool.getProperties())); > > // write kafka stream to standard out. > messageStream.print(); > > env.execute("Read from Kafka example"); > > On Thu,

Re: Flink Kafka 0.10.0.0 connector

2016-11-03 Thread Dominik Safaric
assume that the FlinkKafkaConsumer instance should be of type SourceFunction. However, the same even happened while building the FlinkKafkaConsumer09. Any hint what might be going on? I’ve build the jar distribution as a clean maven package (without running the tests). Thanks, Dominik >

Flink Kafka 0.10.0.0 connector

2016-11-03 Thread Dominik Safaric
anyone could provide some guidance once how to successfully build the Flink Kafka connector supporting Kafka 0.10.x versions. Thanks in advance, Dominik

BoundedOutOfOrdernessTimestampExtractor and timestamps in the future

2016-11-01 Thread Dominik Bruhn
is there a better approach? In general, how does Flink handle readings from the future? Thanks, Dominik -- Dominik

TimeWindow Trigger which only fires when the values have changed

2016-10-04 Thread Dominik Bruhn
c/main/java/com/dataartisans/beam_comparison/customTriggers/EventTimeTriggerWithEarlyAndLateFiring.java Thanks, Dominik

Re: Consuming Messages from Kafka

2016-04-26 Thread Dominik Choma
Hi, You can check if any messages are going through dataflow on flink web dashboard https://flink.apache.org/img/blog/new-dashboard-screenshot.png <https://flink.apache.org/img/blog/new-dashboard-screenshot.png> Dominik Choma > Wiadomość napisana przez Conlin, Joshua [USA] w dniu

Re: Understanding Sliding Windows

2016-04-26 Thread Dominik Choma
Piyush, You created sliding window witch is triggered every 10 seconds Flink fires up this window every 10 seconds, without waiting at 5 min buffer to be filled up It seems to me that first argument is rather "maximum data buffer retention" than " the initial threshold" Domi