date:20190219

Re: Reading messages from start - new job submission

2019-02-19 Thread Dian Fu

Hi Avi, As described in the documentation: "If offsets could not be found for a partition, the auto.offset.reset setting in the properties will be used.". For starting from GroupOffset, the property "auto.offset.reset" will ONLY be respected when the group offset cannot be found for a partition

Re: EXT :Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Dian Fu

DataStream.assignTimestampsAndWatermarks will add a watermark generator operator after each source operator(if their parallelism is the same which is true for the code you showed) and so if one instance of the source operator has no data, the corresponding watermark generator operator cannot gen

How to use my custom log4j.properties when running minicluster in idea

2019-02-19 Thread peibin wang

Hi, I am running flink job in the Intellij IDEA with mini cluster (not submit it to the flink cluster ) for convenience . Now I have put my custom log config file ( both log4j.properties and logback.xml) in src/main/resources/. But it does not work. Is there any solutions？

Re: Starting Flink cluster and running a job

2019-02-19 Thread Boris Lublinsky

Thanks Ken, That was my first instinct as well, but.. To run on the cluster I am building an uber jar for which I am fixing Kafka clients jar version I am also fixing version of Kafka So I do not know where another version can get from Boris Lublinsky FDP Architect boris.lublin...@lightbend.com

Re: Starting Flink cluster and running a job

2019-02-19 Thread Ken Krugler

Hi Boris, I haven’t seen this exact error, but I have seen similar errors caused by multiple versions of jars on the classpath. When I’ve run into this particular "XXX is not an instance of YYY" problem, it often seems to be caused by a jar that I should have marked as provided in my pom. Tho

Re: Jira issue Flink-11127

2019-02-19 Thread Boris Lublinsky

Thanks Konstantin Unfortunately it does not work The snippet from task manager yaml is containers: - name: taskmanager image: {{ .Values.image }}:{{ .Values.imageTag }} imagePullPolicy: {{ .Values.imagePullPolicy }} args: - taskmanager -Dtaskmanager.host=$(K8S_POD_IP) ports: - name:

Re: Starting Flink cluster and running a job

2019-02-19 Thread Boris Lublinsky

Konstantin, After experimenting with this for a while, I got to the root cause of the problem I am running a version of a Taxi ride travel prediction as my sample. It works fine in Intellij, But when I am trying to put it in the docker (standard Debian 1.7 image) It fails with a following error

Assigning timestamps and watermarks several times, several datastreams?

2019-02-19 Thread Aakarsh Madhavan

Hi! Currently I am using Flink 1.4.2. class TSWM implements AssignerWithPunctuatedWatermarks { long maxTS = Long.MIN_VALUE; @Override public Watermark checkAndGetNextWatermark(POJO event, long l) { maxTS = Math.max(maxTS, event.TS); return new Watermark(getMaxTimestamp()); }

Metrics for number of "open windows"?

2019-02-19 Thread Andrew Roberts

Hello, I’m trying to track the number of currently-in-state windows in a keyed, windowed stream (stream.keyBy(…).window(…).trigger(…).process(…)) using Flink metrics. Are there any built in? Or any good approaches for collecting this data? Thanks, Andrew -- *Confidentiality Notice: The infor

Re: Reading messages from start - new job submission

2019-02-19 Thread avilevi

Thanks for the answer, But my question is why do I need to set /myConsumer.setStartFromEarliest();/ if I set this property /setProperty("auto.offset.reset", "earliest") /in consumer properties ? I want the consumer to start reading from earliest only If offsets could not be found as stated in th

Re: EXT :Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Stephen Connolly

Though I am explicitly assigning watermarks with DataStream.assignTimestampsAndWatermarks and I see all the data flowing through that... so shouldn't that override the watermarks from the original source? On Tue, 19 Feb 2019 at 15:59, Martin, Nick wrote: > Yeah, that’s expected/known. Watermarks

Re: FLIP-16, FLIP-15 Status Updates?

2019-02-19 Thread John Tipper

Hi Timo, That’s great, thank you very much. If I’d like to contribute, is it best to wait until the roadmap has been published? And is this the best list to ask on, or is the development mailing list better? Many thanks, John Sent from my iPhone > On 19 Feb 2019, at 16:29, Timo Walther wrot

Re: FLIP-16, FLIP-15 Status Updates?

2019-02-19 Thread Timo Walther

Hi John, you are right that there was not much progress in the last years around these two FLIPs. Mostly due to shift of priorities. However, with the big Blink code contribution from Alibaba and joint development forces for a unified batch and streaming runtime [1], it is very likely that al

Re: Dataset statistics

2019-02-19 Thread Flavio Pompermaier

We've just published a first attempt (on Flink 1.6.2) that extract some descriptive statistics from a batch dataset[1]. Any feedback is welcome. Best, Flavio [1] https://github.com/okkam-it/flink-descriptive-stats On Thu, Feb 14, 2019 at 11:19 AM Flavio Pompermaier wrote: > No effort in this d

RE: EXT :Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Martin, Nick

Yeah, that’s expected/known. Watermarks for the empty partition don’t advance, so the window in your window function never closes. There’s a ticket open to fix it (https://issues.apache.org/jira/browse/FLINK-5479) for the kafka connector, but in general any time one parallel instance of a sourc

Re: Get nested Rows from Json string

2019-02-19 Thread françois lacombe

Hi Rong, Thank you for JIRA. Understood it may be solved in a next release, I'll comment the ticket in case of further input All the best François Le sam. 9 févr. 2019 à 00:57, Rong Rong a écrit : > Hi François, > > I just did some research and seems like this is in fact a Stringify issue. >

Re: How to load multiple same-format files with single batch job?

2019-02-19 Thread françois lacombe

Hi Fabian, After a bit more documentation reading I have a better understanding of how InputFormat interface works. Indeed I've better to wrap a custom InputFormat implementation in my source. This article helps a lot https://brewing.codes/2017/02/06/implementing-flink-batch-data-connector/ conne

Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Stephen Connolly

Hmmm my suspicions are now quite high. I created a file source that just replays the events straight then I get more results On Tue, 19 Feb 2019 at 11:50, Stephen Connolly < stephen.alan.conno...@gmail.com> wrote: > Hmmm after expanding the dataset such that there was additional data that > e

BucketingSink - Could not invoke truncate while recovering from state

2019-02-19 Thread sohimankotia

/pipeline/prod/20190219/15/part-1550570070550-57-1 for DFSClient_NONMAPREDUCE_-706922361_126 on because DFSClient_NONMAPREDUCE_-706922361_126 is already the current lease holder. at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.recoverLeaseInternal(FSNamesystem.java:3109) at

Re: How to debug difference between Kinesis and Kafka

2019-02-19 Thread Stephen Connolly

Hmmm after expanding the dataset such that there was additional data that ended up on shard-0 (everything in my original dataset was coincidentally landing on shard-1) I am now getting output... should I expect this kind of behaviour if no data arrives at shard-0 ever? On Tue, 19 Feb 2019 at 11:14

FLIP-16, FLIP-15 Status Updates?

2019-02-19 Thread John Tipper

Hi All, Does anyone know what the current status is for FLIP-16 (loop fault tolerance) and FLIP-15 (redesign iterations) please? I can see lots of work back in 2016, but it all seemed to stop and go quiet since about March 2017. I see iterations as offering very interesting capabilities for Fli

How to debug difference between Kinesis and Kafka

2019-02-19 Thread Stephen Connolly

Hi, I’m having a strange situation and I would like to know where I should start trying to debug. I have set up a configurable swap in source, with three implementations: 1. A mock implementation 2. A Kafka consumer implementation 3. A Kinesis consumer implementation >From injecting a log and no

Re: Starting Flink cluster and running a job

2019-02-19 Thread Konstantin Knauf

Hi Boris, without looking at the entrypoint in much detail, generally there should not be a race condition there: * if the taskmanagers can not connect to the resourcemanager they will retry (per default the timeout is 5 mins) * if the JobManager does not get enough resources from the ResourceMan

Re: Jira issue Flink-11127

2019-02-19 Thread Konstantin Knauf

Hi Boris, the solution is actually simpler than it sounds from the ticket. The only thing you need to do is to set the "taskmanager.host" to the Pod's IP address in the Flink configuration. The easiest way to do this is to pass this config dynamically via a command-line parameter. The Deployment

Stream enrichment with static data, side inputs for DataStream

2019-02-19 Thread Artur Mrozowski

Hi, I have a stream of buildings and each building has foreign key reference to municipality. Municipalities data set is quite static. Both are placed on Kafka topics. I want to enrich each building with municipality name. FLIP 17, proposal would be ideal for this use case but it's still just a pr

Re: subscribe

2019-02-19 Thread Artur Mrozowski

Will do, thanks! On Tue, Feb 19, 2019 at 8:57 AM Fabian Hueske wrote: > Hi Artur, > > In order to subscribe to Flink's user mailing list you need to send a mail > to user-subscr...@flink.apache.org > > Best, Fabian > > Am Mo., 18. Feb. 2019 um 20:34 Uhr schrieb Artur Mrozowski < > art...@gmail.c

Re: Reading messages from start - new job submission

Re: EXT :Re: How to debug difference between Kinesis and Kafka

How to use my custom log4j.properties when running minicluster in idea

Re: Starting Flink cluster and running a job

Re: Starting Flink cluster and running a job

Re: Jira issue Flink-11127

Re: Starting Flink cluster and running a job

Assigning timestamps and watermarks several times, several datastreams?

Metrics for number of "open windows"?

Re: Reading messages from start - new job submission

Re: EXT :Re: How to debug difference between Kinesis and Kafka

Re: FLIP-16, FLIP-15 Status Updates?

Re: FLIP-16, FLIP-15 Status Updates?

Re: Dataset statistics

RE: EXT :Re: How to debug difference between Kinesis and Kafka

Re: Get nested Rows from Json string

Re: How to load multiple same-format files with single batch job?

Re: How to debug difference between Kinesis and Kafka

BucketingSink - Could not invoke truncate while recovering from state

Re: How to debug difference between Kinesis and Kafka

FLIP-16, FLIP-15 Status Updates?

How to debug difference between Kinesis and Kafka

Re: Starting Flink cluster and running a job

Re: Jira issue Flink-11127

Stream enrichment with static data, side inputs for DataStream

Re: subscribe

26 matches

Site Navigation

Mail list logo

Footer information