date:20160512

Re: Scatter-Gather Iteration aggregators

2016-05-12 Thread Lydia Ickler

Hi Vasia, yes, but only independently within each Function or not? If I set the aggregator in VertexUpdateFunction then the newly set value is not visible in the MessageFunction. Or am I doing something wrong? I would like to have a shared aggregator to normalize vertices. > Am 13.05.2016 um

Re: How to measure Flink performance

2016-05-12 Thread Dhruv Gohil

Hi Prateek, https://github.com/dataArtisans/yahoo-streaming-benchmark/blob/master/flink-benchmarks/src/main/java/flink/benchmark/utils/ThroughputLogger.java https://github.com/dataArtisans/yahoo-streaming-benchmark/blob/master/flink-benchmarks/src/main/java/flink/benchmark/utils/AnalyzeTool.java

Re: Scatter-Gather Iteration aggregators

2016-05-12 Thread Vasiliki Kalavri

Hi Lydia, registered aggregators through the ScatterGatherConfiguration are accessible both in the VertexUpdateFunction and in the MessageFunction. Cheers, -Vasia. On 12 May 2016 at 20:08, Lydia Ickler wrote: > Hi, > > I have a question regarding the Aggregators of a Scatter-Gather Iteration.

Re: Confusion about multiple use of one ValueState

2016-05-12 Thread Balaji Rajagopalan

I don't think the valuestate defined in one map function is accessible in other map function this is my understanding, also you need to be aware there will be instance of map function created for each of your tuple in your stream, I had a similar use case where I had to pass in some state from one

Confusion about multiple use of one ValueState

2016-05-12 Thread Nirmalya Sengupta

Hello all, Let's say I want to hold some state value derived during one transformation, and then use that same state value in a subsequent transformation? For example: myStream .keyBy(fieldID) // Some field ID, may be 0 .map(new MyStatefulMapper()) .map(new MySubsequentMapper()) Now, I defi

Re: Local Cluster have problem with connect to elasticsearch

2016-05-12 Thread rafal green

...btw I found this (in folder: "flink/flink-streaming-connectors/flink-connector-elasticsearch2.pom.xml") : 2.2.1 I change it to 2.3.2 version and of course rebuild with that command "mvn clean install -DskipTests" ...but nothing is changed. 2016-05-12 22:39 GMT+02:00 rafal green : > Sorr

Re: Interesting window behavior with savepoints

2016-05-12 Thread Ufuk Celebi

On Thu, May 12, 2016 at 10:44 PM, Andrew Whitaker wrote: > From what I've observed, most of the time when Flink can't successfully > restore a checkpoint it throws an exception saying as much. I was expecting > to see that behavior here. Could someone explain why this "works" (as in, > flink accep

Re: Interesting window behavior with savepoints

2016-05-12 Thread Andrew Whitaker

"Flink can't successfully restore a checkpoint" should be "Flink can't successfully restore a savepoint". On Thu, May 12, 2016 at 3:44 PM, Andrew Whitaker < andrew.whita...@braintreepayments.com> wrote: > Hi, > > I was recently experimenting with savepoints and various situations in > which they

Interesting window behavior with savepoints

2016-05-12 Thread Andrew Whitaker

Hi, I was recently experimenting with savepoints and various situations in which they succeed or fail. I expected this example to fail: https://gist.github.com/AndrewWhitaker/fa46db04066ea673fe0eda232f0a5ce1 Basically, the first program runs with a count window. The second program is identical e

Re: Does Kafka connector leverage Kafka message keys?

2016-05-12 Thread Krzysztof Zarzycki

If I can throw in my 2 cents, I agree with what Elias says. Without that feature (not partitioning already partitioned Kafka data), Flink is in bad position for common simpler processing, that don't involve shuffling at all, for example simple readKafka-enrich-writeKafka . The systems like the new

Re: Local Cluster have problem with connect to elasticsearch

2016-05-12 Thread rafal green

Hi Gordon, Thanks for advice - it's work perfect but only in elasticsearch case. This pom version works for elasticsearch 2.2.1. org.apache.flink flink-connector-elasticsearch2_${scala.version} 1.1-SNAPSHOT jar false ${project.build.directory}/classes org/apache/flink/**

Re: checkpoints not being removed from HDFS

2016-05-12 Thread Maciek Próchniak

thanks, I'll try to reproduce it in some test by myself... maciek On 12/05/2016 18:39, Ufuk Celebi wrote: The issue is here: https://issues.apache.org/jira/browse/FLINK-3902 (My "explanation" before dosn't make sense actually and I don't see a reason why this should be related to having many s

Barriers at work

2016-05-12 Thread Srikanth

Hello, I was reading about Flink's checkpoint and wanted to check if I correctly understood the usage of barriers for exactly once processing. 1) Operator does alignment by buffering records coming after a barrier until it receives barrier from all upstream operators instances. 2) Barrier is alw

Scatter-Gather Iteration aggregators

2016-05-12 Thread Lydia Ickler

Hi, I have a question regarding the Aggregators of a Scatter-Gather Iteration. Is it possible to have a global aggregator that is accessible in VertexUpdateFunction() and MessagingFunction() at the same time? Thanks in advance, Lydia

Re: How to measure Flink performance

2016-05-12 Thread Konstantin Knauf

Hi Prateek, regarding throughput, what about simply filling the input Kafka topic with some (a lot) of messages and monitor (e.g. http://quantifind.github.io/KafkaOffsetMonitor/) how quickly Flink can work the lag off. The messages should be representative of your use case, of course. Latency is

Re: How to measure Flink performance

2016-05-12 Thread prateekarora

Hi How can i measure throughput and latency of my application in flink 1.0.2 ? Regards Prateek -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/How-to-measure-Flink-performance-tp6741p6863.html Sent from the Apache Flink User Mailing List

Java heap space error

2016-05-12 Thread Flavio Pompermaier

Hi to all, running a job that writes parquet-thrift files I had this exception (in a Task Manager): io.netty.channel.nio.NioEventLoop - Unexpected exception in the selector loop. java.lang.OutOfMemoryError: Java heap space 2016-05-12 18:49:11,302 WARN org.jboss.netty.ch

Re: checkpoints not being removed from HDFS

2016-05-12 Thread Ufuk Celebi

The issue is here: https://issues.apache.org/jira/browse/FLINK-3902 (My "explanation" before dosn't make sense actually and I don't see a reason why this should be related to having many state handles.) On Thu, May 12, 2016 at 3:54 PM, Ufuk Celebi wrote: > Hey Maciek, > > thanks for reporting th

Re: normalize vertex values

2016-05-12 Thread Vasiliki Kalavri

Hi Lydia, there is no dedicated Gelly API method that performs normalization. If you know the max value, then a mapVertices() would suffice. Otherwise, you can get the Dataset of vertices with getVertices() and apply any kind of operation supported by the Dataset API on it. Best, -Vasia. On May 1

Re: get start and end time stamp from time window

2016-05-12 Thread Martin Neumann

Thanks for the help. I use a Fold and a WindowFunction in conjunction now and it works fine. Though I wish there would be a less complicated way to do this. cheers Martin On Thu, May 12, 2016 at 11:59 AM, Fabian Hueske wrote: > Hi Martin, > > You can use a FoldFunction and a WindowFunction to p

Re: checkpoints not being removed from HDFS

2016-05-12 Thread Ufuk Celebi

Hey Maciek, thanks for reporting this. Having files linger around looks like a bug to me. The idea behind having the recursive flag set to false in the AbstractFileStateHandle.discardState() call is that the FileStateHandle is actually just a single file and not a directory. The second call tryin

Re: synchronizing two streams

2016-05-12 Thread Matthias J. Sax

I see. But even if you would have an operator (A,B)->(A,B), it would not be possible to block A if B does not deliver any data, because of Flink's internal design. You will need to use an custom solution: something like to a map (one for each steam) that use an side-communication channel (ie, exte

Re: synchronizing two streams

2016-05-12 Thread Alexander Gryzlov

Yes, this is generally a viable design, and is actually something we started off with. The problem in our case is, however, that either of the streams can occasionally (due to external producer's issues) get stuck for an arbitrary period of time, up to several hours. Buffering the other one during

Re: synchronizing two streams

2016-05-12 Thread Matthias J. Sax

That is correct. But there is no reason to throttle an input stream. If you implements an Outer-Join you will have two in-memory buffers holding the record of each stream of your "time window". Each time you receive a watermark, you can remove all "expired" records from the buffer of the other str

Re: synchronizing two streams

2016-05-12 Thread Alexander Gryzlov

Hmm, probably I don't really get how Flink's execution model works. As far as I understand, the preferred way to throttle down stream consumption is to simply have an operator with a conditional Thread.sleep() inside. Wouldn't calling sleep() in either of TwoInputStreamOperator's processWatermarkN(

Re: synchronizing two streams

2016-05-12 Thread Matthias J. Sax

I cannot follow completely. TwoInputStreamOperators defines two methods to process watermarks for each stream. So you can sync both stream within your outer join operator you plan to implement. -Matthias On 05/11/2016 05:00 PM, Alexander Gryzlov wrote: > Hello, > > We're implementing a streamin

Re: Bug while using Table API

2016-05-12 Thread Vasiliki Kalavri

Good to know :) On 12 May 2016 at 11:16, Simone Robutti wrote: > Ok, I tested it and it works on the same example. :) > > 2016-05-11 12:25 GMT+02:00 Vasiliki Kalavri : > >> Hi Simone, >> >> Fabian has pushed a fix for the streaming TableSources that removed the >> Calcite Stream rules [1]. >> Th

Re: Force triggering events on watermark

2016-05-12 Thread Aljoscha Krettek

Yes, this should work. On Tue, 10 May 2016 at 19:01 Srikanth wrote: > Yes, will work. > I was trying another route of having a "finalize & purge trigger" that will >i) onElement - Register for event time watermark but not alter nested > trigger's TriggerResult > ii) OnEventTime - Always pu

Re: Flink + Avro GenericRecord - first field value overwrites all other fields

2016-05-12 Thread Fabian Hueske

Hi Tarandeep, the AvroInputFormat was recently extended to support GenericRecords. [1] You could also try to run the latest SNAPSHOT version and see if it works for you. Cheers, Fabian [1] https://issues.apache.org/jira/browse/FLINK-3691 2016-05-12 10:05 GMT+02:00 Tarandeep Singh : > I think I

Re: get start and end time stamp from time window

2016-05-12 Thread Fabian Hueske

Hi Martin, You can use a FoldFunction and a WindowFunction to process the same! window. The FoldFunction is eagerly applied, so the window state is only one element. When the window is closed, the aggregated element is given to the WindowFunction where you can add start and end time. The iterator

Unexpected behaviour in datastream.broadcast()

2016-05-12 Thread Biplob Biswas

Hi, I am running this following sample code to understand how iteration and broadcast works in streaming context. final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); env.setParallelism(4); long i = 5; Data

Re: Bug while using Table API

2016-05-12 Thread Simone Robutti

Ok, I tested it and it works on the same example. :) 2016-05-11 12:25 GMT+02:00 Vasiliki Kalavri : > Hi Simone, > > Fabian has pushed a fix for the streaming TableSources that removed the > Calcite Stream rules [1]. > The reported error does not appear anymore with the current master. Could > you

Re: reading from latest kafka offset when flink starts

2016-05-12 Thread Balaji Rajagopalan

No I am using 0.8.0.2 kafka. I did some experiments with changing the parallelism from 4 to 16 now the lag has reduced to 20 min from 2 hours, the cpu utilization (load avg) has gone up from 20-30 % to 50-60 % , so parallelism does seem to play a role in reducing the processing lag in flink as I e

Re: HBase write problem

2016-05-12 Thread Flavio Pompermaier

Great :) On Thu, May 12, 2016 at 10:01 AM, Palle wrote: > Hi guys. > > Thanks for helping out. > > We downgraded to HBase 0.98 and resolved some classpath issues and then it > worked. > > /Palle > > - Original meddelelse - > > *Fra:* Stephan Ewen > *Til:* user@flink.apache.org > *Dato:*

normalize vertex values

2016-05-12 Thread Lydia Ickler

Hi all, If I have a Graph g: Graph g and I would like to normalize all vertex values by the absolute max of all vertex values -> what API function would I choose? Thanks in advance! Lydia

checkpoints not being removed from HDFS

2016-05-12 Thread Maciek Próchniak

Hi, we have stream job with quite large state (few GB), we're using FSStateBackend and we're storing checkpoints in hdfs. What we observe is that v. often old checkpoints are not discarded properly. In hadoop logs I can see: 2016-05-10 12:21:06,559 INFO BlockStateChange: BLOCK* addToInvalidat

[ANNOUNCE] Flink 1.0.3 Released

2016-05-12 Thread Ufuk Celebi

The Flink PMC is pleased to announce the availability of Flink 1.0.3. The official release announcement: http://flink.apache.org/news/2016/05/11/release-1.0.3.html Release binaries: http://apache.openmirror.de/flink/flink-1.0.3/ Please update your Maven dependencies to the new 1.0.3 version and

Re: Flink + Avro GenericRecord - first field value overwrites all other fields

2016-05-12 Thread Tarandeep Singh

I think I found a workaround. Instead of reading Avro files as GenericRecords, if I read them as specific records and then use a map to convert (typecast) them as GenericRecord, the problem goes away. I ran some tests and so far this workaround seems to be working in my local setup. -Tarandeep O

Re: HBase write problem

2016-05-12 Thread Palle

Hi guys. Thanks for helping out. We downgraded to HBase 0.98 and resolved some classpath issues and then it worked. /Palle - Original meddelelse - > Fra: Stephan Ewen > Til: user@flink.apache.org > Dato: Ons, 11. maj 2016 17:19 > Emne: Re: HBase write problem > > Just to narrow down

39 matches

Mail list logo