Hi Robert, I've been trying to build the "performance" project using various versions of Flink, but failing. It seems that I need both KafkaZKStringSerializer class and FlinkKafkaConsumer082 class to build the project, but none of the branches has both of them. KafkaZKStringSerializer existed in 0.9.0-x branches but deleted in 0.9.1-x branches, and FlinkKafkaConsumer082 goes the other way, therefore they don't exist in a same branch. I'm guessing you were using a snapshot somewhere between 0.9.0 and 0.9.1. Could you tell me the SHA you were using?
Regards, Eric On Wed, Aug 24, 2016 at 4:57 PM, Robert Metzger <rmetz...@apache.org> wrote: > Hi, > > Version 0.10-SNAPSHOT is pretty old. The snapshot repository of Apache > probably doesn't keep old artifacts around forever. > Maybe you can migrate the tests to Flink 0.10.0, or maybe even to a higher > version. > > Regards, > Robert > > On Wed, Aug 24, 2016 at 10:32 PM, Eric Fukuda <e.s.fuk...@gmail.com> > wrote: > >> Hi Max, Robert, >> >> Thanks for the advice. I'm trying to build the "performance" project, but >> failing with the following error. Is there a solution for this? >> >> [ERROR] Failed to execute goal on project streaming-state-demo: Could not >> resolve dependencies for project com.dataartisans.flink:streami >> ng-state-demo:jar:1.0-SNAPSHOT: Failure to find >> org.apache.flink:flink-connector-kafka-083:jar:0.10-SNAPSHOT in >> https://repository.apache.org/content/repositories/snapshots/ was cached >> in the local repository, resolution will not be reattempted until the >> update interval of apache.snapshots has elapsed or updates are forced -> >> [Help 1] >> >> >> >> >> On Wed, Aug 24, 2016 at 8:12 AM, Robert Metzger <rmetz...@apache.org> >> wrote: >> >>> Hi Eric, >>> >>> Max is right, the tool has been used for a different benchmark [1]. The >>> throughput logger that should produce the right output is this one [2]. >>> Very recently, I've opened a pull request for adding metric-measuring >>> support into the engine [3]. Maybe that's helpful for your experiments. >>> >>> >>> [1] http://data-artisans.com/high-throughput-low-latency-and >>> -exactly-once-stream-processing-with-apache-flink/ >>> [2] https://github.com/dataArtisans/performance/blob/master/ >>> flink-jobs/src/main/java/com/github/projectflink/streaming/T >>> hroughput.java#L203 >>> [3] https://github.com/apache/flink/pull/2386 >>> >>> >>> >>> On Wed, Aug 24, 2016 at 2:04 PM, Maximilian Michels <m...@apache.org> >>> wrote: >>> >>>> I believe the AnaylzeTool is for processing logs of a different >>>> benchmark. >>>> >>>> CC Jamie and Robert who worked on the benchmark. >>>> >>>> On Wed, Aug 24, 2016 at 3:25 AM, Eric Fukuda <e.s.fuk...@gmail.com> >>>> wrote: >>>> > Hi, >>>> > >>>> > I'm trying to benchmark Flink without Kafka as mentioned in this post >>>> > (http://data-artisans.com/extending-the-yahoo-streaming-benchmark/). >>>> After >>>> > running flink.benchmark.state.AdvertisingTopologyFlinkState with >>>> > user.local.event.generator in localConf.yaml set to 1, I ran >>>> > flink.benchmark.utils.AnalyzeTool giving >>>> > flink-1.0.1/log/flink-[username]-jobmanager-0-[servername].log as a >>>> > command-line argument. I got the following output and it does not >>>> have the >>>> > information about the latency. >>>> > >>>> > >>>> > ================= Latency (0 reports ) ===================== >>>> > ================= Throughput (1 reports ) ===================== >>>> > ====== null (entries: 10150)======= >>>> > Mean throughput 639078.5018497099 >>>> > Exception in thread "main" java.lang.IndexOutOfBoundsException: >>>> toIndex = 2 >>>> > at java.util.ArrayList.subListRangeCheck(ArrayList.java:962) >>>> > at java.util.ArrayList.subList(ArrayList.java:954) >>>> > at flink.benchmark.utils.AnalyzeT >>>> ool.main(AnalyzeTool.java:133) >>>> > >>>> > >>>> > Reading the code in AnalyzeTool.java, I found that it's looking for >>>> lines >>>> > that include "Latency" in the log file, but apparently it's not >>>> finding any. >>>> > I tried grepping the log file, and couldn't find any either. I have >>>> one >>>> > server that runs both JobManager and Task Manager and another server >>>> that >>>> > runs Redis, and they are connected through a network with each other. >>>> > >>>> > I think I have to do something to read the data stored in Redis before >>>> > running AnalyzeTool, but can't figure out what. Does anyone know how >>>> to get >>>> > the latency information? >>>> > >>>> > Thanks, >>>> > Eric >>>> >>> >>> >> >