Re: Referencing Global Window across flink jobs

2017-07-09 Thread Konstantin Knauf
Hi Vijay, thanks for sharing the code. To my knowledge the only way to access the state of one job in another job right now is Queryable State, which in this case seems impractical. Why do you want to perform the apply functions in separate Flink jobs? In the same job I would just perform all agg

Re: Best practices for assigning operator UIDs

2017-07-09 Thread Gyula Fóra
Hi Jared, The only thing that matters is that UIDs are unique within one JobGraph. Its completely fine to use the same uids in two separate jobs. Beyond this I would go with simple uids that dont contain parts of the logic, because maybe you want to change the logic (expression, or add new topic

Re: External checkpoints not getting cleaned up/discarded - potentially causing high load

2017-07-09 Thread Jared Stehler
We are using the rocksDB state backend. We had not activated incremental checkpointing, but in the course of debugging this, we ended up doing so, and also moving back to S3 from EFS as it appeared that EFS was introducing large latencies. I will attempt to provide some profiler data as we are a

Best practices for assigning operator UIDs

2017-07-09 Thread Jared Stehler
I have some confusion around how best to assign UIDs to operators. The documentation states simply that they are important, but stops short of recommending what if any stateful information should go into the name. For example, if the same code is used to create two separate job graphs, should th

Re: problems starting the training exercise TaxiRideCleansing on local cluster

2017-07-09 Thread Günter Hipler
Thanks for response. My classpath contains a version mvn dependency:build-classpath [INFO] Scanning for projects... [INFO] [INFO] [INFO] Building Flink Quickstart Job 0.1 [INFO] -

Re: flink kafka consumer lag

2017-07-09 Thread Kien Truong
Hi, You should setup a metric reporter to collect Flink's metrics. https://ci.apache.org/projects/flink/flink-docs-release-1.2/monitoring/metrics.html There's a lot of useful information in the metrics, including the consumer lags. I'm using the Graphite reporter with InfluxDB for storage +

problems starting the training exercise TaxiRideCleansing on local cluster

2017-07-09 Thread Günter Hipler
Hi, sorry for this newbie question... I'm following the data artisans exercises and wanted to run the TaxiRide Cleansing job on my local cluster (version 1.3.1) (http://training.data-artisans.com/exercises/rideCleansing.html) While this is possible within my IDE the cluster throws an exceptio