How can I confirm a savepoint is used for a new job?

2018-03-21 Thread Hao Sun
Do we have any logs in JM/TM indicate the job is using a savepoint I passed in when I submit the job? Thanks

how does SQL mode work with PopularPlaces example?

2018-03-21 Thread James Yu
Hi, I am following the PopularPlacesSQL example ( http://training.data-artisans.com/exercises/popularPlacesSql.html), but I am unable to understand why the following statement will pickup events with START flag only. "SELECT " + "toCoords(cell), wstart, wend, isStart, popCnt " + "FROM " + "(SELEC

Re: entrypoint for executing job in task manager

2018-03-21 Thread Steven Wu
Thanks, let me clarify the requirement. Sorry that it wasn't clear in the original email. Here is our setup. these 3 dirs are added to classpath * flink/lib: core flink jars (like flink-dist_2.11, flink-shaded-hadoop2-uber) * spaaslib: many jars pulled in our internal platform * jobs: a single fa

Re: Strange behavior on filter, group and reduce DataSets

2018-03-21 Thread Fabian Hueske
Hi, That was a bit too early. I found an issue with my approach. Will come back once I solved that. Best, Fabian 2018-03-21 23:45 GMT+01:00 Fabian Hueske : > Hi, > > I've opened a pull request [1] that should fix the problem. > It would be great if you could try change and report back whether i

Re: Strange behavior on filter, group and reduce DataSets

2018-03-21 Thread Fabian Hueske
Hi, I've opened a pull request [1] that should fix the problem. It would be great if you could try change and report back whether it fixes the problem. Thank you, Fabian [1] https://github.com/apache/flink/pull/5742 2018-03-21 9:49 GMT+01:00 simone : > Hi all, > > an update: following Stephan

InterruptedException when async function is cancelled

2018-03-21 Thread Ken Krugler
Hi all, When I cancel a job that has async functions, I see this sequence in the TaskManager logs: 2018-03-21 14:51:34,471 INFO org.apache.flink.runtime.taskmanager.Task - Attempting to cancel task AsyncFunctionName (1/1) (fcb7bbe7cd89f1167f8a656b0f2fdaf9). 2018-03-21 14:5

Out off memory when catching up

2018-03-21 Thread Lasse Nedergaard
Hi. When our jobs are catching up they read with a factor 10-20 times normal rate but then we loose our task managers with OOM. We could increase the memory allocation but is there a way to figure out how high rate we can consume with the current memory and slot allocation and a way to limit t

Re: ListCheckpointed function - what happens prior to restoreState() being called?

2018-03-21 Thread Ken Krugler
Hi Fabian, > On Mar 20, 2018, at 6:38 AM, Fabian Hueske wrote: > > Hi Ken, > > The documentation page describes that first the state is restored / > initialized and then the function's open() method is called [1]. Yes, thanks - my question was about ListCheckpointed.restoreState(), which doe

Re: Confluent Schema Registry DeserializationSchema

2018-03-21 Thread dim5b
I added kafka tomy dependencies although i am not sure why this would be required... seems to work org.apache.kafka kafka_${kafka.scala.version} ${kafka.version} This is my full dependency l

Re: Error running on Hadoop 2.7

2018-03-21 Thread ashish pok
Hi Piotrek, At this point we are simply trying to start a YARN session.  BTW, we are on Hortonworks HDP 2.6 which is on 2.7 Hadoop if anyone has experienced similar issues.  We actually pulled 2.6 binaries for the heck of it and ran into same issues.  I guess we are left with getting non-hadoop bi

Re: entrypoint for executing job in task manager

2018-03-21 Thread Stephan Ewen
It would be great to understand a bit more what the exact requirements here are, and what setup you use. I am not a dependency injection expert, so let me know if what I am suggesting here is complete bogus. *(1) Fix set of libraries for Dependency Injection, or dedicated container images per ap

Re: Error running on Hadoop 2.7

2018-03-21 Thread Piotr Nowojski
Hi, > Does some simple word count example works on the cluster after the upgrade? If not, maybe your job is pulling some dependency that’s causing this version conflict? Piotrek > On 21 Mar 2018, at 16:52, ashish pok wrote: > > Hi Piotrek, > > Yes, this is a brand new Prod environment. 2.6

Re: Confluent Schema Registry DeserializationSchema

2018-03-21 Thread Piotr Nowojski
Hi, It looks like to me that kafka.utils.VerifiableProperties comes from org.apache.kafka:kafka package - please check and solve (if possible) dependency conflicts in your pom.xml regarding this package. Probably there is some version collision. Piotrek > On 21 Mar 2018, at 16:40, dim5b wro

Re: Error running on Hadoop 2.7

2018-03-21 Thread ashish pok
Hi Piotrek, Yes, this is a brand new Prod environment. 2.6 was in our lab. Thanks, -- Ashish On Wed, Mar 21, 2018 at 11:39 AM, Piotr Nowojski wrote: Hi, Have you replaced all of your old Flink binaries with freshly downloaded Hadoop 2.7 versions? Are you sure that something hasn't mix in

Re: Record Delivery Guarantee with Kafka 1.0.0

2018-03-21 Thread Stephan Ewen
This should be handled by Flink. The system does flush records on checkpoints and does not confirm a checkpoint before all flushes are acked back. Did you turn on checkpointing? Without that, Flink cannot give guarantees for exactly the reason you outlined above. On Wed, Mar 14, 2018 at 9:34 PM

Confluent Schema Registry DeserializationSchema

2018-03-21 Thread dim5b
I trying to connect to schema registry and deserialize the project. I am building my project and on mvn build i get the error class file for kafka.utils.VerifiableProperties not found... import io.confluent.kafka.schemaregistry.client.CachedSchemaRegistryClient; import io.confluent.kafka.sch

Re: Error running on Hadoop 2.7

2018-03-21 Thread Piotr Nowojski
Hi, Have you replaced all of your old Flink binaries with freshly downloaded Hadoop 2.7 versions? Are you sure that something hasn't mix in the process? Does some simple word count example works on the cluster after the upgrade? Piotrek > On 21 Mar 20

Re: Standalone cluster instability

2018-03-21 Thread Piotr Nowojski
Hi, Does the issue really happen after 48 hours? Is there some indication of a failure in TaskManager log? If you will be still unable to solve the problem, please provide full TaskManager and JobManager logs. Piotrek > On 21 Mar 2018, at 16:00, Alexander Smirnov > wrote: > > One more ques

Error running on Hadoop 2.7

2018-03-21 Thread ashish pok
Hi All, We ran into a roadblock in our new Hadoop environment, migrating from 2.6 to 2.7. It was supposed to be an easy lift to get a YARN session but doesnt seem like :) We definitely are using 2.7 binaries but it looks like there is a call here to a private methos which screams runtime incompa

Re: Migration to Flip6 Kubernetes

2018-03-21 Thread Eron Wright
It would be helpful to expand on how, in job mode, the job graph would be produced. The phrase 'which contains the single job you want to execute' has a few meanings; I believe Till means a serialized job graph, not an executable JAR w/ main method. Till is that correct? On Tue, Mar 20, 2018 at

Re: Migration to Flip6 Kubernetes

2018-03-21 Thread Edward Rojas
Hi Till, Thanks for the information. We are using the session cluster and is working quite good, but we would like to benefit from the new approach of per-job mode in order to have a better control over the jobs that are running on the cluster. I took a look to the YarnJobClusterEntrypoint and I

Re: Standalone cluster instability

2018-03-21 Thread Alexander Smirnov
One more question - I see a lot of line like the following in the logs [2018-03-21 00:30:35,975] ERROR Association to [akka.tcp:// fl...@qafdsflinkw811.nn.five9lab.com:35320] with UID [1500204560] irrecoverably failed. Quarantining address. (akka.remote.Remoting) [2018-03-21 00:34:15,208] WARN Ass

Re: Kafka ProducerFencedException after checkpointing

2018-03-21 Thread Piotr Nowojski
Hi, But that’s exactly the case: producer’s transaction timeout starts when the external transaction starts - but FlinkKafkaProducer011 keeps an active Kafka transaction for the whole period between checkpoints. As I wrote in the previous message: > in case of failure, your timeout must also b

Standalone cluster instability

2018-03-21 Thread Alexander Smirnov
Hello, I've assembled a standalone cluster of 3 task managers and 3 job managers(and 3 ZK) following the instructions at https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/deployment/cluster_setup.html and https://ci.apache.org/projects/flink/flink-docs-release-1.4/ops/jobmanager_hig

Re: Apache Zookeeper vs Flink Zookeeper

2018-03-21 Thread Alexander Smirnov
Thanks Gary, appreciate your quick response On Wed, Mar 21, 2018 at 2:33 PM Gary Yao wrote: > Hi Alex, > > You can use vanilla Apache ZooKeeper. The class FlinkZooKeeperQuorumPeer > is only > used if you start ZooKeeper via the provided script bin/zookeeper.sh. > FlinkZooKeeperQuorumPeer does no

Re: Apache Zookeeper vs Flink Zookeeper

2018-03-21 Thread Gary Yao
Hi Alex, You can use vanilla Apache ZooKeeper. The class FlinkZooKeeperQuorumPeer is only used if you start ZooKeeper via the provided script bin/zookeeper.sh. FlinkZooKeeperQuorumPeer does not add any functionality except creating ZooKeeper's myid file. Best, Gary On Wed, Mar 21, 2018 at 12:02

Apache Zookeeper vs Flink Zookeeper

2018-03-21 Thread Alexander Smirnov
Hi, For standalone cluster configuration, is it possible to use vanilla Apache Zookeeper? I saw there's a wrapper around it in Flink - FlinkZooKeeperQuorumPeer. Is it mandatory to use it? Thank you, Alex

Re: Strange behavior on filter, group and reduce DataSets

2018-03-21 Thread simone
Hi all, an update: following Stephan directives on how to diagnose the issue, making Person immutable, the problem does not occur. Simone. On 20/03/2018 20:20, Stephan Ewen wrote: To diagnose that, can you please check the following:   - Change the Person data type to be immutable (final f

Re: Is Hadoop 3.0 integration planned?

2018-03-21 Thread Stephan Ewen
That is definitely a good thing to have, would like to have a discussion about how to approach that after 1.5 is released. On Wed, Mar 21, 2018 at 5:39 AM, Jayant Ameta wrote: > > Jayant Ameta >

Re: Queryable State

2018-03-21 Thread Kostas Kloudas
Hi Vishal, As Fabian said, queryable state is just a feature that exposes the state kept within Flink, and it is not made to replace functionality that would otherwise be made by a sink. In the future the functionality will definitely evolve but for there are no discussions currently, for keepi