date:20181010

Error restoring from savepoint while there's no modification to the job

2018-10-10 Thread Averell

Hi everyone, I'm getting the following error when trying to restore from a savepoint. Here below is the output from flink bin, and in the attachment is a TM log. I didn't have any change in the app before and after savepoint. All Window operators have been assigned unique ID string. Could you ple

Re: Error restoring from savepoint while there's no modification to the job

2018-10-10 Thread Dawid Wysakowicz

Hi Averell, Do you try to scale the job up, meaning do you increase the job parallelism? Have you increased the job max parallelism by chance? If so this is not supported. The max parallelism parameter is used to create key groups that can be further assigned to parallel operators. This parameter

Re: Error restoring from savepoint while there's no modification to the job

2018-10-10 Thread Stefan Richter

Hi, adding to Dawids questions, it would also be very helpful to know which Flink version was used to create the savepoint, which Flink version was used in the restore attempt, if the savepoint was moved or modified. Outside of potential conflicts with those things, I would not expect anything

Re: Wondering how to check if taskmanager is registered with the jobmanager, without asking job manager

2018-10-10 Thread Piotr Nowojski

Hi, I don’t think that’s exposed on the TaskManager. Maybe it would simplify things a bit if you implement this as a single “JobManager” health check, not multiple TaskManagers health check - for example verify that there are expected number of registered TaskManagers. It might cover your case

Re: Flink leaves a lot RocksDB sst files in tmp directory

2018-10-10 Thread Piotr Nowojski

Hi, Was this happening in older Flink version? Could you post in what circumstances the job has been moved to a new TM (full job manager logs and task manager logs would be helpful)? I’m suspecting that those leftover files might have something to do with local recovery. Piotrek > On 9 Oct 2

Re: JobManager did not respond within 60000 ms

2018-10-10 Thread Piotr Nowojski

Hi again, Glad that you solved your problem :) Splitting code into smaller functions has its advantages, but more operators/tasks means more overhead for JobManager/TaskManager to manage them. Usually that’s not a big issue, but as I said, you were running your cluster on extremely low memory

Re: Wondering how to check if taskmanager is registered with the jobmanager, without asking job manager

2018-10-10 Thread Bariša

Thnx Piotr. I agree, that would work. It's a bit chicken and the egg problem, since at that point we can't just spin up a task manager, and have it register itself, we need to have flinkmanager know how many task managers should be there. Bit more logic, but doable. Thnx for the tip. Cheers, Baris

Re: Error restoring from savepoint while there's no modification to the job

2018-10-10 Thread Le-Van Huyen

Hi Stefan, Dawid, I hadn't changed anything in the configuration. Env's parallelism stayed at 64. Some source/sink operators have parallelism of 1 to 8. I'm using Flink 1.7-SNAPSHOT, with the code pulled from master about 5 days back. Savepoint was saved to either S3 or HDFS (I tried multiple time

Re: Error restoring from savepoint while there's no modification to the job

2018-10-10 Thread Averell

Hi Stefan, Dawid, I hadn't changed anything in the configuration. Env's parallelism stayed at 64. Some source/sink operators have parallelism of 1 to 8. I'm using Flink 1.7-SNAPSHOT, with the code pulled from the master branch about 5 days back. Savepoint was saved to either S3 or HDFS (I tried mu

Re: Wondering how to check if taskmanager is registered with the jobmanager, without asking job manager

2018-10-10 Thread Piotr Nowojski

You’re welcome :) > On 10 Oct 2018, at 10:28, Bariša wrote: > > Thnx Piotr. I agree, that would work. It's a bit chicken and the egg problem, > since at that point we can't just spin up a task manager, and have it > register itself, we need to have flinkmanager know how many task managers > s

Re: Error restoring from savepoint while there's no modification to the job

2018-10-10 Thread Kostas Kloudas

Hi Averell, In the logs there are some “Split Reader: Custom File Source:” This is a custom source you implemented? Also is your keySelector deterministic with proper equals and hashcode methods? Cheers, Kostas > On Oct 10, 2018, at 10:50 AM, Averell wrote: > > Hi Stefan, Dawid, > > I hadn'

Re: getRuntimeContext(): The runtime context has not been initialized.

2018-10-10 Thread Chesnay Schepler

In which method are you calling getRuntimeContext()? This method can only be used after open() has been called. On 09.10.2018 17:09, Ahmad Hassan wrote: Hi, We want to use MapState inside fold function to keep the map of all products that we see in 24 hour window to store huge state in rocksd

Re: getRuntimeContext(): The runtime context has not been initialized.

2018-10-10 Thread Fabian Hueske

Yes, it would be good to post your code. Are you using a FoldFunction in a window (if yes, what window) or as a running aggregate? In general, collecting state in a FoldFunction is usually not something that you should do. Did you consider using an AggregateFunction? Fabian Am Mi., 10. Okt. 2018

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-10 Thread Fabian Hueske

Hi Xuefu, Welcome to the Flink community and thanks for starting this discussion! Better Hive integration would be really great! Can you go into details of what you are proposing? I can think of a couple ways to improve Flink in that regard: * Support for Hive UDFs * Support for Hive metadata cat

Re: Error restoring from savepoint while there's no modification to the job

2018-10-10 Thread Averell

Hi Kostas, Yes, I modified ContinuousFileMonitoringFunction to add one more ListState. The error might/should have come from that, but I haven't been able to find out why. All of my keyed streams are defined by Scala tuples like: /ikeyBy(r => (r.customer_id, r.address))/, and the fields using as

Re: Error restoring from savepoint while there's no modification to the job

2018-10-10 Thread Kostas Kloudas

You restore your job with the custom source from a savepoint taken without the custom source? > On Oct 10, 2018, at 11:34 AM, Averell wrote: > > Hi Kostas, > > Yes, I modified ContinuousFileMonitoringFunction to add one more > ListState. The error might/should have come from that, but I haven

cannot resolve constructor when invoke FlinkKafkaProducer011 constructor in scala

2018-10-10 Thread 远远

invoke FlinkKafkaProducer011 constructor in scala: val producer = new FlinkKafkaProducer011[PVEvent.Entity](appConf.getPvEventTopic, new PvEventSerializeSchema, producerProps, Optional.of(FlinkRebalancePartitioner[PVEvent.Entity])) and the constructor is : /** * Creates a FlinkKafkaProducer

Re: Error restoring from savepoint while there's no modification to the job

2018-10-10 Thread Averell

Hi Kostas, No, the same code was used. I (1) started the job, (2) created a savepoint, (3) cancelled the job, (4) restored the job with the same command as in (1) with the addition "-s ". Regards, Averell -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: cannot resolve constructor when invoke FlinkKafkaProducer011 constructor in scala

2018-10-10 Thread Dawid Wysakowicz

Hi, what is the exact error message you are getting? Best, Dawid On 10/10/18 11:51, 远远 wrote: > invoke FlinkKafkaProducer011 constructor in scala: > val producer = new > FlinkKafkaProducer011[PVEvent.Entity](appConf.getPvEventTopic, new > PvEventSerializeSchema, producerProps, > Optional.of

Re: cannot resolve constructor when invoke FlinkKafkaProducer011 constructor in scala

2018-10-10 Thread 远远

IDEA only show "cannot resolve constructor" error message, without other error message. Dawid Wysakowicz 于2018年10月10日周三下午5:55写道： > Hi, > > what is the exact error message you are getting? > > Best, > > Dawid > > On 10/10/18 11:51, 远远 wrote: > > invoke FlinkKafkaProducer011 constructor in scala:

Large rocksdb state restore/checkpoint duration behavior

2018-10-10 Thread Aminouvic

Hi, We are using Flink 1.6.1 on yarn with rocksdb as backend incrementally checkpointed to hdfs (for data and timers). The job reads events from kafka (~1 billion event per day), constructs user sessions using an EventTimeSessionWindow coupled with a late firing trigger and WindowFunction with Ag

Re: Job manager logs for previous YARN attempts

2018-10-10 Thread Gary Yao

Hi Pawel, As far as I know, the application attempt is incremented if the application master fails and a new one is brought up. Therefore, what you are seeing should not happen. I have just deployed on AWS EMR 5.17.0 (Hadoop 2.8.4) and killed the container running the application master – the cont

Re: Large rocksdb state restore/checkpoint duration behavior

2018-10-10 Thread Stefan Richter

Hi, I would assume that the problem about blocked processing during a checkpoint is caused by [1], because you mentioned the use of RocksDB incremental checkpoints and it could be that you use it in combination with heap-based timers. This is the one combination that currently still uses a sync

Re: [deserialization schema] skip data, that couldn't be properly deserialized

2018-10-10 Thread Fabian Hueske

Hi Rinat, Thanks for discussing this idea. Yes, I think this would be a good feature. Can you open a Jira issue and describe the feature? Thanks, Fabian Am Do., 4. Okt. 2018 um 19:28 Uhr schrieb Rinat : > Hi mates, in accordance with the contract of > org.apache.flink.formats.avro.Deserializati

Re: Identifying missing events in keyed streams

2018-10-10 Thread Fabian Hueske

Hi Averell, I'd go with approach 2). As of Flink 1.6.0 you can delete timers. But even if you are on a pre-1.6 version, a ProcessFunction would be the way to go, IMO. You don't need to register a timer for each event. Instead, you can register the first timer with the first event and have a state

Partitions vs. Subpartitions

2018-10-10 Thread Chris Miller

Hi, what's the difference between partitions and subpartitions? Thanks. CM

Re: [deserialization schema] skip data, that couldn't be properly deserialized

2018-10-10 Thread Rinat

Hi Fabian, I have created the issue, https://issues.apache.org/jira/browse/FLINK-10525 Thx ! > On 10 Oct 2018, at 16:47, Fabian Hueske wrote: > > Hi Rinat, > > Thanks for discussing this idea. Yes, I think this would be a good feature. > Can you open a Jira issue and describe the feature? >

Re: Flink 1.5.2 - excessive ammount of container requests, Received new/Returning excess container "flood"

2018-10-10 Thread Gary Yao

Hi Borys, I remember that another user reported a similar issue recently [1] – attached to the ticket you can find his log file. If I recall correctly, we concluded that YARN returned the containers very quickly. At the time, Flink's debug level logs were inconclusive because we did not log the re

Watermark on keyed stream

2018-10-10 Thread Nick Triller

Hi everyone, it seems Flink only supports global watermarks currently which is a problem for my use case. Many sensors send data which might be buffered for days in upstream systems before arriving at the Flink job. The job keys the stream by sensor. If other sensors send values in the meantime

Re: Watermark on keyed stream

2018-10-10 Thread Elias Levy

You are correct that watermarks are not tracked per key. You are dealing with events with a high degree of delay variability. That is usually not a good match for event time processing as implemented in Flink. You could use event time processing and configure a very large window allowed lateness

Taskmanager times out continuously for registration with Jobmanager

2018-10-10 Thread Abdul Qadeer

Hi, We are facing an issue in standalone HA mode in Flink 1.4.0 where Taskmanager restarts and is not able to register with the Jobmanager. It times out awaiting *AcknowledgeRegistration/AlreadyRegistered* message from Jobmanager Actor and keeps sending *RegisterTaskManager *message. The logs at

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-10 Thread vino yang

Hi Xuefu, Appreciate this proposal, and like Fabian, it would look better if you can give more details of the plan. Thanks, vino. Fabian Hueske 于2018年10月10日周三下午5:27写道： > Hi Xuefu, > > Welcome to the Flink community and thanks for starting this discussion! > Better Hive integration would be re

Fwd: Getting NoMethod found error while running job on flink 1.6.1

2018-10-10 Thread Chandu Kempaiah

Hello, I am have a job that reads messages from kafka, processes them and writes back to kafka, this jobs works fine on flink 1.3.2. I upgraded cluster to 1.6.1 but now see below error. Has any one faced similar issue? I have updated all the dependencies to use 1.6.1 org.apache.flink f

Re: Partitions vs. Subpartitions

2018-10-10 Thread Renjie Liu

Hi, Chris: Where are these words from? Are they from flink source code? On Wed, Oct 10, 2018 at 10:18 PM Chris Miller wrote: > Hi, > > what's the difference between partitions and subpartitions? > > > > Thanks. > > > > CM > > -- Renjie Liu Software Engineer, MVAD

Re: Partitions vs. Subpartitions

2018-10-10 Thread Kurt Young

Hi， Partition is the output of a JobVertex which you can simply thought contains an operator. And in real world, JobVertex will run in parallel, each will output some data, which is conceptually called subpartition. Best, Kurt On Thu, Oct 11, 2018 at 10:27 AM Renjie Liu wrote: > Hi, Chris: >

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-10 Thread Zhang, Xuefu

Hi Fabian/Vno, Thank you very much for your encouragement inquiry. Sorry that I didn't see Fabian's email until I read Vino's response just now. (Somehow Fabian's went to the spam folder.) My proposal contains long-term and short-terms goals. Nevertheless, the effort will focus on the followin

Re: Getting NoMethod found error while running job on flink 1.6.1

2018-10-10 Thread vino yang

Hi Chandu, What mode does your Flink run in? In addition, can you check if the flink-metrics-core is included in the classpath of the Flink runtime environment? Thanks, vino. Chandu Kempaiah 于2018年10月11日周四上午9:51写道： > > Hello, > > I am have a job that reads messages from kafka, processes them

答复: No data issued by flink window after a few hours

2018-10-10 Thread 潘功森

Hi， I changed as below configurations，and it looks fine when job started. But there’re no results issued when window ends after running about six hours, and no errors and exceptions. How can I position the question? Yours, September 发件人: 潘功森发送时间: Wednesday, Oc

Issue while running integration test using AbstractTestBase

2018-10-10 Thread James Isaac

Hi, I'm trying to run an integration test of my flink application. My test code looks like this: public class HttpsCsvIngestorTest extends AbstractTestBase { private final static Logger LOG = LoggerFactory.getLogger(HttpsCsvIngestorTest.class); @Test public void testHttpsCsvIngestio

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-10 Thread Jörn Franke

Would it maybe make sense to provide Flink as an engine on Hive („flink-on-Hive“)? Eg to address 4,5,6,8,9,10. this could be more loosely coupled than integrating hive in all possible flink core modules and thus introducing a very tight dependency to Hive in the core. 1,2,3 could be achieved via

Re: Getting NoMethod found error while running job on flink 1.6.1

2018-10-10 Thread Chandu Kempaiah

Flink is running as standalone cluster in High Availability mode, My application jar is a fat jar which has all the necessary dependencies included. I will check once again and verify by adding the flink-metrics-core to the classpath. Thanks Chandu On Wed, Oct 10, 2018 at 8:38 PM vino yang wrot

Re: Issue while running integration test using AbstractTestBase

2018-10-10 Thread James Isaac

I'm using flink 1.5.0. The test gives the same error even with flink-1.6.0. Also, I introduced a Thread.sleep(3); before the assert statement. That didn't help either. Regards, James On Thu, Oct 11, 2018 at 11:11 AM James Isaac wrote: > Hi, > > I'm trying to run an integration test of my fl

Re: No data issued by flink window after a few hours

2018-10-10 Thread vino yang

Hi gongsen, Have you used event time as time semantics? If so, then the possible problem is related to watermark. Since I don't know the details of your program, it's hard to make a conclusion. You can check if your watermark is firing normally. Thanks, vino. 潘功森于2018年10月11日周四下午12:12写道： >

43 matches

Mail list logo