date:20190304

Re: Externalised checkpoint keeps ValueState after a crash of a Flink cluster

2019-03-04 Thread Congxian Qiu

hi, Min Complete Checkpoint contains the snapshot of all states, and when recovery from checkpoint, all the states will be recovered from checkpoint, from what you described, I guess when the job manager gets killed, there is an onging but not completed checkpoint. Maybe the doc[1] can be helpf

Re: flink sql about nested json

2019-03-04 Thread 杨光

HI Timo I have get the nested value by change the Schema definition like this Schema schemaDesc1 = new Schema() .field("str2", Types.STRING) .field("tablestr", Types.STRING).from("table") * .field("obj1", Types.ROW_NAMED(new String[]{"rkey","val","lastTime"}, Types.STRI

Re: flink sql about nested json

2019-03-04 Thread 杨光

杨光下午3:22 (1分钟前) 发送至 Timo、 user HI Timo I have get the nested value by change the Schema definition like this Schema schemaDesc1 = new Schema() .field("str2", Types.STRING) .field("tablestr", Types.STRING).from("table") * .field("obj1", Types.ROW_NAMED(new String[]{"rk

Re: submit job failed on Yarn HA

2019-03-04 Thread Gary Yao

Hi Sen, I don't see high-availability: zookeeper in your Flink configuration. However, this is mandatory for an HA setup. By default "none" is used, and the ZK configuration is ignored. The log also hints that you are using StandaloneLeaderElectionService instead of the ZooKeeper implementat

Re: event time timezone is not correct

2019-03-04 Thread 孙森

Thanks Piotrek. It seems the question has not been solved. I will try to use the TIMESTAMPADD(timeUnit, integer, datetime) instead . Best Sen > 在 2019年3月4日，下午11:29，Piotr Nowojski 写道： > > Hi, > > I think that Flink SQL works currently only in UTC, so the 8 hours difference > is a result of y

Re: Broadcast state with WindowedStream

2019-03-04 Thread Aggarwal, Ajay

It sort of makes sense that broadcast state is not available with WindowedStream. But if I need some dynamic global state in MyProcessWindowFunction what are my options? Ajay From: "Aggarwal, Ajay" Date: Monday, March 4, 2019 at 4:36 PM To: "user@flink.apache.org" Subject: Broadcast state wi

Broadcast state with WindowedStream

2019-03-04 Thread Aggarwal, Ajay

Is it possible to use broadcast state with windowing? My job looks like below inputStream .keyBy("some-key") .window(TumblingEventTimeWindows.of(Time.seconds(Properties.WINDOW_SIZE)))

Re: Task slot sharing: force reallocation

2019-03-04 Thread Le Xu

Thanks Piotr. I didn't realize that the email attachment isn't working so the example I was referring to was this figure from Flink website: https://ci.apache.org/projects/flink/flink-docs-stable/fig/slot_sharing.svg So I try to run multiple jobs concurrently in a cluster -- the jobs are identica

Externalised checkpoint keeps ValueState after a crash of a Flink cluster

2019-03-04 Thread min.tan

Hi, I have a question about to keep a ValueState after a Flink 1.7.2 cluster is crashed. My Flink job is simple 1) read dummy events (an event only has a string Id) from a Kafka source. 2) do a count on input events and save it as a ValueState 3) setup an externalized checkpoint running every

Checkpoint recovery and state external to flink

2019-03-04 Thread Aggarwal, Ajay

What happens when the flink job interacts with a user managed database and hence has some state outside of flink? In these situations when a flink job is recovered from last successful checkpoint, this external state will not be in sync with the recovered flink state. In most cases it will be ah

Re: Using Flink in an university course

2019-03-04 Thread Addison Higham

Hi there, As far as a runtime for students, it seems like docker is your best bet. However, you could have them instead package a jar using some interface (for example, see https://ci.apache.org/projects/flink/flink-docs-release-1.7/dev/packaging.html, which details the `Program` interface) and th

Re: [1.7.1] job stuck in suspended state

2019-03-04 Thread Steven Wu

Till, I will send you the complete log offline. We don't know how to reliably reproduce the problem. but it did happen quite frequently, like once every a couple of days. Let me see if I can cherry pick the fix/commit to 1.7 branch. Thanks, Steven On Mon, Mar 4, 2019 at 5:55 AM Till Rohrmann w

EventCountJob for Flink 1.7.2

2019-03-04 Thread Flavio Pompermaier

Hi to all, I was trying to test the EventCountJob[1] on Flink 1.7.2 but there's no more QueryableStateOptions.SERVER_ENABLE. How can I specify to enable query server on LocalFlinkMiniCluster? I've tried to chenage it to config.setString(QueryableStateOptions.SERVER_PORT_RANGE, "9067"); but it is ig

RE: EXT :Re: Flink 1.7.1 Inaccessible

2019-03-04 Thread Martin, Nick

Seye, are you running Flink and Zookeeper in Docker? I’ve had problems with Jobmanagers not resolving the hostnames for Zookeeper when starting a stack on Docker. From: Till Rohrmann [mailto:trohrm...@apache.org] Sent: Monday, March 04, 2019 7:02 AM To: Seye Jin Cc: user Subject: EXT :Re: Flin

Re: flink sql about nested json

2019-03-04 Thread Timo Walther

Hi, Flink SQL JSON format supports nested formats like the schema that you posted. Maybe the renaming with `from()` works not as expected. Did you try it without the `from()` where schema fields are equal to JSON fields? Alternatively, you could also define the schema only and use the `deriv

Re: Flink parallel subtask affinity of taskmanager

2019-03-04 Thread Piotr Nowojski

Hi, You should be able to use legacy mode for this: https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#legacy However note that this option will disappear in the near future and there is a JIRA ticket to address this issue: https://issues.apache.org/jira/browse/FLINK-11815

Re: Checkpoints and catch-up burst (heavy back pressure)

2019-03-04 Thread Ken Krugler

The amount of data you’re checkpointing (if it’s really limited to file lists) still seems too small to cause timeouts, unless there’s some other issue with either your configuration or where data is being written (thus my previous question #1). — Ken > On Mar 3, 2019, at 10:56 PM, LINZ, Arna

Re: event time timezone is not correct

2019-03-04 Thread Piotr Nowojski

Hi, I think that Flink SQL works currently only in UTC, so the 8 hours difference is a result of you using GMT+8 time stamps somewhere. Please take a look at this thread: http://mail-archives.apache.org/mod_mbox/flink-user/201711.mbox/%3c2e1eb190-26a0-b288-39a4-683b463f4...@apache.org%3E I thi

Re: StochasticOutlierSelection

2019-03-04 Thread Piotr Nowojski

Hi, I have never used this code, but ml library depends heavily on Scala, so I wouldn’t recommend using it with Java. However if you want to go this way (I’m not sure if that’s possible), you would have to pass the implicit parameters manually somehow (I don’t know how to do that from Java).

Task Manager high availability

2019-03-04 Thread Gabriel Candal

Hi, Before the actual question, and just to make sure my assumptions are correct, my understanding is that Flink's current behaviour is: - Job Manager failure under an highly available setup: a standby Job Manager takes over, no impact on the job - Task Manager failure, with enough free slots in

Re: submit job failed on Yarn HA

2019-03-04 Thread Gary Yao

Hi Sen, Are you using the default MemoryStateBackend [1]? As far as I know, it does not support JobManager failover. If you are already using FsStateBackend or RocksDBStateBackend, please send JM logs. Best, Gary [1] https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/state/state_bac

Re: Using Flink in an university course

2019-03-04 Thread Wouter Zorgdrager

Hey all, Thanks for the replies. The issues we were running into (which are not specific to Docker): - Students changing the template wrongly failed the container. - We give full points if the output matches our solutions (and none otherwise), but it would be nice if we could give partial grades p

Re: Flink 1.7.1 Inaccessible

2019-03-04 Thread Till Rohrmann

Hi Seye, usually, Flink's web UI should be accessible after a successful leader election. Could you share with us the cluster logs to see what's going on? Without this information it is hard to tell what's going wrong. What you could also do is to check the ZooKeeper znode which represents the cl

Flink parallel subtask affinity of taskmanager

2019-03-04 Thread Andrew Roberts

Hello, We run flink as a standalone cluster. When moving from flink 1.3 to 1.6, we noticed a change in the scheduling behavior. Where previously parallel subtasks of a job seemed to be round-robin allocated around our cluster, flink 1.6 appears to want to deploy as many subtasks to the same hos

Re: [1.7.1] job stuck in suspended state

2019-03-04 Thread Till Rohrmann

Hi Steven, is this the tail of the logs or are there other statements following? I think your problem could indeed be related to FLINK-11537. Is it possible to somehow reliably reproduce this problem? If yes, then you could try out the RC for Flink 1.8.0 which should be published in the next days.

Re: Using Flink in an university course

2019-03-04 Thread Fabian Hueske

Hi Wouter, We are using Docker Compose (Flink JM, Flink TM, Kafka, Zookeeper) setups for our trainings and it is working very well. We have an additional container that feeds a Kafka topic via the commandline producer to simulate a somewhat realistic behavior. Of course, you can do it without Kafk

Re: [Flink-Question] In Flink parallel computing, how do different windows receive the data of their own partition, that is, how does Window determine which partition number the current Window belongs

2019-03-04 Thread Piotr Nowojski

Hi, I’m not if I understand your question/concerns. As Rong Rong explained, key selector is used to assign records to window operators. Within key context, you do not have access to other keys/values in your operator/functions, so your reduce/process/… functions when processing key:1 won’t b

Re: Command exited with status 1 in running Flink on marathon

2019-03-04 Thread Piotr Nowojski

Hi, With just this information it might be difficult to help. Please look for some additional logs (has the Flink managed to log anything?) or some standard output/errors. I would guess this might be some relatively simple mistake in configuration, like file/directory read/write/execute permis

S3 parquet sink - failed with S3 connection exception

2019-03-04 Thread Averell

Hello everyone, I have a job which is writing some streams into parquet files in S3. I use Flink 1.7.2 on EMR 5.21. My job had been running well, but suddenly it failed to make a checkpoint with the full stack trace mentioned below. After that failure, the job restarted from the last successful ch

Re: Task slot sharing: force reallocation

2019-03-04 Thread Piotr Nowojski

Hi, Are you asking the question if that’s the behaviour or you have actually observed this issue? I’m not entirely sure, but I would guess that the Sink tasks would be distributed randomly across the cluster, but maybe I’m mixing this issue with resource allocations for Task Managers. Maybe Til

Re: Using Flink in an university course

2019-03-04 Thread Jörn Franke

It would help to understand the current issues that you have with this approach? I used a similar approach (not with Flink, but a similar big data technology) some years ago > Am 04.03.2019 um 11:32 schrieb Wouter Zorgdrager : > > Hi all, > > I'm working on a setup to use Apache Flink in an as

Re: Setting source vs sink vs window parallelism with data increase

2019-03-04 Thread Piotr Nowojski

Hi, What Flink version are you using? Generally speaking Flink might not the best if you have records fan out, this may significantly increase checkpointing time. However you might want to first identify what’s causing long GC times. If there are long GC pause, this should be the first thing

Re: Flink Custom SourceFunction and SinkFunction

2019-03-04 Thread Piotr Nowojski

Hi, I couldn’t find any references to your question neither I haven’t seen such use case, but: Re 1. It looks like it could work Re 2. It should work as well, but just try to use StreamingFileSink Re 3. For custom source/sink function, if you do not care data processing guarantees it’s quite

Using Flink in an university course

2019-03-04 Thread Wouter Zorgdrager

Hi all, I'm working on a setup to use Apache Flink in an assignment for a Big Data (bachelor) university course and I'm interested in your view on this. To sketch the situation: - > 200 students follow this course - students have to write some (simple) Flink applications using the DataStream API;

Re: submit job failed on Yarn HA

2019-03-04 Thread 孙森

Hi Gary: Yes, I enable the checkpoints in my program . > 在 2019年3月4日，上午3:03，Gary Yao 写道： > > Hi Sen, > > Did you set a restart strategy [1]? If you enabled checkpoints [2], the fixed- > delay strategy will be used by default. > > Best, > Gary > > [1] > https://ci.apache.org/project

Re: Hadoop 运行 mr 程序报错

2019-03-04 Thread sam peng

Thanks for your reply, I fix the problem by adding a new user. Root is not avaliable . > 在 2019年3月4日，上午11:47，sam peng <624645...@qq.com> 写道： > > > 请教大家一个Hadoop 运行MR问题。 > > 之前我们配置过一个单点Hadoop，能正常运行。 > > 目前我们把hadoop 移到生产环境中，将hadoop目录挂载在磁盘中，用flume能正常收取kafka数据。 > > 但是运行mr程序报错： > > <11_28_35__0

Re: Externalised checkpoint keeps ValueState after a crash of a Flink cluster

Re: flink sql about nested json

Re: flink sql about nested json

Re: submit job failed on Yarn HA

Re: event time timezone is not correct

Re: Broadcast state with WindowedStream

Broadcast state with WindowedStream

Re: Task slot sharing: force reallocation

Externalised checkpoint keeps ValueState after a crash of a Flink cluster

Checkpoint recovery and state external to flink

Re: Using Flink in an university course

Re: [1.7.1] job stuck in suspended state

EventCountJob for Flink 1.7.2

RE: EXT :Re: Flink 1.7.1 Inaccessible

Re: flink sql about nested json

Re: Flink parallel subtask affinity of taskmanager

Re: Checkpoints and catch-up burst (heavy back pressure)

Re: event time timezone is not correct

Re: StochasticOutlierSelection

Task Manager high availability

Re: submit job failed on Yarn HA

Re: Using Flink in an university course

Re: Flink 1.7.1 Inaccessible

Flink parallel subtask affinity of taskmanager

Re: [1.7.1] job stuck in suspended state

Re: Using Flink in an university course

Re: [Flink-Question] In Flink parallel computing, how do different windows receive the data of their own partition, that is, how does Window determine which partition number the current Window belongs

Re: Command exited with status 1 in running Flink on marathon

S3 parquet sink - failed with S3 connection exception

Re: Task slot sharing: force reallocation

Re: Using Flink in an university course

Re: Setting source vs sink vs window parallelism with data increase

Re: Flink Custom SourceFunction and SinkFunction

Using Flink in an university course

Re: submit job failed on Yarn HA

Re: Hadoop 运行 mr 程序报错

36 matches

Site Navigation

Mail list logo

Footer information