date:20180215

Concurrent modification Exception when submitting multiple jobs

2018-02-15 Thread Vinay Patil

Hi, I am submitting job to the cluster (using remote execution env) from multiple threads. I am getting the following exception java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:909) at java.util.ArrayList$Itr.next(ArrayLis

Re: Regarding BucketingSink

2018-02-15 Thread Mu Kong

Hi Vishal, I have the same concern about save pointing in BucketingSink. As for your question, I think before the pending files get cleared in handleRestoredBucketState . They are finalized in notifyCheckpointComplete https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-fi

Re: [ANNOUNCE] Apache Flink 1.4.1 released

2018-02-15 Thread Hao Sun

This is great! On Thu, Feb 15, 2018 at 2:50 PM Bowen Li wrote: > Congratulations everyone! > > On Thu, Feb 15, 2018 at 10:04 AM, Tzu-Li (Gordon) Tai > wrote: > >> The Apache Flink community is very happy to announce the release of >> Apache Flink 1.4.1, which is the first bugfix release for the

Re: [ANNOUNCE] Apache Flink 1.4.1 released

2018-02-15 Thread Bowen Li

Congratulations everyone! On Thu, Feb 15, 2018 at 10:04 AM, Tzu-Li (Gordon) Tai wrote: > The Apache Flink community is very happy to announce the release of Apache > Flink 1.4.1, which is the first bugfix release for the Apache Flink 1.4 > series. > > > Apache Flink® is an open-source stream pro

Re: Optimizing multiple aggregate queries on a CEP using Flink

2018-02-15 Thread Sahil Arora

Thank you Kostas for your inputs. We will try to integrate an optimizer into flink and will get back in case we get stuck. Regards. On Thu, 15 Feb 2018 at 19:11 Kostas Kloudas wrote: > Hi Sahil, > > Currently CEP does not support multi-query optimizations out-of-the-box. > In some cases you can

[ANNOUNCE] Apache Flink 1.4.1 released

2018-02-15 Thread Tzu-Li (Gordon) Tai

The Apache Flink community is very happy to announce the release of Apache Flink 1.4.1, which is the first bugfix release for the Apache Flink 1.4 series. Apache Flink® is an open-source stream processing framework for distributed, high-performing, always-available, and accurate data streaming a

Re: Java heap size in YARN

2018-02-15 Thread Kien Truong

The off heap usage reported in the task manager ui can be misleading, because it does not contain the memory used by native library like rocksdb, which can be huge if you have large stateful job. Regards, Kien ⁣Sent from TypeApp On Feb 16, 2018, 00:33, at 00:33, Pawel Bartoszek wrote: >Tha

Re: Java heap size in YARN

2018-02-15 Thread Pawel Bartoszek

Thanks Kien. I will at least play with the setting :) We use hadoop (s3) as a chekpoint store. In our case off heap memory is around 300MB as reported on task manager statistic page. 15 lut 2018 17:24 "Kien Truong" napisał(a): > Hi, > > The relevant settings is: > > containerized.heap-cutoff-rat

Re: Java heap size in YARN

2018-02-15 Thread Kien Truong

Hi, The relevant settings is: |containerized.heap-cutoff-ratio|: (Default 0.25) Percentage of heap space to remove from containers started by YARN. When a user requests a certain amount of memory for each TaskManager container (for example 4 GB), we can not pass this amount as the maximum hea

Re: Java heap size in YARN

2018-02-15 Thread Pawel Bartoszek

I tried also setting taskmanager.memory.off-heap to true I still get around 42GB (Heap + DirectMemory) yarn 56827 837 16.6 16495964 10953748 ? Sl 16:53 34:10 /usr/lib/jvm/java-openjdk/bin/java -Xms12409m -Xmx12409m -XX:MaxDirectMemorySize=29591m Cheers, Pawel On 15 February 2018 at

Java heap size in YARN

2018-02-15 Thread Pawel Bartoszek

Hi, I have a question regarding configuration of task manager heap size when running YARN session on EMR. I am running 2 task managers on m4.4xlarge (64GB RAM). I would like to use as much as possible of that memory for the task manager heap. However when requesting 56000 MB when staring YARN ac

Manipulating Processing elements of Network Buffers

2018-02-15 Thread m@xi

Hello Flinker! I know that one should set appropriately the number of Network Buffers (NB) that its Flink deployment will use. Except from that, I am wondering if one might change/manipulate the specific sequence of data records into the NB in order to optimize the performance of its application.

Re: Optimizing multiple aggregate queries on a CEP using Flink

2018-02-15 Thread Kostas Kloudas

Hi Sahil, Currently CEP does not support multi-query optimizations out-of-the-box. In some cases you can do manual optimizations to your code, but there is no optimizer involved. Cheers, Kostas > On Feb 15, 2018, at 11:12 AM, Sahil Arora wrote: > > Hi Timo, > Thanks a lot for the help. I will

Re: Flink + Consul as HA backend. What do you think?

2018-02-15 Thread Krzysztof Białek

Alright, I have checkpoints saving implemented that way. I will apply this same pattern to jobgraphs. On Thu, Feb 15, 2018 at 11:13 AM, Fabian Hueske wrote: > Hi, > > all data is stored in a distributed file system or object store (HDFS, S3, > Ceph, ...) and ZooKeeper only stores pointers to th

Re: Flink + Consul as HA backend. What do you think?

2018-02-15 Thread Fabian Hueske

Hi, all data is stored in a distributed file system or object store (HDFS, S3, Ceph, ...) and ZooKeeper only stores pointers to that data. Cheers, Fabian 2018-02-15 11:08 GMT+01:00 Krzysztof Białek : > Alright, just came across the first real-life problem with my Consul HA > implementation. > I

Re: Optimizing multiple aggregate queries on a CEP using Flink

2018-02-15 Thread Sahil Arora

Hi Timo, Thanks a lot for the help. I will be looking forward to a reply from Kostas to be clearer on this. On Mon, 12 Feb 2018, 10:01 pm Timo Walther, wrote: > Hi Sahil, > > I'm not a CEP expert but I will loop in Kostas (in CC). In general, the > example that you described can be easily done

Re: flink read hdfs file error

2018-02-15 Thread Or Sher

Hi, Did you ever get to solve this issue? I'm getting the same error. On 1.3.2 I used to run the fat jar as a standalone without any job submission and it worked just fine It looked like it used an embedded MiniCluster. After just changing the dependencies to 1.4.0 we started getting this errors.

Re: Flink + Consul as HA backend. What do you think?

2018-02-15 Thread Krzysztof Białek

Alright, just came across the first real-life problem with my Consul HA implementation. In Consul KV store there is a limit of 512kB per node and JobGraph of one of my apps exceeded it. In ZK there seems to be similar zNode Limit = 1MB How did you workaround it? Or maybe I serialize the JobGraph wr

Retrieving name of last external checkpoint directory

2018-02-15 Thread Dawid Wysakowicz

Hi, We are running few jobs on yarn and in case of some failure (that the job could not recover from on its own) we want to use last successful external checkpoint to restore the job from manually. The problem is that the ${state.checkpoints.dir} contains checkpoint directories for all jobs that

Re: Deploying Flink with JobManager HA on Docker Swarm/Kubernetes

2018-02-15 Thread Aljoscha Krettek

Hi, AFAIK, the JobGraph itself is not stored in ZK but in HDFS. ZK only stores a handle to the serialised JobGraph. Best, Aljoscha > On 15. Feb 2018, at 04:59, Chirag Dewan wrote: > > Thanks a lot Aljoscha. > > I was doing a silly mistake. TaskManagers can now register with JobManager. > >

Re: How do I run SQL query on a dataStream that generates my custom type.

2018-02-15 Thread Timo Walther

Or even easier: You can do specify the type after the map call: eventStream.map({e: Event => toRow(e)})(Types.ROW_NAMED(...)) Regards, Timo Am 2/15/18 um 9:55 AM schrieb Timo Walther: Hi, In your case you don't have to convert to row if you don't want to. The Table API will do automatic co

Re: How do I run SQL query on a dataStream that generates my custom type.

2018-02-15 Thread Timo Walther

Hi, In your case you don't have to convert to row if you don't want to. The Table API will do automatic conversion once the stream of Event is converted into a table. However, this only works if Event is a POJO. If you want to specify own type information your MapFunction can implement the R

Concurrent modification Exception when submitting multiple jobs

Re: Regarding BucketingSink

Re: [ANNOUNCE] Apache Flink 1.4.1 released

Re: [ANNOUNCE] Apache Flink 1.4.1 released

Re: Optimizing multiple aggregate queries on a CEP using Flink

[ANNOUNCE] Apache Flink 1.4.1 released

Re: Java heap size in YARN

Re: Java heap size in YARN

Re: Java heap size in YARN

Re: Java heap size in YARN

Java heap size in YARN

Manipulating Processing elements of Network Buffers

Re: Optimizing multiple aggregate queries on a CEP using Flink

Re: Flink + Consul as HA backend. What do you think?

Re: Flink + Consul as HA backend. What do you think?

Re: Optimizing multiple aggregate queries on a CEP using Flink

Re: flink read hdfs file error

Re: Flink + Consul as HA backend. What do you think?

Retrieving name of last external checkpoint directory

Re: Deploying Flink with JobManager HA on Docker Swarm/Kubernetes

Re: How do I run SQL query on a dataStream that generates my custom type.

Re: How do I run SQL query on a dataStream that generates my custom type.

22 matches

Site Navigation

Mail list logo

Footer information