date:20161208

Re: Resource under-utilization when using RocksDb state backend [SOLVED]

2016-12-08 Thread Aljoscha Krettek

Great to hear! On Fri, 9 Dec 2016 at 01:02 Cliff Resnick wrote: > It turns out that most of the time in RocksDBFoldingState was spent on > serialization/deserializaton. RocksDb read/write was performing well. By > moving from Kryo to custom serialization we were able to increase > throughput dra

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Aljoscha Krettek

I commented on the issue with a way that should work. On Fri, 9 Dec 2016 at 01:00 Chesnay Schepler wrote: > Done. https://issues.apache.org/jira/browse/FLINK-5299 > > On 08.12.2016 16:50, Ufuk Celebi wrote: > > Would you like to open an issue for this for starters Chesnay? Would be > good to fix

Re: separation of JVMs for different applications

2016-12-08 Thread Manu Zhang

If there are not any existing jira for standalone v2.0, may I open a new one ? Thanks, Manu On Wed, Dec 7, 2016 at 12:39 PM Manu Zhang wrote: > Good to know that. > > Is it the "standalone setup v2.0" section ? The wiki page has no > Google-Doc-like change histories. > Any jiras opened for that

Re: Serializers and Schemas

2016-12-08 Thread Matt

Hi people, This is what I was talking about regarding a generic de/serializer for POJO classes [1]. The Serde class in [2] can be used in both Kafka [3] and Flink [4], and it works out of the box for any POJO class. Do you see anything wrong in this approach? Any way to improve it? Cheers, Matt

Re: dataartisans flink training maven build error

2016-12-08 Thread Conny Gu

Hi all, I think, I found the solution, I just used a new version of Ubuntu and install the latest version of maven, then everything is settled down. best regards, Conny 2016-12-08 17:51 GMT+01:00 Conny Gu [via Apache Flink User Mailing List archive.] : > Hi all, > > I try to use the follow the

Re: Partitioning operator state

2016-12-08 Thread Dominik Safaric

Dear Stefan, Thanks for the clarification. How is however the state recovered in the case of a task failure? Assuming there is a topology of 10 workers and a worker dies. The state in this case, after restarting the entire execution, will how exactly be distributed across the workers? Domi

dataartisans flink training maven build error

2016-12-08 Thread Conny Gu

Hi all, I try to use the follow the dataartisans github training tutorial, http://dataartisans.github.io/flink-training/devEnvSetup.html but I got errors at the setup the DevelopmentEnvironment The errors happend at the step: "Clone and build the flink-training-exercises project" git clon

Re: Resource under-utilization when using RocksDb state backend [SOLVED]

2016-12-08 Thread Cliff Resnick

It turns out that most of the time in RocksDBFoldingState was spent on serialization/deserializaton. RocksDb read/write was performing well. By moving from Kryo to custom serialization we were able to increase throughput dramatically. Load is now where it should be. On Mon, Dec 5, 2016 at 1:15 PM,

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Chesnay Schepler

Done. https://issues.apache.org/jira/browse/FLINK-5299 On 08.12.2016 16:50, Ufuk Celebi wrote: Would you like to open an issue for this for starters Chesnay? Would be good to fix for the upcoming release even. On 8 December 2016 at 16:39:58, Chesnay Schepler (ches...@apache.org) wrote: It wo

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Ufuk Celebi

Would you like to open an issue for this for starters Chesnay? Would be good to fix for the upcoming release even. On 8 December 2016 at 16:39:58, Chesnay Schepler (ches...@apache.org) wrote: > It would be neat if we could support arrays as keys directly; it should > boil down to checking the ke

Re: conditional dataset output

2016-12-08 Thread Chesnay Schepler

Hello Lars, The only other way i can think of how this could be done is by wrapping the used outputformat in a custom format, which calls open on the wrapped outputformat when you receive the first record. This should work but is quite hacky though as it interferes with the format life-cycle

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Chesnay Schepler

It would be neat if we could support arrays as keys directly; it should boil down to checking the key type and in case of an array injecting a KeySelector that calls Arrays.hashCode(array). This worked for me when i ran into the same issue while experimenting with some stuff. The batch API can

conditional dataset output

2016-12-08 Thread lars . bachmann

Hi, let's assume I have a dataset and depending on the input data and different filter operations this dataset can be empty. Now I want to output the dataset to HD, but I want that files are only created if the dataset is not empty. If the dataset is empty I don't want any files. The default

Re: Parallelism and stateful mapping with Flink

2016-12-08 Thread Ufuk Celebi

@Aljoscha: I remember that someone else ran into this, too. Should we address arrays as keys specifically in the API? Prohibit? Document this? – Ufuk On 7 December 2016 at 17:41:40, Andrew Roberts (arobe...@fuze.com) wrote: > Sure! > > (Aside, it turns out that the issue was using an `Array[By

Re: Replace Flink job while cluster is down

2016-12-08 Thread Ufuk Celebi

With HA enabled, Flink checks the configured ZooKeeper node for pre-existing jobs and checkpoints when starting. What Stefan meant was that you can configure a different ZooKeeper node, which will start the cluster with a clean state. You can check the available config options here: https://ci

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

2016-12-08 Thread Ufuk Celebi

Great! :) On 8 December 2016 at 15:28:05, LINZ, Arnaud (al...@bouyguestelecom.fr) wrote: > -yjm works, and suits me better than a global fink-conf.yml parameter. I've > looked for > a command line parameter like that, but I've missed it in the doc, my mistake. > Thanks, > Arnaud > > -Mes

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

2016-12-08 Thread LINZ, Arnaud

-yjm works, and suits me better than a global fink-conf.yml parameter. I've looked for a command line parameter like that, but I've missed it in the doc, my mistake. Thanks, Arnaud -Message d'origine- De : Ufuk Celebi [mailto:u...@apache.org] Envoyé : jeudi 8 décembre 2016 14:43 À : LIN

Re: Replace Flink job while cluster is down

2016-12-08 Thread Al-Isawi Rami

Hi Stefan, Yes, a cluster of 3 machines. Version 1.1.1 I did not get what is the difference between “remove entry from zookeeper” and “using flink zookeeper namespaces feature”. Eventually, I started the cluster and it did recover the old program. However, I was fast enough to click Cancel in

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

2016-12-08 Thread Ufuk Celebi

Good point with the collect() docs. Would you mind opening a JIRA issue for that? I'm not sure whether you can specify it via that key for YARN. Can you try to use -yjm 8192 when submitting the job? Looping in Robert who knows best whether this config key is picked up or not for YARN. – Ufuk

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

2016-12-08 Thread LINZ, Arnaud

Hi Ufuk, Yes, I have a large set of data to collect for a data science job that cannot be distributed easily. Increasing the akka.framesize size do get rid of the collect hang (maybe you should highlight this parameter in the collect() documentation, 10Mb si not that big), thanks. However my j

Re: Recursive directory traversal with TextInputFormat

2016-12-08 Thread Ufuk Celebi

Looping in Kostas who recently worked on the continuous file inputs. @Kostas: do you have an idea what's happening here? – Ufuk On 8 December 2016 at 08:43:32, Lukas Kircher (lukas.kirc...@uni-konstanz.de) wrote: > Hi Stefan, > > thanks for your answer. > > > I think there is a field in Fil

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

2016-12-08 Thread Ufuk Celebi

I also don't get why the job is recovering, but the oversized message is very likely the cause for the freezing collect, because the data set is gather via Akka. You can configure the frame size via "akka.framesize", which defaults to 10485760b (10 MB). Is the collected result larger than that

Re: [DISCUSS] "Who's hiring on Flink" monthly thread in the mailing lists?

2016-12-08 Thread Robert Metzger

Thank you for speaking up Kanstantsin. I really don't want to downgrade the experience on the user@ list. I wonder if jobs@flink would be a too narrowly-scoped mailing list. Maybe we could also start a community@flink (alternatively also general@) mailing list for everything relating to the broade

Re: Partitioning operator state

2016-12-08 Thread Stefan Richter

Hi Dominik, as Gordon’s response only covers keyed-state, I will briefly explain what happens for non-keyed operator state. In contrast to Flink 1.1, Flink 1.2 checkpointing does not write a single blackbox object (e.g. ONE object that is a set of all kafka offsets is emitted), but a list of bl

Re: Resource under-utilization when using RocksDb state backend [SOLVED]

Re: Parallelism and stateful mapping with Flink

Re: separation of JVMs for different applications

Re: Serializers and Schemas

Re: dataartisans flink training maven build error

Re: Partitioning operator state

dataartisans flink training maven build error

Re: Resource under-utilization when using RocksDb state backend [SOLVED]

Re: Parallelism and stateful mapping with Flink

Re: Parallelism and stateful mapping with Flink

Re: conditional dataset output

Re: Parallelism and stateful mapping with Flink

conditional dataset output

Re: Parallelism and stateful mapping with Flink

Re: Replace Flink job while cluster is down

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

Re: Replace Flink job while cluster is down

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

Re: Recursive directory traversal with TextInputFormat

RE: Collect() freeze on yarn cluster on strange recover/deserialization error

Re: [DISCUSS] "Who's hiring on Flink" monthly thread in the mailing lists?

Re: Partitioning operator state

24 matches

Site Navigation

Mail list logo

Footer information