Taskmanager memory

2015-12-09 Thread Kruse, Sebastian
Hi everyone, I am currently looking into how Flink can coexist and interoperate with other frameworks in a cluster, such as plain single-machine processes or Spark?. ?Tachyon seems to be nice solution to exchange data between them. However, I think it is a problem that Flink's taskmanagers al

Re: Taskmanager memory

2015-12-09 Thread Fabian Hueske
Hi Sebastian, There is no way to return memory from a Flink process except shutting the process down. I think YARN could help in your setup. In a YARN setup, you can flexibly start and stop Flink sessions with different configurations (memory, TMs, slots) or run a single job. When running a single

Re: Taskmanager memory

2015-12-09 Thread Stephan Ewen
@Sebastian: Getting memory away from the JVM is tricky always, completely independent of pre-allocation of managed memory or lazy allocation. But here is something that may work: - Start Flink in streaming mode - that will make it allocate managed memory lazily - Set the memory to offheap memo

Re: Question about DataStream serialization

2015-12-09 Thread Stephan Ewen
The object's hash it is additionally scrambled using a typical hash function (like murmur hash) to guard against bad hash functions... On Wed, Dec 9, 2015 at 2:37 AM, Radu Tudoran wrote: > Hi, > > Is the partitioned functioned used by the ".keyBy(Object)" of the form: > > Object.hash % getNumber

Community choice for Hadoop Summit Europe 2016

2015-12-09 Thread Kostas Tzoumas
Hi everyone, Just a reminder, the community vote for the Hadoop Summit Europe 2016 talks in Dublin is still open until December 15. There is a very good number of talks around Flink submitted, here are the ones that mention "flink" in their abstract: https://hadoopsummit.uservoice.com/search?filt

Re: Using memory logging in Flink

2015-12-09 Thread Till Rohrmann
I assume you're looking in the taskmanager log file for the memory usage logging statements, right? Cheers, Till On Wed, Dec 9, 2015 at 12:15 AM, Filip Łęczycki wrote: > Hi, > > Thank you for your reply! > > I have made sure I restarted the TaskManager after changing config, but it > didn't res

Re: Question about DataStream serialization

2015-12-09 Thread Aljoscha Krettek
Right now, it is exactly "Object.hash % getNumberOfParallelSubtasks()”... > On 09 Dec 2015, at 02:37, Radu Tudoran wrote: > > Object.hash % getNumberOfParallelSubtasks()

Re: Taskmanager memory

2015-12-09 Thread Fabian Hueske
Streaming mode with on-heap memory won't help because the JVM allocates all memory but doesn't convert it to managed memory internally, right? Is offheap memory actually freed after it has been allocated as managed memory? Does this happen after a job finishes? 2015-12-09 10:44 GMT+01:00 Stephan

Re: Using memory logging in Flink

2015-12-09 Thread Stephan Ewen
Hi Filip! Someone else just used the memory logging with the exact described settings - it worked. There is probably some mixup, you may be looking into the wrong log file, or may setting the a value in a different config... Stephan On Wed, Dec 9, 2015 at 10:58 AM, Till Rohrmann wrote: > I a

Re: Taskmanager memory

2015-12-09 Thread Stephan Ewen
Off heap memory is freed when the memory consuming operators release the memory. The Java process releases that memory then on the next GC, as far as I know. On Wed, Dec 9, 2015 at 11:01 AM, Fabian Hueske wrote: > Streaming mode with on-heap memory won't help because the JVM allocates > all mem

Re: Taskmanager memory

2015-12-09 Thread Kruse, Sebastian
Thanks for your answers. So the problem with on-heap memory would be that the JVM would not shrink its already allocated heap even if it is largely unused? Pertaining to the streaming-mode: If I run Flink in that mode, can I still submit batch jobs? Because that's what I want to do. Thanks, S

Re: Taskmanager memory

2015-12-09 Thread Fabian Hueske
Yes, streaming mode supports batch jobs as well. The difference is that in streaming mode, managed memory is lazily allocated. This is because the streaming runtime does not use managed memory but only heap memory. 2015-12-09 11:55 GMT+01:00 Kruse, Sebastian : > Thanks for your answers. So the pr

Re: Features with major priority/future release/s

2015-12-09 Thread Maximilian Michels
Hi Ovidiu, This is still subject to community discussion. We created a wiki page to keep track of the essential features to be included in 1.0: https://cwiki.apache.org/confluence/display/FLINK/1.0+Release Otherwise, everything which is marked "Fix version 1.0.0" in JIRA is a candidate to be incl

Re: Using memory logging in Flink

2015-12-09 Thread Filip Łęczycki
Hi Stephan, Till I am looking for the memory usage info in flink__jobmanager... log. As there are no other log files in /log dir I assumed this is the right place to look. I will try to setup a new clean instance and check whether it will work. Thank you very much for your assistance! Best regard

Re: Using memory logging in Flink

2015-12-09 Thread Till Rohrmann
How do you start the flink cluster? If it's a standalone cluster and you don't use a shared directory, then you'll find the log of the taskmanager on the machine on which the taskmanager runs. If you use YARN then you can activate log aggregation to retrieve the log easily after the job has finishe

Re: Taskmanager memory

2015-12-09 Thread Stephan Ewen
BTW, for 1.0, this is consolidated into one single mode... On Wed, Dec 9, 2015 at 1:45 PM, Fabian Hueske wrote: > Yes, streaming mode supports batch jobs as well. > The difference is that in streaming mode, managed memory is lazily > allocated. This is because the streaming runtime does not use

Using S3 as state backend

2015-12-09 Thread Brian Chhun
Hello, I'm trying to setup an HA cluster and I'm running into issues using S3 as the state backend. This is raised during startup: 2015-12-09T19:23:36.430724+00:00 i-1ec317c4 docker/jobmanager01-d3174d6[1207]: java.io.IOException: No file system found with scheme s3, referenced in file URI 's3://

Re: Flink Storm

2015-12-09 Thread Matthias J. Sax
Hi Naveen, just for completeness: Max fixed this bug today and we also updated the documentation. As you are using SNAPSHOT version, you do not need to include "flink-java" any more if you update to the latest version containing the fix. Furthermore, *do not* include "storm-core" as an dependenc

Re: Using S3 as state backend

2015-12-09 Thread Ufuk Celebi
Hey Brian, did you follow the S3 setup guide? https://ci.apache.org/projects/flink/flink-docs-master/apis/example_connectors.html You have to set the fs.hdfs.hadoopconf property and add fs.s3.impl org.apache.hadoop.fs.s3native.NativeS3FileSystem to core-site.xml – Ufuk > On 09 Dec 2015, at

Size of a window without explicit trigger/evictor

2015-12-09 Thread Nirmalya Sengupta
Hello Fabian A small question: during the course of our recent conversation on the behaviour of window,trigger and evictor, you had mentioned that if I - the application programmer - do not attach a trigger to a window, Flink will attach one by itself. This trigger ensures that the size of the wi