date:20151103

Re: How to preserve KeyedDataStream

2015-11-03 Thread Martin Neumann

Here is some code of my current solution, if there is some better way of doing it let me know. KeyedStream hostStream = inStream .keyBy(t -> t.getHost()); KeyedStream,String> freqStream = hostStreams .timeWindow(Time.of(WINDOW_SIZE, TimeUnit.MILLISECONDS)) .fold(new Tuple2(0, ""), new Count()

Re: Running continuously on yarn with kerberos

2015-11-03 Thread Niels Basjes

Update on the status so far I suspect I found a problem in a secure setup. I have created a very simple Flink topology consisting of a streaming Source (the outputs the timestamp a few times per second) and a Sink (that puts that timestamp into a single record in HBase). Running this on a non-

Re: How to preserve KeyedDataStream

2015-11-03 Thread Aljoscha Krettek

Hi, where are you storing the results of each window computation to? Maybe you could also store it from inside a custom WindowFunction where you just count the elements and then store the results. On the other hand, adding a (1) field and doing a window reduce (à la WordCount) is going to be wa

Re: Could not load the task's invokable class.

2015-11-03 Thread Saleh

Hi Stephan, Thanx for the response. Actually I am using Maven and below are all Flink dependencies I have in my pom.xml file. / org.apache.flink flink-core 0.9.1 provided

Re: Running on a firewalled Yarn cluster?

2015-11-03 Thread Niels Basjes

Great! I'll watch the issue and give it a test once I see a working patch. Niels Basjes On Tue, Nov 3, 2015 at 1:03 PM, Maximilian Michels wrote: > Hi Niels, > > Thanks a lot for reporting this issue. I think it is a very common setup > in corporate infrastructure to have restrictive firewall

Re: Running on a firewalled Yarn cluster?

2015-11-03 Thread Maximilian Michels

Hi Niels, Thanks a lot for reporting this issue. I think it is a very common setup in corporate infrastructure to have restrictive firewall settings. For Flink 1.0 (and probably in a minor 0.10.X release) we will have to address this issue to ensure proper integration of Flink. I've created a JIR

How to preserve KeyedDataStream

2015-11-03 Thread Martin Neumann

Hej, I want to do the following thing: 1. Split a Stream of incoming Logs by host address. 2. For each Key, create time based windows 3. Count the number of items in the window 4. Feed it into a statistical model that is maintained for each host Since I don't have fields to sum upon, I use a (win

Re: Running on a firewalled Yarn cluster?

2015-11-03 Thread Niels Basjes

Hi, I forgot to answer your other question: On Mon, Nov 2, 2015 at 4:34 PM, Robert Metzger wrote: > so the problem is that you can not submit a job to Flink using the > "/bin/flink" tool, right? > I assume Flink and its TaskManagers properly start and connect to each > other (the number of Task

Re: Long ids from String

2015-11-03 Thread Martin Junghanns

Hi, Sounds like RDF problems to me :) To build an index you could do the following: triplet := (0) build set of all triplets (with strings) triplets = triplets1.union(triplets2) (1) assign unique long ids to each vertex vertices = triplets.flatMap() => [,,...].distinct() vertexWithID = vertic

Re: Long ids from String

2015-11-03 Thread Fabian Hueske

Converting String ids into Long ids can be quite expensive, so you should make sure it pays off. The save way to do it is to get all unique String ids (project, distinct), do zipWithUniqueId, and join all DataSets that have the String id with the new long id. So it is a full sort for the unique an

Re: Long ids from String

2015-11-03 Thread Flavio Pompermaier

Hi Martin, thanks for the suggestion but unfortunately in my use case I have another problem: I have to join triplets when f2==f0..is there any way to translate also references? i.e. if I have 2 tuples , when I apply that function I obtain something like <1,>,<2,>. I'd like to be able to join the

Re: Long ids from String

2015-11-03 Thread Martin Junghanns

Hi Flavio, If you just want to assign a unique Long identifier to each element in your dataset, you can use the DataSetUtils.zipWithUniqueId() method [1]. Best, Martin [1] https://github.com/apache/flink/blob/master/flink-java/src/main/java/org/apache/flink/api/java/utils/DataSetUtils.java#L131

Long ids from String

2015-11-03 Thread Flavio Pompermaier

Hi to all, I was reading the thread about the Neo4j connector and an old question came to my mind. In my Flink job I have Tuples with String ids that I use to join on that I'd like to convert to long (because Flink should improve quite a lot the memory usage and the processing time if I'm not wro

Re: How to preserve KeyedDataStream

Re: Running continuously on yarn with kerberos

Re: How to preserve KeyedDataStream

Re: Could not load the task's invokable class.

Re: Running on a firewalled Yarn cluster?

Re: Running on a firewalled Yarn cluster?

How to preserve KeyedDataStream

Re: Running on a firewalled Yarn cluster?

Re: Long ids from String

Re: Long ids from String

Re: Long ids from String

Re: Long ids from String

Long ids from String

13 matches

Site Navigation

Mail list logo

Footer information