date:20160713

Ability to partition logs per pipeline

2016-07-13 Thread Chawla,Sumit

Hi All Does flink provide any ability to streamline logs being generated from a pipeline. How can we keep the logs from two pipelines separate so that its easy to debug the pipeline execution (something dynamic to automatically partition the logs per pipeline) Regards Sumit Chawla

Re: Parameters to Control Intra-node Parallelism

2016-07-13 Thread Ovidiu-Cristian MARCU

Hi, I would pay attention to the memory settings such that heap+off-heap+network buffers can be served from your node’s RAM for both TMs. Also, there is some correlation between the number of buffers, parallelism and your workflow’s operators. The suggestion to be used for the numberOfBuffers d

Re: Issue with running Flink Python jobs on cluster

2016-07-13 Thread Geoffrey Mon

Hello, Here is the TaskManager log on pastebin: http://pastebin.com/XAJ56gn4 I will look into whether the files were created. By the way, the cluster is made with virtual machines running on BlueData EPIC. I don't know if that might be related to the problem. Thanks, Geoffrey On Wed, Jul 13, 2

how to get rid of null pointer exception in collection in DataStream

2016-07-13 Thread subash basnet

Hello all, I need to collect the centroids to find out the nearest center for each point. DataStream points = newDataStream.map(new getPoints()); DataStream *centroids* = newCentroidDataStream.map(new TupleCentroidConverter()); ConnectedIterativeStreams loop = points.iterate().withFeedbackType(Cen

Re: Data point goes missing within iteration

2016-07-13 Thread Biplob Biswas

Hi, Sorry for the late reply, was trying different stuff on my code. And from what I observed, its very weird for me. So after experimentation, I found out that when I increase the number of centroids, the number of data points forwarded decreases, when I lower the umber of centroids, the datapo

Re: HDFS to Kafka

2016-07-13 Thread Aljoscha Krettek

Hi, this does not work right now because FileInputFormat does not allow setting the "enumerateNestedFiles" field directly and the Configuration is completely ignored in Flink streaming jobs. Cheers, Aljoscha On Wed, 13 Jul 2016 at 11:06 Robert Metzger wrote: > Hi Dominique, > > In Flink 1.1 we'

How large a Flink cluster can have?

2016-07-13 Thread Yan Chou Chen

FAQ[1], mailing list[2], and the powered by page[3] doesn't find related information. Just out of curiosity, what is the current largest Flink cluster size running in production? For instance, long time ago yahoo [4] ran 4k hadoop nodes in production. Thanks [1]. https://flink.apache.org/faq.html

Re: Writing in flink clusters

2016-07-13 Thread Chesnay Schepler

Hello, Is that the complete error message? I'm a bit surprised it does not explicitly name any file name. If it really doesn't we should change that. Regards, Chesnay Schepler On 13.07.2016 15:35, Alexis Gendronneau wrote: Hi Roy, Have you looked on the nodes in charge of sink tasks ? You s

Re: Writing in flink clusters

2016-07-13 Thread Debaditya Roy

Thanks . Will check. :-) On Wed, Jul 13, 2016 at 3:35 PM, Alexis Gendronneau wrote: > Hi Roy, > > Have you looked on the nodes in charge of sink tasks ? You should be able > to find them on flink web interface by clicking on the sink taks. If you > get the OVERWRITE error, your output is certain

Re: Writing in flink clusters

2016-07-13 Thread Alexis Gendronneau

Hi Roy, Have you looked on the nodes in charge of sink tasks ? You should be able to find them on flink web interface by clicking on the sink taks. If you get the OVERWRITE error, your output is certainly somewhere. By the way, when using distributed mode it is easier to use an output like HDFS. T

Re: countWindow custom WindowFunction

2016-07-13 Thread Ciar, David B.

Hi Vishnu, Thank you for the pointers/modified example, that was really helpful and it is working as expected now. I took another look through the documentation and found in the "Window" section for streaming data, the "Recipes for building windows" sub-section, where it shows the countWindo

Writing in flink clusters

2016-07-13 Thread Debaditya Roy

Hello users, I have written and executed a flink program in a cluster. The program was supposed to write to some text file as a sink, however I cannot find the text files in the target directory of the cluster nodes, but when I reexecute the program second time, it gives me the predictable error:

Re: countWindow custom WindowFunction

2016-07-13 Thread Vishnu Viswanath

Hi David, countWindow(size,slide) creates a GlobalWindow, not a TimeWindow. Also you have to use Tuple instead of Tuple2. class SequentialDeltaCheck extends WindowFunction[RawObservation, String, Tuple, GlobalWindow]{ def apply(key: Tuple, window: GlobalWindow, input: Iterable[RawObservation],

countWindow custom WindowFunction

2016-07-13 Thread Ciar, David B.

Hello everyone, I'm relatively new to using Apache Flink and Scala, and am just getting to grips with some of the basic functionality both provide. I've hit a wall trying to implement a custom WindowFunction over a keyed countWindow however, and hoped someone may have a pointer. The full cod

Re: Data point goes missing within iteration

2016-07-13 Thread Ufuk Celebi

Any update? On Wed, Jul 6, 2016 at 5:55 PM, Ufuk Celebi wrote: > I couldn't tell anything from the code. I would suggest to reduce it > to a minimal example with Integers where you do the same thing flow > structure wise (with simple types) and let's check that again. > > On Wed, Jul 6, 2016 at 9

Re: Issue with running Flink Python jobs on cluster

2016-07-13 Thread Chesnay Schepler

Hello Geoffrey, How often does this occur? Flink distributes the user-code and the python library using the Distributed Cache. Either the file is deleted right after being created for some reason, or the DC returns a file name before the file was created (which shouldn't happen, it should b

Re: custom scheduler in Flink?

2016-07-13 Thread Aljoscha Krettek

Hi, I'm afraid there is no documentation about schedulers, especially at this low level. Maybe this new design proposal would of interest for you, though: https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures In there is a link to the mailing list dis

Re: Apache Flink 1.0.3 IllegalStateException Partiction State on chaining

2016-07-13 Thread ANDREA SPINA

Hi everybody, increasing the akka.ask.timeout solved the second issue. Anyway that was a warning about a congestioned network. So I worked to improve the algorithm. Increasing the numberOfBuffers and the corresponding size solved the first issue, thus now I can run with the full DOP. In my case ena

Re: HDFS to Kafka

2016-07-13 Thread Robert Metzger

Hi Dominique, In Flink 1.1 we've reworked the reading of static files in the DataStream API. There is now a method for passing any FileInputFormat: readFile(fileInputFormat, path, watchType, interval, pathFilter, typeInfo). I guess you can pass a FileInputFormat with the recursive enumeration enab

Re: sampling function

2016-07-13 Thread Le Quoc Do

Hi Till, I have created the JIRA: https://issues.apache.org/jira/browse/FLINK-4205 Thank you, Do On Tue, Jul 12, 2016 at 6:05 PM, Till Rohrmann wrote: > Stratified sampling would also be beneficial for the DataSet API. I think > it would be best if this method is also added to DataSetUtils or

Ability to partition logs per pipeline

Re: Parameters to Control Intra-node Parallelism

Re: Issue with running Flink Python jobs on cluster

how to get rid of null pointer exception in collection in DataStream

Re: Data point goes missing within iteration

Re: HDFS to Kafka

How large a Flink cluster can have?

Re: Writing in flink clusters

Re: Writing in flink clusters

Re: Writing in flink clusters

Re: countWindow custom WindowFunction

Writing in flink clusters

Re: countWindow custom WindowFunction

countWindow custom WindowFunction

Re: Data point goes missing within iteration

Re: Issue with running Flink Python jobs on cluster

Re: custom scheduler in Flink?

Re: Apache Flink 1.0.3 IllegalStateException Partiction State on chaining

Re: HDFS to Kafka

Re: sampling function

20 matches

Site Navigation

Mail list logo

Footer information