Hi All
Does flink provide any ability to streamline logs being generated from a
pipeline. How can we keep the logs from two pipelines separate so that
its easy to debug the pipeline execution (something dynamic to
automatically partition the logs per pipeline)
Regards
Sumit Chawla
Hi,
I would pay attention to the memory settings such that heap+off-heap+network
buffers can be served from your node’s RAM for both TMs.
Also, there is some correlation between the number of buffers, parallelism and
your workflow’s operators. The suggestion to be used for the numberOfBuffers
d
Hello,
Here is the TaskManager log on pastebin:
http://pastebin.com/XAJ56gn4
I will look into whether the files were created.
By the way, the cluster is made with virtual machines running on BlueData
EPIC. I don't know if that might be related to the problem.
Thanks,
Geoffrey
On Wed, Jul 13, 2
Hello all,
I need to collect the centroids to find out the nearest center for each
point.
DataStream points = newDataStream.map(new getPoints());
DataStream *centroids* = newCentroidDataStream.map(new
TupleCentroidConverter());
ConnectedIterativeStreams loop =
points.iterate().withFeedbackType(Cen
Hi,
Sorry for the late reply, was trying different stuff on my code. And from
what I observed, its very weird for me.
So after experimentation, I found out that when I increase the number of
centroids, the number of data points forwarded decreases, when I lower the
umber of centroids, the datapo
Hi,
this does not work right now because FileInputFormat does not allow setting
the "enumerateNestedFiles" field directly and the Configuration is
completely ignored in Flink streaming jobs.
Cheers,
Aljoscha
On Wed, 13 Jul 2016 at 11:06 Robert Metzger wrote:
> Hi Dominique,
>
> In Flink 1.1 we'
FAQ[1], mailing list[2], and the powered by page[3] doesn't find
related information. Just out of curiosity, what is the current
largest Flink cluster size running in production? For instance, long
time ago yahoo [4] ran 4k hadoop nodes in production.
Thanks
[1]. https://flink.apache.org/faq.html
Hello,
Is that the complete error message? I'm a bit surprised it does not
explicitly name any file name. If it really doesn't we should change that.
Regards,
Chesnay Schepler
On 13.07.2016 15:35, Alexis Gendronneau wrote:
Hi Roy,
Have you looked on the nodes in charge of sink tasks ? You s
Thanks . Will check. :-)
On Wed, Jul 13, 2016 at 3:35 PM, Alexis Gendronneau wrote:
> Hi Roy,
>
> Have you looked on the nodes in charge of sink tasks ? You should be able
> to find them on flink web interface by clicking on the sink taks. If you
> get the OVERWRITE error, your output is certain
Hi Roy,
Have you looked on the nodes in charge of sink tasks ? You should be able
to find them on flink web interface by clicking on the sink taks. If you
get the OVERWRITE error, your output is certainly somewhere.
By the way, when using distributed mode it is easier to use an output like
HDFS. T
Hi Vishnu,
Thank you for the pointers/modified example, that was really helpful and it is
working as expected now.
I took another look through the documentation and found in the "Window" section
for streaming data, the "Recipes for building windows" sub-section, where it
shows the countWindo
Hello users,
I have written and executed a flink program in a cluster. The program was
supposed to write to some text file as a sink, however I cannot find the
text files in the target directory of the cluster nodes, but when I
reexecute the program second time, it gives me the predictable error:
Hi David,
countWindow(size,slide) creates a GlobalWindow, not a TimeWindow. Also you
have to use Tuple instead of Tuple2.
class SequentialDeltaCheck extends WindowFunction[RawObservation,
String, Tuple, GlobalWindow]{
def apply(key: Tuple, window: GlobalWindow, input:
Iterable[RawObservation],
Hello everyone,
I'm relatively new to using Apache Flink and Scala, and am just getting to
grips with some of the basic functionality both provide. I've hit a wall
trying to implement a custom WindowFunction over a keyed countWindow however,
and hoped someone may have a pointer. The full cod
Any update?
On Wed, Jul 6, 2016 at 5:55 PM, Ufuk Celebi wrote:
> I couldn't tell anything from the code. I would suggest to reduce it
> to a minimal example with Integers where you do the same thing flow
> structure wise (with simple types) and let's check that again.
>
> On Wed, Jul 6, 2016 at 9
Hello Geoffrey,
How often does this occur?
Flink distributes the user-code and the python library using the
Distributed Cache.
Either the file is deleted right after being created for some reason, or
the DC returns a file name before the file was created (which shouldn't
happen, it should b
Hi,
I'm afraid there is no documentation about schedulers, especially at this
low level. Maybe this new design proposal would of interest for you,
though:
https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures
In
there is a link to the mailing list dis
Hi everybody,
increasing the akka.ask.timeout solved the second issue. Anyway that was a
warning about a congestioned network. So I worked to improve the algorithm.
Increasing the numberOfBuffers and the corresponding size solved the first
issue, thus now I can run with the full DOP.
In my case ena
Hi Dominique,
In Flink 1.1 we've reworked the reading of static files in the DataStream
API.
There is now a method for passing any FileInputFormat:
readFile(fileInputFormat,
path, watchType, interval, pathFilter, typeInfo).
I guess you can pass a FileInputFormat with the recursive enumeration
enab
Hi Till,
I have created the JIRA: https://issues.apache.org/jira/browse/FLINK-4205
Thank you,
Do
On Tue, Jul 12, 2016 at 6:05 PM, Till Rohrmann wrote:
> Stratified sampling would also be beneficial for the DataSet API. I think
> it would be best if this method is also added to DataSetUtils or
20 matches
Mail list logo