In my opinion the streaming process can be perfectly simulated on a
single node. You can setup a message distribution system like Kafka on a
single node, you can run Spark on a single node and the only thing you
need to change when running it on a cluster is that you need to change
the environm
Hi,
I am relatively new to the development process of Apache Flink. Where
can I start to help you developing Flink?
Kind regards,
Kevin
rg/how-to-contribute.html
If you have follow-up question, just go for it :)
-Matthias
On 07/08/2016 10:02 AM, Kevin Jacobs wrote:
Hi,
I am relatively new to the development process of Apache Flink. Where
can I start to help you developing Flink?
Kind regards,
Kevin
Hi,
I am currently working working for an organization which is using Apache
Spark as main data processing framework. Now the organization is
wondering whether Apache Flink is better at processing their data than
Apache Spark. Therefore, I am evaluating Apache Flink and I am comparing
it to A
w.slideshare.net/GyulaFra/largescale-stream-processing-in-the-hadoop-ecosystem
[6]
http://www.slideshare.net/GyulaFra/largescale-stream-processing-in-the-hadoop-ecosystem-hadoop-summit-2016-60887821
On Fri, Jul 8, 2016 at 2:23 PM, Kevin Jacobs wrote:
Hi,
I am currently working working for an or
Hi,
I am currently facing strange behaviour of the FlinkKafkaConsumer09
class. I am using Flink 1.0.3.
These are my properties:
val properties = new Properties()
properties.setProperty("bootstrap.servers", config.urlKafka)
properties.setProperty("group.id", COLLECTOR_NAME)
properties.setPrope
t",
"earliest") to achieve the same behavior.
Kafka keeps track of the offsets per group id. If you have already
read from a topic with a certain group id and want to restart from the
smallest offset available, you need to generate a unique group id.
Cheers,
Max
On Thu, Jul 28,
Hi all,
I am trying to keep track of the biggest value in a stream. I do this by
using the iterative step mechanism of Apache Flink. However, I get an
exception that checkpointing is not supported for iterative jobs. Why
can't this be enabled? My iterative stream is also quite small: only one
Is it possible to discard events that are out-of-order (in terms of
event time)?
or you:
- Can I make this more efficient?
- Is there a way of mixing datasets and datastreams? That would be
really awesome (for at least this use case).
- Is there a way to ensure checkpoints, since I am using an iterative
stream here?
- Can I get rid of the TumblingProcessingTimeWindows? Because in fact,
Hi!
Welcome to the community :-)!
On 01.08.2016 09:51, Ufuk Celebi wrote:
On Sun, Jul 31, 2016 at 8:07 PM, Neelesh Salian wrote:
I am Neelesh Salian; I recently joined the Flink community and I wanted to
take this opportunity to formally introduce myself.
Thanks and welcome! :-)
Hi,
I have the following use case:
1. Group by a specific field.
2. Get a list of all messages belonging to the group.
3. Count the number of records in the group.
With the use of DataSets, it is fairly easy to do this (see
http://stackoverflow.com/questions/38745446/apache-flink
Hi,
Today I will be giving a presentation about Apache Flink and in terms of
the use cases at my company, Apache Flink performs better than Apache
Spark. There is only one issue I encountered, and that is the lack of
support for (Meta)data Driven Window Triggers.
I would like to start a disc
13 matches
Mail list logo