Hi,
The optimizer internals described in this document [1] are probably not
up-to-date.
Can you please confirm if this is still valid:
“The following optimizations are not performed
Join reordering (or operator reordering in general): Joins / Filters / Reducers
are not re-ordered in Flink. This
Hi All,
I wonder if Flink is a right tool for processing historical time series data
e.g. many small files.
Our use case: we have clickstream histories (time series) of many users. We
would like to calculate user specific sliding count window aggregates over past
periods for a sample of users
Hi,
I have a flink streaming application and I want to count records received
per second (as a way of measuring the throughput of my application).
However, I am using the EventTime time characteristic, as follows:
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.setStreamTimeChara
Hi,
Thank you for the responses.
I am not sure if I will be able to use Fold/Reduce function, but I will
keep that in mind.
I have one more question, so what is the implication of having a key that
splits the data into window of very small size(=> large number of small
windows) ?
Thanks and Rega
Hi Max,
thank you for the fast reply and sorry: I use flink-1.0.3.
Yes I tested on dummy dataset with numOfBuffers = 16384 and decreasing the
parallelism degree and this solution solved the first exception. Anyway on
the 80GiB dataset I struggle with the second exception.
Regards,
Andrea
2016-06-
Hi Anton,
I would suggest you simply put your moving average code in a
MapFunction where you can keep track of the current average using a
class field.
Cheers,
Max
On Fri, Jun 24, 2016 at 10:05 PM, Anton wrote:
> Hello
>
> I'm currently trying to learn Flink. And so far am really impressed by i
Hi Andrea,
The number of network buffers should be sufficient. Actually, assuming
you have 16 task slots on each of the 25 nodes, it should be enough to
have 16^2 * 25 * 4 = 14400 network buffers.
See
https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#background
So we have
Hi everyone,
I am running some Flink experiments with Peel benchmark
http://peel-framework.org/ and I am struggling with exceptions: the
environment is a 25-nodes cluster, 16 cores per nodes. The dataset is
~80GiB and is located on Hdfs 2.7.1.
At the beginning I tried with 400 as degree of parall
Hi,
one thing to add: if you use a ReduceFunction or a FoldFunction for your
window the state will not grow with bigger window sizes or larger numbers
of elements because the result is eagerly computed. In that case, state
size is only dependent on the number of individual keys.
Cheers,
Aljoscha
Hi,
I might lead to flooding, yes. But I'm afraid it's the only way to go right
now.
Cheers,
Aljoscha
On Mon, 27 Jun 2016 at 17:57 Biplob Biswas wrote:
> Hi,
>
> I was afraid of buffering because I am not sure when the second map
> function
> would get data, wouldn't the first map be flooded wi
Hi Vishnu,
RocksDB allows for storing the window contents on disk when the state of a
window becomes too big.
BUT when you have to trigger and apply the computation of your window function
on that big window,
then all of its state is loaded in memory.
So although during the window formation ph
11 matches
Mail list logo