in my code, is the config of ExecutionEnv alright?
> On Aug 11, 2016, at 8:47 PM, Dong-iL, Kim wrote:
>
>
> my code and log is as below.
>
>
>val getExecuteEnv: StreamExecutionEnvironment = {
>val env =
> StreamExecutionEnvironment.getExecutionEnvironment.enableCheckpointing(100
Sorry I mean streaming cannot use combiners (repeated below)
---
Streaming cannot use combiners. The aggregations happen on the trigger.
The elements being aggregated are only known after the trigger delivers the
elements to the evaluation function.
Since windows can overlap and even assignme
Kostas,
Good catch! That makes it working! Thank you so much for the help.
Regards,
Buvana
-Original Message-
From: Kostas Kloudas [mailto:k.klou...@data-artisans.com]
Sent: Thursday, August 11, 2016 11:22 AM
To: user@flink.apache.org
Subject: Re: flink - Working with State example
Hi Bu
I am wondering if Flink makes use of combiners to pre-reduce a keyed and
windowed stream before shuffling the data among workers.
I.e. will it use a combiner in something like:
stream.flatMap {...}
.assignTimestampsAndWatermarks(...)
.keyBy(...)
.timeWindow(...)
.trigger(.
Hi Davran,
unfortunately, you found a bug. I created an issue for it (
https://issues.apache.org/jira/browse/FLINK-4385). You could convert the
timestamp to a long value as a workaround.
Table table1 = tableEnv.fromDataSet(dataSet1);
Table table2 = tableEnv.fromDataSet(dataSet2);
Table table
"If Window B is a Folding Window and does not have an evictor then it should
not keep the list of all received elements."
Agreed! Upon closer inspection, the behavior I'm describing is only present
when using EvictingWindowOperator, not when using WindowOperator. I misread
line 382 of WindowOpe
I have two tables created from data sets:
List infos0 = .
List infos1 = .
DataSet dataSet0 = env.fromCollection( infos0 );
DataSet dataSet1 = env.fromCollection( infos1 );
tableEnv.registerDataSet( "table0", dataSet0 );
tableEnv.registerDataSet( "table1", dataSet1
Hi everyone,
I've been trying to write unit tests for my data stream bolts (map, flatMap,
apply etc.), however the results I've been getting are strange. The code for
testing is here (running with scalatest and sbt):
https://gist.github.com/dbciar/7469adfea9e6442cdc9568aed07095ff
It runs t
Hi Buvana,
At a first glance, your snapshotState() should return a Double.
Kostas
> On Aug 11, 2016, at 4:57 PM, Ramanan, Buvana (Nokia - US)
> wrote:
>
> Thank you Kostas & Ufuk. I get into the following compilation error when I
> use checkpointed interface. Pasting the code & message as f
Hi Shannon,
thanks for the clarification. If Window B is a Folding Window and does not
have an evictor then it should not keep the list of all received elements.
Could you maybe post the section of the log that shows what window operator
is used for Window B? I'm looking for something like this:
0
Thank you Kostas & Ufuk. I get into the following compilation error when I use
checkpointed interface. Pasting the code & message as follows:
Is the Serializable definition supposed to be from java.io.Serializable or
somewhere else?
Thanks again,
Buvana
Exactly as Ufuk suggested, if you are not grouping your stream by key,
you should use the checkpointed interface.
The reason I asked before if you are using the keyBy() is because this is the
one that
implicitly sets the keySerializer and scopes your (keyed) state to a specific
key.
If there i
This only works for keyed streams, you have to use keyBy().
You can use the Checkpointed interface instead
(https://ci.apache.org/projects/flink/flink-docs-master/apis/streaming/state.html#checkpointing-instance-fields).
On Thu, Aug 11, 2016 at 3:35 PM, Ramanan, Buvana (Nokia - US)
wrote:
> Hi K
Do you also set fs.hdfs.hadoopconf in flink-conf.yaml
(https://ci.apache.org/projects/flink/flink-docs-master/setup/config.html#common-options)?
On Thu, Aug 11, 2016 at 2:47 PM, Dong-iL, Kim wrote:
> Hi.
> In this case , I used standalone cluster(aws EC2) and I wanna connect to
> remote HDFS mach
Hi Kostas,
Here is my code. All I am trying to compute is (x[t] – x[t-1]), where x[t] is
the current value of the incoming sample and x[t-1] is the previous value of
the incoming sample. I store the current value in state store (‘prev_tuple’) so
that I can use it for computation in next cycle.
Indeed, using the same parallelism corrected the output. Thank you!
On Thu, Aug 11, 2016 at 2:34 PM, Stephan Ewen wrote:
> Hi!
>
> The source runs parallel (n tasks), but the sink has a parallelism of 1.
> The sink hence has to merge the parallel streams from the source, which
> happens based on
Hi.
In this case , I used standalone cluster(aws EC2) and I wanna connect to remote
HDFS machine(aws EMR).
I register the location of core-site.xml as below.
does it need other properties?
fs.defaultFS
hdfs://…:8020
hadoop.security.authentication
si
Hi!
The source runs parallel (n tasks), but the sink has a parallelism of 1.
The sink hence has to merge the parallel streams from the source, which
happens based on arrival speed of the streams, i.e., its not deterministic.
That's why you see the lines being mixed.
Try running source and sink wi
Hi!
Do you register the Hadoop Config at the Flink Configuration?
Also, do you use Flink standalone or on Yarn?
Stephan
On Tue, Aug 9, 2016 at 11:00 AM, Dong-iL, Kim wrote:
> Hi.
> I’m trying to set external hdfs as state backend.
> my os user name is ec2-user. hdfs user is hadoop.
> there is
Hi all,
When I use out.collect() twice inside a faltMap, the output is sometimes
and randomly skewed. Take this example:
final StreamExecutionEnvironment env =
StreamExecutionEnvironment.createLocalEnvironment();
env.generateSequence(1, 10)
.flatMap((Long t, Collector out) -> {
Hi!
A global shared variable is not something that is offered by Flink right
now. It is not part of the system, because it is not really part of the
stream or state derived from individual streams. It is also quite hard to
do efficiently and general purpose.
I see that it is a useful tool in seve
Just to add a drawback in solution 2) you may have some issues because window
boundaries may not
be aligned. For example the elements of a day window may be split between the
last day of a month
and the first of the next month.
Kostas
> On Aug 11, 2016, at 2:21 PM, Kostas Kloudas
> wrote:
>
Hi Shanon,
From what I understand, you want to have your results windowed by different
different durations, e.g. by minute, by day,
by month and you use the evictor to decide which elements should go into each
window. If I am correct, then I do not
think that you need the evictor which bounds
Hi,
Sameet is right about the snapshotting. The CEP operator behaves more or
less like a FlatMap operator that keeps some more complex state internally.
Snapshotting works the same as with any other operator.
Cheers,
Aljoscha
On Thu, 11 Aug 2016 at 00:54 Sameer W wrote:
> Mans,
>
> I think at t
my code and log is as below.
val getExecuteEnv: StreamExecutionEnvironment = {
val env =
StreamExecutionEnvironment.getExecutionEnvironment.enableCheckpointing(1)
env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime)
env.getCheckpointConfig.setCheckp
What do you mean with lost exactly?
You call value() and it returns a value (!= null/defaultValue) and you
call it again and it returns null/defaultValue for the same key with
no update in between?
On Thu, Aug 11, 2016 at 11:59 AM, Kostas Kloudas
wrote:
> Hello,
>
> Could you share the code of t
Hello,
Could you share the code of the job you are running?
With only this information I am afraid we cannot help much.
Thanks,
Kostas
> On Aug 11, 2016, at 11:55 AM, Dong-iL, Kim wrote:
>
> Hi.
> I’m using flink 1.0.3 on aws EMR.
> sporadically value of ValueState is lost.
> what is starting
Hi.
I’m using flink 1.0.3 on aws EMR.
sporadically value of ValueState is lost.
what is starting point for solving this problem.
Thank you.
Hello Buvana,
Can you share a bit more details on your operator and how you are using it?
For example, are you using keyBy before using you custom operator?
Thanks a lot,
Kostas
> On Aug 10, 2016, at 10:03 PM, Ramanan, Buvana (Nokia - US)
> wrote:
>
> Hello,
>
> I am utilizing the code snip
The Flink PMC is pleased to announce the availability of Flink 1.1.1.
The Maven artifacts published on Maven central for the previous 1.1.0
version had a Hadoop dependency issue. No Hadoop 1 specific version
(with version 1.1.0-hadoop1) was deployed and the 1.1.0 artifacts have
a dependency on Had
30 matches
Mail list logo