Hi,
There was a bug in my code, I was assigning the timestamps wrong and that
is why it looked like early events where assigned processing time.
Surprisingly enought my test works both ok with early events. In fact I
have modified my test data generator to generate early events or late
events, and
Hi,
Maybe this is already in the documentation, sorry if I'm asking something
obvious. I was thinking that if you have event time then you can also have
early events, which would be events whose extracted timestampt is in the
future. This might happen in practice for example in sensors with a skew
Hi!
My topology below seems to work when I comment out all the lines with
ContinuousEventTimeTrigger, but prints nothing when the line is in
there. Can I coGroup two large time windows that use a different
trigger time than the window size? (even if the
ContinuousEventTimeTrigger doesn't
Hi Steven,
As Robert said some of our jobs have state sizes around a TB or more. We
use the RocksDB state backend with some configs tuned to perform well on
SSDs (you can get some tips here:
https://www.youtube.com/watch?v=pvUqbIeoPzM).
We checkpoint our state to Ceph (similar to HDFS but this is
Hi Dominik,
Your observation is right, running the JobManager and TaskManager on the
same node is no problem. If that machine fails, both services will be
affected, but as long as you have infrastructure in place (YARN for
example) to start them somewhere else, nothing bad will happen.
Regarding
Hi Tomas,
I'm really not an RDF processing expert, but since nobody responded for 4
days, I'll try to give you some pointers:
I know that there've been discussions regarding RDF processing on this
mailing list before.
Check out this one for example:
http://apache-flink-user-mailing-list-archive.23
Hi Steven,
According to this presentation, King.com is using Flink with terabytes of
state:
http://flink-forward.org/wp-content/uploads/2016/07/Gyulo-Fo%CC%81ra-RBEA-Scalable-Real-Time-Analytics-at-King.compressed.pdf
(see Page 4 specifically)
For the 90GB experiment, what is the expected time fo
Hi Craig,
I also received only this email (and I'm a moderator of the dev@ list, so
the message never made it into Apache's infra)
When this issue was first reported [1][2] I asked on the Maven mailing list
what's going on [3]. I think this JIRA contains the most information on the
issue: https://
Hi,
I have written a program that connect to the example stock tickers stream on
AWS Kinesis and filters out those related to tech sector. I have tried on my
local machine running `sbt run' an everything seems OK.
Then I have moved to AWS EMR (emr-5.1.0). I've installed the Flink
distribution ins