Hi Tony,
In the documentation on keyed windows vs non-keyed it says that it will
split the stream into parallel keyed streams with windows being executed in
parallel across the keys. I would think that this would mean that each key
has it's own window managed independently.
https://ci.apache.org/
Sure, I'm going to use a name as key in this example and just a number as the
value aggregated. This is the sample input data
12:00
{"name": "Marcus", "value": 1}
12:01
{"name": "Suzy", "value": 2}
12:03
{"name": "Alicia", "value": 3}
12:04
{"name": "Ben", "value": 1}
12:06
{"name": "Alicia", "val
I am using Processing Time, so it is using the default timestamps and
watermarks. I am running it with a parallelism of 3, I can see each operator
running at a parallelism of 3 on the Web UI. I am pulling data from a Kafka
topic with 12 partitions.
--
Sent from: http://apache-flink-user-mailing
I ended up combining all the patterns into one giant CEP pattern and then
filtering the output of that pattern instead. That way it was only one
RocksDB instance which led to large checkpoints instead of lots of small
checkpoints. This seems to work, but I still need to do more testing around
wheth
There are only 3 nodes in the HDFS cluster and when running fsck it shows the
filesystem as healthy.
$ hdfs fsck
/user/hadoop/flink/checkpoints/dc2aee563bebce76e420029525c37892/chk-43/
17/04/28 16:24:59 WARN util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using bui
The top level exception is similar to one on this Jira issue but the root
Exception is different. This one says it was fixed in 1.2.0 which is what
I'm using
https://issues.apache.org/jira/browse/FLINK-5663
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050
This is the stacktrace I'm getting when checkpointing to the HDFS. It happens
like once every 3 checkpoints and I don't see this without parallelism.
AsynchronousException{java.lang.Exception: Could not materialize checkpoint
6 for operator KeyedCEPPatternOperator -> Flat Map -> Map -> Sink: Unna
Starting ~300 CEP patterns with parallelism of 6 since there are 6 partitions
on a kafka topic. Checkpoint using rocksDB to Hadoop on interval of 50
seconds. Cluster is HA with 2 JM and 5 TM. Getting following exception :
java.io.IOException: Error creating ColumnFamilyHandle.
at
org.a
Ok, I will try using Flink 1.3
--
View this message in context:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Multiple-CEP-Patterns-tp12871p12896.html
Sent from the Apache Flink User Mailing List archive. mailing list archive at
Nabble.com.
I do have a within clause on all the patterns and I am doing CEP.pattern on
each one. On the output I am adding a Kafka sink. Since all the patterns are
going to the same sink I was wondering if there was a better way to do it
rather then having that overhead.
For the memory issues with 1.2, I do
I'm trying to run multiple independent CEP patterns. They're basic patterns,
just one input followed by another and my flink job runs fine when just
using 1 pattern. If i try to scale this up to add multiple CEP patterns, 200
for example, I start getting memory errors on my cluster. I can definitel
11 matches
Mail list logo