Hi all,
Say I assign timestamps to a stream and then apply a transformation like
this:
stream.keyBy(0).timeWindow(Time.hours(5)).reduce(count).timeWindowAll(Time.days(1)).apply(transformation)
Now, when the first window is applied, events are aggregated based on their
timestamps, but I don't und
Hi,
It looks like the WithIn clause of CEP uses Tumbling Windows. I could get
it to use Sliding windows by using an upstream pipeline which uses Sliding
Windows and produces repeating elements (in each sliding window) and
applying a Watermark assigner on the resulting stream with elements
duplicat
Hi,
For the error I get this when I run the .jar made by mvn clean package
java.lang.NoClassDefFoundError: org/bytedeco/javacpp/opencv_core$Mat
at loc.video.Job.main(Job.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.inv
I want to load previous states and I understand I could do this with
specifying a savepoints.
Is there a way to do this automatically, given I do not change my code
(jar)?
--
Cheers,
Shaosu
Out of curiosity I've tried this locally by adding the following
dependencies to my Maven project:
org.bytedeco
javacpp
1.2.2
org.bytedeco.javacpp-presets
opencv
3.1.0-1.2
With this, running mvn clean package works as expected.
On Tue, Jul 26, 2016 at 7:09 PM, Ufuk Celebi
What error message to you get from Maven?
On Tue, Jul 26, 2016 at 4:39 PM, Debaditya Roy wrote:
> Hello,
>
> I am using the jar builder from IntelliJ IDE (the mvn one was causing
> problems). After that I executed it successfully locally. But in remote it
> is causing problem.
>
> Warm Regards,
>
Hi Robert,
Are you able to simplify the your function input / output types? Flink
aggressively serializes the data stream and complex types such as ArrayList
and BitSet will be much slower to process. Are you able to reconstruct the
lists to be groupings on elements?
Greg
On Mon, Jul 25, 2016 at
Hello,
I am using the jar builder from IntelliJ IDE (the mvn one was causing
problems). After that I executed it successfully locally. But in remote it
is causing problem.
Warm Regards,
Debaditya
On Tue, Jul 26, 2016 at 4:36 PM, Ufuk Celebi wrote:
> Yes, the BlobCache on each TaskManager node
Yes, the BlobCache on each TaskManager node should fetch it from the
JobManager. How are you packaging your JAR?
On Tue, Jul 26, 2016 at 4:32 PM, Debaditya Roy wrote:
> Hello users,
>
> I am having a problem while running my flink program in a cluster. It gives
> me an error that it is unable to
Hello users,
I am having a problem while running my flink program in a cluster. It gives
me an error that it is unable to find an .so file in a tmp directory.
Caused by: java.lang.UnsatisfiedLinkError: no jniopencv_core in
java.library.path
at java.lang.ClassLoader.loadLibrary(ClassLoader.jav
Hi,
Well, what we are doing at King is trying to solve a similar problem. It
would be great if you could read the blogpost because it goes into detail
about the actual implementation but let me recap here quickly:
We are building a stream processing system that data scientists and other
developer
Thanks.
Hi Gyula anything , you can share on this?
Aparup
On 7/26/16, 4:38 AM, "Ufuk Celebi" wrote:
>On Mon, Jul 25, 2016 at 5:38 AM, Aparup Banerjee (apbanerj)
> wrote:
>> We are building a Stream processing system using Apache beam on top of Flink
>> using the Flink Runner. Our pipelines
Thank you. That clears it up.
I meant SavePoints. Sorry I used the term Snapshots in its place :-).
Thanks,
Sameer
On Tue, Jul 26, 2016 at 8:33 AM, Ufuk Celebi wrote:
> On Tue, Jul 26, 2016 at 2:15 PM, Sameer W wrote:
> > 1. Calling clear() on the KV state is only possible for snapshots right
On Tue, Jul 26, 2016 at 2:15 PM, Sameer W wrote:
> 1. Calling clear() on the KV state is only possible for snapshots right? Do
> you control that for checkpoints too.
What do you mean with snapshots vs. checkpoints exactly?
> 2. Assuming that the user has no control over the checkpoint process o
Thanks Ufuk,
That was very helpful. But that raised a few more questions :-):
1. Calling clear() on the KV state is only possible for snapshots right? Do
you control that for checkpoints too.
2. Assuming that the user has no control over the checkpoint process
outside of controlling the checkpoi
Are you using the DataSet or DataStream API?
Yes, most Flink transformations operate on single tuples, but you can
work around it:
- You could write a custom source function, which emits records that
contain X points
(https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/index.html#dat
Hi Suma Cherukuri,
I also replied to your question in the dev list, but I repeat the answer here
just in case you missed in.
From what I understand you have many small files and you want to
aggregate them into bigger ones containing the logs of the last 24h.
As Max said RollingSinks will allow
On Mon, Jul 25, 2016 at 5:38 AM, Aparup Banerjee (apbanerj)
wrote:
> We are building a Stream processing system using Apache beam on top of Flink
> using the Flink Runner. Our pipelines take Kafka streams as sources , and
> can write to multiple sinks. The system needs to be tenant aware. Tenants
On Mon, Jul 25, 2016 at 10:09 AM, Claudia Wegmann wrote:
> To 3) Would an approach similar to King/RBEA even be possible combined with
> Flink CEP? As I understand, Patterns have to be defined in Java code and
> therefore have to be recompiled? Do I overlook something important?
Pulling in Till (
On Mon, Jul 25, 2016 at 8:50 PM, Sameer W wrote:
> The question is, if using really long windows (in hours) if the state of the
> window gets very large over time, would size of the RocksDB get larger?
> Would replication to HDFS start causing performance bottlenecks? Also would
> this need a cons
Hello,
I want to calculate daily access count using Flink streaming.
Flink's TumblingProcessingTimeWindow assigns events to windows of
00:00 GMT to 23:59 GMT each day, but I live in Japan (GMT+09:00) and
want date boundaries to be 09:00 GMT (00:00 JST).
Do I have to implement my own WindowAssigner
+1 to what Gavor said. The hash combine will be part of the upcoming
1.1. release, too.
This could be further amplified by the blocking intermediate results,
which have a very simplistic implementation writing out many different
files, which can lead to a lot of random I/O.
– Ufuk
On Tue, Jul 26
Hello Robert,
> Is there something I might could do to optimize the grouping?
You can try to make your `RichGroupReduceFunction` implement the
`GroupCombineFunction` interface, so that Flink can do combining
before the shuffle, which might significantly reduce the network load.
(How much the comb
23 matches
Mail list logo