Re: Reconstruct object through partial select query

2019-05-10 Thread Shahar Cizer Kobrinsky
Hi Fabian, I have a trouble implementing the type for this operation, i wonder how i can do that. So given generic type T i want to create a TypeInformation for: class TaggedEvent { String[] tags T originalEvent } Was trying a few different things but not sure how to do it. Doesn't seem lik

Re: Use case for StreamingFileSink: Different parquet writers within the Sink

2019-05-10 Thread Kailash Dayanand
Hello, I was able to solve this based by creating a data model where all the incoming events are added into a message envelope and writing a Sink for a dataStream containing these message envelopes. Also I ended up creating parquet writers not when constructing the parquetWriter but instead inside

[no subject]

2019-05-10 Thread an0
> Q2: after a, map(A), and map(B) would work fine. Assign watermarks > immediatedly after a keyBy() is not a good idea, because 1) the records are > shuffled and it's hard to reasoning about ordering, and 2) you lose the > KeyedStream property and would have to keyBy() again (unless you use > inter

Re: I want to use MapState on an unkeyed stream

2019-05-10 Thread an0
Got it, thanks. On 2019/05/10 10:20:40, Fabian Hueske wrote: > Hi, > > RocksDB is only used as local state store. Operator state is not stored in > RocksDB but only on the TM JVM heap. > When a checkpoint is taken, the keyed state from RocksDB and the operator > state from the heap are both cop

Re: Rich and incrementally aggregating window functions

2019-05-10 Thread an0
Thanks. I know reimplementing windowing myself will work but that's a very bad last resort. I believe it is a very reasonable request. But since someone else has already filed a Jira and it was closed as Won't Fix[1], I won't bother refiling it again. I'll try something else first. [1] https:

Re: Checkpointing and save pointing

2019-05-10 Thread Fabian Hueske
For checkpoints and savepoints, the JM and all TMs need access to the same storage system. This can be shared NFS that is mounted on each machine. Best, Fabian Am Fr., 10. Mai 2019 um 15:15 Uhr schrieb Boris Lublinsky < boris.lublin...@lightbend.com>: > For now is a regular Link cluster, > But e

Re: Checkpointing and save pointing

2019-05-10 Thread Boris Lublinsky
For now is a regular Link cluster, But even there I want to use both check and save pointing. We do not want to use Hadoop, but rather shared fs - NFS/Gluster. I was trying to see whether volumes need to be mounted only for Job manager or both. HA is the next step. Trying to find the code saving c

Re: How to export all not-null keyed ValueState

2019-05-10 Thread Averell Huyen Levan
Thank you very much, Fabian. Regards, Averell On Fri, May 10, 2019 at 9:46 PM Fabian Hueske wrote: > Hi Averell, > > I'd go with your approach any state access (given that you use RocksDB > keyed state) or deduplication of messages is going to be more expensive > than a simple cast. > > Best, F

Re: How to export all not-null keyed ValueState

2019-05-10 Thread Fabian Hueske
Hi Averell, I'd go with your approach any state access (given that you use RocksDB keyed state) or deduplication of messages is going to be more expensive than a simple cast. Best, Fabian Am Fr., 10. Mai 2019 um 13:08 Uhr schrieb Averell Huyen Levan < lvhu...@gmail.com>: > Hi Fabian, > > Thanks

Re: How to export all not-null keyed ValueState

2019-05-10 Thread Averell Huyen Levan
Hi Fabian, Thanks for that. However, as I mentioned in my previous email, that implementation requires a lot of typecasting/object wrapping. I tried to broadcast that Toggle stream - the toggles will be saved as a MapState, and whenever an export trigger record arrived, I send out that MapState. T

Re: How to export all not-null keyed ValueState

2019-05-10 Thread Fabian Hueske
Hi Averell, Ah, sorry. I had assumed the toggle events where broadcasted anyway. Since you had both streams keyed, your current solution looks fine to me. Best, Fabian Am Fr., 10. Mai 2019 um 03:13 Uhr schrieb Averell Huyen Levan < lvhu...@gmail.com>: > Hi Fabian, > > Sorry, but I am still conf

Re:

2019-05-10 Thread Fabian Hueske
Hi, Again answers below ;-) Am Do., 9. Mai 2019 um 17:12 Uhr schrieb an0 : > You are right, thanks. But something is still not totally clear to me. > I'll reuse your diagram with a little modification: > > DataStream a = ... > a.map(A).map(B).keyBy().timeWindow(C) > > and execute this with p

Re: Reduce key state

2019-05-10 Thread Fabian Hueske
Hi Frank, By default, Flink does not remove any state. It is the responsibility of the developer to ensure that an application does not leak state. Typically, you would use timers [1] to discard state that expired and is not useful anymore. In the last release 1.8, we added lazy cleanup strategie

Re: I want to use MapState on an unkeyed stream

2019-05-10 Thread Fabian Hueske
Hi, RocksDB is only used as local state store. Operator state is not stored in RocksDB but only on the TM JVM heap. When a checkpoint is taken, the keyed state from RocksDB and the operator state from the heap are both copied to a persistent data store (HDFS, S3, ...). I was trying to find the do

Re: Checkpointing and save pointing

2019-05-10 Thread Fabian Hueske
Hi Boris, Is your question is in the context of replacing Zookeeper by a different service for highly-available setups or are you setting up a regular Flink cluster? Best, Fabian Am Mi., 8. Mai 2019 um 06:20 Uhr schrieb Congxian Qiu < qcx978132...@gmail.com>: > Hi, Boris > > TM will also need