Re: Question about Checkpoint

2017-06-26 Thread Tzu-Li (Gordon) Tai
Hi Desheng, Welcome to the community! What you’re asking alludes the question: How does Flink support end-to-end (from external source to external sink, e.g. Kafka to database) exactly-once delivery? Whether or not that is supported depends on the guarantees of the source and sink and how they

Question about Checkpoint

2017-06-26 Thread ZalaCheung
Hi Flink Community, I am new to Flink and now looking at checkpoint of Flink. After reading the document, I am still confused. Here is scene: I have a datastream finally flow to a database sink. I will update one of the field in database based on the incomming stream. I have now complete a sna

Re: How to run a Flink job in EMR?

2017-06-26 Thread Chris Schneider
Hi Craig, Thanks for providing that information (and apologies for the delay in responding). 1) I was able to create a Step that downloaded my jar, thanks. 2) I was able to create another Step that passed my jar to the flink script, thanks. 3) However, I wasn’t successful with my attempt at s

Re: Appending Windowed Aggregates to Events

2017-06-26 Thread Ted Yu
In case no-sql store is considered, please take a look at base Cheers Original message From: "Jain, Ankit" Date: 6/26/17 12:41 PM (GMT-08:00) To: Tim Stearn , user@flink.apache.org Subject: Re: Appending Windowed Aggregates to Events You could load the historical data as fl

Re: Appending Windowed Aggregates to Events

2017-06-26 Thread Jain, Ankit
You could load the historical data as flink state and then look up the state with the key derived from input record. That should serve like a join in relational world. You may also want to think about keeping the writes and querying isolated. Especially if your windows are going to be long (eg ca

Re: MapR libraries shading issue

2017-06-26 Thread ani.desh1512
Pasting my reply from the MapR community thread: So, I have found a temporary workaround for this. Heres what I did: For the client, I found out the password of /opt/mapr/conf/ssl_trustore via ssl-client.xml For both Datadog and Amazon S3, I found out the certs chain that they were using (i could

Re: Checkpointing with RocksDB as statebackend

2017-06-26 Thread Vinay Patil
Hi Stephan, I have upgraded to Flink 1.3.0 to test RocksDB with incremental checkpointing (PredefinedOptions used is FLASH_SSD_OPTIMIZED) I am currently creating a YARN session and running the job on EMR having r3.4xlarge instances (122GB of memory), I have observed that it is utilizing almost a

Re: MapR libraries shading issue

2017-06-26 Thread Jörn Franke
The error that you mentioned seem to indicate that some certificates of certification authorities could not be found. You may want to add them to the trust store of the application. > On 26. Jun 2017, at 16:55, ani.desh1512 wrote: > > As Stephan pointed out, this seems more like a MapR libs me

Re: Latest spark yahoo benchmark

2017-06-26 Thread nragon
Yes, indeed. That's why we choose Flink instead all the others. This post was just pure curiosity to see spark trying to migrate into a pure streaming engine. -- View this message in context: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Latest-spark-yahoo-benchmark-tp13

Re: MapR libraries shading issue

2017-06-26 Thread ani.desh1512
As Stephan pointed out, this seems more like a MapR libs meddling with some jar. As I had mentioned in the original question, I run across the same problem when i use the aws sdk jar in my program. The error is as follows: /shaded.com.amazonaws.SdkClientException: Unable to execute HTTP request: s

Re: Latest spark yahoo benchmark

2017-06-26 Thread Stephan Ewen
@nragon - I think this is a classical "benchmarketing" post. A few thoughts on that - Everyone can tune their system to be best. We ran Flink with even higher throughput than that: https://image.slidesharecdn.com/benchmark-mapr-160407212254/95/extending-the-yahoo-streaming-benchmark-mapr-benchma

Re: Flink CEP not emitting timed out events properly

2017-06-26 Thread Biplob Biswas
Hi Kostas, I ended up setting my currentMaxTimestamp = Math.max(timestamp, currentMaxTimestamp); to currentMaxTimestamp = Math.min(timestamp, currentMaxTimestamp); and changing this : if(firstEventFlag && (currentTime - systemTimeSinceLastModification > 1)){ systemTimeSinceLastModi

Re: Appending Windowed Aggregates to Events

2017-06-26 Thread Fabian Hueske
Hi Tim, your requirements seem to be similar to what you can do in SQL with an OVER window aggregation. OVER window aggregates are computed for each row and enrich the existing fields with aggregates. The aggregates are computed over a range of rows that precede (or follow) the current row. This m

RE: A way to purge an empty session

2017-06-26 Thread Gwenhael Pasquiers
Thanks ! I didn’t know of this function and indeed it seems to match my needs better than Windows. And I’ll be able to clear my state once it’s empty (and re-create it when necessary). B.R. From: Fabian Hueske [mailto:fhue...@gmail.com] Sent: lundi 26 juin 2017 12:13 To: Gwenhael Pasquiers Cc

Re: About nodes number on Flink

2017-06-26 Thread Timo Walther
If you really what to run one operation per node. You start 1 TaskManager with 1 slot on every node. For each operation you set a new chain and a new slot sharing group. Timo Am 23.06.17 um 15:03 schrieb AndreaKinn: Hi Timo, thanks for your answer. I think my elaboration are not too much heav

Re: Window data retention - Appending to previous data and processing

2017-06-26 Thread Aljoscha Krettek
Hi, I think you should be able to do this by using: * GlobalWindows as your window function * a custom Trigger that fires on every element, sets a timer for your cleanup time, and purges when the cleanup timer fires * a ProcessWindowFunction, to so that you always get all the contents of the

Re: flink doesn't seem to serialize generated Avro pojo

2017-06-26 Thread Aljoscha Krettek
Hi Bart, Is serialisation failing in a Flink job (for example when writing to Kafka) or just in the main() method when experimenting with serialisation? Best, Aljoscha > On 26. Jun 2017, at 11:06, Bart van Deenen wrote: > > It turns out that the Flink documentation has a whole section about t

Re: MapR libraries shading issue

2017-06-26 Thread Stephan Ewen
Okay, just curious because the guy mentioned the behavior changes with removing the MapR dependencies. Maybe these dependencies change the trust-store or certificate-store providers? On Mon, Jun 26, 2017 at 2:35 PM, Chesnay Schepler wrote: > This looks more like a certification problem as descr

Re: MapR libraries shading issue

2017-06-26 Thread Chesnay Schepler
This looks more like a certification problem as described here: https://github.com/square/okhttp/issues/2746 I don't think that shading could have anything to do with this. On 26.06.2017 00:09, ani.desh1512 wrote: I am trying to use Flink (1.3.0) with MapR(5.2.1). Accordingly, I built Flink fo

Re: A way to purge an empty session

2017-06-26 Thread Fabian Hueske
Hi Gwenhael, have you considered to use a ProcessFunction? With a ProcessFunction you have direct access to state (that you register yourself) and you can register timers that trigger a callback function when they expire. So you can cleanup state when you receive an element or when a timer expires

Re: Gelly - bipartite graph runs vertex-centric

2017-06-26 Thread Vasiliki Kalavri
Hi Marc, the BipartiteGraph type doesn't support vertex-centric iterations yet. You can either represent your bipartite graph using the Graph type and e.g. having an extra attribute in the vertex value to distinguish between top and bottom vertices or define your own custom delta iteration on top

Re: flink doesn't seem to serialize generated Avro pojo

2017-06-26 Thread Bart van Deenen
It turns out that the Flink documentation has a whole section about this problem: https://goo.gl/DMs9Dx Now let's find the solution!. Bart