dataset dataframe join

2016-06-15 Thread Vishnu Viswanath
Hi All, Is there any workaround/hack to join a dataset with datastream since https://issues.apache.org/jira/browse/FLINK-2320 is still in progress. Regards, Vishnu

Re: API request to submit job takes over 1hr

2016-06-15 Thread Robert Metzger
Hi, Regarding Shannon first point: I agree. We can improve the user experience a lot, and documenting the behavior is the first step we should do here. I see your points. I agree that we should use a separate thread for running the main method and report better to the front end what's happening.

Re: Send to all in gelly scatter

2016-06-15 Thread Vasiliki Kalavri
I forgot the reference [1] :S Here it is: [1] https://ci.apache.org/projects/flink/flink-docs-master/apis/batch/libs/gelly.html#iteration-abstractions-comparison On 15 June 2016 at 20:59, Vasiliki Kalavri wrote: > Hi Alieh, > > you can send a message from any vertex to any other vertex if you k

Re: Send to all in gelly scatter

2016-06-15 Thread Vasiliki Kalavri
Hi Alieh, you can send a message from any vertex to any other vertex if you know the vertex ID. In [1] you will find a table that compares the update logic and communication scope for all gelly iteration models. Bear in mind though, that sending a message from all vertices to all other vertices is

Re: Gelly Scatter/Gather - Vertex update

2016-06-15 Thread Vasiliki Kalavri
Hi Alieh, the scatter-gather model is built on top of Flink delta iterations exactly for the reason to allow de-activating vertices that do not need to participate in the computation of a certain superstep. If you want all vertices to participate in all iterations of scatter-gather, you can send d

Re: Yarn batch not working with standalone yarn job manager once a persistent, HA job manager is launched ?

2016-06-15 Thread Maximilian Michels
Just had a quick chat with Ufuk. The issue is that in 1.x the Yarn properties file is loaded regardless of whether "-m yarn-cluster" is specified on the command-line. This loads the dynamic properties from the Yarn properties file and applies all configuration of the running (session) cluster clust

Re: Application log on Yarn FlinkCluster

2016-06-15 Thread Theofilos Kakantousis
Great then, I will look into my configuration. Thanks for your help! Cheers, Theofilos On 6/15/2016 2:00 PM, Maximilian Michels wrote: You should also see TaskManager output in the logs. I just verified this using Flink 1.0.3 with Hadoop 2.7.1. I executed the Iterate example and it aggregated

Re: Checkpoint takes long with FlinkKafkaConsumer

2016-06-15 Thread Ufuk Celebi
Hey Hironori, thanks for reporting this. Could you please update this thread when you have more information from the Kafka list? – Ufuk On Wed, Jun 15, 2016 at 2:48 PM, Hironori Ogibayashi wrote: > Kostas, > > Thank you for your advise. I have posted my question to the Kafka mailing > list. >

Re: Yarn batch not working with standalone yarn job manager once a persistent, HA job manager is launched ?

2016-06-15 Thread Maximilian Michels
Hi Arnaud, One issue per thread please. That makes things a lot easier for us :) Something positive first: We are reworking the resuming of existing Flink Yarn applications. It'll be much easier to resume a cluster using simply the Yarn ID or re-discoering the Yarn session using the properties fi

Re: Yarn batch not working with standalone yarn job manager once a persistent, HA job manager is launched ?

2016-06-15 Thread Ufuk Celebi
I've created an issue here: https://issues.apache.org/jira/browse/FLINK-4079 Hopefully it will be fixed in 1.1 and we can provide a bugfix for 1.0.4. On Wed, Jun 15, 2016 at 3:14 PM, LINZ, Arnaud wrote: > Ooopsss > My mistake, snapshot/restore do works in a local env, I've had a weird > con

Re: LongMaxAggregator in gelly scatter/gather

2016-06-15 Thread Ufuk Celebi
Yes, it's possible. On Wed, Jun 15, 2016 at 10:58 AM, Alieh Saeedi wrote: > Hi every body > Is it possible to use org.apache.giraph.aggregators.LongMaxAggregator as an > aggregator of gelly scatter/gather? > > Thanks in advance

Re: LongMaxAggragator()

2016-06-15 Thread Ufuk Celebi
Can you please provide more context? LongMaxAggragator has either been removed or you are referring to Giraph class. On Wed, Jun 15, 2016 at 10:52 AM, Alieh Saeedi wrote: > Hi everybody > > Why LongSumAggragator() works but LongMaxAggragator() is not known any > more?! > > Thanks in advance

Re: Send to all in gelly scatter

2016-06-15 Thread Ufuk Celebi
I think you can only send messages to the directly connected vertices. On Tue, Jun 14, 2016 at 5:30 PM, Alieh Saeedi wrote: > Hi > Is it possible to send a message to all vertices in gelly scatter? How the > Ids of all vertices can be found out? > > Thanks in advance

Re: NotSerializableException

2016-06-15 Thread Ufuk Celebi
Here's the issue https://issues.apache.org/jira/browse/FLINK-4078 On Mon, Jun 13, 2016 at 12:27 PM, Aljoscha Krettek wrote: > Nope, I think there is neither a fix nor an open issue for this right now. > > On Mon, 13 Jun 2016 at 11:31 Maximilian Michels wrote: >> >> Is there an issue or a fix for

RE: Yarn batch not working with standalone yarn job manager once a persistent, HA job manager is launched ?

2016-06-15 Thread LINZ, Arnaud
Ooopsss My mistake, snapshot/restore do works in a local env, I've had a weird configuration issue! But I still have the property file path issue :) -Message d'origine- De : LINZ, Arnaud Envoyé : mercredi 15 juin 2016 14:35 À : 'user@flink.apache.org' Objet : RE: Yarn batch not w

Re: Checkpoint takes long with FlinkKafkaConsumer

2016-06-15 Thread Hironori Ogibayashi
Kostas, Thank you for your advise. I have posted my question to the Kafka mailing list. I think Kafka brokers are fine because no errors on producer side with 15,000 msg/sec and from OS metrics, all of my brokers receives almost the same amount of network traffic. Thanks, Hironori 2016-06-14

RE: Yarn batch not working with standalone yarn job manager once a persistent, HA job manager is launched ?

2016-06-15 Thread LINZ, Arnaud
Hi, I haven't had the time to investigate the bad configuration file path issue yet (if you have any idea why yarn.properties-file.location is ignored you are welcome) , but I'm facing another HA-problem. I'm trying to make my custom streaming sources HA compliant by implementing snapshotState

Re: Migrating from one state backend to another

2016-06-15 Thread Josh
Hi Aljoscha, Thanks, that makes sense. I will start using RocksDB right away then. Josh On Wed, Jun 15, 2016 at 1:01 PM, Aljoscha Krettek wrote: > Hi, > right now migrating from one state backend to another is not possible. I > have it in the back of my head, however, that we should introduce a

Re: Migrating from one state backend to another

2016-06-15 Thread Aljoscha Krettek
Hi, right now migrating from one state backend to another is not possible. I have it in the back of my head, however, that we should introduce a common serialized representation of state to make this possible in the future. (Both for checkpoints and savepoints, which use the same mechanism undernea

Re: Application log on Yarn FlinkCluster

2016-06-15 Thread Maximilian Michels
You should also see TaskManager output in the logs. I just verified this using Flink 1.0.3 with Hadoop 2.7.1. I executed the Iterate example and it aggregated correctly including the TaskManager logs. I'm wondering, is there anything in the Hadoop logs of the Resourcemanager/Nodemanager that could

Re: Custom Barrier?

2016-06-15 Thread Aljoscha Krettek
Hi, when you have a parallel input stream (for example multiple kafka partitions that you read from) would you have the super events (A-Start, B-Start and so on) in all of the parallel streams? If the answer is yes, then you can probably abuse the watermarks mechanism to deal with it. If not, then

LongMaxAggregator in gelly scatter/gather

2016-06-15 Thread Alieh Saeedi
Hi every bodyIs it possible to use  org.apache.giraph.aggregators.LongMaxAggregator as an aggregator of gelly scatter/gather? Thanks in advance

LongMaxAggragator()

2016-06-15 Thread Alieh Saeedi
Hi everybody Why LongSumAggragator() works but LongMaxAggragator() is not known any more?! Thanks in advance

Re: Application log on Yarn FlinkCluster

2016-06-15 Thread Theofilos Kakantousis
Hi, By yarn aggregated log I mean Yarn log aggregation is enabled and the log I'm referring to is the one returned by `yarn logs -applicationId `. When running a Spark job for example on the same setup, the yarn aggregated log contains all the information printed out by the application. Chee

Re: Application log on Yarn FlinkCluster

2016-06-15 Thread Maximilian Michels
Please use the `yarn logs -applicationId ` to retrieve the logs. If you have enabled log aggregation, this will give you all container logs concatenated. Cheers, Max On Wed, Jun 15, 2016 at 12:24 AM, Theofilos Kakantousis wrote: > Hi Max, > > The runBlocking(..) problem was due to a Netty depen