[jira] [Created] (FLINK-3784) Unexpected results using collect() in RichMapPartitionFunction

2016-04-19 Thread JIRA
Sergio Ramírez created FLINK-3784:
-

 Summary: Unexpected results using collect() in 
RichMapPartitionFunction
 Key: FLINK-3784
 URL: https://issues.apache.org/jira/browse/FLINK-3784
 Project: Flink
  Issue Type: Bug
  Components: DataSet API, Machine Learning Library, Scala API
Affects Versions: 1.0.0
 Environment: Debian 8.3
Reporter: Sergio Ramírez


The following code (in Scala) outputs unexpected registers when it tries to 
transpose a simple matrix formed by LabeledVector. For each new key (feature, 
partition), a different number of registers is presented despite all new pairs 
should yield the same number of register as the data is dense (please, take a 
look to the result with a sample dataset). 

def mapPartition(it: java.lang.Iterable[LabeledVector], out: Collector[((Int, 
Int), Int)]): Unit = {
  val index = getRuntimeContext().getIndexOfThisSubtask() // Partition 
index
  var ninst = 0
  for(reg <- it.asScala) {
requireByteValues(reg.vector)
ninst += 1
  }
  for(i <- 0 until nFeatures) out.collect((i, index) -> ninst)
}

Result: 
Attribute 10, first seven partitions: 
((10,0),201),((10,1),200),((10,2),201),((10,3),200),((10,4),200),((10,5),201),((10,6),201),((10,7),201)

Attribute 12, first seven partitions: 
((12,0),201),((12,1),201),((12,2),201),((12,3),200),((12,4),201),((12,5),200),((12,6),200),((12,7),201)
 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: GSoC Project Proposal Draft: Code Generation in Serializers

2016-04-19 Thread Fabian Hueske
Hi Gabor,

you are right, a codegen serializer module would depend on flink-core and
in the current design flink-core would need to know about the type infos /
serializers / comparators.

Decoupling implementations of type info, serializers, and comparators from
flink-core and resolving the cyclic dependency would be what the plugin
architecture would be for.
Maybe this can be done by some mechanism to dynamically load
TypeInformations for types with overridden serializers / comparators.
This would require some design document and discussion in the community.

Cheers, Fabian





2016-04-18 21:19 GMT+02:00 Gábor Horváth :

> Unfortunately making code generation a separate module would introduce
> cyclic dependency.
> Code generation requires the TypeInfo which is available in flink-core and
> flink-core requires
> the generated serializers from the code generation module. Do you have a
> solution for this?
>
> I think if we can come up with a solution I will implement it as a separate
> Scala module
> otherwise I will stick to Java.
>
> BR,
> Gábor
>
> On 18 April 2016 at 12:40, Fabian Hueske  wrote:
>
> > +1 for not mixing Java and Scala in flink-core.
> >
> > Maybe it makes sense to implement the code generated serializers /
> > comparators as a separate module which can be plugged-in. This could be
> > pure Scala.
> > In general, I think it would be good to have some kind of "version
> > management" for serializers in place. With features such as safepoints
> that
> > depend on the implementation of serializers, it would be good to have a
> > mechanism to switch between implementations.
> >
> > Best, Fabian
> >
> > 2016-04-18 10:01 GMT+02:00 Chiwan Park :
> >
> > > Yes, I know Janino is a pure Java project. I meant if we add Scala code
> > to
> > > flink-core, we should add Scala dependency to flink-core and it could
> be
> > > confusing.
> > >
> > > Regards,
> > > Chiwan Park
> > >
> > > > On Apr 18, 2016, at 2:49 PM, Márton Balassi <
> balassi.mar...@gmail.com>
> > > wrote:
> > > >
> > > > Chiwan, just to clarify Janino is a Java project. [1]
> > > >
> > > > [1] https://github.com/aunkrig/janino
> > > >
> > > > On Mon, Apr 18, 2016 at 3:40 AM, Chiwan Park 
> > > wrote:
> > > >
> > > >> I prefer to avoid Scala dependencies in flink-core. If flink-core
> > > includes
> > > >> Scala dependencies, Scala version suffix (_2.10 or _2.11) should be
> > > added.
> > > >> I think that users could be confused.
> > > >>
> > > >> Regards,
> > > >> Chiwan Park
> > > >>
> > > >>> On Apr 17, 2016, at 3:49 PM, Márton Balassi <
> > balassi.mar...@gmail.com>
> > > >> wrote:
> > > >>>
> > > >>> Hi Gábor,
> > > >>>
> > > >>> I think that adding the Janino dep to flink-core should be fine, as
> > it
> > > >> has
> > > >>> quite slim dependencies [1,2] which are generally orthogonal to
> > Flink's
> > > >>> main dependency line (also it is already used elsewhere).
> > > >>>
> > > >>> As for mixing Scala code that is used from the Java parts of the
> same
> > > >> maven
> > > >>> module I am skeptical. We have seen IDE compilation issues with
> > > projects
> > > >>> using this setup and have decided that the community-wide potential
> > IDE
> > > >>> setup pain outweighs the individual implementation convenience with
> > > >> Scala.
> > > >>>
> > > >>> [1]
> > > >>>
> > > >>
> > >
> >
> https://repo1.maven.org/maven2/org/codehaus/janino/janino-parent/2.7.8/janino-parent-2.7.8.pom
> > > >>> [2]
> > > >>>
> > > >>
> > >
> >
> https://repo1.maven.org/maven2/org/codehaus/janino/janino/2.7.8/janino-2.7.8.pom
> > > >>>
> > > >>> On Sat, Apr 16, 2016 at 5:51 PM, Gábor Horváth <
> xazax@gmail.com>
> > > >> wrote:
> > > >>>
> > >  Hi!
> > > 
> > >  Table API already uses code generation and the Janino compiler
> [1].
> > Is
> > > >> it a
> > >  dependency that is ok to add to flink-core? In case it is ok, I
> > think
> > > I
> > >  will use the same in order to be consistent with the other code
> > > >> generation
> > >  efforts.
> > > 
> > >  I started to look at the Table API code generation [2] and it uses
> > > Scala
> > >  extensively. There are several Scala features that can make Java
> > code
> > >  generation easier such as pattern matching and string
> > interpolation. I
> > > >> did
> > >  not see any Scala code in flink-core yet. Is it ok to implement
> the
> > > code
> > >  generation inside the flink-core using Scala?
> > > 
> > >  Regards,
> > >  Gábor
> > > 
> > >  [1] http://unkrig.de/w/Janino
> > >  [2]
> > > 
> > > 
> > > >>
> > >
> >
> https://github.com/apache/flink/blob/master/flink-libraries/flink-table/src/main/scala/org/apache/flink/api/table/codegen/CodeGenerator.scala
> > > 
> > >  On 18 March 2016 at 19:37, Gábor Horváth 
> > wrote:
> > > 
> > > > Thank you! I finalized the project.
> > > >
> > > >
> > > > On 18 March 2016 at 10:29, Márton Balassi <
> > balassi.mar...@gmail.com>
> > > > wrot

Re: Bug in Scala Shell

2016-04-19 Thread Nikolaas s
Hi Trevor,

this is a bug and was introduced before the streaming shell:
https://issues.apache.org/jira/browse/FLINK-3701

I suspect something changed somewhere else in the Flink code that breaks
the scala shell.
In the scala shell tests executing the same program twice is not done so it
went unnoticed.

The streaming shell does seem to work. If you need the batch shell, I
checked for version 1.0 and it seemed to work.

best,
Nikolaas



2016-04-18 23:35 GMT+02:00 Trevor Grant :

> I was trying out the new scala-shell with streaming support...
>
> The following code executes correctly the first time I run it:
>
> val survival = benv.readCsvFile[(String, String, String,
> String)]("file:///home/trevor/gits/datasets/haberman/haberman.data")
> survival.count()
>
> However, if I call survival.count() again, run the entire code again I get
> the following error:
>
> java.lang.NullPointerException
> at
>
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1074)
> at
>
> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1004)
> at
>
> org.apache.flink.api.java.ScalaShellRemoteEnvironment.execute(ScalaShellRemoteEnvironment.java:70)
> at
>
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:898)
> at
>
> org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:638)
> at org.apache.flink.api.scala.DataSet.count(DataSet.scala:528)
> at .(:62)
> at .()
> at .(:7)
> at .()
> at $print()
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
>
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
>
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
> at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983)
> at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
> at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604)
> at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568)
> at scala.tools.nsc.interpreter.ILoop.reallyInterpret$1(ILoop.scala:760)
> at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:805)
> at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:717)
> at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581)
> at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588)
> at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591)
> at
>
> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:882)
> at
> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
> at
> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
> at
>
> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
> at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837)
> at org.apache.flink.api.scala.FlinkShell$.startShell(FlinkShell.scala:199)
> at org.apache.flink.api.scala.FlinkShell$.main(FlinkShell.scala:127)
> at org.apache.flink.api.scala.FlinkShell.main(FlinkShell.scala)
>
>
> Is this me or is this a bug? or both?
>
> Thanks,
>
> tg
>
> Trevor Grant
> Data Scientist
> https://github.com/rawkintrevo
> http://stackexchange.com/users/3002022/rawkintrevo
> http://trevorgrant.org
>
> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>


Re: Adding custom monitoring to Flink

2016-04-19 Thread Till Rohrmann
Hi Maxim,

I think the corresponding JIRA issue is
https://issues.apache.org/jira/browse/FLINK-456

Cheers,
Till

On Thu, Apr 14, 2016 at 10:50 PM, Maxim  wrote:

> I don't have full list of metrics, but everything that is related to
> runtime performance and possible bottlenecks of the system. All
> interprocess communication counters, errors, latencies, checkpoint sizes
> and checkpointing latencies. Buffer allocations and releases, etc.
> As we aggregate ourselves we can produce multiple views of the same metric:
> min, max, tp99, tp99.9, top n, etc.
>
> Could you point to the doc/Jira/diff for your change?
>
>
> On Thu, Apr 14, 2016 at 12:32 PM, Chesnay Schepler 
> wrote:
>
> > I'm currently working on a metric system that
> > a) exposes several TaskManger metrics
> > b) allows gathering metrics in various parts of a task, most notably
> > user-defined functions.
> >
> > The first version makes these metrics available via JMX on each
> > TaskManager.
> > While a mechanism to make that pluggable is /planned/ there are no
> details
> > on that yet.
> >
> > I /guess/ once it is merged you should be able to modify one of the
> > classes so that the data is directly
> > exported to your tool, but i would have to know more about it to make a
> > definite assessment.
> >
> > There are no plans to funnel all those metrics unaggregated through
> > Flink's accumulator mechanism;
> > only a selection that will be aggregated locally and on the JobManager to
> > display in the Dashboard.
> >
> > Out of curiosity, what metrics are you interested in?
> >
> >
> > On 14.04.2016 20:59, Maxim wrote:
> >
> >> Hi!
> >> I'm looking into integrating Flink into our stack and one of the
> >> requirements is to report metrics to an internal system. The current
> >> Accumulators are not adequate to provide visibility that we need to run
> >> such a system in production. We want much more information about the
> >> internal cluster state and ability to calculate aggregates ourselves.
> The
> >> core reporting API accepts a metric name, metric type (gauge, counter,
> >> timer) and a set of key value pairs that act as dimensions.
> >>
> >> The ideal solution for us would report the metrics through such API and
> >> provide default binding to existing Accumulators, but allow overriding
> it
> >> to our internal reporting client.
> >>
> >> Is it something that could be added to the Flink or there are other
> plans
> >> for monitoring?
> >>
> >> Thanks!
> >>
> >> Maxim.
> >>
> >>
> >
>


[jira] [Created] (FLINK-3785) Add REST resource to query information about checkpointing

2016-04-19 Thread Konstantin Knauf (JIRA)
Konstantin Knauf created FLINK-3785:
---

 Summary: Add REST resource to query information about checkpointing
 Key: FLINK-3785
 URL: https://issues.apache.org/jira/browse/FLINK-3785
 Project: Flink
  Issue Type: New Feature
  Components: Web Client
Reporter: Konstantin Knauf


For Monitoring purposes it would be nice, to be able to query information about 
the last n checkpoint via the REST API, i.e ID, Trigger Time, state size and 
duration. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3786) Add BigDecimal and BigInteger as Basic types

2016-04-19 Thread Timo Walther (JIRA)
Timo Walther created FLINK-3786:
---

 Summary: Add BigDecimal and BigInteger as Basic types
 Key: FLINK-3786
 URL: https://issues.apache.org/jira/browse/FLINK-3786
 Project: Flink
  Issue Type: New Feature
  Components: Core
Reporter: Timo Walther
Assignee: Timo Walther


We already had the discussion on the mailing list some months ago about adding 
BigDecimal and BigInteger as basic types.

Especially for business or scientific applications it 
makes sense to support the BigInteger and BigDecimal types natively. In 
my opinion they are as important as Date or Void and should be added as 
BasicTypes. The Table API would also benefit from it.

http://mail-archives.apache.org/mod_mbox/flink-dev/201511.mbox/%3c564cad71.8070...@apache.org%3E




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3787) Yarn client does not report unfulfillable container constraints

2016-04-19 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-3787:


 Summary: Yarn client does not report unfulfillable container 
constraints
 Key: FLINK-3787
 URL: https://issues.apache.org/jira/browse/FLINK-3787
 Project: Flink
  Issue Type: Improvement
  Components: YARN Client
Affects Versions: 1.1.0
Reporter: Till Rohrmann
Priority: Minor


If the number of virtual cores for a Yarn container is not fulfillable, then 
the {{TaskManager}} won't be started. This is only reported in the logs but not 
in the {{FlinkYarnClient}}. Thus, the user will see a started {{JobManager}} 
with no connected {{TaskManagers}}. Since the log aggregation is only available 
after the Yarn job has been stopped, there is no easy way for the user to 
detect what's going on.

This problem is aggravated by the fact that the number of virtual cores is 
coupled to the number of slots if no explicit value has been set for the 
virtual cores. Therefore, it might happen that the Yarn deployment fails 
because of the virtual cores even though the user has never set a value for 
them (the user might even not know about the virtual cores).

I think it would be good to check if the virtual cores constraint is 
fulfillable. If not, then the user should receive a clear message that the 
Flink cluster cannot be deployed (similar to the memory constraints).  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Flink Interpreter w/ yarn-session

2016-04-19 Thread Till Rohrmann
Hi Andrea,

I think your problem should be fixed with the PRs [1,2]. I've tested it
locally on my yarn cluster and it worked.

[1] https://github.com/apache/flink/pull/1904
[2] https://github.com/apache/flink/pull/1914

Cheers,
Till

On Tue, Apr 19, 2016 at 2:16 PM, Till Rohrmann  wrote:

> I think this is another issue you’ve detected. I already spotted some
> suspicious code in the yarn deployment section. If I’m not mistaken, then
> flink-conf.yaml is read too late and is, thus, not respected. I’ll verify
> it and if valid, then I’ll open another issue and fix it.
>
> Thanks for your patience and thorough reporting. It helps a lot :-)
>
> Cheers,
> Till
> ​
>
> On Tue, Apr 19, 2016 at 2:12 PM, Andrea Sella 
> wrote:
>
>> No, I tried it via scala-shell as you can see the attachment.
>>
>> Regards,
>> Andrea
>>
>> 2016-04-19 14:08 GMT+02:00 Till Rohrmann :
>>
>>> Hi Andrea,
>>>
>>> thanks for testing it. How did you submit the job this time? Via
>>> Zeppelin?
>>>
>>> Cheers,
>>> Till
>>>
>>> On Tue, Apr 19, 2016 at 12:51 PM, Andrea Sella <
>>> andrea.se...@radicalbit.io> wrote:
>>>
 Hi Till,

 I've used your branch fixScalaShell to test the scala-shell with our HA
 cluster, it doesn't work. Same error as before

 2016-04-19 06:40:35,030 WARN  org.apache.flink.yarn.YarnJobManager
  - Discard message
 LeaderSessionMessage(null,SubmitJob(JobGraph(jobId:
 aa5b034e10a850d863642a24aab75d9c),EXECUTION_RESULT_AND_STATE_CHANGES))
 because the expected leader session ID
 Some(bc706707-2bab-4b82-b7a7-1426dce696a7) did not equal the received
 leader session ID None.

 If I submit a simple job, it works. I think it is not a problem of our
 environment.

 Cheers,
 Andrea

 2016-04-18 18:41 GMT+02:00 Till Rohrmann :

> Cool, that helps a lot :-)
>
> On Mon, Apr 18, 2016 at 6:06 PM, Andrea Sella <
> andrea.se...@radicalbit.io> wrote:
>
>> Hi Till,
>>
>> Don't worry, I am going to test the PR in our HA environment.
>>
>> Cheers,
>> Andrea
>>
>>
>> 2016-04-18 17:46 GMT+02:00 Till Rohrmann :
>>
>>> Hi Andrea,
>>>
>>> sorry I've seen your mail too late. I already fixed the problem and
>>> opened a PR [1] for it. I hope you haven't invested too much time for 
>>> it,
>>> yet.
>>>
>>> [1] https://github.com/apache/flink/pull/1904
>>>
>>> Cheers,
>>> Till
>>>
>>> On Mon, Apr 18, 2016 at 11:19 AM, Andrea Sella <
>>> andrea.se...@radicalbit.io> wrote:
>>>
 Hi Till,
 Thanks for the support, I will take the issue and starting to work
 on it asap.

 Regards,
 Andrea

 2016-04-18 10:32 GMT+02:00 Till Rohrmann :

> Hi Andrea,
>
> I think the problem is simply that it has not been correctly
> implemented. I just checked and I think the user configuration is not 
> given
> to the PlanExecutor which is internally created. I’ve opened an
> issue for that [1].
>
> [1] https://issues.apache.org/jira/browse/FLINK-3774
>
> Cheers,
> Till
> ​
>
> On Fri, Apr 15, 2016 at 4:58 PM, Andrea Sella <
> andrea.se...@radicalbit.io> wrote:
>
>> Hi Till,
>>
>> I've tried the Scala-Shell with our HA cluster, no luck again.
>>
>> Cheers,
>> Andrea
>>
>> 2016-04-15 14:43 GMT+02:00 Andrea Sella <
>> andrea.se...@radicalbit.io>:
>>
>>> Hi Till,
>>>
>>> I am using a branched version of 1.0.1 where I cherry-picked
>>> FLINK-2935
>>> 
>>>  to
>>> use FlinkILoop with Configuration. My Flink interpreter is here
>>> ,
>>> I've started tweaking just two days ago and as I can see there is a
>>> Zeppelin issue
>>>  to provide
>>> FlinkInterpeter working with Yarn and I need it too.
>>>
>>> Thanks,
>>> Andrea
>>>
>>>
>>>
>>> 2016-04-15 14:20 GMT+02:00 Till Rohrmann :
>>>
 Hi Andrea,

 which version of Flink are you using with Zeppelin? How do you
 pass the Flink configuration to the FlinkILoop? Could you maybe 
 show me
 your version of Zeppelin (code).

 According to the log, the ScalaShellRemoteEnvironment didn't
 get the Flink configuration with the HA settings. Therefore, it 
 still tries
 to connec

Re: Bug in Scala Shell

2016-04-19 Thread Maximilian Michels
Hi Trevor,

As Nikolaas pointed out, this is a regression in the recent master.
The current implementation of code in the ExecutionConfig doesn't
expect multiple executions. We have a pending fix [1] which should be
in soon.

Cheers,
Max

[1] https://github.com/apache/flink/pull/1913

On Tue, Apr 19, 2016 at 11:58 AM, Nikolaas s
 wrote:
> Hi Trevor,
>
> this is a bug and was introduced before the streaming shell:
> https://issues.apache.org/jira/browse/FLINK-3701
>
> I suspect something changed somewhere else in the Flink code that breaks
> the scala shell.
> In the scala shell tests executing the same program twice is not done so it
> went unnoticed.
>
> The streaming shell does seem to work. If you need the batch shell, I
> checked for version 1.0 and it seemed to work.
>
> best,
> Nikolaas
>
>
>
> 2016-04-18 23:35 GMT+02:00 Trevor Grant :
>
>> I was trying out the new scala-shell with streaming support...
>>
>> The following code executes correctly the first time I run it:
>>
>> val survival = benv.readCsvFile[(String, String, String,
>> String)]("file:///home/trevor/gits/datasets/haberman/haberman.data")
>> survival.count()
>>
>> However, if I call survival.count() again, run the entire code again I get
>> the following error:
>>
>> java.lang.NullPointerException
>> at
>>
>> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1074)
>> at
>>
>> org.apache.flink.api.java.ExecutionEnvironment.createProgramPlan(ExecutionEnvironment.java:1004)
>> at
>>
>> org.apache.flink.api.java.ScalaShellRemoteEnvironment.execute(ScalaShellRemoteEnvironment.java:70)
>> at
>>
>> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:898)
>> at
>>
>> org.apache.flink.api.scala.ExecutionEnvironment.execute(ExecutionEnvironment.scala:638)
>> at org.apache.flink.api.scala.DataSet.count(DataSet.scala:528)
>> at .(:62)
>> at .()
>> at .(:7)
>> at .()
>> at $print()
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>>
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>>
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:734)
>> at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:983)
>> at scala.tools.nsc.interpreter.IMain.loadAndRunReq$1(IMain.scala:573)
>> at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:604)
>> at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:568)
>> at scala.tools.nsc.interpreter.ILoop.reallyInterpret$1(ILoop.scala:760)
>> at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:805)
>> at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:717)
>> at scala.tools.nsc.interpreter.ILoop.processLine$1(ILoop.scala:581)
>> at scala.tools.nsc.interpreter.ILoop.innerLoop$1(ILoop.scala:588)
>> at scala.tools.nsc.interpreter.ILoop.loop(ILoop.scala:591)
>> at
>>
>> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:882)
>> at
>> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
>> at
>> scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:837)
>> at
>>
>> scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
>> at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:837)
>> at org.apache.flink.api.scala.FlinkShell$.startShell(FlinkShell.scala:199)
>> at org.apache.flink.api.scala.FlinkShell$.main(FlinkShell.scala:127)
>> at org.apache.flink.api.scala.FlinkShell.main(FlinkShell.scala)
>>
>>
>> Is this me or is this a bug? or both?
>>
>> Thanks,
>>
>> tg
>>
>> Trevor Grant
>> Data Scientist
>> https://github.com/rawkintrevo
>> http://stackexchange.com/users/3002022/rawkintrevo
>> http://trevorgrant.org
>>
>> *"Fortunate is he, who is able to know the causes of things."  -Virgil*
>>


[jira] [Created] (FLINK-3788) Local variable values are not distributed to task runners

2016-04-19 Thread Andreas C. Osowski (JIRA)
Andreas C. Osowski created FLINK-3788:
-

 Summary: Local variable values are not distributed to task runners
 Key: FLINK-3788
 URL: https://issues.apache.org/jira/browse/FLINK-3788
 Project: Flink
  Issue Type: Bug
  Components: DataSet API
Affects Versions: 1.0.1, 1.0.0
 Environment: Scala 2.11.8
Sun JDK 1.8.0_65 or OpenJDK 1.8.0_77
Fedora 25, 4.6.0-0.rc2.git3.1.fc25.x86_64
Reporter: Andreas C. Osowski


Variable values of non-elementary types aren't caught and distributed to job 
runners, causing them to remain 'null' and causing NPEs upon access when 
running on a cluster. Running locally through `flink-clients` works fine.

Changing parallelism or disabling the closure cleaner don't seem to have any 
effect.

Minimal example, also see the attached archive.
{code:java}
case class IntWrapper(a1: Int)
val wrapped = IntWrapper(42)
env.readTextFile("myTextFile.txt").map(line => wrapped.toString).collect
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-3789) Overload methods which trigger program execution to allow naming job

2016-04-19 Thread Greg Hogan (JIRA)
Greg Hogan created FLINK-3789:
-

 Summary: Overload methods which trigger program execution to allow 
naming job
 Key: FLINK-3789
 URL: https://issues.apache.org/jira/browse/FLINK-3789
 Project: Flink
  Issue Type: Improvement
  Components: Java API
Affects Versions: 1.1.0
Reporter: Greg Hogan
Assignee: Greg Hogan
Priority: Minor


Overload the following functions to additionally accept a job name to pass to 
{{ExecutionEnvironment.execute(String)}}.
* {{DataSet.collect()}}
* {{DataSet.count()}}
* {{DataSetUtils.checksumHashCode(DataSet)}}
* {{GraphUtils.checksumHashCode(Graph)}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] Release Apache Flink 1.0.2 (RC3)

2016-04-19 Thread Robert Metzger
Thank you for creating another bugfix release of the 1.0 release Ufuk!

+1 for releasing this proposed RC.


- Checked some flink-dist jars for correctly shaded guava classes
- Started Flink in local mode and ran some examples
- Checked the staging repository
  - Checked the quickstarts for the scala 2.10 and the right hadoop
profiles.



On Mon, Apr 18, 2016 at 5:29 PM, Ufuk Celebi  wrote:

> Dear Flink community,
>
> Please vote on releasing the following candidate as Apache Flink version
> 1.0.2.
>
> The commit to be voted on:
> d39af152a166ddafaa2466cdae82695880893f3e
>
> Branch:
> release-1.0.2-rc3 (see
>
> https://git1-us-west.apache.org/repos/asf/flink/?p=flink.git;a=shortlog;h=refs/heads/release-1.0.2-rc3
> )
>
> The release artifacts to be voted on can be found at:
> http://home.apache.org/~uce/flink-1.0.2-rc3/
>
> The release artifacts are signed with the key with fingerprint 9D403309:
> http://www.apache.org/dist/flink/KEYS
>
> The staging repository for this release can be found at:
> https://repository.apache.org/content/repositories/orgapacheflink-1092
>
> -
>
> The vote is open for the next 72 hours and passes if a majority of at least
> three +1 PMC votes are cast.
>
> The vote ends on Thursday April 21, 2016.
>
> [ ] +1 Release this package as Apache Flink 1.0.2
> [ ] -1 Do not release this package because ...
>
> ===
>
> The following commits have been added since the 1.0.1 release (excluding
> docs),
> most notably a performance optimization for the RocksDB state backend and
> a fix
> for proper passing of dynamic YARN properties to the Client.
>
> * 5987eb6 - [FLINK-3657] [dataSet] Change access of
> DataSetUtils.countElements() to 'public' (5 hours ago) 
> * b4b08ca - [FLINK-3762] [core] Enable Kryo reference tracking (3 days
> ago) 
> * 5b69dd8 - [FLINK-3732] [core] Fix potential null deference in
> ExecutionConfig#equals() (3 days ago) 
> * aadc5fa - [FLINK-3760] Fix StateDescriptor.readObject (19 hours ago)
> 
> * ea50ed3 - [FLINK-3716] [kafka consumer] Decreasing socket timeout so
> testFailOnNoBroker() will pass before JUnit timeout (29 hours ago)
> 
> * ff38202 - [FLINK-3730] Fix RocksDB Local Directory Initialization
> (29 hours ago) 
> * 4f9c198 - [FLINK-3712] Make all dynamic properties available to the
> CLI frontend (3 days ago) 
> * 1554c9b - [FLINK-3688] WindowOperator.trigger() does not emit
> Watermark anymore (6 days ago) 
> * 43093e3 - [FLINK-3697] Properly access type information for nested
> POJO key selection (6 days ago) 
> * 17909aa - [FLINK-3654] Disable Write-Ahead-Log in RocksDB State (7
> days ago) 
> * e0dc5c1 - [FLINK-3595] [runtime] Eagerly destroy buffer pools on
> cancelling (10 days ago) 
>