[RESULT] [VOTE] Release Apache Flink 1.1.4 (RC1)

2016-11-16 Thread Ufuk Celebi
This RC has been cancelled in favour of RC2.

I will wait with RC2 until (at least) the following fixes are in:

- https://issues.apache.org/jira/browse/FLINK-5073
- https://issues.apache.org/jira/browse/FLINK-5013
- https://issues.apache.org/jira/browse/FLINK-5065

If anything else comes up, make sure to mention it in this thread please.

– Ufuk

On 15 November 2016 at 21:00:32, Till Rohrmann (trohrm...@apache.org) wrote:
> I would also like to include a fix for
> https://issues.apache.org/jira/browse/FLINK-5073 into the new RC if
> possible.
>  
> Cheers,
> Till
>  
> On Tue, Nov 15, 2016 at 12:06 PM, Tzu-Li (Gordon) Tai  
> wrote:
>  
> > +1 for cancelling RC1.
> >
> > I would also like to merge FLINK-5013 (Kinesis connector not working on
> > old EMR versions) to be included in 1.1.4.
> > The PR (https://github.com/apache/flink/pull/2787) has been reviewed and
> > can be merged soon.
> >
> > Best,
> > Gordon
> >
> >
> > On November 15, 2016 at 6:56:42 PM, Stefan Richter (
> > s.rich...@data-artisans.com) wrote:
> >
> > +1 for canceling.
> >
> > > Am 15.11.2016 um 10:49 schrieb Till Rohrmann :
> > >
> > > +1 for Ufuk's proposal to cancel the vote in favour of a new RC. +1 for
> > > including FLINK-5065. I'm currently backporting the fix.
> > >
> > > On Tue, Nov 15, 2016 at 9:08 AM, Ufuk Celebi wrote:
> > >
> > >> Furthermore, I think this issue is a blocker:
> > >>
> > >> https://issues.apache.org/jira/browse/FLINK-5065: Resource leaks in
> > case
> > >> of lost checkpoint messages
> > >>
> > >> – Ufuk
> > >>
> > >> On 14 November 2016 at 19:15:49, Ufuk Celebi (u...@apache.org) wrote:
> > >>> -1
> > >>>
> > >>> (1) We found an issue with the task cancellation safety net
> > >> configuration. A fix is pending
> > >>> here: https://github.com/apache/flink/pull/2794 Since it was one of
> > the
> > >> main fixes
> > >>> for this release, I vote to cancel the PR in favour of a new one.
> > >>>
> > >>> (2) We merged a fix that changes the YARN class loader behaviour. I
> > >> think it's better to
> > >>> revert that change since changed class loading behaviour with minor
> > >> releases can be
> > >>> quite unexpected: https://github.com/apache/flink/pull/2795
> > >>>
> > >>> Do you agree with (2)? I will wait with the [RESULT] until I get some
> > >> feed back on these points.
> > >>>
> > >>> ---
> > >>>
> > >>> The good news is that we could forward most of the applied tests Fabian
> > >> and Gyula (except
> > >>> the YARN tests).
> > >>>
> > >>> – Ufuk
> > >>>
> > >>> On 14 November 2016 at 17:39:24, Fabian Hueske (fhue...@gmail.com)
> > >> wrote:
> >  +1
> > 
> >  - Checked hashes and signatures of release artifacts
> >  - Checked commit diff against v1.1.3: Did not find any added or
> > changed
> >  dependencies
> >  - Built from source with default settings for Scala 2.10 and 2.11 on
> > >> MacOS,
> >  Java 8
> > 
> >  Cheers, Fabian
> > 
> >  2016-11-13 14:39 GMT+01:00 Gyula Fóra :
> > 
> > > Hi,
> > >
> > > +1 from me.
> > >
> > > Tested:
> > >
> > > - Built from source for Hadoop 2.7.3
> > > - Tested running on YARN (with some fairly complex topologies)
> > > - Savepoint app running on 1.1.3 -> restored successfully on 1.1.4
> > >
> > > Cheers,
> > > Gyula
> > >
> > > Ufuk Celebi ezt írta (időpont: 2016. nov. 11., P, 10:10):
> > >
> > > Dear Flink community,
> > >
> > > Please vote on releasing the following candidate as Apache Flink
> > >> version
> > > 1.1.4.
> > >
> > > The commit to be voted on:
> > > 3c1024a (http://git-wip-us.apache.org/repos/asf/flink/commit/3c1024a  
> > >> )
> > >
> > > Branch:
> > > release-1.1.4-rc1
> > > (
> > > https://git1-us-west.apache.org/repos/asf/flink/repo?p=flink
> > > .git;a=shortlog;h=refs/heads/release-1.1.4-rc1
> > > )
> > >
> > > The release artifacts to be voted on can be found at:
> > > http://people.apache.org/~uce/flink-1.1.4-rc1/
> > >
> > > The release artifacts are signed with the key with fingerprint
> > >> 9D403309:
> > > http://www.apache.org/dist/flink/KEYS
> > >
> > > The staging repository for this release can be found at:
> > > https://repository.apache.org/content/repositories/
> > >> orgapacheflink-1107
> > >
> > > -
> > >
> > > The voting time is at least three days and the vote passes if a
> > >> majority of
> > > at least three +1 PMC votes are cast.
> > >
> > > The vote ends on Tuesday, November 15th, 2016, counting the weekend
> > >> as a
> > > single day.
> > >
> > > [ ] +1 Release this package as Apache Flink 1.1.4
> > > [ ] -1 Do not release this package, because ...
> > >
> > 
> > >>>
> > >>>
> > >>
> > >>
> >
> >
>  



Re: [DISCUSS] FLIP-14: Loops API and Termination

2016-11-16 Thread Paris Carbone
Thanks for reviewing, Gyula.

One thing that is still up to discussion is whether we should remove completely 
the old iterations API or simply mark it as deprecated till v2.0.
Also, not sure what is the best process now. We have the changes ready. Should 
I copy the FLIP to the wiki and trigger the PRs or wait for a few more days in 
case someone has objections?

@Stephan, what is your take on our interpretation of the approach you 
suggested? Should we proceed or is there anything that you do not find nice?

Paris

> On 15 Nov 2016, at 10:01, Gyula Fóra  wrote:
> 
> Hi Paris,
> 
> I like the proposed changes to the iteration API, this cleans up things in
> the Java API without any strict restriction I think (it was never a problem
> in the Scala API).
> 
> The termination algorithm based on the proposed scoped loops seems to be
> fairly simple and looks good :)
> 
> Cheers,
> Gyula
> 
> Paris Carbone  ezt írta (időpont: 2016. nov. 14., H, 8:50):
> 
>> That would be great Shi! Let's take that offline.
>> 
>> Anyone else interested in the iteration changes? It would be nice to
>> incorporate these to v1.2 if possible so I count on your review asap.
>> 
>> cheers,
>> Paris
>> 
>> On Nov 14, 2016, at 2:59 AM, xiaogang.sxg > > wrote:
>> 
>> Hi Paris
>> 
>> Unfortunately, the project is not public yet.
>> But i can provide you a primitive implementation of the update protocol in
>> the paper. It’s implemented in Storm. Since the protocol assumes the
>> communication channels between different tasks are dual, i think it’s not
>> easy to adapt it to Flink.
>> 
>> Regards
>> Xiaogang
>> 
>> 
>> 在 2016年11月12日,上午3:03,Paris Carbone mailto:par...@kth.se>>
>> 写道:
>> 
>> Hi Shi,
>> 
>> Naiad/Timely Dataflow and other projects use global coordination which is
>> very convenient for asynchronous progress tracking in general but it has
>> some downsides in a production systems that count on in-flight
>> transactional control mechanisms and rollback recovery guarantees. This is
>> why we generally prefer decentralized approaches (despite their our
>> downsides).
>> 
>> Regarding synchronous/structured iterations, this is a bit off topic and
>> they are a bit of a different story as you already know.
>> We maintain a graph streaming (gelly-streams) library on Flink that you
>> might find interesting [1]. Vasia, another Flink committer is also working
>> on that among others.
>> You can keep an eye on it since we are planning to use this project as a
>> showcase for a new way of doing structured and fixpoint iterations on
>> streams in the future.
>> 
>> P.S. many thanks for sharing your publication, it was an interesting read.
>> Do you happen to have your source code public? We could most certainly use
>> it in an benchmark soon.
>> 
>> [1] https://github.com/vasia/gelly-streaming
>> 
>> 
>> On 11 Nov 2016, at 19:18, SHI Xiaogang > shixiaoga...@gmail.com>> wrote:
>> 
>> Hi, Fouad
>> 
>> Thank you for the explanation. Now the centralized method seems correct to
>> me.
>> The passing of StatusUpdate events will lead to synchronous iterations and
>> we are using the information in each iterations to terminate the
>> computation.
>> 
>> Actually, i prefer the centralized method because in many applications, the
>> convergence may depend on some global statistics.
>> For example, a PageRank program may terminate the computation when 99%
>> vertices are converged.
>> I think those learning programs which cannot reach the fixed-point
>> (oscillating around the fixed-point) can benefit a lot from such features.
>> The decentralized method makes it hard to support such convergence
>> conditions.
>> 
>> 
>> Another concern is that Flink cannot produce periodical results in the
>> iteration over infinite data streams.
>> Take a concrete example. Given an edge stream constructing a graph, the
>> user may need the PageRank weight of each vertex in the graphs formed at
>> certain instants.
>> Currently Flink does not provide any input or iteration information to
>> users, making users hard to implement such real-time iterative
>> applications.
>> Such features are supported in both Naiad and Tornado. I think Flink should
>> support it as well.
>> 
>> What do you think?
>> 
>> Regards
>> Xiaogang
>> 
>> 
>> 2016-11-11 19:27 GMT+08:00 Fouad ALi > fouad.alsay...@gmail.com>>:
>> 
>> Hi Shi,
>> 
>> It seems that you are referring to the centralized algorithm which is no
>> longer the proposed version.
>> In the decentralized version (check last doc) there is no master node or
>> global coordination involved.
>> 
>> Let us keep this discussion to the decentralized one if possible.
>> 
>> To answer your points on the previous approach, there is a catch in your
>> trace at t7. Here is what is happening :
>> - Head,as well as RS, will receive  a 'BroadcastStatusUpdate' from
>> runtime (see 2.1 in the steps).
>> - RS and Heads will broadcast Statu

[jira] [Created] (FLINK-5078) Introduce annotations for classes copied from Calcite

2016-11-16 Thread Timo Walther (JIRA)
Timo Walther created FLINK-5078:
---

 Summary: Introduce annotations for classes copied from Calcite
 Key: FLINK-5078
 URL: https://issues.apache.org/jira/browse/FLINK-5078
 Project: Flink
  Issue Type: Improvement
  Components: Table API & SQL
Reporter: Timo Walther


We have already copied several classes from Calcite because of missing features 
or bugs. In order to track those classes, update them when bumping up the 
version, or check if they became obsolete it might be useful to introduce a 
special annotation to mark those classes. Maybe with an additional standardized 
comment format which lines have been modified. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5079) Failed to submit job to YARN cluster

2016-11-16 Thread Ufuk Celebi (JIRA)
Ufuk Celebi created FLINK-5079:
--

 Summary: Failed to submit job to YARN cluster
 Key: FLINK-5079
 URL: https://issues.apache.org/jira/browse/FLINK-5079
 Project: Flink
  Issue Type: Bug
Affects Versions: 1.1.3
Reporter: Ufuk Celebi


{code}
*@*:~/flink/build-target$ bin/flink run -p 60 ___.jar .
^Chadoop@uce-testing-master-vm:~/flink/build-target$ bin/flink run -p 60 ___.jar
2016-11-16 11:01:47,646 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- Found YARN properties file /tmp/.yarn-properties-hadoop
2016-11-16 11:01:47,646 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- Found YARN properties file /tmp/.yarn-properties-hadoop
Found YARN properties file /tmp/.yarn-properties-hadoop
2016-11-16 11:01:47,683 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- Using Yarn application id from YARN properties 
application_1479288266115_0002
2016-11-16 11:01:47,683 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- Using Yarn application id from YARN properties 
application_1479288266115_0002
Using Yarn application id from YARN properties application_1479288266115_0002
2016-11-16 11:01:47,683 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- YARN properties set default parallelism to 60
2016-11-16 11:01:47,683 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- YARN properties set default parallelism to 60
YARN properties set default parallelism to 60
2016-11-16 11:01:47,684 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- Found YARN properties file /tmp/.yarn-properties-hadoop
2016-11-16 11:01:47,684 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- Found YARN properties file /tmp/.yarn-properties-hadoop
Found YARN properties file /tmp/.yarn-properties-hadoop
2016-11-16 11:01:47,684 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- Using Yarn application id from YARN properties 
application_1479288266115_0002
2016-11-16 11:01:47,684 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- Using Yarn application id from YARN properties 
application_1479288266115_0002
Using Yarn application id from YARN properties application_1479288266115_0002
2016-11-16 11:01:47,684 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- YARN properties set default parallelism to 60
2016-11-16 11:01:47,684 INFO  org.apache.flink.yarn.cli.FlinkYarnSessionCli 
- YARN properties set default parallelism to 60
YARN properties set default parallelism to 60
2016-11-16 11:01:47,718 INFO  org.apache.hadoop.yarn.client.RMProxy 
- Connecting to ResourceManager at ___/10.240.0.54:8032
2016-11-16 11:01:47,859 INFO  org.apache.flink.yarn.YarnClusterDescriptor   
- Found application JobManager host name '___' and port '38915' 
from supplied application id 'application_1479288266115_0002'
Cluster configuration: Yarn cluster with application id 
application_1479288266115_0002
Using address 10.240.0.49:38915 to connect to JobManager.
JobManager web interface address ___/proxy/application_1479288266115_0002/
Starting execution of program
2016-11-16 11:01:47,903 INFO  org.apache.flink.yarn.YarnClusterClient   
- Starting program in interactive mode
Using checkpointing interval 1 and mode EXACTLY_ONCE
2016-11-16 11:01:48,139 INFO  org.apache.flink.yarn.YarnClusterClient   
- Waiting until all TaskManagers have connected
Waiting until all TaskManagers have connected
2016-11-16 11:01:48,140 INFO  org.apache.flink.yarn.YarnClusterClient   
- Starting client actor system.
2016-11-16 11:01:48,725 INFO  org.apache.flink.yarn.YarnClusterClient   
- TaskManager status (60/1)
TaskManager status (60/1)
2016-11-16 11:01:48,725 INFO  org.apache.flink.yarn.YarnClusterClient   
- All TaskManagers are connected
All TaskManagers are connected
2016-11-16 11:01:48,726 INFO  org.apache.flink.yarn.YarnClusterClient   
- Submitting job with JobID: 3fd357c3a8352e0bc5c504b8300afa47. 
Waiting for job completion.
Submitting job with JobID: 3fd357c3a8352e0bc5c504b8300afa47. Waiting for job 
completion.
Connected to JobManager at 
Actor[akka.tcp://flink@10.240.0.49:38915/user/jobmanager#-1077240075]
^C2016-11-16 11:02:42,929 INFO  org.apache.flink.yarn.YarnClusterClient 
  - Shutting down YarnClusterClient from the client shutdown hook
2016-11-16 11:02:42,929 INFO  org.apache.flink.yarn.YarnClusterClient   
- Disconnecting YarnClusterClient from ApplicationMaster
{code}

I have 60 task managers. The client say {{(60/1)}} (should be 1/60 actually) 
task managers available and then nothing happens. I have logs available that I 
can share

Re: [Discuss] State Backend use external HBase storage

2016-11-16 Thread Till Rohrmann
Hi Jinkui,

the file system state backend and the RocksDB state backend can be
configured (and usually should be) such that they store their checkpoint
data on a reliable storage system such as HDFS. Then you also have the
reliability guarantees.

Of course, one can start adding more state backends to Flink. At some point
in time there was the idea to write a Cassandra backed state backend [1],
for example. Similarly, one could think about a HBase backed state backend.

[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/Cassandra-statebackend-td12690.html


Cheers,
Till

On Wed, Nov 16, 2016 at 3:10 AM, shijinkui  wrote:

> Hi, All
>
> At present flink have three state backend: memory, file system, rocksdb.
> MemoryStateBackend will tansform the snapshot to jobManager, 5MB limited
> default. Even setting it bigger, that not suitable for very big state
> storage.
> HDFS can meet the reliability guarantee, but It's slow. File System and
> RocksDB are fast, but they are have no reliability guarantee.
> Three state backend all have no reliability guarantee.
>
> Can we have a Hbase state backend, providing reliability guarantee of
> state snapshot?
> For user, only new a HbaseStateBackend object, provide hbase parameter and
> optimization configure.
> Maybe Hbase or other distributed key-value storage is heavyweight storage,
> we only use hbase client to read/write asynchronously.
>
> -Jinkui Shi
>


Re: [DISCUSS] FLIP-14: Loops API and Termination

2016-11-16 Thread Gyula Fóra
I am not completely sure whether we should deprecate the old API for 1.2 or
remove it completely. Personally I am in favor of removing it, I don't
think it is a huge burden to move to the new one if it makes for a much
nicer user experience.

I think you can go ahead add the FLIP to the wiki and open the PR so we can
start the review if you have it ready anyways.

Gyula

Paris Carbone  ezt írta (időpont: 2016. nov. 16., Sze,
11:55):

> Thanks for reviewing, Gyula.
>
> One thing that is still up to discussion is whether we should remove
> completely the old iterations API or simply mark it as deprecated till v2.0.
> Also, not sure what is the best process now. We have the changes ready.
> Should I copy the FLIP to the wiki and trigger the PRs or wait for a few
> more days in case someone has objections?
>
> @Stephan, what is your take on our interpretation of the approach you
> suggested? Should we proceed or is there anything that you do not find nice?
>
> Paris
>
> > On 15 Nov 2016, at 10:01, Gyula Fóra  wrote:
> >
> > Hi Paris,
> >
> > I like the proposed changes to the iteration API, this cleans up things
> in
> > the Java API without any strict restriction I think (it was never a
> problem
> > in the Scala API).
> >
> > The termination algorithm based on the proposed scoped loops seems to be
> > fairly simple and looks good :)
> >
> > Cheers,
> > Gyula
> >
> > Paris Carbone  ezt írta (időpont: 2016. nov. 14., H,
> 8:50):
> >
> >> That would be great Shi! Let's take that offline.
> >>
> >> Anyone else interested in the iteration changes? It would be nice to
> >> incorporate these to v1.2 if possible so I count on your review asap.
> >>
> >> cheers,
> >> Paris
> >>
> >> On Nov 14, 2016, at 2:59 AM, xiaogang.sxg  >> > wrote:
> >>
> >> Hi Paris
> >>
> >> Unfortunately, the project is not public yet.
> >> But i can provide you a primitive implementation of the update protocol
> in
> >> the paper. It’s implemented in Storm. Since the protocol assumes the
> >> communication channels between different tasks are dual, i think it’s
> not
> >> easy to adapt it to Flink.
> >>
> >> Regards
> >> Xiaogang
> >>
> >>
> >> 在 2016年11月12日,上午3:03,Paris Carbone mailto:par...@kth.se
> >>
> >> 写道:
> >>
> >> Hi Shi,
> >>
> >> Naiad/Timely Dataflow and other projects use global coordination which
> is
> >> very convenient for asynchronous progress tracking in general but it has
> >> some downsides in a production systems that count on in-flight
> >> transactional control mechanisms and rollback recovery guarantees. This
> is
> >> why we generally prefer decentralized approaches (despite their our
> >> downsides).
> >>
> >> Regarding synchronous/structured iterations, this is a bit off topic and
> >> they are a bit of a different story as you already know.
> >> We maintain a graph streaming (gelly-streams) library on Flink that you
> >> might find interesting [1]. Vasia, another Flink committer is also
> working
> >> on that among others.
> >> You can keep an eye on it since we are planning to use this project as a
> >> showcase for a new way of doing structured and fixpoint iterations on
> >> streams in the future.
> >>
> >> P.S. many thanks for sharing your publication, it was an interesting
> read.
> >> Do you happen to have your source code public? We could most certainly
> use
> >> it in an benchmark soon.
> >>
> >> [1] https://github.com/vasia/gelly-streaming
> >>
> >>
> >> On 11 Nov 2016, at 19:18, SHI Xiaogang  >> shixiaoga...@gmail.com>> wrote:
> >>
> >> Hi, Fouad
> >>
> >> Thank you for the explanation. Now the centralized method seems correct
> to
> >> me.
> >> The passing of StatusUpdate events will lead to synchronous iterations
> and
> >> we are using the information in each iterations to terminate the
> >> computation.
> >>
> >> Actually, i prefer the centralized method because in many applications,
> the
> >> convergence may depend on some global statistics.
> >> For example, a PageRank program may terminate the computation when 99%
> >> vertices are converged.
> >> I think those learning programs which cannot reach the fixed-point
> >> (oscillating around the fixed-point) can benefit a lot from such
> features.
> >> The decentralized method makes it hard to support such convergence
> >> conditions.
> >>
> >>
> >> Another concern is that Flink cannot produce periodical results in the
> >> iteration over infinite data streams.
> >> Take a concrete example. Given an edge stream constructing a graph, the
> >> user may need the PageRank weight of each vertex in the graphs formed at
> >> certain instants.
> >> Currently Flink does not provide any input or iteration information to
> >> users, making users hard to implement such real-time iterative
> >> applications.
> >> Such features are supported in both Naiad and Tornado. I think Flink
> should
> >> support it as well.
> >>
> >> What do you think?
> >>
> >> Regards
> >> Xiaogang
> >>
> >>
> >> 2016-

[jira] [Created] (FLINK-5080) Cassandra connector ignores saveAsync result onSuccess

2016-11-16 Thread Jakub Nowacki (JIRA)
Jakub Nowacki created FLINK-5080:


 Summary: Cassandra connector ignores saveAsync result onSuccess
 Key: FLINK-5080
 URL: https://issues.apache.org/jira/browse/FLINK-5080
 Project: Flink
  Issue Type: Bug
  Components: Cassandra Connector
Affects Versions: 1.1.3
Reporter: Jakub Nowacki


When a record is saved to Cassandra it may return a ResultSet to the callback 
given in the saveAsync; e.g. when we do {{INSERT ... IF NOT EXISTS}}, a 
ResultSet is returned with column {{applied: false}} if the record exists and 
the new record has not been inserted. Thus, we loose data in such case. 

The minimal solution would be to log the result. The best solution would be to 
add possibility of passing a custom callback; in this way we could deal with a 
success or failure in more custom way.  Other solution is to add a possibility 
to pass onSuccess and onFailure functions, which would be executed inside the 
callback.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5081) unable to set yarn.maximum-failed-containers with flink one-time YARN setup

2016-11-16 Thread Nico Kruber (JIRA)
Nico Kruber created FLINK-5081:
--

 Summary: unable to set yarn.maximum-failed-containers with flink 
one-time YARN setup
 Key: FLINK-5081
 URL: https://issues.apache.org/jira/browse/FLINK-5081
 Project: Flink
  Issue Type: Bug
  Components: Startup Shell Scripts
Affects Versions: 1.1.4
Reporter: Nico Kruber


When letting flink setup YARN for a one-time job, it apparently does not 
deliver the {{yarn.maximum-failed-containers}} parameter to YARN as the 
{{yarn-session.sh}} script does. Adding it to conf/flink-conf.yaml as 
https://ci.apache.org/projects/flink/flink-docs-master/setup/yarn_setup.html#recovery-behavior-of-flink-on-yarn
 suggested also does not work.

example:
{code:none}
flink run -m yarn-cluster -yn 3 -yjm 1024 -ytm 4096 .jar --parallelism 3 
-Dyarn.maximum-failed-containers=100
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Request for edit rights in Jira

2016-11-16 Thread limcheehau lim
Hi folks,

I would like to request for edit rights in Jira so that I could assign the
issue that I'm working on (https://issues.apache.org/jira/browse/FLINK-3551)
to myself.

My Jira username is ch33hau

Thanks =)
Regards,
Chee Hau


[jira] [Created] (FLINK-5082) Pull ExecutionService lifecycle management out of the JobManager

2016-11-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-5082:


 Summary: Pull ExecutionService lifecycle management out of the 
JobManager
 Key: FLINK-5082
 URL: https://issues.apache.org/jira/browse/FLINK-5082
 Project: Flink
  Issue Type: Bug
  Components: JobManager
Affects Versions: 1.1.3, 1.2.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.2.0, 1.1.4


The {{JobManager}} receives an {{ExecutorService}} to run its futures as a 
constructor parameter. Even though the {{ExecutorService}} comes from outside, 
the {{JobManager}} shuts the executor service down if the {{JobManager}} 
terminates. This is clearly a sub-optimal behaviour leading also to 
{{RejectedExecutionExceptions}}.

I propose to move the {{ExecutorService}} lifecycle management out of the 
{{JobManager}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Request for edit rights in Jira

2016-11-16 Thread Vasiliki Kalavri
Hi!

I've given you contributor rights and assigned the issue to you. You can
also assign issues to yourself now :)

-Vasia.

On 16 November 2016 at 17:37, limcheehau lim  wrote:

> Hi folks,
>
> I would like to request for edit rights in Jira so that I could assign the
> issue that I'm working on (https://issues.apache.org/
> jira/browse/FLINK-3551)
> to myself.
>
> My Jira username is ch33hau
>
> Thanks =)
> Regards,
> Chee Hau
>


[jira] [Created] (FLINK-5083) Race condition in Rolling/Bucketing Sink pending files cleanup

2016-11-16 Thread Cliff Resnick (JIRA)
Cliff Resnick created FLINK-5083:


 Summary: Race condition in Rolling/Bucketing Sink pending files 
cleanup
 Key: FLINK-5083
 URL: https://issues.apache.org/jira/browse/FLINK-5083
 Project: Flink
  Issue Type: Bug
  Components: DataStream API
Affects Versions: 1.1.3, 1.2.0
Reporter: Cliff Resnick


In both Open and Restore methods there is code that:

1. gets a recursive listing from baseDir
2. iterates listing and name checks filenames based on subtaskIndex and other 
criteria to find pending or in-progress files. If found delete.

The problem is that the recursive listing gets all files for all 
subtaskIndexes. The race error is when #hasNext is called as part of the 
iteration, a hidden existence check is made on the "next" file, which was 
deleted by another task after-listing but pre-iteration, so an error is thrown 
and the job fails. 

Depending on the number of pending files, this condition may outlast the number 
of job retries, each failing on a different file.

A solution would be use #listStatus instead. The hadoop FileSystem supports a 
PathFilter in its #listStatus calls, but not in the recursive #listFiles call. 
The cleanup is performed from the baseDir so the recursive listing would have 
to be in Flink. 

This touches on another issue. Over time, the directory listing is bound to get 
very large, and re-listing everything from the baseDir may get increasingly 
expensive, especially if the Fs is S3. Maybe we can have a Bucketer callback to 
return a list of cleanup root directories based on the current file? I'm 
guessing most people are using time based bucketing, so there's only so much of 
a period where cleanup will matter. If so, then this would solve for the above 
recursive listing problem.








 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5084) Replace Java Table API integration tests by unit tests

2016-11-16 Thread Fabian Hueske (JIRA)
Fabian Hueske created FLINK-5084:


 Summary: Replace Java Table API integration tests by unit tests
 Key: FLINK-5084
 URL: https://issues.apache.org/jira/browse/FLINK-5084
 Project: Flink
  Issue Type: Task
  Components: Table API & SQL
Reporter: Fabian Hueske
Priority: Minor


The Java Table API is a wrapper on top of the Scala Table API. 
Instead of operating directly with Expressions like the Scala API, the Java API 
accepts a String parameter which is parsed into Expressions.

We could therefore replace the Java Table API ITCases by tests that check that 
the parsing step produces a valid logical plan.

This could be done by creating two {{Table}} objects for an identical query 
once with the Scala Expression API and one with the Java String API and 
comparing the logical plans of both {{Table}} objects. Basically something like 
the following:

{code}
val ds1 = CollectionDataSets.getSmall3TupleDataSet(env).toTable(tEnv, 'a, 'b, 
'c)
val ds2 = CollectionDataSets.get5TupleDataSet(env).toTable(tEnv, 'd, 'e, 'f, 
'g, 'h)

val joinT1 = ds1.join(ds2).where('b === 'e).select('c, 'g)
val joinT2 = ds1.join(ds2).where("b = e").select("c, g")

val lPlan1 = joinT1.logicalPlan
val lPlan2 = joinT2.logicalPlan

Assert.assertEquals("Logical Plans do not match", lPlan1, lPlan2)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (FLINK-5085) Execute CheckpointCoodinator's state discard calls asynchronously

2016-11-16 Thread Till Rohrmann (JIRA)
Till Rohrmann created FLINK-5085:


 Summary: Execute CheckpointCoodinator's state discard calls 
asynchronously
 Key: FLINK-5085
 URL: https://issues.apache.org/jira/browse/FLINK-5085
 Project: Flink
  Issue Type: Bug
  Components: State Backends, Checkpointing
Affects Versions: 1.1.3, 1.2.0
Reporter: Till Rohrmann
Assignee: Till Rohrmann
 Fix For: 1.2.0, 1.1.4


The {{CheckpointCoordinator}} discards under certain circumstances pending 
checkpoints or state handles. These discard operations can involve a blocking 
IO operation if the underlying state handle refers to a file which has to be 
deleted. In order to not block the calling thread, we should execute these 
calls in a dedicated IO executor.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Request for edit rights in Jira

2016-11-16 Thread limcheehau lim
Thanks Vasia! =)

On Thu, Nov 17, 2016 at 4:01 AM, Vasiliki Kalavri  wrote:

> Hi!
>
> I've given you contributor rights and assigned the issue to you. You can
> also assign issues to yourself now :)
>
> -Vasia.
>
> On 16 November 2016 at 17:37, limcheehau lim  wrote:
>
> > Hi folks,
> >
> > I would like to request for edit rights in Jira so that I could assign
> the
> > issue that I'm working on (https://issues.apache.org/
> > jira/browse/FLINK-3551)
> > to myself.
> >
> > My Jira username is ch33hau
> >
> > Thanks =)
> > Regards,
> > Chee Hau
> >
>


Request for permissions in JIRA

2016-11-16 Thread Boris Osipov
Hi folks!
I've done https://issues.apache.org/jira/browse/FLINK-5006 and want to resolve 
it. Could you please give me enough permissions in JIRA.
My user name in JIRA is Boris_Osipov

Best regards,
Boris Osipov