[jira] [Commented] (FLINK-4256) Fine-grained recovery
[ https://issues.apache.org/jira/browse/FLINK-4256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413090#comment-15413090 ] wenlong.lyu commented on FLINK-4256: thanks for explaining, you are right about pre-computing. Still have another concern, I think it is quite a special case for a job to be ExecutionJobVertex level splittable, it may only happen in batch job graphs with blocking edges in practice. > Fine-grained recovery > - > > Key: FLINK-4256 > URL: https://issues.apache.org/jira/browse/FLINK-4256 > Project: Flink > Issue Type: Improvement > Components: JobManager >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.2.0 > > > When a task fails during execution, Flink currently resets the entire > execution graph and triggers complete re-execution from the last completed > checkpoint. This is more expensive than just re-executing the failed tasks. > In many cases, more fine-grained recovery is possible. > The full description and design is in the corresponding FLIP. > https://cwiki.apache.org/confluence/display/FLINK/FLIP-1+%3A+Fine+Grained+Recovery+from+Task+Failures -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2330: FLINK-4311 Fixed several problems in TableInputFormat
Github user nielsbasjes commented on the issue: https://github.com/apache/flink/pull/2330 Question: Is this change good? Or do you have more things that I need to change before it can be committed? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4311) TableInputFormat fails when reused on next split
[ https://issues.apache.org/jira/browse/FLINK-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413108#comment-15413108 ] ASF GitHub Bot commented on FLINK-4311: --- Github user nielsbasjes commented on the issue: https://github.com/apache/flink/pull/2330 Question: Is this change good? Or do you have more things that I need to change before it can be committed? > TableInputFormat fails when reused on next split > > > Key: FLINK-4311 > URL: https://issues.apache.org/jira/browse/FLINK-4311 > Project: Flink > Issue Type: Bug >Affects Versions: 1.0.3 >Reporter: Niels Basjes >Assignee: Niels Basjes >Priority: Critical > > We have written a batch job that uses data from HBase by means of using the > TableInputFormat. > We have found that this class sometimes fails with this exception: > {quote} > java.lang.RuntimeException: java.util.concurrent.RejectedExecutionException: > Task > org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b > rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, > pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165] > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:208) > at > org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) > at > org.apache.hadoop.hbase.client.ClientScanner.nextScanner(ClientScanner.java:295) > at > org.apache.hadoop.hbase.client.ClientScanner.initializeScannerInConstruction(ClientScanner.java:160) > at > org.apache.hadoop.hbase.client.ClientScanner.(ClientScanner.java:155) > at org.apache.hadoop.hbase.client.HTable.getScanner(HTable.java:821) > at > org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:152) > at > org.apache.flink.addons.hbase.TableInputFormat.open(TableInputFormat.java:47) > at > org.apache.flink.runtime.operators.DataSourceTask.invoke(DataSourceTask.java:147) > at org.apache.flink.runtime.taskmanager.Task.run(Task.java:559) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.util.concurrent.RejectedExecutionException: Task > org.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture@4f4efe4b > rejected from java.util.concurrent.ThreadPoolExecutor@7872d5c1[Terminated, > pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 1165] > at > java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2047) > at > java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:823) > at > java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1369) > at > org.apache.hadoop.hbase.client.ResultBoundedCompletionService.submit(ResultBoundedCompletionService.java:142) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.addCallsForCurrentReplica(ScannerCallableWithReplicas.java:269) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:165) > at > org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59) > at > org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) > ... 10 more > {quote} > As you can see the ThreadPoolExecutor was terminated at this point. > We tracked it down to the fact that > # the configure method opens the table > # the open method obtains the result scanner > # the closes method closes the table. > If a second split arrives on the same instance then the open method will fail > because the table has already been closed. > We also found that this error varies with the versions of HBase that are > used. I have also seen this exception: > {quote} > Caused by: java.io.IOException: hconnection-0x19d37183 closed > at > org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation.locateRegion(ConnectionManager.java:1146) > at > org.apache.hadoop.hbase.client.RpcRetryingCallerWithReadReplicas.getRegionLocations(RpcRetryingCallerWithReadReplicas.java:300) > ... 37 more > {quote} > I found that in the [documentation of the InputFormat > interface|https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/api/common/io/InputFormat.html] > is clearly states > {quote}IMPORTANT NOTE: Input formats must be written such that an instance > can be opened again after it was closed. That is due to the fact that the > input format is used for potentially multiple splits. After a split is done, > the format's close function is invoked and, if another split is available, > the open function is invo
[jira] [Resolved] (FLINK-4332) Savepoint Serializer mixed read()/readFully()
[ https://issues.apache.org/jira/browse/FLINK-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-4332. - Resolution: Fixed Fixed in - 1.2.0 via 3f3bab10b9ca68eb31a7ef5a31e49145b51006fd - 1.1.1 via 19de8ec01a9ec2b3ac0fdf0052b780f970b9bcd1 > Savepoint Serializer mixed read()/readFully() > - > > Key: FLINK-4332 > URL: https://issues.apache.org/jira/browse/FLINK-4332 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Critical > Fix For: 1.2.0, 1.1.1 > > > The {{SavepointV1Serializer}} accidentally used {{InputStream.read(byte[], > int, int)}} where it should use {{InputStream.readFully(byte[], int, int)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-4320) Fix misleading ScheduleMode names
[ https://issues.apache.org/jira/browse/FLINK-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-4320. --- > Fix misleading ScheduleMode names > - > > Key: FLINK-4320 > URL: https://issues.apache.org/jira/browse/FLINK-4320 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Minor > Fix For: 1.2.0 > > > The {{ScheduleMode}} has non-intuitive option names and offers > non-implemented options. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-4320) Fix misleading ScheduleMode names
[ https://issues.apache.org/jira/browse/FLINK-4320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-4320. - Resolution: Fixed Fixed via cd98e85ddd3c35e5900713266fc38916b53f172d (commit has wrong JIRA issue tag) > Fix misleading ScheduleMode names > - > > Key: FLINK-4320 > URL: https://issues.apache.org/jira/browse/FLINK-4320 > Project: Flink > Issue Type: Sub-task > Components: Distributed Coordination >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Minor > Fix For: 1.2.0 > > > The {{ScheduleMode}} has non-intuitive option names and offers > non-implemented options. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-4332) Savepoint Serializer mixed read()/readFully()
[ https://issues.apache.org/jira/browse/FLINK-4332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-4332. --- > Savepoint Serializer mixed read()/readFully() > - > > Key: FLINK-4332 > URL: https://issues.apache.org/jira/browse/FLINK-4332 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Critical > Fix For: 1.2.0, 1.1.1 > > > The {{SavepointV1Serializer}} accidentally used {{InputStream.read(byte[], > int, int)}} where it should use {{InputStream.readFully(byte[], int, int)}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-4333) Name mixup in Savepoint versions
[ https://issues.apache.org/jira/browse/FLINK-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-4333. --- > Name mixup in Savepoint versions > > > Key: FLINK-4333 > URL: https://issues.apache.org/jira/browse/FLINK-4333 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Trivial > Fix For: 1.2.0 > > > The {{SavepointV0}} is serialized with the {{SavepointV1Serializer}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-4333) Name mixup in Savepoint versions
[ https://issues.apache.org/jira/browse/FLINK-4333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-4333. - Resolution: Fixed Fixed via 9a84b04f076f9cdc2fd0037fcc89f31edc596bdd > Name mixup in Savepoint versions > > > Key: FLINK-4333 > URL: https://issues.apache.org/jira/browse/FLINK-4333 > Project: Flink > Issue Type: Bug > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen >Priority: Trivial > Fix For: 1.2.0 > > > The {{SavepointV0}} is serialized with the {{SavepointV1Serializer}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart
[ https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413199#comment-15413199 ] ASF GitHub Bot commented on FLINK-4334: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2341 Thanks, this makes a lot of sense. We should merge this for Flink 1.1.1 For Flink 1.2.0, we should actually fix the root issue, which is that the hadoop1 artifact has a scala suffix even though it does not need one. I think this was introduced by accident. > Shaded Hadoop1 jar not fully excluded in Quickstart > --- > > Key: FLINK-4334 > URL: https://issues.apache.org/jira/browse/FLINK-4334 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3 >Reporter: Shannon Carey > > The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink > 1.0.0 (see > https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837), > but the quickstart POMs both refer to it as flink-shaded-hadoop1. > If using "-Pbuild-jar", the problem is not encountered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2341: [FLINK-4334] [quickstarts] Correctly exclude hadoop1 in q...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2341 Thanks, this makes a lot of sense. We should merge this for Flink 1.1.1 For Flink 1.2.0, we should actually fix the root issue, which is that the hadoop1 artifact has a scala suffix even though it does not need one. I think this was introduced by accident. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (FLINK-4329) Streaming File Source Must Correctly Handle Timestamps/Watermarks
[ https://issues.apache.org/jira/browse/FLINK-4329?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kostas Kloudas reassigned FLINK-4329: - Assignee: Kostas Kloudas > Streaming File Source Must Correctly Handle Timestamps/Watermarks > - > > Key: FLINK-4329 > URL: https://issues.apache.org/jira/browse/FLINK-4329 > Project: Flink > Issue Type: Bug > Components: Streaming Connectors >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Kostas Kloudas > Fix For: 1.1.1 > > > The {{ContinuousFileReaderOperator}} does not correctly deal with watermarks, > i.e. they are just passed through. This means that when the > {{ContinuousFileMonitoringFunction}} closes and emits a {{Long.MAX_VALUE}} > that watermark can "overtake" the records that are to be emitted in the > {{ContinuousFileReaderOperator}}. Together with the new "allowed lateness" > setting in window operator this can lead to elements being dropped as late. > Also, {{ContinuousFileReaderOperator}} does not correctly assign ingestion > timestamps since it is not technically a source but looks like one to the > user. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart
[ https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen updated FLINK-4334: Fix Version/s: 1.1.1 > Shaded Hadoop1 jar not fully excluded in Quickstart > --- > > Key: FLINK-4334 > URL: https://issues.apache.org/jira/browse/FLINK-4334 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3 >Reporter: Shannon Carey > Fix For: 1.1.1 > > > The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink > 1.0.0 (see > https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837), > but the quickstart POMs both refer to it as flink-shaded-hadoop1. > If using "-Pbuild-jar", the problem is not encountered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4337) Remove unnecessary Scala suffix from Hadoop1 artifact
Stephan Ewen created FLINK-4337: --- Summary: Remove unnecessary Scala suffix from Hadoop1 artifact Key: FLINK-4337 URL: https://issues.apache.org/jira/browse/FLINK-4337 Project: Flink Issue Type: Bug Components: Build System Affects Versions: 1.1.0 Reporter: Stephan Ewen Fix For: 1.2.0 The hadoop1 artifacts have a "_2.10" Scala suffix, but have no Scala dependency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground
[ https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413252#comment-15413252 ] Ismaël Mejía commented on FLINK-4326: - Well it is good to know that there is interest around the foreground mode, probably it is a good idea to invite [~greghogan] to the discussion since he reviewed my previous PR. What do you think ? should I rebase my previous patch and create a PR for this, or any of you guys has a better idea of how to do it ? > Flink start-up scripts should optionally start services on the foreground > - > > Key: FLINK-4326 > URL: https://issues.apache.org/jira/browse/FLINK-4326 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.0.3 >Reporter: Elias Levy > > This has previously been mentioned in the mailing list, but has not been > addressed. Flink start-up scripts start the job and task managers in the > background. This makes it difficult to integrate Flink with most processes > supervisory tools and init systems, including Docker. One can get around > this via hacking the scripts or manually starting the right classes via Java, > but it is a brittle solution. > In addition to starting the daemons in the foreground, the start up scripts > should use exec instead of running the commends, so as to avoid forks. Many > supervisory tools assume the PID of the process to be monitored is that of > the process it first executes, and fork chains make it difficult for the > supervisor to figure out what process to monitor. Specifically, > jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and > flink-daemon.sh should exec java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #2342: FLINK-4253 - Rename "recovery.mode" config key to ...
GitHub user ramkrish86 opened a pull request: https://github.com/apache/flink/pull/2342 FLINK-4253 - Rename "recovery.mode" config key to "high-availability" (Ram) Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html). In addition to going through the list, please provide a meaningful description of your changes. - [ ] General - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text") - The pull request addresses only one issue - Each commit in the PR has a meaningful commit message (including the JIRA id) - [ ] Documentation - Documentation has been added for new functionality - Old documentation affected by the pull request has been updated - JavaDoc for public methods has been added - [ ] Tests & Build - Functionality added by the pull request is covered by tests - `mvn clean verify` has been executed successfully locally or a Travis build has passed I ran `mvn clean verify` - all the tests in flink-runtime passed and this test failed ` Failed tests: BlobServerDeleteTest.testDeleteAll:157 DELETE operation failed: Server side error: Unable to delete directory C:\Users\rsvasude\AppData\Local\Temp\blobStore-18502f30-ee19-4c4c-9cb7-7b51c9bdeffb\job_801e21ed42b26de3c813cfe4917d029d. LeaderChangeStateCleanupTest.testReelectionOfSameJobManager:244 TaskManager should not be able to register at JobManager. ` I think it is an environment issue. Ran other tests changed as part of this PR and they all seems to pass. Handled backward compatability if the config file has the older config `recover.mode`. The same has been handled in the `config.sh` script also. Suggestions/feedback welcome. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ramkrish86/flink FLINK-4253 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2342.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2342 commit 0acc3c4ae131bcd8735493227f1f3df18adae1b3 Author: Ramkrishna Date: 2016-08-09T09:18:12Z FLINK-4253 - Rename "recovery.mode" config key to "high-availability" (Ram) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2338: [FLINK-4316] [core] [hadoop compatibility] Make flink-cor...
Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2338 +1 to merge, once the failing tests are fixed. I think the exception to the API stability check is okay in this case. The class is still in the same java package. This are the test failures: ``` Results : Failed tests: PojoTypeExtractionTest.testPojoWC:203->checkWCPojoAsserts:244 position of field complex.valueType wrong expected:<2> but was:<5> Tests in error: TypeInfoParserTest.testMultiDimensionalArray:321 û IllegalArgument String coul... TypeInfoParserTest.testPojoType:190 û IllegalArgument String could not be pars... ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4253) Rename "recovery.mode" config key to "high-availability"
[ https://issues.apache.org/jira/browse/FLINK-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413258#comment-15413258 ] ASF GitHub Bot commented on FLINK-4253: --- GitHub user ramkrish86 opened a pull request: https://github.com/apache/flink/pull/2342 FLINK-4253 - Rename "recovery.mode" config key to "high-availability" (Ram) Thanks for contributing to Apache Flink. Before you open your pull request, please take the following check list into consideration. If your changes take all of the items into account, feel free to open your pull request. For more information and/or questions please refer to the [How To Contribute guide](http://flink.apache.org/how-to-contribute.html). In addition to going through the list, please provide a meaningful description of your changes. - [ ] General - The pull request references the related JIRA issue ("[FLINK-XXX] Jira title text") - The pull request addresses only one issue - Each commit in the PR has a meaningful commit message (including the JIRA id) - [ ] Documentation - Documentation has been added for new functionality - Old documentation affected by the pull request has been updated - JavaDoc for public methods has been added - [ ] Tests & Build - Functionality added by the pull request is covered by tests - `mvn clean verify` has been executed successfully locally or a Travis build has passed I ran `mvn clean verify` - all the tests in flink-runtime passed and this test failed ` Failed tests: BlobServerDeleteTest.testDeleteAll:157 DELETE operation failed: Server side error: Unable to delete directory C:\Users\rsvasude\AppData\Local\Temp\blobStore-18502f30-ee19-4c4c-9cb7-7b51c9bdeffb\job_801e21ed42b26de3c813cfe4917d029d. LeaderChangeStateCleanupTest.testReelectionOfSameJobManager:244 TaskManager should not be able to register at JobManager. ` I think it is an environment issue. Ran other tests changed as part of this PR and they all seems to pass. Handled backward compatability if the config file has the older config `recover.mode`. The same has been handled in the `config.sh` script also. Suggestions/feedback welcome. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ramkrish86/flink FLINK-4253 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2342.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2342 commit 0acc3c4ae131bcd8735493227f1f3df18adae1b3 Author: Ramkrishna Date: 2016-08-09T09:18:12Z FLINK-4253 - Rename "recovery.mode" config key to "high-availability" (Ram) > Rename "recovery.mode" config key to "high-availability" > > > Key: FLINK-4253 > URL: https://issues.apache.org/jira/browse/FLINK-4253 > Project: Flink > Issue Type: Improvement >Reporter: Ufuk Celebi >Assignee: ramkrishna.s.vasudevan > > Currently, HA is configured via the following configuration keys: > {code} > recovery.mode: STANDALONE // No high availability (HA) > recovery.mode: ZOOKEEPER // HA > {code} > This could be more straight forward by simply renaming the key to > {{high-availability}}. Furthermore, the term {{STANDALONE}} is overloaded. We > already have standalone cluster mode. > {code} > high-availability: NONE // No HA > high-availability: ZOOKEEPER // HA via ZooKeeper > {code} > The {{recovery.mode}} configuration keys would have to be deprecated before > completely removing them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4316) Make flink-core independent of Hadoop
[ https://issues.apache.org/jira/browse/FLINK-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413259#comment-15413259 ] ASF GitHub Bot commented on FLINK-4316: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2338 +1 to merge, once the failing tests are fixed. I think the exception to the API stability check is okay in this case. The class is still in the same java package. This are the test failures: ``` Results : Failed tests: PojoTypeExtractionTest.testPojoWC:203->checkWCPojoAsserts:244 position of field complex.valueType wrong expected:<2> but was:<5> Tests in error: TypeInfoParserTest.testMultiDimensionalArray:321 » IllegalArgument String coul... TypeInfoParserTest.testPojoType:190 » IllegalArgument String could not be pars... ``` > Make flink-core independent of Hadoop > - > > Key: FLINK-4316 > URL: https://issues.apache.org/jira/browse/FLINK-4316 > Project: Flink > Issue Type: Bug > Components: Core >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.2.0 > > > We want to gradually reduce the hard and heavy mandatory dependencies in > Hadoop. Hadoop will still be part of (most) flink downloads, but the API > projects should not have a hard dependency on Hadoop. > I suggest to start with {{flink-core}}, because it only depends on Hadoop for > the {{Writable}} type, to support seamless operation of Hadoop types. > I propose to move all {{WritableTypeInfo}}-related classes to the > {{flink-hadoop-compatibility}} project and access them via reflection in the > {{TypeExtractor}}. > That way, {{Writable}} types will be out of the box supported if users have > the {{flink-hadoop-compatibility}} project in the classpath. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-3298) Streaming connector for ActiveMQ
[ https://issues.apache.org/jira/browse/FLINK-3298?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413270#comment-15413270 ] Robert Metzger commented on FLINK-3298: --- I've started a discussion on this issue on the dev@ list today. > Streaming connector for ActiveMQ > > > Key: FLINK-3298 > URL: https://issues.apache.org/jira/browse/FLINK-3298 > Project: Flink > Issue Type: New Feature > Components: Streaming Connectors >Reporter: Mohit Sethi >Assignee: Ivan Mushketyk >Priority: Minor > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2338: [FLINK-4316] [core] [hadoop compatibility] Make flink-cor...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2338 Good point. I'll fix those tests... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4316) Make flink-core independent of Hadoop
[ https://issues.apache.org/jira/browse/FLINK-4316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413275#comment-15413275 ] ASF GitHub Bot commented on FLINK-4316: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2338 Good point. I'll fix those tests... > Make flink-core independent of Hadoop > - > > Key: FLINK-4316 > URL: https://issues.apache.org/jira/browse/FLINK-4316 > Project: Flink > Issue Type: Bug > Components: Core >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.2.0 > > > We want to gradually reduce the hard and heavy mandatory dependencies in > Hadoop. Hadoop will still be part of (most) flink downloads, but the API > projects should not have a hard dependency on Hadoop. > I suggest to start with {{flink-core}}, because it only depends on Hadoop for > the {{Writable}} type, to support seamless operation of Hadoop types. > I propose to move all {{WritableTypeInfo}}-related classes to the > {{flink-hadoop-compatibility}} project and access them via reflection in the > {{TypeExtractor}}. > That way, {{Writable}} types will be out of the box supported if users have > the {{flink-hadoop-compatibility}} project in the classpath. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4338) Implement Slot Pool
Stephan Ewen created FLINK-4338: --- Summary: Implement Slot Pool Key: FLINK-4338 URL: https://issues.apache.org/jira/browse/FLINK-4338 Project: Flink Issue Type: Sub-task Components: Cluster Management Affects Versions: 1.1.0 Reporter: Stephan Ewen Fix For: 1.2.0 Implement the slot pool as described in the FLIP-6 document: https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-4338) Implement Slot Pool
[ https://issues.apache.org/jira/browse/FLINK-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen updated FLINK-4338: Issue Type: Bug (was: Sub-task) Parent: (was: FLINK-4319) > Implement Slot Pool > --- > > Key: FLINK-4338 > URL: https://issues.apache.org/jira/browse/FLINK-4338 > Project: Flink > Issue Type: Bug > Components: Cluster Management >Affects Versions: 1.1.0 >Reporter: Stephan Ewen > Fix For: 1.2.0 > > > Implement the slot pool as described in the FLIP-6 document: > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (FLINK-4338) Implement Slot Pool
[ https://issues.apache.org/jira/browse/FLINK-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen updated FLINK-4338: Issue Type: New Feature (was: Bug) > Implement Slot Pool > --- > > Key: FLINK-4338 > URL: https://issues.apache.org/jira/browse/FLINK-4338 > Project: Flink > Issue Type: New Feature > Components: Cluster Management >Affects Versions: 1.1.0 >Reporter: Stephan Ewen > Fix For: 1.2.0 > > > Implement the slot pool as described in the FLIP-6 document: > https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=65147077 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4339) Implement Slot Pool Core
Stephan Ewen created FLINK-4339: --- Summary: Implement Slot Pool Core Key: FLINK-4339 URL: https://issues.apache.org/jira/browse/FLINK-4339 Project: Flink Issue Type: Sub-task Components: Cluster Management Affects Versions: 1.1.0 Reporter: Stephan Ewen Fix For: 1.2.0 Impements the core slot structures and behavior of the {{SlotPool}}: - pool of available slots - request slots and response if slot is available in pool - return / deallocate slots -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4072) EventTimeWindowCheckpointingITCase.testSlidingTimeWindow fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1541#comment-1541 ] Robert Metzger commented on FLINK-4072: --- Once again: https://s3.amazonaws.com/archive.travis-ci.org/jobs/150862032/log.txt {code} Caused by: java.lang.AssertionError: Window start: -100 end: 900 expected:<404550> but was:<30870> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.flink.test.checkpointing.EventTimeWindowCheckpointingITCase$ValidatingSink.invoke(EventTimeWindowCheckpointingITCase.java:642) at org.apache.flink.test.checkpointing.EventTimeWindowCheckpointingITCase$ValidatingSink.invoke(EventTimeWindowCheckpointingITCase.java:578) at org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:39) at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:176) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:66) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at java.lang.Thread.run(Thread.java:745) {code} > EventTimeWindowCheckpointingITCase.testSlidingTimeWindow fails on Travis > > > Key: FLINK-4072 > URL: https://issues.apache.org/jira/browse/FLINK-4072 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.1.0 >Reporter: Till Rohrmann >Assignee: Aljoscha Krettek >Priority: Critical > Labels: test-stability > > The test case {{EventTimeWindowCheckpointingITCase.testSlidingTimeWindow}} > failed on Travis. > https://s3.amazonaws.com/archive.travis-ci.org/jobs/137498497/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (FLINK-4072) EventTimeWindowCheckpointingITCase.testSlidingTimeWindow fails on Travis
[ https://issues.apache.org/jira/browse/FLINK-4072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1541#comment-1541 ] Robert Metzger edited comment on FLINK-4072 at 8/9/16 10:19 AM: Once again: https://s3.amazonaws.com/archive.travis-ci.org/jobs/150862032/log.txt and https://s3.amazonaws.com/archive.travis-ci.org/jobs/150862038/log.txt {code} Caused by: java.lang.AssertionError: Window start: -100 end: 900 expected:<404550> but was:<30870> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.flink.test.checkpointing.EventTimeWindowCheckpointingITCase$ValidatingSink.invoke(EventTimeWindowCheckpointingITCase.java:642) at org.apache.flink.test.checkpointing.EventTimeWindowCheckpointingITCase$ValidatingSink.invoke(EventTimeWindowCheckpointingITCase.java:578) at org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:39) at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:176) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:66) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at java.lang.Thread.run(Thread.java:745) {code} was (Author: rmetzger): Once again: https://s3.amazonaws.com/archive.travis-ci.org/jobs/150862032/log.txt {code} Caused by: java.lang.AssertionError: Window start: -100 end: 900 expected:<404550> but was:<30870> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:743) at org.junit.Assert.assertEquals(Assert.java:118) at org.junit.Assert.assertEquals(Assert.java:555) at org.apache.flink.test.checkpointing.EventTimeWindowCheckpointingITCase$ValidatingSink.invoke(EventTimeWindowCheckpointingITCase.java:642) at org.apache.flink.test.checkpointing.EventTimeWindowCheckpointingITCase$ValidatingSink.invoke(EventTimeWindowCheckpointingITCase.java:578) at org.apache.flink.streaming.api.operators.StreamSink.processElement(StreamSink.java:39) at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:176) at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:66) at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:266) at org.apache.flink.runtime.taskmanager.Task.run(Task.java:584) at java.lang.Thread.run(Thread.java:745) {code} > EventTimeWindowCheckpointingITCase.testSlidingTimeWindow fails on Travis > > > Key: FLINK-4072 > URL: https://issues.apache.org/jira/browse/FLINK-4072 > Project: Flink > Issue Type: Bug > Components: Tests >Affects Versions: 1.1.0 >Reporter: Till Rohrmann >Assignee: Aljoscha Krettek >Priority: Critical > Labels: test-stability > > The test case {{EventTimeWindowCheckpointingITCase.testSlidingTimeWindow}} > failed on Travis. > https://s3.amazonaws.com/archive.travis-ci.org/jobs/137498497/log.txt -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2323: [FLINK-2090] toString of CollectionInputFormat takes long...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2323 Looks good, merging this... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge
[ https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413355#comment-15413355 ] ASF GitHub Bot commented on FLINK-2090: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2323 Looks good, merging this... > toString of CollectionInputFormat takes long time when the collection is huge > - > > Key: FLINK-2090 > URL: https://issues.apache.org/jira/browse/FLINK-2090 > Project: Flink > Issue Type: Improvement >Reporter: Till Rohrmann >Assignee: Ivan Mushketyk >Priority: Minor > > The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on > its underlying {{Collection}}. Thus, {{toString}} is called for each element > of the collection. If the {{Collection}} contains many elements or the > individual {{toString}} calls for each element take a long time, then the > string generation can take a considerable amount of time. [~mikiobraun] > noticed that when he inserted several jBLAS matrices into Flink. > The {{toString}} method is mainly used for logging statements in > {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and > in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it > is necessary to print the complete content of the underlying {{Collection}} > or if it's not enough to print only the first 3 elements in the {{toString}} > method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge
[ https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413358#comment-15413358 ] ASF GitHub Bot commented on FLINK-2090: --- Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2323 Awesome! Thank you. > toString of CollectionInputFormat takes long time when the collection is huge > - > > Key: FLINK-2090 > URL: https://issues.apache.org/jira/browse/FLINK-2090 > Project: Flink > Issue Type: Improvement >Reporter: Till Rohrmann >Assignee: Ivan Mushketyk >Priority: Minor > > The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on > its underlying {{Collection}}. Thus, {{toString}} is called for each element > of the collection. If the {{Collection}} contains many elements or the > individual {{toString}} calls for each element take a long time, then the > string generation can take a considerable amount of time. [~mikiobraun] > noticed that when he inserted several jBLAS matrices into Flink. > The {{toString}} method is mainly used for logging statements in > {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and > in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it > is necessary to print the complete content of the underlying {{Collection}} > or if it's not enough to print only the first 3 elements in the {{toString}} > method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2323: [FLINK-2090] toString of CollectionInputFormat takes long...
Github user mushketyk commented on the issue: https://github.com/apache/flink/pull/2323 Awesome! Thank you. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart
[ https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-4334. - Resolution: Fixed Fix Version/s: (was: 1.1.1) 1.1.2 Fixed for 1.1.2 via fc5b58d29486d9a7e0053a508274a46de97c73aa > Shaded Hadoop1 jar not fully excluded in Quickstart > --- > > Key: FLINK-4334 > URL: https://issues.apache.org/jira/browse/FLINK-4334 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3 >Reporter: Shannon Carey > Fix For: 1.1.2 > > > The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink > 1.0.0 (see > https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837), > but the quickstart POMs both refer to it as flink-shaded-hadoop1. > If using "-Pbuild-jar", the problem is not encountered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart
[ https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-4334. --- > Shaded Hadoop1 jar not fully excluded in Quickstart > --- > > Key: FLINK-4334 > URL: https://issues.apache.org/jira/browse/FLINK-4334 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3 >Reporter: Shannon Carey > Fix For: 1.1.2 > > > The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink > 1.0.0 (see > https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837), > but the quickstart POMs both refer to it as flink-shaded-hadoop1. > If using "-Pbuild-jar", the problem is not encountered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (FLINK-4337) Remove unnecessary Scala suffix from Hadoop1 artifact
[ https://issues.apache.org/jira/browse/FLINK-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen reassigned FLINK-4337: --- Assignee: Stephan Ewen > Remove unnecessary Scala suffix from Hadoop1 artifact > - > > Key: FLINK-4337 > URL: https://issues.apache.org/jira/browse/FLINK-4337 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.2.0 > > > The hadoop1 artifacts have a "_2.10" Scala suffix, but have no Scala > dependency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #2343: [FLINK-4337] [build] Remove unnecessary Scala Suff...
GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/2343 [FLINK-4337] [build] Remove unnecessary Scala Suffix from Hadoop 1 shaded artifact The shaded Hadoop 1 dependency was versioned with a Scala Version Suffix, even though it does not depend on Scala. This pull request removes that suffix and simplifies the POM files accordingly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink hadoop_1_no_suffix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2343.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2343 commit dc6defb8cc9c6a74725b374c3de155b879adb83f Author: Stephan Ewen Date: 2016-08-09T11:01:33Z [FLINK-4337] [build] Remove unnecessary Scala Suffix from Hadoop 1 shaded artifact --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2341: [FLINK-4334] [quickstarts] Correctly exclude hadoop1 in q...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2341 I merged this into the 1.1 release branch. For the 1.2 releases, we should use this fix: #2343 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4337) Remove unnecessary Scala suffix from Hadoop1 artifact
[ https://issues.apache.org/jira/browse/FLINK-4337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413426#comment-15413426 ] ASF GitHub Bot commented on FLINK-4337: --- GitHub user StephanEwen opened a pull request: https://github.com/apache/flink/pull/2343 [FLINK-4337] [build] Remove unnecessary Scala Suffix from Hadoop 1 shaded artifact The shaded Hadoop 1 dependency was versioned with a Scala Version Suffix, even though it does not depend on Scala. This pull request removes that suffix and simplifies the POM files accordingly. You can merge this pull request into a Git repository by running: $ git pull https://github.com/StephanEwen/incubator-flink hadoop_1_no_suffix Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2343.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2343 commit dc6defb8cc9c6a74725b374c3de155b879adb83f Author: Stephan Ewen Date: 2016-08-09T11:01:33Z [FLINK-4337] [build] Remove unnecessary Scala Suffix from Hadoop 1 shaded artifact > Remove unnecessary Scala suffix from Hadoop1 artifact > - > > Key: FLINK-4337 > URL: https://issues.apache.org/jira/browse/FLINK-4337 > Project: Flink > Issue Type: Bug > Components: Build System >Affects Versions: 1.1.0 >Reporter: Stephan Ewen >Assignee: Stephan Ewen > Fix For: 1.2.0 > > > The hadoop1 artifacts have a "_2.10" Scala suffix, but have no Scala > dependency. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2341: [FLINK-4334] [quickstarts] Correctly exclude hadoop1 in q...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2341 @rehevkor5 Since the Apache bot does not automatically close this pull request, could you close it manually? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart
[ https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413427#comment-15413427 ] ASF GitHub Bot commented on FLINK-4334: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2341 I merged this into the 1.1 release branch. For the 1.2 releases, we should use this fix: #2343 > Shaded Hadoop1 jar not fully excluded in Quickstart > --- > > Key: FLINK-4334 > URL: https://issues.apache.org/jira/browse/FLINK-4334 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3 >Reporter: Shannon Carey > Fix For: 1.1.2 > > > The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink > 1.0.0 (see > https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837), > but the quickstart POMs both refer to it as flink-shaded-hadoop1. > If using "-Pbuild-jar", the problem is not encountered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart
[ https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413429#comment-15413429 ] ASF GitHub Bot commented on FLINK-4334: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2341 @rehevkor5 Since the Apache bot does not automatically close this pull request, could you close it manually? > Shaded Hadoop1 jar not fully excluded in Quickstart > --- > > Key: FLINK-4334 > URL: https://issues.apache.org/jira/browse/FLINK-4334 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3 >Reporter: Shannon Carey > Fix For: 1.1.2 > > > The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink > 1.0.0 (see > https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837), > but the quickstart POMs both refer to it as flink-shaded-hadoop1. > If using "-Pbuild-jar", the problem is not encountered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #2344: [hotfix][build] Remove Scala suffix from Hadoop1 s...
GitHub user rmetzger opened a pull request: https://github.com/apache/flink/pull/2344 [hotfix][build] Remove Scala suffix from Hadoop1 shading project It seems that the hadoop1 shaded artifact has a scala suffix, which is not needed anymore. You can merge this pull request into a Git repository by running: $ git pull https://github.com/rmetzger/flink remove_scala_suffix_hd1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2344.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2344 commit feb494971563d2c61655c9c9aa1d670cbacfdaf9 Author: Robert Metzger Date: 2016-08-09T12:35:53Z [hotfix][build] Remove Scala suffix from Hadoop1 shading project --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4282) Add Offset Parameter to WindowAssigners
[ https://issues.apache.org/jira/browse/FLINK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413471#comment-15413471 ] Aljoscha Krettek commented on FLINK-4282: - In my opinion the offset parameter should just change how windows are aligned, i.e. if hourly windows start on the our or say at quarter past the hour. In concrete terms: without an offset we would have hourly windows 12:00-12:59, 13:00-13:59 and so on. With an offset of 15 minutes we would have windows 12:15-13:14, 13:15-14:14 and so on. Putting the offset into the window and putting elements in windows where their timestamp is not actually contained in the window range can be problematic, as happens in your first example with processing time = 5. > Add Offset Parameter to WindowAssigners > --- > > Key: FLINK-4282 > URL: https://issues.apache.org/jira/browse/FLINK-4282 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Aljoscha Krettek > > Currently, windows are always aligned to EPOCH, which basically means days > are aligned with GMT. This is somewhat problematic for people living in > different timezones. > And offset parameter would allow to adapt the window assigner to the timezone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2251: [FLINK-4212] [scripts] Lock PID file when starting daemon...
Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2251 Merging this ... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4212) Lock PID file when starting daemons
[ https://issues.apache.org/jira/browse/FLINK-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413473#comment-15413473 ] ASF GitHub Bot commented on FLINK-4212: --- Github user greghogan commented on the issue: https://github.com/apache/flink/pull/2251 Merging this ... > Lock PID file when starting daemons > --- > > Key: FLINK-4212 > URL: https://issues.apache.org/jira/browse/FLINK-4212 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > > As noted on the mailing list (0), when multiple TaskManagers are started in > parallel (using pdsh) there is a race condition on updating the pid: 1) the > pid file is first read to parse the process' index, 2) the process is > started, and 3) on success the daemon pid is appended to the pid file. > We could use a tool such as {{flock}} to lock on the pid file while starting > the Flink daemon. > 0: > http://mail-archives.apache.org/mod_mbox/flink-user/201607.mbox/%3CCA%2BssbKXw954Bz_sBRwP6db0FntWyGWzTyP7wJZ5nhOeQnof3kg%40mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
Aljoscha Krettek created FLINK-4340: --- Summary: Remove RocksDB Semi-Async Checkpoint Mode Key: FLINK-4340 URL: https://issues.apache.org/jira/browse/FLINK-4340 Project: Flink Issue Type: Improvement Components: State Backends, Checkpointing Affects Versions: 1.1.0 Reporter: Aljoscha Krettek Assignee: Aljoscha Krettek This seems to be causing to many problems and is also incompatible with the upcoming key-group/sharding changes that will allow rescaling of keyed state. Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2002: Support for bz2 compression in flink-core
Github user mtanski commented on the issue: https://github.com/apache/flink/pull/2002 Now that 1.1 one is out, is it possible to get this in? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint ...
GitHub user aljoscha opened a pull request: https://github.com/apache/flink/pull/2345 [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode This was causing to many problems and is also incompatible with the upcoming key-group/sharding changes. R: @StephanEwen for review, should be fairly easy, though You can merge this pull request into a Git repository by running: $ git pull https://github.com/aljoscha/flink rocksdb/remove-semi-async Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2345.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2345 commit 521f93c01bc1b99805d41eee5ab9f5279f52c4bd Author: Aljoscha Krettek Date: 2016-08-09T13:16:59Z [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode This was causing to many problems and is also incompatible with the upcoming key-group/sharding changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413493#comment-15413493 ] ASF GitHub Bot commented on FLINK-4340: --- GitHub user aljoscha opened a pull request: https://github.com/apache/flink/pull/2345 [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode This was causing to many problems and is also incompatible with the upcoming key-group/sharding changes. R: @StephanEwen for review, should be fairly easy, though You can merge this pull request into a Git repository by running: $ git pull https://github.com/aljoscha/flink rocksdb/remove-semi-async Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2345.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2345 commit 521f93c01bc1b99805d41eee5ab9f5279f52c4bd Author: Aljoscha Krettek Date: 2016-08-09T13:16:59Z [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode This was causing to many problems and is also incompatible with the upcoming key-group/sharding changes. > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/2345 Technically, `HDFSCopyFromLocal` and `HDFSCopyToLocal` are now unused. Should we remove them? They might be useful for some stuff in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint ...
GitHub user aljoscha reopened a pull request: https://github.com/apache/flink/pull/2345 [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode This was causing to many problems and is also incompatible with the upcoming key-group/sharding changes. R: @StephanEwen for review, should be fairly easy, though You can merge this pull request into a Git repository by running: $ git pull https://github.com/aljoscha/flink rocksdb/remove-semi-async Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2345.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2345 commit 521f93c01bc1b99805d41eee5ab9f5279f52c4bd Author: Aljoscha Krettek Date: 2016-08-09T13:16:59Z [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode This was causing to many problems and is also incompatible with the upcoming key-group/sharding changes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint ...
Github user aljoscha closed the pull request at: https://github.com/apache/flink/pull/2345 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413497#comment-15413497 ] ASF GitHub Bot commented on FLINK-4340: --- Github user aljoscha closed the pull request at: https://github.com/apache/flink/pull/2345 > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413496#comment-15413496 ] ASF GitHub Bot commented on FLINK-4340: --- Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/2345 Technically, `HDFSCopyFromLocal` and `HDFSCopyToLocal` are now unused. Should we remove them? They might be useful for some stuff in the future. > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413498#comment-15413498 ] ASF GitHub Bot commented on FLINK-4340: --- GitHub user aljoscha reopened a pull request: https://github.com/apache/flink/pull/2345 [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode This was causing to many problems and is also incompatible with the upcoming key-group/sharding changes. R: @StephanEwen for review, should be fairly easy, though You can merge this pull request into a Git repository by running: $ git pull https://github.com/aljoscha/flink rocksdb/remove-semi-async Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2345.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2345 commit 521f93c01bc1b99805d41eee5ab9f5279f52c4bd Author: Aljoscha Krettek Date: 2016-08-09T13:16:59Z [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode This was causing to many problems and is also incompatible with the upcoming key-group/sharding changes. > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #2251: [FLINK-4212] [scripts] Lock PID file when starting...
Github user greghogan closed the pull request at: https://github.com/apache/flink/pull/2251 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4212) Lock PID file when starting daemons
[ https://issues.apache.org/jira/browse/FLINK-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413516#comment-15413516 ] ASF GitHub Bot commented on FLINK-4212: --- Github user greghogan closed the pull request at: https://github.com/apache/flink/pull/2251 > Lock PID file when starting daemons > --- > > Key: FLINK-4212 > URL: https://issues.apache.org/jira/browse/FLINK-4212 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > > As noted on the mailing list (0), when multiple TaskManagers are started in > parallel (using pdsh) there is a race condition on updating the pid: 1) the > pid file is first read to parse the process' index, 2) the process is > started, and 3) on success the daemon pid is appended to the pid file. > We could use a tool such as {{flock}} to lock on the pid file while starting > the Flink daemon. > 0: > http://mail-archives.apache.org/mod_mbox/flink-user/201607.mbox/%3CCA%2BssbKXw954Bz_sBRwP6db0FntWyGWzTyP7wJZ5nhOeQnof3kg%40mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-4212) Lock PID file when starting daemons
[ https://issues.apache.org/jira/browse/FLINK-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Hogan closed FLINK-4212. - Resolution: Implemented Implemented in 46b427fac9cfceca7839fc93f06ba758101f4fee > Lock PID file when starting daemons > --- > > Key: FLINK-4212 > URL: https://issues.apache.org/jira/browse/FLINK-4212 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.1.0 >Reporter: Greg Hogan >Assignee: Greg Hogan > > As noted on the mailing list (0), when multiple TaskManagers are started in > parallel (using pdsh) there is a race condition on updating the pid: 1) the > pid file is first read to parse the process' index, 2) the process is > started, and 3) on success the daemon pid is appended to the pid file. > We could use a tool such as {{flock}} to lock on the pid file while starting > the Flink daemon. > 0: > http://mail-archives.apache.org/mod_mbox/flink-user/201607.mbox/%3CCA%2BssbKXw954Bz_sBRwP6db0FntWyGWzTyP7wJZ5nhOeQnof3kg%40mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge
[ https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen resolved FLINK-2090. - Resolution: Fixed Fix Version/s: 1.2.0 Fixed via b5d58934d7124e0076e588e74485a60e7c1f484b Thank you for the contribution > toString of CollectionInputFormat takes long time when the collection is huge > - > > Key: FLINK-2090 > URL: https://issues.apache.org/jira/browse/FLINK-2090 > Project: Flink > Issue Type: Improvement >Reporter: Till Rohrmann >Assignee: Ivan Mushketyk >Priority: Minor > Fix For: 1.2.0 > > > The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on > its underlying {{Collection}}. Thus, {{toString}} is called for each element > of the collection. If the {{Collection}} contains many elements or the > individual {{toString}} calls for each element take a long time, then the > string generation can take a considerable amount of time. [~mikiobraun] > noticed that when he inserted several jBLAS matrices into Flink. > The {{toString}} method is mainly used for logging statements in > {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and > in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it > is necessary to print the complete content of the underlying {{Collection}} > or if it's not enough to print only the first 3 elements in the {{toString}} > method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #2323: [FLINK-2090] toString of CollectionInputFormat tak...
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2323 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2231: [FLINK-4035] Bump Kafka producer in Kafka sink to Kafka 0...
Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2231 Sorry for joining this discussion late. I've been on vacation. I also stumbled across the code duplicates. I'll check out the code from this pull request and see if there's a good way of re-using most of the 0.9 connector code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4035) Bump Kafka producer in Kafka sink to Kafka 0.10.0.0
[ https://issues.apache.org/jira/browse/FLINK-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413518#comment-15413518 ] ASF GitHub Bot commented on FLINK-4035: --- Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2231 Sorry for joining this discussion late. I've been on vacation. I also stumbled across the code duplicates. I'll check out the code from this pull request and see if there's a good way of re-using most of the 0.9 connector code. > Bump Kafka producer in Kafka sink to Kafka 0.10.0.0 > --- > > Key: FLINK-4035 > URL: https://issues.apache.org/jira/browse/FLINK-4035 > Project: Flink > Issue Type: Bug > Components: Kafka Connector >Affects Versions: 1.0.3 >Reporter: Elias Levy >Priority: Minor > > Kafka 0.10.0.0 introduced protocol changes related to the producer. > Published messages now include timestamps and compressed messages now include > relative offsets. As it is now, brokers must decompress publisher compressed > messages, assign offset to them, and recompress them, which is wasteful and > makes it less likely that compression will be used at all. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge
[ https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephan Ewen closed FLINK-2090. --- > toString of CollectionInputFormat takes long time when the collection is huge > - > > Key: FLINK-2090 > URL: https://issues.apache.org/jira/browse/FLINK-2090 > Project: Flink > Issue Type: Improvement >Reporter: Till Rohrmann >Assignee: Ivan Mushketyk >Priority: Minor > Fix For: 1.2.0 > > > The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on > its underlying {{Collection}}. Thus, {{toString}} is called for each element > of the collection. If the {{Collection}} contains many elements or the > individual {{toString}} calls for each element take a long time, then the > string generation can take a considerable amount of time. [~mikiobraun] > noticed that when he inserted several jBLAS matrices into Flink. > The {{toString}} method is mainly used for logging statements in > {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and > in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it > is necessary to print the complete content of the underlying {{Collection}} > or if it's not enough to print only the first 3 elements in the {{toString}} > method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4282) Add Offset Parameter to WindowAssigners
[ https://issues.apache.org/jira/browse/FLINK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413519#comment-15413519 ] Aditi Viswanathan commented on FLINK-4282: -- Also when we want to assign time zones, system.currentTimeMillis() will always give the epoch time, irrespective of time zones. So we'd have to use the epoch time, correct window size and triggering time as well as assign the element to the specified time zone. This is the same case as the scenario I mentioned with the 5 seconds, because the processing time will be in UTC and won't directly fall into the other time zone buckets. That's why I've modified the triggering so that whatever the offset, it will still trigger correctly after the specified window size. Aditi Viswanathan | +91-9632130809 Data Engineer, [24]7 Customer Ltd. On Tue, Aug 9, 2016 at 6:35 PM, Aljoscha Krettek (JIRA) > Add Offset Parameter to WindowAssigners > --- > > Key: FLINK-4282 > URL: https://issues.apache.org/jira/browse/FLINK-4282 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Aljoscha Krettek > > Currently, windows are always aligned to EPOCH, which basically means days > are aligned with GMT. This is somewhat problematic for people living in > different timezones. > And offset parameter would allow to adapt the window assigner to the timezone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-2090) toString of CollectionInputFormat takes long time when the collection is huge
[ https://issues.apache.org/jira/browse/FLINK-2090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413521#comment-15413521 ] ASF GitHub Bot commented on FLINK-2090: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2323 > toString of CollectionInputFormat takes long time when the collection is huge > - > > Key: FLINK-2090 > URL: https://issues.apache.org/jira/browse/FLINK-2090 > Project: Flink > Issue Type: Improvement >Reporter: Till Rohrmann >Assignee: Ivan Mushketyk >Priority: Minor > Fix For: 1.2.0 > > > The {{toString}} method of {{CollectionInputFormat}} calls {{toString}} on > its underlying {{Collection}}. Thus, {{toString}} is called for each element > of the collection. If the {{Collection}} contains many elements or the > individual {{toString}} calls for each element take a long time, then the > string generation can take a considerable amount of time. [~mikiobraun] > noticed that when he inserted several jBLAS matrices into Flink. > The {{toString}} method is mainly used for logging statements in > {{DataSourceNode}}'s {{computeOperatorSpecificDefaultEstimates}} method and > in {{JobGraphGenerator.getDescriptionForUserCode}}. I'm wondering whether it > is necessary to print the complete content of the underlying {{Collection}} > or if it's not enough to print only the first 3 elements in the {{toString}} > method. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2002: Support for bz2 compression in flink-core
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2002 I think this looks good. +1 to merge. As a followup, we can upgrade the `commons-compression` version via dependency management (if it is backwards compatible, which apache commons libs usually are). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2345 @gyfora You mean if the "full async" is slower than the "semi async"? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413544#comment-15413544 ] ASF GitHub Bot commented on FLINK-4340: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2345 The "full async" takes more time but runs completely in the background, so performs better in most cases than "semi async". > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2345 The "full async" takes more time but runs completely in the background, so performs better in most cases than "semi async". --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413537#comment-15413537 ] ASF GitHub Bot commented on FLINK-4340: --- Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 Hi, Isn't this way of checkpointing is much much slower then the semi async version? > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413542#comment-15413542 ] ASF GitHub Bot commented on FLINK-4340: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2345 @gyfora You mean if the "full async" is slower than the "semi async"? > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 Some of the benefits we lose on restore. Especially for very large states this can be pretty serious. Maybe this is required for the sharding to some extent but I don't see this as completely straightforward, in terms of which one is better in production. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413547#comment-15413547 ] ASF GitHub Bot commented on FLINK-4340: --- Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 Some of the benefits we lose on restore. Especially for very large states this can be pretty serious. Maybe this is required for the sharding to some extent but I don't see this as completely straightforward, in terms of which one is better in production. > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 But you are right it is probably more important to keep the latency down for the running programs, and for that the fully async seems to be strictly better --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413548#comment-15413548 ] ASF GitHub Bot commented on FLINK-4340: --- Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 But you are right it is probably more important to keep the latency down for the running programs, and for that the fully async seems to be strictly better > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2345 +1 for this There seems to be an issue with the RocksDB backup engine, to we should probably discourage that mode even in current releases. I would also remove the `HDFSCopyFromLocal` and `HDFSCopyToLocal` utils. We should not have dead code in the repository and we can always re-add them from the git history if needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413536#comment-15413536 ] ASF GitHub Bot commented on FLINK-4340: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2345 +1 for this There seems to be an issue with the RocksDB backup engine, to we should probably discourage that mode even in current releases. I would also remove the `HDFSCopyFromLocal` and `HDFSCopyToLocal` utils. We should not have dead code in the repository and we can always re-add them from the git history if needed. > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413563#comment-15413563 ] ASF GitHub Bot commented on FLINK-4340: --- Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 Good thing about the way fully async checkpoints are restored though is that it is very trivial to insert some state adaptor code :) > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (FLINK-4326) Flink start-up scripts should optionally start services on the foreground
[ https://issues.apache.org/jira/browse/FLINK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413551#comment-15413551 ] Greg Hogan commented on FLINK-4326: --- Using {{exec}} solves the problem of not knowing the PID until the daemon has launched but doesn't allow for removing the PID after termination. What level of monitoring is performed by the supervisor? Is this simply a "is this process still alive" or more complicated like tracking cpu and memory usage? > Flink start-up scripts should optionally start services on the foreground > - > > Key: FLINK-4326 > URL: https://issues.apache.org/jira/browse/FLINK-4326 > Project: Flink > Issue Type: Improvement > Components: Startup Shell Scripts >Affects Versions: 1.0.3 >Reporter: Elias Levy > > This has previously been mentioned in the mailing list, but has not been > addressed. Flink start-up scripts start the job and task managers in the > background. This makes it difficult to integrate Flink with most processes > supervisory tools and init systems, including Docker. One can get around > this via hacking the scripts or manually starting the right classes via Java, > but it is a brittle solution. > In addition to starting the daemons in the foreground, the start up scripts > should use exec instead of running the commends, so as to avoid forks. Many > supervisory tools assume the PID of the process to be monitored is that of > the process it first executes, and fork chains make it difficult for the > supervisor to figure out what process to monitor. Specifically, > jobmanager.sh and taskmanager.sh should exec flink-daemon.sh, and > flink-daemon.sh should exec java. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/2345 Jip, that's also good. ð --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413575#comment-15413575 ] ASF GitHub Bot commented on FLINK-4340: --- Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/2345 Jip, that's also good. 😃 > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 Good thing about the way fully async checkpoints are restored though is that it is very trivial to insert some state adaptor code :) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 But I wonder what would happen in a scenario with a lot of states: Semi async: short local copy time at every snapshot + very fast restore Fully async: no copy time + very slow restore (puts sort data, recreate index etc) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413579#comment-15413579 ] ASF GitHub Bot commented on FLINK-4340: --- Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 But I wonder what would happen in a scenario with a lot of states: Semi async: short local copy time at every snapshot + very fast restore Fully async: no copy time + very slow restore (puts sort data, recreate index etc) > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user gyfora commented on the issue: https://github.com/apache/flink/pull/2345 Hi, Isn't this way of checkpointing is much much slower then the semi async version? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2345: [FLINK-4340] Remove RocksDB Semi-Async Checkpoint Mode
Github user wenlong88 commented on the issue: https://github.com/apache/flink/pull/2345 @aljoscha we use rocksdb checkpoint machanism to do the semi-async checkpoint, which use hard link to make checkpoint, cost quite a little IO and time in synchronized phrase. This works well even when the state is large. and using the checkpoint dir to restore is also very fast, since no extra IO need, if the task manager didn't changed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4340) Remove RocksDB Semi-Async Checkpoint Mode
[ https://issues.apache.org/jira/browse/FLINK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413596#comment-15413596 ] ASF GitHub Bot commented on FLINK-4340: --- Github user wenlong88 commented on the issue: https://github.com/apache/flink/pull/2345 @aljoscha we use rocksdb checkpoint machanism to do the semi-async checkpoint, which use hard link to make checkpoint, cost quite a little IO and time in synchronized phrase. This works well even when the state is large. and using the checkpoint dir to restore is also very fast, since no extra IO need, if the task manager didn't changed. > Remove RocksDB Semi-Async Checkpoint Mode > - > > Key: FLINK-4340 > URL: https://issues.apache.org/jira/browse/FLINK-4340 > Project: Flink > Issue Type: Improvement > Components: State Backends, Checkpointing >Affects Versions: 1.1.0 >Reporter: Aljoscha Krettek >Assignee: Aljoscha Krettek > > This seems to be causing to many problems and is also incompatible with the > upcoming key-group/sharding changes that will allow rescaling of keyed state. > Once this is done we can also close FLINK-4228. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #2341: [FLINK-4334] [quickstarts] Correctly exclude hadoo...
Github user rehevkor5 closed the pull request at: https://github.com/apache/flink/pull/2341 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart
[ https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413638#comment-15413638 ] ASF GitHub Bot commented on FLINK-4334: --- Github user rehevkor5 closed the pull request at: https://github.com/apache/flink/pull/2341 > Shaded Hadoop1 jar not fully excluded in Quickstart > --- > > Key: FLINK-4334 > URL: https://issues.apache.org/jira/browse/FLINK-4334 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3 >Reporter: Shannon Carey > Fix For: 1.1.2 > > > The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink > 1.0.0 (see > https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837), > but the quickstart POMs both refer to it as flink-shaded-hadoop1. > If using "-Pbuild-jar", the problem is not encountered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2341: [FLINK-4334] [quickstarts] Correctly exclude hadoop1 in q...
Github user rehevkor5 commented on the issue: https://github.com/apache/flink/pull/2341 You got it @StephanEwen --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4334) Shaded Hadoop1 jar not fully excluded in Quickstart
[ https://issues.apache.org/jira/browse/FLINK-4334?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413637#comment-15413637 ] ASF GitHub Bot commented on FLINK-4334: --- Github user rehevkor5 commented on the issue: https://github.com/apache/flink/pull/2341 You got it @StephanEwen > Shaded Hadoop1 jar not fully excluded in Quickstart > --- > > Key: FLINK-4334 > URL: https://issues.apache.org/jira/browse/FLINK-4334 > Project: Flink > Issue Type: Bug > Components: Quickstarts >Affects Versions: 1.1.0, 1.0.1, 1.0.2, 1.0.3 >Reporter: Shannon Carey > Fix For: 1.1.2 > > > The Shaded Hadoop1 jar has artifactId flink-shaded-hadoop1_2.10 since Flink > 1.0.0 (see > https://github.com/apache/flink/commit/2c4e4d1ffaf4107fb802c90858184fc10af66837), > but the quickstart POMs both refer to it as flink-shaded-hadoop1. > If using "-Pbuild-jar", the problem is not encountered. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2305: [FLINK-4271] [DataStreamAPI] Enable CoGroupedStreams and ...
Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2305 I just thought of a different trick: We could add a second variant of the `apply(..)` function (for example called `with(...)` as in the DataSet API) and have the proper return type there (calling apply() and cast). We can then immediately deprecate the `with()` function to indicate that it is a temporary workaround and is to be replaced by `apply(...)` in Flink 2.0. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink issue #2344: [hotfix][build] Remove Scala suffix from Hadoop1 shading ...
Github user uce commented on the issue: https://github.com/apache/flink/pull/2344 Duplicate of #2343? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4271) There is no way to set parallelism of operators produced by CoGroupedStreams
[ https://issues.apache.org/jira/browse/FLINK-4271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413641#comment-15413641 ] ASF GitHub Bot commented on FLINK-4271: --- Github user StephanEwen commented on the issue: https://github.com/apache/flink/pull/2305 I just thought of a different trick: We could add a second variant of the `apply(..)` function (for example called `with(...)` as in the DataSet API) and have the proper return type there (calling apply() and cast). We can then immediately deprecate the `with()` function to indicate that it is a temporary workaround and is to be replaced by `apply(...)` in Flink 2.0. > There is no way to set parallelism of operators produced by CoGroupedStreams > > > Key: FLINK-4271 > URL: https://issues.apache.org/jira/browse/FLINK-4271 > Project: Flink > Issue Type: Bug > Components: DataStream API >Reporter: Wenlong Lyu >Assignee: Jark Wu > > Currently, CoGroupStreams package the map/keyBy/window operators with a human > friendly interface, like: > dataStreamA.cogroup(streamB).where(...).equalsTo().window().apply(), both the > intermediate operators and final window operators can not be accessed by > users, and we cannot set attributes of the operators, which make co-group > hard to use in production environment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2344: [hotfix][build] Remove Scala suffix from Hadoop1 shading ...
Github user rmetzger commented on the issue: https://github.com/apache/flink/pull/2344 True. Thank you, I'll close this one. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] flink pull request #2344: [hotfix][build] Remove Scala suffix from Hadoop1 s...
Github user rmetzger closed the pull request at: https://github.com/apache/flink/pull/2344 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-4282) Add Offset Parameter to WindowAssigners
[ https://issues.apache.org/jira/browse/FLINK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413645#comment-15413645 ] Aljoscha Krettek commented on FLINK-4282: - The {{WindowAssigner}} should not try to shift the timestamp around or do anything fancy. Specifying a timezone should just be another way of specifying where hours start with respect to UTC. (I brought this up just because there are some "exotic" time zones that are not shifted by exact hours compared to UTC.) > Add Offset Parameter to WindowAssigners > --- > > Key: FLINK-4282 > URL: https://issues.apache.org/jira/browse/FLINK-4282 > Project: Flink > Issue Type: Improvement > Components: Streaming >Reporter: Aljoscha Krettek > > Currently, windows are always aligned to EPOCH, which basically means days > are aligned with GMT. This is somewhat problematic for people living in > different timezones. > And offset parameter would allow to adapt the window assigner to the timezone. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink pull request #2051: [FLINK-3779] Add support for queryable state
Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2051 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (FLINK-3779) Add support for queryable state
[ https://issues.apache.org/jira/browse/FLINK-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15413647#comment-15413647 ] ASF GitHub Bot commented on FLINK-3779: --- Github user asfgit closed the pull request at: https://github.com/apache/flink/pull/2051 > Add support for queryable state > --- > > Key: FLINK-3779 > URL: https://issues.apache.org/jira/browse/FLINK-3779 > Project: Flink > Issue Type: Improvement > Components: Distributed Coordination >Reporter: Ufuk Celebi >Assignee: Ufuk Celebi > > Flink offers state abstractions for user functions in order to guarantee > fault-tolerant processing of streams. Users can work with both > non-partitioned (Checkpointed interface) and partitioned state > (getRuntimeContext().getState(ValueStateDescriptor) and other variants). > The partitioned state interface provides access to different types of state > that are all scoped to the key of the current input element. This type of > state can only be used on a KeyedStream, which is created via stream.keyBy(). > Currently, all of this state is internal to Flink and used in order to > provide processing guarantees in failure cases (e.g. exactly-once processing). > The goal of Queryable State is to expose this state outside of Flink by > supporting queries against the partitioned key value state. > This will help to eliminate the need for distributed operations/transactions > with external systems such as key-value stores which are often the bottleneck > in practice. Exposing the local state to the outside moves a good part of the > database work into the stream processor, allowing both high throughput > queries and immediate access to the computed state. > This is the initial design doc for the feature: > https://docs.google.com/document/d/1NkQuhIKYmcprIU5Vjp04db1HgmYSsZtCMxgDi_iTN-g. > Feel free to comment. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] flink issue #2305: [FLINK-4271] [DataStreamAPI] Enable CoGroupedStreams and ...
Github user aljoscha commented on the issue: https://github.com/apache/flink/pull/2305 Wouldn't this suffer from the same problem as the "casting solution"? People would use `apply` and then wonder why there is no `setParallelism`, not bothering to read the Javadoc to find out that there is also `with`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---