The RC artifacts are only deployed to the Maven Central Repository when the RC is promoted to a release. As written in the 1.8.0 RC1 voting email [1], you can find the maven artifacts, and the Flink binaries here:
- https://repository.apache.org/content/repositories/orgapacheflink-1210/ - https://dist.apache.org/repos/dist/dev/flink/flink-1.8.0-rc1/ Alternatively, you can apply the patch yourself, and build Flink 1.7 from sources [2]. On my machine this takes around 10 minutes if tests are skipped. Best, Gary [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-8-0-release-candidate-1-td27637.html [2] https://ci.apache.org/projects/flink/flink-docs-release-1.7/flinkDev/building.html#build-flink On Tue, Mar 12, 2019 at 4:01 PM Vishal Santoshi <vishal.santo...@gmail.com> wrote: > Do you have a mvn repository ( at mvn central ) set up for 1,8 release > candidate. We could test it for you. > > Without 1.8and this exit code we are essentially held up. > > On Tue, Mar 12, 2019 at 10:56 AM Gary Yao <g...@ververica.com> wrote: > >> Nobody can tell with 100% certainty. We want to give the RC some exposure >> first, and there is also a release process that is prescribed by the ASF >> [1]. >> You can look at past releases to get a feeling for how long the release >> process lasts [2]. >> >> [1] http://www.apache.org/legal/release-policy.html#release-approval >> [2] >> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/template/NamlServlet.jtp?macro=search_page&node=1&query=%5BVOTE%5D+Release&days=0 >> >> >> On Tue, Mar 12, 2019 at 3:38 PM Vishal Santoshi < >> vishal.santo...@gmail.com> wrote: >> >>> And when is the 1.8.0 release expected ? >>> >>> On Tue, Mar 12, 2019 at 10:32 AM Vishal Santoshi < >>> vishal.santo...@gmail.com> wrote: >>> >>>> :) That makes so much more sense. Is k8s native flink a part of this >>>> release ? >>>> >>>> On Tue, Mar 12, 2019 at 10:27 AM Gary Yao <g...@ververica.com> wrote: >>>> >>>>> Hi Vishal, >>>>> >>>>> This issue was fixed recently [1], and the patch will be released with >>>>> 1.8. If >>>>> the Flink job gets cancelled, the JVM should exit with code 0. There >>>>> is a >>>>> release candidate [2], which you can test. >>>>> >>>>> Best, >>>>> Gary >>>>> >>>>> [1] https://issues.apache.org/jira/browse/FLINK-10743 >>>>> [2] >>>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/VOTE-Release-1-8-0-release-candidate-1-td27637.html >>>>> >>>>> On Tue, Mar 12, 2019 at 3:21 PM Vishal Santoshi < >>>>> vishal.santo...@gmail.com> wrote: >>>>> >>>>>> Thanks Vijay, >>>>>> >>>>>> This is the larger issue. The cancellation routine is itself broken. >>>>>> >>>>>> On cancellation flink does remove the checkpoint counter >>>>>> >>>>>> *2019-03-12 14:12:13,143 >>>>>> INFO org.apache.flink.runtime.checkpoint.ZooKeeperCheckpointIDCounter - >>>>>> Removing /checkpoint-counter/00000000000000000000000000000000 from >>>>>> ZooKeeper * >>>>>> >>>>>> but exist with a non zero code >>>>>> >>>>>> *2019-03-12 14:12:13,477 >>>>>> INFO org.apache.flink.runtime.entrypoint.ClusterEntrypoint - >>>>>> Terminating cluster entrypoint process StandaloneJobClusterEntryPoint >>>>>> with >>>>>> exit code 1444.* >>>>>> >>>>>> >>>>>> That I think is an issue. A cancelled job is a complete job and thus >>>>>> the exit code should be 0 for k8s to mark it complete. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Tue, Mar 12, 2019 at 10:18 AM Vijay Bhaskar < >>>>>> bhaskar.eba...@gmail.com> wrote: >>>>>> >>>>>>> Yes Vishal. Thats correct. >>>>>>> >>>>>>> Regards >>>>>>> Bhaskar >>>>>>> >>>>>>> On Tue, Mar 12, 2019 at 7:14 PM Vishal Santoshi < >>>>>>> vishal.santo...@gmail.com> wrote: >>>>>>> >>>>>>>> This really not cool but here you go. This seems to work. Agreed >>>>>>>> that this cannot be this painful. The cancel does not exit with an exit >>>>>>>> code pf 0 and thus the job has to manually delete. Vijay does this >>>>>>>> align >>>>>>>> with what you have had to do ? >>>>>>>> >>>>>>>> >>>>>>>> - Take a save point . This returns a request id >>>>>>>> >>>>>>>> curl --header "Content-Type: application/json" --request POST >>>>>>>> --data >>>>>>>> '{"target-directory":"hdfs://nn-crunchy:8020/tmp/xyz14","cancel-job":false}' >>>>>>>> >>>>>>>> https://*************/jobs/00000000000000000000000000000000/savepoints >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - Make sure the save point succeeded >>>>>>>> >>>>>>>> curl --request GET >>>>>>>> https://****************/jobs/00000000000000000000000000000000/savepoints/2c053ce3bea31276aa25e63784629687 >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - cancel the job >>>>>>>> >>>>>>>> curl --request PATCH >>>>>>>> https://***************/jobs/00000000000000000000000000000000?mode=cancel >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - Delete the job and deployment >>>>>>>> >>>>>>>> kubectl delete -f >>>>>>>> manifests/bf2-PRODUCTION/job-cluster-job-deployment.yaml >>>>>>>> >>>>>>>> kubectl delete -f >>>>>>>> manifests/bf2-PRODUCTION/task-manager-deployment.yaml >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - Edit the job-cluster-job-deployment.yaml. Add/Edit >>>>>>>> >>>>>>>> args: ["job-cluster", >>>>>>>> >>>>>>>> "--fromSavepoint", >>>>>>>> >>>>>>>> >>>>>>>> "hdfs://************/tmp/xyz14/savepoint-000000-1d4f71345e22", >>>>>>>> "--job-classname", ......... >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - Restart >>>>>>>> >>>>>>>> kubectl create -f >>>>>>>> manifests/bf2-PRODUCTION/job-cluster-job-deployment.yaml >>>>>>>> >>>>>>>> kubectl create -f >>>>>>>> manifests/bf2-PRODUCTION/task-manager-deployment.yaml >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> - Make sure from the UI, that it restored from the specific >>>>>>>> save point. >>>>>>>> >>>>>>>> >>>>>>>> On Tue, Mar 12, 2019 at 7:26 AM Vijay Bhaskar < >>>>>>>> bhaskar.eba...@gmail.com> wrote: >>>>>>>> >>>>>>>>> Yes Its supposed to work. But unfortunately it was not working. >>>>>>>>> Flink community needs to respond to this behavior. >>>>>>>>> >>>>>>>>> Regards >>>>>>>>> Bhaskar >>>>>>>>> >>>>>>>>> On Tue, Mar 12, 2019 at 3:45 PM Vishal Santoshi < >>>>>>>>> vishal.santo...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> Aah. >>>>>>>>>> Let me try this out and will get back to you. >>>>>>>>>> Though I would assume that save point with cancel is a single >>>>>>>>>> atomic step, rather then a save point *followed* by a >>>>>>>>>> cancellation ( else why would that be an option ). >>>>>>>>>> Thanks again. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, Mar 12, 2019 at 4:50 AM Vijay Bhaskar < >>>>>>>>>> bhaskar.eba...@gmail.com> wrote: >>>>>>>>>> >>>>>>>>>>> Hi Vishal, >>>>>>>>>>> >>>>>>>>>>> yarn-cancel doesn't mean to be for yarn cluster. It works for >>>>>>>>>>> all clusters. Its recommended command. >>>>>>>>>>> >>>>>>>>>>> Use the following command to issue save point. >>>>>>>>>>> curl --header "Content-Type: application/json" --request POST >>>>>>>>>>> --data '{"target-directory":"hdfs://*********:8020/tmp/xyz1", >>>>>>>>>>> "cancel-job":false}' \ https:// >>>>>>>>>>> ************.ingress.*******/jobs/00000000000000000000000000000000/savepoints >>>>>>>>>>> >>>>>>>>>>> Then issue yarn-cancel. >>>>>>>>>>> After that follow the process to restore save point >>>>>>>>>>> >>>>>>>>>>> Regards >>>>>>>>>>> Bhaskar >>>>>>>>>>> >>>>>>>>>>> On Tue, Mar 12, 2019 at 2:11 PM Vishal Santoshi < >>>>>>>>>>> vishal.santo...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> Hello Vijay, >>>>>>>>>>>> >>>>>>>>>>>> Thank you for the reply. This though is k8s >>>>>>>>>>>> deployment ( rather then yarn ) but may be they follow the same >>>>>>>>>>>> lifecycle. >>>>>>>>>>>> I issue a* save point with cancel* as documented here >>>>>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.7/monitoring/rest_api.html#jobs-jobid-savepoints, >>>>>>>>>>>> a straight up >>>>>>>>>>>> curl --header "Content-Type: application/json" --request POST >>>>>>>>>>>> --data >>>>>>>>>>>> '{"target-directory":"hdfs://*********:8020/tmp/xyz1","cancel-job":true}' >>>>>>>>>>>> \ https:// >>>>>>>>>>>> ************.ingress.*******/jobs/00000000000000000000000000000000/savepoints >>>>>>>>>>>> >>>>>>>>>>>> I would assume that after taking the save point, the jvm should >>>>>>>>>>>> exit, after all the k8s deployment is of kind: job and if it is a >>>>>>>>>>>> job >>>>>>>>>>>> cluster then a cancellation should exit the jvm and hence the pod. >>>>>>>>>>>> It does >>>>>>>>>>>> seem to do some things right. It stops a bunch of stuff ( the >>>>>>>>>>>> JobMaster, >>>>>>>>>>>> the slotPol, zookeeper coordinator etc ) . It also remove the >>>>>>>>>>>> checkpoint >>>>>>>>>>>> counter but does not exit the job. And after a little bit the job >>>>>>>>>>>> is >>>>>>>>>>>> restarted which does not make sense and absolutely not the right >>>>>>>>>>>> thing to >>>>>>>>>>>> do ( to me at least ). >>>>>>>>>>>> >>>>>>>>>>>> Further if I delete the deployment and the job from k8s and >>>>>>>>>>>> restart the job and deployment fromSavePoint, it refuses to honor >>>>>>>>>>>> the >>>>>>>>>>>> fromSavePoint. I have to delete the zk chroot for it to consider >>>>>>>>>>>> the save >>>>>>>>>>>> point. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Thus the process of cancelling and resuming from a SP on a k8s >>>>>>>>>>>> job cluster deployment seems to be >>>>>>>>>>>> >>>>>>>>>>>> - cancel with save point as defined hre >>>>>>>>>>>> >>>>>>>>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.7/monitoring/rest_api.html#jobs-jobid-savepoints >>>>>>>>>>>> - delete the job manger job and task manager deployments >>>>>>>>>>>> from k8s almost immediately. >>>>>>>>>>>> - clear the ZK chroot for the 0000000...... job and may be >>>>>>>>>>>> the checkpoints directory. >>>>>>>>>>>> - resumeFromCheckPoint >>>>>>>>>>>> >>>>>>>>>>>> If some body can say that this indeed is the process ? >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Logs are attached. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:43,871 INFO >>>>>>>>>>>> org.apache.flink.runtime.jobmaster.JobMaster - >>>>>>>>>>>> Savepoint stored in >>>>>>>>>>>> hdfs://*********:8020/tmp/xyz3/savepoint-000000-6d5bdc9b53ae. Now >>>>>>>>>>>> cancelling 00000000000000000000000000000000. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:43,871 INFO >>>>>>>>>>>> org.apache.flink.runtime.executiongraph.ExecutionGraph - >>>>>>>>>>>> Job anomaly_echo (00000000000000000000000000000000) switched from >>>>>>>>>>>> state >>>>>>>>>>>> RUNNING to CANCELLING. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,227 INFO >>>>>>>>>>>> org.apache.flink.runtime.checkpoint.CheckpointCoordinator >>>>>>>>>>>> - Completed checkpoint 10 for job >>>>>>>>>>>> 00000000000000000000000000000000 (7238 bytes in 311 ms). >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,232 INFO >>>>>>>>>>>> org.apache.flink.runtime.executiongraph.ExecutionGraph - >>>>>>>>>>>> Source: Barnacle Anomalies Kafka topic -> Map -> Sink: Logging >>>>>>>>>>>> Sink (1/1) >>>>>>>>>>>> (e2d02ca40a9a6c96a0c1882f5a2e4dd6) switched from RUNNING to >>>>>>>>>>>> CANCELING. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,274 INFO >>>>>>>>>>>> org.apache.flink.runtime.executiongraph.ExecutionGraph - >>>>>>>>>>>> Source: Barnacle Anomalies Kafka topic -> Map -> Sink: Logging >>>>>>>>>>>> Sink (1/1) >>>>>>>>>>>> (e2d02ca40a9a6c96a0c1882f5a2e4dd6) switched from CANCELING to >>>>>>>>>>>> CANCELED. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,276 INFO >>>>>>>>>>>> org.apache.flink.runtime.executiongraph.ExecutionGraph - >>>>>>>>>>>> Job anomaly_echo (00000000000000000000000000000000) switched from >>>>>>>>>>>> state >>>>>>>>>>>> CANCELLING to CANCELED. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,276 INFO >>>>>>>>>>>> org.apache.flink.runtime.checkpoint.CheckpointCoordinator >>>>>>>>>>>> - Stopping checkpoint coordinator for job >>>>>>>>>>>> 00000000000000000000000000000000. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,277 INFO >>>>>>>>>>>> org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore >>>>>>>>>>>> - Shutting down >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,323 INFO >>>>>>>>>>>> org.apache.flink.runtime.checkpoint.CompletedCheckpoint >>>>>>>>>>>> - Checkpoint with ID 8 at >>>>>>>>>>>> 'hdfs://nn-crunchy:8020/tmp/xyz2/savepoint-000000-859e626cbb00' not >>>>>>>>>>>> discarded. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,437 INFO >>>>>>>>>>>> org.apache.flink.runtime.checkpoint.ZooKeeperCompletedCheckpointStore >>>>>>>>>>>> - Removing >>>>>>>>>>>> /k8s_anomalyecho/k8s_anomalyecho/checkpoints/00000000000000000000000000000000 >>>>>>>>>>>> from ZooKeeper >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,437 INFO >>>>>>>>>>>> org.apache.flink.runtime.checkpoint.CompletedCheckpoint >>>>>>>>>>>> - Checkpoint with ID 10 at >>>>>>>>>>>> 'hdfs://*************:8020/tmp/xyz3/savepoint-000000-6d5bdc9b53ae' >>>>>>>>>>>> not >>>>>>>>>>>> discarded. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,447 INFO >>>>>>>>>>>> org.apache.flink.runtime.checkpoint.ZooKeeperCheckpointIDCounter >>>>>>>>>>>> - Shutting down. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,447 INFO >>>>>>>>>>>> org.apache.flink.runtime.checkpoint.ZooKeeperCheckpointIDCounter >>>>>>>>>>>> - Removing /checkpoint-counter/00000000000000000000000000000000 >>>>>>>>>>>> from ZooKeeper >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,463 INFO >>>>>>>>>>>> org.apache.flink.runtime.dispatcher.MiniDispatcher - >>>>>>>>>>>> Job 00000000000000000000000000000000 reached globally terminal >>>>>>>>>>>> state >>>>>>>>>>>> CANCELED. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,467 INFO >>>>>>>>>>>> org.apache.flink.runtime.jobmaster.JobMaster - >>>>>>>>>>>> Stopping the JobMaster for job >>>>>>>>>>>> anomaly_echo(00000000000000000000000000000000). >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,468 INFO >>>>>>>>>>>> org.apache.flink.runtime.entrypoint.ClusterEntrypoint >>>>>>>>>>>> - Shutting StandaloneJobClusterEntryPoint down with >>>>>>>>>>>> application status CANCELED. Diagnostics null. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,468 INFO >>>>>>>>>>>> org.apache.flink.runtime.jobmaster.MiniDispatcherRestEndpoint - >>>>>>>>>>>> Shutting down rest endpoint. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,473 INFO >>>>>>>>>>>> org.apache.flink.runtime.leaderretrieval.ZooKeeperLeaderRetrievalService >>>>>>>>>>>> - Stopping ZooKeeperLeaderRetrievalService >>>>>>>>>>>> /leader/resource_manager_lock. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,475 INFO >>>>>>>>>>>> org.apache.flink.runtime.jobmaster.JobMaster - >>>>>>>>>>>> Close ResourceManager connection d38c6e599d16415a69c65c8b2a72d9a2: >>>>>>>>>>>> JobManager is shutting down.. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,475 INFO >>>>>>>>>>>> org.apache.flink.runtime.jobmaster.slotpool.SlotPool - >>>>>>>>>>>> Suspending SlotPool. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,476 INFO >>>>>>>>>>>> org.apache.flink.runtime.jobmaster.slotpool.SlotPool - >>>>>>>>>>>> Stopping SlotPool. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,476 INFO >>>>>>>>>>>> org.apache.flink.runtime.resourcemanager.StandaloneResourceManager >>>>>>>>>>>> - Disconnect job manager a0dcf8aaa3fadcfd6fef49666d7344ca >>>>>>>>>>>> @akka.tcp://flink@anomalyecho:6123/user/jobmanager_0 for job >>>>>>>>>>>> 00000000000000000000000000000000 from the resource manager. >>>>>>>>>>>> >>>>>>>>>>>> 2019-03-12 08:10:44,477 INFO >>>>>>>>>>>> org.apache.flink.runtime.leaderelection.ZooKeeperLeaderElectionService >>>>>>>>>>>> - Stopping ZooKeeperLeaderElectionService >>>>>>>>>>>> ZooKeeperLeaderElectionService{leaderPath='/leader/00000000000000000000000000000000/job_manager_lock'}. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> After a little bit >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Starting the job-cluster >>>>>>>>>>>> >>>>>>>>>>>> used deprecated key `jobmanager.heap.mb`, please replace with >>>>>>>>>>>> key `jobmanager.heap.size` >>>>>>>>>>>> >>>>>>>>>>>> Starting standalonejob as a console application on host >>>>>>>>>>>> anomalyecho-mmg6t. >>>>>>>>>>>> >>>>>>>>>>>> .. >>>>>>>>>>>> >>>>>>>>>>>> .. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> Regards. >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Tue, Mar 12, 2019 at 3:25 AM Vijay Bhaskar < >>>>>>>>>>>> bhaskar.eba...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Vishal >>>>>>>>>>>>> >>>>>>>>>>>>> Save point with cancellation internally use /cancel REST >>>>>>>>>>>>> API. Which is not stable API. It always exits with 404. Best >>>>>>>>>>>>> way to issue >>>>>>>>>>>>> is: >>>>>>>>>>>>> >>>>>>>>>>>>> a) First issue save point REST API >>>>>>>>>>>>> b) Then issue /yarn-cancel rest API( As described in >>>>>>>>>>>>> http://mail-archives.apache.org/mod_mbox/flink-user/201804.mbox/%3c0ffa63f4-e6ed-42d8-1928-37a7adaaa...@apache.org%3E >>>>>>>>>>>>> ) >>>>>>>>>>>>> c) Then After resuming your job, provide save point Path as >>>>>>>>>>>>> argument for the run jar REST API, which is returned by the (a) >>>>>>>>>>>>> Above is the smoother way >>>>>>>>>>>>> >>>>>>>>>>>>> Regards >>>>>>>>>>>>> Bhaskar >>>>>>>>>>>>> >>>>>>>>>>>>> On Tue, Mar 12, 2019 at 2:46 AM Vishal Santoshi < >>>>>>>>>>>>> vishal.santo...@gmail.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> There are some issues I see and would want to get some >>>>>>>>>>>>>> feedback >>>>>>>>>>>>>> >>>>>>>>>>>>>> 1. On Cancellation With SavePoint with a Target Directory , >>>>>>>>>>>>>> the k8s job does not exit ( it is not a deployment ) . I would >>>>>>>>>>>>>> assume >>>>>>>>>>>>>> that on cancellation the jvm should exit, after cleanup etc, and >>>>>>>>>>>>>> thus the >>>>>>>>>>>>>> pod should too. That does not happen and thus the job pod >>>>>>>>>>>>>> remains live. Is >>>>>>>>>>>>>> that expected ? >>>>>>>>>>>>>> >>>>>>>>>>>>>> 2. To resume fro a save point it seems that I have to delete >>>>>>>>>>>>>> the job id ( 0000000000.... ) from ZooKeeper ( this is HA ), >>>>>>>>>>>>>> else it >>>>>>>>>>>>>> defaults to the latest checkpoint no matter what >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am kind of curious as to what in 1.7.2 is the tested >>>>>>>>>>>>>> process of cancelling with a save point and resuming and what >>>>>>>>>>>>>> is the >>>>>>>>>>>>>> cogent story around job id ( defaults to 000000000000.. ). Note >>>>>>>>>>>>>> that >>>>>>>>>>>>>> --job-id does not work with 1.7.2 so even though that does not >>>>>>>>>>>>>> make sense, >>>>>>>>>>>>>> I still can not provide a new job id. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Regards, >>>>>>>>>>>>>> >>>>>>>>>>>>>> Vishal. >>>>>>>>>>>>>> >>>>>>>>>>>>>>