Hi Eric, Thanks for confirming what we suspected all along!
On Tue, Aug 9, 2022 at 10:56 AM Loon, Eric van (ITOP DI) - KLM < eric-van.l...@klm.com> wrote: > Hi Zoltan, > > I checked with the developer and basically there's no difference between a > CANCEL PROC and a CANCEL REPLICATION. It might be improved in a future > release. > > Kind regards, > Eric van Loon > Core Infra > > -----Original Message----- > From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of Zoltan > Forray > Sent: dinsdag 9 augustus 2022 14:25 > To: ADSM-L@VM.MARIST.EDU > Subject: Re: Cancel Session with EXTREME PREJUDICE (a.k.a. FORCE) > > Hi Eric, > > Thank you for confirming what my co-worker and I have been experiencing > and that we aren't alone with these issues. We would never have jumped from > 8.1.12 to .14 if it were not for log4j and the plethora of other possible > intrusions! > > I have seen way too many APARS related to STGRULE and have no desire to > inflict even more pain upon ourselves! These are (were) standard > administrative schedule or console "replicate node" commands using node > groups or sometimes just a single node trying to reconcile "errors" caused > by canceling replication. > > This brings up another question - what is the purpose of the "cancel > replication" command if it causes such damage that subsequent replications > issue these dire warnings about *"detected partially replicated data from a > previous replication operation. This might result in extended processing > time while the server is replicating"*. IMO, it sounds like a regular > "cancel process" would do about the same? I understand about special > commands like "cancel expiration" which allows subsequent expire inventory > commands to resume processing where the "canceled expiration" left off but > cancel replication doesn't seem to offer much benefit - or is this another > APAR waiting for mitigation? > > > On Tue, Aug 9, 2022 at 2:54 AM Loon, Eric van (ITOP DI) - KLM < > eric-van.l...@klm.com> wrote: > > > Hi Zoltan, > > > > This all sounds so familiar. To my opinion, all releases after 8.1.12 > > are the most buggy versions IBM ever released (yes, maybe even more > > buggy than the infamous 6.1). I'm in close contact with somebody from > > development and it already resulted in 6 or 7 APARs and still not > everything is working OK. > > The crashes, the hanging replications, the slow replications, the > > stale sessions which can't be canceled, I have seen them all... > > One question: are you using 'traditional' node replication or are you > > using stgrule replication? The latter one contains a nasty bug: when > > replication fails of gets canceled, the next replication runs VERY slow. > > The only way to fix this is by running a special script I received > > from support, along with several DB2 commands... No permanent fix > available yet. > > But to come back on your question: I have never been able to cancel > > those session, the only way to get rid of them is by bouncing the server. > > > > Kind regards, > > Eric van Loon > > Air France/KLM Core Infra > > > > -----Original Message----- > > From: ADSM: Dist Stor Manager <ADSM-L@VM.MARIST.EDU> On Behalf Of > > Zoltan Forray > > Sent: maandag 8 augustus 2022 16:08 > > To: ADSM-L@VM.MARIST.EDU > > Subject: Cancel Session with EXTREME PREJUDICE (a.k.a. FORCE) > > > > First off, we do not know how anyone can run replication of data from > > a FILE base storagepool (yes, we know that CONTAINERS fixes > > everything🙄) to another server. Every attempt we have made usually > > ends up in a mess that we have to undo/cleanup. We find it is very > > slow (10G on both ends) and the replication processes never seem to > finish/end. > > > > We have observed that no matter how many or how few replication > > sessions we start, most of them seem to go idle/wait (e.g. > > MAXSESSIONS=10 starts 20-sessions to the target server of which 16+ > > become idle eventhough there is 4TB to replication. > > > > Since we need to get off magnetic tape (moving to a new building with > > restricted space so existing ATL has to go!), we have been using the > > offsite server as Virtual Volumes and creating offsite backups to it. > > This was working pretty well until we started experiencing server > > crashes/cores after upgrading to 8.1.14 (support confirmed a bug - > > sent us an eFix for it > > - we were continuing to have intermittent crashes - support discovered > > another related bug via RECONCILE VOLUMES command - just installed > > another eFix that is supposed to address the crashes). > > > > While waiting for a fix for the original server problem, we decided to > > try to transition back to replication - only to have more problems > > than the crashes. We have had to bounce the target and source servers > > multiple times due to replication sessions that won't go away/end as > > well as the performance issues I mentioned above. > > > > Did I mention the issues with the 8.1.15 Linux client also related to > > replication? > > > > Since we have the shared stgpools/reconcile volumes eFix (8.1.14.110) > > installed on all servers, we have decided to go back to virtual volumes. > > > > Now back to the subject of this post. Right now I have 4-replication > > sessions on the target server that say they are doing something (i.e. > > not in a WAIT), but in reality have been hung since August 1st (we had > > installed the eFix but forgot to disable the admin command that kicks > > off replication). There are no replication sessions on the source > server. > > > > All attempts to cancel the ghost sessions on the target server say > > they can't be canceled. > > > > So before we bounce it one-more-time, we were wondering if there is a > > super-secret "cancel session with force" we are not aware of? > > > > -- > > *Zoltan Forray* > > Enterprise Backup Administrator > > VMware Systems Administrator > > Enterprise Compute & Storage Platforms Team VCU Infrastructure > > Services www.ucc.vcu.edu zfor...@vcu.edu - 804-828-4807 Don't be a > > phishing victim > > - VCU and other reputable organizations will never use email to > > request that you reply with your password, social security number or > > confidential personal information. For more details visit > > http://phishing.vcu.edu/ < https://adminmicro2.questionpro.com> > > ******************************************************** > > For information, services and offers, please visit our web site: > > http://www.klm.com. This e-mail and any attachment may contain > > confidential and privileged material intended for the addressee only. > > If you are not the addressee, you are notified that no part of the > > e-mail or any attachment may be disclosed, copied or distributed, and > > that any other action related to this e-mail or attachment is strictly > > prohibited, and may be unlawful. If you have received this e-mail by > > error, please notify the sender immediately by return e-mail, and delete > this message. > > > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or > > its employees shall not be liable for the incorrect or incomplete > > transmission of this e-mail or any attachments, nor responsible for any > delay in receipt. > > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal > > Dutch > > Airlines) is registered in Amstelveen, The Netherlands, with > > registered number 33014286 > > ******************************************************** > > > > > -- > *Zoltan Forray* > Enterprise Backup Administrator > VMware Systems Administrator > Enterprise Compute & Storage Platforms Team VCU Infrastructure Services > www.ucc.vcu.edu zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim > - VCU and other reputable organizations will never use email to request > that you reply with your password, social security number or confidential > personal information. For more details visit http://phishing.vcu.edu/ < > https://adminmicro2.questionpro.com> > ******************************************************** > For information, services and offers, please visit our web site: > http://www.klm.com. This e-mail and any attachment may contain > confidential and privileged material intended for the addressee only. If > you are not the addressee, you are notified that no part of the e-mail or > any attachment may be disclosed, copied or distributed, and that any other > action related to this e-mail or attachment is strictly prohibited, and may > be unlawful. If you have received this e-mail by error, please notify the > sender immediately by return e-mail, and delete this message. > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its > employees shall not be liable for the incorrect or incomplete transmission > of this e-mail or any attachments, nor responsible for any delay in receipt. > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch > Airlines) is registered in Amstelveen, The Netherlands, with registered > number 33014286 > ******************************************************** > -- *Zoltan Forray* Enterprise Backup Administrator VMware Systems Administrator Enterprise Compute & Storage Platforms Team VCU Infrastructure Services www.ucc.vcu.edu zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other reputable organizations will never use email to request that you reply with your password, social security number or confidential personal information. For more details visit http://phishing.vcu.edu/ <https://adminmicro2.questionpro.com>