[GitHub] [cloudstack-kubernetes-provider] onitake commented on issue #1: Code transfer of SWISS TXT cloudstack-cloud-controller-manager to the Apache project
onitake commented on issue #1: Code transfer of SWISS TXT cloudstack-cloud-controller-manager to the Apache project URL: https://github.com/apache/cloudstack-kubernetes-provider/pull/1#issuecomment-529341143 @rhtyd Any update? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services
Re: 4.13 rbd snapshot delete failed
A quick feedback from my side. I've never had a properly working delete snapshot with ceph. Every week or so I have to manually delete all ceph snapshots. However, the NFS secondary storage snapshots are deleted just fine. I've been using CloudStack for 5+ years and it was always the case. I am currently running 4.11.2 with ceph 13.2.6-1xenial. Andrei - Original Message - > From: "Andrija Panic" > To: "Gabriel Beims Bräscher" > Cc: "users" , "dev" > Sent: Sunday, 8 September, 2019 19:17:59 > Subject: Re: 4.13 rbd snapshot delete failed > Thx Gabriel for extensive feedback. > Actually my ex company added the code to really delete a RBD snap back in > 2016 or so, was part of 4.9 if not mistaken. So I expect the code is there, > but probably some exception is happening or regression... > > Cheers > > On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher > wrote: > >> Thanks for the feedback, Andrija. It looks like delete was not totally >> supported then (am I missing something?). I will take a look into this and >> open a PR adding propper support for rbd snapshot deletion if necessary. >> >> Regarding the rollback, I have tested it several times and it worked; >> however, I see a weak point on the Ceph rollback implementation. >> >> It looks like Li Jerry was able to execute the rollback without any >> problem. Li, could you please post here the log output: "Attempting to >> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s], >> [snapshotid:%s]"? Andrija will not be able to see that log as the exception >> happen prior to it, the only way of you checking those values is via remote >> debugging. If you be able to post those values it would help as well on >> sorting out what is wrong. >> >> I am checking the code base, running a few tests, and evaluating the log >> that you (Andrija) sent. What I can say for now is that it looks that the >> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical piece of >> code that can definitely break the rollback execution flow. My tests had >> pointed for a pattern but now I see other possibilities. I will probably >> add a few parameters on the rollback/revert command instead of using the >> path or review the path life-cycle and different execution flows in order >> to keep it safer to be used. >> [1] >> https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper >> >> A few details on the test environments and Ceph/RBD version: >> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04 >> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic >> (stable) >> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [ >> https://github.com/ceph/ceph/pull/6878] >> Rados-java [https://github.com/ceph/rados-java] supports snapshot >> rollback since 0.5.0; rados-java 0.5.0 is the version used by CloudStack >> 4.13.0.0 >> >> I will be updating here soon. >> >> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander >> escreveu: >> >>> >>> >>> On 9/8/19 5:26 AM, Andrija Panic wrote: >>> > Maaany release ago, deleting Ceph volume snap, was also only deleting >>> it in >>> > DB, so the RBD performance become terrible with many tens of (i. e. >>> Hourly) >>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the guys >>> > will know better... >>> >>> I pinged Gabriel and he's looking into it. He'll get back to it. >>> >>> Wido >>> >>> > >>> > I >>> > >>> > On Sat, Sep 7, 2019, 08:34 li jerry wrote: >>> > >>> >> I found it had nothing to do with storage.cleanup.delay and >>> >> storage.cleanup.interval. >>> >> >>> >> >>> >> >>> >> The reason is that when DeleteSnapshot Cmd is executed, because the RBD >>> >> snapshot does not have Copy to secondary storage, it only changes the >>> >> database information, and does not enter the main storage to delete the >>> >> snapshot. >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> Log=== >>> >> >>> >> >>> >> >>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet] >>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START=== >>> 192.168.254.3 >>> >> -- GET >>> >> >>> command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480 >>> >> >>> >> 2019-09-07 23:27:00,139 DEBUG [c.c.a.ApiServer] >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) CIDRs from >>> >> which account 'Acct[2f96c108-9408-11e9-a820-0200582b001a-admin]' is >>> allowed >>> >> to perform API calls: 0.0.0.0/0,::/0 >>> >> >>> >> 2019-09-07 23:27:00,204 DEBUG [c.c.a.ApiServer] >>> >> (qtp504527234-17:ctx-2e407b61 ctx-679fd276) (logid:445cbea8) Retrieved >>> >> cmdEventType from job info: SNAPSHOT.DELETE >>> >> >>> >> 2019-09-07 23:27:00,217 INFO [o.a.c.f.j.i.AsyncJobMonitor] >>> >> (API-Job-Executor-2:ctx-f0843047 job-1378) (logid:c34a368a) Add >>> job-1378 >>> >> into job monitoring >>> >> >>> >> 2019-09-07 23:27:00,219 DEBUG
Re: 4.13 rbd snapshot delete failed
Thanks for the feedback Andrija and Andrei. I have opened issue #3590 for the snapshot rollback issue raised by Andrija. I will be investigating both issues: - RBD snapshot Revert #3590 ( https://github.com/apache/cloudstack/issues/3590) - RBD snapshot deletion #3586 ( https://github.com/apache/cloudstack/issues/3586) Cheers, Gabriel Em seg, 9 de set de 2019 às 09:41, Andrei Mikhailovsky escreveu: > A quick feedback from my side. I've never had a properly working delete > snapshot with ceph. Every week or so I have to manually delete all ceph > snapshots. However, the NFS secondary storage snapshots are deleted just > fine. I've been using CloudStack for 5+ years and it was always the case. I > am currently running 4.11.2 with ceph 13.2.6-1xenial. > > Andrei > > - Original Message - > > From: "Andrija Panic" > > To: "Gabriel Beims Bräscher" > > Cc: "users" , "dev" < > dev@cloudstack.apache.org> > > Sent: Sunday, 8 September, 2019 19:17:59 > > Subject: Re: 4.13 rbd snapshot delete failed > > > Thx Gabriel for extensive feedback. > > Actually my ex company added the code to really delete a RBD snap back in > > 2016 or so, was part of 4.9 if not mistaken. So I expect the code is > there, > > but probably some exception is happening or regression... > > > > Cheers > > > > On Sun, Sep 8, 2019, 09:31 Gabriel Beims Bräscher > > wrote: > > > >> Thanks for the feedback, Andrija. It looks like delete was not totally > >> supported then (am I missing something?). I will take a look into this > and > >> open a PR adding propper support for rbd snapshot deletion if necessary. > >> > >> Regarding the rollback, I have tested it several times and it worked; > >> however, I see a weak point on the Ceph rollback implementation. > >> > >> It looks like Li Jerry was able to execute the rollback without any > >> problem. Li, could you please post here the log output: "Attempting to > >> rollback RBD snapshot [name:%s], [pool:%s], [volumeid:%s], > >> [snapshotid:%s]"? Andrija will not be able to see that log as the > exception > >> happen prior to it, the only way of you checking those values is via > remote > >> debugging. If you be able to post those values it would help as well on > >> sorting out what is wrong. > >> > >> I am checking the code base, running a few tests, and evaluating the log > >> that you (Andrija) sent. What I can say for now is that it looks that > the > >> parameter "snapshotRelPath = snapshot.getPath()" [1] is a critical > piece of > >> code that can definitely break the rollback execution flow. My tests had > >> pointed for a pattern but now I see other possibilities. I will probably > >> add a few parameters on the rollback/revert command instead of using the > >> path or review the path life-cycle and different execution flows in > order > >> to keep it safer to be used. > >> [1] > >> > https://github.com/apache/cloudstack/blob/50fc045f366bd9769eba85c4bc3ecdc0b7035c11/plugins/hypervisors/kvm/src/main/java/com/cloud/hypervisor/kvm/resource/wrapper > >> > >> A few details on the test environments and Ceph/RBD version: > >> CloudStack, KVM, and Ceph nodes are running with Ubuntu 18.04 > >> Ceph version 13.2.5 (cbff874f9007f1869bfd3821b7e33b2a6ffd4988) mimic > >> (stable) > >> RADOS Block Devices has snapshot rollback support since Ceph v10.0.2 [ > >> https://github.com/ceph/ceph/pull/6878] > >> Rados-java [https://github.com/ceph/rados-java] supports snapshot > >> rollback since 0.5.0; rados-java 0.5.0 is the version used by CloudStack > >> 4.13.0.0 > >> > >> I will be updating here soon. > >> > >> Em dom, 8 de set de 2019 às 12:28, Wido den Hollander > >> escreveu: > >> > >>> > >>> > >>> On 9/8/19 5:26 AM, Andrija Panic wrote: > >>> > Maaany release ago, deleting Ceph volume snap, was also only deleting > >>> it in > >>> > DB, so the RBD performance become terrible with many tens of (i. e. > >>> Hourly) > >>> > snapshots. I'll try to verify this on 4.13 myself, but Wido and the > guys > >>> > will know better... > >>> > >>> I pinged Gabriel and he's looking into it. He'll get back to it. > >>> > >>> Wido > >>> > >>> > > >>> > I > >>> > > >>> > On Sat, Sep 7, 2019, 08:34 li jerry wrote: > >>> > > >>> >> I found it had nothing to do with storage.cleanup.delay and > >>> >> storage.cleanup.interval. > >>> >> > >>> >> > >>> >> > >>> >> The reason is that when DeleteSnapshot Cmd is executed, because the > RBD > >>> >> snapshot does not have Copy to secondary storage, it only changes > the > >>> >> database information, and does not enter the main storage to delete > the > >>> >> snapshot. > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> > >>> >> Log=== > >>> >> > >>> >> > >>> >> > >>> >> 2019-09-07 23:27:00,118 DEBUG [c.c.a.ApiServlet] > >>> >> (qtp504527234-17:ctx-2e407b61) (logid:445cbea8) ===START=== > >>> 192.168.254.3 > >>> >> -- GET > >>> >> > >>> > command=deleteSnapshot&id=0b50eb7e-4f42-4de7-96c2-1fae137c8c9f&response=json&_=1567869534480 > >>> >> > >>> >> 201