Hello
If it can help, what I did so far to try to re-enable dead CTs
# prlctl stop ldap2
Stopping the CT...
Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details: Cannot
lock the Container
)
# cat /vz/lock/144dc737-b4e3-4c03-852c-25a6df06cee4.lck
6227
resuming
# ps auwx | grep 6227
root 6227 0.0 0.0 92140 6984 ? S 15:10 0:00
/usr/sbin/vzctl resume 144dc737-b4e3-4c03-852c-25a6df06cee4
# kill -9 6227
still cannot stop the CT (Cannot lock the Container...)
# df |grep 144dc737-b4e3-4c03-852c-25a6df06cee4
/dev/ploop11432p1 10188052 2546636 7100848 27%
/vz/root/144dc737-b4e3-4c03-852c-25a6df06cee4
none 1048576 0 1048576 0%
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/dump/Dump/.criu.cgyard.56I2ls
# umount /dev/ploop11432p1
# ploop check -F
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds
Reopen rw /vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds
Error in ploop_check (check.c:663): Dirty flag is set
# ploop mount
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/DiskDescriptor.xml
Error in ploop_mount_image (ploop.c:2495): Image
/vz/private/144dc737-b4e3-4c03-852c-25a6df06cee4/root.hdd/root.hds
already used by device /dev/ploop11432
# df -H | grep ploop11432
=> nothing
I am lost , any help appreciated .
Thanks .
Le 06/07/2020 à 15:37, Jehan Procaccia IMT a écrit :
Hello,
I am back to the initial pb related to that post , since I updated to
/OpenVZ release 7.0.14 (136) | ///Virtuozzo Linux release 7.8.0
(609)// , I am also facing CT corrupted status .
I don't see the exact same error as mentioned by Kevin Drysdale below
(ploop/fsck) , but I am not able to enter certain CT neither can I
stop them
/[root@olb~]# prlctl stop trans8//
//Stopping the CT...//
//Failed to stop the CT: PRL_ERR_VZCTL_OPERATION_FAILED (Details:
Cannot lock the Container//
//)//
/
/[root@olb ~]# prlctl enter trans8//
//Unable to get init pid//
//enter into CT failed//
//
//exited from CT 02faecdd-ddb6-42eb-8103-202508f18256/
For those CTs that fail to enter or stop, I noticed that there is a
2nd device mounted with name ending in /dump/Dump/.criu.cgyard.4EJB8c//
/
/[root@olb ~]# df -H |grep 02faecdd-ddb6-42eb-8103-202508f18256//
///dev/ploop53152p1 11G 2,2G 7,7G 23%
/vz/root/02faecdd-ddb6-42eb-8103-202508f18256//
//none 537M 0 537M 0%
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/dump/Dump/.criu.cgyard.4EJB8c/
//[root@olb ~]# prlctl list | grep 02faecdd-ddb6-42eb-8103-202508f18256//
//{02faecdd-ddb6-42eb-8103-202508f18256} running 157.159.196.17 CT
isptrans8//
//
I rebooted the whole hardware node, and since reboot here is the
related vzctl.log
/2020-07-06T15:10:38+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : Removing the stale lock file
/vz/lock/02faecdd-ddb6-42eb-8103-202508f18256.lck//
//2020-07-06T15:10:38+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : Restoring the Container ...//
//2020-07-06T15:10:38+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : Mount image:
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd //
//2020-07-06T15:10:38+0200 : Opening delta
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
//2020-07-06T15:10:38+0200 : Opening delta
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
//2020-07-06T15:10:38+0200 : Opening delta
/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds//
//2020-07-06T15:10:38+0200 : Adding delta dev=/dev/ploop53152
img=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd/root.hds
(rw)//
//2020-07-06T15:10:39+0200 : Mounted /dev/ploop53152p1 at
/vz/root/02faecdd-ddb6-42eb-8103-202508f18256 fstype=ext4
data=',balloon_ino=12' //
//2020-07-06T15:10:39+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : Container is mounted//
//2020-07-06T15:10:40+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : Setting permissions for
image=/vz/private/02faecdd-ddb6-42eb-8103-202508f18256/root.hdd//
//2020-07-06T15:10:40+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : Configure memguarantee: 0%//
//2020-07-06T15:18:12+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid//
//2020-07-06T15:18:12+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed//
//2020-07-06T15:19:49+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : Cannot lock the Container//
//2020-07-06T15:25:33+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : Unable to get init pid//
//2020-07-06T15:25:33+0200 vzctl : CT
02faecdd-ddb6-42eb-8103-202508f18256 : enter into CT failed/
on another CT failing to enter / stop same kind of logs + /Error
(criu /:
/2020-07-06T15:10:38+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Restoring the Container ...//
//2020-07-06T15:10:38+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Mount image:
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd //
//2020-07-06T15:10:38+0200 : Opening delta
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:10:39+0200 : Opening delta
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:10:39+0200 : Opening delta
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:10:39+0200 : Adding delta dev=/dev/ploop36049
img=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds
(rw)//
//2020-07-06T15:10:41+0200 : Mounted /dev/ploop36049p1 at
/vz/root/4ae48335-5b63-475d-8629-c8d742cb0ba0 fstype=ext4
data=',balloon_ino=12' //
//2020-07-06T15:10:41+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Container is mounted//
//2020-07-06T15:10:41+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Setting permissions for
image=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd//
//2020-07-06T15:10:41+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Configure memguarantee: 0%//
//2020-07-06T15:10:57+0200 vzeventd : Run: /etc/vz/vzevent.d/ve-stop
id=4ae48335-5b63-475d-8629-c8d742cb0ba0//
//2020-07-06T15:10:57+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : (03.038774) Error
(criu/util.c:666): exited, status=4//
//2020-07-06T15:10:57+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : (14.446513) 1: Error
(criu/files.c:230): Empty list on file desc id 0x1f(5)//
//2020-07-06T15:10:57+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : (14.446518) 1: Error
(criu/files.c:231): BUG at criu/files.c:231//
//2020-07-06T15:10:57+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : (15.589529) Error
(criu/cr-restore.c:1612): 7130 killed by signal 11: Segmentation fault//
//2020-07-06T15:10:57+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : (15.604550) Error
(criu/cr-restore.c:2614): Restoring FAILED.//
//2020-07-06T15:10:57+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : The restore log was saved in
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/dump/Dump/restore.log//
//2020-07-06T15:10:57+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : criu exited with rc=17//
//2020-07-06T15:10:57+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Unmount image:
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd (190)//
//2020-07-06T15:10:57+0200 : Unmounting file system at
/vz/root/4ae48335-5b63-475d-8629-c8d742cb0ba0//
//2020-07-06T15:11:31+0200 : Opening delta
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:11:31+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Container is unmounted//
//2020-07-06T15:11:31+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Failed to restore the Container//
//2020-07-06T15:11:31+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Restoring the Container ...//
//2020-07-06T15:11:31+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Mount image:
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd //
//2020-07-06T15:11:31+0200 : Opening delta
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:11:31+0200 : Opening delta
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:11:31+0200 : Opening delta
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds//
//2020-07-06T15:11:31+0200 : Adding delta dev=/dev/ploop36049
img=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd/root.hds
(rw)//
//2020-07-06T15:11:31+0200 : Mounted /dev/ploop36049p1 at
/vz/root/4ae48335-5b63-475d-8629-c8d742cb0ba0 fstype=ext4
data=',balloon_ino=12' //
//2020-07-06T15:11:31+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Container is mounted//
//2020-07-06T15:11:31+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Setting permissions for
image=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd//
//2020-07-06T15:11:31+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Configure memguarantee: 0%//
//2020-07-06T15:14:18+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : Unable to get init pid//
//2020-07-06T15:14:18+0200 vzctl : CT
4ae48335-5b63-475d-8629-c8d742cb0ba0 : enter into CT failed//
/
in prl-disp.log
/07-06 15:10:30.797 F /virtuozzo:4836:4836/ register CT:
4ae48335-5b63-475d-8629-c8d742cb0ba0//
//07-06 15:10:38.717 F /disp:4836:6163/ Processing command
'DspCmdVmStartEx' 1036 for CT
uuid='{4ae48335-5b63-475d-8629-c8d742cb0ba0}' //
//07-06 15:10:38.738 I /virtuozzo:4836:6234/ /usr/sbin/vzctl resume
4ae48335-5b63-475d-8629-c8d742cb0ba0//
//07-06 15:10:48.542 I /disp:4836:5196/ vzevent: state=6,
envid=4ae48335-5b63-475d-8629-c8d742cb0ba0//
//07-06 15:10:57.364 I /disp:4836:5196/ vzevent: state=8,
envid=4ae48335-5b63-475d-8629-c8d742cb0ba0//
//07-06 15:10:57.475 I /disp:4836:5196/ vzevent: state=12,
envid=4ae48335-5b63-475d-8629-c8d742cb0ba0//
//07-06 15:11:31.161 F /virtuozzo:4836:6234/ /usr/sbin/vzctl utility
failed: /usr/sbin/vzctl resume 4ae48335-5b63-475d-8629-c8d742cb0ba0 [6]//
//Mount image:
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd //
//Setting permissions for
image=/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd//
//Unmount image:
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/root.hdd (190)//
//The restore log was saved in
/vz/private/4ae48335-5b63-475d-8629-c8d742cb0ba0/dump/Dump/restore.log//
//07-06 15:11:31.162 I /virtuozzo:4836:6234/ /usr/sbin/vzctl start
4ae48335-5b63-475d-8629-c8d742cb0ba0/
Is this related to the update ? how can I renable those CT ?
Thanks .
//
Le 29/06/2020 à 12:30, Kevin Drysdale a écrit :
Hello,
After updating one of our OpenVZ VPS hosting nodes at the end of
last week, we've started to have issues with corruption apparently
occurring inside containers. Issues of this nature have never
affected the node previously, and there do not appear to be any
hardware issues that could explain this.
Specifically, a few hours after updating, we began to see containers
experiencing errors such as this in the logs:
[90471.678994] EXT4-fs (ploop35454p1): error count since last fsck: 25
[90471.679022] EXT4-fs (ploop35454p1): initial error at time
1593205255: ext4_ext_find_extent:904: inode 136399
[90471.679030] EXT4-fs (ploop35454p1): last error at time
1593232922: ext4_ext_find_extent:904: inode 136399
[95189.954569] EXT4-fs (ploop42983p1): error count since last fsck: 67
[95189.954582] EXT4-fs (ploop42983p1): initial error at time
1593210174: htree_dirblock_to_tree:918: inode 926441: block 3683060
[95189.954589] EXT4-fs (ploop42983p1): last error at time
1593276902: ext4_iget:4435: inode 1849777
[95714.207432] EXT4-fs (ploop60706p1): error count since last fsck: 42
[95714.207447] EXT4-fs (ploop60706p1): initial error at time
1593210489: ext4_ext_find_extent:904: inode 136272
[95714.207452] EXT4-fs (ploop60706p1): last error at time
1593231063: ext4_ext_find_extent:904: inode 136272
Shutting the containers down and manually mounting and e2fsck'ing
their filesystems did clear these errors, but each of the containers
(which were mostly used for running Plesk) had widespread issues
with corrupt or missing files after the fsck's completed,
necessitating their being restored from backup.
Concurrently, we also began to see messages like this appearing in
/var/log/vzctl.log, which again have never appeared at any point
prior to this update being installed:
/var/log/vzctl.log:2020-06-26T21:05:19+0100 : Error in fill_hole
(check.c:240): Warning: ploop image
'/vz/private/8288448/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:09:41+0100 : Error in fill_hole
(check.c:240): Warning: ploop image
'/vz/private/8288450/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:16:22+0100 : Error in fill_hole
(check.c:240): Warning: ploop image
'/vz/private/8288451/root.hdd/root.hds' is sparse
/var/log/vzctl.log:2020-06-26T21:19:57+0100 : Error in fill_hole
(check.c:240): Warning: ploop image
'/vz/private/8288452/root.hdd/root.hds' is sparse
The basic procedure we follow when updating our nodes is as follows:
1, Update the standby node we keep spare for this process
2. vzmigrate all containers from the live node being updated to the
standby node
3. Update the live node
4. Reboot the live node
5. vzmigrate the containers from the standby node back to the live
node they originally came from
So the only tool which has been used to affect these containers is
'vzmigrate' itself, so I'm at something of a loss as to how to
explain the root.hdd images for these containers containing sparse
gaps. This is something we have never done, as we have always been
aware that OpenVZ does not support their use inside a container's
hard drive image. And the fact that these images have suddenly
become sparse at the same time they have started to exhibit
filesystem corruption is somewhat concerning.
We can restore all affected containers from backups, but I wanted to
get in touch with the list to see if anyone else at any other site
has experienced these or similar issues after applying the 7.0.14
(136) update.
Thank you,
Kevin Drysdale.
_______________________________________________
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users
_______________________________________________
Users mailing list
Users@openvz.org
https://lists.openvz.org/mailman/listinfo/users