On Thu, 7 Nov 2019 16:14:43 +0800 Daniel Cho <daniel...@qnap.com> wrote:
> Hi Lukas, > Thanks for your reply. > > However, we test the question 1 with steps below the error message, we > notice the secondary VM's image > will break while it reboots. > Here is the error message. > ------------------------------------------------------------------- > [ 1.280299] XFS (sda1): Mounting V5 Filesystem > [ 1.428418] input: ImExPS/2 Generic Explorer Mouse as > /devices/platform/i8042/serio1/input/input2 > [ 1.501320] XFS (sda1): Starting recovery (logdev: internal) > [ 1.504076] tsc: Refined TSC clocksource calibration: 3492.211 MHz > [ 1.505534] Switched to clocksource tsc > [ 2.031027] XFS (sda1): Internal error XFS_WANT_CORRUPTED_GOTO at line > 1635 of file fs/xfs/libxfs/xfs_alloc.c. Caller xfs_free_extent+0xfc/0x130 > [xfs] > [ 2.032743] CPU: 0 PID: 300 Comm: mount Not tainted > 3.10.0-693.11.6.el7.x86_64 #1 > [ 2.033982] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 > [ 2.035882] Call Trace: > [ 2.036494] [<ffffffff816a5ea1>] dump_stack+0x19/0x1b > [ 2.037315] [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs] > [ 2.038150] [<ffffffffc0138e6c>] ? xfs_free_extent+0xfc/0x130 [xfs] > [ 2.039046] [<ffffffffc01362da>] xfs_free_ag_extent+0x20a/0x780 [xfs] > [ 2.039920] [<ffffffffc0138e6c>] xfs_free_extent+0xfc/0x130 [xfs] > [ 2.040768] [<ffffffffc01a7736>] xfs_trans_free_extent+0x26/0x60 [xfs] > [ 2.041642] [<ffffffffc019fade>] xlog_recover_process_efi+0x17e/0x1c0 > [xfs] > [ 2.042558] [<ffffffffc01a1e37>] > xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs] > [ 2.043771] [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs] > [ 2.044650] [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs] > [ 2.045518] [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs] > [ 2.046341] [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80 > [xfs] > [ 2.047260] [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs] > [ 2.048116] [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0 > [ 2.048881] [<ffffffffc01919b0>] ? > xfs_test_remount_options.isra.11+0x70/0x70 [xfs] > [ 2.050105] [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs] > [ 2.050906] [<ffffffff81207349>] mount_fs+0x39/0x1b0 > [ 2.051963] [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20 > [ 2.059431] [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110 > [ 2.060283] [<ffffffff81226483>] do_mount+0x233/0xaf0 > [ 2.061081] [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0 > [ 2.061844] [<ffffffff812270c6>] SyS_mount+0x96/0xf0 > [ 2.062619] [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b > [ 2.063512] XFS (sda1): Internal error xfs_trans_cancel at line 984 of > file fs/xfs/xfs_trans.c. Caller xlog_recover_process_efi+0x18e/0x1c0 [xfs] > [ 2.065260] CPU: 0 PID: 300 Comm: mount Not tainted > 3.10.0-693.11.6.el7.x86_64 #1 > [ 2.066489] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS > rel-1.12.1-0-ga5cab58e9a3f-prebuilt.qemu.org 04/01/2014 > [ 2.068023] Call Trace: > [ 2.068590] [<ffffffff816a5ea1>] dump_stack+0x19/0x1b > [ 2.069403] [<ffffffffc01794eb>] xfs_error_report+0x3b/0x40 [xfs] > [ 2.070318] [<ffffffffc019faee>] ? xlog_recover_process_efi+0x18e/0x1c0 > [xfs] > [ 2.071538] [<ffffffffc019594d>] xfs_trans_cancel+0xbd/0xe0 [xfs] > [ 2.072429] [<ffffffffc019faee>] xlog_recover_process_efi+0x18e/0x1c0 > [xfs] > [ 2.073339] [<ffffffffc01a1e37>] > xlog_recover_process_efis.isra.30+0x77/0xe0 [xfs] > [ 2.074561] [<ffffffffc01a5761>] xlog_recover_finish+0x21/0xb0 [xfs] > [ 2.075421] [<ffffffffc0198894>] xfs_log_mount_finish+0x34/0x50 [xfs] > [ 2.076301] [<ffffffffc018ef21>] xfs_mountfs+0x5d1/0x8b0 [xfs] > [ 2.077128] [<ffffffffc017d220>] ? xfs_filestream_get_parent+0x80/0x80 > [xfs] > [ 2.078049] [<ffffffffc0191d6b>] xfs_fs_fill_super+0x3bb/0x4d0 [xfs] > [ 2.078900] [<ffffffff81206ad0>] mount_bdev+0x1b0/0x1f0 > [ 2.079667] [<ffffffffc01919b0>] ? > xfs_test_remount_options.isra.11+0x70/0x70 [xfs] > [ 2.080883] [<ffffffffc01906d5>] xfs_fs_mount+0x15/0x20 [xfs] > [ 2.081687] [<ffffffff81207349>] mount_fs+0x39/0x1b0 > [ 2.082457] [<ffffffff811a7d45>] ? __alloc_percpu+0x15/0x20 > [ 2.083258] [<ffffffff81223f77>] vfs_kern_mount+0x67/0x110 > [ 2.084057] [<ffffffff81226483>] do_mount+0x233/0xaf0 > [ 2.084797] [<ffffffff811a2cfb>] ? strndup_user+0x4b/0xa0 > [ 2.085568] [<ffffffff812270c6>] SyS_mount+0x96/0xf0 > [ 2.086324] [<ffffffff816b89fd>] system_call_fastpath+0x16/0x1b > [ 2.087161] XFS (sda1): xfs_do_force_shutdown(0x8) called from line 985 > of file fs/xfs/xfs_trans.c. Return address = 0xffffffffc0195966 > [ 2.088795] XFS (sda1): Corruption of in-memory data detected. Shutting > down filesystem > [ 2.090273] XFS (sda1): Please umount the filesystem and rectify the > problem(s) > [ 2.091519] XFS (sda1): Failed to recover EFIs > [ 2.092299] XFS (sda1): log mount finish failed > [FAILED] Failed to mount /sysroot. > . > . > . > Generating "/run/initramfs/rdsosreport.txt" > [ 2.178103] blk_update_request: I/O error, dev fd0, sector 0 > [ 2.246106] blk_update_request: I/O error, dev fd0, sector 0 > ------------------------------------------------------------------- > > Here is the replicated steps: > *1. Start primary VM with command, and do every thing you want on PVM* > qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp > stdio -vnc :5 \ > -device piix3-usb-uhci,id=puu -device usb-tablet,id=ut -name primary \ > -netdev > tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \ > -device rtl8139,id=e0,netdev=hn0 \ > -drive > if=ide,id=colo-disk0,driver=quorum,read-pattern=fifo,vote-threshold=1,children.0.file.filename=$image_path,children.0.driver=qcow2 > *2. Add the device and object to PVM with qmp command* > {'execute':'qmp_capabilities'} > {"execute":"chardev-add", "arguments":{ "id" : "mirror0", "backend" : > { "type" : "socket", "data" : { "server": true, "wait": false, "addr": { > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9003" } } } } }} > {"execute":"chardev-add", "arguments":{ "id" : "compare1", "backend" > : { "type" : "socket", "data" : { "server": true, "wait": true, "addr": { > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9004" } } } } }} > {"execute":"chardev-add", "arguments":{ "id" : "compare0", "backend" > : { "type" : "socket", "data" : { "server": true, "wait": false, "addr": { > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }} > {"execute":"chardev-add", "arguments":{ "id" : "compare0-0", > "backend" : { "type" : "socket", "data" : { "server": false, "addr": { > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9001" } } } } }} > {"execute":"chardev-add", "arguments":{ "id" : "compare_out", > "backend" : { "type" : "socket", "data" : { "server": true, "wait": false, > "addr": { "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } } > } } }} > {"execute":"chardev-add", "arguments":{ "id" : "compare_out0", > "backend" : { "type" : "socket", "data" : { "server": false, "addr": { > "type": "inet", "data":{ "host": "127.0.0.1", "port": "9005" } } } } } } > {"execute":"object-add", "arguments":{ "qom-type" : "filter-mirror", > "id" : "m0", "props": { "netdev": "hn0", "outdev" : "mirror0", "queue" : > "tx" } } } > {"execute":"object-add", "arguments":{ "qom-type" : > "filter-redirector", "id" : "redire0", "props": { "netdev": "hn0", "indev" > : "compare_out", "queue" : "rx" } } } > {"execute":"object-add", "arguments":{ "qom-type" : > "filter-redirector", "id" : "redire1", "props": { "netdev": "hn0", "outdev" > : "compare0", "queue" : "rx" } } } > {"execute":"object-add", "arguments":{ "qom-type" : "iothread", "id" > : "iothread1", "props": {} } } > {"execute":"object-add", "arguments":{ "qom-type" : "colo-compare", > "id" : "comp0", "props": { "primary_in" : "compare0-0", "secondary_in" : > "compare1", "outdev" : "compare_out0", "iothread" : "iothread1"} } } > *3. Start the secondary VM with command* > qemu-system-x86_64 -enable-kvm -cpu qemu64,+kvmclock -m 2048 -qmp > stdio \ > -vnc :6 -device piix3-usb-uhci -device usb-tablet -name secondary \ > -netdev > tap,id=hn0,vhost=off,helper=/usr/local/ceph/libexec/qemu-bridge-helper \ > -device rtl8139,id=e0,netdev=hn0 \ > -chardev socket,id=red0,host=127.0.0.1,port=9003,reconnect=1 \ > -chardev socket,id=red1,host=127.0.0.1,port=9004,reconnect=1 \ > -object filter-redirector,id=f1,netdev=hn0,queue=tx,indev=red0 \ > -object filter-redirector,id=f2,netdev=hn0,queue=rx,outdev=red1 \ > -object filter-rewriter,id=rew0,netdev=hn0,queue=all \ > -drive if=none,id=colo-disk0,file.filename=$image_path,driver=qcow2,\ > node-name=node1 \ > -drive > if=ide,id=active-disk0,driver=replication,mode=secondary,file.driver=qcow2,\ > top-id=active-disk0,file.file.filename=active-disk.qcow2,\ > file.backing.driver=qcow2,file.backing.file.filename=hidden-disk.qcow2,\ > file.backing.backing=colo-disk0,node-name=node2 \ > -incoming tcp:0:9998 > *4. As the document create rbd server and do migrate with qmp command* > [image: image.png] > *5. Kill the PVM and failover to SVM* > [image: image.png] > *6. Reboot the secondary VM, then we will get the error.* > It is high possibility to occur this error. > > Therefore, we can solve the image problem by *xfs_repair*, then reboot the > VM it will work. > Command: > xfs_repair -L /dev/sda1 > > Do you have any idea to occur this problem? Hi Daniel, The disks have to be synchronized before you can start COLO. So try something like this: {'execute': 'drive-mirror', 'arguments':{ 'device': 'colo-disk0', 'job-id': 'resync', 'target': 'nbd://SECONDARY:?/colo-disk0', 'mode': 'existing', 'format': 'raw', 'sync': 'full'} } Then, after the job is ready: {'execute': 'stop'} {'execute': 'block-job-cancel', 'arguments':{ 'device': 'resync'} } And then you can add the replication driver and start colo. Regards, Lukas Straub