hello, guys,    I found data lost when flattening a cloned image on 
giant(0.87.2). The problem can be easily reproduced by runing the following 
script:ceph osd pool create wuxingyi 1 1rbd create --image-format 2 
wuxingyi/disk1.img --size 8#writing "FOOBAR" at offset 0python writetooffset.py 
disk1.img 0 FOOBARrbd snap create wuxingyi/disk1.img@SNAPSHOTrbd snap protect 
wuxingyi/disk1.img@SNAPSHOTecho "start cloing"rbd clone 
wuxingyi/disk1.img@SNAPSHOT wuxingyi/CLONEIMAGE#writing "WUXINGYI" at offset 4M 
 of cloned imagepython writetooffset.py CLONEIMAGE $((4*1048576)) WUXINGYIrbd 
snap create wuxingyi/CLONEIMAGE@CLONEDSNAPSHOT
#modify  at offset 4M  of cloned imagepython writetooffset.py CLONEIMAGE 
$((4*1048576)) HEHEHEHE
echo "start flattening CLONEIMAGE"rbd flatten wuxingyi/CLONEIMAGEecho "before 
rollback"rbd export wuxingyi/CLONEIMAGE &&  hexdump -C CLONEIMAGErm CLONEIMAGE 
-f
rbd snap rollback wuxingyi/CLONEIMAGE@CLONEDSNAPSHOTecho "after rollback"rbd 
export wuxingyi/CLONEIMAGE &&  hexdump -C CLONEIMAGErm CLONEIMAGE -fwhere 
writetooffset.py is a simple python script writing specific data to the 
specific offset of the image:#!/usr/bin/python#coding=utf-8import sysimport 
rbdimport radoscluster = 
rados.Rados(conffile='/etc/ceph/ceph.conf')cluster.connect()ioctx = 
cluster.open_ioctx('wuxingyi')rbd_inst = rbd.RBD()image=rbd.Image(ioctx, 
sys.argv[1])image.write(sys.argv[3], int(sys.argv[2]))The output is something 
like:before rollbackExporting image: 100% complete...done.00000000  46 4f 4f 42 
41 52 00 00  00 00 00 00 00 00 00 00  |FOOBAR..........|00000010  00 00 00 00 
00 00 00 00  00 00 00 00 00 00 00 00  |................|*00400000  48 45 48 45 
48 45 48 45  00 00 00 00 00 00 00 00  |HEHEHEHE........|00400010  00 00 00 00 
00 00 00 00  00 00 00 00 00 00 00 00  |................|*00800000Rolling back 
to snapshot: 100% complete...done.after rollbackExporting image: 100% 
complete...done.00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  
|................|*00400000  57 55 58 49 4e 47 59 49  00 00 00 00 00 00 00 00  
|WUXINGYI........|00400010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  
|................|*00800000We can easily fount that the first object of the 
image is definitely lost, and I found the data loss is happened when 
flattening, there is only a "head" version of the first object, actually a 
"snapid" version of the object should also be created and writed when 
flattening.But when running this scripts on upstream code, I cannot hit this 
problem. I look through the upstream code but could not find which commit fixes 
this bug. I also found the whole state machine dealing with RBD layering 
changed a lot since giant release.Could you please give me some hints on which 
commits should I backport?Thanks~~~~                                            
                                                                            
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to