Re: [ceph-users] Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

Sean Tue, 05 May 2015 12:14:35 -0700

*

Hello Yehuda and the rest of the mailing list.

My main question currently is why are the bucket index and the objectmanifest ever different? Based on how we are uploading data I do notthink that the rados gateway should ever know the full file size withouthaving all of the objects within ceph at one point in time. So after themultipart is marked as completed Rados gateway should cat through all ofthe objects and make a complete part, correct?




Secondly,

I think I am not understanding the process to grab all of the partscorrectly. To continue to use my example file"86b6fad8-3c53-465f-8758-2009d6df01e9/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam"in bucket tcga_cghub_protected. I would be using the following to grabthe prefix:

prefix=$(radosgw-admin object stat --bucket=tcga_cghub_protected--object=86b6fad8-3c53-465f-8758-2009d6df01e9/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam| grep -iE '"prefix"' | awk -F"\"" '{print $4}')

Which should take everything between quotes for the prefix key and giveme the value.



In this case::

"prefix":"86b6fad8-3c53-465f-8758-2009d6df01e9\/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam.2\/YAROhWaAm9LPwCHeP55cD4CKlLC0B4S",



So

lacadmin@kh10-9:~$ echo ${prefix}

86b6fad8-3c53-465f-8758-2009d6df01e9\/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam.2\/YAROhWaAm9LPwCHeP55cD4CKlLC0B4S

From here I list all of the objects in the .rgw.buckets pool and grep

for that said prefix which yields 1335 objects. From here if I cat allof these objects together I only end up with a 5468160 byte file whichis 2G short of what the object manifest says it should be. If I grab thefile and tail the Rados gateway log I end up with 1849 objects and whenI sum them all up I end up with 7744771642 which is the same size thatthe manifest reports. I understand that this does nothing other thanverify the manifests accuracy but I still find it interesting. Themissing chunks may still exist in ceph outside of the object manifestand tagged with the same prefix, correct? Or am I misunderstandingsomething?

We have over 40384 files in the tcga_cghub_protected bucket and only 66of these files are suffering from this truncation issue. What I need toknow is: is this happening on the gateway side or on the client side?Next I need to know what possible actions can occur where the bucketindex and the object manifest would be mismatched like this as 40318 outof 40384 are working without issue.

The truncated files are of all different sizes (5 megabytes - 980gigabytes) and the truncation seems to be all over. By "all over" I meansome files are missing the first few bytes that should read "bam" andsome are missing parts in the middle.

So our upload code is using mmap to stream chunks of the file to theRados gateway via a multipart upload but no where on the client side dowe have a direct reference to the files we are using nor do we specifythe size in anyway. So where is the gateway getting the correct completefilesize from and how is the bucket index showing the intended file size?

This implies that, at some point in time, ceph was able to see all ofthe parts of the file and calculate the correct total size. This to meseems like a rados gateway bug regardless of how the file is beinguploaded. I think that the RGW should be able to be fuzzed and stillstore the data correctly.

Why is the bucket list not matching the bucket index and how can Iverify that the data is not being corrupted by the RGW or worse, afterit is committed to ceph*?*

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

Reply via email to