*
Hello Yehuda and the rest of the mailing list.
My main question currently is why are the bucket index and the object
manifest ever different? Based on how we are uploading data I do not
think that the rados gateway should ever know the full file size without
having all of the objects within ceph at one point in time. So after the
multipart is marked as completed Rados gateway should cat through all of
the objects and make a complete part, correct?
Secondly,
I think I am not understanding the process to grab all of the parts
correctly. To continue to use my example file
"86b6fad8-3c53-465f-8758-2009d6df01e9/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam"
in bucket tcga_cghub_protected. I would be using the following to grab
the prefix:
prefix=$(radosgw-admin object stat --bucket=tcga_cghub_protected
--object=86b6fad8-3c53-465f-8758-2009d6df01e9/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam
| grep -iE '"prefix"' | awk -F"\"" '{print $4}')
Which should take everything between quotes for the prefix key and give
me the value.
In this case::
"prefix":
"86b6fad8-3c53-465f-8758-2009d6df01e9\/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam.2\/YAROhWaAm9LPwCHeP55cD4CKlLC0B4S",
So
lacadmin@kh10-9:~$ echo ${prefix}
86b6fad8-3c53-465f-8758-2009d6df01e9\/TCGA-A2-A0T7-01A-21D-A099-09_IlluminaGA-DNASeq_exome.bam.2\/YAROhWaAm9LPwCHeP55cD4CKlLC0B4S
From here I list all of the objects in the .rgw.buckets pool and grep
for that said prefix which yields 1335 objects. From here if I cat all
of these objects together I only end up with a 5468160 byte file which
is 2G short of what the object manifest says it should be. If I grab the
file and tail the Rados gateway log I end up with 1849 objects and when
I sum them all up I end up with 7744771642 which is the same size that
the manifest reports. I understand that this does nothing other than
verify the manifests accuracy but I still find it interesting. The
missing chunks may still exist in ceph outside of the object manifest
and tagged with the same prefix, correct? Or am I misunderstanding
something?
We have over 40384 files in the tcga_cghub_protected bucket and only 66
of these files are suffering from this truncation issue. What I need to
know is: is this happening on the gateway side or on the client side?
Next I need to know what possible actions can occur where the bucket
index and the object manifest would be mismatched like this as 40318 out
of 40384 are working without issue.
The truncated files are of all different sizes (5 megabytes - 980
gigabytes) and the truncation seems to be all over. By "all over" I mean
some files are missing the first few bytes that should read "bam" and
some are missing parts in the middle.
So our upload code is using mmap to stream chunks of the file to the
Rados gateway via a multipart upload but no where on the client side do
we have a direct reference to the files we are using nor do we specify
the size in anyway. So where is the gateway getting the correct complete
filesize from and how is the bucket index showing the intended file size?
This implies that, at some point in time, ceph was able to see all of
the parts of the file and calculate the correct total size. This to me
seems like a rados gateway bug regardless of how the file is being
uploaded. I think that the RGW should be able to be fuzzed and still
store the data correctly.
Why is the bucket list not matching the bucket index and how can I
verify that the data is not being corrupted by the RGW or worse, after
it is committed to ceph*?*
*
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com