Hi Kevin,
On 1/17/2019 10:50 PM, KEVIN MICHAEL HRPCEK wrote:
Hey,
I recall reading about this somewhere but I can't find it in the docs
or list archive and confirmation from a dev or someone who knows for
sure would be nice. What I recall is that bluestore has a max 4GB file
size limit based on the design of bluestore not the
osd_max_object_size setting. The bluestore source seems to suggest
that by setting the OBJECT_MAX_SIZE to a 32bit max, giving an error if
osd_max_object_size is > OBJECT_MAX_SIZE, and not writing the data if
offset+length >= OBJECT_MAX_SIZE. So it seems like the in osd file
size int can't exceed 32 bits which is 4GB, like FAT32. Am I correct
or maybe I'm reading all this wrong..?
You're correct, BlueStore doesn't support object larger than
OBJECT_MAX_SIZE(i.e. 4Gb)
If bluestore has a hard 4GB object limit using radosstriper to break
up an object would work, but does using an EC pool that breaks up the
object to shards smaller than OBJECT_MAX_SIZE have the same effect as
radosstriper to get around a 4GB limit? We use rados directly and
would like to move to bluestore but we have some large objects <= 13G
that may need attention if this 4GB limit does exist and an ec pool
doesn't get around it.
Theoretically object split using EC might help. But I'm not sure whether
one needs to adjust osd_max_object_size greater than 4Gb to permit 13Gb
object usage in EC pool. If it's needed than tosd_max_object_size <=
OBJECT_MAX_SIZE constraint is violated and BlueStore wouldn't start.
https://github.com/ceph/ceph/blob/master/src/os/bluestore/BlueStore.cc#L88
#define OBJECT_MAX_SIZE 0xffffffff // 32 bits
https://github.com/ceph/ceph/blob/master/src/os/bluestore/BlueStore.cc#L4395
// sanity check(s)
auto osd_max_object_size =
cct->_conf.get_val<Option::size_t>("osd_max_object_size");
if (osd_max_object_size >= (size_t)OBJECT_MAX_SIZE) {
derr << __func__ << " osd_max_object_size >= 0x" << std::hex <<
OBJECT_MAX_SIZE
<< "; BlueStore has hard limit of 0x" << OBJECT_MAX_SIZE << "." << std::dec
<< dendl;
return -EINVAL;
}
https://github.com/ceph/ceph/blob/master/src/os/bluestore/BlueStore.cc#L12331
if (offset + length >= OBJECT_MAX_SIZE) {
r = -E2BIG;
} else {
_assign_nid(txc, o);
r = _do_write(txc, c, o, offset, length, bl, fadvise_flags);
txc->write_onode(o);
}
Thanks!
Kevin
--
Kevin Hrpcek
NASA SNPP Atmosphere SIPS
Space Science & Engineering Center
University of Wisconsin-Madison
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Thanks,
Igor
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com