Hi ceph-users, After reading through the GC related code, I am thinking to use a much larger value for "rgw gc max obis" (like 997), and I don't see any side effect if we increase this value. Did I miss anything?
Thanks, Guang Begin forwarded message: > From: redm...@tracker.ceph.com > Subject: [rgw - Bug #7073] (New) "rgw gc max objs" should have a prime number > as default value > Date: December 31, 2013 3:28:53 PM GMT+08:00 > > Issue #7073 has been reported by Guang Yang. > Bug #7073: "rgw gc max objs" should have a prime number as default value > Author: Guang Yang > Status: New > Priority: Normal > Assignee: > Category: > Target version: > Source: other > Backport: > Tags: > Severity: 3 - minor > Reviewed: > Recently when we trouble shoot latency increasing on our ceph cluster, we > observed a couple of gc objects were hotspot which slow down the entire OSD, > after checking the .rgw.gc pool, we found a couple of gc objects has tens of > thousands of entries while other gc objects has zero entry. > > The problem is because we have a bad default value (32) for "rgw gc max objs". > > The data flow is: > 1. For each object, it has a object ID with pattern > {client_id}.{eachreqincrease_by_1_number}, sample is: 0_default.4351.24557. > 2. For each delete request, it needs to set gc entry for the object, the way > how it does is: > 2.1 hash the object ID to figure out which gc object to use (0 – 31) > 2.2 set two entries for that gc object. > > The problem comes from step 2.1, as the default max objs is 32, so that for > each string (object tag) hashed value, it will need to mod 32, which result a > un-even distribution, it definitely should choose a prime number to have a > evenly distribution. > > I wrote a small problem to simulate the above as: > #include <iostream> > #include <sstream> > #include <string> > using namespace std; > > unsigned str_hash(const char* str, unsigned length) { > unsigned long hash = 0; > while (length--) { > unsigned char c = *str++; > hash = (hash + (c << 4) + (c >> 4)) * 11; > } > return hash; > } > > int main() { > int gc_old32 = {0,0}; > int gc_new31 = {0,0}; > string base("0_default.4351."); > ostringstream os; > for (int i = 0; i < 10000; ++i) { > os.clear(); > os << i; > string tag = base + os.str(); > unsigned n = str_hash(tag.c_str(), tag.size()); > gc_old[n%32]++; > gc_new[n%31]++; > } > > cout << "with use max objs 32..."<<endl; > for(int i = 0; i < 32; ++i) > { > cout << "gc."<< i <<" "<< gc_old[i] << endl; > } > cout << "with use max objs 31..."<<endl; > for(int i = 0; i < 31; ++i) > { > cout << "gc." << i << " " << gc_new[i] << endl; > } > return 0; > }, > output of the program is: > with use max objs 32... > gc.0 0 > gc.1 0 > gc.2 2317 > gc.3 58 > gc.4 0 > gc.5 0 > gc.6 68 > gc.7 57 > gc.8 0 > gc.9 0 > gc.10 68 > gc.11 57 > gc.12 0 > gc.13 0 > gc.14 67 > gc.15 57 > gc.16 0 > gc.17 0 > gc.18 2319 > gc.19 55 > gc.20 0 > gc.21 0 > gc.22 69 > gc.23 57 > gc.24 0 > gc.25 0 > gc.26 4569 > gc.27 58 > gc.28 0 > gc.29 0 > gc.30 68 > gc.31 56 > with use max objs 31... > gc.0 322 > gc.1 287 > gc.2 307 > gc.3 315 > gc.4 345 > gc.5 333 > gc.6 333 > gc.7 323 > gc.8 297 > gc.9 324 > gc.10 316 > gc.11 354 > gc.12 313 > gc.13 331 > gc.14 314 > gc.15 312 > gc.16 335 > gc.17 320 > gc.18 337 > gc.19 317 > gc.20 316 > gc.21 340 > gc.22 330 > gc.23 322 > gc.24 306 > gc.25 350 > gc.26 332 > gc.27 327 > gc.28 309 > gc.29 292 > gc.30 341 > In order to avoid the hotspot, we should choose a prime number as default > value and clearly document that if user need to change the value, he / she > should choose a prime number to have a better performance. > > You have received this notification because you have either subscribed to it, > or are involved in it. > To change your notification preferences, please click here: > http://tracker.ceph.com/my/account
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com