On Mon, Apr 17 2017, gordon chung wrote: Hi Gordon,
> i've started to implement multiple buckets and the initial tests look > promising. here's some things i've done: > > - dropped the scheduler process and allow processing workers to figure > out tasks themselves > - each sack is now handled fully (not counting anything added after > processing worker) > - number of sacks are static > > after the above, i've been testing it and it works pretty well, i'm able > to process 40K metrics, 60 points each, in 8-10mins with 54 workers when > it took significantly longer before. Great! > the issues i've run into: > > - dynamic sack size > making number of sacks dynamic is a concern. previously, we said to have > sack size in conf file. the concern is that changing that option > incorrectly actually 'corrupts' the db to a state that it cannot recover > from. it will have stray unprocessed measures constantly. if we change > the db path incorrectly, we don't actually corrupt anything, we just > lose data. we've said we don't want sack mappings in indexer so it seems > to me, the only safe solution is to make it sack size static and only > changeable by hacking? Not hacking, just we need a proper tool to rebalance it. As I already wrote, I think it's good enough to have this documented and set to a moderated good value by default (e.g. 4096). There's no need to store it in a configuration file, it should be stored in the storage driver itself to avoid any mistake, when the storage is initialized via `gnocchi-upgrade'. > - sack distribution > to distribute sacks across workers, i initially implemented consistent > hashing. the issue i noticed is that because hashring is inherently has > non-uniform distribution[1], i would have workers sitting idle because > it was given less sacks, while other workers were still working. > > i tried also to implement jump hash[2], which improved distribution and > is in theory, less memory intensive as it does not maintain a hash > table. while better at distribution, it still is not completely uniform > and similarly, the less sacks per worker, the worse the distribution. > > lastly, i tried just simple locking where each worker is completely > unaware of any other worker and handles all sacks. it will lock the sack > it is working on, so if another worker tries to work on it, it will just > skip. this will effectively cause an additional requirement on locking > system (in my case redis) as each worker will make x lock requests where > x is number of sacks. so if we have 50 workers and 2048 sacks, it will > be 102K requests per cycle. this is in addition to the n number of lock > requests per metric (10K-1M metrics?). this does guarantee if a worker > is free and there is work to be done, it will do it. > > i guess the question i have is: by using a non-uniform hash, it seems we > gain possibly less load at the expense of efficiency/'speed'. the number > of sacks/tasks we have is stable, it won't really change. the number of > metricd workers may change but not constantly. lastly, the number of > sacks per worker will always be relatively low (10:1, 100:1 assuming max > number of sacks is 2048). given these conditions, do we need > consistent/jump hashing? is it better to just modulo sacks and ensure > 'uniform' distribution and allow for 'larger' set of buckets to be > reshuffled when workers are added? What about using the hashring with replicas (e.g. 3 by default) and a lock per sack? This should reduce largely the number of lock try that you see. If you have 2k sacks divided across 50 workers and each one has a replica, that make each process care about 122 metrics so they might send 122 acquire() try each, which is 50 × 122 = 6100 acquire request, 17 times less than 102k. This also solve the problem of non-uniform distribution, as having replicas make sure every node gets some work. You can then probably remove the per-metric-lock too: this is just used when processing new measures (here the sack lock is enough) and when expunging metrics. You can safely use the same lock sack-lock for expunging metric. We may just need to it out from janitor? Something to think about! Cheers, -- Julien Danjou -- Free Software hacker -- https://julien.danjou.info
signature.asc
Description: PGP signature
__________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev