On Wed, Feb 5, 2014 at 2:21 PM, Josh Durgin <josh.dur...@inktank.com> wrote:
> On 02/05/2014 01:23 PM, Craig Lewis wrote:
>> On 2/4/14 20:02 , Josh Durgin wrote:
>>> From the log it looks like you're hitting the default maximum number of
>>> entries to be processed at once per shard. This was intended to prevent
>>> one really busy shard from blocking progress on syncing other shards,
>>> since the remainder will be synced the next time the shard is processed.
>>> Perhaps the default is too low though, or the idea should be scrapped
>>> altogether since you can sync other shards in parallel.
>>> For your particular usage, since you're updating the same few buckets,
>>> the max entries limit is hit constantly. You can increase it with
>>> max-entries: 1000000 in the config file or --max-entries 10000000 on
>>> the command line.
>>> Josh
>> This doesn't appear to have any effect:
>> root@ceph1c:/etc/init.d# grep max-entries /etc/ceph/radosgw-agent.conf
>> max-entries: 1000000
>> root@ceph1c:/etc/init.d# egrep 'has [0-9]+ entries after'
>> /var/log/ceph/radosgw-agent.log  | tail -1
>> 2014-02-05T13:11:03.915 2743:INFO:radosgw_agent.worker:bucket instance
>> "live-2:us-west-1.35026898.2" has 1000 entries after
>> "00000206789.410535.3"
>> Neither does --max-entries 10000000:
>> root@ceph1c:/etc/init.d# ps auxww | grep radosgw-agent | grep max-entries
>> root     19710  6.0  0.0  74492 18708 pts/3    S    13:22 0:00
>> /usr/bin/python /usr/bin/radosgw-agent --incremental-sync-delay=10
>> --max-entries 10000000 -c /etc/ceph/radosgw-agent.conf
>> root@ceph1c:/etc/init.d# egrep 'has [0-9]+ entries after'
>> /var/log/ceph/radosgw-agent.us-west-1.us-central-1.log  | tail -1
>> 2014-02-05T13:22:58.577 21626:INFO:radosgw_agent.worker:bucket instance
>> "live-2:us-west-1.35026898.2" has 1000 entries after
>> "00000207788.411542.2"
>> I guess I'll look into that too, since I'll be in that area of the code.
> It seems to be a hardcoded limit on the server side to prevent a single
> osd operation from taking too long (see cls_log_list() in ceph.git
> src/cls/cls_log.cc).

Right, that part is intentional. Otherwise osd operation might take too long.

> This should probably be fixed in radosgw, but you could work around it
> with a loop in radosgw-agent.

No need to change the radosgw side, it's intentional. The agent should
assume requests are paged and behave accordingly.

ceph-users mailing list

Reply via email to