In my cluster I saw that the problematic objects have been uploaded by a
specific application (onedata), which I think used to upload the files
doing something like:

rados --pool <pool> put <objectname> <filename>

Now (since Luminous ?) the default object size is 128MB but if I am not
wrong it was 100GB before.
This would explain why I have such big objects around (which indeed have an
old timestamp)

Cheers, Massimo

On Wed, Jan 15, 2020 at 7:06 PM Liam Monahan <l...@umiacs.umd.edu> wrote:

> I just changed my max object size to 256MB and scrubbed and the errors
> went away.  I’m not sure what can be done to reduce the size of these
> objects, though, if it really is a problem.  Our cluster has dynamic bucket
> index resharding turned on, but that sharding process shouldn’t help it if
> non-index objects are what is over the limit.
>
> I don’t think a pg repair would do anything unless the config tunables are
> adjusted.
>
> On Jan 15, 2020, at 10:56 AM, Massimo Sgaravatto <
> massimo.sgarava...@gmail.com> wrote:
>
> I never changed the default value for that attribute
>
> I am missing why I have such big objects around
>
> I am also wondering what a pg repair would do in such case
>
> Il mer 15 gen 2020, 16:18 Liam Monahan <l...@umiacs.umd.edu> ha scritto:
>
>> Thanks for that link.
>>
>> Do you have a default osd max object size of 128M?  I’m thinking about
>> doubling that limit to 256MB on our cluster.  Our largest object is only
>> about 10% over that limit.
>>
>> On Jan 15, 2020, at 3:51 AM, Massimo Sgaravatto <
>> massimo.sgarava...@gmail.com> wrote:
>>
>> I guess this is coming from:
>>
>> https://github.com/ceph/ceph/pull/30783
>>
>> introduced in Nautilus 14.2.5
>>
>> On Wed, Jan 15, 2020 at 8:10 AM Massimo Sgaravatto <
>> massimo.sgarava...@gmail.com> wrote:
>>
>>> As I wrote here:
>>>
>>>
>>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2020-January/037909.html
>>>
>>> I saw the same after an update from Luminous to Nautilus 14.2.6
>>>
>>> Cheers, Massimo
>>>
>>> On Tue, Jan 14, 2020 at 7:45 PM Liam Monahan <l...@umiacs.umd.edu>
>>> wrote:
>>>
>>>> Hi,
>>>>
>>>> I am getting one inconsistent object on our cluster with an
>>>> inconsistency error that I haven’t seen before.  This started happening
>>>> during a rolling upgrade of the cluster from 14.2.3 -> 14.2.6, but I am not
>>>> sure that’s related.
>>>>
>>>> I was hoping to know what the error means before trying a repair.
>>>>
>>>> [root@objmon04 ~]# ceph health detail
>>>> HEALTH_ERR noout flag(s) set; 1 scrub errors; Possible data damage: 1
>>>> pg inconsistent
>>>> OSDMAP_FLAGS noout flag(s) set
>>>> OSD_SCRUB_ERRORS 1 scrub errors
>>>> PG_DAMAGED Possible data damage: 1 pg inconsistent
>>>>     pg 9.20e is active+clean+inconsistent, acting [509,674,659]
>>>>
>>>> rados list-inconsistent-obj 9.20e --format=json-pretty
>>>> {
>>>>     "epoch": 759019,
>>>>     "inconsistents": [
>>>>         {
>>>>             "object": {
>>>>                 "name":
>>>> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
>>>>                 "nspace": "",
>>>>                 "locator": "",
>>>>                 "snap": "head",
>>>>                 "version": 692875
>>>>             },
>>>>             "errors": [
>>>>                 "size_too_large"
>>>>             ],
>>>>             "union_shard_errors": [],
>>>>             "selected_object_info": {
>>>>                 "oid": {
>>>>                     "oid":
>>>> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
>>>>                     "key": "",
>>>>                     "snapid": -2,
>>>>                     "hash": 3321413134,
>>>>                     "max": 0,
>>>>                     "pool": 9,
>>>>                     "namespace": ""
>>>>                 },
>>>>                 "version": "281183'692875",
>>>>                 "prior_version": "281183'692874",
>>>>                 "last_reqid": "client.34042469.0:206759091",
>>>>                 "user_version": 692875,
>>>>                 "size": 146097278,
>>>>                 "mtime": "2017-07-03 12:43:35.569986",
>>>>                 "local_mtime": "2017-07-03 12:43:35.571196",
>>>>                 "lost": 0,
>>>>                 "flags": [
>>>>                     "dirty",
>>>>                     "data_digest",
>>>>                     "omap_digest"
>>>>                 ],
>>>>                 "truncate_seq": 0,
>>>>                 "truncate_size": 0,
>>>>                 "data_digest": "0xf19c8035",
>>>>                 "omap_digest": "0xffffffff",
>>>>                 "expected_object_size": 0,
>>>>                 "expected_write_size": 0,
>>>>                 "alloc_hint_flags": 0,
>>>>                 "manifest": {
>>>>                     "type": 0
>>>>                 },
>>>>                 "watchers": {}
>>>>             },
>>>>             "shards": [
>>>>                 {
>>>>                     "osd": 509,
>>>>                     "primary": true,
>>>>                     "errors": [],
>>>>                     "size": 146097278
>>>>                 },
>>>>                 {
>>>>                     "osd": 659,
>>>>                     "primary": false,
>>>>                     "errors": [],
>>>>                     "size": 146097278
>>>>                 },
>>>>                 {
>>>>                     "osd": 674,
>>>>                     "primary": false,
>>>>                     "errors": [],
>>>>                     "size": 146097278
>>>>                 }
>>>>             ]
>>>>         }
>>>>     ]
>>>> }
>>>>
>>>> Thanks,
>>>> Liam
>>>> —
>>>> Senior Developer
>>>> Institute for Advanced Computer Studies
>>>> University of Maryland
>>>> _______________________________________________
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>>>
>>
>
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to