Re: [ceph-users] PG inconsistent with error "size_too_large"

2020-01-15 Thread Massimo Sgaravatto
I guess this is coming from:

https://github.com/ceph/ceph/pull/30783

introduced in Nautilus 14.2.5

On Wed, Jan 15, 2020 at 8:10 AM Massimo Sgaravatto <
massimo.sgarava...@gmail.com> wrote:

> As I wrote here:
>
>
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2020-January/037909.html
>
> I saw the same after an update from Luminous to Nautilus 14.2.6
>
> Cheers, Massimo
>
> On Tue, Jan 14, 2020 at 7:45 PM Liam Monahan  wrote:
>
>> Hi,
>>
>> I am getting one inconsistent object on our cluster with an inconsistency
>> error that I haven’t seen before.  This started happening during a rolling
>> upgrade of the cluster from 14.2.3 -> 14.2.6, but I am not sure that’s
>> related.
>>
>> I was hoping to know what the error means before trying a repair.
>>
>> [root@objmon04 ~]# ceph health detail
>> HEALTH_ERR noout flag(s) set; 1 scrub errors; Possible data damage: 1 pg
>> inconsistent
>> OSDMAP_FLAGS noout flag(s) set
>> OSD_SCRUB_ERRORS 1 scrub errors
>> PG_DAMAGED Possible data damage: 1 pg inconsistent
>> pg 9.20e is active+clean+inconsistent, acting [509,674,659]
>>
>> rados list-inconsistent-obj 9.20e --format=json-pretty
>> {
>> "epoch": 759019,
>> "inconsistents": [
>> {
>> "object": {
>> "name":
>> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
>> "nspace": "",
>> "locator": "",
>> "snap": "head",
>> "version": 692875
>> },
>> "errors": [
>> "size_too_large"
>> ],
>> "union_shard_errors": [],
>> "selected_object_info": {
>> "oid": {
>> "oid":
>> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
>> "key": "",
>> "snapid": -2,
>> "hash": 3321413134,
>> "max": 0,
>> "pool": 9,
>> "namespace": ""
>> },
>> "version": "281183'692875",
>> "prior_version": "281183'692874",
>> "last_reqid": "client.34042469.0:206759091",
>> "user_version": 692875,
>> "size": 146097278,
>> "mtime": "2017-07-03 12:43:35.569986",
>> "local_mtime": "2017-07-03 12:43:35.571196",
>> "lost": 0,
>> "flags": [
>> "dirty",
>> "data_digest",
>> "omap_digest"
>> ],
>> "truncate_seq": 0,
>> "truncate_size": 0,
>> "data_digest": "0xf19c8035",
>> "omap_digest": "0x",
>> "expected_object_size": 0,
>> "expected_write_size": 0,
>> "alloc_hint_flags": 0,
>> "manifest": {
>> "type": 0
>> },
>> "watchers": {}
>> },
>> "shards": [
>> {
>> "osd": 509,
>> "primary": true,
>> "errors": [],
>> "size": 146097278
>> },
>> {
>> "osd": 659,
>> "primary": false,
>> "errors": [],
>> "size": 146097278
>> },
>> {
>> "osd": 674,
>> "primary": false,
>> "errors": [],
>> "size": 146097278
>> }
>> ]
>> }
>> ]
>> }
>>
>> Thanks,
>> Liam
>> —
>> Senior Developer
>> Institute for Advanced Computer Studies
>> University of Maryland
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Weird mount issue (Ubuntu 18.04, Ceph 14.2.5 & 14.2.6)

2020-01-15 Thread Aaron
Seeing a weird mount issue.  Some info:

No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.3 LTS
Release: 18.04
Codename: bionic

Ubuntu 18.04.3 with kerne 4.15.0-74-generic
Ceph 14.2.5 & 14.2.6

With ceph-common, ceph-base, etc installed:

ceph/stable,now 14.2.6-1bionic amd64 [installed]
ceph-base/stable,now 14.2.6-1bionic amd64 [installed]
ceph-common/stable,now 14.2.6-1bionic amd64 [installed,automatic]
ceph-mds/stable,now 14.2.6-1bionic amd64 [installed]
ceph-mgr/stable,now 14.2.6-1bionic amd64 [installed,automatic]
ceph-mgr-dashboard/stable,stable,now 14.2.6-1bionic all [installed]
ceph-mon/stable,now 14.2.6-1bionic amd64 [installed]
ceph-osd/stable,now 14.2.6-1bionic amd64 [installed]
libcephfs2/stable,now 14.2.6-1bionic amd64 [installed,automatic]
python-ceph-argparse/stable,stable,now 14.2.6-1bionic all [installed,automatic]
python-cephfs/stable,now 14.2.6-1bionic amd64 [installed,automatic]

I create a user via get-or-create cmd, and I have a users/secret now.
When I try to mount on these Ubuntu nodes,

The mount cmd I run for testing is:
sudo mount -t ceph -o
name=user-20c5338c-34db-11ea-b27a-de7033e905f6,secret=AQC6dhpeyczkDxAAhRcr7oERUY4BcD2NCUkuNg==
10.10.10.10:6789:/work/20c5332d-34db-11ea-b27a-de7033e905f6 /tmp/test

I get the error:
couldn't finalize options: -34

>From some tracking down, it's part of the get_secret_option() in
common/secrets.c and the Linux System Error:

#define ERANGE  34  /* Math result not representable */

Now the weird part...when I remove all the above libs above, the mount
command works. I know that there are ceph.ko modules in the Ubuntu
filesystems DIR, and that Ubuntu comes with some understanding of how
to mount a cephfs system.  So, that explains how it can mount
cephfs...but, what I don't understand is why I'm getting that -34
error with the 14.2.5 and 14.2.6 libs installed. I didn't have this
issue with 14.2.3 or 14.2.4.

Cheers,
Aaron
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] low io with enterprise SSDs ceph luminous - can we expect more? [klartext]

2020-01-15 Thread Stefan Bauer
Folks,



i would like to thank you again for your help regarding performance speedup of 
our ceph cluster.



Customer just reports, that database is around 40% faster than before without 
changing any hardware.



This really kicks ass now! :)



We measured the subop_latency - avgtime on our OSDs and could reduce latency 
from 2.5ms to 0.7ms now.



:p




Cheers





Stefan





-Ursprüngliche Nachricht-
Von: Виталий Филиппов 
Gesendet: Dienstag 14 Januar 2020 10:28
An: Wido den Hollander ; Stefan Bauer 
CC: ceph-users@lists.ceph.com
Betreff: Re: [ceph-users] low io with enterprise SSDs ceph luminous - can we 
expect more? [klartext]

...disable signatures and rbd cache. I didn't mention it in the email to not 
repeat myself. But I have it in the article :-)
--
With best regards,
Vitaliy Filippov___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] OSD's hang after network blip

2020-01-15 Thread Nick Fisk
Hi All,

Running 14.2.5, currently experiencing some network blips isolated to a single 
rack which is under investigation. However, it appears following a network 
blip, random OSD's in unaffected racks are sometimes not recovering from the 
incident and are left running running in a zombie state. The OSD's appear to be 
running from a process perspective, but the cluster thinks they are down and 
will not rejoin the cluster until the OSD process is restarted, which 
incidentally takes a lot longer than usual (systemctl command takes a couple of 
minutes to complete).

If the OSD is left in this state, CPU and memory usage of the process appears 
to climb, but never rejoins, at least for several hours that I have left them. 
Not exactly sure what the OSD is trying to do during this period. There's 
nothing in the logs during this hung state to indicate that anything is 
happening, but I will try and inject more verbose logging next time it occurs.

Not sure if anybody has come across this before or any ideas? In the past as 
long as OSD's have been running they have always re-joint following any network 
issues.

Nick

Sample from OSD and cluster logs below. Blip happened at 12:06, I restarted OSD 
at 12:26

OSD Logs from OSD that hung (Note this OSD was not directly affected by network 
outage)
2020-01-15 12:06:32.234 7f41a1023700 -1 osd.43 2342991 heartbeat_check: no 
reply from [*:*:*:5::14]:6838 osd.71 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:33.194 7f41a1023700 -1 osd.43 2342991 heartbeat_check: no 
reply from [*:*:*:5::13]:6854 osd.49 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:33.194 7f41a1023700 -1 osd.43 2342991 heartbeat_check: no 
reply from [*:*:*:5::13]:6834 osd.51 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:33.194 7f41a1023700 -1 osd.43 2342991 heartbeat_check: no 
reply from [*:*:*:5::13]:6862 osd.52 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:33.194 7f41a1023700 -1 osd.43 2342991 heartbeat_check: no 
reply from [*:*:*:5::13]:6875 osd.53 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:33.194 7f41a1023700 -1 osd.43 2342991 heartbeat_check: no 
reply from [*:*:*:5::13]:6894 osd.54 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:33.194 7f41a1023700 -1 osd.43 2342991 heartbeat_check: no 
reply from [*:*:*:5::14]:6838 osd.71 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:34.034 7f419480a700  0 log_channel(cluster) log [WRN] : 
Monitor daemon marked osd.43 down, but it is still running
2020-01-15 12:06:34.034 7f419480a700  0 log_channel(cluster) log [DBG] : map 
e2342992 wrongly marked me down at e2342992
2020-01-15 12:06:34.034 7f419480a700  1 osd.43 2342992 start_waiting_for_healthy
2020-01-15 12:06:34.198 7f41a1023700 -1 osd.43 2342992 heartbeat_check: no 
reply from [*:*:*:5::13]:6854 osd.49 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:34.198 7f41a1023700 -1 osd.43 2342992 heartbeat_check: no 
reply from [*:*:*:5::13]:6834 osd.51 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:34.198 7f41a1023700 -1 osd.43 2342992 heartbeat_check: no 
reply from [*:*:*:5::13]:6862 osd.52 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:34.198 7f41a1023700 -1 osd.43 2342992 heartbeat_check: no 
reply from [*:*:*:5::13]:6875 osd.53 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:34.198 7f41a1023700 -1 osd.43 2342992 heartbeat_check: no 
reply from [*:*:*:5::13]:6894 osd.54 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)
2020-01-15 12:06:34.198 7f41a1023700 -1 osd.43 2342992 heartbeat_check: no 
reply from [*:*:*:5::14]:6838 osd.71 ever on either front or back, first ping 
sent 2020-01-15 12:06:1
1.411216 (oldest deadline 2020-01-15 12:06:31.411216)

Cluster logs
2020-01-15 12:06:09.740607 mon.mc-ceph-mon1 (mon.0) 531400 : cluster [DBG] 
osd.43 reported failed by osd.57
2020-01-15 12:06:09.945163 mon.mc-ceph-mon1 (mon.0) 531683 : cluster [DBG] 
osd.43 reported failed by osd.63
2020-01-15 12:06:09.945287 mon.mc-ceph-mon1 (mon.0) 531684 : cluster [INF] 
osd.43 f

Re: [ceph-users] PG inconsistent with error "size_too_large"

2020-01-15 Thread Liam Monahan
Thanks for that link.

Do you have a default osd max object size of 128M?  I’m thinking about doubling 
that limit to 256MB on our cluster.  Our largest object is only about 10% over 
that limit.

> On Jan 15, 2020, at 3:51 AM, Massimo Sgaravatto 
>  wrote:
> 
> I guess this is coming from:
> 
> https://github.com/ceph/ceph/pull/30783 
> 
> 
> introduced in Nautilus 14.2.5
> 
> On Wed, Jan 15, 2020 at 8:10 AM Massimo Sgaravatto 
> mailto:massimo.sgarava...@gmail.com>> wrote:
> As I wrote here:
> 
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2020-January/037909.html 
> 
> 
> I saw the same after an update from Luminous to Nautilus 14.2.6
> 
> Cheers, Massimo
> 
> On Tue, Jan 14, 2020 at 7:45 PM Liam Monahan  > wrote:
> Hi,
> 
> I am getting one inconsistent object on our cluster with an inconsistency 
> error that I haven’t seen before.  This started happening during a rolling 
> upgrade of the cluster from 14.2.3 -> 14.2.6, but I am not sure that’s 
> related.
> 
> I was hoping to know what the error means before trying a repair.
> 
> [root@objmon04 ~]# ceph health detail
> HEALTH_ERR noout flag(s) set; 1 scrub errors; Possible data damage: 1 pg 
> inconsistent
> OSDMAP_FLAGS noout flag(s) set
> OSD_SCRUB_ERRORS 1 scrub errors
> PG_DAMAGED Possible data damage: 1 pg inconsistent
> pg 9.20e is active+clean+inconsistent, acting [509,674,659]
> 
> rados list-inconsistent-obj 9.20e --format=json-pretty
> {
> "epoch": 759019,
> "inconsistents": [
> {
> "object": {
> "name": 
> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
> "nspace": "",
> "locator": "",
> "snap": "head",
> "version": 692875
> },
> "errors": [
> "size_too_large"
> ],
> "union_shard_errors": [],
> "selected_object_info": {
> "oid": {
> "oid": 
> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
> "key": "",
> "snapid": -2,
> "hash": 3321413134,
> "max": 0,
> "pool": 9,
> "namespace": ""
> },
> "version": "281183'692875",
> "prior_version": "281183'692874",
> "last_reqid": "client.34042469.0:206759091",
> "user_version": 692875,
> "size": 146097278,
> "mtime": "2017-07-03 12:43:35.569986",
> "local_mtime": "2017-07-03 12:43:35.571196",
> "lost": 0,
> "flags": [
> "dirty",
> "data_digest",
> "omap_digest"
> ],
> "truncate_seq": 0,
> "truncate_size": 0,
> "data_digest": "0xf19c8035",
> "omap_digest": "0x",
> "expected_object_size": 0,
> "expected_write_size": 0,
> "alloc_hint_flags": 0,
> "manifest": {
> "type": 0
> },
> "watchers": {}
> },
> "shards": [
> {
> "osd": 509,
> "primary": true,
> "errors": [],
> "size": 146097278
> },
> {
> "osd": 659,
> "primary": false,
> "errors": [],
> "size": 146097278
> },
> {
> "osd": 674,
> "primary": false,
> "errors": [],
> "size": 146097278
> }
> ]
> }
> ]
> }
> 
> Thanks,
> Liam
> —
> Senior Developer
> Institute for Advanced Computer Studies
> University of Maryland
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
> 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG inconsistent with error "size_too_large"

2020-01-15 Thread Massimo Sgaravatto
I never changed the default value for that attribute

I am missing why I have such big objects around

I am also wondering what a pg repair would do in such case

Il mer 15 gen 2020, 16:18 Liam Monahan  ha scritto:

> Thanks for that link.
>
> Do you have a default osd max object size of 128M?  I’m thinking about
> doubling that limit to 256MB on our cluster.  Our largest object is only
> about 10% over that limit.
>
> On Jan 15, 2020, at 3:51 AM, Massimo Sgaravatto <
> massimo.sgarava...@gmail.com> wrote:
>
> I guess this is coming from:
>
> https://github.com/ceph/ceph/pull/30783
>
> introduced in Nautilus 14.2.5
>
> On Wed, Jan 15, 2020 at 8:10 AM Massimo Sgaravatto <
> massimo.sgarava...@gmail.com> wrote:
>
>> As I wrote here:
>>
>>
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2020-January/037909.html
>>
>> I saw the same after an update from Luminous to Nautilus 14.2.6
>>
>> Cheers, Massimo
>>
>> On Tue, Jan 14, 2020 at 7:45 PM Liam Monahan  wrote:
>>
>>> Hi,
>>>
>>> I am getting one inconsistent object on our cluster with an
>>> inconsistency error that I haven’t seen before.  This started happening
>>> during a rolling upgrade of the cluster from 14.2.3 -> 14.2.6, but I am not
>>> sure that’s related.
>>>
>>> I was hoping to know what the error means before trying a repair.
>>>
>>> [root@objmon04 ~]# ceph health detail
>>> HEALTH_ERR noout flag(s) set; 1 scrub errors; Possible data damage: 1 pg
>>> inconsistent
>>> OSDMAP_FLAGS noout flag(s) set
>>> OSD_SCRUB_ERRORS 1 scrub errors
>>> PG_DAMAGED Possible data damage: 1 pg inconsistent
>>> pg 9.20e is active+clean+inconsistent, acting [509,674,659]
>>>
>>> rados list-inconsistent-obj 9.20e --format=json-pretty
>>> {
>>> "epoch": 759019,
>>> "inconsistents": [
>>> {
>>> "object": {
>>> "name":
>>> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
>>> "nspace": "",
>>> "locator": "",
>>> "snap": "head",
>>> "version": 692875
>>> },
>>> "errors": [
>>> "size_too_large"
>>> ],
>>> "union_shard_errors": [],
>>> "selected_object_info": {
>>> "oid": {
>>> "oid":
>>> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
>>> "key": "",
>>> "snapid": -2,
>>> "hash": 3321413134,
>>> "max": 0,
>>> "pool": 9,
>>> "namespace": ""
>>> },
>>> "version": "281183'692875",
>>> "prior_version": "281183'692874",
>>> "last_reqid": "client.34042469.0:206759091",
>>> "user_version": 692875,
>>> "size": 146097278,
>>> "mtime": "2017-07-03 12:43:35.569986",
>>> "local_mtime": "2017-07-03 12:43:35.571196",
>>> "lost": 0,
>>> "flags": [
>>> "dirty",
>>> "data_digest",
>>> "omap_digest"
>>> ],
>>> "truncate_seq": 0,
>>> "truncate_size": 0,
>>> "data_digest": "0xf19c8035",
>>> "omap_digest": "0x",
>>> "expected_object_size": 0,
>>> "expected_write_size": 0,
>>> "alloc_hint_flags": 0,
>>> "manifest": {
>>> "type": 0
>>> },
>>> "watchers": {}
>>> },
>>> "shards": [
>>> {
>>> "osd": 509,
>>> "primary": true,
>>> "errors": [],
>>> "size": 146097278
>>> },
>>> {
>>> "osd": 659,
>>> "primary": false,
>>> "errors": [],
>>> "size": 146097278
>>> },
>>> {
>>> "osd": 674,
>>> "primary": false,
>>> "errors": [],
>>> "size": 146097278
>>> }
>>> ]
>>> }
>>> ]
>>> }
>>>
>>> Thanks,
>>> Liam
>>> —
>>> Senior Developer
>>> Institute for Advanced Computer Studies
>>> University of Maryland
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ?==?utf-8?q? OSD's hang after network blip

2020-01-15 Thread Nick Fisk
On Wednesday, January 15, 2020 14:37 GMT, "Nick Fisk"  wrote: 
 
> Hi All,
> 
> Running 14.2.5, currently experiencing some network blips isolated to a 
> single rack which is under investigation. However, it appears following a 
> network blip, random OSD's in unaffected racks are sometimes not recovering 
> from the incident and are left running running in a zombie state. The OSD's 
> appear to be running from a process perspective, but the cluster thinks they 
> are down and will not rejoin the cluster until the OSD process is restarted, 
> which incidentally takes a lot longer than usual (systemctl command takes a 
> couple of minutes to complete).
> 
> If the OSD is left in this state, CPU and memory usage of the process appears 
> to climb, but never rejoins, at least for several hours that I have left 
> them. Not exactly sure what the OSD is trying to do during this period. 
> There's nothing in the logs during this hung state to indicate that anything 
> is happening, but I will try and inject more verbose logging next time it 
> occurs.
> 
> Not sure if anybody has come across this before or any ideas? In the past as 
> long as OSD's have been running they have always re-joint following any 
> network issues.
> 
> Nick
> 
> Sample from OSD and cluster logs below. Blip happened at 12:06, I restarted 
> OSD at 12:26
> 
> OSD Logs from OSD that hung (Note this OSD was not directly affected by 
> network outage)
> 2020-01-15 12:06:32.234 7f41a1023700 -1 osd.43 2342991 heartbeat_check: no 
> reply from [*:*:*:5::14]:6838 osd.71 ever on either front or back, first ping 
> sent 2020-01-15 12:06:1


 
 Its just happened again and managed to pull this out of debug_osd 20 :

2020-01-15 16:29:01.464 7ff1763df700 10 osd.87 2343121 handle_osd_ping osd.182 
v2:[2a03:25e0:253:5::76]:6839/8394683 says i am down in 2343138
2020-01-15 16:29:01.464 7ff1763df700 10 osd.87 2343121 handle_osd_ping osd.184 
v2:[2a03:25e0:253:5::76]:6814/7394522 says i am down in 2343138
2020-01-15 16:29:01.464 7ff1763df700 10 osd.87 2343121 handle_osd_ping osd.190 
v2:[2a03:25e0:253:5::76]:6860/5986687 says i am down in 2343138
2020-01-15 16:29:01.668 7ff1763df700 10 osd.87 2343121 handle_osd_ping osd.19 
v2:[2a03:25e0:253:5::12]:6815/5153900 says i am down in 2343138

And this from the daemon status output:
sudo ceph daemon osd.87 status
{
"cluster_fsid": "c1703b54-b4cd-41ab-a3ba-4fab241b62f3",
"osd_fsid": "0cd8fe7d-17be-4982-b76f-ef1cbed0c19b",
"whoami": 87,
"state": "waiting_for_healthy",
"oldest_map": 2342407,
"newest_map": 2343121,
"num_pgs": 218
}

So OSD doesn't seem to be getting latest map from the mon's. Map 2343138 
obviously has OSD.87 marked down hence the error messages from the osd_pings. 
But I'm guessing the latest map the OSD has 2343121 has it as marked up, so it 
never tries to "re-connect"?

Seems similar to this post from a few years back, which didn't seem to end with 
a form of resolution
https://www.spinics.net/lists/ceph-devel/msg31788.html

Also found this PR for Nautilus which suggested it might be a fix for the 
issue, but should already be part of the release I'm running:
https://github.com/ceph/ceph/pull/23958

Nick

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PG inconsistent with error "size_too_large"

2020-01-15 Thread Liam Monahan
I just changed my max object size to 256MB and scrubbed and the errors went 
away.  I’m not sure what can be done to reduce the size of these objects, 
though, if it really is a problem.  Our cluster has dynamic bucket index 
resharding turned on, but that sharding process shouldn’t help it if non-index 
objects are what is over the limit.

I don’t think a pg repair would do anything unless the config tunables are 
adjusted.

> On Jan 15, 2020, at 10:56 AM, Massimo Sgaravatto 
>  wrote:
> 
> I never changed the default value for that attribute
> 
> I am missing why I have such big objects around 
> 
> I am also wondering what a pg repair would do in such case
> 
> Il mer 15 gen 2020, 16:18 Liam Monahan  > ha scritto:
> Thanks for that link.
> 
> Do you have a default osd max object size of 128M?  I’m thinking about 
> doubling that limit to 256MB on our cluster.  Our largest object is only 
> about 10% over that limit.
> 
>> On Jan 15, 2020, at 3:51 AM, Massimo Sgaravatto 
>> mailto:massimo.sgarava...@gmail.com>> wrote:
>> 
>> I guess this is coming from:
>> 
>> https://github.com/ceph/ceph/pull/30783 
>> 
>> 
>> introduced in Nautilus 14.2.5
>> 
>> On Wed, Jan 15, 2020 at 8:10 AM Massimo Sgaravatto 
>> mailto:massimo.sgarava...@gmail.com>> wrote:
>> As I wrote here:
>> 
>> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2020-January/037909.html 
>> 
>> 
>> I saw the same after an update from Luminous to Nautilus 14.2.6
>> 
>> Cheers, Massimo
>> 
>> On Tue, Jan 14, 2020 at 7:45 PM Liam Monahan > > wrote:
>> Hi,
>> 
>> I am getting one inconsistent object on our cluster with an inconsistency 
>> error that I haven’t seen before.  This started happening during a rolling 
>> upgrade of the cluster from 14.2.3 -> 14.2.6, but I am not sure that’s 
>> related.
>> 
>> I was hoping to know what the error means before trying a repair.
>> 
>> [root@objmon04 ~]# ceph health detail
>> HEALTH_ERR noout flag(s) set; 1 scrub errors; Possible data damage: 1 pg 
>> inconsistent
>> OSDMAP_FLAGS noout flag(s) set
>> OSD_SCRUB_ERRORS 1 scrub errors
>> PG_DAMAGED Possible data damage: 1 pg inconsistent
>> pg 9.20e is active+clean+inconsistent, acting [509,674,659]
>> 
>> rados list-inconsistent-obj 9.20e --format=json-pretty
>> {
>> "epoch": 759019,
>> "inconsistents": [
>> {
>> "object": {
>> "name": 
>> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
>> "nspace": "",
>> "locator": "",
>> "snap": "head",
>> "version": 692875
>> },
>> "errors": [
>> "size_too_large"
>> ],
>> "union_shard_errors": [],
>> "selected_object_info": {
>> "oid": {
>> "oid": 
>> "2017-07-03-12-8b980d5b-23de-41f9-8b14-84a5bbc3f1c9.31293422.4-activedns-diff",
>> "key": "",
>> "snapid": -2,
>> "hash": 3321413134,
>> "max": 0,
>> "pool": 9,
>> "namespace": ""
>> },
>> "version": "281183'692875",
>> "prior_version": "281183'692874",
>> "last_reqid": "client.34042469.0:206759091",
>> "user_version": 692875,
>> "size": 146097278,
>> "mtime": "2017-07-03 12:43:35.569986",
>> "local_mtime": "2017-07-03 12:43:35.571196",
>> "lost": 0,
>> "flags": [
>> "dirty",
>> "data_digest",
>> "omap_digest"
>> ],
>> "truncate_seq": 0,
>> "truncate_size": 0,
>> "data_digest": "0xf19c8035",
>> "omap_digest": "0x",
>> "expected_object_size": 0,
>> "expected_write_size": 0,
>> "alloc_hint_flags": 0,
>> "manifest": {
>> "type": 0
>> },
>> "watchers": {}
>> },
>> "shards": [
>> {
>> "osd": 509,
>> "primary": true,
>> "errors": [],
>> "size": 146097278
>> },
>> {
>> "osd": 659,
>> "primary": false,
>> "errors": [],
>> "size": 146097278
>> },
>> {
>> "osd": 674,
>> "primary": false,
>> "errors": [],
>> "size": 146097278
>> }
>> ]
>> }
>> ]
>> }

[ceph-users] Mon crashes virtual void LogMonitor::update_from_paxos(bool*)

2020-01-15 Thread Kevin Hrpcek
Hey all,

One of my mons has been having a rough time for the last day or so. It started 
with a crash and restart I didn't notice about a day ago and now it won't 
start. Where it crashes has changed over time but it is now stuck on the last 
error below. I've tried to get some more information out of it with debug 
logging and gdb but I haven't seen anything that makes the root cause of this 
obvious.

Right now it is crashing at line 103 in 
https://github.com/ceph/ceph/blob/mimic/src/mon/LogMonitor.cc#L103. This is 
part of the mon preinit step. Best that I can tell right now is that it is 
having a problem with a map version. I'm considering rebuilding the mon's store 
though I don't see any clear signs of corruption.

It bails at assert(err == 0);

  // walk through incrementals
  while (version > summary.version) {
bufferlist bl;
int err = get_version(summary.version+1, bl);
assert(err == 0);
assert(bl.length());

Has anyone seen similar or have any ideas?

ceph 13.2.8

Thanks!
Kevin


The first crash/restart

Jan 14 20:47:11 sephmon5 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/mon/Monitor.cc:
 In function 'bool Monitor::_scrub(ScrubResult*, 
std::pair, std::basic_string >*, int*)' thread 
7f5b54680700 time 2020-01-14 20:47:11.618368
Jan 14 20:47:11 sephmon5 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/mon/Monitor.cc:
 5225: FAILED assert(err == 0)
Jan 14 20:47:11 sephmon5 ceph-mon: ceph version 13.2.8 
(5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable)
Jan 14 20:47:11 sephmon5 ceph-mon: 1: (ceph::__ceph_assert_fail(char const*, 
char const*, int, char const*)+0x14b) [0x7f5b6440b87b]
Jan 14 20:47:11 sephmon5 ceph-mon: 2: (()+0x26fa07) [0x7f5b6440ba07]
Jan 14 20:47:11 sephmon5 ceph-mon: 3: (Monitor::_scrub(ScrubResult*, 
std::pair*, int*)+0xfa6) [0x55c3230a1896]
Jan 14 20:47:11 sephmon5 ceph-mon: 4: 
(Monitor::handle_scrub(boost::intrusive_ptr)+0x25e) 
[0x55c3230aa01e]
Jan 14 20:47:11 sephmon5 ceph-mon: 5: 
(Monitor::dispatch_op(boost::intrusive_ptr)+0xcaf) 
[0x55c3230c73ff]
Jan 14 20:47:11 sephmon5 ceph-mon: 6: (Monitor::_ms_dispatch(Message*)+0x732) 
[0x55c3230c8152]
Jan 14 20:47:11 sephmon5 ceph-mon: 7: (Monitor::ms_dispatch(Message*)+0x23) 
[0x55c3230edcc3]
Jan 14 20:47:11 sephmon5 ceph-mon: 8: (DispatchQueue::entry()+0xb7a) 
[0x7f5b644ca24a]
Jan 14 20:47:11 sephmon5 ceph-mon: 9: 
(DispatchQueue::DispatchThread::entry()+0xd) [0x7f5b645684bd]
Jan 14 20:47:11 sephmon5 ceph-mon: 10: (()+0x7e65) [0x7f5b63749e65]
Jan 14 20:47:11 sephmon5 ceph-mon: 11: (clone()+0x6d) [0x7f5b6025d88d]

Then a couple more crashes/restarts about 11 hours later with this trace

-10001> 2020-01-15 09:36:35.796 7f9600fc7700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/mon/LogMonitor.cc:
 In function 'void LogMonitor::_create_sub_incremental(MLog*, int, version_t)' 
thread 7f9600fc7700 time 2020-01-15 09:36:35.796354
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/13.2.8/rpm/el7/BUILD/ceph-13.2.8/src/mon/LogMonitor.cc:
 673: FAILED assert(err == 0)

 ceph version 13.2.8 (5579a94fafbc1f9cc913a0f5d362953a5d9c3ae0) mimic (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x14b) [0x7f9610d5287b]
 2: (()+0x26fa07) [0x7f9610d52a07]
 3: (LogMonitor::_create_sub_incremental(MLog*, int, unsigned long)+0xb54) 
[0x55aeb09e2f94]
 4: (LogMonitor::check_sub(Subscription*)+0x506) [0x55aeb09e3806]
 5: (Monitor::handle_subscribe(boost::intrusive_ptr)+0x10ed) 
[0x55aeb098973d]
 6: (Monitor::dispatch_op(boost::intrusive_ptr)+0x3cd) 
[0x55aeb09b0b1d]
 7: (Monitor::_ms_dispatch(Message*)+0x732) [0x55aeb09b2152]
 8: (Monitor::ms_dispatch(Message*)+0x23) [0x55aeb09d7cc3]
 9: (DispatchQueue::entry()+0xb7a) [0x7f9610e1124a]
 10: (DispatchQueue::DispatchThread::entry()+0xd) [0x7f9610eaf4bd]
 11: (()+0x7e65) [0x7f9610090e65]
 12: (clone()+0x6d) [0x7f960cba488d]

-10001> 2020-01-15 09:36:35.797 7f95fffc5700  1 -- 10.1.9.205:6789/0 >> - 
conn(0x55aec5dd0600 :6789 s=STATE_ACCEPTING pgs=0 cs=0 l=0)._process_connection 
sd=47 -
-10001> 2020-01-15 09:36:35.798 7f9600fc7700 -1 *** Caught signal (Aborted) **
 in thread 7f9600fc7700 thread_name:ms_dispatch


And now the mon no longer starts with this trace

  -261> 2020-01-15 16:36:46.084 7f0946674a00 10 
mon.sephmon5@-1(probing).paxosservice(logm 0..86521000) refresh
  -261> 2020-01-15 16:36:46.084 7f0946674a00 10 mon.sephmon5@-1(probing).log 
v86521000 update_from_paxos
  -261> 2020-01-15 16:36:46.084 7f0946674a00 10 mon.sephmon5@-1(prob