Dear Cephers...

Today our ceph cluster gave us a couple of scrub errors regarding inconsistent 
pgs. We just upgraded from 9.2.0 to 10.2.2 two days ago.

# ceph health detail 
HEALTH_ERR 2 pgs inconsistent; 2 scrub errors; crush map has legacy tunables 
(require bobtail, min is firefly)
pg 6.39c is active+clean+inconsistent, acting [2,60,32]
pg 6.263 is active+clean+inconsistent, acting [56,39,6]
2 scrub errors
crush map has legacy tunables (require bobtail, min is firefly); see 
http://ceph.com/docs/master/rados/operations/crush-map/#tunables

We have started by looking to pg 6.263. Errors were only appearing in osd.56 
logs but not in others.

# cat  ceph-osd.56.log-20160629 | grep -Hn 'ERR' 
(standard input):8569:2016-06-29 08:09:50.952397 7fd023322700 -1 
log_channel(cluster) log [ERR] : scrub 6.263 
6:c645f18e:::100002a343d.00000000:head on disk size (1836) does not match 
object info size (41242) adjusted for ondisk to (41242)
(standard input):8602:2016-06-29 08:11:11.227865 7fd023322700 -1 
log_channel(cluster) log [ERR] : 6.263 scrub 1 errors

So, we did a 'ceph pg repair  6.263'.

Eventually, that pg went back to 'active+clean'

# ceph pg dump | grep ^6.263
dumped all in format plain
6.263   10845   0       0       0       0       39592671010     3037    3037    
active+clean    2016-06-30 02:13:00.455293      1005'2126237    1005:2795768    
[56,39,6]       56      [56,39,6]       56      1005'2076134    2016-06-30 
02:13:00.455256      1005'2076134    2016-06-30 02:13:00.455256

However, in the logs i found

2016-06-30 02:03:03.992240 osd.56 192.231.127.226:6801/21569 278 : cluster 
[INF] 6.263 repair starts
2016-06-30 02:13:00.455237 osd.56 192.231.127.226:6801/21569 279 : cluster 
[INF] 6.263 repair ok, 0 fixed

I did not like the '0 fixed'. 

Inspecting a bit more, I found that the object inside the pg in all involved 
osds are changing size. For example in osd.56 (but the same thing is true in 39 
and 6) I found in consecutive 'ls -l' commands:

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 8602 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
[root@rccephosd8 ceph]# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 170 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 15436 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 26044 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 0 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 14076 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 31110 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 0 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 20230 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 23392 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 0 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 0 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

# ls -l 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6
-rw-r--r-- 1 ceph ceph 41412 Jun 30 02:53 
/var/lib/ceph/osd/ceph-56/current/6.263_head/DIR_3/DIR_6/DIR_2/DIR_A/100002a343d.00000000__head_718FA263__6

>From the size checks I did before applying the repair I know that the size of 
>the object should be 41412. The initial error also says that.

So what is actually going on here?

Cheers
G.

 
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to