[ceph-users] Reboot 1 OSD server, now ceph says 60% misplaced?

2017-11-19 Thread Tracy Reed
One of my 9 ceph osd nodes just spontaneously rebooted.  

This particular osd server only holds 4% of total storage. 

Why, after it has come back up and rejoined the cluster, does ceph
health say that 60% of my objects are misplaced?  I'm wondering if I
have something setup wrong in my cluster. This cluster has been
operating well for the most part for about a year but I have noticed
this sort of behavior before. This is going to take many hours to
recover. Ceph 10.2.3.

Thanks for any insights you may be able to provide!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reboot 1 OSD server, now ceph says 60% misplaced?

2017-11-19 Thread Tracy Reed
6:6850/5091 10.0.5.16:6851/5091 exists,up 
03e2b2bb-b7a6-4d28-8ff4-b73f208812e2
osd.75 up   in  weight 0.992599 up_from 112255 up_thru 119791 down_at 112241 
last_clean_interval [94225,112240) 10.0.5.16:6832/4480 10.0.5.16:6833/4480 
10.0.5.16:6834/4480 10.0.5.16:6835/4480 exists,up 
0a4e41b8-2cd5-496f-910c-a2fd7732a42e
osd.76 up   in  weight 1 up_from 112254 up_thru 119770 down_at 112241 
last_clean_interval [94247,112240) 10.0.5.16:6844/4907 10.0.5.16:6845/4907 
10.0.5.16:6846/4907 10.0.5.16:6847/4907 exists,up 
ea9fc955-6f28-4ca6-b06b-0bd4f178f22e
osd.77 up   in  weight 0.940018 up_from 112249 up_thru 119569 down_at 112241 
last_clean_interval [94260,112240) 10.0.5.16:6820/3813 10.0.5.16:6821/3813 
10.0.5.16:6822/3813 10.0.5.16:6823/3813 exists,up 
361ae33c-39b3-4572-b5a9-69bdfbbb4c3f
osd.78 up   in  weight 0.962784 up_from 112249 up_thru 119411 down_at 112241 
last_clean_interval [94277,112240) 10.0.5.16:6816/3628 10.0.5.16:6817/3628 
10.0.5.16:6818/3628 10.0.5.16:6819/3628 exists,up 
1902cb84-465c-4fab-bac0-9a5d1fa9a5ae
osd.79 up   in  weight 1 up_from 112246 up_thru 119541 down_at 112241 
last_clean_interval [94304,112240) 10.0.5.16:6812/3497 10.0.5.16:6813/3497 
10.0.5.16:6814/3497 10.0.5.16:6815/3497 exists,up 
5d4041db-398d-4fbb-a672-029444f3d974
osd.80 up   in  weight 0.993805 up_from 113233 up_thru 119411 down_at 113231 
last_clean_interval [112253,113230) 10.0.5.16:6852/169626 10.0.5.16:6854/169626 
10.0.5.16:6856/169626 10.0.5.16:6857/169626 exists,up 
9e6d643b-8a2e-43d6-b112-126dd162ad0b
osd.81 up   in  weight 1 up_from 112245 up_thru 119500 down_at 112241 
last_clean_interval [94345,112240) 10.0.5.16:6808/3355 10.0.5.16:6809/3355 
10.0.5.16:6810/3355 10.0.5.16:6811/3355 exists,up 
144a1a6f-9a11-4950-a54f-7f649989fac7
osd.82 up   in  weight 1 up_from 112253 up_thru 119429 down_at 112241 
last_clean_interval [94354,112240) 10.0.5.16:6840/4733 10.0.5.16:6841/4733 
10.0.5.16:6842/4733 10.0.5.16:6843/4733 exists,up 
991499cf-8334-4b46-97fb-535c8a703e45
osd.83 up   in  weight 1 up_from 112249 up_thru 119788 down_at 112241 
last_clean_interval [94371,112240) 10.0.5.16:6824/3944 10.0.5.16:6825/3944 
10.0.5.16:6826/3944 10.0.5.16:6827/3944 exists,up 
22819ad7-ab30-42a2-97fb-4d8fdd1937aa
osd.84 up   in  weight 1 up_from 112245 up_thru 119411 down_at 112241 
last_clean_interval [94384,112240) 10.0.5.16:6800/3071 10.0.5.16:6801/3071 
10.0.5.16:6802/3071 10.0.5.16:6803/3071 exists,up 
91da19ca-4948-4d2a-baa0-7a075f59d2ea
osd.85 up   in  weight 1 up_from 112243 up_thru 119571 down_at 112241 
last_clean_interval [94396,112240) 10.0.5.16:6804/3196 10.0.5.16:6805/3196 
10.0.5.16:6806/3196 10.0.5.16:6807/3196 exists,up 
b79d7033-fdf6-4f4d-97bf-26a24f903b98
pg_temp 0.0 [52,16,73]
pg_temp 0.1 [61,26,77]
pg_temp 0.5 [84,48,29]
pg_temp 0.6 [77,70,46]
pg_temp 0.7 [29,73,46]
pg_temp 0.8 [61,16,73]
pg_temp 0.9 [67,83,47]
pg_temp 0.b [83,0,49]
pg_temp 0.c [0,64,77]
pg_temp 0.e [67,0,69]
< a couple thousand more lines like these pg_temp lines >

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reboot 1 OSD server, now ceph says 60% misplaced?

2017-11-19 Thread Tracy Reed
   1.0 
 64  1.81799 osd.64   up  0.55638  1.0 
 63  1.81799 osd.63   up  0.60434  1.0 
 -8  5.45398 host ceph07   
 65  1.81799 osd.65   up  0.52611  1.0 
 67  1.81799 osd.67   up  0.61052  1.0 
 70  1.81799 osd.70   up  0.56075  1.0 
 -9  2.69798 host ceph08   
  4  0.90900 osd.4up  0.45261  1.0 
  5  0.87999 osd.5up  0.46480  1.0 
 16  0.90900 osd.16   up  0.48987  1.0 
-10 29.09595 host ceph10   
 66  1.81850 osd.66 down0  1.0 
 71  1.81850 osd.71 down0  1.0 
 72  1.81850 osd.72 down0  1.0 
 73  1.81850 osd.73   up  0.89394  1.0 
 74  1.81850 osd.74   up  1.0  1.0 
 75  1.81850 osd.75   up  0.99260  1.0 
 76  1.81850 osd.76   up  1.0  1.0 
 77  1.81850 osd.77   up  0.94002  1.0 
 78  1.81850 osd.78   up  0.96278  1.0 
 79  1.81850 osd.79   up  1.0  1.0 
 80  1.81850 osd.80   up  0.99380  1.0 
 81  1.81850 osd.81   up  1.0  1.0 
 82  1.81850 osd.82   up  1.0  1.0 
 83  1.81850 osd.83   up  1.0  1.0 
 84  1.81850 osd.84   up  1.0  1.0 
 85  1.81850 osd.85   up  1.0  1.0 


-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Reboot 1 OSD server, now ceph says 60% misplaced?

2017-11-19 Thread Tracy Reed
On Sun, Nov 19, 2017 at 02:41:56AM PST, Gregory Farnum spake thusly:
> Okay, so the hosts look okay (although very uneven numbers of OSDs).
> 
> But the sizes are pretty wonky. Are the disks really that mismatched
> in size? I note that many of them in host10 are set to 1.0, but most
> of the others are some fraction less than that.

Yes, they are that mismatched. This is a very mix and match cluster we
built out of what we had laying around. I know that isn't ideal.
Possibly due to the large mismatch in disk sizes (although I had always
expected CRUSH to manage it batter given the default weighting
proportional to size) we used to run into situations where the small
disks would fill up even when the large disks were barely at 50%. So
back in June we ran bc-ceph-reweight-by-utilization.py fairly frequently
for a few days until things were happy and stable and it stayed that way
until tonight's incident.

I'm pretty sure you are right: The weights got reset to defaults causing
lots of movement. I had forgotten that ceph osd reweight is not a
persistent setting. So it looks like once things settle I need to adjust
crush weights appropriately and set reweights back to 1 to make this
permanent.

That explains it. Thanks!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Can't activate OSD

2016-10-03 Thread Tracy Reed
Hello all,

Over the past few weeks I've been trying to go through the Quick Ceph Deploy 
tutorial at:

http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/

just trying to get a basic 2 OSD ceph cluster up and running. Everything seems
to go well until I get to the:

ceph-deploy osd activate ceph02:/dev/sdc ceph03:/dev/sdc

part. It never actually seems to activate the OSD and eventually times out:

[ceph02][DEBUG ] connection detected need for sudo
[ceph02][DEBUG ] connected to host: ceph02 
[ceph02][DEBUG ] detect platform information from remote host
[ceph02][DEBUG ] detect machine type
[ceph02][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.2.1511 Core
[ceph_deploy.osd][DEBUG ] activating host ceph02 disk /dev/sdc
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[ceph02][DEBUG ] find the location of an executable
[ceph02][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate 
--mark-init systemd --mount /dev/sdc
[ceph02][WARNIN] main_activate: path = /dev/sdc
[ceph02][WARNIN] No data was received after 300 seconds, disconnecting...
[ceph02][INFO  ] checking OSD status...
[ceph02][DEBUG ] find the location of an executable
[ceph02][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat 
--format=json
[ceph02][INFO  ] Running command: sudo systemctl enable ceph.target
[ceph03][DEBUG ] connection detected need for sudo
[ceph03][DEBUG ] connected to host: ceph03 
[ceph03][DEBUG ] detect platform information from remote host
[ceph03][DEBUG ] detect machine type
[ceph03][DEBUG ] find the location of an executable
[ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.2.1511 Core
[ceph_deploy.osd][DEBUG ] activating host ceph03 disk /dev/sdc
[ceph_deploy.osd][DEBUG ] will use init type: systemd
[ceph03][DEBUG ] find the location of an executable
[ceph03][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate 
--mark-init systemd --mount /dev/sdc
[ceph03][WARNIN] main_activate: path = /dev/sdc
[ceph03][WARNIN] No data was received after 300 seconds, disconnecting...
[ceph03][INFO  ] checking OSD status...
[ceph03][DEBUG ] find the location of an executable
[ceph03][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat 
--format=json
[ceph03][INFO  ] Running command: sudo systemctl enable ceph.target

Machines involved are ceph-deploy (deploy server), ceph01 (monitor), ceph02 and
ceph03 (OSD servers).

ceph log is here:

http://pastebin.com/A2kP28c4

This is CentOS 5. iptables and selinux are both off. When I first started doing
this the volume would be left mounted in the tmp location on the OSDs. But I
have since upgraded my version of ceph and now nothing is left mounted on the
OSD but it still times out.

Please let me know if there is any other info I can provide which might help.
Any help you can offer is greatly appreciated! I've been stuck on this for
weeks. Thanks!

-- 
Tracy Reed


pgpmPpa4E7s3Y.pgp
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Can't activate OSD

2016-10-03 Thread Tracy Reed
Oops, I said CentOS 5 (old habit, ran it for years!). I meant CentOS 7. And I'm
running the following Ceph package versions from the ceph repo:

root@ceph02 ~]# rpm -qa |grep -i ceph
libcephfs1-10.2.3-0.el7.x86_64
ceph-common-10.2.3-0.el7.x86_64
ceph-mon-10.2.3-0.el7.x86_64
ceph-release-1-1.el7.noarch
python-cephfs-10.2.3-0.el7.x86_64
ceph-selinux-10.2.3-0.el7.x86_64
ceph-osd-10.2.3-0.el7.x86_64
ceph-mds-10.2.3-0.el7.x86_64
ceph-radosgw-10.2.3-0.el7.x86_64
ceph-base-10.2.3-0.el7.x86_64
ceph-10.2.3-0.el7.x86_64

On Mon, Oct 03, 2016 at 03:34:50PM PDT, Tracy Reed spake thusly:
> Hello all,
> 
> Over the past few weeks I've been trying to go through the Quick Ceph Deploy 
> tutorial at:
> 
> http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/
> 
> just trying to get a basic 2 OSD ceph cluster up and running. Everything seems
> to go well until I get to the:
> 
> ceph-deploy osd activate ceph02:/dev/sdc ceph03:/dev/sdc
> 
> part. It never actually seems to activate the OSD and eventually times out:
> 
> [ceph02][DEBUG ] connection detected need for sudo
> [ceph02][DEBUG ] connected to host: ceph02 
> [ceph02][DEBUG ] detect platform information from remote host
> [ceph02][DEBUG ] detect machine type
> [ceph02][DEBUG ] find the location of an executable
> [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.2.1511 Core
> [ceph_deploy.osd][DEBUG ] activating host ceph02 disk /dev/sdc
> [ceph_deploy.osd][DEBUG ] will use init type: systemd
> [ceph02][DEBUG ] find the location of an executable
> [ceph02][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate 
> --mark-init systemd --mount /dev/sdc
> [ceph02][WARNIN] main_activate: path = /dev/sdc
> [ceph02][WARNIN] No data was received after 300 seconds, disconnecting...
> [ceph02][INFO  ] checking OSD status...
> [ceph02][DEBUG ] find the location of an executable
> [ceph02][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat 
> --format=json
> [ceph02][INFO  ] Running command: sudo systemctl enable ceph.target
> [ceph03][DEBUG ] connection detected need for sudo
> [ceph03][DEBUG ] connected to host: ceph03 
> [ceph03][DEBUG ] detect platform information from remote host
> [ceph03][DEBUG ] detect machine type
> [ceph03][DEBUG ] find the location of an executable
> [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.2.1511 Core
> [ceph_deploy.osd][DEBUG ] activating host ceph03 disk /dev/sdc
> [ceph_deploy.osd][DEBUG ] will use init type: systemd
> [ceph03][DEBUG ] find the location of an executable
> [ceph03][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate 
> --mark-init systemd --mount /dev/sdc
> [ceph03][WARNIN] main_activate: path = /dev/sdc
> [ceph03][WARNIN] No data was received after 300 seconds, disconnecting...
> [ceph03][INFO  ] checking OSD status...
> [ceph03][DEBUG ] find the location of an executable
> [ceph03][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat 
> --format=json
> [ceph03][INFO  ] Running command: sudo systemctl enable ceph.target
> 
> Machines involved are ceph-deploy (deploy server), ceph01 (monitor), ceph02 
> and
> ceph03 (OSD servers).
> 
> ceph log is here:
> 
> http://pastebin.com/A2kP28c4
> 
> This is CentOS 5. iptables and selinux are both off. When I first started 
> doing
> this the volume would be left mounted in the tmp location on the OSDs. But I
> have since upgraded my version of ceph and now nothing is left mounted on the
> OSD but it still times out.
> 
> Please let me know if there is any other info I can provide which might help.
> Any help you can offer is greatly appreciated! I've been stuck on this for
> weeks. Thanks!
> 
> -- 
> Tracy Reed



> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
Tracy Reed


pgpIRPsYCGgTx.pgp
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph consultants?

2016-10-05 Thread Tracy Reed

Hello all,

Any independent Ceph consultants out there? We have been trying to get Ceph
going and it's been very slow going. We don't have anything working yet after a
month! 

We really can't waste much more time on this by ourselves. At this point we're
looking to pay someone for a few hours to get us over the initial roadblock and
advise us occasionally as we move forward. Probably just a few hours of work
but if there's an experienced ceph person out there looking to make a little
extra money please drop me a line.

Thanks!

-- 
Tracy Reed


pgpoAdnW9Acn4.pgp
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph consultants?

2016-10-05 Thread Tracy Reed
On Wed, Oct 05, 2016 at 01:17:52PM PDT, Peter Maloney spake thusly:
> What do you need help with specifically? Setting up ceph isn't very
> complicated... just fixing it when things go wrong should be. What type
> of scale are you working with, and do you already have hardware? Or is
> the problem more to do with integrating it with clients?

Hi Peter,

I agree, setting up Ceph isn't very complicated. I posted to the list on
10/03/16 with the initial problem I have run into under the subject "Can't
activate OSD". Please refer to that thread as it has logs, details of my setup,
etc.

I started working on this about a month ago then spent several days on it and a
few hours with a couple different people on IRC. Nobody has been able to figure
out how to get my OSD activated. I took a couple weeks off and now I'm back at
it as I really need to get this going soon.

Basically, I'm following the quickstart guide at
http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/ and when I run the
command to activate the OSDs like so:

ceph-deploy osd activate ceph02:/dev/sdc ceph03:/dev/sdc

I get this in the ceph-deploy log:

[2016-10-03 15:16:10,193][ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 
7.2.1511 Core
[2016-10-03 15:16:10,193][ceph_deploy.osd][DEBUG ] activating host ceph03 disk 
/dev/sdc
[2016-10-03 15:16:10,193][ceph_deploy.osd][DEBUG ] will use init type: systemd
[2016-10-03 15:16:10,194][ceph03][DEBUG ] find the location of an executable
[2016-10-03 15:16:10,200][ceph03][INFO  ] Running command: sudo 
/usr/sbin/ceph-disk -v activate --mark-init systemd --mount /dev/sdc
[2016-10-03 15:16:10,377][ceph03][WARNING] main_activate: path = /dev/sdc
[2016-10-03 15:21:10,380][ceph03][WARNING] No data was received after 300 
seconds, disconnecting...
[2016-10-03 15:21:15,387][ceph03][INFO  ] checking OSD status...
[2016-10-03 15:21:15,401][ceph03][DEBUG ] find the location of an executable
[2016-10-03 15:21:15,472][ceph03][INFO  ] Running command: sudo /bin/ceph 
--cluster=ceph osd stat --format=json
[2016-10-03 15:21:15,698][ceph03][INFO  ] Running command: sudo systemctl 
enable ceph.target

More details in other thread.

Where am I going wrong here?

Thanks!

-- 
Tracy Reed


pgpf71_DOjtT2.pgp
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] SOLVED Re: Can't activate OSD

2016-10-06 Thread Tracy Reed
SOLVED!

Thanks to a very kind person from this list who helped me debug, we found that
when I created the VLAN on the switch I didn't set it allow jumbo packets. This
was preventing the OSDs from activating because some traffic was being blocked.
Once I fixed that everything started working. Sometimes it really helps to have
a second pair of eyes. So this wasn't a Ceph problem at all, really.

Thanks!

On Mon, Oct 03, 2016 at 03:39:45PM PDT, Tracy Reed spake thusly:
> Oops, I said CentOS 5 (old habit, ran it for years!). I meant CentOS 7. And 
> I'm
> running the following Ceph package versions from the ceph repo:
> 
> root@ceph02 ~]# rpm -qa |grep -i ceph
> libcephfs1-10.2.3-0.el7.x86_64
> ceph-common-10.2.3-0.el7.x86_64
> ceph-mon-10.2.3-0.el7.x86_64
> ceph-release-1-1.el7.noarch
> python-cephfs-10.2.3-0.el7.x86_64
> ceph-selinux-10.2.3-0.el7.x86_64
> ceph-osd-10.2.3-0.el7.x86_64
> ceph-mds-10.2.3-0.el7.x86_64
> ceph-radosgw-10.2.3-0.el7.x86_64
> ceph-base-10.2.3-0.el7.x86_64
> ceph-10.2.3-0.el7.x86_64
> 
> On Mon, Oct 03, 2016 at 03:34:50PM PDT, Tracy Reed spake thusly:
> > Hello all,
> > 
> > Over the past few weeks I've been trying to go through the Quick Ceph 
> > Deploy tutorial at:
> > 
> > http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/
> > 
> > just trying to get a basic 2 OSD ceph cluster up and running. Everything 
> > seems
> > to go well until I get to the:
> > 
> > ceph-deploy osd activate ceph02:/dev/sdc ceph03:/dev/sdc
> > 
> > part. It never actually seems to activate the OSD and eventually times out:
> > 
> > [ceph02][DEBUG ] connection detected need for sudo
> > [ceph02][DEBUG ] connected to host: ceph02 
> > [ceph02][DEBUG ] detect platform information from remote host
> > [ceph02][DEBUG ] detect machine type
> > [ceph02][DEBUG ] find the location of an executable
> > [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.2.1511 Core
> > [ceph_deploy.osd][DEBUG ] activating host ceph02 disk /dev/sdc
> > [ceph_deploy.osd][DEBUG ] will use init type: systemd
> > [ceph02][DEBUG ] find the location of an executable
> > [ceph02][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate 
> > --mark-init systemd --mount /dev/sdc
> > [ceph02][WARNIN] main_activate: path = /dev/sdc
> > [ceph02][WARNIN] No data was received after 300 seconds, disconnecting...
> > [ceph02][INFO  ] checking OSD status...
> > [ceph02][DEBUG ] find the location of an executable
> > [ceph02][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat 
> > --format=json
> > [ceph02][INFO  ] Running command: sudo systemctl enable ceph.target
> > [ceph03][DEBUG ] connection detected need for sudo
> > [ceph03][DEBUG ] connected to host: ceph03 
> > [ceph03][DEBUG ] detect platform information from remote host
> > [ceph03][DEBUG ] detect machine type
> > [ceph03][DEBUG ] find the location of an executable
> > [ceph_deploy.osd][INFO  ] Distro info: CentOS Linux 7.2.1511 Core
> > [ceph_deploy.osd][DEBUG ] activating host ceph03 disk /dev/sdc
> > [ceph_deploy.osd][DEBUG ] will use init type: systemd
> > [ceph03][DEBUG ] find the location of an executable
> > [ceph03][INFO  ] Running command: sudo /usr/sbin/ceph-disk -v activate 
> > --mark-init systemd --mount /dev/sdc
> > [ceph03][WARNIN] main_activate: path = /dev/sdc
> > [ceph03][WARNIN] No data was received after 300 seconds, disconnecting...
> > [ceph03][INFO  ] checking OSD status...
> > [ceph03][DEBUG ] find the location of an executable
> > [ceph03][INFO  ] Running command: sudo /bin/ceph --cluster=ceph osd stat 
> > --format=json
> > [ceph03][INFO  ] Running command: sudo systemctl enable ceph.target
> > 
> > Machines involved are ceph-deploy (deploy server), ceph01 (monitor), ceph02 
> > and
> > ceph03 (OSD servers).
> > 
> > ceph log is here:
> > 
> > http://pastebin.com/A2kP28c4
> > 
> > This is CentOS 5. iptables and selinux are both off. When I first started 
> > doing
> > this the volume would be left mounted in the tmp location on the OSDs. But I
> > have since upgraded my version of ceph and now nothing is left mounted on 
> > the
> > OSD but it still times out.
> > 
> > Please let me know if there is any other info I can provide which might 
> > help.
> > Any help you can offer is greatly appreciated! I've been stuck on this for
> > weeks. Thanks!
> > 
> > -- 
> > Tracy Reed
> 
> 
> 
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> -- 
> Tracy Reed



> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


-- 
Tracy Reed


pgpVN3wp3MUC4.pgp
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Monitor troubles

2016-11-01 Thread Tracy Reed
atus
[2016-10-05 14:48:54,272][ceph01][DEBUG ] 

[2016-10-05 14:48:54,273][ceph01][DEBUG ] status for monitor: mon.ceph01
[2016-10-05 14:48:54,274][ceph01][DEBUG ] {
[2016-10-05 14:48:54,275][ceph01][DEBUG ]   "election_epoch": 5, 
[2016-10-05 14:48:54,275][ceph01][DEBUG ]   "extra_probe_peers": [], 
[2016-10-05 14:48:54,275][ceph01][DEBUG ]   "monmap": {
[2016-10-05 14:48:54,276][ceph01][DEBUG ] "created": "2016-09-05 
01:22:09.228315", 
[2016-10-05 14:48:54,276][ceph01][DEBUG ] "epoch": 1, 
[2016-10-05 14:48:54,276][ceph01][DEBUG ] "fsid": 
"3e84db5d-3dc8-4104-89e7-da23c103ef50", 
[2016-10-05 14:48:54,276][ceph01][DEBUG ] "modified": "2016-09-05 
01:22:09.228315", 
[2016-10-05 14:48:54,277][ceph01][DEBUG ] "mons": [
[2016-10-05 14:48:54,277][ceph01][DEBUG ]   {
[2016-10-05 14:48:54,277][ceph01][DEBUG ] "addr": "10.0.5.2:6789/0", 
[2016-10-05 14:48:54,277][ceph01][DEBUG ] "name": "ceph01", 
[2016-10-05 14:48:54,278][ceph01][DEBUG ] "rank": 0
[2016-10-05 14:48:54,278][ceph01][DEBUG ]   }
[2016-10-05 14:48:54,279][ceph01][DEBUG ] ]
[2016-10-05 14:48:54,279][ceph01][DEBUG ]   }, 
[2016-10-05 14:48:54,280][ceph01][DEBUG ]   "name": "ceph01", 
[2016-10-05 14:48:54,280][ceph01][DEBUG ]   "outside_quorum": [], 
[2016-10-05 14:48:54,281][ceph01][DEBUG ]   "quorum": [
[2016-10-05 14:48:54,282][ceph01][DEBUG ] 0
[2016-10-05 14:48:54,282][ceph01][DEBUG ]   ], 
[2016-10-05 14:48:54,282][ceph01][DEBUG ]   "rank": 0, 
[2016-10-05 14:48:54,282][ceph01][DEBUG ]   "state": "leader", 
[2016-10-05 14:48:54,282][ceph01][DEBUG ]   "sync_provider": []
[2016-10-05 14:48:54,283][ceph01][DEBUG ] }
[2016-10-05 14:48:54,283][ceph01][DEBUG ] 

[2016-10-05 14:48:54,283][ceph01][INFO  ] monitor: mon.ceph01 is running

But the cluster worked just fine until I tried adding two more monitors.

In the troubleshooting section "Recovering a Monitor’s Broken monmap" I
thought maybe I would try extracting a monmap with the idea that maybe I
would learn something or possibly change the fsid on ceph01 or
something.

[root@ceph01 ~]# ceph-mon -i mon.ceph01 --extract-monmap /tmp/monmap
monitor data directory at '/var/lib/ceph/mon/ceph-mon.ceph01' does not
exist: have you run 'mkfs'?

So that didn't get me anything either.

mon log on ceph01 contains repetitions of:
2016-11-01 21:34:33.588396 7ff029c70700  0 mon.ceph01@0(probing) e2 
handle_probe ignoring fsid e2e43abc-e634-4a04-ae24-0c486a035b6e != 
3e84db5d-3dc8-4104-89e7-da23c103ef50
2016-11-01 21:34:35.739479 7ff029c70700  0 mon.ceph01@0(probing) e2 
handle_probe ignoring fsid e2e43abc-e634-4a04-ae24-0c486a035b6e != 
3e84db5d-3dc8-4104-89e7-da23c103ef50
2016-11-01 21:34:35.936020 7ff024f3f700  0 -- 10.0.5.2:6789/0 >> 
10.0.5.5:0/3093707402 pipe(0x7ff03d57e800 sd=20 :6789 s=0 pgs=0 cs=0 l=0 
c=0x7ff03d81e580).accept peer addr is really 10.0.5.5:0/3093707402 (socket is 
10.0.5.5:44360/0)
2016-11-01 21:34:37.890073 7ff029c70700  0 mon.ceph01@0(probing) e2 
handle_probe ignoring fsid e2e43abc-e634-4a04-ae24-0c486a035b6e != 
3e84db5d-3dc8-4104-89e7-da23c103ef50
2016-11-01 21:34:40.043113 7ff029c70700  0 mon.ceph01@0(probing) e2 
handle_probe ignoring fsid e2e43abc-e634-4a04-ae24-0c486a035b6e != 
3e84db5d-3dc8-4104-89e7-da23c103ef50
2016-11-01 21:34:40.554165 7ff02a471700  0 mon.ceph01@0(probing).data_health(0) 
update_stats avail 96% total 51175 MB, used 1850 MB, avail 49324 MB

while mon log on ceph02 contains repetitions of:
2016-11-01 21:34:11.327458 7f33f4284700  0 log_channel(audit) log [DBG] : 
from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2016-11-01 21:34:11.327623 7f33f4284700  0 log_channel(audit) log [DBG] : 
from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2016-11-01 21:34:12.451514 7f33f4284700  0 log_channel(audit) log [DBG] : 
from='admin socket' entity='admin socket' cmd='mon_status' args=[]: dispatch
2016-11-01 21:34:12.451683 7f33f4284700  0 log_channel(audit) log [DBG] : 
from='admin socket' entity='admin socket' cmd=mon_status args=[]: finished
2016-11-01 21:34:12.780988 7f33f1715700  0 mon.ceph02@0(probing) e0 
handle_probe ignoring fsid 3e84db5d-3dc8-4104-89e7-da23c103ef50 != 
e2e43abc-e634-4a04-ae24-0c486a035b6e

Any ideas how to recover from this situation are greatly appreciated!

-- 
Tracy Reed


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Monitor troubles

2016-11-01 Thread Tracy Reed
On Tue, Nov 01, 2016 at 09:36:16PM PDT, Tracy Reed spake thusly:
> I initially setup my ceph cluster on CentOS 7 with just one monitor. The
> monitor runs on an osd server (not ideal, will change soon).  I've

Sorry, forgot to add that I'm running the following ceph version from
the ceph repo:

# rpm -qa|grep ceph
libcephfs1-10.2.3-0.el7.x86_64
ceph-release-1-1.el7.noarch
ceph-mds-10.2.3-0.el7.x86_64
ceph-radosgw-10.2.3-0.el7.x86_64
python-cephfs-10.2.3-0.el7.x86_64
ceph-common-10.2.3-0.el7.x86_64
ceph-selinux-10.2.3-0.el7.x86_64
ceph-mon-10.2.3-0.el7.x86_64
ceph-10.2.3-0.el7.x86_64
ceph-base-10.2.3-0.el7.x86_64
ceph-osd-10.2.3-0.el7.x86_64


-- 
Tracy Reed


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Monitor troubles

2016-11-03 Thread Tracy Reed
After a lot of messing about I have manually created a monmap and got
the two new monitors working for a total of three. But to do that I had
to delete the first monitor which for some reason was coming up with a
bogus fsid after manipulated the monmap which I checked and it had the
correct fsid. So I recreated it with the right fsid and after that I had
three monitors with quorum. I had to manually put the keys in place on
the recreated monitor too.

But now all of my OSDs have disapeared! Apparently the mon I deleted was
storing some special knowledge of the OSDs?

The mon log says:

2016-11-03 18:31:26.612012 7f7139529700  0 mon.ceph01@0(leader).data_health(14) 
update_stats avail 96% total 51175 MB, used 1744 MB, avail 49430 MB
2016-11-03 18:31:26.679911 7f7138d28700  0 cephx server osd.3: couldn't find 
entity name: osd.3
2016-11-03 18:31:26.876589 7f7138d28700  0 cephx server osd.6: couldn't find 
entity name: osd.6
2016-11-03 18:31:26.996219 7f7138d28700  0 cephx server osd.14: couldn't find 
entity name: osd.14
2016-11-03 18:31:27.016283 7f7138d28700  0 cephx server osd.41: couldn't find 
entity name: osd.41
2016-11-03 18:31:27.016406 7f7138d28700  0 cephx server osd.37: couldn't find 
entity name: osd.37
2016-11-03 18:31:27.016606 7f7138d28700  0 cephx server osd.40: couldn't find 
entity name: osd.40
2016-11-03 18:31:27.017276 7f7138d28700  0 cephx server osd.48: couldn't find 
entity name: osd.48
2016-11-03 18:31:27.291934 7f7138d28700  0 cephx server osd.4: couldn't find 
entity name: osd.4
2016-11-03 18:31:27.292598 7f7138d28700  0 cephx server osd.5: couldn't find 
entity name: osd.5
2016-11-03 18:31:27.339803 7f7138d28700  0 cephx server osd.7: couldn't find 
entity name: osd.7

So how do I tell the mon about the OSDs?

Any pointers are greatly appreciated.

-- 
Tracy Reed


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] What is in the mon leveldb?

2018-03-26 Thread Tracy Reed
Hello all,

It seems I have underprovisioned storage space for my mons and my
/var/lib/ceph/mon filesystem is getting full. When I first started using
ceph this only took up tens of megabytes and I assumed it would stay
that way and 5G for this filesystem seemed luxurious. Little did I know
that mon was going to be storing multiple gigs of data! That's still a
trivial amount of course but larger than what I expected and now I have
to do some work to rebuild my monitors on bigger storage. 

I'm curious: Exactly what is being stored and is there any way to trim
it down a bit? It has slowly grown over time. I've already run a compact
on it which gained me only a few percent.

Thanks!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What is in the mon leveldb?

2018-03-27 Thread Tracy Reed
On Mon, Mar 26, 2018 at 11:15:34PM PDT, Wido den Hollander spake thusly:
> The MONs keep a history of OSDMaps and other maps. Normally these maps
> are trimmed from the database, but if one or more PGs are not
> active+clean the MONs will keep a large history to get old OSDs up to
> speed which might be needed to bring that PGs to a clean state again.
>
> What is the status of your Ceph cluster (ceph -s) and what version are
> you running?

Ah...well. That leads to my next question which may resolve this issue:

Current state of my cluster is:

  health: HEALTH_WARN
  recovery 1230/13361271 objects misplaced (0.009%)

and no recovery is happening. I'm not sure why. This hasn't happened
before. But the mon db had been growing since long before this
circumstance.

Any idea why it might be stuck like this? I suppose I need to clear this
up before I can know if this is the cause of the disk usage.

> And yes, make sure your MONs do have a tens of GBs available should they
> need it for a very long recovery.

Yeah...I've temporarily moved the store.db to another disk and symlinked
it back but I'm working towards rebuilding my mons.

> For example, I'm working on a 2200 OSD cluster which has been doing a
> recovery operation for a week now and the MON DBs are about 50GB now.

Wow. My cluster is only around 70 OSDs.

Thanks!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What is in the mon leveldb?

2018-03-27 Thread Tracy Reed
>   health: HEALTH_WARN
>   recovery 1230/13361271 objects misplaced (0.009%)
> 
> and no recovery is happening. I'm not sure why. This hasn't happened
> before. But the mon db had been growing since long before this
> circumstance.

Hmmok, the recent trouble started a few days ago when we removed a
node containing 4 OSDs from the cluster. The OSDs on that node were shut
down but were not removed from the crush map. So apparently this has
caused some issues. I just removed the OSDs properly and now there is
recovery happening. Unfortunately it now says 30% of my objects are
misplaced so I'm looking at 24 hours of recovery. Maybe the store.db
will be smaller when it finally finishes.

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] mgr dashboard differs from ceph status

2018-05-03 Thread Tracy Reed
My ceph status says:

  cluster:
id: b2b00aae-f00d-41b4-a29b-58859aa41375
health: HEALTH_OK
 
  services:
mon: 3 daemons, quorum ceph01,ceph03,ceph07
mgr: ceph01(active), standbys: ceph-ceph07, ceph03
osd: 78 osds: 78 up, 78 in
 
  data:
pools:   4 pools, 3240 pgs
objects: 4384k objects, 17533 GB
usage:   53141 GB used, 27311 GB / 80452 GB avail
pgs: 3240 active+clean
 
  io:
client:   4108 kB/s rd, 10071 kB/s wr, 27 op/s rd, 331 op/s wr

but my mgr dashboard web interface says:


Health
Overall status: HEALTH_WARN

PG_AVAILABILITY: Reduced data availability: 2563 pgs inactive


Anyone know why the discrepency? Hopefully the dashboard is very
mistaken! Everything seems to be operating normally. If I had 2/3 of my
pgs inactive I'm sure all of my rbd backing my VMs would be blocked etc.

I'm running ceph-12.2.4-0.el7.x86_64 on CentOS 7. Almost all filestore
except for one OSD which recently had to be replaced which I made
bluestore. I plan to slowly migrate everything over to bluestore over
the course of the next month.

Thanks!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph mgr module not working

2018-05-03 Thread Tracy Reed
Hello all,

I can seemingly enable the balancer ok:

$ ceph mgr module enable balancer

but if I try to check its status:

$ ceph balancer status
Error EINVAL: unrecognized command

or turn it on:

$ ceph balancer on
Error EINVAL: unrecognized command

$ which ceph
/bin/ceph
$ rpm -qf /bin/ceph
ceph-common-12.2.4-0.el7.x86_64

So it's not like I'm running an old version of the ceph command which
wouldn't know about the balancer.

I'm running ceph-12.2.4-0.el7.x86_64 on CentOS 7. Almost all filestore
except for one OSD which recently had to be replaced which I made
bluestore. I plan to slowly migrate everything over to bluestore over
the course of the next month.

Thanks!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Place on separate hosts?

2018-05-04 Thread Tracy Reed
I've been using ceph for nearly a year and one of the things I ran into
quite a while back was that it seems like ceph is placing copies of
objects on different OSDs but sometimes those OSDs can be on the same
host by default. Is that correct? I discovered this by taking down one
host and having some pgs become inactive. 

So I guess you could say I want my failure domain to be the host, not
the OSD.

How would I accomplish this? I understand it involves changing the crush
map.  I've been reading over
http://docs.ceph.com/docs/master/rados/operations/crush-map/ and it
still isn't clear to me what needs to change. I expect I need to change
the default replicated_ruleset which I'm still running:

$ ceph osd crush rule dump
[
{
"rule_id": 0,
"rule_name": "replicated_ruleset",
"ruleset": 0,
"type": 1,
"min_size": 1,
"max_size": 10,
"steps": [
{
"op": "take",
"item": -1,
"item_name": "default"
},
{
"op": "chooseleaf_firstn",
"num": 0,
"type": "host"
},
{
"op": "emit"
}
]
}
]


And that I need something like:

ceph osd crush rule create-replicated


then:

ceph osd pool set  crush_rule 

but I'm not sure what the values of   
would be in my situation. Maybe:

ceph osd crush rule create-replicated different-host default  


but I don't know what failure-domain or class should just by inspecting
my current crush map.

Suggestions are greatly appreciated!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Place on separate hosts?

2018-05-04 Thread Tracy Reed
On Fri, May 04, 2018 at 12:08:35AM PDT, Tracy Reed spake thusly:
> I've been using ceph for nearly a year and one of the things I ran into
> quite a while back was that it seems like ceph is placing copies of
> objects on different OSDs but sometimes those OSDs can be on the same
> host by default. Is that correct? I discovered this by taking down one
> host and having some pgs become inactive. 

Actually, this (admittedly ancient) document:

https://jcftang.github.io/2012/09/06/going-from-replicating-across-osds-to-replicating-across-hosts-in-a-ceph-cluster/

says "As the default CRUSH map replicates across OSD’s I wanted to try
replicating data across hosts just to see what would happen." This would
seem to align with my experience as far as the default goes. However,
this:

http://docs.ceph.com/docs/master/rados/operations/crush-map/

says:

"When you deploy OSDs they are automatically placed within the CRUSH map
under a host node named with the hostname for the host they are running
on. This, combined with the default CRUSH failure domain, ensures that
replicas or erasure code shards are separated across hosts and a single
host failure will not affect availability."

How can I tell which way mine is configured? I could post the whole
crushmap if necessary but it's a bit large to copy and paste.

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Place on separate hosts?

2018-05-04 Thread Tracy Reed
On Fri, May 04, 2018 at 12:18:15AM PDT, Tracy Reed spake thusly:
> https://jcftang.github.io/2012/09/06/going-from-replicating-across-osds-to-replicating-across-hosts-in-a-ceph-cluster/

> How can I tell which way mine is configured? I could post the whole
> crushmap if necessary but it's a bit large to copy and paste.

To further answer my own question (sorry for the spam) the above linked
doc says this should do what I want:

step chooseleaf firstn 0 type host

which is what I already have in my crush map. So it looks like the
default is as I want it. In which case I wonder why I had the problem
previously... I guess the only way to know for sure is to stop one osd
node and see what happens.

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] mgr dashboard differs from ceph status

2018-05-07 Thread Tracy Reed
On Mon, May 07, 2018 at 12:13:00AM PDT, Janne Johansson spake thusly:
> > mgr: ceph01(active), standbys: ceph-ceph07, ceph03
> 
> Don't know if it matters, but the naming seems different even though I guess
> you are running mgr's on the same nodes as the mons, but ceph07 is called
> "ceph-ceph07" in the mgr list.

Yes, I did make a mistake when I started the manager on that one and
provided it with an inconsistent name. That has been corrected. I have
also since restarted all of the managers but the problem persists. But
I'm not in a position to do any debugging on it right now but will try
to look into it more in the morning.

Thanks for the feedback!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rbd map hangs

2018-06-06 Thread Tracy Reed

Hello all! I'm running luminous with old style non-bluestore OSDs. ceph
10.2.9 clients though, haven't been able to upgrade those yet. 

Occasionally I have access to rbds hang on the client such as right now.
I tried to dd a VM image into a mapped rbd and it just hung.

Then I tried to map a new rbd and that hangs also.

How would I troubleshoot this? /var/log/ceph is empty, nothing in
/var/log/messages or dmesg etc.

I just discovered:

find /sys/kernel/debug/ceph -type f -print -exec cat {} \;

which produces (among other seemingly innocuous things, let me know if
anyone wants to see the rest):

osd2(unknown sockaddr family 0) 0%(doesn't exist) 100%

which seems suspicious.

rbd ls works reliably. As does create.  Cluster is healthy. 

But the processes which hung trying to access that mapped rbd appear to
be completely unkillable. What 

else should I check?

Thanks!


-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd map hangs

2018-06-07 Thread Tracy Reed
et-alloc-hint,write
16271496osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
16271497osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
16271498osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
16271499osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
16271500osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
32154589osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
32154590osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
32155075osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
32155250osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
32156442osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
32156983osd12.5d28d036 rbd_data.1c55496b8b4567.08cb 
set-alloc-hint,write
33982347osd12.678ef636 rbd_data.93285b6b8b4567.139d 
set-alloc-hint,write
34517953osd79   0.e321b924 rbd_data.51f32238e1f29.0d30  
   set-alloc-hint,write
34517955osd79   0.e321b924 rbd_data.51f32238e1f29.0d30  
   set-alloc-hint,write
34517956osd79   0.e321b924 rbd_data.51f32238e1f29.0d30  
   set-alloc-hint,write
34517957osd79   0.e321b924 rbd_data.51f32238e1f29.0d30  
   set-alloc-hint,write
34517958osd79   0.e321b924 rbd_data.51f32238e1f29.0d30  
   set-alloc-hint,write
34517959osd79   0.e321b924 rbd_data.51f32238e1f29.0d30  
   set-alloc-hint,write
34517960osd79   0.e321b924 rbd_data.51f32238e1f29.0d30  
   set-alloc-hint,write
34517961osd79   0.e321b924 rbd_data.51f32238e1f29.0d30  
   set-alloc-hint,write
34517963osd79   0.e321b924 rbd_data.51f32238e1f29.0d30  
   set-alloc-hint,write
34533231osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de  
   set-alloc-hint,write
34533233osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de  
   set-alloc-hint,write
34533234osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de  
   set-alloc-hint,write
34533235osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de  
   set-alloc-hint,write
34533236osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de  
   set-alloc-hint,write
34533237osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de  
   set-alloc-hint,write
34533238osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de  
   set-alloc-hint,write
34533239osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de  
   set-alloc-hint,write
34533241osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de  
   set-alloc-hint,write
/sys/kernel/debug/ceph/b2b00aae-f00d-41b4-a29b-58859aa41375.client31276017/monc
have osdmap 232455
want next osdmap


Thanks!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd map hangs

2018-06-07 Thread Tracy Reed
On Thu, Jun 07, 2018 at 08:40:50AM PDT, Ilya Dryomov spake thusly:
> > Kernel is Linux cpu04.mydomain.com 3.10.0-229.20.1.el7.x86_64 #1 SMP Tue 
> > Nov 3 19:10:07 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
> 
> This is a *very* old kernel.

It's what's shipping with CentOS/RHEL 7 and probably what the vast
majority of people are using aside from perhaps the Ubuntu LTS people.
Does anyone really still compile their own latest kernels? Back in the
mid-90's I'd compile a new kernel at the drop of a hat. But now it has
gotten so complicated with so many options and drivers etc. that it's
actually pretty hard to get it right.

> These lines indicate in-flight requests.  Looks like there may have
> been a problem with osd1 in the past, as some of these are much older
> than others.  Try bouncing osd1 with "ceph osd down 1" (it should
> come back up automatically) and see if that clears up this batch.

Thanks!

-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd map hangs

2018-06-07 Thread Tracy Reed
xists, up)100%
osd61   10.0.5.3:680352%(exists, up)100%
osd62   10.0.5.12:6800   42%(exists, up)100%
osd63   10.0.5.12:6819   46%(exists, up)100%
osd64   10.0.5.12:6809   44%(exists, up)100%
osd65   10.0.5.13:6800   44%(exists, up)100%
osd66   (unknown sockaddr family 0)   0%(doesn't exist) 100%
osd67   10.0.5.13:6808   50%(exists, up)100%
osd68   10.0.5.4:680441%(exists, up)100%
osd69   10.0.5.4:680039%(exists, up)100%
osd70   10.0.5.13:6804   42%(exists, up)100%
osd71   (unknown sockaddr family 0)   0%(doesn't exist) 100%
osd72   (unknown sockaddr family 0)   0%(doesn't exist) 100%
osd73   10.0.5.16:6826   92%(exists, up)100%
osd74   10.0.5.16:6846  100%(exists, up)100%
osd75   10.0.5.16:6811   98%(exists, up)100%
osd76   10.0.5.16:6815  100%(exists, up)100%
osd77   10.0.5.16:6835   93%(exists, up)100%
osd78   10.0.5.16:6802   97%(exists, up)100%
osd79   10.0.5.16:6858  100%(exists, up)100%
osd80   10.0.5.16:6839   91%(exists, up)100%
osd81   10.0.5.16:6801  100%(exists, up)100%
osd82   10.0.5.16:6820   99%(exists, up)100%
osd83   10.0.5.16:6852   98%(exists, up)100%
osd84   10.0.5.16:6862   93%(exists, up)100%
osd85   10.0.5.16:6800   96%(exists, up)100%
/sys/kernel/debug/ceph/b2b00aae-f00d-41b4-a29b-58859aa41375.client31276017/monmap
epoch 12
mon010.0.5.2:6789
mon110.0.5.4:6789
mon210.0.5.13:6789
/sys/kernel/debug/ceph/b2b00aae-f00d-41b4-a29b-58859aa41375.client31276017/osdc
34533231osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de 
set-alloc-hint,write
34533233osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de 
set-alloc-hint,write
34533234osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de 
set-alloc-hint,write
34533235osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de 
set-alloc-hint,write
34533236osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de 
set-alloc-hint,write
34533237osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de 
set-alloc-hint,write
34533238osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de 
set-alloc-hint,write
34533239osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de 
set-alloc-hint,write
34533241osd73   0.f0ae1f02 rbd_data.51f32238e1f29.13de 
set-alloc-hint,write
34919983osd67   0.f4cdfa38 rbd_header.51f32238e1f29 
5613'998386622791680watch
34919984osd62.5aca5ef2 rbd_header.93285b6b8b4567 
4422885'943544185389056 watch
34919985osd67   2.4dbc6037 rbd_header.5f75476b8b4567 
28922'998386622791680   watch
34919986osd12.ba8d973e rbd_header.dd3b556b8b4567 
5305738'894263730634752 watch
/sys/kernel/debug/ceph/b2b00aae-f00d-41b4-a29b-58859aa41375.client31276017/monc
have osdmap 232501
want next osdmap


-- 
Tracy Reed
http://tracyreed.org
Digital signature attached for your safety.


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] virt-install into rbd hangs during Anaconda package installation

2017-02-06 Thread Tracy Reed
This is what I'm doing on my CentOS 7/KVM/virtlib server:

rbd create --size 20G pool/vm.mydomain.com

rbd map pool/vm.mydomain.com --name client.admin

virt-install --name vm.mydomain.com --ram 2048 --disk 
path=/dev/rbd/pool/vm.mydomain.com  --vcpus 1  --os-type linux --os-variant 
rhel6 --network bridge=dmz --graphics none --console pty,target_type=serial 
--location http://repo.mydomain.com/centos/7/os/x86_64 --extra-args 
"ip=en0:dhcp ks=http://repo.mydomain.com/ks/ks.cfg.vm console=ttyS0  
ksdevice=eth0 
inst.repo=http://10.0.10.5/http://repo.mydomain.com/centos/7/os/x86_64";

And then it creates partitions, filesystems (xfs), and
starts installing packages. 9 times out of 10 it hangs while
installing packages. And I have no idea why. I can't kill
the VM. 

Trying to destroy it shows:

virsh # destroy vm.mydomain.com
error: Failed to destroy domain vm.mydomain.com
error: Failed to terminate process 19629 with SIGKILL:
Device or resource busy

and then virsh ls shows:

virsh ls shows:

127   vm.mydomain.comin shutdown

The log for this vm in
/var/log/libvirt/qemu/vm.mydomain.com contains only:

2017-02-06 08:14:12.256+: starting up libvirt version:
2.0.0, package: 10.el7_3.2 (CentOS BuildSystem
<http://bugs.centos.org>, 2016-12-06-19:53:38,
c1bm.rdu2.centos.org), qemu version: 1.5.3
(qemu-kvm-1.5.3-105.el7_2.7), hostname: cpu01.mydomain.com
LC_ALL=C
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name
secclass2.mydomain.com -S -machine
pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu
SandyBridge,+vme,+f16c,+rdrand,+fsgsbase,+smep,+erms -m
2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1
-uuid 5dadf01e-b996-411f-b95f-26ce6b790bae -nographic
-no-user-config -nodefaults -chardev
socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-127-secclass2.mydomain./monitor.sock,server,nowait
-mon chardev=charmonitor,id=monitor,mode=control -rtc
base=utc,driftfix=slew -global
kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot
-global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1
-boot strict=on -kernel
/var/lib/libvirt/boot/virtinst-vmlinuz.9Ax4zt -initrd
/var/lib/libvirt/boot/virtinst-initrd.img.ALJE43 -append
'ip=en0:dhcp ks=http://util1.mydomain.com/ks/ks.cfg.vm.
console=ttyS0  ksdevice=eth0
inst.repo=http://10.0.10.5/http://util1.mydomain.com/centos/7/os/x86_64
method=http://util1.mydomain.com/centos/7/os/x86_64'
-device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7
-device
ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5
-device
ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1
-device
ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2
-device
virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4
-drive
file=/dev/rbd/security-class/secclass2.mydomain.com,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native
-device
virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
-netdev tap,fd=55,id=hostnet0,vhost=on,vhostfd=57 -device
virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:87:d2:12,bus=pci.0,addr=0x3
-chardev pty,id=charserial0 -device
isa-serial,chardev=charserial0,id=serial0 -chardev
socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-127-secclass2.mydomain./org.qemu.guest_agent.0,server,nowait
-device
virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=channel0,name=org.qemu.guest_agent.0
-device usb-tablet,id=input0,bus=usb.0,port=1 -device
virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x7 -msg
timestamp=on
char device redirected to /dev/pts/24 (label charserial0)
qemu: terminating on signal 15 from pid 23385

Any ideas? If this is a libvirt/kvm problem I'll take it to the
appropriate forum but we can install into iscsi LUNs with no problem at
all.

Someone on IRC mentioned mkfs discard starting a zero on the rbd image
which can take a long time but that should be doable in background and
not hang the whole VM forever, right?

Thanks for any insight you can provide!

-- 
Tracy Reed


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] virt-install into rbd hangs during Anaconda package installation

2017-02-07 Thread Tracy Reed
Weird. Now the VMs that were hung in interruptable wait state have now
disappeared. No idea why.

Additional information:

ceph-mds-10.2.3-0.el7.x86_64
python-cephfs-10.2.3-0.el7.x86_64
ceph-osd-10.2.3-0.el7.x86_64
ceph-radosgw-10.2.3-0.el7.x86_64
libcephfs1-10.2.3-0.el7.x86_64
ceph-common-10.2.3-0.el7.x86_64
ceph-base-10.2.3-0.el7.x86_64
ceph-10.2.3-0.el7.x86_64
ceph-selinux-10.2.3-0.el7.x86_64
ceph-mon-10.2.3-0.el7.x86_64

cluster b2b00aae-f00d-41b4-a29b-58859aa41375
 health HEALTH_OK
 monmap e11: 3 mons at 
{ceph01=10.0.5.2:6789/0,ceph03=10.0.5.4:6789/0,ceph07=10.0.5.13:6789/0}
election epoch 76, quorum 0,1,2 ceph01,ceph03,ceph07
 osdmap e14396: 70 osds: 66 up, 66 in
flags sortbitwise,require_jewel_osds
  pgmap v7116569: 1664 pgs, 3 pools, 7876 GB data, 1969 kobjects
23648 GB used, 24310 GB / 47958 GB avail
1661 active+clean
   2 active+clean+scrubbing+deep
   1 active+clean+scrubbing
  client io 839 kB/s wr, 0 op/s rd, 159 op/s wr


On Mon, Feb 06, 2017 at 06:57:23PM PST, Tracy Reed spake thusly:
> This is what I'm doing on my CentOS 7/KVM/virtlib server:
> 
> rbd create --size 20G pool/vm.mydomain.com
> 
> rbd map pool/vm.mydomain.com --name client.admin
> 
> virt-install --name vm.mydomain.com --ram 2048 --disk 
> path=/dev/rbd/pool/vm.mydomain.com  --vcpus 1  --os-type linux --os-variant 
> rhel6 --network bridge=dmz --graphics none --console pty,target_type=serial 
> --location http://repo.mydomain.com/centos/7/os/x86_64 --extra-args 
> "ip=en0:dhcp ks=http://repo.mydomain.com/ks/ks.cfg.vm console=ttyS0  
> ksdevice=eth0 
> inst.repo=http://10.0.10.5/http://repo.mydomain.com/centos/7/os/x86_64";
> 
> And then it creates partitions, filesystems (xfs), and
> starts installing packages. 9 times out of 10 it hangs while
> installing packages. And I have no idea why. I can't kill
> the VM. 
> 
> Trying to destroy it shows:
> 
> virsh # destroy vm.mydomain.com
> error: Failed to destroy domain vm.mydomain.com
> error: Failed to terminate process 19629 with SIGKILL:
> Device or resource busy
> 
> and then virsh ls shows:
> 
> virsh ls shows:
> 
> 127   vm.mydomain.comin shutdown
> 
> The log for this vm in
> /var/log/libvirt/qemu/vm.mydomain.com contains only:
> 
> 2017-02-06 08:14:12.256+: starting up libvirt version:
> 2.0.0, package: 10.el7_3.2 (CentOS BuildSystem
> <http://bugs.centos.org>, 2016-12-06-19:53:38,
> c1bm.rdu2.centos.org), qemu version: 1.5.3
> (qemu-kvm-1.5.3-105.el7_2.7), hostname: cpu01.mydomain.com
> LC_ALL=C
> PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
> QEMU_AUDIO_DRV=none /usr/libexec/qemu-kvm -name
> secclass2.mydomain.com -S -machine
> pc-i440fx-rhel7.0.0,accel=kvm,usb=off -cpu
> SandyBridge,+vme,+f16c,+rdrand,+fsgsbase,+smep,+erms -m
> 2048 -realtime mlock=off -smp 1,sockets=1,cores=1,threads=1
> -uuid 5dadf01e-b996-411f-b95f-26ce6b790bae -nographic
> -no-user-config -nodefaults -chardev
> socket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-127-secclass2.mydomain./monitor.sock,server,nowait
> -mon chardev=charmonitor,id=monitor,mode=control -rtc
> base=utc,driftfix=slew -global
> kvm-pit.lost_tick_policy=discard -no-hpet -no-reboot
> -global PIIX4_PM.disable_s3=1 -global PIIX4_PM.disable_s4=1
> -boot strict=on -kernel
> /var/lib/libvirt/boot/virtinst-vmlinuz.9Ax4zt -initrd
> /var/lib/libvirt/boot/virtinst-initrd.img.ALJE43 -append
> 'ip=en0:dhcp ks=http://util1.mydomain.com/ks/ks.cfg.vm.
> console=ttyS0  ksdevice=eth0
> inst.repo=http://10.0.10.5/http://util1.mydomain.com/centos/7/os/x86_64
> method=http://util1.mydomain.com/centos/7/os/x86_64'
> -device ich9-usb-ehci1,id=usb,bus=pci.0,addr=0x5.0x7
> -device
> ich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x5
> -device
> ich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x5.0x1
> -device
> ich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x5.0x2
> -device
> virtio-serial-pci,id=virtio-serial0,bus=pci.0,addr=0x4
> -drive
> file=/dev/rbd/security-class/secclass2.mydomain.com,format=raw,if=none,id=drive-virtio-disk0,cache=none,aio=native
> -device
> virtio-blk-pci,scsi=off,bus=pci.0,addr=0x6,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1
> -netdev tap,fd=55,id=hostnet0,vhost=on,vhostfd=57 -device
> virtio-net-pci,netdev=hostnet0,id=net0,mac=52:54:00:87:d2:12,bus=pci.0,addr=0x3
> -chardev pty,id=charserial0 -device
> isa-serial,chardev=charserial0,id=serial0 -chardev
> socket,id=charchannel0,path=/var/lib/libvirt/qemu/channel/target/domain-127-secclass2.mydomain./org.qemu.guest_agent.0,server,nowait
> -device
> virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,id=ch

Re: [ceph-users] virt-install into rbd hangs during Anaconda package installation

2017-02-07 Thread Tracy Reed
On Tue, Feb 07, 2017 at 12:25:08AM PST, koukou73gr spake thusly:
> On 2017-02-07 10:11, Tracy Reed wrote:
> > Weird. Now the VMs that were hung in interruptable wait state have now
> > disappeared. No idea why.
> 
> Have you tried the same procedure but with local storage instead?

Yes. I have local storage and iSCSI storage and they both install just
fine.

-- 
Tracy Reed


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] virt-install into rbd hangs during Anaconda package installation

2017-02-08 Thread Tracy Reed
On Wed, Feb 08, 2017 at 10:57:38AM PST, Shinobu Kinjo spake thusly:
> If you would be able to reproduce the issue intentionally under
> particular condition which I have no idea about at the moment, it
> would be helpful.

The issue is very reproduceable. It hangs every time. Any install I do
with virt-install causes a hang at some point during the install. I have
reproduces it 3 times this morning already.

> There were some MLs previously regarding to *similar* issue.
> 
>  # google "libvirt rbd issue"

I found:

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-September/004179.html

which suggested file descriptors as the problem. That's good to know for
when my cluster gets bigger but I have only 70 OSDs and the number of
fds used did not exceed 90 when the soft limit is 1024.

My problem also manifests itself a little differently than described in
that post. I can dd large machine images into rbd all day long with no
problems. In fact I am considering bypassing anaconda kickstart installs
for the moment and just copying the machine image which gets
successfully installed occasionally but this is not our normal
deployment workflow so is not ideal. Plus I'm still concerned there is
an actual underlying problem or something I am not understanding which
may bite us later.

That post also mentions jumbo frames. We have jumbo frames enabled
everywhere. We did have a problem months ago with getting ceph up and
running initially because we forgot to tell the switch to use jumbo
frames and learned our lesson on that.

Not sure what else I can look at. I'm not seeing any clues.

-- 
Tracy Reed


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How safe is ceph pg repair these days?

2017-02-17 Thread Tracy Reed
I have a 3 replica cluster. A couple times I have run into inconsistent
PGs. I googled it and ceph docs and various blogs say run a repair
first. But a couple people on IRC and a mailing list thread from 2015
say that ceph blindly copies the primary over the secondaries and calls
it good. 

http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001370.html

I sure hope that isn't the case. If so it would seem highly
irresponsible to implement such a naive command called "repair". I have
recently learned how to properly analyze the OSD logs and manually fix
these things but not before having run repair on a dozen inconsistent
PGs. Now I'm worried about what sort of corruption I may have
introduced. Repairing things by hand is a simple heuristic based on
comparing the size or checksum (as indicated by the logs) for each of
the 3 copies and figuring out which is correct. Presumably matching two
out of three should win and the odd object out should be deleted since
having the exact same kind of error on two different OSDs is highly
improbable. I don't understand why ceph repair wouldn't have done this
all along.

What is the current best practice in the use of ceph repair?

Thanks!

-- 
Tracy Reed


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How safe is ceph pg repair these days?

2017-02-17 Thread Tracy Reed
Well, that's the question...is that safe? Because the link to the
mailing list post (possibly outdated) says that what you just suggested
is definitely NOT safe. Is the mailing list post wrong? Has the
situation changed? Exactly what does ceph repair do now? I suppose I
could go dig into the code but I'm not an expert and would hate to get
it wrong and post possibly bogus info the the list for other newbies to
find and worry about and possibly lose their data.

On Fri, Feb 17, 2017 at 06:08:39PM PST, Shinobu Kinjo spake thusly:
> if ``ceph pg deep-scrub `` does not work
> then
>   do
> ``ceph pg repair 
> 
> 
> On Sat, Feb 18, 2017 at 10:02 AM, Tracy Reed  wrote:
> > I have a 3 replica cluster. A couple times I have run into inconsistent
> > PGs. I googled it and ceph docs and various blogs say run a repair
> > first. But a couple people on IRC and a mailing list thread from 2015
> > say that ceph blindly copies the primary over the secondaries and calls
> > it good.
> >
> > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001370.html
> >
> > I sure hope that isn't the case. If so it would seem highly
> > irresponsible to implement such a naive command called "repair". I have
> > recently learned how to properly analyze the OSD logs and manually fix
> > these things but not before having run repair on a dozen inconsistent
> > PGs. Now I'm worried about what sort of corruption I may have
> > introduced. Repairing things by hand is a simple heuristic based on
> > comparing the size or checksum (as indicated by the logs) for each of
> > the 3 copies and figuring out which is correct. Presumably matching two
> > out of three should win and the odd object out should be deleted since
> > having the exact same kind of error on two different OSDs is highly
> > improbable. I don't understand why ceph repair wouldn't have done this
> > all along.
> >
> > What is the current best practice in the use of ceph repair?
> >
> > Thanks!
> >
> > --
> > Tracy Reed
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >

-- 
Tracy Reed


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How safe is ceph pg repair these days?

2017-02-22 Thread Tracy Reed
On Mon, Feb 20, 2017 at 02:12:52PM PST, Gregory Farnum spake thusly:
> Hmm, I went digging in and sadly this isn't quite right. 

Thanks for looking into this! This is the answer I was afraid of. Aren't
all of those blog entries which talk about using repair and the ceph
docs themselves putting people's data at risk? It seems like the only
responsible way to deal with inconsistent PGs is to dig into the osd
log, look at the reason for the inconistency, examine the data on disk,
determine which one is good and which is bad, and delete the bad one?

> The code has a lot of internal plumbing to allow more smarts than were
> previously feasible and the erasure-coded pools make use of them for
> noticing stuff like local corruption. Replicated pools make an attempt
> but it's not as reliable as one would like and it still doesn't
> involve any kind of voting mechanism.

This is pretty surprising. I would have thought a best two out of three
voting mechanism in a triple replicated setup would be the obvious way
to go. It must be more difficult to implement than I suppose.

> A self-inconsistent replicated primary won't get chosen. A primary is
> self-inconsistent when its digest doesn't match the data, which
> happens when:
> 1) the object hasn't been written since it was last scrubbed, or
> 2) the object was written in full, or
> 3) the object has only been appended to since the last time its digest
> was recorded, or
> 4) something has gone terribly wrong in/under LevelDB and the omap
> entries don't match what the digest says should be there.

At least there's some sort of basic heuristic which attempts to do the
right thing even if the whole process isn't as thorough as it could be.

> David knows more and correct if I'm missing something. He's also
> working on interfaces for scrub that are more friendly in general and
> allow administrators to make more fine-grained decisions about
> recovery in ways that cooperate with RADOS.

These will be very welcome improvements! 

-- 
Tracy Reed


signature.asc
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com