Re: [ceph-users] Strange configuration with many SAN and few servers

2014-11-08 Thread Mario Giammarco
Gregory Farnum writes: > > > and then to "replace the server" you could hair mount the LUNs somewhere else and turn on the OSDs. You would need to set a few config options (like the one that automatically updates crush location on boot), but it shouldn't be too difficult. Thank you for your r

Re: [ceph-users] RBD kernel module for CentOS?

2014-11-08 Thread Alexandre DERUMIER
Hi >>Can anyone point me to a RBD kmod for CentOS? http://gitbuilder.ceph.com/kmod-rpm-rhel7beta-x86_64-basic/ref/rhel7/x86_64/ - Mail original - De: "Bruce McFarland" À: ceph-users@lists.ceph.com Envoyé: Vendredi 7 Novembre 2014 20:13:37 Objet: [ceph-users] RBD kernel module for

[ceph-users] [URGENT] My CEPH cluster is dying (due to "incomplete" PG)

2014-11-08 Thread Chu Duc Minh
My ceph cluster have a pg in state "incomplete" and i can not query them any more. *# ceph pg 6.9d8 query* (hang forever) All my volumes may be lost data because of this PG. # ceph pg dump_stuck inactive ok pg_stat objects mip degrmispunf bytes log disklog state state_sta

[ceph-users] Troubleshooting an erasure coded pool with a cache tier

2014-11-08 Thread Loic Dachary
Hi, This is a first attempt, it is entirely possible that the solution is simple or RTFM ;-) Here is the problem observed: rados --pool ec4p1 bench 120 write # the erasure coded pool Total time run: 147.207804 Total writes made: 458 Write size: 4194304 Bandwidth (MB/sec

Re: [ceph-users] Strange configuration with many SAN and few servers

2014-11-08 Thread Gregory Farnum
Yep! I mean, you don't do anything to register the osd with the cluster again, you just turn it on and it goes to register its new location. -Greg On Sat, Nov 8, 2014 at 2:35 AM Mario Giammarco wrote: > Gregory Farnum writes: > > > > > > > and then to "replace the server" you could hair mount t

Re: [ceph-users] Troubleshooting an erasure coded pool with a cache tier

2014-11-08 Thread Gregory Farnum
When acting as a cache pool it needs to go do a lookup on the base pool for every object it hasn't encountered before. I assume that's why it's slower. (The penalty should not be nearly as high as you're seeing here, but based on the low numbers I imagine you're running everything on an overloaded

Re: [ceph-users] [URGENT] My CEPH cluster is dying (due to "incomplete" PG)

2014-11-08 Thread Chu Duc Minh
I have no choice except re-create this PG: # ceph pg force_create_pg 6.9d8 But it still stuck at creating: # ceph pg dump | grep creating dumped all in format plain 6.9d8 0 0 0 0 0 0 0 0 creating2014-11-09 03:27:23.611838 0'0 0:0 []

Re: [ceph-users] Troubleshooting an erasure coded pool with a cache tier

2014-11-08 Thread Loic Dachary
Hi Greg, On 08/11/2014 20:19, Gregory Farnum wrote:> When acting as a cache pool it needs to go do a lookup on the base pool for every object it hasn't encountered before. I assume that's why it's slower. > (The penalty should not be nearly as high as you're seeing here, but based on > the low

[ceph-users] Cache Tier Statistics

2014-11-08 Thread Nick Fisk
Hi, Does anyone know if there any statistics available specific to the cache tier functionality, I'm thinking along the lines of cache hit ratios? Or should I be pulling out the Read statistics for backing+cache pools and assuming that if a read happens from the backing pool it was a miss and t

Re: [ceph-users] Troubleshooting an erasure coded pool with a cache tier

2014-11-08 Thread Gregory Farnum
It's all about the disk accesses. What's the slow part when you dump historic and in-progress ops? On Sat, Nov 8, 2014 at 2:30 PM Loic Dachary wrote: > Hi Greg, > > On 08/11/2014 20:19, Gregory Farnum wrote:> When acting as a cache pool it > needs to go do a lookup on the base pool for every obje

Re: [ceph-users] Troubleshooting an erasure coded pool with a cache tier

2014-11-08 Thread Loic Dachary
On 09/11/2014 00:03, Gregory Farnum wrote: > It's all about the disk accesses. What's the slow part when you dump historic > and in-progress ops? This is what I see on g1 (6% iowait) root@g1:~# ceph daemon osd.0 dump_ops_in_flight { "num_ops": 0, "ops": []} root@g1:~# ceph daemon osd.0 dump

Re: [ceph-users] Troubleshooting an erasure coded pool with a cache tier

2014-11-08 Thread Gregory Farnum
On Sat, Nov 8, 2014 at 3:24 PM, Loic Dachary wrote: > > > On 09/11/2014 00:03, Gregory Farnum wrote: >> It's all about the disk accesses. What's the slow part when you dump >> historic and in-progress ops? > > This is what I see on g1 (6% iowait) Yeah, you're going to need to do some data collat

Re: [ceph-users] Cache Tier Statistics

2014-11-08 Thread Jean-Charles Lopez
Hi Nick If my brain doesn't fail me you can try ceph daemon osd.{id} perf dump ceph report (not 100% sure if cache stats are in Rgds JC On Saturday, November 8, 2014, Nick Fisk wrote: > Hi, > > > > Does anyone know if there any statistics available specific to the cache > tier functionality,

[ceph-users] How to remove hung object

2014-11-08 Thread Tuân Tạ Bá
Hi all, I want to remove hung object . When using "rados -p volumes rm rbd_data.3955c5cdbb2ea.2832" bug, I can't remove because that object in incomlete pg. How to remove hung object? (any way?) 2014-11-09 09:00:57.005459 7f34f42de700 10 osd.88 pg_epoch: 112398 *pg[6.9d8(* v 1093