[ceph-users] SSD-Cache Tier + RBD-Cache = Filesystem corruption?

2016-02-06 Thread Udo Waechter
Hello,

I am experiencing totally weird filesystem corruptions with the
following setup:

* Ceph infernalis on Debian8
* 10 OSDs (5 hosts) with spinning disks
* 4 OSDs (1 host, with SSDs)

The SSDs are new in my setup and I am trying to setup a Cache tier.

Now, with the spinning disks Ceph is running since about a year without
any major issues. Replacing disks and all that went fine.

Ceph is used by rbd+libvirt+kvm with

rbd_cache = true
rbd_cache_writethrough_until_flush = true
rbd_cache_size = 128M
rbd_cache_max_dirty = 96M

Also, in libvirt, I have

cachemode=writeback enabled.

So far so good.

Now, I've added the SSD-Cache tier to the picture with "cache-mode
writeback"

The SSD-Machine also has "deadline" scheduler enabled.

Suddenly VMs start to corrupt their filesystems (all ext4) with "Journal
failed".
Trying to reboot the machines ends in "No bootable drive"
Using parted and testdisk on the image mapped via rbd reveals that the
partition table is gone.

testdisk finds the proper ones, e2fsck repairs the filesystem beyond
usage afterwards.

This does not happen to all machines, It happens to those that actually
do some or most fo the IO

elasticsearch, MariaDB+Galera, postgres, backup, GIT

So I thought, yesterday one of my ldap-servers died, and that one is not
doing IO.

Could it be that rbd caching + qemu writeback cache + ceph cach tier
writeback are not playing well together?

I've read through some older mails on the list, where people had similar
problems and suspected somehting like that.

What are the proper/right settings for rdb/qemu/libvirt?

libvirt: cachemode=none (writeback?)
rdb: cache_mode = none
SSD-tier: cachemode: writeback

?

Thanks for any help,
udo.



signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and hadoop (fstab insted of CephFS)

2016-02-06 Thread Zoltan Arnold Nagy
Hi,

Are these bare metal nodes or VMs?

For VMs I suggest you just attach rbd data disks then let hdfs do it’s magic. 
Just make sure you’re not replicating 9x (3x on ceph + 3x on hadoop).
If it’s VMs, you can just do the same with krbd, just make sure to run a recent 
enough kernel  :-)

Basically putting HDFS on RBDs.

> On 05 Feb 2016, at 13:42, Jose M  > wrote:
> 
> Hi Zoltan, thanks for the answer.
> 
> Because replacing hdfs:// with ceph:// and use CephFs doesn't work for all 
> haddop componentes out of the box (unless in my tests), for example I had 
> issues with Hbase, then with Yarn, Hue, etc (I'm using the cloudera 
> distribution but I also tried with separate components). And besides the need 
> to add jars and bindings to each node to get them work, there are a lot of 
> places (xmls, configuration) where the "hdfs for ceph" replacement need to be 
> made. 
> 
> Giving this issues, I thought that mounting ceph as a local directory and 
> then use this "virtual dirs" as the haddop dfs dirs, would be easier and will 
> work better (less configuration problems, and only changing the dfs dirs will 
> make all components work without any more changes).
> 
> Of course I can be totally wrong, and it's a core change to do this, that's 
> why I think I should ask here first :)
> 
> Thanks!
> 
> PS: If you are asking why I'm trying to use ceph here, well it's because we 
> were given an infrastructure with the possibility yo use a big ceph storage 
> that's working really really well (but as an object store and wasn't use 
> until now with hadoop).
> 
> 
> De: Zoltan Arnold Nagy  >
> Enviado: jueves, 04 de febrero de 2016 06:07 p.m.
> Para: John Spray
> Cc: Jose M; ceph-users@lists.ceph.com 
> Asunto: Re: [ceph-users] Ceph and hadoop (fstab insted of CephFS)
>  
> Might be totally wrong here, but it’s not layering them but replacing hdfs:// 
> URLs with ceph:// URLs so all the mapreduce/spark/hbase/whatever is on top 
> can use CephFS directly which is not a bad thing to do (if it works) :-)
> 
>> On 02 Feb 2016, at 16:50, John Spray > > wrote:
>> 
>> On Tue, Feb 2, 2016 at 3:42 PM, Jose M > > wrote:
>>> Hi,
>>> 
>>> 
>>> One simple question, in the ceph docs says that to use Ceph as an HDFS
>>> replacement, I can use the CephFs Hadoop plugin
>>> (http://docs.ceph.com/docs/master/cephfs/hadoop/ 
>>> ).
>>> 
>>> 
>>> What I would like to know if instead of using the plugin, I can mount ceph
>>> in fstab and then point hdfs dirs (namenode, datanode, etc) to this mounted
>>> "ceph" dirs, instead of native local dirs.
>>> 
>>> I understand that maybe will involve more configuration steps (configuring
>>> fstab in each node), but will this work? Is there any problem with this type
>>> of configuration?
>> 
>> Without being a big HDFS expert, it seems like you would be
>> essentially putting one distributed filesystem on top of another
>> distributed filesystem.  I don't know if you're going to find anything
>> that breaks as such, but it's probably not a good idea.
>> 
>> John
>> 
>>> 
>>> Thanks in advance,
>>> 
>>> 
>>> 
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com 
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>>> 
>>> 
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com 
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
>> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD-Cache Tier + RBD-Cache = Filesystem corruption?

2016-02-06 Thread Alexandre DERUMIER
>>Could it be that rbd caching + qemu writeback cache + ceph cach tier
>>writeback are not playing well together?


rbd caching=true is the same than qemu writeback.

Setting cache=writeback in qemu, configure the librbd with rbd cache=true



if you have fs corruption, it seem that flush from guest are not going 
correctly to the final storage.
I never have had problem with rbd_cache=true .

Maybe it's a bug with ssd cache tier...



- Mail original -
De: "Udo Waechter" 
À: "ceph-users" 
Envoyé: Samedi 6 Février 2016 11:31:51
Objet: [ceph-users] SSD-Cache Tier + RBD-Cache = Filesystem corruption?

Hello, 

I am experiencing totally weird filesystem corruptions with the 
following setup: 

* Ceph infernalis on Debian8 
* 10 OSDs (5 hosts) with spinning disks 
* 4 OSDs (1 host, with SSDs) 

The SSDs are new in my setup and I am trying to setup a Cache tier. 

Now, with the spinning disks Ceph is running since about a year without 
any major issues. Replacing disks and all that went fine. 

Ceph is used by rbd+libvirt+kvm with 

rbd_cache = true 
rbd_cache_writethrough_until_flush = true 
rbd_cache_size = 128M 
rbd_cache_max_dirty = 96M 

Also, in libvirt, I have 

cachemode=writeback enabled. 

So far so good. 

Now, I've added the SSD-Cache tier to the picture with "cache-mode 
writeback" 

The SSD-Machine also has "deadline" scheduler enabled. 

Suddenly VMs start to corrupt their filesystems (all ext4) with "Journal 
failed". 
Trying to reboot the machines ends in "No bootable drive" 
Using parted and testdisk on the image mapped via rbd reveals that the 
partition table is gone. 

testdisk finds the proper ones, e2fsck repairs the filesystem beyond 
usage afterwards. 

This does not happen to all machines, It happens to those that actually 
do some or most fo the IO 

elasticsearch, MariaDB+Galera, postgres, backup, GIT 

So I thought, yesterday one of my ldap-servers died, and that one is not 
doing IO. 

Could it be that rbd caching + qemu writeback cache + ceph cach tier 
writeback are not playing well together? 

I've read through some older mails on the list, where people had similar 
problems and suspected somehting like that. 

What are the proper/right settings for rdb/qemu/libvirt? 

libvirt: cachemode=none (writeback?) 
rdb: cache_mode = none 
SSD-tier: cachemode: writeback 

? 

Thanks for any help, 
udo. 


___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph and hadoop (fstab insted of CephFS)

2016-02-06 Thread Zoltan Arnold Nagy
Hi,

Please keep the list on CC as I guess others might be interested as well, if 
you don’t mind.

For VMs one can use rbd backed block devices and for bare-metal nodes where 
there is no abstraction one can use krbd - notice the k there, it stands for 
“kernel”. krdb is the in-kernel driver as there is no other abstraction layer 
there to provide unlike in the VM case where qemu can use librbd to implement 
the functionality.

If it’s a VM environment anyway then just attach the volumes from the 
underlying ceph and you are good to go. Make sure to attach multiple volumes on 
the same node for better performance - if you want 10TB per node for example, 
I’d mount 10x1TB.

I’d keep the replication in HDFS as that gives it “fake” data locality so 
better processing in MR/Spark workloads; just make sure you do map the nodes to 
different physical zones in your ceph cluster; way I’d do it is split the nodes 
across the racks, so let’s say 10 VM on rack 1, 10 VM on rack 2, 10VM on rack3, 
and set up the crush rules to keep the data on that particular volume within 
the rack.

This maps your underlying failure domains from ceph to hdfs basically.

> On 06 Feb 2016, at 14:07, Jose M  wrote:
> 
> Hi Zoltan! Thanks for the tips :)
> 
> I suppose the krbd is for bare metal nodes? I ask because you says "for vm's" 
> for both of them ;)
> 
> A couple of questions if you don't mind. This cloud is base in apache 
> cloudstack, so I understand I should go to the VM way, right? And why is 
> better not to use krbd if they are VM's?
> 
> I understand i should disable replacation in one of them. Would be your 
> recommendation to do it in ceph or in hadoop? Ceph seems to be more reliable, 
> but i have doubts if any hadoop feature would work bad if replication is 
> disabled "from the hadoop side".
> 
> THanks!
> De: Zoltan Arnold Nagy  >
> Enviado: viernes, 05 de febrero de 2016 02:21 p.m.
> Para: Jose M
> Asunto: Re: [ceph-users] Ceph and hadoop (fstab insted of CephFS)
>  
> Hi,
> 
> Are these bare metal nodes or VMs?
> 
> For VMs I suggest you just attach rbd data disks then let hdfs do it’s magic. 
> Just make sure you’re not replicating 9x (3x on ceph + 3x on hadoop).
> If it’s VMs, you can just do the same with krbd, just make sure to run a 
> recent enough kernel  :-)
> 
> Basically putting HDFS on RBDs.
> 
>> On 05 Feb 2016, at 13:42, Jose M > > wrote:
>> 
>> Hi Zoltan, thanks for the answer.
>> 
>> Because replacing hdfs:// with ceph:// and use CephFs doesn't work for all 
>> haddop componentes out of the box (unless in my tests), for example I had 
>> issues with Hbase, then with Yarn, Hue, etc (I'm using the cloudera 
>> distribution but I also tried with separate components). And besides the 
>> need to add jars and bindings to each node to get them work, there are a lot 
>> of places (xmls, configuration) where the "hdfs for ceph" replacement need 
>> to be made. 
>> 
>> Giving this issues, I thought that mounting ceph as a local directory and 
>> then use this "virtual dirs" as the haddop dfs dirs, would be easier and 
>> will work better (less configuration problems, and only changing the dfs 
>> dirs will make all components work without any more changes).
>> 
>> Of course I can be totally wrong, and it's a core change to do this, that's 
>> why I think I should ask here first :)
>> 
>> Thanks!
>> 
>> PS: If you are asking why I'm trying to use ceph here, well it's because we 
>> were given an infrastructure with the possibility yo use a big ceph storage 
>> that's working really really well (but as an object store and wasn't use 
>> until now with hadoop).
>> 
>> 
>> De: Zoltan Arnold Nagy > >
>> Enviado: jueves, 04 de febrero de 2016 06:07 p.m.
>> Para: John Spray
>> Cc: Jose M; ceph-users@lists.ceph.com 
>> Asunto: Re: [ceph-users] Ceph and hadoop (fstab insted of CephFS)
>>  
>> Might be totally wrong here, but it’s not layering them but replacing 
>> hdfs:// URLs with ceph:// URLs so all the mapreduce/spark/hbase/whatever is 
>> on top can use CephFS directly which is not a bad thing to do (if it works) 
>> :-)
>> 
>>> On 02 Feb 2016, at 16:50, John Spray >> > wrote:
>>> 
>>> On Tue, Feb 2, 2016 at 3:42 PM, Jose M >> > wrote:
 Hi,
 
 
 One simple question, in the ceph docs says that to use Ceph as an HDFS
 replacement, I can use the CephFs Hadoop plugin
 (http://docs.ceph.com/docs/master/cephfs/hadoop/ 
 ).
 
 
 What I would like to know if instead of using the plugin, I can mount ceph
 in fstab and then point hdfs dirs (namenode, datanode, etc) to this mounted
 "ceph" dirs, instead of native local dirs.
 
 I understand that maybe will involve more configuration steps (configuring
 fstab in each node), b

[ceph-users] CEPH health issues

2016-02-06 Thread Jeffrey McDonald
Hi,

I'm seeing lots  of issues with my CEPH installation.The health of the
system is degraded and many of the OSD are down.

# ceph -v
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)

#   ceph health
HEALTH_ERR 2002 pgs degraded; 14 pgs down; 180 pgs inconsistent; 14 pgs
peering; 1 pgs stale; 2002 pgs stuck degraded; 14 pgs stuck inactive; 1 pgs
stuck stale; 2320 pgs stuck unclean; 2002 pgs stuck undersized; 2002 pgs
undersized; 100 requests are blocked > 32 sec; recovery 3802/531925830
objects degraded (7.150%); recovery 48881596/531925830 objects misplaced
(9.190%); 12623 scrub errors; 11/320 in osds are down; noout flag(s) set

Log for one of the down OSDes shows:

-5> 2016-02-05 19:10:45.294873 7fd4d58e4700  1 -- 10.31.0.3:6835/157558
--> 10.31.0.5:0/3796 -- osd_ping(ping_reply e144138 stamp 2016-02-05
19:10:45.286934) v2 -- ?+
0 0x4359a00 con 0x2bc9ac60
-4> 2016-02-05 19:10:45.294915 7fd4d70e7700  1 -- 10.31.0.67:6835/157558
--> 10.31.0.5:0/3796 -- osd_ping(ping_reply e144138 stamp 2016-02-05
19:10:45.286934) v2 -- ?
+0 0x27e21800 con 0x2bacd700
-3> 2016-02-05 19:10:45.341383 7fd4e2ea8700  0
filestore(/var/lib/ceph/osd/ceph-299)  error (39) Directory not empty not
handled on operation 0x12c88178 (6494115.0.1,
 or op 1, counting from 0)
-2> 2016-02-05 19:10:45.341477 7fd4e2ea8700  0
filestore(/var/lib/ceph/osd/ceph-299) ENOTEMPTY suggests garbage data in
osd data dir
-1> 2016-02-05 19:10:45.341493 7fd4e2ea8700  0
filestore(/var/lib/ceph/osd/ceph-299)  transaction dump:
{
"ops": [
{
"op_num": 0,
"op_name": "remove",
"collection": "70.532s3_head",
"oid": "532\/\/head\/\/70\/18446744073709551615\/3"
},
{
"op_num": 1,
"op_name": "rmcoll",
"collection": "70.532s3_head"
}
]
}

 0> 2016-02-05 19:10:45.343794 7fd4e2ea8700 -1 os/FileStore.cc: In
function 'unsigned int
FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, int, ThreadP
ool::TPHandle*)' thread 7fd4e2ea8700 time 2016-02-05 19:10:45.341673
os/FileStore.cc: 2757: FAILED assert(0 == "unexpected error")

 ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x8b) [0xbc60eb]
 2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long,
int, ThreadPool::TPHandle*)+0xa52) [0x923d12]
 3: (FileStore::_do_transactions(std::list >&, unsigned long,
ThreadPool::TPHandle*)+0x64) [0x92a3a4]
 4: (FileStore::_do_op(FileStore::OpSequencer*,
ThreadPool::TPHandle&)+0x16a) [0x92a52a]
 5: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xbb6b4e]
 6: (ThreadPool::WorkThread::entry()+0x10) [0xbb7bf0]
 7: (()+0x8182) [0x7fd4ef916182]
 8: (clone()+0x6d) [0x7fd4ede8147d]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed
to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 keyvaluestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
  max_recent 1
  max_new 1000
  log_file /var/log/ceph/ceph-osd.299.log
--- end dump of recent events ---
2016-02-05 19:10:45.441428 7fd4e2ea8700 -1 *** Caught signal (Aborted) **
 in thread 7fd4e2ea8700

 ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43)
 1: /usr/bin/ceph-osd() [0xacd7ba]
 2: (()+0x10340) [0x7fd4ef91e340]
 3: (gsignal()+0x39) [0x7fd4eddbdcc9]
 4: (abort()+0x148) [0x7fd4eddc10d8]
 5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fd4ee6c8535]
 6: (()+0x5e6d6) [0x7fd4ee6c66d6]
 7: (()+0x5e703) [0x7fd4ee6c6703]
 8: (()+0x5e922) [0x7fd4ee6c6922]
 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char
const*)+0x278) [0xbc62d8]
 10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long,
int, ThreadPool::TPHandle*)+0xa52) [0x923d12]
 11: (FileStore::_do_transactions(std::list >&, unsigned long,
ThreadPool::TPHandle*)+0x64) [0x92a3a4
]
 12: (FileStore::_do_op(FileStore::OpSequencer*,
ThreadPool::TPHandle&)+0x16a) [0x92a52a]
 13: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xbb6b4e]
 14: (ThreadPool::WorkThread::entry()+0x10) [0xbb7bf0]
 15: (()+0x8182) [0x7fd4ef916182]
 16: (clone()+0x6d) [0x7fd4ede8147d]
 NOTE: a copy of the executable, or `objdump -rdS ` is needed
to interpret this.

--- begin dump of recent e

Re: [ceph-users] Ceph mirrors wanted!

2016-02-06 Thread Tyler Bishop
Covered except that the dreamhost mirror is constantly down or broken.

I can add ceph.mirror.beyondhosting.net for it.

Tyler Bishop 
Chief Technical Officer 
513-299-7108 x10 



tyler.bis...@beyondhosting.net 


If you are not the intended recipient of this transmission you are notified 
that disclosing, copying, distributing or taking any action in reliance on the 
contents of this information is strictly prohibited.

- Original Message -
From: "Wido den Hollander" 
To: "Tyler Bishop" 
Cc: "ceph-users" 
Sent: Saturday, February 6, 2016 2:46:50 AM
Subject: Re: [ceph-users] Ceph mirrors wanted!

> Op 6 februari 2016 om 0:08 schreef Tyler Bishop
> :
> 
> 
> I have ceph pulling down from eu.   What *origin* should I setup rsync to
> automatically pull from?
> 
> download.ceph.com is consistently broken.
> 

download.ceph.com should be your best guess, since that is closest.

The US however seems covered with download.ceph.com although we might set up
us-east and us-west.

I see that Ceph is currently in a subfolder called 'Ceph' and that is not
consistent with the other mirrors.

Can that be fixed so that it matches the original directory structure?

Wido

> - Original Message -
> From: "Tyler Bishop" 
> To: "Wido den Hollander" 
> Cc: "ceph-users" 
> Sent: Friday, February 5, 2016 5:59:20 PM
> Subject: Re: [ceph-users] Ceph mirrors wanted!
> 
> We would be happy to mirror the project.
> 
> http://mirror.beyondhosting.net
> 
> 
> - Original Message -
> From: "Wido den Hollander" 
> To: "ceph-users" 
> Sent: Saturday, January 30, 2016 9:14:59 AM
> Subject: [ceph-users] Ceph mirrors wanted!
> 
> Hi,
> 
> My PR was merged with a script to mirror Ceph properly:
> https://github.com/ceph/ceph/tree/master/mirroring
> 
> Currently there are 3 (official) locations where you can get Ceph:
> 
> - download.ceph.com (Dreamhost, US)
> - eu.ceph.com (PCextreme, Netherlands)
> - au.ceph.com (Digital Pacific, Australia)
> 
> I'm looking for more mirrors to become official mirrors so we can easily
> distribute Ceph.
> 
> Mirrors do go down and it's always nice to have a mirror local to you.
> 
> I'd like to have one or more mirrors in Asia, Africa and/or South
> Ameirca if possible. Anyone able to host there? Other locations are
> welcome as well!
> 
> A few things which are required:
> 
> - 1Gbit connection or more
> - Native IPv4 and IPv6
> - HTTP access
> - rsync access
> - 2TB of storage or more
> - Monitoring of the mirror/source
> 
> You can easily mirror Ceph yourself with this script I wrote:
> https://github.com/ceph/ceph/blob/master/mirroring/mirror-ceph.sh
> 
> eu.ceph.com and au.ceph.com use it to sync from download.ceph.com. If
> you want to mirror Ceph locally, please pick a mirror local to you.
> 
> Please refer to these guidelines:
> https://github.com/ceph/ceph/tree/master/mirroring#guidelines
> 
> -- 
> Wido den Hollander
> 42on B.V.
> Ceph trainer and consultant
> 
> Phone: +31 (0)20 700 9902
> Skype: contact42on
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] can't get rid of stale+active+clean pgs by no means

2016-02-06 Thread Nikola Ciprich
Hi,

I'm still strugling with health problems of  my cluster..

I still have 2 stale+active+clean and one creating pgs..
I've just stopped all nodes and started them all again,
and those pages still remain..

I think I've read all related discussions and docs, and tried
virtually everything I though could help (and be safe).

Querying those stale pgs hangs, OSDs which should be acting
for them are running.. I can't figure what could be wrong..

does anyone have an idea what to try?

I'm running latest hammer (0.94.5) on centos 6..

thanks a lot in advance

cheers

nik



-- 
-
Ing. Nikola CIPRICH
LinuxBox.cz, s.r.o.
28.rijna 168, 709 00 Ostrava

tel.:   +420 591 166 214
fax:+420 596 621 273
mobil:  +420 777 093 799
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-


pgp1v3XmsIQXD.pgp
Description: PGP signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CEPH health issues

2016-02-06 Thread Tyler Bishop
You need to get your OSD back online. 




From: "Jeffrey McDonald"  
To: ceph-users@lists.ceph.com 
Sent: Saturday, February 6, 2016 8:18:06 AM 
Subject: [ceph-users] CEPH health issues 

Hi, 
I'm seeing lots of issues with my CEPH installation. The health of the system 
is degraded and many of the OSD are down. 

# ceph -v 
ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) 

# ceph health 
HEALTH_ERR 2002 pgs degraded; 14 pgs down; 180 pgs inconsistent; 14 pgs 
peering; 1 pgs stale; 2002 pgs stuck degraded; 14 pgs stuck inactive; 1 pgs 
stuck stale; 2320 pgs stuck unclean; 2002 pgs stuck undersized; 2002 pgs 
undersized; 100 requests are blocked > 32 sec; recovery 3802/531925830 
objects degraded (7.150%); recovery 48881596/531925830 objects misplaced 
(9.190%); 12623 scrub errors; 11/320 in osds are down; noout flag(s) set 

Log for one of the down OSDes shows: 

-5> 2016-02-05 19:10:45.294873 7fd4d58e4700 1 -- 10.31.0.3:6835/157558 --> 
10.31.0.5:0/3796 -- osd_ping(ping_reply e144138 stamp 2016-02-05 
19:10:45.286934) v2 -- ?+ 
0 0x4359a00 con 0x2bc9ac60 
-4> 2016-02-05 19:10:45.294915 7fd4d70e7700 1 -- 10.31.0.67:6835/157558 --> 
10.31.0.5:0/3796 -- osd_ping(ping_reply e144138 stamp 2016-02-05 
19:10:45.286934) v2 -- ? 
+0 0x27e21800 con 0x2bacd700 
-3> 2016-02-05 19:10:45.341383 7fd4e2ea8700 0 
filestore(/var/lib/ceph/osd/ceph-299) error (39) Directory not empty not 
handled on operation 0x12c88178 (6494115.0.1, 
or op 1, counting from 0) 
-2> 2016-02-05 19:10:45.341477 7fd4e2ea8700 0 
filestore(/var/lib/ceph/osd/ceph-299) ENOTEMPTY suggests garbage data in osd 
data dir 
-1> 2016-02-05 19:10:45.341493 7fd4e2ea8700 0 
filestore(/var/lib/ceph/osd/ceph-299) transaction dump: 
{ 
"ops": [ 
{ 
"op_num": 0, 
"op_name": "remove", 
"collection": "70.532s3_head", 
"oid": "532\/\/head\/\/70\/18446744073709551615\/3" 
}, 
{ 
"op_num": 1, 
"op_name": "rmcoll", 
"collection": "70.532s3_head" 
} 
] 
} 

0> 2016-02-05 19:10:45.343794 7fd4e2ea8700 -1 os/FileStore.cc: In function 
'unsigned int FileStore::_do_transaction(ObjectStore::Transaction&, uint64_t, 
int, ThreadP 
ool::TPHandle*)' thread 7fd4e2ea8700 time 2016-02-05 19:10:45.341673 
os/FileStore.cc: 2757: FAILED assert(0 == "unexpected error") 

ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) 
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) 
[0xbc60eb] 
2: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, 
ThreadPool::TPHandle*)+0xa52) [0x923d12] 
3: (FileStore::_do_transactions(std::list >&, unsigned long, 
ThreadPool::TPHandle*)+0x64) [0x92a3a4] 
4: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x16a) 
[0x92a52a] 
5: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xbb6b4e] 
6: (ThreadPool::WorkThread::entry()+0x10) [0xbb7bf0] 
7: (()+0x8182) [0x7fd4ef916182] 
8: (clone()+0x6d) [0x7fd4ede8147d] 
NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this. 

--- logging levels --- 
0/ 5 none 
0/ 1 lockdep 
0/ 1 context 
1/ 1 crush 
1/ 5 mds 
1/ 5 mds_balancer 
1/ 5 mds_locker 
1/ 5 mds_log 
1/ 5 mds_log_expire 
1/ 5 mds_migrator 
0/ 1 buffer 
0/ 1 timer 
0/ 1 filer 
0/ 1 striper 
0/ 1 objecter 
0/ 5 rados 
0/ 5 rbd 
0/ 5 rbd_replay 
0/ 5 journaler 
0/ 5 objectcacher 
0/ 5 client 
0/ 5 osd 
0/ 5 optracker 
0/ 5 objclass 
1/ 3 filestore 
1/ 3 keyvaluestore 
1/ 3 journal 
0/ 5 ms 
1/ 5 mon 
0/10 monc 
1/ 5 paxos 
0/ 5 tp 
1/ 5 auth 
1/ 5 crypto 
1/ 1 finisher 
1/ 5 heartbeatmap 
1/ 5 perfcounter 
1/ 5 rgw 
1/10 civetweb 
1/ 5 javaclient 
1/ 5 asok 
1/ 1 throttle 
0/ 0 refs 
1/ 5 xio 
-2/-2 (syslog threshold) 
-1/-1 (stderr threshold) 
max_recent 1 
max_new 1000 
log_file /var/log/ceph/ceph-osd.299.log 
--- end dump of recent events --- 
2016-02-05 19:10:45.441428 7fd4e2ea8700 -1 *** Caught signal (Aborted) ** 
in thread 7fd4e2ea8700 

ceph version 0.94.5 (9764da52395923e0b32908d83a9f7304401fee43) 
1: /usr/bin/ceph-osd() [0xacd7ba] 
2: (()+0x10340) [0x7fd4ef91e340] 
3: (gsignal()+0x39) [0x7fd4eddbdcc9] 
4: (abort()+0x148) [0x7fd4eddc10d8] 
5: (__gnu_cxx::__verbose_terminate_handler()+0x155) [0x7fd4ee6c8535] 
6: (()+0x5e6d6) [0x7fd4ee6c66d6] 
7: (()+0x5e703) [0x7fd4ee6c6703] 
8: (()+0x5e922) [0x7fd4ee6c6922] 
9: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x278) 
[0xbc62d8] 
10: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, 
ThreadPool::TPHandle*)+0xa52) [0x923d12] 
11: (FileStore::_do_transactions(std::list >&, unsigned long, 
ThreadPool::TPHandle*)+0x64) [0x92a3a4 
] 
12: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x16a) 
[0x92a52a] 
13: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa5e) [0xbb6b4e] 
14: (ThreadPool::WorkThread::entry()+0x10) [0xbb7bf0] 
15: (()+0x8182) [0x7fd4ef916182] 
16: (clone()+0x6d) [0x7fd4ede8147d] 
NOTE: a copy of the executable, or `objdump -rdS ` is needed to 
interpret this. 

--- begin dump of recent events --- 
-4> 2