Hm. It seems that the cache pool qoutas have not been set. At least I'm sure I didn't set them.
# ceph osd pool get-quota cache quotas for pool 'cache': max objects: N/A max bytes : N/A Hmm. It seems that the cache pool quota have not been set. At least I'm sure I didn't set it. Maybe it have default setting. # ceph osd pool get-quota cache quotas for pool 'cache': max objects: N/A max bytes : N/A But I set target_max_bytes: # ceph osd pool set cache target_max_bytes 1000000000000 Can it serve as the reason? On Wed, Feb 24, 2016 at 4:08 PM, Alexey Sheplyakov <asheplya...@mirantis.com > wrote: > Hi, > > > 0> 2016-02-24 04:51:45.884445 7fd994825700 -1 osd/ReplicatedPG.cc: In > function 'int ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*, > ceph::buffer::list::iterator&, OSDOp&, ObjectContextRef&, bool)' thread > 7fd994825700 time 2016-02-24 04:51:45.870995 > osd/ReplicatedPG.cc: 5558: FAILED assert(cursor.data_complete) > > ceph version 0.80.11-8-g95c4287 > (95c4287b5d24b762bc8538633c5bb2918ecfe4dd) > > This one looks familiar: http://tracker.ceph.com/issues/13098 > > A quick work around is to unset the cache pool quota: > > ceph osd pool set-quota $cache_pool_name max_bytes 0 > ceph osd pool set-quota $cache_pool_name max_objects 0 > > The problem has been properly fixed in infernalis v9.1.0, and > (partially) in hammer (v0.94.6 which will be released soon). > > Best regards, > Alexey > > > On Wed, Feb 24, 2016 at 5:37 AM, Alexander Gubanov <sht...@gmail.com> > wrote: > > Hi, > > > > Every time 2 of 18 OSDs are crashing. I think it's happening when run PG > > replication because crashing only 2 OSDs and every time they're are the > > same. > > > > 0> 2016-02-24 04:51:45.884445 7fd994825700 -1 osd/ReplicatedPG.cc: In > > function 'int ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*, > > ceph::buffer::list::iterator&, OSDOp&, ObjectContextRef&, bool)' thread > > 7fd994825700 time 2016-02-24 04:51:45.870995 > > osd/ReplicatedPG.cc: 5558: FAILED assert(cursor.data_complete) > > > > ceph version 0.80.11-8-g95c4287 > (95c4287b5d24b762bc8538633c5bb2918ecfe4dd) > > 1: (ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*, > > ceph::buffer::list::iterator&, OSDOp&, > std::tr1::shared_ptr<ObjectContext>&, > > bool)+0xffc) [0x7c1f7c] > > 2: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, > std::vector<OSDOp, > > std::allocator<OSDOp> >&)+0x4171) [0x809f21] > > 3: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x62) > > [0x814622] > > 4: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x5f8) > [0x815098] > > 5: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x3dd4) > [0x81a3f4] > > 6: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, > > ThreadPool::TPHandle&)+0x66d) [0x7b4ecd] > > 7: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a5) [0x600ee5] > > 8: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, > > ThreadPool::TPHandle&)+0x203) [0x61cba3] > > 9: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, > > std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> > >>::_void_process(void*, ThreadPool::TPHandle&)+0xac) [0x660f2c] > > 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb20) [0xa7def0] > > 11: (ThreadPool::WorkThread::entry()+0x10) [0xa7ede0] > > 12: (()+0x7dc5) [0x7fd9ad03edc5] > > 13: (clone()+0x6d) [0x7fd9abd2828d] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to > > interpret this. > > > > --- logging levels --- > > 0/ 5 none > > 0/ 1 lockdep > > 0/ 1 context > > 1/ 1 crush > > 1/ 5 mds > > 1/ 5 mds_balancer > > 1/ 5 mds_locker > > 1/ 5 mds_log > > 1/ 5 mds_log_expire > > 1/ 5 mds_migrator > > 0/ 1 buffer > > 0/ 1 timer > > 0/ 1 filer > > 0/ 1 striper > > 0/ 1 objecter > > 0/ 5 rados > > 0/ 5 rbd > > 0/ 5 journaler > > 0/ 5 objectcacher > > 0/ 5 client > > 0/ 5 osd > > 0/ 5 optracker > > 0/ 5 objclass > > 1/ 3 filestore > > 1/ 3 keyvaluestore > > 1/ 3 journal > > 0/ 5 ms > > 1/ 5 mon > > 0/10 monc > > 1/ 5 paxos > > 0/ 5 tp > > 1/ 5 auth > > 1/ 5 crypto > > 1/ 1 finisher > > 1/ 5 heartbeatmap > > 1/ 5 perfcounter > > 1/ 5 rgw > > 1/10 civetweb > > 1/ 5 javaclient > > 1/ 5 asok > > 1/ 1 throttle > > -2/-2 (syslog threshold) > > -1/-1 (stderr threshold) > > max_recent 10000 > > max_new 1000 > > log_file /var/log/ceph/ceph-osd.3.log > > --- end dump of recent events --- > > 2016-02-24 04:51:45.944447 7fd994825700 -1 *** Caught signal (Aborted) ** > > in thread 7fd994825700 > > > > ceph version 0.80.11-8-g95c4287 > (95c4287b5d24b762bc8538633c5bb2918ecfe4dd) > > 1: /usr/bin/ceph-osd() [0x9a24f6] > > 2: (()+0xf100) [0x7fd9ad046100] > > 3: (gsignal()+0x37) [0x7fd9abc675f7] > > 4: (abort()+0x148) [0x7fd9abc68ce8] > > 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fd9ac56b9d5] > > 6: (()+0x5e946) [0x7fd9ac569946] > > 7: (()+0x5e973) [0x7fd9ac569973] > > 8: (()+0x5eb93) [0x7fd9ac569b93] > > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > > const*)+0x1ef) [0xa8d9df] > > 10: (ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*, > > ceph::buffer::list::iterator&, OSDOp&, > std::tr1::shared_ptr<ObjectContext>&, > > bool)+0xffc) [0x7c1f7c] > > 11: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, > std::vector<OSDOp, > > std::allocator<OSDOp> >&)+0x4171) [0x809f21] > > 12: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x62) > > [0x814622] > > 13: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x5f8) > [0x815098] > > 14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x3dd4) > > [0x81a3f4] > > 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, > > ThreadPool::TPHandle&)+0x66d) [0x7b4ecd] > > 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a5) [0x600ee5] > > 17: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, > > ThreadPool::TPHandle&)+0x203) [0x61cba3] > > 18: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, > > std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> > >>::_void_process(void*, ThreadPool::TPHandle&)+0xac) [0x660f2c] > > 19: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb20) [0xa7def0] > > 20: (ThreadPool::WorkThread::entry()+0x10) [0xa7ede0] > > 21: (()+0x7dc5) [0x7fd9ad03edc5] > > 22: (clone()+0x6d) [0x7fd9abd2828d] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to > > interpret this. > > > > --- begin dump of recent events --- > > -5> 2016-02-24 04:51:45.904559 7fd995026700 5 -- op tracker -- , > seq: > > 19230, time: 2016-02-24 04:51:45.904559, event: started, request: > > osd_op(osd.13.12097:806246 rb.0.218d6.238e1f29.000000010db3@snapdir > > [list-snaps] 3.94c2bed2 > ack+read+ignore_cache+ignore_overlay+map_snap_clone > > e13252) v4 > > -4> 2016-02-24 04:51:45.904598 7fd995026700 1 -- > 172.16.0.1:6801/419703 > > --> 172.16.0.3:6844/12260 -- osd_op_reply(806246 > > rb.0.218d6.238e1f29.000000010db3 [list-snaps] v0'0 uv27683057 ondisk = > 0) v6 > > -- ?+0 0x9f90800 con 0x1b7838c0 > > -3> 2016-02-24 04:51:45.904616 7fd995026700 5 -- op tracker -- , > seq: > > 19230, time: 2016-02-24 04:51:45.904616, event: done, request: > > osd_op(osd.13.12097:806246 rb.0.218d6.238e1f29.000000010db3@snapdir > > [list-snaps] 3.94c2bed2 > ack+read+ignore_cache+ignore_overlay+map_snap_clone > > e13252) v4 > > -2> 2016-02-24 04:51:45.904637 7fd995026700 5 -- op tracker -- , > seq: > > 19231, time: 2016-02-24 04:51:45.904637, event: reached_pg, request: > > osd_op(osd.13.12097:806247 rb.0.218d6.238e1f29.000000010db3 [copy-get max > > 8388608] 3.94c2bed2 ack+read+ignore_cache+ignore_overlay+map_snap_clone > > e13252) v4 > > -1> 2016-02-24 04:51:45.904673 7fd995026700 5 -- op tracker -- , > seq: > > 19231, time: 2016-02-24 04:51:45.904673, event: started, request: > > osd_op(osd.13.12097:806247 rb.0.218d6.238e1f29.000000010db3 [copy-get max > > 8388608] 3.94c2bed2 ack+read+ignore_cache+ignore_overlay+map_snap_clone > > e13252) v4 > > 0> 2016-02-24 04:51:45.944447 7fd994825700 -1 *** Caught signal > > (Aborted) ** > > in thread 7fd994825700 > > > > ceph version 0.80.11-8-g95c4287 > (95c4287b5d24b762bc8538633c5bb2918ecfe4dd) > > 1: /usr/bin/ceph-osd() [0x9a24f6] > > 2: (()+0xf100) [0x7fd9ad046100] > > 3: (gsignal()+0x37) [0x7fd9abc675f7] > > 4: (abort()+0x148) [0x7fd9abc68ce8] > > 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) [0x7fd9ac56b9d5] > > 6: (()+0x5e946) [0x7fd9ac569946] > > 7: (()+0x5e973) [0x7fd9ac569973] > > 8: (()+0x5eb93) [0x7fd9ac569b93] > > 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char > > const*)+0x1ef) [0xa8d9df] > > 10: (ReplicatedPG::fill_in_copy_get(ReplicatedPG::OpContext*, > > ceph::buffer::list::iterator&, OSDOp&, > std::tr1::shared_ptr<ObjectContext>&, > > bool)+0xffc) [0x7c1f7c] > > 11: (ReplicatedPG::do_osd_ops(ReplicatedPG::OpContext*, > std::vector<OSDOp, > > std::allocator<OSDOp> >&)+0x4171) [0x809f21] > > 12: (ReplicatedPG::prepare_transaction(ReplicatedPG::OpContext*)+0x62) > > [0x814622] > > 13: (ReplicatedPG::execute_ctx(ReplicatedPG::OpContext*)+0x5f8) > [0x815098] > > 14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x3dd4) > > [0x81a3f4] > > 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, > > ThreadPool::TPHandle&)+0x66d) [0x7b4ecd] > > 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, > > std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a5) [0x600ee5] > > 17: (OSD::OpWQ::_process(boost::intrusive_ptr<PG>, > > ThreadPool::TPHandle&)+0x203) [0x61cba3] > > 18: (ThreadPool::WorkQueueVal<std::pair<boost::intrusive_ptr<PG>, > > std::tr1::shared_ptr<OpRequest> >, boost::intrusive_ptr<PG> > >>::_void_process(void*, ThreadPool::TPHandle&)+0xac) [0x660f2c] > > 19: (ThreadPool::worker(ThreadPool::WorkThread*)+0xb20) [0xa7def0] > > 20: (ThreadPool::WorkThread::entry()+0x10) [0xa7ede0] > > 21: (()+0x7dc5) [0x7fd9ad03edc5] > > 22: (clone()+0x6d) [0x7fd9abd2828d] > > NOTE: a copy of the executable, or `objdump -rdS <executable>` is > needed to > > interpret this. > > > > --- logging levels --- > > 0/ 5 none > > 0/ 1 lockdep > > 0/ 1 context > > 1/ 1 crush > > 1/ 5 mds > > 1/ 5 mds_balancer > > 1/ 5 mds_locker > > 1/ 5 mds_log > > 1/ 5 mds_log_expire > > 1/ 5 mds_migrator > > 0/ 1 buffer > > 0/ 1 timer > > 0/ 1 filer > > 0/ 1 striper > > 0/ 1 objecter > > 0/ 5 rados > > 0/ 5 rbd > > 0/ 5 journaler > > 0/ 5 objectcacher > > 0/ 5 client > > 0/ 5 osd > > 0/ 5 optracker > > 0/ 5 objclass > > 1/ 3 filestore > > 1/ 3 keyvaluestore > > 1/ 3 journal > > 0/ 5 ms > > 1/ 5 mon > > 0/10 monc > > 1/ 5 paxos > > 0/ 5 tp > > 1/ 5 auth > > 1/ 5 crypto > > 1/ 1 finisher > > 1/ 5 heartbeatmap > > 1/ 5 perfcounter > > 1/ 5 rgw > > 1/10 civetweb > > 1/ 5 javaclient > > 1/ 5 asok > > 1/ 1 throttle > > -2/-2 (syslog threshold) > > -1/-1 (stderr threshold) > > max_recent 10000 > > max_new 1000 > > log_file /var/log/ceph/ceph-osd.3.log > > --- end dump of recent events --- > > > > -- > > Alexander Gubanov > > > > _______________________________________________ > > ceph-users mailing list > > ceph-users@lists.ceph.com > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > -- Alexander Gubanov
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com