----- Message from Haomai Wang <[email protected]> --------- Date: Tue, 19 Aug 2014 12:28:27 +0800 From: Haomai Wang <[email protected]> Subject: Re: [ceph-users] ceph cluster inconsistency? To: Kenneth Waegeman <[email protected]> Cc: Sage Weil <[email protected]>, [email protected]
On Mon, Aug 18, 2014 at 7:32 PM, Kenneth Waegeman <[email protected]> wrote:----- Message from Haomai Wang <[email protected]> --------- Date: Mon, 18 Aug 2014 18:34:11 +0800 From: Haomai Wang <[email protected]> Subject: Re: [ceph-users] ceph cluster inconsistency? To: Kenneth Waegeman <[email protected]> Cc: Sage Weil <[email protected]>, [email protected]On Mon, Aug 18, 2014 at 5:38 PM, Kenneth Waegeman <[email protected]> wrote:Hi, I tried this after restarting the osd, but I guess that was not the aim ( # ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list _GHOBJTOSEQ_| grep 6adb1100 -A 100 IO error: lock /var/lib/ceph/osd/ceph-67/current//LOCK: Resource temporarily unavailable tools/ceph_kvstore_tool.cc: In function 'StoreTool::StoreTool(const string&)' thread 7f8fecf7d780 time 2014-08-18 11:12:29.551780 tools/ceph_kvstore_tool.cc: 38: FAILED assert(!db_ptr->open(std::cerr)) .. ) When I run it after bringing the osd down, it takes a while, but it has no output.. (When running it without the grep, I'm getting a huge list )Oh, sorry for it! I made a mistake, the hash value(6adb1100) will be reversed into leveldb. So grep "benchmark_data_ceph001.cubone.os_5560_object789734" should be help it.this gives: [root@ceph003 ~]# ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list _GHOBJTOSEQ_ | grep 5560_object789734 -A 100 _GHOBJTOSEQ_:3%e0s0_head!0011BDA6!!3!!benchmark_data_ceph001%ecubone%eos_5560_object789734!head _GHOBJTOSEQ_:3%e0s0_head!0011C027!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1330170!head _GHOBJTOSEQ_:3%e0s0_head!0011C6FD!!3!!benchmark_data_ceph001%ecubone%eos_4919_object227366!head _GHOBJTOSEQ_:3%e0s0_head!0011CB03!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1363631!head _GHOBJTOSEQ_:3%e0s0_head!0011CDF0!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1573957!head _GHOBJTOSEQ_:3%e0s0_head!0011D02C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1019282!head _GHOBJTOSEQ_:3%e0s0_head!0011E2B5!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1283563!head _GHOBJTOSEQ_:3%e0s0_head!0011E511!!3!!benchmark_data_ceph001%ecubone%eos_4919_object273736!head _GHOBJTOSEQ_:3%e0s0_head!0011E547!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1170628!head _GHOBJTOSEQ_:3%e0s0_head!0011EAAB!!3!!benchmark_data_ceph001%ecubone%eos_4919_object256335!head _GHOBJTOSEQ_:3%e0s0_head!0011F446!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1484196!head _GHOBJTOSEQ_:3%e0s0_head!0011FC59!!3!!benchmark_data_ceph001%ecubone%eos_5560_object884178!head _GHOBJTOSEQ_:3%e0s0_head!001203F3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object853746!head _GHOBJTOSEQ_:3%e0s0_head!001208E3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object36633!head _GHOBJTOSEQ_:3%e0s0_head!00120B37!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1235337!head _GHOBJTOSEQ_:3%e0s0_head!001210B6!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1661351!head _GHOBJTOSEQ_:3%e0s0_head!001210CB!!3!!benchmark_data_ceph001%ecubone%eos_5560_object238126!head _GHOBJTOSEQ_:3%e0s0_head!0012184C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object339943!head _GHOBJTOSEQ_:3%e0s0_head!00121916!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1047094!head _GHOBJTOSEQ_:3%e0s0_head!001219C1!!3!!benchmark_data_ceph001%ecubone%eos_31461_object520642!head _GHOBJTOSEQ_:3%e0s0_head!001222BB!!3!!benchmark_data_ceph001%ecubone%eos_5560_object639565!head _GHOBJTOSEQ_:3%e0s0_head!001223AA!!3!!benchmark_data_ceph001%ecubone%eos_4919_object231080!head _GHOBJTOSEQ_:3%e0s0_head!0012243C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object858050!head _GHOBJTOSEQ_:3%e0s0_head!0012289C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object241796!head _GHOBJTOSEQ_:3%e0s0_head!00122D28!!3!!benchmark_data_ceph001%ecubone%eos_4919_object7462!head _GHOBJTOSEQ_:3%e0s0_head!00122DFE!!3!!benchmark_data_ceph001%ecubone%eos_5560_object243798!head _GHOBJTOSEQ_:3%e0s0_head!00122EFC!!3!!benchmark_data_ceph001%ecubone%eos_8961_object109512!head _GHOBJTOSEQ_:3%e0s0_head!001232D7!!3!!benchmark_data_ceph001%ecubone%eos_31461_object653973!head _GHOBJTOSEQ_:3%e0s0_head!001234A3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1378169!head _GHOBJTOSEQ_:3%e0s0_head!00123714!!3!!benchmark_data_ceph001%ecubone%eos_5560_object512925!head _GHOBJTOSEQ_:3%e0s0_head!001237D9!!3!!benchmark_data_ceph001%ecubone%eos_4919_object23289!head _GHOBJTOSEQ_:3%e0s0_head!00123854!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1108852!head _GHOBJTOSEQ_:3%e0s0_head!00123971!!3!!benchmark_data_ceph001%ecubone%eos_5560_object704026!head _GHOBJTOSEQ_:3%e0s0_head!00123F75!!3!!benchmark_data_ceph001%ecubone%eos_8961_object250441!head _GHOBJTOSEQ_:3%e0s0_head!00124083!!3!!benchmark_data_ceph001%ecubone%eos_31461_object706178!head _GHOBJTOSEQ_:3%e0s0_head!001240FA!!3!!benchmark_data_ceph001%ecubone%eos_5560_object316952!head _GHOBJTOSEQ_:3%e0s0_head!0012447D!!3!!benchmark_data_ceph001%ecubone%eos_5560_object538734!head _GHOBJTOSEQ_:3%e0s0_head!001244D9!!3!!benchmark_data_ceph001%ecubone%eos_31461_object789215!head _GHOBJTOSEQ_:3%e0s0_head!001247CD!!3!!benchmark_data_ceph001%ecubone%eos_8961_object265993!head _GHOBJTOSEQ_:3%e0s0_head!00124897!!3!!benchmark_data_ceph001%ecubone%eos_31461_object610597!head _GHOBJTOSEQ_:3%e0s0_head!00124BE4!!3!!benchmark_data_ceph001%ecubone%eos_31461_object691723!head _GHOBJTOSEQ_:3%e0s0_head!00124C9B!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1306135!head _GHOBJTOSEQ_:3%e0s0_head!00124E1D!!3!!benchmark_data_ceph001%ecubone%eos_5560_object520580!head _GHOBJTOSEQ_:3%e0s0_head!0012534C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object659767!head _GHOBJTOSEQ_:3%e0s0_head!00125A81!!3!!benchmark_data_ceph001%ecubone%eos_5560_object184060!head _GHOBJTOSEQ_:3%e0s0_head!00125E77!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1292867!head _GHOBJTOSEQ_:3%e0s0_head!00126562!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1201410!head _GHOBJTOSEQ_:3%e0s0_head!00126B34!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1657326!head _GHOBJTOSEQ_:3%e0s0_head!00127383!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1269787!head _GHOBJTOSEQ_:3%e0s0_head!00127396!!3!!benchmark_data_ceph001%ecubone%eos_31461_object500115!head _GHOBJTOSEQ_:3%e0s0_head!001277F8!!3!!benchmark_data_ceph001%ecubone%eos_31461_object394932!head _GHOBJTOSEQ_:3%e0s0_head!001279DD!!3!!benchmark_data_ceph001%ecubone%eos_4919_object252963!head _GHOBJTOSEQ_:3%e0s0_head!00127B40!!3!!benchmark_data_ceph001%ecubone%eos_31461_object936811!head _GHOBJTOSEQ_:3%e0s0_head!00127BAC!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1481773!head _GHOBJTOSEQ_:3%e0s0_head!0012894E!!3!!benchmark_data_ceph001%ecubone%eos_5560_object999885!head _GHOBJTOSEQ_:3%e0s0_head!00128D05!!3!!benchmark_data_ceph001%ecubone%eos_31461_object943667!head _GHOBJTOSEQ_:3%e0s0_head!0012908A!!3!!benchmark_data_ceph001%ecubone%eos_5560_object212990!head _GHOBJTOSEQ_:3%e0s0_head!00129519!!3!!benchmark_data_ceph001%ecubone%eos_5560_object437596!head _GHOBJTOSEQ_:3%e0s0_head!00129716!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1585330!head _GHOBJTOSEQ_:3%e0s0_head!00129798!!3!!benchmark_data_ceph001%ecubone%eos_5560_object603505!head _GHOBJTOSEQ_:3%e0s0_head!001299C9!!3!!benchmark_data_ceph001%ecubone%eos_31461_object808800!head _GHOBJTOSEQ_:3%e0s0_head!00129B7A!!3!!benchmark_data_ceph001%ecubone%eos_31461_object23193!head _GHOBJTOSEQ_:3%e0s0_head!00129B9A!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1158397!head _GHOBJTOSEQ_:3%e0s0_head!0012A932!!3!!benchmark_data_ceph001%ecubone%eos_5560_object542450!head _GHOBJTOSEQ_:3%e0s0_head!0012B77A!!3!!benchmark_data_ceph001%ecubone%eos_8961_object195480!head _GHOBJTOSEQ_:3%e0s0_head!0012BE8C!!3!!benchmark_data_ceph001%ecubone%eos_4919_object312911!head _GHOBJTOSEQ_:3%e0s0_head!0012BF74!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1563783!head _GHOBJTOSEQ_:3%e0s0_head!0012C65C!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1123980!head _GHOBJTOSEQ_:3%e0s0_head!0012C6FE!!3!!benchmark_data_ceph001%ecubone%eos_3411_object913!head _GHOBJTOSEQ_:3%e0s0_head!0012CCAD!!3!!benchmark_data_ceph001%ecubone%eos_31461_object400863!head _GHOBJTOSEQ_:3%e0s0_head!0012CDBB!!3!!benchmark_data_ceph001%ecubone%eos_5560_object789667!head _GHOBJTOSEQ_:3%e0s0_head!0012D14B!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1020723!head _GHOBJTOSEQ_:3%e0s0_head!0012D95B!!3!!benchmark_data_ceph001%ecubone%eos_8961_object106293!head _GHOBJTOSEQ_:3%e0s0_head!0012E3C8!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1355526!head _GHOBJTOSEQ_:3%e0s0_head!0012E5B3!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1491348!head _GHOBJTOSEQ_:3%e0s0_head!0012F2BB!!3!!benchmark_data_ceph001%ecubone%eos_8961_object338872!head _GHOBJTOSEQ_:3%e0s0_head!0012F374!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1337264!head _GHOBJTOSEQ_:3%e0s0_head!0012FBE5!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1512395!head _GHOBJTOSEQ_:3%e0s0_head!0012FCE3!!3!!benchmark_data_ceph001%ecubone%eos_8961_object298610!head _GHOBJTOSEQ_:3%e0s0_head!0012FEB6!!3!!benchmark_data_ceph001%ecubone%eos_4919_object120824!head _GHOBJTOSEQ_:3%e0s0_head!001301CA!!3!!benchmark_data_ceph001%ecubone%eos_5560_object816326!head _GHOBJTOSEQ_:3%e0s0_head!00130263!!3!!benchmark_data_ceph001%ecubone%eos_5560_object777163!head _GHOBJTOSEQ_:3%e0s0_head!00130529!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1413173!head _GHOBJTOSEQ_:3%e0s0_head!001317D9!!3!!benchmark_data_ceph001%ecubone%eos_31461_object809510!head _GHOBJTOSEQ_:3%e0s0_head!0013204F!!3!!benchmark_data_ceph001%ecubone%eos_31461_object471416!head _GHOBJTOSEQ_:3%e0s0_head!00132400!!3!!benchmark_data_ceph001%ecubone%eos_5560_object695087!head _GHOBJTOSEQ_:3%e0s0_head!00132A19!!3!!benchmark_data_ceph001%ecubone%eos_31461_object591945!head _GHOBJTOSEQ_:3%e0s0_head!00132BF8!!3!!benchmark_data_ceph001%ecubone%eos_31461_object302000!head _GHOBJTOSEQ_:3%e0s0_head!00132F5B!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1645443!head _GHOBJTOSEQ_:3%e0s0_head!00133B8B!!3!!benchmark_data_ceph001%ecubone%eos_5560_object761911!head _GHOBJTOSEQ_:3%e0s0_head!0013433E!!3!!benchmark_data_ceph001%ecubone%eos_31461_object1467727!head _GHOBJTOSEQ_:3%e0s0_head!00134446!!3!!benchmark_data_ceph001%ecubone%eos_31461_object791960!head _GHOBJTOSEQ_:3%e0s0_head!00134678!!3!!benchmark_data_ceph001%ecubone%eos_31461_object677078!head _GHOBJTOSEQ_:3%e0s0_head!00134A96!!3!!benchmark_data_ceph001%ecubone%eos_31461_object254923!head _GHOBJTOSEQ_:3%e0s0_head!001355D0!!3!!benchmark_data_ceph001%ecubone%eos_31461_object321528!head _GHOBJTOSEQ_:3%e0s0_head!00135690!!3!!benchmark_data_ceph001%ecubone%eos_4919_object36935!head _GHOBJTOSEQ_:3%e0s0_head!00135B62!!3!!benchmark_data_ceph001%ecubone%eos_5560_object1228272!head _GHOBJTOSEQ_:3%e0s0_head!00135C72!!3!!benchmark_data_ceph001%ecubone%eos_4812_object2180!head _GHOBJTOSEQ_:3%e0s0_head!00135DEE!!3!!benchmark_data_ceph001%ecubone%eos_5560_object425705!head _GHOBJTOSEQ_:3%e0s0_head!00136366!!3!!benchmark_data_ceph001%ecubone%eos_5560_object141569!head _GHOBJTOSEQ_:3%e0s0_head!00136371!!3!!benchmark_data_ceph001%ecubone%eos_5560_object564213!head100 rows seemed true for me. I found the min list objects is 1024. Please could you run "ceph-kvstore-tool /var/lib/ceph/osd/ceph-67/current/ list _GHOBJTOSEQ_| grep 6adb1100 -A 1024"
I got the output in attachment
Or should I run this immediately after the osd is crashed, (because it maybe rebalanced? I did already restarted the cluster) I don't know if it is related, but before I could all do that, I had to fix something else: A monitor did run out if disk space, using 8GB for his store.db folder (lot of sst files). Other monitors are also near that level. Never had that problem on previous setups before. I recreated a monitor and now it uses 3.8GB.It exists some duplicate data which needed to be compacted.Another idea, maybe you can make KeyValueStore's stripe size align with EC stripe size.How can I do that? Is there some documentation about that?ceph --show-config | grep keyvaluestoredebug_keyvaluestore = 0/0 keyvaluestore_queue_max_ops = 50 keyvaluestore_queue_max_bytes = 104857600 keyvaluestore_debug_check_backend = false keyvaluestore_op_threads = 2 keyvaluestore_op_thread_timeout = 60 keyvaluestore_op_thread_suicide_timeout = 180 keyvaluestore_default_strip_size = 4096 keyvaluestore_max_expected_write_size = 16777216 keyvaluestore_header_cache_size = 4096 keyvaluestore_backend = leveldb keyvaluestore_default_strip_size is the wantedI haven't think deeply and maybe I will try it later.Thanks! Kenneth ----- Message from Sage Weil <[email protected]> --------- Date: Fri, 15 Aug 2014 06:10:34 -0700 (PDT) From: Sage Weil <[email protected]> Subject: Re: [ceph-users] ceph cluster inconsistency? To: Haomai Wang <[email protected]> Cc: Kenneth Waegeman <[email protected]>, [email protected]On Fri, 15 Aug 2014, Haomai Wang wrote:Hi Kenneth, I don't find valuable info in your logs, it lack of the necessary debug output when accessing crash code. But I scan the encode/decode implementation in GenericObjectMap and find something bad. For example, two oid has same hash and their name is: A: "rb.data.123" B: "rb-123" In ghobject_t compare level, A < B. But GenericObjectMap encode "." to "%e", so the key in DB is: A: _GHOBJTOSEQ_:blah!51615000!!none!!rb%edata%e123!head B: _GHOBJTOSEQ_:blah!51615000!!none!!rb-123!head A > B And it seemed that the escape function is useless and should be disabled. I'm not sure whether Kenneth's problem is touching this bug. Because this scene only occur when the object set is very large and make the two object has same hash value. Kenneth, could you have time to run "ceph-kv-store [path-to-osd] list _GHOBJTOSEQ_| grep 6adb1100 -A 100". ceph-kv-store is a debug tool which can be compiled from source. You can clone ceph repo and run "./authongen.sh; ./configure; cd src; make ceph-kvstore-tool". "path-to-osd" should be "/var/lib/ceph/osd-[id]/current/". "6adb1100" is from your verbose log and the next 100 rows should know necessary infos.You can also get ceph-kvstore-tool from the 'ceph-tests' package.Hi sage, do you think we need to provided with upgrade function to fix it?Hmm, we might. This only affects the key/value encoding right? The FileStore is using its own function to map these to file names? Can you open a ticket in the tracker for this? Thanks! sageOn Thu, Aug 14, 2014 at 7:36 PM, Kenneth Waegeman <[email protected]> wrote: > > ----- Message from Haomai Wang <[email protected]> --------- > Date: Thu, 14 Aug 2014 19:11:55 +0800 > > From: Haomai Wang <[email protected]> > Subject: Re: [ceph-users] ceph cluster inconsistency? > To: Kenneth Waegeman <[email protected]> > > >> Could you add config "debug_keyvaluestore = 20/20" to the crashed >> osd >> and replay the command causing crash? >> >> I would like to get more debug infos! Thanks. > > > I included the log in attachment! > Thanks! > >> >> On Thu, Aug 14, 2014 at 4:41 PM, Kenneth Waegeman >> <[email protected]> wrote: >>> >>> >>> I have: >>> osd_objectstore = keyvaluestore-dev >>> >>> in the global section of my ceph.conf >>> >>> >>> [root@ceph002 ~]# ceph osd erasure-code-profile get profile11 >>> directory=/usr/lib64/ceph/erasure-code >>> k=8 >>> m=3 >>> plugin=jerasure >>> ruleset-failure-domain=osd >>> technique=reed_sol_van >>> >>> the ecdata pool has this as profile >>> >>> pool 3 'ecdata' erasure size 11 min_size 8 crush_ruleset 2 >>> object_hash >>> rjenkins pg_num 128 pgp_num 128 last_change 161 flags hashpspool >>> stripe_width 4096 >>> >>> ECrule in crushmap >>> >>> rule ecdata { >>> ruleset 2 >>> type erasure >>> min_size 3 >>> max_size 20 >>> step set_chooseleaf_tries 5 >>> step take default-ec >>> step choose indep 0 type osd >>> step emit >>> } >>> root default-ec { >>> id -8 # do not change unnecessarily >>> # weight 140.616 >>> alg straw >>> hash 0 # rjenkins1 >>> item ceph001-ec weight 46.872 >>> item ceph002-ec weight 46.872 >>> item ceph003-ec weight 46.872 >>> ... >>> >>> Cheers! >>> Kenneth >>> >>> ----- Message from Haomai Wang <[email protected]> --------- >>> Date: Thu, 14 Aug 2014 10:07:50 +0800 >>> From: Haomai Wang <[email protected]> >>> Subject: Re: [ceph-users] ceph cluster inconsistency? >>> To: Kenneth Waegeman <[email protected]> >>> Cc: ceph-users <[email protected]> >>> >>> >>> >>>> Hi Kenneth, >>>> >>>> Could you give your configuration related to EC and KeyValueStore? >>>> Not sure whether it's bug on KeyValueStore >>>> >>>> On Thu, Aug 14, 2014 at 12:06 AM, Kenneth Waegeman >>>> <[email protected]> wrote: >>>>> >>>>> >>>>> Hi, >>>>> >>>>> I was doing some tests with rados bench on a Erasure Coded pool >>>>> (using >>>>> keyvaluestore-dev objectstore) on 0.83, and I see some strangs >>>>> things: >>>>> >>>>> >>>>> [root@ceph001 ~]# ceph status >>>>> cluster 82766e04-585b-49a6-a0ac-c13d9ffd0a7d >>>>> health HEALTH_WARN too few pgs per osd (4 < min 20) >>>>> monmap e1: 3 mons at >>>>> >>>>> >>>>> >>>>>>>>>> {ceph001=10.141.8.180:6789/0,ceph002=10.141.8.181:6789/0,ceph003=10.141.8.182:6789/0},>>>>> election epoch 6, quorum 0,1,2 ceph001,ceph002,ceph003 >>>>> mdsmap e116: 1/1/1 up {0=ceph001.cubone.os=up:active}, 2 >>>>> up:standby >>>>> osdmap e292: 78 osds: 78 up, 78 in >>>>> pgmap v48873: 320 pgs, 4 pools, 15366 GB data, 3841 >>>>> kobjects >>>>> 1381 GB used, 129 TB / 131 TB avail >>>>> 320 active+clean >>>>> >>>>> There is around 15T of data, but only 1.3 T usage. >>>>> >>>>> This is also visible in rados: >>>>> >>>>> [root@ceph001 ~]# rados df >>>>> pool name category KB objects >>>>> clones >>>>> degraded unfound rd rd KB wr >>>>> wr >>>>> KB >>>>> data - 0 0 >>>>> 0 >>>>> 0 0 0 0 0 0 >>>>> ecdata - 16113451009 3933959 >>>>> 0 >>>>> 0 0 1 1 3935632 16116850711 >>>>> metadata - 2 20 >>>>> 0 >>>>> 0 0 33 36 21 8 >>>>> rbd - 0 0 >>>>> 0 >>>>> 0 0 0 0 0 0 >>>>> total used 1448266016 3933979 >>>>> total avail 139400181016 >>>>> total space 140848447032 >>>>> >>>>> >>>>> Another (related?) thing: if I do rados -p ecdata ls, I trigger >>>>> osd >>>>> shutdowns (each time): >>>>> I get a list followed by an error: >>>>> >>>>> ... >>>>> benchmark_data_ceph001.cubone.os_8961_object243839 >>>>> benchmark_data_ceph001.cubone.os_5560_object801983 >>>>> benchmark_data_ceph001.cubone.os_31461_object856489 >>>>> benchmark_data_ceph001.cubone.os_8961_object202232 >>>>> benchmark_data_ceph001.cubone.os_4919_object33199 >>>>> benchmark_data_ceph001.cubone.os_5560_object807797 >>>>> benchmark_data_ceph001.cubone.os_4919_object74729 >>>>> benchmark_data_ceph001.cubone.os_31461_object1264121 >>>>> benchmark_data_ceph001.cubone.os_5560_object1318513 >>>>> benchmark_data_ceph001.cubone.os_5560_object1202111 >>>>> benchmark_data_ceph001.cubone.os_31461_object939107 >>>>> benchmark_data_ceph001.cubone.os_31461_object729682 >>>>> benchmark_data_ceph001.cubone.os_5560_object122915 >>>>> benchmark_data_ceph001.cubone.os_5560_object76521 >>>>> benchmark_data_ceph001.cubone.os_5560_object113261 >>>>> benchmark_data_ceph001.cubone.os_31461_object575079 >>>>> benchmark_data_ceph001.cubone.os_5560_object671042 >>>>> benchmark_data_ceph001.cubone.os_5560_object381146 >>>>> 2014-08-13 17:57:48.736150 7f65047b5700 0 -- >>>>> 10.141.8.180:0/1023295 >> >>>>> 10.141.8.182:6839/4471 pipe(0x7f64fc019b20 sd=5 :0 s=1 pgs=0 cs=0 >>>>> l=1 >>>>> c=0x7f64fc019db0).fault >>>>> >>>>> And I can see this in the log files: >>>>> >>>>> -25> 2014-08-13 17:52:56.323908 7f8a97fa4700 1 -- >>>>> 10.143.8.182:6827/64670 <== osd.57 10.141.8.182:0/15796 51 ==== >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ==== >>>>> 47+0+0 >>>>> (3227325175 0 0) 0xf475940 con 0xee89fa0 >>>>> -24> 2014-08-13 17:52:56.323938 7f8a97fa4700 1 -- >>>>> 10.143.8.182:6827/64670 --> 10.141.8.182:0/15796 -- >>>>> osd_ping(ping_reply >>>>> e220 >>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf815b00 con >>>>> 0xee89fa0 >>>>> -23> 2014-08-13 17:52:56.324078 7f8a997a7700 1 -- >>>>> 10.141.8.182:6840/64670 <== osd.57 10.141.8.182:0/15796 51 ==== >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.323092) v2 ==== >>>>> 47+0+0 >>>>> (3227325175 0 0) 0xf132bc0 con 0xee8a680 >>>>> -22> 2014-08-13 17:52:56.324111 7f8a997a7700 1 -- >>>>> 10.141.8.182:6840/64670 --> 10.141.8.182:0/15796 -- >>>>> osd_ping(ping_reply >>>>> e220 >>>>> stamp 2014-08-13 17:52:56.323092) v2 -- ?+0 0xf811a40 con >>>>> 0xee8a680 >>>>> -21> 2014-08-13 17:52:56.584461 7f8a997a7700 1 -- >>>>> 10.141.8.182:6840/64670 <== osd.29 10.143.8.181:0/12142 47 ==== >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ==== >>>>> 47+0+0 >>>>> (3355887204 0 0) 0xf655940 con 0xee88b00 >>>>> -20> 2014-08-13 17:52:56.584486 7f8a997a7700 1 -- >>>>> 10.141.8.182:6840/64670 --> 10.143.8.181:0/12142 -- >>>>> osd_ping(ping_reply >>>>> e220 >>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf132bc0 con >>>>> 0xee88b00 >>>>> -19> 2014-08-13 17:52:56.584498 7f8a97fa4700 1 -- >>>>> 10.143.8.182:6827/64670 <== osd.29 10.143.8.181:0/12142 47 ==== >>>>> osd_ping(ping e220 stamp 2014-08-13 17:52:56.583010) v2 ==== >>>>> 47+0+0 >>>>> (3355887204 0 0) 0xf20e040 con 0xee886e0 >>>>> -18> 2014-08-13 17:52:56.584526 7f8a97fa4700 1 -- >>>>> 10.143.8.182:6827/64670 --> 10.143.8.181:0/12142 -- >>>>> osd_ping(ping_reply >>>>> e220 >>>>> stamp 2014-08-13 17:52:56.583010) v2 -- ?+0 0xf475940 con >>>>> 0xee886e0 >>>>> -17> 2014-08-13 17:52:56.594448 7f8a798c7700 1 -- >>>>> 10.141.8.182:6839/64670 >> :/0 pipe(0xec15f00 sd=74 :6839 s=0 >>>>> pgs=0 >>>>> cs=0 >>>>> l=0 >>>>> c=0xee856a0).accept sd=74 10.141.8.180:47641/0 >>>>> -16> 2014-08-13 17:52:56.594921 7f8a798c7700 1 -- >>>>> 10.141.8.182:6839/64670 <== client.7512 10.141.8.180:0/1018433 1 >>>>> ==== >>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+39 (1972163119 0 >>>>> 4174233976) 0xf3bca40 con 0xee856a0 >>>>> -15> 2014-08-13 17:52:56.594957 7f8a798c7700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 299, time: 2014-08-13 17:52:56.594874, event: header_read, op: >>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -14> 2014-08-13 17:52:56.594970 7f8a798c7700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 299, time: 2014-08-13 17:52:56.594880, event: throttled, op: >>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -13> 2014-08-13 17:52:56.594978 7f8a798c7700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 299, time: 2014-08-13 17:52:56.594917, event: all_read, op: >>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -12> 2014-08-13 17:52:56.594986 7f8a798c7700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 299, time: 0.000000, event: dispatched, op: >>>>> osd_op(client.7512.0:1 >>>>> [pgls >>>>> start_epoch 0] 3.0 ack+read+known_if_redirected e220) >>>>> -11> 2014-08-13 17:52:56.595127 7f8a90795700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 299, time: 2014-08-13 17:52:56.595104, event: reached_pg, op: >>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -10> 2014-08-13 17:52:56.595159 7f8a90795700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 299, time: 2014-08-13 17:52:56.595153, event: started, op: >>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -9> 2014-08-13 17:52:56.602179 7f8a90795700 1 -- >>>>> 10.141.8.182:6839/64670 --> 10.141.8.180:0/1018433 -- >>>>> osd_op_reply(1 >>>>> [pgls >>>>> start_epoch 0] v164'30654 uv30654 ondisk = 0) v6 -- ?+0 0xec16180 >>>>> con >>>>> 0xee856a0 >>>>> -8> 2014-08-13 17:52:56.602211 7f8a90795700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 299, time: 2014-08-13 17:52:56.602205, event: done, op: >>>>> osd_op(client.7512.0:1 [pgls start_epoch 0] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -7> 2014-08-13 17:52:56.614839 7f8a798c7700 1 -- >>>>> 10.141.8.182:6839/64670 <== client.7512 10.141.8.180:0/1018433 2 >>>>> ==== >>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>> ack+read+known_if_redirected e220) v4 ==== 151+0+89 (3460833343 0 >>>>> 2600845095) 0xf3bcec0 con 0xee856a0 >>>>> -6> 2014-08-13 17:52:56.614864 7f8a798c7700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 300, time: 2014-08-13 17:52:56.614789, event: header_read, op: >>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -5> 2014-08-13 17:52:56.614874 7f8a798c7700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 300, time: 2014-08-13 17:52:56.614792, event: throttled, op: >>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -4> 2014-08-13 17:52:56.614884 7f8a798c7700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 300, time: 2014-08-13 17:52:56.614835, event: all_read, op: >>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -3> 2014-08-13 17:52:56.614891 7f8a798c7700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 300, time: 0.000000, event: dispatched, op: >>>>> osd_op(client.7512.0:2 >>>>> [pgls >>>>> start_epoch 220] 3.0 ack+read+known_if_redirected e220) >>>>> -2> 2014-08-13 17:52:56.614972 7f8a92f9a700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 300, time: 2014-08-13 17:52:56.614958, event: reached_pg, op: >>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> -1> 2014-08-13 17:52:56.614993 7f8a92f9a700 5 -- op tracker >>>>> -- >>>>> , >>>>> seq: >>>>> 300, time: 2014-08-13 17:52:56.614986, event: started, op: >>>>> osd_op(client.7512.0:2 [pgls start_epoch 220] 3.0 >>>>> ack+read+known_if_redirected e220) >>>>> 0> 2014-08-13 17:52:56.617087 7f8a92f9a700 -1 >>>>> os/GenericObjectMap.cc: >>>>> In function 'int GenericObjectMap::list_objects(const coll_t&, >>>>> ghobject_t, >>>>> int, std::vector<ghobject_t>*, ghobject_t*)' thread 7f8a92f9a700 >>>>> time >>>>> 2014-08-13 17:52:56.615073 >>>>> os/GenericObjectMap.cc: 1118: FAILED assert(start <= header.oid) >>>>> >>>>> >>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8) >>>>> 1: (GenericObjectMap::list_objects(coll_t const&, ghobject_t, >>>>> int, >>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*, >>>>> ghobject_t*)+0x474) >>>>> [0x98f774] >>>>> 2: (KeyValueStore::collection_list_partial(coll_t, ghobject_t, >>>>> int, >>>>> int, >>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*, >>>>> ghobject_t*)+0x274) [0x8c5b54] >>>>> 3: (PGBackend::objects_list_partial(hobject_t const&, int, int, >>>>> snapid_t, >>>>> std::vector<hobject_t, std::allocator<hobject_t> >*, >>>>> hobject_t*)+0x1c9) >>>>> [0x862de9] >>>>> 4: >>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5) >>>>> [0x7f67f5] >>>>> 5: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3) >>>>> [0x8177b3] >>>>> 6: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, >>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045] >>>>> 7: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>>>> std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d) >>>>> [0x62bf8d] >>>>> 8: (OSD::ShardedOpWQ::_process(unsigned int, >>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c] >>>>> 9: (ShardedThreadPool::shardedthreadpool_worker(unsigned >>>>> int)+0x8cd) >>>>> [0xa776fd] >>>>> 10: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) >>>>> [0xa79980] >>>>> 11: (()+0x7df3) [0x7f8aac71fdf3] >>>>> 12: (clone()+0x6d) [0x7f8aab1963dd] >>>>> NOTE: a copy of the executable, or `objdump -rdS <executable>` >>>>> is >>>>> needed >>>>> to >>>>> interpret this. >>>>> >>>>> >>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8) >>>>> 1: /usr/bin/ceph-osd() [0x99b466] >>>>> 2: (()+0xf130) [0x7f8aac727130] >>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989] >>>>> 4: (abort()+0x148) [0x7f8aab0d7098] >>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) >>>>> [0x7f8aab9e89d5] >>>>> 6: (()+0x5e946) [0x7f8aab9e6946] >>>>> 7: (()+0x5e973) [0x7f8aab9e6973] >>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f] >>>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>>>> const*)+0x1ef) [0xa8805f] >>>>> 10: (GenericObjectMap::list_objects(coll_t const&, ghobject_t, >>>>> int, >>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*, >>>>> ghobject_t*)+0x474) >>>>> [0x98f774] >>>>> 11: (KeyValueStore::collection_list_partial(coll_t, ghobject_t, >>>>> int, >>>>> int, >>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*, >>>>> ghobject_t*)+0x274) [0x8c5b54] >>>>> 12: (PGBackend::objects_list_partial(hobject_t const&, int, int, >>>>> snapid_t, >>>>> std::vector<hobject_t, std::allocator<hobject_t> >*, >>>>> hobject_t*)+0x1c9) >>>>> [0x862de9] >>>>> 13: >>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5) >>>>> [0x7f67f5] >>>>> 14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3) >>>>> [0x8177b3] >>>>> 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, >>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045] >>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>>>> std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d) >>>>> [0x62bf8d] >>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int, >>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c] >>>>> 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned >>>>> int)+0x8cd) >>>>> [0xa776fd] >>>>> 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) >>>>> [0xa79980] >>>>> 20: (()+0x7df3) [0x7f8aac71fdf3] >>>>> 21: (clone()+0x6d) [0x7f8aab1963dd] >>>>> NOTE: a copy of the executable, or `objdump -rdS <executable>` >>>>> is >>>>> needed >>>>> to >>>>> interpret this. >>>>> >>>>> --- begin dump of recent events --- >>>>> 0> 2014-08-13 17:52:56.714214 7f8a92f9a700 -1 *** Caught >>>>> signal >>>>> (Aborted) ** >>>>> in thread 7f8a92f9a700 >>>>> >>>>> ceph version 0.83 (78ff1f0a5dfd3c5850805b4021738564c36c92b8) >>>>> 1: /usr/bin/ceph-osd() [0x99b466] >>>>> 2: (()+0xf130) [0x7f8aac727130] >>>>> 3: (gsignal()+0x39) [0x7f8aab0d5989] >>>>> 4: (abort()+0x148) [0x7f8aab0d7098] >>>>> 5: (__gnu_cxx::__verbose_terminate_handler()+0x165) >>>>> [0x7f8aab9e89d5] >>>>> 6: (()+0x5e946) [0x7f8aab9e6946] >>>>> 7: (()+0x5e973) [0x7f8aab9e6973] >>>>> 8: (()+0x5eb9f) [0x7f8aab9e6b9f] >>>>> 9: (ceph::__ceph_assert_fail(char const*, char const*, int, char >>>>> const*)+0x1ef) [0xa8805f] >>>>> 10: (GenericObjectMap::list_objects(coll_t const&, ghobject_t, >>>>> int, >>>>> std::vector<ghobject_t, std::allocator<ghobject_t> >*, >>>>> ghobject_t*)+0x474) >>>>> [0x98f774] >>>>> 11: (KeyValueStore::collection_list_partial(coll_t, ghobject_t, >>>>> int, >>>>> int, >>>>> snapid_t, std::vector<ghobject_t, std::allocator<ghobject_t> >*, >>>>> ghobject_t*)+0x274) [0x8c5b54] >>>>> 12: (PGBackend::objects_list_partial(hobject_t const&, int, int, >>>>> snapid_t, >>>>> std::vector<hobject_t, std::allocator<hobject_t> >*, >>>>> hobject_t*)+0x1c9) >>>>> [0x862de9] >>>>> 13: >>>>> (ReplicatedPG::do_pg_op(std::tr1::shared_ptr<OpRequest>)+0xea5) >>>>> [0x7f67f5] >>>>> 14: (ReplicatedPG::do_op(std::tr1::shared_ptr<OpRequest>)+0x1f3) >>>>> [0x8177b3] >>>>> 15: (ReplicatedPG::do_request(std::tr1::shared_ptr<OpRequest>, >>>>> ThreadPool::TPHandle&)+0x5d5) [0x7b8045] >>>>> 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, >>>>> std::tr1::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x47d) >>>>> [0x62bf8d] >>>>> 17: (OSD::ShardedOpWQ::_process(unsigned int, >>>>> ceph::heartbeat_handle_d*)+0x35c) [0x62c56c] >>>>> 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned >>>>> int)+0x8cd) >>>>> [0xa776fd] >>>>> 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) >>>>> [0xa79980] >>>>> 20: (()+0x7df3) [0x7f8aac71fdf3] >>>>> 21: (clone()+0x6d) [0x7f8aab1963dd] >>>>> NOTE: a copy of the executable, or `objdump -rdS <executable>` >>>>> is >>>>> needed >>>>> to >>>>> interpret this. >>>>> >>>>> I guess this has something to do with using the dev >>>>> Keyvaluestore? >>>>> >>>>> >>>>> Thanks! >>>>> >>>>> Kenneth >>>>> >>>>> _______________________________________________ >>>>> ceph-users mailing list >>>>> [email protected] >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>> >>>> >>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> >>>> Wheat >>> >>> >>> >>> >>> ----- End message from Haomai Wang <[email protected]> ----- >>> >>> -- >>> >>> Met vriendelijke groeten, >>> Kenneth Waegeman >>> >> >> >> >> -- >> Best Regards, >> >> Wheat > > > > ----- End message from Haomai Wang <[email protected]> ----- > > -- > > Met vriendelijke groeten, > Kenneth Waegeman > -- Best Regards, Wheat _______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com----- End message from Sage Weil <[email protected]> ----- -- Met vriendelijke groeten, Kenneth Waegeman-- Best Regards, Wheat----- End message from Haomai Wang <[email protected]> ----- -- Met vriendelijke groeten, Kenneth Waegeman-- Best Regards, Wheat
----- End message from Haomai Wang <[email protected]> ----- -- Met vriendelijke groeten, Kenneth Waegeman
os_5560_object789734
Description: Binary data
_______________________________________________ ceph-users mailing list [email protected] http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
