Please try to set bluestore_bluefs_gift_ratio to 0.0002


On 7/9/2019 7:39 PM, Brett Chancellor wrote:
Too large for pastebin.. The problem is continually crashing new OSDs. Here is the latest one.

On Tue, Jul 9, 2019 at 11:46 AM Igor Fedotov <ifedo...@suse.de <mailto:ifedo...@suse.de>> wrote:

    could you please set debug bluestore to 20 and collect startup log
    for this specific OSD once again?


    On 7/9/2019 6:29 PM, Brett Chancellor wrote:
    I restarted most of the OSDs with the stupid allocator (6 of them
    wouldn't start unless bitmap allocator was set), but I'm still
    seeing issues with OSDs crashing.  Interestingly it seems that
    the dying OSDs are always working on a pg from the .rgw.meta pool
    when they crash.

    Log : https://pastebin.com/yuJKcPvX

    On Tue, Jul 9, 2019 at 5:14 AM Igor Fedotov <ifedo...@suse.de
    <mailto:ifedo...@suse.de>> wrote:

        Hi Brett,

        in Nautilus you can do that via

        ceph config set osd.N bluestore_allocator stupid

        ceph config set osd.N bluefs_allocator stupid

        See
        
https://ceph.com/community/new-mimic-centralized-configuration-management/
        for more details on a new way of configuration options setting.


        A known issue with Stupid allocator is gradual write request
        latency increase (occurred within several days after OSD
        restart). Seldom observed though. There were some posts about
        that behavior in the mail list  this year.

        Thanks,

        Igor.


        On 7/8/2019 8:33 PM, Brett Chancellor wrote:


        I'll give that a try.  Is it something like...
        ceph tell 'osd.*' bluestore_allocator stupid
        ceph tell 'osd.*' bluefs_allocator stupid

        And should I expect any issues doing this?


        On Mon, Jul 8, 2019 at 1:04 PM Igor Fedotov
        <ifedo...@suse.de <mailto:ifedo...@suse.de>> wrote:

            I should read call stack more carefully... It's not
            about lacking free space - this is rather the bug from
            this ticket:

            http://tracker.ceph.com/issues/40080


            You should upgrade to v14.2.2 (once it's available) or
            temporarily switch to stupid allocator as a workaround.


            Thanks,

            Igor



            On 7/8/2019 8:00 PM, Igor Fedotov wrote:

            Hi Brett,

            looks like BlueStore is unable to allocate additional
            space for BlueFS at main device. It's either lacking
            free space or it's too fragmented...

            Would you share osd log, please?

            Also please run "ceph-bluestore-tool --path <substitute
            with path-to-osd!!!> bluefs-bdev-sizes" and share the
            output.

            Thanks,

            Igor

            On 7/3/2019 9:59 PM, Brett Chancellor wrote:
            Hi All! Today I've had 3 OSDs stop themselves and are
            unable to restart, all with the same error. These OSDs
            are all on different hosts. All are running 14.2.1

            I did try the following two commands
            - ceph-kvstore-tool bluestore-kv
            /var/lib/ceph/osd/ceph-80 list > keys
              ## This failed with the same error below
            - ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-80
            fsck
             ## After a couple of hours returned...
            2019-07-03 18:30:02.095 7fe7c1c1ef00 -1
            bluestore(/var/lib/ceph/osd/ceph-80) fsck warning:
            legacy statfs record found, suggest to run store
            repair to get consistent statistic reports
            fsck success


            ## Error when trying to start one of the OSDs
             -12> 2019-07-03 18:36:57.450 7f5e42366700 -1 ***
            Caught signal (Aborted) **
             in thread 7f5e42366700 thread_name:rocksdb:low0

             ceph version 14.2.1
            (d555a9489eb35f84f2e1ef49b77e19da9d113972) nautilus
            (stable)
             1: (()+0xf5d0) [0x7f5e50bd75d0]
             2: (gsignal()+0x37) [0x7f5e4f9ce207]
             3: (abort()+0x148) [0x7f5e4f9cf8f8]
             4: (ceph::__ceph_assert_fail(char const*, char
            const*, int, char const*)+0x199) [0x55a7aaee96ab]
             5: (ceph::__ceph_assertf_fail(char const*, char
            const*, int, char const*, char const*, ...)+0)
            [0x55a7aaee982a]
             6: (interval_set<unsigned long, std::map<unsigned
            long, unsigned long, std::less<unsigned long>,
            std::allocator<std::pair<unsigned long const, unsigned
            long> > > >::insert(unsigned long, unsigned long,
            unsigned long*, unsigned long*)+0x3c6) [0x55a7ab212a66]
             7: (BlueStore::allocate_bluefs_freespace(unsigned
            long, unsigned long, std::vector<bluestore_pextent_t,
            mempool::pool_allocator<(mempool::pool_index_t)4,
            bluestore_pextent_t> >*)+0x74e) [0x55a7ab48253e]
             8: (BlueFS::_expand_slow_device(unsigned long,
            std::vector<bluestore_pextent_t,
            mempool::pool_allocator<(mempool::pool_index_t)4,
            bluestore_pextent_t> >&)+0x111) [0x55a7ab59e921]
             9: (BlueFS::_allocate(unsigned char, unsigned long,
            bluefs_fnode_t*)+0x68b) [0x55a7ab59f68b]
             10: (BlueFS::_flush_range(BlueFS::FileWriter*,
            unsigned long, unsigned long)+0xe5) [0x55a7ab59fce5]
             11: (BlueFS::_flush(BlueFS::FileWriter*, bool)+0x10b)
            [0x55a7ab5a1b4b]
             12: (BlueRocksWritableFile::Flush()+0x3d)
            [0x55a7ab5bf84d]
             13: (rocksdb::WritableFileWriter::Flush()+0x19e)
            [0x55a7abbedd0e]
             14: (rocksdb::WritableFileWriter::Sync(bool)+0x2e)
            [0x55a7abbedfee]
             15:
            (rocksdb::CompactionJob::FinishCompactionOutputFile(rocksdb::Status
            const&, rocksdb::CompactionJob::SubcompactionState*,
            rocksdb::RangeDelAggregator*,
            CompactionIterationStats*, rocksdb::Slice
            const*)+0xbaa) [0x55a7abc3b73a]
             16:
            
(rocksdb::CompactionJob::ProcessKeyValueCompaction(rocksdb::CompactionJob::SubcompactionState*)+0x7d0)
            [0x55a7abc3f150]
             17: (rocksdb::CompactionJob::Run()+0x298)
            [0x55a7abc40618]
             18: (rocksdb::DBImpl::BackgroundCompaction(bool*,
            rocksdb::JobContext*, rocksdb::LogBuffer*,
            rocksdb::DBImpl::PrepickedCompaction*)+0xcb7)
            [0x55a7aba7fb67]
             19:
            
(rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*,
            rocksdb::Env::Priority)+0xd0) [0x55a7aba813c0]
             20: (rocksdb::DBImpl::BGWorkCompaction(void*)+0x3a)
            [0x55a7aba8190a]
             21: (rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned
            long)+0x264) [0x55a7abc8d9c4]
             22:
            (rocksdb::ThreadPoolImpl::Impl::BGThreadWrapper(void*)+0x4f)
            [0x55a7abc8db4f]
             23: (()+0x129dfff) [0x55a7abd1afff]
             24: (()+0x7dd5) [0x7f5e50bcfdd5]
             25: (clone()+0x6d) [0x7f5e4fa95ead]
             NOTE: a copy of the executable, or `objdump -rdS
            <executable>` is needed to interpret this.

            _______________________________________________
            ceph-users mailing list
            ceph-users@lists.ceph.com  <mailto:ceph-users@lists.ceph.com>
            http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

            _______________________________________________
            ceph-users mailing list
            ceph-users@lists.ceph.com  <mailto:ceph-users@lists.ceph.com>
            http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to