I tried modifying filestore_rocksdb_options by removing compression=kNoCompression as well as setting it to compression=kSnappyCompression. Leaving it with kNoCompression or removing it results in the same segfault in the previous log. Setting it to kSnappyCompression resulted in [1] this being logged and the OSD just failing to start instead of segfaulting. Is there anything else you would suggest trying before I purge this OSD from the cluster? I'm afraid it might be something with the CentOS binaries.
[1] 2018-10-01 17:10:37.134930 7f1415dfcd80 0 set rocksdb option compression = kSnappyCompression 2018-10-01 17:10:37.134986 7f1415dfcd80 -1 rocksdb: Invalid argument: Compression type Snappy is not linked with the binary. 2018-10-01 17:10:37.135004 7f1415dfcd80 -1 filestore(/var/lib/ceph/osd/ceph-1) mount(1723): Error initializing rocksdb : 2018-10-01 17:10:37.135020 7f1415dfcd80 -1 osd.1 0 OSD:init: unable to mount object store 2018-10-01 17:10:37.135029 7f1415dfcd80 -1 ESC[0;31m ** ERROR: osd init failed: (1) Operation not permittedESC[0m On Sat, Sep 29, 2018 at 1:57 PM Pavan Rallabhandi < prallabha...@walmartlabs.com> wrote: > I looked at one of my test clusters running Jewel on Ubuntu 16.04, and > interestingly I found this(below) in one of the OSD logs, which is > different from your OSD boot log, where none of the compression algorithms > seem to be supported. This hints more at how rocksdb was built on CentOS > for Ceph. > > 2018-09-29 17:38:38.629112 7fbd318d4b00 4 rocksdb: Compression algorithms > supported: > 2018-09-29 17:38:38.629112 7fbd318d4b00 4 rocksdb: Snappy supported: 1 > 2018-09-29 17:38:38.629113 7fbd318d4b00 4 rocksdb: Zlib supported: 1 > 2018-09-29 17:38:38.629113 7fbd318d4b00 4 rocksdb: Bzip supported: 0 > 2018-09-29 17:38:38.629114 7fbd318d4b00 4 rocksdb: LZ4 supported: 0 > 2018-09-29 17:38:38.629114 7fbd318d4b00 4 rocksdb: ZSTD supported: 0 > 2018-09-29 17:38:38.629115 7fbd318d4b00 4 rocksdb: Fast CRC32 supported: 0 > > On 9/27/18, 2:56 PM, "Pavan Rallabhandi" <prallabha...@walmartlabs.com> > wrote: > > I see Filestore symbols on the stack, so the bluestore config doesn’t > affect. And the top frame of the stack hints at a RocksDB issue, and there > are a whole lot of these too: > > “2018-09-17 19:23:06.480258 7f1f3d2a7700 2 rocksdb: > [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.4/rpm/el7/BUILD/ceph-12.2.4/src/rocksdb/table/block_based_table_reader.cc:636] > Cannot find Properties block from file.” > > It really seems to be something with RocksDB on centOS. I still think > you can try removing “compression=kNoCompression” from the > filestore_rocksdb_options And/Or check if rocksdb is expecting snappy to be > enabled. > > Thanks, > -Pavan. > > From: David Turner <drakonst...@gmail.com> > Date: Thursday, September 27, 2018 at 1:18 PM > To: Pavan Rallabhandi <prallabha...@walmartlabs.com> > Cc: ceph-users <ceph-users@lists.ceph.com> > Subject: EXT: Re: [ceph-users] Any backfill in our cluster makes the > cluster unusable and takes forever > > I got pulled away from this for a while. The error in the log is > "abort: Corruption: Snappy not supported or corrupted Snappy compressed > block contents" and the OSD has 2 settings set to snappy by default, > async_compressor_type and bluestore_compression_algorithm. Do either of > these settings affect the omap store? > > On Wed, Sep 19, 2018 at 2:33 PM Pavan Rallabhandi <mailto: > prallabha...@walmartlabs.com> wrote: > Looks like you are running on CentOS, fwiw. We’ve successfully ran the > conversion commands on Jewel, Ubuntu 16.04. > > Have a feel it’s expecting the compression to be enabled, can you try > removing “compression=kNoCompression” from the filestore_rocksdb_options? > And/or you might want to check if rocksdb is expecting snappy to be enabled. > > From: David Turner <mailto:drakonst...@gmail.com> > Date: Tuesday, September 18, 2018 at 6:01 PM > To: Pavan Rallabhandi <mailto:prallabha...@walmartlabs.com> > Cc: ceph-users <mailto:ceph-users@lists.ceph.com> > Subject: EXT: Re: [ceph-users] Any backfill in our cluster makes the > cluster unusable and takes forever > > Here's the [1] full log from the time the OSD was started to the end > of the crash dump. These logs are so hard to parse. Is there anything > useful in them? > > I did confirm that all perms were set correctly and that the > superblock was changed to rocksdb before the first time I attempted to > start the OSD with it's new DB. This is on a fully Luminous cluster with > [2] the defaults you mentioned. > > [1] > https://gist.github.com/drakonstein/fa3ac0ad9b2ec1389c957f95e05b79ed > [2] "filestore_omap_backend": "rocksdb", > "filestore_rocksdb_options": > "max_background_compactions=8,compaction_readahead_size=2097152,compression=kNoCompression", > > On Tue, Sep 18, 2018 at 5:29 PM Pavan Rallabhandi <mailto:mailto: > prallabha...@walmartlabs.com> wrote: > I meant the stack trace hints that the superblock still has leveldb in > it, have you verified that already? > > On 9/18/18, 5:27 PM, "Pavan Rallabhandi" <mailto:mailto: > prallabha...@walmartlabs.com> wrote: > > You should be able to set them under the global section and that > reminds me, since you are on Luminous already, I guess those values are > already the default, you can verify from the admin socket of any OSD. > > But the stack trace didn’t hint as if the superblock on the OSD is > still considering the omap backend to be leveldb and to do with the > compression. > > Thanks, > -Pavan. > > From: David Turner <mailto:mailto:drakonst...@gmail.com> > Date: Tuesday, September 18, 2018 at 5:07 PM > To: Pavan Rallabhandi <mailto:mailto:prallabha...@walmartlabs.com> > Cc: ceph-users <mailto:mailto:ceph-users@lists.ceph.com> > Subject: EXT: Re: [ceph-users] Any backfill in our cluster makes > the cluster unusable and takes forever > > Are those settings fine to have be global even if not all OSDs on > a node have rocksdb as the backend? Or will I need to convert all OSDs on > a node at the same time? > > On Tue, Sep 18, 2018 at 5:02 PM Pavan Rallabhandi <mailto:mailto > :mailto:mailto:prallabha...@walmartlabs.com> wrote: > The steps that were outlined for conversion are correct, have you > tried setting some the relevant ceph conf values too: > > filestore_rocksdb_options = > "max_background_compactions=8;compaction_readahead_size=2097152;compression=kNoCompression" > > filestore_omap_backend = rocksdb > > Thanks, > -Pavan. > > From: ceph-users <mailto:mailto:mailto:mailto: > ceph-users-boun...@lists.ceph.com> on behalf of David Turner <mailto: > mailto:mailto:mailto:drakonst...@gmail.com> > Date: Tuesday, September 18, 2018 at 4:09 PM > To: ceph-users <mailto:mailto:mailto:mailto: > ceph-users@lists.ceph.com> > Subject: EXT: [ceph-users] Any backfill in our cluster makes the > cluster unusable and takes forever > > I've finally learned enough about the OSD backend track down this > issue to what I believe is the root cause. LevelDB compaction is the > common thread every time we move data around our cluster. I've ruled out > PG subfolder splitting, EC doesn't seem to be the root cause of this, and > it is cluster wide as opposed to specific hardware. > > One of the first things I found after digging into leveldb omap > compaction was [1] this article with a heading "RocksDB instead of LevelDB" > which mentions that leveldb was replaced with rocksdb as the default db > backend for filestore OSDs and was even backported to Jewel because of the > performance improvements. > > I figured there must be a way to be able to upgrade an OSD to use > rocksdb from leveldb without needing to fully backfill the entire OSD. > There is [2] this article, but you need to have an active service account > with RedHat to access it. I eventually came across [3] this article about > optimizing Ceph Object Storage which mentions a resolution to OSDs flapping > due to omap compaction to migrate to using rocksdb. It links to the RedHat > article, but also has [4] these steps outlined in it. I tried to follow > the steps, but the OSD I tested this on was unable to start with [5] this > segfault. And then trying to move the OSD back to the original LevelDB > omap folder resulted in [6] this in the log. I apologize that all of my > logging is with log level 1. If needed I can get some higher log levels. > > My Ceph version is 12.2.4. Does anyone have any suggestions for > how I can update my filestore backend from leveldb to rocksdb? Or if > that's the wrong direction and I should be looking elsewhere? Thank you. > > > [1] https://ceph.com/community/new-luminous-rados-improvements/ > [2] https://access.redhat.com/solutions/3210951 > [3] > https://hubb.blob.core.windows.net/c2511cea-81c5-4386-8731-cc444ff806df-public/resources/Optimize > Ceph object storage for production in multisite clouds.pdf > > [4] ■ Stop the OSD > ■ mv /var/lib/ceph/osd/ceph-/current/omap > /var/lib/ceph/osd/ceph-/omap.orig > ■ ulimit -n 65535 > ■ ceph-kvstore-tool leveldb /var/lib/ceph/osd/ceph-/omap.orig > store-copy /var/lib/ceph/osd/ceph-/current/omap 10000 rocksdb > ■ ceph-osdomap-tool --omap-path > /var/lib/ceph/osd/ceph-/current/omap --command check > ■ sed -i s/leveldb/rocksdb/g /var/lib/ceph/osd/ceph-/superblock > ■ chown ceph.ceph /var/lib/ceph/osd/ceph-/current/omap -R > ■ cd /var/lib/ceph/osd/ceph-; rm -rf omap.orig > ■ Start the OSD > > [5] 2018-09-17 19:23:10.826227 7f1f3f2ab700 -1 abort: Corruption: > Snappy not supported or corrupted Snappy compressed block contents > 2018-09-17 19:23:10.830525 7f1f3f2ab700 -1 *** Caught signal > (Aborted) ** > > [6] 2018-09-17 19:27:34.010125 7fcdee97cd80 -1 osd.0 0 OSD:init: > unable to mount object store > 2018-09-17 19:27:34.010131 7fcdee97cd80 -1 ESC[0;31m ** ERROR: osd > init failed: (1) Operation not permittedESC[0m > 2018-09-17 19:27:54.225941 7f7f03308d80 0 set uid:gid to 167:167 > (ceph:ceph) > 2018-09-17 19:27:54.225975 7f7f03308d80 0 ceph version 12.2.4 > (52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable), process > (unknown), pid 361535 > 2018-09-17 19:27:54.231275 7f7f03308d80 0 pidfile_write: ignore > empty --pid-file > 2018-09-17 19:27:54.260207 7f7f03308d80 0 load: jerasure load: > lrc load: isa > 2018-09-17 19:27:54.260520 7f7f03308d80 0 > filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342) > 2018-09-17 19:27:54.261135 7f7f03308d80 0 > filestore(/var/lib/ceph/osd/ceph-0) backend xfs (magic 0x58465342) > 2018-09-17 19:27:54.261750 7f7f03308d80 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: FIEMAP > ioctl is disabled via 'filestore fiemap' config option > 2018-09-17 19:27:54.261757 7f7f03308d80 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: > SEEK_DATA/SEEK_HOLE is disabled via 'filestore seek data hole' config option > 2018-09-17 19:27:54.261758 7f7f03308d80 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: splice() > is disabled via 'filestore splice' config option > 2018-09-17 19:27:54.286454 7f7f03308d80 0 > genericfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_features: > syncfs(2) syscall fully supported (by glibc and kernel) > 2018-09-17 19:27:54.286572 7f7f03308d80 0 > xfsfilestorebackend(/var/lib/ceph/osd/ceph-0) detect_feature: extsize is > disabled by conf > 2018-09-17 19:27:54.287119 7f7f03308d80 0 > filestore(/var/lib/ceph/osd/ceph-0) start omap initiation > 2018-09-17 19:27:54.287527 7f7f03308d80 -1 > filestore(/var/lib/ceph/osd/ceph-0) mount(1723): Error initializing leveldb > : Corruption: VersionEdit: unknown tag > > > > > >
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com