[ceph-users] librbd on CentOS7

2017-10-23 Thread Wolfgang Lendl
Hello,

we're testing KVM on CentOS 7 as Ceph (luminous) client.
CentOS 7 has a librbd package in its base repository with version 0.94.5

the question is (aside from feature support) if we should install a
recent librbd from the ceph repositories (12.2.x) or stay with the
default one.
my main concern is performance and I'm not sure how the librbd version
has impact on it.


wolfgang

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] bluestore - wal,db on faster devices?

2017-11-08 Thread Wolfgang Lendl
Hello,

it's clear to me getting a performance gain from putting the journal on
a fast device (ssd,nvme) when using filestore backend.
it's not when it comes to bluestore - are there any resources,
performance test, etc. out there how a fast wal,db device impacts
performance?


br
wolfgang

-- 
Wolfgang Lendl
IT Systems & Communications
Medizinische Universität Wien
Spitalgasse 23 / BT 88 /Ebene 00
A-1090 Wien
Tel: +43 1 40160-21231
Fax: +43 1 40160-921200

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] bluestore - wal,db on faster devices?

2017-11-08 Thread Wolfgang Lendl
Hi Mark,

thanks for your reply!
I'm a big fan of keeping things simple - this means that there has to be
a very good reason to put the WAL and DB on a separate device otherwise
I'll keep it collocated (and simpler).

as far as I understood - putting the WAL,DB on a faster (than hdd)
device makes more sense in cephfs and rgw environments (more metadata) -
and less sense in rbd environments - correct?

br
wolfgang

On 11/08/2017 02:21 PM, Mark Nelson wrote:
> Hi Wolfgang,
>
> In bluestore the WAL serves sort of a similar purpose to filestore's
> journal, but bluestore isn't dependent on it for guaranteeing
> durability of large writes.  With bluestore you can often get higher
> large-write throughput than with filestore when using HDD-only or
> flash-only OSDs.
>
> Bluestore also stores allocation, object, and cluster metadata in the
> DB.  That, in combination with the way bluestore stores objects,
> dramatically improves behavior during certain workloads.  A big one is
> creating millions of small objects as quickly as possible.  In
> filestore, PG splitting has a huge impact on performance and tail
> latency.  Bluestore is much better just on HDD, and putting the DB and
> WAL on flash makes it better still since metadata no longer is a
> bottleneck.
>
> Bluestore does have a couple of shortcomings vs filestore currently.
> The allocator is not as good as XFS's and can fragment more over time.
> There is no server-side readahead so small sequential read performance
> is very dependent on client-side readahead.  There's still a number of
> optimizations to various things ranging from threading and locking in
> the shardedopwq to pglog and dup_ops that potentially could improve
> performance.
>
> I have a blog post that we've been working on that explores some of
> these things but I'm still waiting on review before I publish it.
>
> Mark
>
> On 11/08/2017 05:53 AM, Wolfgang Lendl wrote:
>> Hello,
>>
>> it's clear to me getting a performance gain from putting the journal on
>> a fast device (ssd,nvme) when using filestore backend.
>> it's not when it comes to bluestore - are there any resources,
>> performance test, etc. out there how a fast wal,db device impacts
>> performance?
>>
>>
>> br
>> wolfgang
>>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph configuration backup - what is vital?

2017-12-12 Thread Wolfgang Lendl
hello,

I'm looking for a recommendation about what parts/configuration/etc to
backup from a ceph cluster in case of a disaster.
I know this depends heavily on the type of disaster and I'm not talking
about backup of payload stored on osds.

currently I have my admin key stored somewhere outside the cluster -
maybe there are some best practices out there?


wolfgang

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] QUEMU - rbd cache - inconsistent documentation?

2018-01-19 Thread Wolfgang Lendl
hi,

I'm a bit confused after reading the official ceph docu regarding QEMU
and rbd caching.

http://docs.ceph.com/docs/master/rbd/qemu-rbd/?highlight=qemu

there's a big fat warning: 

"Important: If you set rbd_cache=true, you must set cache=writeback or
risk data loss. Without cache=writeback, QEMU will not send flush
requests to librbd. If QEMU exits uncleanly in this configuration,
filesystems on top of rbd can be corrupted."

two sections below you can find the following:

"QEMU’s cache settings override Ceph’s cache settings (including
settings that are explicitly set in the Ceph configuration file)."


I find these two statements contradicting and looking for the truth - or
did i miss something?

ceph/librbd: 12.2
qemu/kvm: 2.9.0


br wolfgang

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] SSD OSDs crashing after upgrade to 12.2.7

2018-08-29 Thread Wolfgang Lendl
Hi,

after upgrading my ceph clusters from 12.2.5 to 12.2.7  I'm experiencing random 
crashes from SSD OSDs (bluestore) - it seems that HDD OSDs are not affected.
I destroyed and recreated some of the SSD OSDs which seemed to help. 

this happens on centos 7.5 (different kernels tested)

/var/log/messages: 
Aug 29 10:24:08  ceph-osd: *** Caught signal (Segmentation fault) **
Aug 29 10:24:08  ceph-osd: in thread 7f8a8e69e700 thread_name:bstore_kv_final
Aug 29 10:24:08  kernel: traps: bstore_kv_final[187470] general protection 
ip:7f8a997cf42b sp:7f8a8e69abc0 error:0 in 
libtcmalloc.so.4.4.5[7f8a997a8000+46000]
Aug 29 10:24:08  systemd: ceph-osd@2.service: main process exited, code=killed, 
status=11/SEGV
Aug 29 10:24:08  systemd: Unit ceph-osd@2.service entered failed state.
Aug 29 10:24:08  systemd: ceph-osd@2.service failed.
Aug 29 10:24:28  systemd: ceph-osd@2.service holdoff time over, scheduling 
restart.
Aug 29 10:24:28  systemd: Starting Ceph object storage daemon osd.2...
Aug 29 10:24:28  systemd: Started Ceph object storage daemon osd.2.
Aug 29 10:24:28  ceph-osd: starting osd.2 at - osd_data 
/var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
Aug 29 10:24:35  ceph-osd: *** Caught signal (Segmentation fault) **
Aug 29 10:24:35  ceph-osd: in thread 7f5f1e790700 thread_name:tp_osd_tp
Aug 29 10:24:35  kernel: traps: tp_osd_tp[186933] general protection 
ip:7f5f43103e63 sp:7f5f1e78a1c8 error:0 in 
libtcmalloc.so.4.4.5[7f5f430cd000+46000]
Aug 29 10:24:35  systemd: ceph-osd@0.service: main process exited, code=killed, 
status=11/SEGV
Aug 29 10:24:35  systemd: Unit ceph-osd@0.service entered failed state.
Aug 29 10:24:35  systemd: ceph-osd@0.service failed.

did I hit a known issue?
any suggestions are highly appreciated


br
wolfgang




signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD OSDs crashing after upgrade to 12.2.7

2018-08-30 Thread Wolfgang Lendl
Hi Alfredo,


caught some logs:
https://pastebin.com/b3URiA7p

br
wolfgang

On 2018-08-29 15:51, Alfredo Deza wrote:
> On Wed, Aug 29, 2018 at 2:06 AM, Wolfgang Lendl
>  wrote:
>> Hi,
>>
>> after upgrading my ceph clusters from 12.2.5 to 12.2.7  I'm experiencing 
>> random crashes from SSD OSDs (bluestore) - it seems that HDD OSDs are not 
>> affected.
>> I destroyed and recreated some of the SSD OSDs which seemed to help.
>>
>> this happens on centos 7.5 (different kernels tested)
>>
>> /var/log/messages:
>> Aug 29 10:24:08  ceph-osd: *** Caught signal (Segmentation fault) **
>> Aug 29 10:24:08  ceph-osd: in thread 7f8a8e69e700 thread_name:bstore_kv_final
>> Aug 29 10:24:08  kernel: traps: bstore_kv_final[187470] general protection 
>> ip:7f8a997cf42b sp:7f8a8e69abc0 error:0 in 
>> libtcmalloc.so.4.4.5[7f8a997a8000+46000]
>> Aug 29 10:24:08  systemd: ceph-osd@2.service: main process exited, 
>> code=killed, status=11/SEGV
>> Aug 29 10:24:08  systemd: Unit ceph-osd@2.service entered failed state.
>> Aug 29 10:24:08  systemd: ceph-osd@2.service failed.
>> Aug 29 10:24:28  systemd: ceph-osd@2.service holdoff time over, scheduling 
>> restart.
>> Aug 29 10:24:28  systemd: Starting Ceph object storage daemon osd.2...
>> Aug 29 10:24:28  systemd: Started Ceph object storage daemon osd.2.
>> Aug 29 10:24:28  ceph-osd: starting osd.2 at - osd_data 
>> /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
>> Aug 29 10:24:35  ceph-osd: *** Caught signal (Segmentation fault) **
>> Aug 29 10:24:35  ceph-osd: in thread 7f5f1e790700 thread_name:tp_osd_tp
>> Aug 29 10:24:35  kernel: traps: tp_osd_tp[186933] general protection 
>> ip:7f5f43103e63 sp:7f5f1e78a1c8 error:0 in 
>> libtcmalloc.so.4.4.5[7f5f430cd000+46000]
>> Aug 29 10:24:35  systemd: ceph-osd@0.service: main process exited, 
>> code=killed, status=11/SEGV
>> Aug 29 10:24:35  systemd: Unit ceph-osd@0.service entered failed state.
>> Aug 29 10:24:35  systemd: ceph-osd@0.service failed
> These systemd messages aren't usually helpful, try poking around
> /var/log/ceph/ for the output on that one OSD.
>
> If those logs aren't useful either, try bumping up the verbosity (see
> http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/#boot-time
> )
>> did I hit a known issue?
>> any suggestions are highly appreciated
>>
>>
>> br
>> wolfgang
>>
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>

-- 
Wolfgang Lendl
IT Systems & Communications
Medizinische Universität Wien
Spitalgasse 23 / BT 88 /Ebene 00
A-1090 Wien
Tel: +43 1 40160-21231
Fax: +43 1 40160-921200




signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD OSDs crashing after upgrade to 12.2.7

2018-09-04 Thread Wolfgang Lendl
is downgrading from 12.2.7 to 12.2.5 an option? - I'm still suffering
from high frequent osd crashes.
my hopes are with 12.2.9 - but hope wasn't always my best strategy

br
wolfgang

On 2018-08-30 19:18, Alfredo Deza wrote:
> On Thu, Aug 30, 2018 at 5:24 AM, Wolfgang Lendl
>  wrote:
>> Hi Alfredo,
>>
>>
>> caught some logs:
>> https://pastebin.com/b3URiA7p
> That looks like there is an issue with bluestore. Maybe Radoslaw or
> Adam might know a bit more.
>
>
>> br
>> wolfgang
>>
>> On 2018-08-29 15:51, Alfredo Deza wrote:
>>> On Wed, Aug 29, 2018 at 2:06 AM, Wolfgang Lendl
>>>  wrote:
>>>> Hi,
>>>>
>>>> after upgrading my ceph clusters from 12.2.5 to 12.2.7  I'm experiencing 
>>>> random crashes from SSD OSDs (bluestore) - it seems that HDD OSDs are not 
>>>> affected.
>>>> I destroyed and recreated some of the SSD OSDs which seemed to help.
>>>>
>>>> this happens on centos 7.5 (different kernels tested)
>>>>
>>>> /var/log/messages:
>>>> Aug 29 10:24:08  ceph-osd: *** Caught signal (Segmentation fault) **
>>>> Aug 29 10:24:08  ceph-osd: in thread 7f8a8e69e700 
>>>> thread_name:bstore_kv_final
>>>> Aug 29 10:24:08  kernel: traps: bstore_kv_final[187470] general protection 
>>>> ip:7f8a997cf42b sp:7f8a8e69abc0 error:0 in 
>>>> libtcmalloc.so.4.4.5[7f8a997a8000+46000]
>>>> Aug 29 10:24:08  systemd: ceph-osd@2.service: main process exited, 
>>>> code=killed, status=11/SEGV
>>>> Aug 29 10:24:08  systemd: Unit ceph-osd@2.service entered failed state.
>>>> Aug 29 10:24:08  systemd: ceph-osd@2.service failed.
>>>> Aug 29 10:24:28  systemd: ceph-osd@2.service holdoff time over, scheduling 
>>>> restart.
>>>> Aug 29 10:24:28  systemd: Starting Ceph object storage daemon osd.2...
>>>> Aug 29 10:24:28  systemd: Started Ceph object storage daemon osd.2.
>>>> Aug 29 10:24:28  ceph-osd: starting osd.2 at - osd_data 
>>>> /var/lib/ceph/osd/ceph-2 /var/lib/ceph/osd/ceph-2/journal
>>>> Aug 29 10:24:35  ceph-osd: *** Caught signal (Segmentation fault) **
>>>> Aug 29 10:24:35  ceph-osd: in thread 7f5f1e790700 thread_name:tp_osd_tp
>>>> Aug 29 10:24:35  kernel: traps: tp_osd_tp[186933] general protection 
>>>> ip:7f5f43103e63 sp:7f5f1e78a1c8 error:0 in 
>>>> libtcmalloc.so.4.4.5[7f5f430cd000+46000]
>>>> Aug 29 10:24:35  systemd: ceph-osd@0.service: main process exited, 
>>>> code=killed, status=11/SEGV
>>>> Aug 29 10:24:35  systemd: Unit ceph-osd@0.service entered failed state.
>>>> Aug 29 10:24:35  systemd: ceph-osd@0.service failed
>>> These systemd messages aren't usually helpful, try poking around
>>> /var/log/ceph/ for the output on that one OSD.
>>>
>>> If those logs aren't useful either, try bumping up the verbosity (see
>>> http://docs.ceph.com/docs/master/rados/troubleshooting/log-and-debug/#boot-time
>>> )
>>>> did I hit a known issue?
>>>> any suggestions are highly appreciated
>>>>
>>>>
>>>> br
>>>> wolfgang
>>>>
>>>>
>>>>
>>>> ___
>>>> ceph-users mailing list
>>>> ceph-users@lists.ceph.com
>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>>
>> --
>> Wolfgang Lendl
>> IT Systems & Communications
>> Medizinische Universität Wien
>> Spitalgasse 23 / BT 88 /Ebene 00
>> A-1090 Wien
>> Tel: +43 1 40160-21231
>> Fax: +43 1 40160-921200
>>
>>

-- 
Wolfgang Lendl
IT Systems & Communications
Medizinische Universität Wien
Spitalgasse 23 / BT 88 /Ebene 00
A-1090 Wien
Tel: +43 1 40160-21231
Fax: +43 1 40160-921200




signature.asc
Description: OpenPGP digital signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] SSD OSDs crashing after upgrade to 12.2.7

2018-09-07 Thread Wolfgang Lendl
Hi,

the problem still exists
for me, this happens to SSD OSDs only - I recreated all of them running 12.2.8

this is what i got even on newly created OSDs after some time and crashes
ceph-bluestore-tool fsck -l /root/fsck-osd.0.log --log-level=20 --path 
/var/lib/ceph/osd/ceph-0 --deep on

2018-09-05 10:15:42.784873 7f609a311ec0 -1 
bluestore(/var/lib/ceph/osd/ceph-137) fsck error: found stray shared blob data 
for sbid 0x34dbe4
2018-09-05 10:15:42.818239 7f609a311ec0 -1 
bluestore(/var/lib/ceph/osd/ceph-137) fsck error: found stray shared blob data 
for sbid 0x376ccf
2018-09-05 10:15:42.863419 7f609a311ec0 -1 
bluestore(/var/lib/ceph/osd/ceph-137) fsck error: found stray shared blob data 
for sbid 0x3a4e58
2018-09-05 10:15:42.887404 7f609a311ec0 -1 
bluestore(/var/lib/ceph/osd/ceph-137) fsck error: found stray shared blob data 
for sbid 0x3b7f29
2018-09-05 10:15:42.958417 7f609a311ec0 -1 
bluestore(/var/lib/ceph/osd/ceph-137) fsck error: found stray shared blob data 
for sbid 0x3df760
2018-09-05 10:15:42.961275 7f609a311ec0 -1 
bluestore(/var/lib/ceph/osd/ceph-137) fsck error: found stray shared blob data 
for sbid 0x3e076f
2018-09-05 10:15:43.038658 7f609a311ec0 -1 
bluestore(/var/lib/ceph/osd/ceph-137) fsck error: found stray shared blob data 
for sbid 0x3ff156

I don't know if these errors are the reason for the OSD crashes or the result 
of it
currently I'm trying to catch some verbose logs

see also Radoslaws reply below

>This looks quite similar to #25001 [1]. The corruption *might* be caused by
>the racy SharedBlob::put() [2] that was fixed in 12.2.6. However, more logs
>(debug_bluestore=20, debug_bdev=20) would be useful. Also you might
>want to carefully use fsck --  please take a look on the Igor's (CCed) post
>and Troy's response.
>
>Best regards,
>Radoslaw Zarzynski
>
>[1] http://tracker.ceph.com/issues/25001
>[2] http://tracker.ceph.com/issues/24211
>[3] http://tracker.ceph.com/issues/25001#note-6

I'll keep you updated
br wolfgang



On 2018-09-06 09:27, Caspar Smit wrote:
> Hi,
>
> These reports are kind of worrying since we have a 12.2.5 cluster too
> waiting to upgrade. Did you have a luck with upgrading to 12.2.8 or
> still the same behavior?
> Is there a bugtracker for this issue?
>
> Kind regards,
> Caspar
>
> Op di 4 sep. 2018 om 09:59 schreef Wolfgang Lendl
>  <mailto:wolfgang.le...@meduniwien.ac.at>>:
>
> is downgrading from 12.2.7 to 12.2.5 an option? - I'm still suffering
> from high frequent osd crashes.
> my hopes are with 12.2.9 - but hope wasn't always my best strategy
>
> br
> wolfgang
>
> On 2018-08-30 19:18, Alfredo Deza wrote:
> > On Thu, Aug 30, 2018 at 5:24 AM, Wolfgang Lendl
> >  <mailto:wolfgang.le...@meduniwien.ac.at>> wrote:
> >> Hi Alfredo,
> >>
> >>
> >> caught some logs:
> >> https://pastebin.com/b3URiA7p
> > That looks like there is an issue with bluestore. Maybe Radoslaw or
> > Adam might know a bit more.
> >
> >
> >> br
> >> wolfgang
> >>
> >> On 2018-08-29 15:51, Alfredo Deza wrote:
> >>> On Wed, Aug 29, 2018 at 2:06 AM, Wolfgang Lendl
> >>>  <mailto:wolfgang.le...@meduniwien.ac.at>> wrote:
> >>>> Hi,
> >>>>
> >>>> after upgrading my ceph clusters from 12.2.5 to 12.2.7  I'm
> experiencing random crashes from SSD OSDs (bluestore) - it seems
> that HDD OSDs are not affected.
> >>>> I destroyed and recreated some of the SSD OSDs which seemed
> to help.
> >>>>
> >>>> this happens on centos 7.5 (different kernels tested)
> >>>>
> >>>> /var/log/messages:
> >>>> Aug 29 10:24:08  ceph-osd: *** Caught signal (Segmentation
> fault) **
> >>>> Aug 29 10:24:08  ceph-osd: in thread 7f8a8e69e700
> thread_name:bstore_kv_final
> >>>> Aug 29 10:24:08  kernel: traps: bstore_kv_final[187470]
> general protection ip:7f8a997cf42b sp:7f8a8e69abc0 error:0 in
> libtcmalloc.so.4.4.5[7f8a997a8000+46000]
> >>>> Aug 29 10:24:08  systemd: ceph-osd@2.service: main process
> exited, code=killed, status=11/SEGV
> >>>> Aug 29 10:24:08  systemd: Unit ceph-osd@2.service entered
> failed state.
> >>>> Aug 29 10:24:08  systemd: ceph-osd@2.service failed.
> >>>> Aug 29 10:24:28  systemd: ceph-osd@2.service holdoff time
> over, scheduling restart.
> >>>> Aug 29 10:24:28  systemd: Starting Ceph object storage daemon
>

Re: [ceph-users] SSD OSDs crashing after upgrade to 12.2.7

2018-09-07 Thread Wolfgang Lendl
Hello,

got new logs - if this snip is not sufficent, I can provide the full log

https://pastebin.com/dKBzL9AW

br+thx wolfgang


On 2018-09-05 01:55, Radoslaw Zarzynski wrote:
> In the log following trace can be found:
>
>  0> 2018-08-30 13:11:01.014708 7ff2dd344700 -1 *** Caught signal
> (Segmentation fault) **
>  in thread 7ff2dd344700 thread_name:osd_srv_agent
>
>  ceph version 12.2.7 (3ec878d1e53e1aeb47a9f619c49d9e7c0aa384d5)
> luminous (stable)
>  1: (()+0xa48ec1) [0x5652900ffec1]
>  2: (()+0xf6d0) [0x7ff2f7c206d0]
>  3: (BlueStore::_wctx_finish(BlueStore::TransContext*,
> boost::intrusive_ptr&,
> boost::intrusive_ptr, BlueStore::WriteContext*,
> std::set,
> std::allocator >*)+0xb4) [0x56528ffe3954]
>  4: (BlueStore::_do_truncate(BlueStore::TransContext*,
> boost::intrusive_ptr&,
> boost::intrusive_ptr, unsigned long,
> std::set,
> std::allocator >*)+0x2c2) [0x56528fffd642]
>  5: (BlueStore::_do_remove(BlueStore::TransContext*,
> boost::intrusive_ptr&,
> boost::intrusive_ptr)+0xc6) [0x56528fffdf86]
>  6: (BlueStore::_remove(BlueStore::TransContext*,
> boost::intrusive_ptr&,
> boost::intrusive_ptr&)+0x94) [0x565289f4]
>  7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> ObjectStore::Transaction*)+0x15af) [0x56529001280f]
>  8: ...
>
> This looks quite similar to #25001 [1]. The corruption *might* be caused by
> the racy SharedBlob::put() [2] that was fixed in 12.2.6. However, more logs
> (debug_bluestore=20, debug_bdev=20) would be useful. Also you might
> want to carefully use fsck --  please take a look on the Igor's (CCed) post
> and Troy's response.
>
> Best regards,
> Radoslaw Zarzynski
>
> [1] http://tracker.ceph.com/issues/25001
> [2] http://tracker.ceph.com/issues/24211
> [3] http://tracker.ceph.com/issues/25001#note-6
>
> On Tue, Sep 4, 2018 at 12:54 PM, Alfredo Deza  wrote:
>> On Tue, Sep 4, 2018 at 3:59 AM, Wolfgang Lendl
>>  wrote:
>>> is downgrading from 12.2.7 to 12.2.5 an option? - I'm still suffering
>>> from high frequent osd crashes.
>>> my hopes are with 12.2.9 - but hope wasn't always my best strategy
>> 12.2.8 just went out. I think that Adam or Radoslaw might have some
>> time to check those logs now
>>
>>> br
>>> wolfgang
>>>
>>> On 2018-08-30 19:18, Alfredo Deza wrote:
>>>> On Thu, Aug 30, 2018 at 5:24 AM, Wolfgang Lendl
>>>>  wrote:
>>>>> Hi Alfredo,
>>>>>
>>>>>
>>>>> caught some logs:
>>>>> https://pastebin.com/b3URiA7p
>>>> That looks like there is an issue with bluestore. Maybe Radoslaw or
>>>> Adam might know a bit more.
>>>>
>>>>
>>>>> br
>>>>> wolfgang
>>>>>
>>>>> On 2018-08-29 15:51, Alfredo Deza wrote:
>>>>>> On Wed, Aug 29, 2018 at 2:06 AM, Wolfgang Lendl
>>>>>>  wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> after upgrading my ceph clusters from 12.2.5 to 12.2.7  I'm 
>>>>>>> experiencing random crashes from SSD OSDs (bluestore) - it seems that 
>>>>>>> HDD OSDs are not affected.
>>>>>>> I destroyed and recreated some of the SSD OSDs which seemed to help.
>>>>>>>
>>>>>>> this happens on centos 7.5 (different kernels tested)
>>>>>>>
>>>>>>> /var/log/messages:
>>>>>>> Aug 29 10:24:08  ceph-osd: *** Caught signal (Segmentation fault) **
>>>>>>> Aug 29 10:24:08  ceph-osd: in thread 7f8a8e69e700 
>>>>>>> thread_name:bstore_kv_final
>>>>>>> Aug 29 10:24:08  kernel: traps: bstore_kv_final[187470] general 
>>>>>>> protection ip:7f8a997cf42b sp:7f8a8e69abc0 error:0 in 
>>>>>>> libtcmalloc.so.4.4.5[7f8a997a8000+46000]
>>>>>>> Aug 29 10:24:08  systemd: ceph-osd@2.service: main process exited, 
>>>>>>> code=killed, status=11/SEGV
>>>>>>> Aug 29 10:24:08  systemd: Unit ceph-osd@2.service entered failed state.
>>>>>>> Aug 29 10:24:08  systemd: ceph-osd@2.service failed.
>>>>>>> Aug 29 10:24:28  systemd: ceph-osd@2.service holdoff time over, 
>>>>>>> scheduling restart.
>>>>>>> Aug 29 10:24:28  systemd: Starting Ceph object storage daemon osd.2...
>>>>>>> Aug 29 10:24:28  systemd: Started Ceph object storage daemon osd.2.
>>>>>>> A

[ceph-users] erasure-code-profile: what's "w=" ?

2018-02-26 Thread Wolfgang Lendl
hi,

I have no idea what "w=8" means and can't find any hints in docs ...
maybe someone can explain


ceph 12.2.2

# ceph osd erasure-code-profile get ec42
crush-device-class=hdd
crush-failure-domain=host
crush-root=default
jerasure-per-chunk-alignment=false
k=4
m=2
plugin=jerasure
technique=reed_sol_van
w=8


thx
wolfgang
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] predict impact of crush tunables change

2019-01-22 Thread Wolfgang Lendl

dear all,

i have a luminious cluster with tunables profile "hammer" - now all my 
hammer clients are gone and i could raise the tunables level to "jewel".
is there any good way to predict the data movement caused by such a 
config change?


br
wolfgang




smime.p7s
Description: S/MIME Cryptographic Signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph balancer - Some osds belong to multiple subtrees

2019-06-26 Thread Wolfgang Lendl

Hi,

tried to enable the ceph balancer on a 12.2.12 cluster and got this:

mgr[balancer] Some osds belong to multiple subtrees: [0, 1, 2, 3, 4, 5, 6, 7, 
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 
28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 
48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 
88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 
106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 121, 
122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 137, 
138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 153, 
154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 169, 
170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 185, 
186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 201, 
202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 217, 
218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 233, 
234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 249, 
250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 265, 
266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 281, 
282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 297, 
298, 299, 300, 301, 302, 303, 304, 305]

I'm not aware of any additional subtree - maybe someone can enlighten me:

ceph balancer status
{
"active": true,
"plans": [],
"mode": "crush-compat"
}

ceph osd crush tree
ID  CLASS WEIGHT (compat)  TYPE NAME
 -1   3176.04785   root default
 -7316.52490 316.52490 host node0
  0   hdd9.09560   9.09560 osd.0
  4   hdd9.09560   9.09560 osd.4
  8   hdd9.09560   9.09560 osd.8
 10   hdd9.09560   9.09560 osd.10
 12   hdd9.09560   9.09560 osd.12
 16   hdd9.09560   9.09560 osd.16
 20   hdd9.09560   9.09560 osd.20
 21   hdd9.09560   9.09560 osd.21
 26   hdd9.09560   9.09560 osd.26
 29   hdd9.09560   9.09560 osd.29
 31   hdd9.09560   9.09560 osd.31
 35   hdd9.09560   9.09560 osd.35
 37   hdd9.09560   9.09560 osd.37
 44   hdd9.09560   9.09560 osd.44
 47   hdd9.09560   9.09560 osd.47
 56   hdd9.09560   9.09560 osd.56
 59   hdd9.09560   9.09560 osd.59
 65   hdd9.09560   9.09560 osd.65
 71   hdd9.09560   9.09560 osd.71
 77   hdd9.09560   9.09560 osd.77
 80   hdd9.09560   9.09560 osd.80
 83   hdd9.09569   9.09569 osd.83
 86   hdd9.09560   9.09560 osd.86
 88   hdd9.09560   9.09560 osd.88
 94   hdd   10.91409  10.91409 osd.94
 95   hdd   10.91409  10.91409 osd.95
 98   hdd   10.91409  10.91409 osd.98
 99   hdd   10.91409  10.91409 osd.99
238   hdd9.09569   9.09569 osd.238
239   hdd9.09569   9.09569 osd.239
240   hdd9.09569   9.09569 osd.240
241   hdd9.09569   9.09569 osd.241
242   hdd9.09569   9.09569 osd.242
243   hdd9.09569   9.09569 osd.243
 -3316.52426 316.52426 host node1
  1   hdd9.09560   9.09560 osd.1
  5   hdd9.09560   9.09560 osd.5
  6   hdd9.09560   9.09560 osd.6
 11   hdd9.09560   9.09560 osd.11
 13   hdd9.09560   9.09560 osd.13
 15   hdd9.09560   9.09560 osd.15
 19   hdd9.09560   9.09560 osd.19
 23   hdd9.09560   9.09560 osd.23
 25   hdd9.09560   9.09560 osd.25
 28   hdd9.09560   9.09560 osd.28
 32   hdd9.09560   9.09560 osd.32
 34   hdd9.09560   9.09560 osd.34
 38   hdd9.09560   9.09560 osd.38
 41   hdd9.09560   9.09560 osd.41
 43   hdd9.09560   9.09560 osd.43
 46   hdd9.09560   9.09560 osd.46
 49   hdd9.09560   9.09560 osd.49
 52   hdd9.09560   9.09560 osd.52
 55   hdd9.09560   9.09560 osd.55
 58   hdd9.09560   9.09560 osd.58
 61   hdd9.09560   9.09560 osd.61
 64   hdd9.09560   9.09560 osd.64
 67   hdd9.09560   9.09560 osd.67
 70   hdd9.09560   9.09560 osd.70
 73   hdd9.09560   9.09560 osd.73
 76   hdd9.09560   9.09560 osd.76
 79   hdd9.09560   9.09560 osd.79
 81   hdd9.09560   9.09560 osd.81
 85   hdd9.09560   9.09560 osd.85
 89   hdd9.09560   9.09560 osd.89
 90   hdd   10.91409  10.91409 osd.90
 91   hdd   10.91409  10.91409 osd.91
 96   hdd   10.91409  10.91409  

Re: [ceph-users] ceph balancer - Some osds belong to multiple subtrees

2019-06-27 Thread Wolfgang Lendl

thx Paul - I suspect these shadow trees causing this misbehaviour.
I have a second luminous cluster where these balancer settings work as expected 
- this working one has hdd+ssd osds

i cannot use the upmap balancer because of some jewel-krbd clients - at least 
they are being reported as jewel clients

"client": {
"group": {
"features": "0x7010fb86aa42ada",
"release": "jewel",
"num": 1
},
"group": {
"features": "0x27018fb86aa42ada",
"release": "jewel",
"num": 3
},

is there a good way to decode the "features" value?

wolfgang


Am 26.06.2019 um 13:29 schrieb Paul Emmerich:
Device classes are implemented with magic invisible crush trees; 
you've got two completely independent trees internally: one for crush 
rules mapping to HDDs, one to legacy crush rules not specifying a 
device class.


The balancer *should* be aware of this and ignore it, but I'm not sure 
about the state of the balancer on Luminous. There were quite a few 
problems in older versions, lots of them have been fixed in backports.


The upmap balancer is much better than the crush-compat balancer, but 
it requires all clients to run Luminous or later.



Paul

--
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io <http://www.croit.io>
Tel: +49 89 1896585 90


On Wed, Jun 26, 2019 at 10:21 AM Wolfgang Lendl 
<mailto:wolfgang.le...@meduniwien.ac.at>> wrote:


Hi,

tried to enable the ceph balancer on a 12.2.12 cluster and got this:

mgr[balancer] Some osds belong to multiple subtrees: [0, 1, 2, 3, 4, 5, 6, 
7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 
27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 
47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 
67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 
105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, 
121, 122, 123, 124, 125, 126, 127, 128, 129, 130, 131, 132, 133, 134, 135, 136, 
137, 138, 139, 140, 141, 142, 143, 144, 145, 146, 147, 148, 149, 150, 151, 152, 
153, 154, 155, 156, 157, 158, 159, 160, 161, 162, 163, 164, 165, 166, 167, 168, 
169, 170, 171, 172, 173, 174, 175, 176, 177, 178, 179, 180, 181, 182, 183, 184, 
185, 186, 187, 188, 189, 190, 191, 192, 193, 194, 195, 196, 197, 198, 199, 200, 
201, 202, 203, 204, 205, 206, 207, 208, 209, 210, 211, 212, 213, 214, 215, 216, 
217, 218, 219, 220, 221, 222, 223, 224, 225, 226, 227, 228, 229, 230, 231, 232, 
233, 234, 235, 236, 237, 238, 239, 240, 241, 242, 243, 244, 245, 246, 247, 248, 
249, 250, 251, 252, 253, 254, 255, 256, 257, 258, 259, 260, 261, 262, 263, 264, 
265, 266, 267, 268, 269, 270, 271, 272, 273, 274, 275, 276, 277, 278, 279, 280, 
281, 282, 283, 284, 285, 286, 287, 288, 289, 290, 291, 292, 293, 294, 295, 296, 
297, 298, 299, 300, 301, 302, 303, 304, 305]

I'm not aware of any additional subtree - maybe someone can enlighten me:

ceph balancer status
{
 "active": true,
 "plans": [],
 "mode": "crush-compat"
}

ceph osd crush tree
ID  CLASS WEIGHT (compat)  TYPE NAME
  -1   3176.04785   root default
  -7316.52490 316.52490 host node0
   0   hdd9.09560   9.09560 osd.0
   4   hdd9.09560   9.09560 osd.4
   8   hdd9.09560   9.09560 osd.8
  10   hdd9.09560   9.09560 osd.10
  12   hdd9.09560   9.09560 osd.12
  16   hdd9.09560   9.09560 osd.16
  20   hdd9.09560   9.09560 osd.20
  21   hdd9.09560   9.09560 osd.21
  26   hdd9.09560   9.09560 osd.26
  29   hdd9.09560   9.09560 osd.29
  31   hdd9.09560   9.09560 osd.31
  35   hdd9.09560   9.09560 osd.35
  37   hdd9.09560   9.09560 osd.37
  44   hdd9.09560   9.09560 osd.44
  47   hdd9.09560   9.09560 osd.47
  56   hdd9.09560   9.09560 osd.56
  59   hdd9.09560   9.09560 osd.59
  65   hdd9.09560   9.09560 osd.65
  71   hdd9.09560   9.09560 osd.71
  77   hdd9.09560   9.09560 osd.77
  80   hdd9.09560   9.09560 osd.80
  83   hdd9.09569   9.09569 osd.83
  86   hdd9.09560   9.09560 osd.86
  88   hdd9.09560   9.09560 osd.88
  94   hdd   10.91409  10.91409 osd.94
  95   hdd   1