[ceph-users] frequent Monitor down

2020-10-28 Thread Andrei Mikhailovsky
Hello everyone, I am having regular messages that the Monitors are going down and up: 2020-10-27T09:50:49.032431+ mon .arh-ibstorage2-ib ( mon .1) 2248 : cluster [WRN] Health check failed: 1/4 mons down, quorum arh-ibstorage2-ib,arh-ibstorage3-ib,arh-ibstorage4-ib (MON_DOWN) 2020-10-27T0

[ceph-users] Re: frequent Monitor down

2020-10-28 Thread Andrei Mikhailovsky
20 > Subject: [ceph-users] Re: frequent Monitor down > Have you looked into syslog and mon logs? > > > Zitat von Andrei Mikhailovsky : > >> Hello everyone, >> >> I am having regular messages that the Monitors are going down and up: >> >> 2020-10-27

[ceph-users] Re: frequent Monitor down

2020-10-28 Thread Andrei Mikhailovsky
drei - Original Message - > From: "Eugen Block" > To: "Andrei Mikhailovsky" > Cc: "ceph-users" > Sent: Wednesday, 28 October, 2020 20:19:15 > Subject: Re: [ceph-users] Re: frequent Monitor down > Why do you have 4 MONs in the first place?

[ceph-users] radosgw process crashes multiple times an hour

2021-01-28 Thread Andrei Mikhailovsky
Hello, I am experiencing very frequent crashes of the radosgw service. It happens multiple times every hour. As an example, over the last 12 hours we've had 35 crashes. Has anyone experienced similar behaviour of the radosgw octopus release service? More info below: Radosgw service is runnin

[ceph-users] Re: radosgw process crashes multiple times an hour

2021-01-28 Thread Andrei Mikhailovsky
r > It looks like your radosgw is using a different version of librados. In > the backtrace, the top useful line begins: > > librados::v14_2_0 > > when it should be v15.2.0, like the ceph::buffer in the same line. > > Is there an old librados lying around that didn't ge

[ceph-users] osd recommended scheduler

2021-01-28 Thread Andrei Mikhailovsky
Hello everyone, Could some one please let me know what is the recommended modern kernel disk scheduler that should be used for SSD and HDD osds? The information in the manuals is pretty dated and refer to the schedulers which have been deprecated from the recent kernels. Thanks Andrei _

[ceph-users] Re: osd recommended scheduler

2021-02-01 Thread Andrei Mikhailovsky
Bump - Original Message - > From: "andrei" > To: "ceph-users" > Sent: Thursday, 28 January, 2021 17:09:23 > Subject: [ceph-users] osd recommended scheduler > Hello everyone, > > Could some one please let me know what is the recommended modern kernel disk > scheduler that should be use

[ceph-users] Re: radosgw process crashes multiple times an hour

2021-02-01 Thread Andrei Mikhailovsky
ltiple times an hour > >> It looks like your radosgw is using a different version of librados. In >> the backtrace, the top useful line begins: >> >> librados::v14_2_0 >> >> when it should be v15.2.0, like the ceph::buffer in the same line. >> >>

[ceph-users] Re: osd recommended scheduler

2021-02-02 Thread Andrei Mikhailovsky
kyber, mq-deadline and none. Could someone please suggest which of these new schedulers does ceph team recommend using for HDD drives and SSD drives? We have both drive types in use. Many thanks Andrei - Original Message - > From: "Wido den Hollander" > To: "And

[ceph-users] Running ceph on multiple networks

2021-03-31 Thread Andrei Mikhailovsky
Hello everyone, I have a small ceph cluster consisting of 4 Ubuntu 20.04 osd servers mainly serving rbd images to Cloudstack kvm cluster. The ceph version is 15.2.9. The network is done in such a way that all storage cluster is ran over infiniband qdr links (ipoib). We've got the management ne

[ceph-users] Unable to add osds with ceph-volume

2021-04-28 Thread Andrei Mikhailovsky
Hello everyone, I am running ceph version 15.2.8 on Ubuntu servers. I am using bluestore osds with data on hdd and db and wal on ssd drives. Each ssd has been partitioned such that it holds 5 dbs and 5 wals. The ssd were were prepared a while back probably when I was running ceph 13.x. I have

[ceph-users] cluster network change

2022-10-20 Thread Andrei Mikhailovsky
Hello cephers, I've got a few questions for the community to help us with migrating ceph cluster from Infiniband networking to 10G Ethernet with no or minimal downtime. Please find below the details of the cluster as well as info on what we are trying to achieve. 1. Cluster Info: Ceph versi

[ceph-users] radosgw not working after upgrade to Quincy

2022-12-28 Thread Andrei Mikhailovsky
Hello everyone, After the upgrade from Pacific to Quincy the radosgw service is no longer listening on network port, but the process is running. I get the following in the log: 2022-12-29T02:07:35.641+ 7f5df868ccc0 0 ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy

[ceph-users] Re: radosgw not working after upgrade to Quincy

2022-12-29 Thread Andrei Mikhailovsky
Thanks, Konstantin. Will try > From: "Konstantin Shalygin" > To: "Andrei Mikhailovsky" > Cc: "ceph-users" > Sent: Thursday, 29 December, 2022 03:42:56 > Subject: Re: [ceph-users] radosgw not working after upgrade to Quincy > Hi, > Just try

[ceph-users] rgw - unable to remove some orphans

2023-01-03 Thread Andrei Mikhailovsky
Happy New Year everyone! I have a bit of an issue with removing some of the orphan objects that were generated with the rgw-orphan-list tool. Over the years rgw generated over 14 million orphans with an overall waste of over 100TB in size, considering the overall data stored in rgw was well un

[ceph-users] Re: rgw - unable to remove some orphans

2023-01-03 Thread Andrei Mikhailovsky
he file is is still in the bucket with > `radosgw-admin bucket radoslist --bucket BUCKET` > > Cheers > Boris > > Am Di., 3. Jan. 2023 um 13:47 Uhr schrieb Andrei Mikhailovsky > : >> >> Happy New Year everyone! >> >> I have a bit of an issue with remo

[ceph-users] Re: rgw - unable to remove some orphans

2023-01-03 Thread Andrei Mikhailovsky
is a workaround? Cheers Andrei - Original Message - > From: "EDH" > To: "Andrei Mikhailovsky" , "ceph-users" > > Sent: Tuesday, 3 January, 2023 13:36:19 > Subject: RE: rgw - unable to remove some orphans > Object index database get corru

[ceph-users] Re: rgw - unable to remove some orphans

2023-01-04 Thread Andrei Mikhailovsky
could be removed and I am left with around 2m objects which error out when trying to delete them. This is why I started this thread. Please note that this tool is experimental and should be used with a great caution and care. Andrei > From: "Fabio Pasetti" > To: &quo

[ceph-users] rados buckets copy

2020-04-28 Thread Andrei Mikhailovsky
Hello, I have a problem with radosgw service where the actual disk usage (ceph df shows 28TB usage) is way more than reported by the radosgw-admin bucket stats (9TB usage). I have tried to get to the end of the problem, but no one seems to be able to help. As a last resort I will attempt to co

[ceph-users] Re: rados buckets copy

2020-04-28 Thread Andrei Mikhailovsky
Hi Manuel, My replica is 2, hence about 10TB of unaccounted usage. Andrei - Original Message - > From: "EDH - Manuel Rios" > To: "Andrei Mikhailovsky" > Sent: Tuesday, 28 April, 2020 23:57:20 > Subject: RE: rados buckets copy > Is your replic

[ceph-users] Re: rados buckets copy

2020-04-30 Thread Andrei Mikhailovsky
Can anyone suggest of the best ways to copy the buckets? I don't see a command line option of the radosgw admin tool for that. - Original Message - > From: "Andrei Mikhailovsky" > To: "EDH" > Cc: "ceph-users" > Sent: Wednesday, 29 April,

[ceph-users] Re: rados buckets copy

2020-05-07 Thread Andrei Mikhailovsky
Thanks for the suggestion! - Original Message - > From: "Szabo, Istvan (Agoda)" > To: "Andrei Mikhailovsky" , "ceph-users" > > Sent: Thursday, 7 May, 2020 03:48:04 > Subject: RE: rados buckets copy > Hi, > > You might try s3 brow

[ceph-users] RGW orphans search

2020-05-30 Thread Andrei Mikhailovsky
Hello, I am trying to clean up some wasted space (about 1/3 of used space in the rados pool is currently unaccounted for including the replication level). I've started the search command 20 days ago ( radosgw-admin orphans find --pool=.rgw.buckets --job-id=ophans_clean1 --yes-i-really-mean-it

[ceph-users] Re: RGW orphans search

2020-05-30 Thread Andrei Mikhailovsky
Hi Manuel, Thanks for the tip. Do you know if the latest code has this bug fixed? I was planning to upgrade to the latest major. Cheers - Original Message - > From: "EDH" > To: "Andrei Mikhailovsky" , "ceph-users" > > Sent: Saturday

[ceph-users] Octopus missing rgw-orphan-list tool

2020-06-29 Thread Andrei Mikhailovsky
Hello, I have been struggling a lot with radosgw buckets space wastage, which is currently stands at about 2/3 of utilised space is wasted and unaccounted for. I've tried to use the tools to find the orphan objects, but these were running in loop for weeks on without producing any results. Wid

[ceph-users] Octopus upgrade breaks Ubuntu 18.04 libvirt

2020-06-29 Thread Andrei Mikhailovsky
Hello, I've upgraded ceph to Octopus (15.2.3 from repo) on one of the Ubuntu 18.04 host servers. The update caused problem with libvirtd which hangs when it tries to access the storage pools. The problem doesn't exist on Nautilus. The libvirtd process simply hangs. Nothing seem to happen. The

[ceph-users] Advice on SSD choices for WAL/DB?

2020-07-01 Thread Andrei Mikhailovsky
Hello, We are planning to perform a small upgrade to our cluster and slowly start adding 12TB SATA HDD drives. We need to accommodate for additional SSD WAL/DB requirements as well. Currently we are considering the following: HDD Drives - Seagate EXOS 12TB SSD Drives for WAL/DB - Intel D3 S4

[ceph-users] Re: Advice on SSD choices for WAL/DB?

2020-07-01 Thread Andrei Mikhailovsky
day, 1 July, 2020 13:09:34 > Subject: [ceph-users] Re: Advice on SSD choices for WAL/DB? > Hi, > > On 7/1/20 1:57 PM, Andrei Mikhailovsky wrote: >> Hello, >> >> We are planning to perform a small upgrade to our cluster and slowly start >> adding 12TB SATA HDD d

[ceph-users] Re: Octopus upgrade breaks Ubuntu 18.04 libvirt

2020-07-06 Thread Andrei Mikhailovsky
installed is: libvirt-bin 4.0.0-1ubuntu8.1 Any idea how I can make ceph Octopus to play nicely with libvirt? Cheers Andrei - Original Message - > From: "Andrei Mikhailovsky" > To: "ceph-users" > Sent: Monday, 29 June, 2020 20:40:01 &g

[ceph-users] Re: Octopus upgrade breaks Ubuntu 18.04 libvirt

2020-07-07 Thread Andrei Mikhailovsky
d8d5ec36-3cb0-39af-8fc6-084a4abd5d28 active no real148m54.763s user0m0.241s sys 0m0.304s Am I the only person having these issues with libvirt and Octopus release? Cheers - Original Message - > From: "Andrei Mikhailovsky" > To: "ceph-users&quo

[ceph-users] Re: Octopus upgrade breaks Ubuntu 18.04 libvirt

2020-07-07 Thread Andrei Mikhailovsky
Hi Jason, The extract from the debug log file is given below in the first message. It just repeats those lines every so often. I can't find anything else. Cheers - Original Message - > From: "Jason Dillaman" > To: "Andrei Mikhailovsky" > Cc: "cep

[ceph-users] Re: Octopus upgrade breaks Ubuntu 18.04 libvirt

2020-07-08 Thread Andrei Mikhailovsky
Jason, this is what I currently have: log_filters="1:libvirt 1:util 1:qemu" log_outputs="1:file:/var/log/libvirt/libvirtd.log" I will add the 1:storage and send more logs. Thanks for trying to help. Andrei - Original Message - > From: "Jason Dillaman&

[ceph-users] Re: Octopus upgrade breaks Ubuntu 18.04 libvirt

2020-07-08 Thread Andrei Mikhailovsky
cmask(SIG_BLOCK, [PIPE CHLD WINCH], [], 8) = 0 poll([{fd=5, events=POLLIN}, {fd=6, events=POLLIN}], 2, -1 It get's stuck at the last line and there is nothing happening. Andrei - Original Message - > From: "Jason Dillaman" > To: "Andrei Mikhailovsky"

[ceph-users] Re: Octopus upgrade breaks Ubuntu 18.04 libvirt

2020-07-08 Thread Andrei Mikhailovsky
E. Patrakov" > To: "Andrei Mikhailovsky" > Cc: "dillaman" , "ceph-users" > Sent: Wednesday, 8 July, 2020 14:50:56 > Subject: Re: [ceph-users] Re: Octopus upgrade breaks Ubuntu 18.04 libvirt > Please strace both virsh and libvirtd (you can attach to i

[ceph-users] Module crash has failed (Octopus)

2020-08-03 Thread Andrei Mikhailovsky
Hello everyone, I am running my Octopus 15.2.4 version and a couple of days ago noticed an ERROR state on the cluster with the following message: Module 'crash' has failed: dictionary changed size during iteration I couldn't find much info on this error. I've tried restarting the mon servers

[ceph-users] Re: Module crash has failed (Octopus)

2020-08-04 Thread Andrei Mikhailovsky
Thanks Michael. I will try it. Cheers Andrei - Original Message - > From: "Michael Fladischer" > To: "ceph-users" > Sent: Tuesday, 4 August, 2020 08:51:52 > Subject: [ceph-users] Re: Module crash has failed (Octopus) > Hi Andrei, > &

[ceph-users] RGW unable to delete a bucket

2020-08-04 Thread Andrei Mikhailovsky
Hi I am trying to delete a bucket using the following command: # radosgw-admin bucket rm --bucket= --purge-objects However, in console I get the following messages. About 100+ of those messages per second. 2020-08-04T17:11:06.411+0100 7fe64cacf080 1 RGWRados::Bucket::List::list_objects_or

[ceph-users] Re: RGW unable to delete a bucket

2020-08-06 Thread Andrei Mikhailovsky
BUMP... - Original Message - > From: "Andrei Mikhailovsky" > To: "ceph-users" > Sent: Tuesday, 4 August, 2020 17:16:28 > Subject: [ceph-users] RGW unable to delete a bucket > Hi > > I am trying to delete a bucket using the following command

[ceph-users] Re: RGW unable to delete a bucket

2020-08-10 Thread Andrei Mikhailovsky
- Original Message - > From: "Matt Benjamin" > To: "EDH" > Cc: "Andrei Mikhailovsky" , "ceph-users" > , "Ivancich, Eric" > Sent: Thursday, 6 August, 2020 21:48:53 > Subject: Re: RGW unable to delete a bucket > Hi Folks,

[ceph-users] rgw-orphan-list

2020-08-24 Thread Andrei Mikhailovsky
While continuing my saga with the rgw orphans and dozens of terabytes of wasted space I have used the rgw-orphan-list tool. after about 45 mins the tool has crashed ((( # time rgw-orphan-list .rgw.buckets Pool is ".rgw.buckets". Note: output files produced will be tagged with the current tim

[ceph-users] Re: rgw-orphan-list

2020-08-25 Thread Andrei Mikhailovsky
Bump - Original Message - > From: "Andrei Mikhailovsky" > To: "ceph-users" > Sent: Monday, 24 August, 2020 16:37:49 > Subject: [ceph-users] rgw-orphan-list > While continuing my saga with the rgw orphans and dozens of terabytes of > wasted > s

[ceph-users] Re: Infiniband support

2020-08-26 Thread Andrei Mikhailovsky
Rafael, We've been using ceph with ipoib for over 7 years and it's been supported. However, I am not too sure of the the native rdma support. There has been discussions on/off for a while now, but I've not seen much. Perhaps others know. Cheers > From: "Rafael Quaglio" > To: "ceph-users" >

[ceph-users] Re: Raw use 10 times higher than data use

2019-09-26 Thread Andrei Mikhailovsky
Hi Georg, I am having a similar issue with the RGW pool. However, not to the extent of 10x error rate. In my case, the error rate is a bout 2-3x. My real data usage is around 6TB, but Ceph uses over 17TB. I have asked this question here, but no one seems to know the solution and how to go about

[ceph-users] Re: Raw use 10 times higher than data use

2019-09-27 Thread Andrei Mikhailovsky
ows.  We still have testing we need to perform to see if it's a good > idea as a default value.  We are also considering inlining very small > (<4K) objects in the onode itself, but that also will require > significant testing as it may put additional load on the DB as well. > &g

[ceph-users] is ceph balancer doing anything?

2020-03-04 Thread Andrei Mikhailovsky
Hello everyone, A few weeks ago I have enabled the ceph balancer on my cluster as per the instructions here: [ https://docs.ceph.com/docs/mimic/mgr/balancer/ | https://docs.ceph.com/docs/mimic/mgr/balancer/ ] I am running ceph version: ceph version 13.2.6 (7b695f835b03642f85998b2ae7b6dd093d