[ceph-users] Re: rbd object mapping

2021-08-08 Thread Konstantin Shalygin
> There are two types of "object", RBD-image-object and 8MiB-block-object. > When create a RBD image, a RBD-image-object is created and 12800 > 8MiB-block-objects > are allocated. That whole RBD-image-object is mapped to a single PG, which is > mapped > to 3 OSDs (replica 3). That means, all us

[ceph-users] Re: rbd object mapping

2021-08-09 Thread Konstantin Shalygin
> On 8 Aug 2021, at 20:10, Tony Liu wrote: > > That's what I thought. I am confused by this. > > # ceph osd map vm fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk > osdmap e18381 pool 'vm' (4) object > 'fcb09c9c-4cd9-44d8-a20b-8961c6eedf8e_disk' -> pg 4.c7a78d40 (4.0) -> up > ([4,17,6], p4) acting

[ceph-users] Re: [ Ceph ] - Downgrade path failure

2021-08-12 Thread Konstantin Shalygin
Downgrade is not supported k Sent from my iPhone > On 13 Aug 2021, at 07:23, Lokendra Rathour wrote: > > Is this supported? Please advise over the same. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le

[ceph-users] Re: RGW memory consumption

2021-08-13 Thread Konstantin Shalygin
Hi, > On 13 Aug 2021, at 14:10, Martin Traxl wrote: > > yesterday evening one of my rgw nodes died again, radosgw was killed by the > kernel oom killer. > > [Thu Aug 12 22:10:04 2021] Out of memory: Killed process 1376 (radosgw) > total-vm:70747176kB, anon-rss:63900544kB, file-rss:0kB, shmem-

[ceph-users] Re: RGW memory consumption

2021-08-13 Thread Konstantin Shalygin
I see... > On 13 Aug 2021, at 14:51, Martin Traxl wrote: > > We are experiencing this behaviour eversince this cluster is productive and > gets "some load". We started with this cluster in May this year, running Ceph > 14.2.15 and already had this same issue. It just took a little longer until

[ceph-users] Re: Deployment of Monitors and Managers

2021-08-16 Thread Konstantin Shalygin
Hi > On 14 Aug 2021, at 11:06, Michel Niyoyita > wrote: > > I am going to deploy ceph in production , and I am going to deploy 3 > monitors on 3 differents hosts to make a quorum. Is there any > inconvenience if I deploy 2 managers on the same hosts where I deployed >

[ceph-users] Re: s3 select api

2021-08-31 Thread Konstantin Shalygin
Hi, S3 Select introduced in Pacific Cheers, k Sent from my iPhone > On 1 Sep 2021, at 07:14, Szabo, Istvan (Agoda) wrote: > > Is the s3 select api working with octopus or only with pacific? ___ ceph-users mailing list -- ceph-users@ceph.io To unsu

[ceph-users] Re: [Suspicious newsletter] Re: s3 select api

2021-09-01 Thread Konstantin Shalygin
Only if some developer focused on this. But not this time k > On 1 Sep 2021, at 09:42, Szabo, Istvan (Agoda) wrote: > > Yeah but sometimes with some updates some new features come to the older > versions also, isn't it? ___ ceph-users mailing list

[ceph-users] PG merge: PG stuck in premerge+peered state

2021-09-05 Thread Konstantin Shalygin
Hi, Does somebody see PG inactive like this before? We get first pool outage: PG_AVAILABILITY Reduced data availability: 2 pgs inactive pg 4.1f1 is stuck inactive for 8637.783533, current state clean+premerge+peered, last acting [312,358,331] pg 4.9f1 is stuck inactive for 8637.783331,

[ceph-users] Re: debug RBD timeout issue

2021-09-08 Thread Konstantin Shalygin
What is ceoh.conf for this rbd client? k Sent from my iPhone > On 7 Sep 2021, at 19:54, Tony Liu wrote: > > > I have OpenStack Ussuri and Ceph Octopus. Sometimes, I see timeout when create > or delete volumes. I can see RBD timeout from cinder-volume. Has anyone seen > such > issue? I'd lik

[ceph-users] Re: Edit crush rule

2021-09-08 Thread Konstantin Shalygin
Just create new one with your failure domain and switch the pool rule. Then delete old rule k Sent from my iPhone > On 8 Sep 2021, at 01:11, Budai Laszlo wrote: > > Thank you for your answers. Yes, I'm aware of this option, but this is not > changing the failure domain of an existing rule.

[ceph-users] Re: debug RBD timeout issue

2021-09-08 Thread Konstantin Shalygin
This may be just a connection string problem k > On 8 Sep 2021, at 19:59, Tony Liu wrote: > > That's what I am trying to figure out, "what exactly could cause a timeout". > User creates 10 VMs (boot on volume and an attached volume) by Terraform, > then destroy them. Repeat the same, it works

[ceph-users] Re: debug RBD timeout issue

2021-09-08 Thread Konstantin Shalygin
In previous email I was ask you to show your ceph.conf... k > On 8 Sep 2021, at 22:20, Tony Liu wrote: > > Sorry Konstantin, I didn't get it. Could you elaborate a bit? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email

[ceph-users] Re: debug RBD timeout issue

2021-09-08 Thread Konstantin Shalygin
Try to simplify it to [global] fsid = 35d050c0-77c0-11eb-9242-2cea7ff9d07c mon_host = 10.250.50.80:3300,10.250.50.81:3300,10.250.50.82:3300 And try again We are found that with only msgr2 enabled clusters, clients with mon_host settings without hardcoded 3300 port may be timeouted from time to

[ceph-users] Re: debug RBD timeout issue

2021-09-08 Thread Konstantin Shalygin
I think It's just a compat with legacy (v1) clusters. In the kernel the same. Your cluster already msgr2 enabled, you don't need any compats k Sent from my iPhone > On 8 Sep 2021, at 22:53, Tony Liu wrote: > > Good to know. Thank you Konstantin!=0A= > Will test it out.=0A= > Is this some kno

[ceph-users] Re: Smarter DB disk replacement

2021-09-09 Thread Konstantin Shalygin
Ceph guarantee data consistency only when its write by Ceph When NVMe dies - we replace it and fill. Normal for our is a fill osd host for a two weeks k Sent from my iPhone > On 9 Sep 2021, at 17:10, Michal Strnad wrote: > > 2. When DB disk is not completely dead and has only relocated sector

[ceph-users] Re: List pg with heavily degraded objects

2021-09-10 Thread Konstantin Shalygin
Hi, One time I search undersized PG's with only one replica (as I remember) Snippet left in my notes, so may be help you ceph pg dump | grep undersized | awk '{print $1 " " $17 " " $18 " " $19}' | awk -vOFS='\t' '{ print length($4), $0 }' | sort -k1,1n | cut -f2- | head k > On 10 Sep 2021, at

[ceph-users] Re: mon stucks on probing and out of quorum, after down and restart

2021-09-10 Thread Konstantin Shalygin
Yes, try to use Wido's script (remove quorum logic or execute commands by hand) https://gist.github.com/wido/561c69dc2ec3a49d1dba10a59b53dfe5 k > On 10 Sep 2021, at 14:57, mk wrote: > > I have just seen that on failed mon store

[ceph-users] Re: How many concurrent users can be supported by a single Rados gateway

2021-09-10 Thread Konstantin Shalygin
> On 10 Sep 2021, at 18:04, huxia...@horebdata.cn wrote: > > I am planning a Ceph Cluster (Lumninous 12.2.13) for hosting on-line courses > for one university. The data would mostly be video media and thus 4+2 EC > coded object store together with CivetWeb RADOS gateway will be utilized. > >

[ceph-users] Re: How many concurrent users can be supported by a single Rados gateway

2021-09-10 Thread Konstantin Shalygin
Nautilus already EOL too, commits is not backported to this branch. Only by companies who made products on this release and can verify patches self k Sent from my iPhone > On 10 Sep 2021, at 18:23, Eugen Block wrote: > Nautilus, and N will also be EOL soon ___

[ceph-users] Re: Bluefs spillover octopus 15.2.10

2021-09-12 Thread Konstantin Shalygin
You can try offline compact osd ceph-kvstore-tool bluestore-kv ${osd_path} compact Then restart osd k Sent from my iPhone > On 12 Sep 2021, at 08:13, Szabo, Istvan (Agoda) > wrote: > > I’ve looked around what to do with this but haven’t really found any > solution, so I wonder if I have thi

[ceph-users] Re: Radosgw single side configuration

2021-09-12 Thread Konstantin Shalygin
Singe side is a default RGW configuration Just run your first RGW - all needed pools will be created - it's just works Cheers, k > On 12 Sep 2021, at 10:45, Cem Zafer wrote: > > I have been looking for documentation about single side ceph object gateway > configuration but I just found lots of

[ceph-users] osd: mkfs: bluestore_stored > 235GiB from start

2021-09-14 Thread Konstantin Shalygin
Hi, One of OSD after deploy on first (mkfs) run allocate ~ 235GiB of space. OSD perf dump after boot without any PG's: "bluestore_allocated": 253647523840, "bluestore_stored": 252952705757, I was created tracker for this [1], maybe someone has already encountered a similar problem? Thanks,

[ceph-users] Re: v16.2.6 Pacific released

2021-09-17 Thread Konstantin Shalygin
Hi, For some reason backport bot is't created backport issue for this, then ticket just closed without pacific backport k > On 17 Sep 2021, at 13:34, Adrian Nicolae wrote: > > Hi, > > Does the 16.2.6 version fixed the following bug : > > https://github.com/ceph/ceph/pull/42690 >

[ceph-users] Re: v16.2.6 Pacific released

2021-09-17 Thread Konstantin Shalygin
Thanks Cory, Adrian, FYI k > On 17 Sep 2021, at 16:15, Cory Snyder wrote: > > Orchestrator issues don't get their own backport trackers because the team > lead handles these backports and does them in batches. This patch did make it > into the 16.2.6 release via this batch backport PR: > >

[ceph-users] Re: CentOS Linux 8 EOL

2021-09-17 Thread Konstantin Shalygin
Currently, we on CentOS8 Stream use usual Ceph repo: [root@k8s-prod-worker0 /]# dnf info ceph-osd Last metadata expiration check: 0:00:06 ago on Fri 17 Sep 2021 08:44:30 PM +07. Available Packages Name : ceph-osd Epoch: 2 Version : 16.2.5 Release : 0.el8 Architecture :

[ceph-users] Re: Buffered io +/vs osd memory target

2021-09-18 Thread Konstantin Shalygin
Hi, Yes, the "osd target" settings is a resident memory, "buffered_io" is a cached memory (that can be freed by kernel when kernel receives malloc signal) k > On 18 Sep 2021, at 11:29, Szabo, Istvan (Agoda) > wrote: > > If we are using buffered_io does the osd memory target settings still m

[ceph-users] Re: Monitor issue while installation

2021-09-21 Thread Konstantin Shalygin
 Hi, Your Ansible monitoring_group_name variable is not defined, define it first k > On 21 Sep 2021, at 12:12, Michel Niyoyita wrote: > > Hello team > > I am running a ceph cluster pacific version deployed using ansible . I > would like to add other osds but it fails once riche to the mon

[ceph-users] Re: Monitor issue while installation

2021-09-21 Thread Konstantin Shalygin
   Hi, Your Ansible monitoring_group_name variable is not defined, define it first k > On 21 Sep 2021, at 12:12, Michel Niyoyita wrote: > > Hello team > > I am running a ceph cluster pacific version deployed using ansible . I > would like to add other osds but it fails once riche to the

[ceph-users] Re: Monitor issue while installation

2021-09-21 Thread Konstantin Shalygin
 Hi, Your Ansible monitoring_group_name variable is not defined, define it first k > On 21 Sep 2021, at 12:12, Michel Niyoyita wrote: > > Hello team > > I am running a ceph cluster pacific version deployed using ansible . I > would like to add other osds but it fails once riche to the mon

[ceph-users] Re: "Partitioning" in RGW

2021-09-28 Thread Konstantin Shalygin
Hi, Your DMZ is S3 protocol. Access to buckets will be provided via S3 keys Just create as much users as much you need If you need definitely different "fake S3", I think create another pools and RGW instances is a way to achieve "real DMZ" Cheers, k Sent from my iPhone > On 23 Sep 2021, at 2

[ceph-users] Re: Leader election, how to notice it?

2021-10-03 Thread Konstantin Shalygin
Hi, You always can get the leader from quorum status ceph quorum_status | jq -r '.quorum_leader_name' Cheers, k > On 3 Oct 2021, at 10:21, gustavo panizzo wrote: > > Instead of setting up pacemaker or similar I'd like to only run the > application in the same machine > as the leader Mon. At

[ceph-users] Can't join new mon - lossy channel, failing

2021-10-04 Thread Konstantin Shalygin
Hi, I was make a mkfs for new mon, but mon stuck on probing. On debug I see: fault on lossy channel, failing. This is a bad (lossy) network (crc mismatch)? 2021-10-04 16:22:24.707 7f5952761700 10 mon.mon2@-1(probing) e10 probing other monitors 2021-10-04 16:22:24.707 7f5952761700 1 -- [v2:10

[ceph-users] Re: Can't join new mon - lossy channel, failing

2021-10-04 Thread Konstantin Shalygin
> On 4 Oct 2021, at 16:38, Stefan Kooman wrote: > > What procedure are you following to add the mon? # ceph mon dump epoch 10 fsid 677f4be1-cd98-496d-8b50-1f99df0df670 last_changed 2021-09-11 10:04:23.890922 created 2018-05-18 20:43:43.260897 min_mon_release 14 (nautilus) 0: [v2:10.40.0.81:33

[ceph-users] Re: Can't join new mon - lossy channel, failing

2021-10-04 Thread Konstantin Shalygin
> On 4 Oct 2021, at 17:07, Vladimir Bashkirtsev > wrote: > > This line bothers me: > > [v2:10.40.0.81:6898/2507925,v1:10.40.0.81:6899/2507925] conn(0x560287e4 > 0x560287e56000 crc :-1 s=READY pgs=16872 cs=0 l=1 rev1=1 rx=0 > tx=0).handle_read_frame_preamble_main read frame preamble fail

[ceph-users] Re: Can't join new mon - lossy channel, failing

2021-10-04 Thread Konstantin Shalygin
ld do. > > > > Try -f instead of -d if you are overwhelmed with output to get mon debug > output to log file. > > > > Regards, > > Vladimir > > On 5/10/21 01:27, Konstantin Shalygin wrote: >> >>> On 4 Oct 2021, at 17:07, Vladimir Bashki

[ceph-users] Re: Can't join new mon - lossy channel, failing

2021-10-04 Thread Konstantin Shalygin
This cluster isn't use cephx. ceph.conf global settings disable it k Sent from my iPhone > On 4 Oct 2021, at 17:46, Stefan Kooman wrote: > > I'm missing the part where keyring is downloaded and used: > > ceph auth get mon. -o /tmp/keyring > ceph mon getmap -o /tmp/monmap > chown -R ceph:ceph

[ceph-users] Re: Can't join new mon - lossy channel, failing

2021-10-05 Thread Konstantin Shalygin
As last resort we've change ipaddr of this host, and mon successfully joined to quorum. When revert ipaddr back - mon can't join, we think there something on switch side or on old mon's side. From old mon's I was checked new mon process connectivity via telnet - all works It's good to make a som

[ceph-users] Re: Edit crush rule

2021-10-07 Thread Konstantin Shalygin
Hi, > On 7 Oct 2021, at 11:03, ceph-us...@hovr.anonaddy.com wrote: > > Following on this, are there any restriction or issues with setting a new > rule on a pool, except for the resulting backfilling? Nope > > I can't see anything specific about it in the documentation, just the command > ce

[ceph-users] Re: Do people still use LevelDBStore?

2021-10-14 Thread Konstantin Shalygin
+1 we convert all levelbd monstore to rocksdb on luminous k Sent from my iPhone > On 14 Oct 2021, at 10:42, Dan van der Ster wrote: > > +1 from users perspective too but we should warn add a HEALTH_WARN if > a cluster has leveldb monstore or filestore osds, so users know to > convert before

[ceph-users] Re: Does centos8/redhat8 support connection to luminous cluster

2021-10-15 Thread Konstantin Shalygin
Yes k Sent from my iPhone > On 15 Oct 2021, at 16:36, Malshan Peiris wrote: > > Does centos8/redhat8, using it's nautilus version client (14.2.x) support > connecting (as a rbd client) to a ceph luminous cluster (12.2.x). ___ ceph-users mailing list

[ceph-users] Re: HDD cache

2023-11-09 Thread Konstantin Shalygin
Hi Peter, > On Nov 8, 2023, at 20:32, Peter wrote: > > Anyone experienced this can advise? You can try: * check for current cache status smartctl -x /dev/sda | grep "Write cache" * turn off write cache smartctl -s wcache-sct,off,p /dev/sda * check again smartctl -x /dev/sda | grep "Write

[ceph-users] Re: Bug fixes in 17.2.7

2023-11-20 Thread Konstantin Shalygin
Hi Tobias, This was not meged to Quincy yet [1] k [1] https://tracker.ceph.com/issues/59730 Sent from my iPhone > On Nov 20, 2023, at 17:50, Tobias Kulschewski > wrote: > > Just wanted to ask, if the bug with the multipart upload [1] has been fixed > in 17.2.7? _

[ceph-users] Re: Bug fixes in 17.2.7

2023-11-20 Thread Konstantin Shalygin
Hi, > On Nov 20, 2023, at 19:24, Tobias Kulschewski > wrote: > > do you have a rough estimate of when this will happen? > > Not at this year I think. For now precedence for a 18.2.1 and last release of Pacific But you can request shaman build, and clone repo for your local usage k ___

[ceph-users] Re: CLT Meeting minutes 2023-11-23

2023-11-23 Thread Konstantin Shalygin
Hi, > On Nov 23, 2023, at 16:10, Nizamudeen A wrote: > > RCs for reef, quincy and pacific > for next week when there is more time to discuss Just little noise: pacific is ready? 16.2.15 should be last release (at least that was the last plan), but [1] still not merged. Why now ticket is clos

[ceph-users] Re: osdmaptool target & deviation calculation

2023-11-27 Thread Konstantin Shalygin
Hi, This deviation is very soft. If u wanna do real upmaps you should use deviation 1 k Sent from my iPhone > On Nov 27, 2023, at 21:39, Robert Hish wrote: > > The result is many many OSDs with a deviation well above the > upmap_max_deviation which is at default: 5 _

[ceph-users] Re: MDS recovery with existing pools

2023-12-11 Thread Konstantin Shalygin
Good to hear that, Eugen! CC'ed Zac for a your docs mention k > On Dec 11, 2023, at 23:28, Eugen Block wrote: > > Update: apparently, we did it! > We walked through the disaster recovery steps where one of the steps was to > reset the journal. I was under the impression that the specified com

[ceph-users] Re: Ceph Nautilous 14.2.22 slow OSD memory leak?

2024-01-13 Thread Konstantin Shalygin
Hi, > On Jan 12, 2024, at 12:01, Frédéric Nass > wrote: > > Hard to tell for sure since this bug hit different major versions of the > kernel, at least RHEL's from what I know. In what RH kernel release this issue was fixed? Thanks, k ___ ceph-us

[ceph-users] Re: Ceph 16.2.14: ceph-mgr getting oom-killed

2024-01-24 Thread Konstantin Shalygin
Hi, The backport to pacific was rejected [1], you may switch to reef, when [2] merged and released [1] https://github.com/ceph/ceph/pull/55109 [2] https://github.com/ceph/ceph/pull/55110 k Sent from my iPhone > On Jan 25, 2024, at 04:12, changzhi tan <544463...@qq.com> wrote: > > Is there an

[ceph-users] Re: pacific 16.2.15 QE validation status

2024-02-07 Thread Konstantin Shalygin
> > On Feb 7, 2024, at 16:59, Zakhar Kirpichenko wrote: > > Indeed, it looks like it's been recently reopened. Thanks for this! Hi, It was merged yesterday Thanks for the right noise, k ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscr

[ceph-users] Re: Size return by df

2024-02-22 Thread Konstantin Shalygin
Hi, Yes you can, this controlled by option client quota df = false k Sent from my iPhone > On Feb 22, 2024, at 11:17, Albert Shih wrote: > > Is they are any way to keep the first answer ? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubsc

[ceph-users] Re: Monitoring Ceph Bucket and overall ceph cluster remaining space

2024-03-05 Thread Konstantin Shalygin
Hi, For RGW usage statistics you can use radosgw_usage_exporter [1] k [1] https://github.com/blemmenes/radosgw_usage_exporter Sent from my iPhone > On 6 Mar 2024, at 00:21, Michael Worsham wrote: > Is there an easy way to poll the ceph cluster buckets in a way to see how > much space is re

[ceph-users] Re: Monitoring Ceph Bucket and overall ceph cluster remaining space

2024-03-05 Thread Konstantin Shalygin
Hi, Don't aware about what is SW, but if this software works with Prometheus metrics format - why not. Anyway the exporters are open source, you can modify the existing code for your environment k Sent from my iPhone > On 6 Mar 2024, at 07:58, Michael Worsham wrote: > > This looks interest

[ceph-users] Re: ceph Quincy to Reef non cephadm upgrade

2024-03-06 Thread Konstantin Shalygin
Hi, Yes, you upgrade ceph-common package, then restart your mons k Sent from my iPhone > On 6 Mar 2024, at 21:55, sarda.r...@gmail.com wrote: > > My question is - does this mean I need to upgrade all ceph packages (ceph, > ceph-common) and restart only monitor daemon first? __

[ceph-users] Re: Running dedicated RGWs for async tasks

2024-03-07 Thread Konstantin Shalygin
Hi, Yes. You need to turn off gc, lc threads in config for your current (client side) RGW's. Then setup your 'async tasks' RGW without client traffic. No special configuration needed, only if I wanna tune gc, lc settings k Sent from my iPhone > On 7 Mar 2024, at 13:09, Marc Singer wrote: >

[ceph-users] Telemetry endpoint down?

2024-03-11 Thread Konstantin Shalygin
Hi, seems telemetry endpoint is down for a some days? We have connection errors from multiple places 1:ERROR Mar 10 00:46:10.653 [564383]: opensock: Could not establish a connection to telemetry.ceph.com:443 2:ERROR Mar 10 01:48:20.061 [564383]: opensock: Could not establish a connecti

[ceph-users] Re: Telemetry endpoint down?

2024-03-11 Thread Konstantin Shalygin
Hi Greg Seems is up now, last report uploaded successfully Thanks, k Sent from my iPhone > On 11 Mar 2024, at 18:57, Gregory Farnum wrote: > > We had a lab outage Thursday and it looks like this service wasn’t > restarted after that occurred. Fixed now and we’ll look at how to prevent > that

[ceph-users] Re: ceph metrics units

2024-03-14 Thread Konstantin Shalygin
Hi, > On 14 Mar 2024, at 16:44, Denis Polom wrote: > > do you know if there is some table of Ceph metrics and units that should be > used for them? > > I currently struggling with > > ceph_osd_op_r_latency_sum > > ceph_osd_op_w_latency_sum > > if they are in ms or seconds? > > Any idea ple

[ceph-users] Re: ceph metrics units

2024-03-14 Thread Konstantin Shalygin
Hi, > On 14 Mar 2024, at 19:29, Denis Polom wrote: > > so metric itself is miliseconds and after division on _count it's in seconds? > > This is two metrics for long running averages [1], the query that produces "seconds" unit looks like this (irate(ceph_osd_op_r_latency_sum[1m]) / irate(ce

[ceph-users] Re: RGW - tracking new bucket creation and bucket usage

2024-03-15 Thread Konstantin Shalygin
Hi, > On 15 Mar 2024, at 01:07, Ondřej Kukla wrote: > > Hello I’m looking for suggestions how to track bucket creation over s3 api > and bucket usage (num of objects and size) of all buckets in time. > > In our RGW setup, we have a custom client panel, where like 85% percent of > buckets are

[ceph-users] Re: Laptop Losing Connectivity To CephFS On Sleep/Hibernation

2024-03-23 Thread Konstantin Shalygin
Hi, Yes, this is generic solution for end users mounts - samba gateway k Sent from my iPhone > On 23 Mar 2024, at 12:10, duluxoz wrote: > > Hi Alex, and thanks for getting back to me so quickly (I really appreciate > it), > > So from what you said it looks like we've got the wrong solution.

[ceph-users] Re: Ceph object gateway metrics

2024-03-25 Thread Konstantin Shalygin
Hi, You can use the [2] exporter to achieve usage stats per user and per bucket, including quotas usage k Sent from my iPhone > On 26 Mar 2024, at 01:38, Kushagr Gupta wrote: > > 2. https://github.com/blemmenes/radosgw_usage_exporter ___ ceph-users

[ceph-users] Re: Impact of large PG splits

2024-04-09 Thread Konstantin Shalygin
Hi Eugene! I have a case, where PG with millions of objects, like this ``` root@host# ./show_osd_pool_pg_usage.sh | less | head id used_mbytes used_objects omap_used_mbytes omap_used_keys -- --- -- 17.c91 1213.24827

[ceph-users] Re: Impact of large PG splits

2024-04-10 Thread Konstantin Shalygin
> On 10 Apr 2024, at 01:00, Eugen Block wrote: > > I appreciate your message, it really sounds tough (9 months, really?!). But > thanks for the reassurance :-) Yes, the total "make this project great again" tooks 16 month, I think. This my work First problem after 1M objects in PG was a del

[ceph-users] Re: Client kernel crashes on cephfs access

2024-04-17 Thread Konstantin Shalygin
Hi, > On 9 Apr 2024, at 04:07, Xiubo Li wrote: > > Thanks for reporting this, I generated one patch to fix it. Will send it out > after testing is done. Trace from our users, but from mainline kernel. Look like as trace above kernel: [ cut here ] kernel: list_add corr

[ceph-users] Re: Client kernel crashes on cephfs access

2024-04-17 Thread Konstantin Shalygin
Hi Xiubo, Seems patch already landed to kernel 6.8.7, thanks! k Sent from my iPhone > On 18 Apr 2024, at 05:31, Xiubo Li wrote: > > Hi Konstantin, > > We have fixed it, please see > https://patchwork.kernel.org/project/ceph-devel/list/?series=842682&archive=both. > > - Xiubo __

[ceph-users] Re: Ceph image delete error - NetHandler create_socket couldnt create socket

2024-04-18 Thread Konstantin Shalygin
Hi, Your shell seems reached the default file discriptors limit (1024 mostly) and your cluster maybe more than 1000 OSD Try to set command `ulimit -n 10240` before rbd rm task k Sent from my iPhone > On 18 Apr 2024, at 23:50, Pardhiv Karri wrote: > > Hi, > > Trying to delete images in a C

[ceph-users] Re: Ceph image delete error - NetHandler create_socket couldnt create socket

2024-04-19 Thread Konstantin Shalygin
Hi, > On 19 Apr 2024, at 10:39, Pardhiv Karri wrote: > > Thank you for the reply. I tried setting ulimit to 32768 when I saw 25726 > number in lsof output and then after 2 disks deletion again it got an error > and checked lsof and which is above 35000. I'm not sure how to handle it. > I reboot

[ceph-users] Re: Ceph client cluster compatibility

2024-05-01 Thread Konstantin Shalygin
Hi, Yes, like it always do k Sent from my iPhone > On 2 May 2024, at 07:09, Nima AbolhassanBeigi > wrote: > > We are trying to upgrade our OS version from ubuntu 18.04 to ubuntu 22.04. > Our ceph cluster version is 16.2.13 (pacific). > > The problem is that the ubuntu packages for the ceph

[ceph-users] Re: unknown PGs after adding hosts in different subtree

2024-05-21 Thread Konstantin Shalygin
Hi Eugen > On 21 May 2024, at 15:26, Eugen Block wrote: > > step set_choose_tries 100 I think you should try to increase set_choose_tries to 200 Last year we had an Pacific EC 8+2 deployment of 10 racks. And even with 50 hosts, the value of 100 not worked for us k ___

[ceph-users] Re: Unable to Install librados2 18.2.0 on RHEL 7 from Ceph Repository

2024-05-29 Thread Konstantin Shalygin
Hi, The last release for EL7 is Octopus (version 15), you try to catch version 18 k Sent from my iPhone > On 29 May 2024, at 22:34, abdel.doui...@gmail.com wrote: > > The Ceph repository at https://download.ceph.com/ does not seem to have the > librados2 package version 18.2.0 for RHEL 7. Th

[ceph-users] Re: pacific doesn't defer small writes for pre-pacific hdd osds

2022-07-14 Thread Konstantin Shalygin
Dan, do you tested the redeploy one of your OSD with default pacific bluestore_min_alloc_size_hdd (4096) ? This will also resolves this issue (just not affected, when all options in their defaults)? Thanks, k > On 14 Jul 2022, at 08:43, Dan van der Ster wrote: >

[ceph-users] Re: moving mgr in Pacific

2022-07-15 Thread Konstantin Shalygin
Hi, > On 15 Jul 2022, at 12:25, Adrian Nicolae wrote: > > Hi, > > What is the recommended procedure to move the secondary mgr to another node ? > > Thanks. > On new node: systemctl reenable ceph-mgr@$(hostname -s) systemctl start ceph-mgr.target Good luck, k __

[ceph-users] Re: v16.2.10 Pacific released

2022-07-23 Thread Konstantin Shalygin
Hi, This is hotfix only release? No another patches was targeted to 16.2.10 landed here? Thanks, k Sent from my iPhone > On 22 Jul 2022, at 03:38, David Galloway wrote: > > This is a hotfix release addressing two security vulnerabilities. We > recommend all users update to this release. >

[ceph-users] Re: ceph health "overall_status": "HEALTH_WARN"

2022-07-25 Thread Konstantin Shalygin
Hi, The Mimic have many HEALTH troubles like this Mimic is EOL for a years, I suggest you to upgrade to Nautilus 14.2.22 at least k > On 25 Jul 2022, at 11:45, Frank Schilder wrote: > > Hi all, > > I made a strange observation on our cluster. The command ceph status -f > json-pretty returns

[ceph-users] Re: weird performance issue on ceph

2022-07-27 Thread Konstantin Shalygin
Hi, All rbd features was added to ceph-csi in last year [1] You can add object-map feature in your options like any others: ``` imageFeatures: layering,exclusive-lock,object-map,fast-diff,deep-flatten mapOptions: ms_mode=prefer-crc ``` k [1] https://github.com/ceph/ceph-csi/pull/2514

[ceph-users] Re: linux distro requirements for reef

2022-08-10 Thread Konstantin Shalygin
Hi Ken, CentOS 8 Stream will continue to receive packages or have some barrires for R? Thanks, k Sent from my iPhone > On 10 Aug 2022, at 18:08, Ken Dreyer wrote: > > Hi folks, > > In the Ceph Leadership Team meeting today we discussed dropping > support for older distros in our Reef relea

[ceph-users] Re: linux distro requirements for reef

2022-08-10 Thread Konstantin Shalygin
22 at 11:35 AM Konstantin Shalygin wrote: >> >> Hi Ken, >> >> CentOS 8 Stream will continue to receive packages or have some barrires for >> R? > > No, starting with Reef, we will no longer build nor ship RPMs for > CentOS 8 Stream (and debs for Ubuntu Fo

[ceph-users] Re: cephfs and samba

2022-08-22 Thread Konstantin Shalygin
Hi Robert, > On 19 Aug 2022, at 17:11, Robert Sander wrote: > > You could easily add nodes to the CTDB cluster to distribute load there. How to do that? Add more then one publlic_ip? How to tell Winsows then, about multiple IP's? Thanks k ___ ceph-

[ceph-users] Re: RadosGW compression vs bluestore compression

2022-08-25 Thread Konstantin Shalygin
Hi, What exactly you mean in rwg compression? Another storage class? k Sent from my iPhone > On 21 Aug 2022, at 22:14, Danny Webb wrote: > > Hi, > > What is the difference between using rgw compression vs enabling compression > on a pool? Is there any reason why you'd use one over the othe

[ceph-users] Re: RadosGW compression vs bluestore compression

2022-08-25 Thread Konstantin Shalygin
> > vs say: > > https://www.redhat.com/en/blog/red-hat-ceph-storage-33-bluestore-compression-performance > > Cheers, > Danny > ____ > From: Konstantin Shalygin > Sent: 25 August 2022 13:23 > To: Danny Webb > Cc: ceph-users@ceph.io &g

[ceph-users] Re: Public RGW access without any LB in front?

2022-09-19 Thread Konstantin Shalygin
Hi, Actually rgw can handle SSL traffic, and updates of certs is just a restarting of service. For client it will be reset of connection, client will make a new one We use keeaplived DR method for RGW's for a years The only bottleneck in this setup is input traffic limited by LB. This also can

[ceph-users] Re: Public RGW access without any LB in front?

2022-09-19 Thread Konstantin Shalygin
Hi, > On 19 Sep 2022, at 10:38, Tobias Urdin wrote: > > Why not scaleout HAproxy by adding multiple ones and use a TCP load balancer > in front of multiple HAproxy instances or use BGP ECMP routing directly to > split > load between multiple HAproxy? Because you can do this without "TCP load b

[ceph-users] Re: How to remove remaining bucket index shard objects

2022-10-02 Thread Konstantin Shalygin
Hi, Try to deep-scrub all PG's of your index pool k Sent from my iPhone > On 3 Oct 2022, at 03:41, Yuji Ito wrote: > Would you have any idea how to resolve this condition? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email

[ceph-users] Re: How to remove remaining bucket index shard objects

2022-10-04 Thread Konstantin Shalygin
Hi, > On 4 Oct 2022, at 03:36, Yuji Ito (伊藤 祐司) wrote: > > After removing the index objects, I ran deep-scrub for all PGs of the index > pool. However, the problem wasn't resolved. Seems you just have large OMAPs, not because 'bogus shard' objects. Try to look PG stats with 'show_osd_pool_pg_

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-06 Thread Konstantin Shalygin
Hi, When you see one of 100 drives perf is unusually different, this may mean 'this drive is not like the others' and should be replaced k Sent from my iPhone > On 7 Oct 2022, at 07:33, Zakhar Kirpichenko wrote: > > Anyone, please? ___ ceph-users

[ceph-users] Re: 16.2.10: ceph osd perf always shows high latency for a specific OSD

2022-10-07 Thread Konstantin Shalygin
Zakhar, try to look to top of slow ops in daemon socket for this osd, you may find 'snapc' operations, for example. By rbd head you can find rbd image, and then try to look how much snapshots in chain for this image. More than 10 snaps for one image can increase client ops latency to tens millis

[ceph-users] Re: mgr/prometheus module port 9283 binds only with IPv6 ?

2022-10-10 Thread Konstantin Shalygin
Hi, Do you set "mgr/prometheus//server_addr" ipv4 address in config? k > On 10 Oct 2022, at 16:56, Ackermann, Christoph > wrote: > > Hello list member > > after subsequent installation of Ceph (17.2.4) monitoring stuff we got this > error: The mgr/prometheus module at ceph1n020.int.infoserv

[ceph-users] Re: How to remove remaining bucket index shard objects

2022-10-14 Thread Konstantin Shalygin
Hi, What you mean "strange"? It is normal, the object need only for OMAP data, not for actual data. Is only key for k,v database I see that you have lower number of objects, some of your PG don't have data at all. I suggest check your buckets for properly resharding process How this look (the i

[ceph-users] Re: monitoring drives

2022-10-14 Thread Konstantin Shalygin
Hi, You can get this metrics, even wear level, from official smartctl_exporter [1] [1] https://github.com/prometheus-community/smartctl_exporter k Sent from my iPhone > On 14 Oct 2022, at 17:12, John Petrini wrote: > > We run a mix of Samsung and Intel SSD's, our solution was to write a > sc

[ceph-users] Re: How to remove remaining bucket index shard objects

2022-10-19 Thread Konstantin Shalygin
This strange stats, at least one object should be exists for this OMAP's. Try to deep-scrub this PG, try to list objects in this PG `rados ls --pgid 6.2` k Sent from my iPhone > On 18 Oct 2022, at 03:39, Yuji Ito wrote: > > Thank you for your reply. > >> the object need only for OMAP data,

[ceph-users] Re: Quincy 22.04/Jammy packages

2022-10-21 Thread Konstantin Shalygin
CC'ed David Maybe Ilya can tag someone from DevOps additionally Thanks, k > On 20 Oct 2022, at 20:07, Goutham Pacha Ravi wrote: > > +1 > The OpenStack community is interested in this as well. We're trying to move > all our ubuntu testing to Ubuntu Jammy/22.04 [1]; and we consume packages > fr

[ceph-users] Re: Quincy 22.04/Jammy packages

2022-10-21 Thread Konstantin Shalygin
Thank you Ilya! > On 21 Oct 2022, at 21:02, Ilya Dryomov wrote: > > On Fri, Oct 21, 2022 at 12:48 PM Konstantin Shalygin wrote: >> >> CC'ed David > > Hi Konstantin, > > David has decided to pursue something else and is no longer working on > Ceph

[ceph-users] Re: Empty /var/lib/ceph/osd/ceph-$osd after reboot

2022-12-26 Thread Konstantin Shalygin
Hi, ceph-volume lvm activate --all k > On 21 Dec 2022, at 13:53, Isaiah Tang Yue Shun wrote: > > From what I understand, after creating an OSD using "ceph-volume lvm > create", we will do a "ceph-volume lvm activate" so that the systemd is > created. > > However, I found that after reboot

[ceph-users] Re: Does Replica Count Affect Tell Bench Result or Not?

2022-12-27 Thread Konstantin Shalygin
Hi, The cache was gone, optimize is proceed. This is not enterprise device, you should never use it with Ceph 🙂 k Sent from my iPhone > On 27 Dec 2022, at 16:41, hosseinz8...@yahoo.com wrote: > >  Thanks AnthonyI have a cluster with QLC SSD disks (Samsung QVO 860). The > cluster works for 2

[ceph-users] Re: radosgw not working after upgrade to Quincy

2022-12-28 Thread Konstantin Shalygin
Hi, Just try to read your logs: > 2022-12-29T02:07:38.953+ 7f5df868ccc0 0 WARNING: skipping unknown > framework: civetweb You try to use the `civetweb`, it was absent in quincy release. You need to update your configs and use `beast` instead k > On 29 Dec 2022, at 09:20, Andrei Mikhail

[ceph-users] Re: max pool size (amount of data/number of OSDs)

2023-01-02 Thread Konstantin Shalygin
Hi Chris, The actually limits are not software. Usually Ceph teams on Cloud Providers or Universities running out at physical resources at first: racks, racks power or network (ports, EOL switches that can't be upgraded) or hardware lifetime (There is no point in buying old hardware, and the n

[ceph-users] Re: [ERR] OSD_SCRUB_ERRORS: 2 scrub errors

2023-01-11 Thread Konstantin Shalygin
Hi, > On 10 Jan 2023, at 07:10, David Orman wrote: > > We ship all of this to our centralized monitoring system (and a lot more) and > have dashboards/proactive monitoring/alerting with 100PiB+ of Ceph. If you're > running Ceph in production, I believe host-level monitoring is critical, > abo

[ceph-users] Re: Current min_alloc_size of OSD?

2023-01-13 Thread Konstantin Shalygin
Hi, > On 12 Jan 2023, at 04:35, Robert Sander wrote: > > How can I get the current min_allloc_size of OSDs that were created with > older Ceph versions? Is there a command that shows this info from the on disk > format of a bluestore OSD? You can see this via kvstore-tool: ceph-kvstore-tool

[ceph-users] Re: Mount ceph using FQDN

2023-01-24 Thread Konstantin Shalygin
Hi, Do you think kernel should care about DNS resolution? k > On 24 Jan 2023, at 19:07, kushagra.gu...@hsc.com wrote: > > Hi team, > > We have a ceph cluster with 3 storage nodes: > 1. storagenode1 - abcd:abcd:abcd::21 > 2. storagenode2 - abcd:abcd:abcd::22 > 3. storagenode3 - abcd:abcd:abcd:

<    1   2   3   4   5   >