[ceph-users] Re: question about radosgw-admin bucket check

2022-02-16 Thread Ulrich Klein
Hi Francois,

> For the mpu's it is less important as I can fix them with some scripts.

Would you mind sharing how you get rid of these left-over mpu objects?
I’ve been trying to get rid of them without much success.

The "radosgw-admin bucket check --bucket --fix --check-objects” I tried, but it 
didn’t seem to have any effect. But I’m on pacific

Thanks in advance.

Ciao, Uli

> On 15. 02 2022, at 16:16, Scheurer François  
> wrote:
> 
> Dear Ceph Experts,
> 
> 
> The docu about this rgw command is a bit unclear:
>radosgw-admin bucket check --bucket --fix --check-objects
> 
> Is this command still maintained and safe to use? (we are still on nautilus)
> Is it working with sharded buckets? and also in multi-site?
> 
> I heard it will clear invalid mpu's from the index and correct wrong bucket 
> stats.
> 
> The most important part for us would be to correct the bucket stats.
> For the mpu's it is less important as I can fix them with some scripts.
> 
> Thank you for your help!
> 
> 
> Cheers
> Francois
> 
> 
> PS:
> I can also reshard a bucket to correct its stats, but in multi-site this will 
> also require to delete this bucket in secondary zones and resync it from 
> scratch, which is sub-optimal ;-)
> 
> 
> 
> 
> --
> 
> 
> EveryWare AG
> François Scheurer
> Senior Systems Engineer
> Zurlindenstrasse 52a
> CH-8003 Zürich
> 
> tel: +41 44 466 60 00
> fax: +41 44 466 60 10
> mail: francois.scheu...@everyware.ch
> web: http://www.everyware.ch
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Tenant and user id

2022-02-16 Thread Rok Jaklič
Hi,

is it possible to get tenant and user id with some python boto3 request?

Kind regards,
Rok
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Need feedback on cache tiering

2022-02-16 Thread Neha Ojha
Hi everyone,

We'd like to understand how many users are using cache tiering and in
which release.
The cache tiering code is not actively maintained, and there are known
performance issues with using it (documented in
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#a-word-of-caution).
We are wondering if we can deprecate cache tiering sometime soon.

Thanks,
Neha

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Need feedback on cache tiering

2022-02-16 Thread Mark Nelson
On a related note,  Intel will be presenting about their Open CAS 
software that provides caching at the block layer under the OSD at the 
weekly performance meeting on 2/24/2022 (similar to dm-cache, but with 
differences regarding the implementation).  This isn't a replacement for 
cache tiering, but has some overlapping benefits and may be of interest 
to folks in the community.  I'll send out a reminder to the list when we 
get closer to the meeting.



Mark


On 2/16/22 09:43, Neha Ojha wrote:

Hi everyone,

We'd like to understand how many users are using cache tiering and in
which release.
The cache tiering code is not actively maintained, and there are known
performance issues with using it (documented in
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#a-word-of-caution).
We are wondering if we can deprecate cache tiering sometime soon.

Thanks,
Neha

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Need feedback on cache tiering

2022-02-16 Thread Eugen Block

Hi,

we've noticed the warnings for quite some time now, but we're big fans  
of the cache tier. :-)
IIRC we set it up some time around 2015 or 2016 for our production  
openstack environment and it works nicely for us. We tried it without  
the cache some time after we switched to Nautilus but the performance  
was really bad, so we enabled it again. Of course, one could argue  
that we could just use SSD OSDs for the cached pool, too. But since  
the cache works fine we don't find it necessary to rebuild the entire  
pool with larger SSDs.
We're currently sill on Nautilus, we want to upgrade to Octopus soon.  
But I think we would vote for keeping the cache tier. :-)


Regards,
Eugen


Zitat von Neha Ojha :


Hi everyone,

We'd like to understand how many users are using cache tiering and in
which release.
The cache tiering code is not actively maintained, and there are known
performance issues with using it (documented in
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#a-word-of-caution).
We are wondering if we can deprecate cache tiering sometime soon.

Thanks,
Neha

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Need feedback on cache tiering

2022-02-16 Thread Mark Nelson

Hi Eugen,


Thanks for the great feedback.  Is there anything specific about the 
cache tier itself that you like vs hypothetically having caching live 
below the OSDs?  There are some real advantages to the cache tier 
concept, but eviction over the network has definitely been one of the 
tougher aspects of how it works (imho) compared with block level caching.



Mark


On 2/16/22 10:18, Eugen Block wrote:

Hi,

we've noticed the warnings for quite some time now, but we're big fans 
of the cache tier. :-)
IIRC we set it up some time around 2015 or 2016 for our production 
openstack environment and it works nicely for us. We tried it without 
the cache some time after we switched to Nautilus but the performance 
was really bad, so we enabled it again. Of course, one could argue 
that we could just use SSD OSDs for the cached pool, too. But since 
the cache works fine we don't find it necessary to rebuild the entire 
pool with larger SSDs.
We're currently sill on Nautilus, we want to upgrade to Octopus soon. 
But I think we would vote for keeping the cache tier. :-)


Regards,
Eugen


Zitat von Neha Ojha :


Hi everyone,

We'd like to understand how many users are using cache tiering and in
which release.
The cache tiering code is not actively maintained, and there are known
performance issues with using it (documented in
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#a-word-of-caution). 


We are wondering if we can deprecate cache tiering sometime soon.

Thanks,
Neha

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Need feedback on cache tiering

2022-02-16 Thread Eugen Block
There’s nothing special about our setup really. I’m also open to test  
any alternative if it improves our user experience. So it would  
probably make sense to check out the performance meeting you  
mentioned. :-)


Zitat von Mark Nelson :


Hi Eugen,


Thanks for the great feedback.  Is there anything specific about the  
cache tier itself that you like vs hypothetically having caching  
live below the OSDs?  There are some real advantages to the cache  
tier concept, but eviction over the network has definitely been one  
of the tougher aspects of how it works (imho) compared with block  
level caching.



Mark


On 2/16/22 10:18, Eugen Block wrote:

Hi,

we've noticed the warnings for quite some time now, but we're big  
fans of the cache tier. :-)
IIRC we set it up some time around 2015 or 2016 for our production  
openstack environment and it works nicely for us. We tried it  
without the cache some time after we switched to Nautilus but the  
performance was really bad, so we enabled it again. Of course, one  
could argue that we could just use SSD OSDs for the cached pool,  
too. But since the cache works fine we don't find it necessary to  
rebuild the entire pool with larger SSDs.
We're currently sill on Nautilus, we want to upgrade to Octopus  
soon. But I think we would vote for keeping the cache tier. :-)


Regards,
Eugen


Zitat von Neha Ojha :


Hi everyone,

We'd like to understand how many users are using cache tiering and in
which release.
The cache tiering code is not actively maintained, and there are known
performance issues with using it (documented in
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/#a-word-of-caution). We are wondering if we can deprecate cache tiering sometime  
soon.


Thanks,
Neha

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


--
Eugen Block
NDE Netzdesign und -entwicklung AG   voice: +49 40 5595175
Postfach 61 03 15   e-mail: ebl...@nde.ag
D-22423 Hamburg

  Vorstand: Jens-U. Mozdzen
  Aufsichtsratsvorsitzende: Angelika Torlée-Mozdzen
  Sitz und Registergericht: Hamburg, HRB 90934
  USt-IdNr: DE 814 013 983


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Problem with Ceph daemons

2022-02-16 Thread Adam King
Is there anything useful in the rgw daemon's logs? (e.g. journalctl -xeu
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk)

 - Adam King

On Wed, Feb 16, 2022 at 3:58 PM Ron Gage  wrote:

> Hi everyone!
>
>
>
> Looks like I am having some problems with some of my ceph RGW daemons -
> they
> won't stay running.
>
>
>
> From 'cephadm ls'.
>
>
>
> {
>
> "style": "cephadm:v1",
>
> "name": "rgw.obj0.c01.gpqshk",
>
> "fsid": "35194656-893e-11ec-85c8-005056870dae",
>
> "systemd_unit":
> "ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk
>  ",
>
> "enabled": true,
>
> "state": "error",
>
> "service_name": "rgw.obj0",
>
> "ports": [
>
> 80
>
> ],
>
> "ip": null,
>
> "deployed_by": [
>
>
> "
> quay.io/ceph/ceph@sha256:c3a89afac4f9c83c716af57e08863f7010318538c7e2cd9114
> 58800097f7d97d
> 
>  :c3a89afac4f9c83c716af57e08863f7010318538c7e
> 2cd911458800097f7d97d> ",
>
>
> "
> quay.io/ceph/ceph@sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76fff41a7
> 7fa32d0b903061
> 
>  :a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76
> fff41a77fa32d0b903061> "
>
> ],
>
> "rank": null,
>
> "rank_generation": null,
>
> "memory_request": null,
>
> "memory_limit": null,
>
> "container_id": null,
>
> "container_image_name":
> "
> quay.io/ceph/ceph@sha256:a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76fff41a7
> 7fa32d0b903061
> 
>  :a39107f8d3daab4d756eabd6ee1630d1bc7f31eaa76
> fff41a77fa32d0b903061> ",
>
> "container_image_id": null,
>
> "container_image_digests": null,
>
> "version": null,
>
> "started": null,
>
> "created": "2022-02-09T01:00:53.411541Z",
>
> "deployed": "2022-02-09T01:00:52.338515Z",
>
> "configured": "2022-02-09T01:00:53.411541Z"
>
> },
>
>
>
> That whole "state: error" bit is concerning to me - and it contributing to
> the cluster status of warning (showing 6 cephadm daemons down).
>
>
>
> Can I get a hint or two on how to fix this?
>
>
> Thanks!
>
>
>
> Ron Gage
>
> Westland, MI
>
>
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Problem with Ceph daemons

2022-02-16 Thread Eugen Block
Can you retry after resetting the systemd unit? The message "Start  
request repeated too quickly." should be cleared first, then start it  
again:


systemctl reset-failed  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service
systemctl start  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service


Then check the logs again. If there's still nothing in the rgw log  
then you'll need to check the (active) mgr daemon logs for anything  
suspicious and also the syslog on that rgw host. Is the rest of the  
cluster healthy? Are rgw daemons colocated with other services?



Zitat von Ron Gage :


Adam:



Not really….



-- Unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
begun starting up.


Feb 16 15:01:03 c01 podman[426007]:

Feb 16 15:01:04 c01 bash[426007]:  
915d1e19fa0f213902c666371c8e825480e103f85172f3b15d1d5bf2427a87c9


Feb 16 15:01:04 c01 conmon[426038]: debug  
2022-02-16T20:01:04.303+ 7f4f72ff6440  0 deferred set uid:gid to  
167:167 (ceph:ceph)


Feb 16 15:01:04 c01 conmon[426038]: debug  
2022-02-16T20:01:04.303+ 7f4f72ff6440  0 ceph version 16.2.7  
(dd0603118f56ab514f133c8d2e3adfc983942503) pacific (st>


Feb 16 15:01:04 c01 conmon[426038]: debug  
2022-02-16T20:01:04.303+ 7f4f72ff6440  0 framework: beast


Feb 16 15:01:04 c01 conmon[426038]: debug  
2022-02-16T20:01:04.303+ 7f4f72ff6440  0 framework conf key:  
port, val: 80


Feb 16 15:01:04 c01 conmon[426038]: debug  
2022-02-16T20:01:04.303+ 7f4f72ff6440  1 radosgw_Main not  
setting numa affinity


Feb 16 15:01:04 c01 systemd[1]: Started Ceph rgw.obj0.c01.gpqshk for  
35194656-893e-11ec-85c8-005056870dae.


-- Subject: Unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
finished start-up


-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- Unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
finished starting up.


--

-- The start-up result is done.

Feb 16 15:01:04 c01 systemd[1]:  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service: Main  
process exited, code=exited, status=98/n/a


Feb 16 15:01:05 c01 systemd[1]:  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service:  
Failed with result 'exit-code'.


-- Subject: Unit failed

-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- The unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
entered the 'failed' state with result 'exit-code'.


Feb 16 15:01:15 c01 systemd[1]:  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service:  
Service RestartSec=10s expired, scheduling restart.


Feb 16 15:01:15 c01 systemd[1]:  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service:  
Scheduled restart job, restart counter is at 5.


-- Subject: Automatic restarting of a unit has been scheduled

-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- Automatic restarting of the unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
been scheduled, as the result for


-- the configured Restart= setting for the unit.

Feb 16 15:01:15 c01 systemd[1]: Stopped Ceph rgw.obj0.c01.gpqshk for  
35194656-893e-11ec-85c8-005056870dae.


-- Subject: Unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
finished shutting down


-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- Unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
finished shutting down.


Feb 16 15:01:15 c01 systemd[1]:  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service: Start  
request repeated too quickly.


Feb 16 15:01:15 c01 systemd[1]:  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service:  
Failed with result 'exit-code'.


-- Subject: Unit failed

-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- The unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
entered the 'failed' state with result 'exit-code'.


Feb 16 15:01:15 c01 systemd[1]: Failed to start Ceph  
rgw.obj0.c01.gpqshk for 35194656-893e-11ec-85c8-005056870dae.


-- Subject: Unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
failed


-- Defined-By: systemd

-- Support: https://access.redhat.com/support

--

-- Unit  
ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk.service has  
failed.


--

-- The result is failed.



Ron Gage

Westland, MI



From: Adam King 
Sent: Wednesday, February 16, 2022 4:18 PM
To: Ron Gage 
Cc: ceph-users 
Subject: Re: [ceph-users] Problem with Ceph daemons



Is there anything useful in the rgw daemon's logs? (e.g. journalctl  
-xeu ceph-35194656-893e-11ec-85c8-005056870dae@rgw.obj0.c01.gpqshk  
  
)




 - Adam Kin