[ceph-users] Re: [EXTERNAL] Re: OSDs flapping with "_open_alloc loaded 132 GiB in 2930776 extents available 113 GiB"

2021-08-27 Thread Igor Fedotov

On 8/26/2021 4:18 PM, Dave Piper wrote:

I'll keep trying to repro and gather diags, but running in containers is making 
it very hard to run debug commands while the ceph daemons are down. Is this a 
known problem with a solution?

Sorry, not aware of this issues. I don't use containers though.

In the meantime, what's the impact of running with the Bitmap Allocator instead 
of the Hybrid one?  I'm nervous about switching from the default without 
understanding what that means.


Bitmap allocator is known for worse performance and higher resulting 
fragmentation.





Dave

-Original Message-
From: Igor Fedotov 
Sent: 23 August 2021 14:22
To: Dave Piper ; ceph-users@ceph.io
Subject: Re: [EXTERNAL] Re: [ceph-users] OSDs flapping with "_open_alloc loaded 132 
GiB in 2930776 extents available 113 GiB"

Hi Dave,

so may be another bug in Hybid Allocator...

Could you please dump free extents for your "broken" osd(s) by issuing 
"ceph-bluestore-tool --path  --command free-dump".  OSD to be offline.

Preferably to have these reports after you reproduce the issue with hybrid 
allocator once again - hence you'll need to switch back and wait till the 
repro. But if that's inappropriate it would be OK to have such a dump for the 
current state - hopefully it will reveal something interesting as well.


Thanks in advance,

Igor


On 8/20/2021 1:50 PM, Dave Piper wrote:

Igor,

We've hit this again on ceph 15.2.13 using the default allocator. Once again, 
configuring the OSDs to use the bitmap allocator has fixed up the issue.

I'm still trying to gather the full set of debug logs from the crash. I think 
again the fact I'm running in containers is the issue here; the container seems 
to be dying before we've had time to flush the log stream to file. I'll keep 
looking for a way around this.

Dave

-Original Message-
From: Igor Fedotov 
Sent: 12 August 2021 13:36
To: Dave Piper ; ceph-users@ceph.io
Subject: Re: [EXTERNAL] Re: [ceph-users] OSDs flapping with "_open_alloc loaded 132 
GiB in 2930776 extents available 113 GiB"

Hi Dave,

thanks for the update.

I'm curious whether reverting back to default allocator on the latest release 
would be OK as well. Please try if possible.


Thanks,

Igor

On 8/12/2021 2:00 PM, Dave Piper wrote:

Hi Igor,

Just to update you on our progress.

- We've not had another repro of this since switching to bitmap allocator / 
upgrading to the latest octopus release. I'll try to gather the full set of 
diags if we do see this again.
- I think my issues with an empty /var/lib/ceph/osd/ceph-N/ folder are because 
we're running ceph in container which is using a mounted filesystem. As soon as 
I stop the OSD, the container terminated and the filesystem disappears. There's 
probably a way to decouple the ceph process from the lifetime of the container, 
but I've not figured it out yet.

Cheers again for all your help,

Dave

-Original Message-
From: Igor Fedotov 
Sent: 26 July 2021 13:30
To: Dave Piper ; ceph-users@ceph.io
Subject: Re: [EXTERNAL] Re: [ceph-users] OSDs flapping with "_open_alloc loaded 132 
GiB in 2930776 extents available 113 GiB"

Dave,

please see inline

On 7/26/2021 1:57 PM, Dave Piper wrote:

Hi Igor,


So to get more verbose but less log one can set both debug-bluestore and 
debug-bluefs to 1/20. ...

More verbose logging attached. I've trimmed the file to a single restart 
attempt to keep the filesize down; let me know if there's not enough here.

Jul 26 10:25:07 condor_sc0 docker[19100]: -9628>
2021-07-26T10:25:05.512+ 7f9b3ed48f40 20 bluefs _read got 32768
Jul 26 10:25:07 condor_sc0 docker[19100]: -9627>
2021-07-26T10:25:05.512+ 7f9b3ed48f40 10 bluefs _read h
0x563e8bd3ff80 0xb2d~8000 from file(ino 316842 size 0xe6a476e
mtime
2021-07-14T15:54:21.751044+ allocated e6b extents
[1:0x112874~1,1:0x112878~1,1:0x112ad6~1,1:0x1
165fa~1,1:0x11662a~1,1:0x116741~1,1:0x116be10
000~1,1:0x112a33~2,1:0x117089~1,1:0x11751e~10
000,1:0x11770c~1,1:0x117776~1,1:0x117805~1,1:
0x117b2d~1,1:0x117b58~1,1:0x12a977~1,1:0x12a9
82~1,1:0x12a98d~1,1:0x12a98f~1,1:0x12a993
~1,1:0x12a9a0~1,1:0x12a9b3~1,1:0x12a9bd~1
,1:0x12a9c3~1,1:0x12a9c7~1,1:0x12a9ca~1,1:0x1
2a9cc~1,1:0x12a9df~1,1:0x12a9e9~1,1:0x12a9ec0
000~1,1:0x12a9f3~1,1:0x12a9f9~1,1:0x12a9ff~10
000,1:0x12aa02~1,1:0x12aa05~1,1:0x12aa09~1,1:
0x12aa0b~1,1:0x12aa15~1,1:0x12aa22~1,1:0x12aa
37~1,1:0x12aa41~1,1:0x12aa44~1,1:0x12aa49
~1,1:0x12aa4c~1,1:0x12aa51~1,1:0x12aa5a~1
,1:0x12aa7b~1,1:0x12aa91~1,1:0x12aa95~1,1:0x1
2aa97~1,1:0x12aa9a~1,1:0x12aa9d~1,1:0x12aab60
000~1,1:0x12aac1~1,1:0x12ad42~10

[ceph-users] A simple erasure-coding question about redundance

2021-08-27 Thread Rainer Krienke

Hello,

recently I thought about erasure coding and how to set k+m in a useful 
way also taking into account the number of hosts available for ceph. Say 
I would have this setup:


The cluster has 6 hosts and I want to allow two *hosts* to fail without 
loosing data. So I might choose k+m as 4+2 with redundancy at host 
level, but isn't this a little unwise?


What would happen if:

1. two disks would fail where both failed disks are not on the same 
host? I think ceph would be able to find a PG distributed across all 
hosts avoiding the two failed disks, so ceph would be able to repair and 
reach a healthy status after a while?


2. Two complete hosts would fail say because of broken power supplies. 
In this case ceph would no longer be able to repair the damage because 
there are no two more "free" remaining hosts to satisfy the 4+2 rule 
(with redundancy on host level). So data would not be lost but the 
cluster might stop delivering data and would be unable to repair and 
thus would also be unable to become healthy again?


Right or wrong?

Thanks a lot
Rainer
--
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse  1
56070 Koblenz, Web: http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312
PGP: http://www.uni-koblenz.de/~krienke/mypgp.html, Fax: +49261287 
1001312

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: A simple erasure-coding question about redundance

2021-08-27 Thread Janne Johansson
Den fre 27 aug. 2021 kl 12:43 skrev Rainer Krienke :
>
> Hello,
>
> recently I thought about erasure coding and how to set k+m in a useful
> way also taking into account the number of hosts available for ceph. Say
> I would have this setup:
>
> The cluster has 6 hosts and I want to allow two *hosts* to fail without
> loosing data. So I might choose k+m as 4+2 with redundancy at host
> level, but isn't this a little unwise?

Yes. You should have more hosts for EC 4+2, or .. less K.

> What would happen if:
> 1. two disks would fail where both failed disks are not on the same
> host? I think ceph would be able to find a PG distributed across all
> hosts avoiding the two failed disks, so ceph would be able to repair and
> reach a healthy status after a while?

Yes, if other disks are available to spill over to.

> 2. Two complete hosts would fail say because of broken power supplies.
> In this case ceph would no longer be able to repair the damage because
> there are no two more "free" remaining hosts to satisfy the 4+2 rule
> (with redundancy on host level). So data would not be lost but the
> cluster might stop delivering data and would be unable to repair and
> thus would also be unable to become healthy again?
>
> Right or wrong?

In the second case, the cluster stops until at least one new host
appears, and only then can it start repairing, and after some repairs
have made at least one more shard for your EC objects will it start
serving data again. Also, it will be very dangerous until this
happens, in case a third drive or host fails.


-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: A simple erasure-coding question about redundance

2021-08-27 Thread Eugen Block

Hi,

1. two disks would fail where both failed disks are not on the same  
host? I think ceph would be able to find a PG distributed across all  
hosts avoiding the two failed disks, so ceph would be able to repair  
and reach a healthy status after a while?


yes, if there is enough disk space and no other OSDs fail during that  
time then ceph would recover successfully and the PGs would still be  
available.



2. Two complete hosts would fail say because of broken power  
supplies. In this case ceph would no longer be able to repair the  
damage because there are no two more "free" remaining hosts to  
satisfy the 4+2 rule (with redundancy on host level). So data would  
not be lost but the cluster might stop delivering data and would be  
unable to repair and thus would also be unable to become healthy  
again?


Correct, your cluster would be in a degraded state until you have 6  
hosts again. But keep in mind that with EC your pool's min_size is  
usually k+1 so in your example your cluster would stop serving I/O the  
moment the second host fails.
The best choice would be if k+m would be smaller than the number of  
available hosts so your cluster can recover. If you want to be able to  
recover from two failed hosts you should respectively take that into  
consideration when choosing k and m.


Regards,
Eugen


Zitat von Rainer Krienke :


Hello,

recently I thought about erasure coding and how to set k+m in a  
useful way also taking into account the number of hosts available  
for ceph. Say I would have this setup:


The cluster has 6 hosts and I want to allow two *hosts* to fail  
without loosing data. So I might choose k+m as 4+2 with redundancy  
at host level, but isn't this a little unwise?


What would happen if:

1. two disks would fail where both failed disks are not on the same  
host? I think ceph would be able to find a PG distributed across all  
hosts avoiding the two failed disks, so ceph would be able to repair  
and reach a healthy status after a while?


2. Two complete hosts would fail say because of broken power  
supplies. In this case ceph would no longer be able to repair the  
damage because there are no two more "free" remaining hosts to  
satisfy the 4+2 rule (with redundancy on host level). So data would  
not be lost but the cluster might stop delivering data and would be  
unable to repair and thus would also be unable to become healthy  
again?


Right or wrong?

Thanks a lot
Rainer
--
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse  1
56070 Koblenz, Web: http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312
PGP: http://www.uni-koblenz.de/~krienke/mypgp.html, Fax:  
+49261287 1001312

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 11 Bullseye support

2021-08-27 Thread Francesco Piraneo G.



For this August Debian testing became a Debian stable with LTS support.

But I see that only sid repo exists, no testing and no new stable bullseye.

May be some one knows, when there are plans to have a bullseye build?

I cannot answer to this question; however I hope that the maintainers 
makes a serious cleanup of the packages installed with ceph-manager: 
When installing it other funny (and futile) packages are installed, like:


- docker.io (can be removed manually with no effect);

- libgfortran5 (yes! the language!)

- fonts-lyx (Latex fonts...)

- ttf-bitstream-vera (TTF!!!)

I have just one question: Why?

F.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: A simple erasure-coding question about redundance

2021-08-27 Thread Rainer Krienke

Hello Janne,

thank you very much for answering my questions.

Rainer

Am 27.08.21 um 12:51 schrieb Janne Johansson:

Den fre 27 aug. 2021 kl 12:43 skrev Rainer Krienke :


Hello,

recently I thought about erasure coding and how to set k+m in a useful
way also taking into account the number of hosts available for ceph. Say
I would have this setup:

The cluster has 6 hosts and I want to allow two *hosts* to fail without
loosing data. So I might choose k+m as 4+2 with redundancy at host
level, but isn't this a little unwise?




--
Rainer Krienke, Uni Koblenz, Rechenzentrum, A22, Universitaetsstrasse  1
56070 Koblenz, Web: http://www.uni-koblenz.de/~krienke, Tel: +49261287 1312
PGP: http://www.uni-koblenz.de/~krienke/mypgp.html, Fax: +49261287 
1001312

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: A simple erasure-coding question about redundance

2021-08-27 Thread Robert Sander

Am 27.08.21 um 12:51 schrieb Janne Johansson:


Yes. You should have more hosts for EC 4+2, or .. less K.


I'll second that. You should have at least k+m+2 hosts in the cluster 
for erasure coding. Not only because of redundancy but also for better 
distributing the load. EC is CPU heavy.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 93818 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Pacific: access via S3 / Object gateway slow for small files

2021-08-27 Thread E Taka
Hi,

thanks for the answers. My goal was to speed up the S3 interface, an
not only  a single program. This was successful with this method:
https://docs.ceph.com/en/latest/rados/configuration/bluestore-config-ref/#block-and-block-db

However, one major disadvantage was that Cephadm considered the OSDs
as "STRAY DAEMON"  and the OSD could not be adminstered with the
Dashboard. What really helped was this doc:

https://docs.ceph.com/en/pacific/cephadm/osd/

1. As prerequisites one have to turn off the automagic creation of OSD:

  ceph orch apply osd --all-available-devices --unmanaged=true

2. Then create a YAML specificastion like this and apply it:

service_type: osd
service_id: osd_spec_default
placement:
  host_pattern: '*'
data_devices:
  rotational: 1
db_devices:
  rotational: 0

3. delete ALL OSD from one node:
  ceph orch osd rm 
(and wait probably for many< hours)

4. Zap those HDD and SSD:
ceph orch device zap  

5. Activate ceph-volume via
  ceph cephadm osd activate 

Et voià! Now we can use the dashboard and the SSD are used für WAL/DB.
This speedens up the access to Ceph, epsecially the S3 API which is
almost 10 times as fast as before.

For Pacific++ there should be a very prominent reference to the doc
"Cephadm – OSD service", in particular from the "BlueStore Settings"
(first URL above). That would have saved me many hours of testing.

Thanks anyway!

Am Di., 24. Aug. 2021 um 10:41 Uhr schrieb Janne Johansson
:
>
> Den tis 24 aug. 2021 kl 09:46 skrev Francesco Piraneo G. 
> :
> > Il 24.08.21 09:32, Janne Johansson ha scritto:
> > >> As a simple test I copied an Ubuntu /usr/share/doc (580 MB in 23'000 
> > >> files):
> > >> - rsync -a to a Cephfs took 2 min
> > >> - s3cmd put --recursive took over 70 min
> > >> Users reported that the S3 access is generally slow, not only with 
> > >> s3tools.
> > > Single per-object accesses and writes on S3 are slower, since they
> > > involve both client and server side checksumming, a lot of http(s)
> > > stuff before the actual operations start and I don't think there is a
> > > lot of connection reuse or pipelining being done so you are going to
> > > make some 23k requests, each taking a non-zero time to complete.
> > >
> > Question: Is Swift compatible protocol faster?
>
> Probably not, but make a few tests and find out how it works at your place.
> It's kind of easy to rig both at the same time, so you can test on exactly the
> same setup.
>
> > Use case: I have to store indefinite files quantity for a data storage
> > service; I thought object storage is the unique solution; each file is
> > identified by UUID, no metadata on file, files are chunked 4Mb size each.
>
> That sounds like a better case for S3/Swift.
>
> > In such case cephfs is the best suitable choice?
>
> One factor to add might be "will it be reachable from the outside?",
> since radosgw is kind of easy to put behind a set of load balancers,
> that can wash/clean incoming traffic and handle TLS offload and things
> like that. Putting cephfs out on the internet might have other cons.
>
> --
> May the most significant bit of your life be positive.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Replacing swift with RGW

2021-08-27 Thread Michel Niyoyita
Hello ,

I have configured RGW in my ceph cluster deployed using ceph ansible and
create sub user to access the created containers and would like to replace
swift by RGW in the openstack side. Anyone can help on configuration to be
done in the OpenStack side in order to integrate those services. I have
deployed OpenStack wallaby using Kolla-ansible on ubuntu 20.04. and ceph
pacific 16.2.5 was deployed using ansible on ubuntu 20.04

Kindly help for the configuration or documentation.

Best Regards

Michel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Issue installing radosgw on debian 10

2021-08-27 Thread Dimitri Savineau
Can you try to update the `mon host` value with brackets ?

mon host = [v2:192.168.1.50:3300,v1:192.168.1.50:6789],[v2:192.168.1.51:3300
,v1:192.168.1.51:6789],[v2:192.168.1.52:3300,v1:192.168.1.52:6789]

https://docs.ceph.com/en/latest/rados/configuration/msgr2/#updating-ceph-conf-and-mon-host

Regards,

Dimitri

On Fri, Aug 27, 2021 at 11:00 AM Francesco Piraneo G. 
wrote:

>  > Installed radosgw and ceph-common via apt; modified ceph.conf as
> follows on mon1 and propagated the modified file to all hosts and
> obviously to s3.anonicloud.test.
>
>  > Yes, I forgot the ceph.conf, sorry.
>
>
> [global]
> fsid = e79c0ace-b910-40af-ab2c-ae90fa4f5dd2
> mon initial members = mon1, mon2, mon3
> mon host = v2:192.168.1.50:3300,v1:192.168.1.50:6789,
> v2:192.168.1.51:3300,v1:192.168.1.51:6789,
> v2:192.168.1.52:3300,v1:192.168.1.52:6789
> public network = 192.168.1.0/24
> cluster_network = 172.16.0.0/16
> auth cluster required = cephx
> auth service required = cephx
> auth client required = cephx
> osd journal size = 1024
> osd pool default size = 3
> osd pool default min size = 2
> osd pool default pg num = 333
> osd pool default pgp num = 333
> osd crush chooseleaf type = 1
>
> [client.rgw.s3]
> host = s3 # See note (1)
> rgw frontends = "civetweb port=80"
> rgw dns name = s3.anonicloud.test
>
>
> (1) In RedHat doc I found to put here the hostname -s; in other pages on
> the net they indicated the ip address... where is the truth?
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: tcmu-runner crashing on 16.2.5

2021-08-27 Thread Paul Giralt (pgiralt)
Ok - thanks Xiubo. Not sure I feel comfortable doing that without breaking 
something else, so will wait for a new release that incorporates the fix. In 
the meantime I’m trying to figure out what might be triggering the issue, since 
this has been running fine for months and just recently started happening. Now 
it happens fairly regularly.

I noticed that in the tcmu logs, I see the following:

2021-08-27 15:06:40.158 8:ework-thread [ERROR] 
tcmu_rbd_service_status_update:140 rbd/iscsi-pool-0001.iscsi-p0001-img-01: 
Could not update service status. (Err -107)
2021-08-27 15:06:40.158 8:ework-thread [ERROR] __tcmu_report_event:173 
rbd/iscsi-pool-0001.iscsi-p0001-img-01: Could not report events. Error -107.
2021-08-27 15:06:41.131 8:io_context_pool [WARN] tcmu_notify_lock_lost:271 
rbd/iscsi-pool-0002.iscsi-p0002-img-02: Async lock drop. Old state 5
2021-08-27 15:06:41.147 8:cmdproc-uio9 [INFO] alua_implicit_transition:592 
rbd/iscsi-pool-0002.iscsi-p0002-img-02: Starting write lock acquisition 
operation.
2021-08-27 15:06:42.132 8:ework-thread [ERROR] 
tcmu_rbd_service_status_update:140 rbd/iscsi-pool-0002.iscsi-p0002-img-02: 
Could not update service status. (Err -107)
2021-08-27 15:06:42.132 8:ework-thread [ERROR] __tcmu_report_event:173 
rbd/iscsi-pool-0002.iscsi-p0002-img-02: Could not report events. Error -107.
2021-08-27 15:06:42.216 8:ework-thread [INFO] 
tcmu_rbd_rm_stale_entries_from_blacklist:340 
rbd/iscsi-pool-0001.iscsi-p0001-img-01: removing addrs: 
{10.122.242.197:0/2251669337}
2021-08-27 15:06:42.217 8:ework-thread [ERROR] 
tcmu_rbd_rm_stale_entry_from_blacklist:322 
rbd/iscsi-pool-0001.iscsi-p0001-img-01: Could not rm blacklist entry '�(~'. 
(Err -13)
2021-08-27 15:06:42.217 8:ework-thread [INFO] 
tcmu_rbd_rm_stale_entries_from_blacklist:340 
rbd/iscsi-pool-0001.iscsi-p0001-img-01: removing addrs: 
{10.122.242.197:0/3276725458}
2021-08-27 15:06:42.218 8:ework-thread [ERROR] 
tcmu_rbd_rm_stale_entry_from_blacklist:322 
rbd/iscsi-pool-0001.iscsi-p0001-img-01: Could not rm blacklist entry ''. (Err 
-13)
2021-08-27 15:06:42.443 8:io_context_pool [WARN] tcmu_notify_lock_lost:271 
rbd/iscsi-pool-0005.iscsi-p0005-img-01: Async lock drop. Old state 5
2021-08-27 15:06:42.459 8:cmdproc-uio0 [INFO] alua_implicit_transition:592 
rbd/iscsi-pool-0005.iscsi-p0005-img-01: Starting write lock acquisition 
operation.
2021-08-27 15:06:42.488 8:ework-thread [INFO] 
tcmu_rbd_rm_stale_entries_from_blacklist:340 
rbd/iscsi-pool-0005.iscsi-p0005-img-01: removing addrs: 
{10.122.242.197:0/2189482708}
2021-08-27 15:06:42.489 8:ework-thread [ERROR] 
tcmu_rbd_rm_stale_entry_from_blacklist:322 
rbd/iscsi-pool-0005.iscsi-p0005-img-01: Could not rm blacklist entry '`"�'. 
(Err -13)

The tcmu_rbd_service_status_update is showing up in there which is the code 
that is affected by this bug. Any idea what the error -107 means? Maybe if I 
fix what is causing some of these errors, it might work around the problem. 
Also if you have thoughts on the other blacklist entry errors and what might be 
causing them, that would be greatly appreciated as well.

-Paul


On Aug 26, 2021, at 8:37 PM, Xiubo Li 
mailto:xiu...@redhat.com>> wrote:

On 8/27/21 12:06 AM, Paul Giralt (pgiralt) wrote:
This is great. Is there a way to test the fix in my environment?


It seem you could restart the tcmu-runner service from the container.

Since this change not only in the handler_rbd.so but also the libtcmu.so and 
tcmu-runner binary, the whole tcmu-runner need to be built.

That means I am afraid you have to build and install it from source on the host 
and then restart the tcmu container.



-Paul


On Aug 26, 2021, at 11:05 AM, Xiubo Li 
mailto:xiu...@redhat.com>> wrote:


Hi Paul, Ilya,

I have fixed it in [1], please help review.

Thanks

[1] https://github.com/open-iscsi/tcmu-runner/pull/667


On 8/26/21 7:34 PM, Paul Giralt (pgiralt) wrote:
Thank you for the analysis. Can you think of a workaround for the issue?

-Paul

Sent from my iPhone

On Aug 26, 2021, at 5:17 AM, Xiubo Li 
 wrote:



Hi Paul,

There has one racy case when updating the state to ceph cluster and while 
reopening the image, which will close and open the image, the crash should 
happen just after the image was closed and the resources were released and then 
if work queue was trying to update the state to ceph cluster it will trigger 
use-after-free bug.

I will try to fix it.

Thanks


On 8/26/21 10:40 AM, Paul Giralt (pgiralt) wrote:
I will send a unicast email with the link and details.

-Paul


On Aug 25, 2021, at 10:37 PM, Xiubo Li 
mailto:xiu...@redhat.com>> wrote:


Hi Paul,

Please send me the detail versions of the tcmu-runner and ceph-iscsi packages 
you are using.

Thanks


On 8/26/21 10:21 AM, Paul Giralt (pgiralt) wrote:
Thank you. I did find some coredump files. Is there a way I can send these to 
you to analyze?

[root@cxcto-c240-j27-02 coredump]# ls -asl
total 71292
0 drwxr-xr-x. 2 root root  176 Aug 25 18:31 .
0 drw

[ceph-users] Howto upgrade AND change distro

2021-08-27 Thread Francois Legrand

Hello,

We are running a ceph nautilus cluster under centos 7. To upgrade to 
pacific we need to change to a more recent distro (probably debian or 
ubuntu because of the recent announcement about centos 8, but the distro 
doesn't matter very much).


However, I could'nt find a clear procedure to upgrade ceph AND the 
distro !  As we have more than 100 osds and ~600TB of data, we would 
like to avoid as far as possible to wipe the disks and 
rebuild/rebalance. It seems to be possible to reinstall a server and 
reuse the osds, but the exact procedure remains quite unclear to me.


What is the best way to proceed ? Does someone have done that and have a 
rather detailed doc on how to proceed ?


Thanks for your help !

F.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Howto upgrade AND change distro

2021-08-27 Thread Matthew Vernon

Hi,

On 27/08/2021 16:16, Francois Legrand wrote:

We are running a ceph nautilus cluster under centos 7. To upgrade to 
pacific we need to change to a more recent distro (probably debian or 
ubuntu because of the recent announcement about centos 8, but the distro 
doesn't matter very much).


However, I could'nt find a clear procedure to upgrade ceph AND the 
distro !  As we have more than 100 osds and ~600TB of data, we would 
like to avoid as far as possible to wipe the disks and 
rebuild/rebalance. It seems to be possible to reinstall a server and 
reuse the osds, but the exact procedure remains quite unclear to me.


It's going to be least pain to do the operations separately, which means 
you may need to build a set of packages for one or other "end" of the 
operation, if you see what I mean?


The Debian and Ubuntu installers both have an "expert mode" which gives 
you quite a lot of control which should enable you to upgrade the OS 
without touching the OSD disks - but make sure you have backups of all 
your Ceph config!


If you're confident (and have enough redundancy), you can set noout 
while you upgrade a machine, which will reduce the amount of rebalancing 
you have to do when it rejoins the cluster post upgrade.


Regards,

Matthew

[one good thing about Ubuntu's cloud archive is that e.g. you can get 
the same version that's default in 20.04 available as packages for 18.04 
via UCA meaning you can upgrade Ceph first, and then do the distro 
upgrade, and it's pretty painless]


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: A simple erasure-coding question about redundance

2021-08-27 Thread Anthony D'Atri


> 
>> Yes. You should have more hosts for EC 4+2, or .. less K.
> 
> I'll second that. You should have at least k+m+2 hosts in the cluster for 
> erasure coding. Not only because of redundancy but also for better 
> distributing the load. EC is CPU heavy.
> 
> Regards

I agree operationally, but FWIW ISTR that in Pacific …. or maybe it was Octopus 
…. the default min_size was changed to K from K+1.  Perhaps that doesn’t affect 
existing pools without intervention?

With small clusters especially, there are other reasons to favor more hosts, 
even if they have to be smaller.  Ways to get there include 1U servers instead 
of 2U, not initially filling all drive slots.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: LARGE_OMAP_OBJECTS: any proper action possible?

2021-08-27 Thread Patrick Donnelly
Hi Frank,

On Wed, Aug 25, 2021 at 6:27 AM Frank Schilder  wrote:
>
> Hi all,
>
> I have the notorious "LARGE_OMAP_OBJECTS: 4 large omap objects" warning and 
> am again wondering if there is any proper action one can take except "wait it 
> out and deep-scrub (numerous ceph-users threads)" or "ignore 
> (https://docs.ceph.com/en/latest/rados/operations/health-checks/#large-omap-objects)".
>  Only for RGWs is a proper action described, but mine come from MDSes. Is 
> there any way to ask an MDS to clean up or split the objects?
>
> The disks with the meta-data pool can easily deal with objects of this size. 
> My question is more along the lines: If I can't do anything anyway, why the 
> warning? If there is a warning, I would assume that one can do something 
> proper to prevent large omap objects from being born by an MDS. What is it?

Please try the resolutions suggested in: https://tracker.ceph.com/issues/45333

--
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Issue installing radosgw on debian 10

2021-08-27 Thread Dimitri Savineau
What ceph version is used for the cluster ?

Because it looks like (according to the ceph config file) that you're using
either Nautilus/Octopus/Pacific (due to the msgr v2 config)

But are you using the same version for your radosgw node ?
The ceph-common and radosgw packages in buster [1][2] are using Luminous so
that would explain why the configuration can't be parsed (doesn't support
msgr v2).

[1] https://packages.debian.org/buster/ceph-common
[2] https://packages.debian.org/buster/radosgw

Regards,

Dimitri

On Fri, Aug 27, 2021 at 12:07 PM Francesco Piraneo G. 
wrote:

> Same error, but with the brackets! :-)
>
> # ceph -s
> server name not found: [v2:192.168.1.50:3300 (Name or service not known)
> unable to parse addrs in '[v2:192.168.1.50:3300,v1:192.168.1.50:6789],
> [v2:192.168.1.51:3300,v1:192.168.1.51:6789],
> [v2:192.168.1.52:3300,v1:192.168.1.52:6789]'
> [errno 22] error connecting to the cluster
>
>
> F.
>
> Il 27.08.21 17:04, Dimitri Savineau ha scritto:
> > Can you try to update the `mon host` value with brackets ?
> >
> > mon host = [v2:192.168.1.50:3300,v1:192.168.1.50:6789],[v2:
> 192.168.1.51:3300
> > ,v1:192.168.1.51:6789],[v2:192.168.1.52:3300,v1:192.168.1.52:6789]
> >
> >
> https://docs.ceph.com/en/latest/rados/configuration/msgr2/#updating-ceph-conf-and-mon-host
> >
> > Regards,
> >
> > Dimitri
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Missing OSD in SSD after disk failure

2021-08-27 Thread David Orman
This was a bug in some versions of ceph, which has been fixed:

https://tracker.ceph.com/issues/49014
https://github.com/ceph/ceph/pull/39083

You'll want to upgrade Ceph to resolve this behavior, or you can use
size or something else to filter if that is not possible.

David

On Thu, Aug 19, 2021 at 9:12 AM Eric Fahnle  wrote:
>
> Hi everyone!
> I've got a doubt, I tried searching for it in this list, but didn't find an 
> answer.
>
> I've got 4 OSD servers. Each server has 4 HDDs and 1 NVMe SSD disk. The 
> deployment was done with "ceph orch apply deploy-osd.yaml", in which the file 
> "deploy-osd.yaml" contained the following:
> ---
> service_type: osd
> service_id: default_drive_group
> placement:
>   label: "osd"
> data_devices:
>   rotational: 1
> db_devices:
>   rotational: 0
>
> After the deployment, each HDD had an OSD and the NVMe shared the 4 OSDs, 
> plus the DB.
>
> A few days ago, an HDD broke and got replaced. Ceph detected the new disk and 
> created a new OSD for the HDD but didn't use the NVMe. Now the NVMe in that 
> server has 3 OSDs running but didn't add the new one. I couldn't find out how 
> to re-create the OSD with the exact configuration it had before. The only 
> "way" I found was to delete all 4 OSDs and create everything from scratch (I 
> didn't actually do it, as I hope there is a better way).
>
> Has anyone had this issue before? I'd be glad if someone pointed me in the 
> right direction.
>
> Currently running:
> Version
> 15.2.8
> octopus (stable)
>
> Thank you in advance and best regards,
> Eric
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Debian 11 Bullseye support

2021-08-27 Thread Gilles Mocellin
Le vendredi 27 août 2021, 09:18:01 CEST Francesco Piraneo G. a écrit :
> > For this August Debian testing became a Debian stable with LTS support.
> > 
> > But I see that only sid repo exists, no testing and no new stable
> > bullseye.
> > 
> > May be some one knows, when there are plans to have a bullseye build?
> 
> I cannot answer to this question; however I hope that the maintainers
> makes a serious cleanup of the packages installed with ceph-manager:
> When installing it other funny (and futile) packages are installed, like:
> 
> - docker.io (can be removed manually with no effect);
> 
> - libgfortran5 (yes! the language!)
> 
> - fonts-lyx (Latex fonts...)
> 
> - ttf-bitstream-vera (TTF!!!)
> 
> I have just one question: Why?
> 
> F.

Aptiutude can help you know why, with its "why" ciommand !

On my workstation, for example :

$ aptitude why libgfortran5 
  

i   ardour Dépend libqm-dsp0 
i A libqm-dsp0 Dépend libatlas3-base 
i A libatlas3-base Dépend libgfortran5 (>= 8)





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Issue installing radosgw on debian 10

2021-08-27 Thread Francesco Piraneo G.

Hi all,

on a working test cluster trying to install radosgw on a separate 
machine; the system is like


mon1...mon3 - 192.168.1.50 - 192.168.1.52

osd1...osd3 - 192.168.1.60 - 192.168.1.60 + Cluster network 172.16.0.0/16

radosgw - hostname - s3.anonicloud.test - IP: 192.168.1.70


Everything deployed under Debian 10.10.

Installed radosgw and ceph-common via apt; modified ceph.conf as follows 
on mon1 and propagated the modified file to all hosts and obviously to 
s3.anonicloud.test.


Side note: I run a local DNS for the anonicloud.test zone and with dig 
the hostname resolve to the correct ip.



Also the ceph.client.admin.keyring has been copied to radosgw host.

However, when I try to ceph -s on the new host, this is the result:


# ceph -s
server name not found: v2:192.168.1.50:3300 (Name or service not known)
unable to parse addrs in 'v2:192.168.1.50:3300,v1:192.168.1.50:6789, 
v2:192.168.1.51:3300,v1:192.168.1.51:6789, 
v2:192.168.1.52:3300,v1:192.168.1.52:6789'

[errno 22] error connecting to the cluster


Obviously any other command like:


# ceph auth get-or-create client.rgw.`hostname -s` osd 'allow rwx' mon 
'allow rw' -o /var/lib/ceph/radosgw/ceph-rgw.`hostname -s`/keyring



...fails in the same way.


Hostname has been set as:


# cat /etc/hostname
s3.anonicloud.test

# cat /etc/resolv.conf
domain anonicloud.test
search anonicloud.test
nameserver 192.168.1.200 # This is my local DNS


# cat /etc/hosts
127.0.0.1    localhost
192.168.1.70    s3.anonicloud.test    s3

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters


Sidenote: IPv6 has been disabled on this machine.


Any help is appreciated.

Thanks, Francesco

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Issue installing radosgw on debian 10

2021-08-27 Thread Francesco Piraneo G.
> Installed radosgw and ceph-common via apt; modified ceph.conf as 
follows on mon1 and propagated the modified file to all hosts and 
obviously to s3.anonicloud.test.


> Yes, I forgot the ceph.conf, sorry.


[global]
fsid = e79c0ace-b910-40af-ab2c-ae90fa4f5dd2
mon initial members = mon1, mon2, mon3
mon host = v2:192.168.1.50:3300,v1:192.168.1.50:6789, 
v2:192.168.1.51:3300,v1:192.168.1.51:6789, 
v2:192.168.1.52:3300,v1:192.168.1.52:6789

public network = 192.168.1.0/24
cluster_network = 172.16.0.0/16
auth cluster required = cephx
auth service required = cephx
auth client required = cephx
osd journal size = 1024
osd pool default size = 3
osd pool default min size = 2
osd pool default pg num = 333
osd pool default pgp num = 333
osd crush chooseleaf type = 1

[client.rgw.s3]
host = s3 # See note (1)
rgw frontends = "civetweb port=80"
rgw dns name = s3.anonicloud.test


(1) In RedHat doc I found to put here the hostname -s; in other pages on 
the net they indicated the ip address... where is the truth?


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph mds in death loop from client trying to remove a file

2021-08-27 Thread Pickett, Neale T
Well, the scan_links cleaned up all the duplicate inode messages, and now it's 
just crashing on:


 -5> 2021-08-25T16:23:54.996+ 7f8b088e4700 10 monclient: 
get_auth_request con 0x55e2cc18d400 auth_method 0
 -4> 2021-08-25T16:23:55.098+ 7f8b080e3700 10 monclient: 
get_auth_request con 0x55e2cc18dc00 auth_method 0
 -3> 2021-08-25T16:23:55.276+ 7f8b090e5700 10 monclient: 
get_auth_request con 0x55e2dae97400 auth_method 0
 -2> 2021-08-25T16:23:55.348+ 7f8b010d5700  4 mds.0.server 
handle_client_request client_request(client.26380166:232625 unlink 
#0x100028e3a23/SOME_FILENAME.zip 2021-08-25T16:15:22.665921+ RETRY=22 
caller_uid=0, caller_gid=0{}) v2
 -1> 2021-08-25T16:23:55.353+ 7f8b010d5700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mds/Server.cc:
 In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' 
thread 7f8b010d5700 time 2021-08-25T16:23:55.349217+
 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mds/Server.cc:
 7503: FAILED ceph_assert(in->first <= straydn->first)


I think the inode table has been completely reset at some point, although we've 
run scan_inodes at least twice. My guess is that there's some assertion making 
sure that the next "free" inode number is lower than the inode of some other 
object, and that's failing. I further assume what I need to do is bump the next 
free inode number to a really high value.

Am I guessing correctly?

In case anybody is concerned, we aren't trying to put it back into service for 
anything other than rsync --delete. We're about to enter year 3 of the recovery 
effort!


From: Pickett, Neale T
Sent: Tuesday, August 24, 2021 15:23
To: Dan van der Ster
Cc: ceph-users@ceph.io
Subject: Re: [EXTERNAL] Re: [ceph-users] mds in death loop with [ERR] loaded 
dup inode XXX [2,head] XXX at XXX, but inode XXX already exists at XXX


Aha, I knew it was too short to be true. It seems like a client is trying to 
delete a file which is triggering all this.


There are many many lines looking like -5 and -4 here.


-5> 2021-08-24T21:17:38.293+ 7fe9a5b0b700  0 mds.0.cache.dir(0x609) 
_fetched  badness: got (but i already had) [inode 0x100028e2fe5 [...2,head] 
~mds0/stray3/100028e2fe5 auth v265487174 snaprealm=0x5608e16d3a00 s=14 nl=15 
n(v0 rc2021-05-25T15:33:03.443023+ b14 1=1+0) (iversion lock) 
0x5608e16d7180] mode 33188 mtime 2014-07-03T22:35:25.00+
-4> 2021-08-24T21:17:38.293+ 7fe9a5b0b700 -1 log_channel(cluster) log 
[ERR] : loaded dup inode 0x100028e2fe5 [2,head] v265540122 at 
~mds0/stray9/100028e2fe5, but inode 0x100028e2fe5.head v265487174 already 
exists at ~mds0/stray3/100028e2fe5
-3> 2021-08-24T21:17:38.307+ 7fe9a730e700  4 mds.0.server 
handle_client_request client_request(client.26332595:243938 unlink 
#0x100028e597b/1213982112.554772-6,128.165.213.35:2945,222.73.254.92:22.pcap 
2021-08-23T17:51:55.994745+ RETRY=209 caller_uid=0, caller_gid=0{}) v2
-2> 2021-08-24T21:17:38.307+ 7fe9a730e700 -1 log_channel(cluster) log 
[ERR] : unmatched rstat rbytes on single dirfrag 0x100028e597b, inode has 
n(v1963 rc2021-08-23T17:51:55.994745+ b2195090776 5729=5728+1), dirfrag has 
n(v1963 rc2021-08-23T17:51:55.994745+)




From: Pickett, Neale T
Sent: Tuesday, August 24, 2021 15:20
To: Dan van der Ster
Cc: ceph-users@ceph.io
Subject: Re: [EXTERNAL] Re: [ceph-users] mds in death loop with [ERR] loaded 
dup inode XXX [2,head] XXX at XXX, but inode XXX already exists at XXX


Full backtrace below. Seems pretty short for a ceph backtrace!


I'll get started on a link scan for the time being. It'll keep it from flapping 
in and out of CEPH_ERR!


-1> 2021-08-24T21:17:38.313+ 7fe9a730e700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mds/Server.cc:
 In function 'void Server::_unlink_local(MDRequestRef&, CDentry*, CDentry*)' 
thread 7fe9a730e700 time 2021-08-24T21:17:38.308917+
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.2.5/rpm/el8/BUILD/ceph-16.2.5/src/mds/Server.cc:
 7503: FAILED ceph_assert(in->first <= straydn->first)

 ceph version 16.2.5 (0883bdea7337b95e4b611c768c0279868462204a) pacific (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x158) [0x7fe9b4780b24]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x276d3e) [0x7fe9b4780d3e]
 3: (Server::_unlink_local(boost::intrusive_ptr&, CDentry*, 
CDentry*)+0x106a) [0x5608c0c338ba]

[ceph-users] Re: Issue installing radosgw on debian 10

2021-08-27 Thread Francesco Piraneo G.

Same error, but with the brackets! :-)

# ceph -s
server name not found: [v2:192.168.1.50:3300 (Name or service not known)
unable to parse addrs in '[v2:192.168.1.50:3300,v1:192.168.1.50:6789], 
[v2:192.168.1.51:3300,v1:192.168.1.51:6789], 
[v2:192.168.1.52:3300,v1:192.168.1.52:6789]'

[errno 22] error connecting to the cluster


F.

Il 27.08.21 17:04, Dimitri Savineau ha scritto:

Can you try to update the `mon host` value with brackets ?

mon host = [v2:192.168.1.50:3300,v1:192.168.1.50:6789],[v2:192.168.1.51:3300
,v1:192.168.1.51:6789],[v2:192.168.1.52:3300,v1:192.168.1.52:6789]

https://docs.ceph.com/en/latest/rados/configuration/msgr2/#updating-ceph-conf-and-mon-host

Regards,

Dimitri
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RADOS + Crimson updates - August

2021-08-27 Thread Neha Ojha
Hi everyone,

I'd like to share a few updates on completed/ongoing RADOS and Crimson
projects with the community.

Significant PRs merged

- Remove allocation metadata from RocksDB - should significantly
improve small write performance
- PG Autoscaler scale-down profile - default in new clusters for
better performance out of the box (pending pacific release)
- Support in msgr 2.0 for on-wire compression for osd-osd communication
- BlueStore deferred writes behavior for large writes on spinners
(pending pacific release)
- Fix for the ceph-bluestore-tool reshard option (pending pacific release)
- New perf channel in telemetry to capture performance metrics (work
in progress)

Ongoing Projects

Crimson

- Rook integration with Crimson has started, Radek working on fixing
issues as they come up
- Seastore: lba rewrite merged
- Seastore: work continues on extent manager PR (prerequisite for
multi-device, tiering, and pmem)
- Seastore: several improvements to metrics, will be important for
evaluating options for performance improvements
- More interruptible_future stabilization

QoS

- More progress on QoS for background activities in the OSD. The
process of setting appropriate mclock parameters has been automated
and now happens once during OSD startup (pending pacific release)
- Ongoing work to capture better scrub statistics needed for QoS
- Work on client vs client QoS has started

Thanks,
Neha & Sam

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Issue installing radosgw on debian 10

2021-08-27 Thread Francesco Piraneo G.
So, now I discovered that Debian has it's own ceph packages! For a 
stupid joke I added the ceph repository and installed ceph-common 
without making an apt clean; apt update; apt upgrade -y and this lead to 
have all the cluster with pacific and just the radosgw with Luminous! 
:-/ For this reason ceph -s was not able to parse the ceph.conf file!


Once raised the full update & upgrade like above everything worked fine!

Thank you very much to have pointed me to the right direction.

Francesco


Il 27.08.21 19:48, Dimitri Savineau ha scritto:

What ceph version is used for the cluster ?

Because it looks like (according to the ceph config file) that you're using
either Nautilus/Octopus/Pacific (due to the msgr v2 config)

But are you using the same version for your radosgw node ?
The ceph-common and radosgw packages in buster [1][2] are using Luminous so
that would explain why the configuration can't be parsed (doesn't support
msgr v2).

[1]https://packages.debian.org/buster/ceph-common
[2]https://packages.debian.org/buster/radosgw

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Missing OSD in SSD after disk failure

2021-08-27 Thread Eric Fahnle
Hi David! Very much appreciated your response.

I'm not sure that may be the problem. I tried with the following (without using 
"rotational"):

...(snip)...
data_devices:
   size: "15G:"
db_devices:
   size: ":15G"
filter_logic: AND
placement:
  label: "osdj2"
service_id: test_db_device
service_type: osd
...(snip)...

Without success. Also tried without the "filter_logic: AND" in the yaml file 
and the result was the same.

Best regards,
Eric


-Original Message-
From: David Orman [mailto:orma...@corenode.com] 
Sent: 27 August 2021 14:56
To: Eric Fahnle
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Missing OSD in SSD after disk failure

This was a bug in some versions of ceph, which has been fixed:

https://tracker.ceph.com/issues/49014
https://github.com/ceph/ceph/pull/39083

You'll want to upgrade Ceph to resolve this behavior, or you can use size or 
something else to filter if that is not possible.

David

On Thu, Aug 19, 2021 at 9:12 AM Eric Fahnle  wrote:
>
> Hi everyone!
> I've got a doubt, I tried searching for it in this list, but didn't find an 
> answer.
>
> I've got 4 OSD servers. Each server has 4 HDDs and 1 NVMe SSD disk. The 
> deployment was done with "ceph orch apply deploy-osd.yaml", in which the file 
> "deploy-osd.yaml" contained the following:
> ---
> service_type: osd
> service_id: default_drive_group
> placement:
>   label: "osd"
> data_devices:
>   rotational: 1
> db_devices:
>   rotational: 0
>
> After the deployment, each HDD had an OSD and the NVMe shared the 4 OSDs, 
> plus the DB.
>
> A few days ago, an HDD broke and got replaced. Ceph detected the new disk and 
> created a new OSD for the HDD but didn't use the NVMe. Now the NVMe in that 
> server has 3 OSDs running but didn't add the new one. I couldn't find out how 
> to re-create the OSD with the exact configuration it had before. The only 
> "way" I found was to delete all 4 OSDs and create everything from scratch (I 
> didn't actually do it, as I hope there is a better way).
>
> Has anyone had this issue before? I'd be glad if someone pointed me in the 
> right direction.
>
> Currently running:
> Version
> 15.2.8
> octopus (stable)
>
> Thank you in advance and best regards, Eric 
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an 
> email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io