[ceph-users] Re: Ceph Dashboard

2021-11-15 Thread Ernesto Puerta
Hi,

What was the error it threw? Did you intentionally set it up for HTTP? If
you're not using a L7 load balancer, you can still configure a reverse
proxy with HTTPS in both SSL passthrough and SSL termination modes, so no
need to turn HTTPS off.

By default the Ceph Dashboard runs with HTTPS (8443), while the default
HTTP port is 8080. It looks like there might be a process already listening
to that port.

I suggest you check the mgr logs  while reloading it and
provide any relevant data from there.

Kind Regards,
Ernesto


On Sun, Nov 14, 2021 at 7:56 AM Innocent Onwukanjo 
wrote:

> Hi!
>
> While trying to set a domain name for my company's ceph cluster, I used
> Nginx on another server to reverse proxy the public IP address of the
> dashboard and the port 8443. The domain name is from CloudFlare. The
> dashboard came up for HTTP only but threw error for HTTPS and I could not
> log in. So I removed the self signed certificate and disabled the
> dashboard.
> Re-Enabling the dashboard, I now get the error message:
>
> Error EIO: Module 'dashboard' has experienced an error and cannot handle
> commands: Timeout('Port 8443 not free on ceph.phpsandbox.io.',)
>
> Thanks.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: LVM support in Ceph Pacific

2021-11-15 Thread Janne Johansson
Den mån 15 nov. 2021 kl 10:18 skrev MERZOUKI, HAMID :
> Thanks for your answers Janne Johansson,
> "I'm sure you can if you do it even more manually with ceph-volume, but there 
> should seldom be a need to"
> Why do you think "there should seldom be a need to" ?

I meant this as a response to how to handle something like:
"I have one drive with only pvcreate run on it, one with pvcreate,
vgcreate and lastly one with pvcreate,vgcreate and lvcreate run on it
and a named LV to use for OSD data and I want the auto-setup tools to
handle this and make the required steps to make WAL on the first, DB
on the second and have data on the third"

If -for any reason- you have such a setup and these kinds of demands,
you are probably doing it wrong or you get to do it all 100% manually
for this kind of weird setup.

As soon as you start growing into something resembling real world ceph
cluster usage with some 5-10-15 OSD hosts with X drives on them each,
you will probably get into a situation where drives actually are empty
when you get them and where the defaults and auto-setup scripts will
work out fine without having to think a lot about these things.

> "Yes, upgrades do not contain LVM management, as far as I have ever seen."
> But there will be problems if later one existent OSD must be totally 
> recreated, won't it ?

'Totally recreated' just means you get to run "sgdisk -Z" or
"ceph-volume lvm zap" once before remaking a drive into an OSD again,
adding this step to your setup routine is very simple in those cases.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Adding a RGW realm to a single cephadm-managed ceph cluster

2021-11-15 Thread Eugen Block

Hi,

it's not entirely clear how your setup looks like, are you trying to  
setup multiple RGW containers on the same host(s) to serve multiple  
realms or do you have multiple RGWs for that?
You can add a second realm with a spec file or via cli (which you  
already did). If you want to create multiple RGW containers per host  
you need to specify a different port for every RGW, see the docs [1]  
for some examples.


This worked just fine in my Octopus lab except for a little mistake in  
the "port" spec, apparently this


spec:
  port: 8000

doesn't work:

host1:~ # ceph orch apply -i rgw2.yaml
Error EINVAL: ServiceSpec: __init__() got an unexpected keyword  
argument 'port'


But this does:

spec:
  rgw_frontend_port: 8000

Now I have two RGW containers on each host, serving two different realms.


[1] https://docs.ceph.com/en/latest/cephadm/services/rgw/

Zitat von J-P Methot :


Hi,

I'm testing out adding a second RGW realm to my single ceph cluster.  
This is not very well documented though, since obviously realms were  
designed for multi-site deployments.


Now, what I can't seem to figure is if I need to deploy a container  
with cephadm to act as a frontend for this second realm and, if so,  
how? I've set a frontend port and address when I created the second  
realm, but my attempts at creating a RGW container for that realm  
didn't work at all, with the container just not booting up.



--
Jean-Philippe Méthot
Senior Openstack system administrator
Administrateur système Openstack sénior
PlanetHoster inc.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Dashboard

2021-11-15 Thread Innocent Onwukanjo
Hi Ernesto. Thanks sooo much for the reply. I am actually new to ceph and
you are right. I would try that out right now. But please I wanted to know
if using a reverse proxy method to map ceph Dashboard to a domain name is
the way to go?
Lastly, I setup my sub domain name using CloudFlare. Wen I run a dig on the
subdomain name I get 2 IP addresses which are not related to the public IP
address where my ceph dashboard runs.
Is this normal? Because when I ping the subdomain I get the IP address of
the instance.
Thanks

On Mon, 15 Nov 2021, 10:10 Ernesto Puerta,  wrote:

> Hi,
>
> What was the error it threw? Did you intentionally set it up for HTTP? If
> you're not using a L7 load balancer, you can still configure a reverse
> proxy with HTTPS in both SSL passthrough and SSL termination modes, so no
> need to turn HTTPS off.
>
> By default the Ceph Dashboard runs with HTTPS (8443), while the default
> HTTP port is 8080. It looks like there might be a process already listening
> to that port.
>
> I suggest you check the mgr logs  while reloading it and
> provide any relevant data from there.
>
> Kind Regards,
> Ernesto
>
>
> On Sun, Nov 14, 2021 at 7:56 AM Innocent Onwukanjo 
> wrote:
>
>> Hi!
>>
>> While trying to set a domain name for my company's ceph cluster, I used
>> Nginx on another server to reverse proxy the public IP address of the
>> dashboard and the port 8443. The domain name is from CloudFlare. The
>> dashboard came up for HTTP only but threw error for HTTPS and I could not
>> log in. So I removed the self signed certificate and disabled the
>> dashboard.
>> Re-Enabling the dashboard, I now get the error message:
>>
>> Error EIO: Module 'dashboard' has experienced an error and cannot handle
>> commands: Timeout('Port 8443 not free on ceph.phpsandbox.io.',)
>>
>> Thanks.
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cheap M.2 2280 SSD for Ceph

2021-11-15 Thread Varun Priolkar
Thank you!

It is hard for me to find that particular model. Kingston DC1000B is
readily available and not very expensive. Would this work well?

https://www.kingston.com/us/ssd/dc1000b-data-center-boot-ssd

It is marketed as a boot disk. Would it be OK for a home lab?

Regards,

Varun Priolkar

On Mon, 15 Nov, 2021, 4:19 am Eneko Lacunza,  wrote:

> Hi Varun,
>
> El 14/11/21 a las 9:55, Varun Priolkar escribió:
> > Hello,
> >
> > I am a home user trying to build a homelab for Ceph+KVM VMs. I have
> > acquired 3 Lenovo Tiny M920q systems for this purpose. I am trying to
> > select not so expensive M.2 NVMe or M.2 SATA SSDs, but it is hard to
> > find information on which drives would have OK performance with Ceph.
> > My systems have a 1x M.2 2280 slot. The 2.5" SATA slot would be
> > unusable after I replace it with a 10/40Gbe network card.
> >
> > >From what I read online, I need to look for drives with PLP. Can
> > someone please recommend some M.2 drives to me? I am fine with buying
> > used and my budget is 200 CAD or around 160 USD per drive. Would
> > prefer capacity close to 1TB as usable capacity would be the same
> > divided by 3.
>
> Any "datacenter" SSD will do, for example:
>
>
> https://ark.intel.com/content/www/us/en/ark/products/134913/intel-ssd-d3s4510-series-960gb-m-2-80mm-sata-6gbs-3d2-tlc.html
>
> Cheers
>
>
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
>
> Tel. +34 943 569 206 | https://www.binovo.es
> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
>
> https://www.youtube.com/user/CANALBINOVO
> https://www.linkedin.com/company/37269706/
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cheap M.2 2280 SSD for Ceph

2021-11-15 Thread Marius Leustean
Hi Varun,

I'm not an expert into SSD drives, but since you wish to build a home lab,
ceph OSDs just need a "block device" to set up the bluestore on it.
So any SSD that's recognized by your system should work fine.

On Mon, Nov 15, 2021 at 3:34 PM Varun Priolkar  wrote:

> Thank you!
>
> It is hard for me to find that particular model. Kingston DC1000B is
> readily available and not very expensive. Would this work well?
>
> https://www.kingston.com/us/ssd/dc1000b-data-center-boot-ssd
>
> It is marketed as a boot disk. Would it be OK for a home lab?
>
> Regards,
>
> Varun Priolkar
>
> On Mon, 15 Nov, 2021, 4:19 am Eneko Lacunza,  wrote:
>
> > Hi Varun,
> >
> > El 14/11/21 a las 9:55, Varun Priolkar escribió:
> > > Hello,
> > >
> > > I am a home user trying to build a homelab for Ceph+KVM VMs. I have
> > > acquired 3 Lenovo Tiny M920q systems for this purpose. I am trying to
> > > select not so expensive M.2 NVMe or M.2 SATA SSDs, but it is hard to
> > > find information on which drives would have OK performance with Ceph.
> > > My systems have a 1x M.2 2280 slot. The 2.5" SATA slot would be
> > > unusable after I replace it with a 10/40Gbe network card.
> > >
> > > >From what I read online, I need to look for drives with PLP. Can
> > > someone please recommend some M.2 drives to me? I am fine with buying
> > > used and my budget is 200 CAD or around 160 USD per drive. Would
> > > prefer capacity close to 1TB as usable capacity would be the same
> > > divided by 3.
> >
> > Any "datacenter" SSD will do, for example:
> >
> >
> >
> https://ark.intel.com/content/www/us/en/ark/products/134913/intel-ssd-d3s4510-series-960gb-m-2-80mm-sata-6gbs-3d2-tlc.html
> >
> > Cheers
> >
> >
> > Eneko Lacunza
> > Zuzendari teknikoa | Director técnico
> > Binovo IT Human Project
> >
> > Tel. +34 943 569 206 | https://www.binovo.es
> > Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
> >
> > https://www.youtube.com/user/CANALBINOVO
> > https://www.linkedin.com/company/37269706/
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cheap M.2 2280 SSD for Ceph

2021-11-15 Thread Mario Giammarco
You can use also consumer drives considering that is an homelab.
Otherwise try to find seagate nytro xm1441 or xm1440.
Mario

Il giorno lun 15 nov 2021 alle ore 14:59 Eneko Lacunza 
ha scritto:

> Hi Varun,
>
> That Kingston DC grade model should work (well enough at least for a
> home lab), it has PLP.  Note I haven't used that model.
>
> Just avoid consumer drives.
>
> Cheers
>
> El 15/11/21 a las 14:35, Varun Priolkar escribió:
> > Thank you!
> >
> > It is hard for me to find that particular model. Kingston DC1000B is
> > readily available and not very expensive. Would this work well?
> >
> > https://www.kingston.com/us/ssd/dc1000b-data-center-boot-ssd
> > 
> >
> > It is marketed as a boot disk. Would it be OK for a home lab?
> >
> > Regards,
> >
> > Varun Priolkar
> >
> > On Mon, 15 Nov, 2021, 4:19 am Eneko Lacunza,  > > wrote:
> >
> > Hi Varun,
> >
> > El 14/11/21 a las 9:55, Varun Priolkar escribió:
> > > Hello,
> > >
> > > I am a home user trying to build a homelab for Ceph+KVM VMs. I have
> > > acquired 3 Lenovo Tiny M920q systems for this purpose. I am
> > trying to
> > > select not so expensive M.2 NVMe or M.2 SATA SSDs, but it is hard
> to
> > > find information on which drives would have OK performance with
> > Ceph.
> > > My systems have a 1x M.2 2280 slot. The 2.5" SATA slot would be
> > > unusable after I replace it with a 10/40Gbe network card.
> > >
> > > >From what I read online, I need to look for drives with PLP. Can
> > > someone please recommend some M.2 drives to me? I am fine with
> > buying
> > > used and my budget is 200 CAD or around 160 USD per drive. Would
> > > prefer capacity close to 1TB as usable capacity would be the same
> > > divided by 3.
> >
> > Any "datacenter" SSD will do, for example:
> >
> >
> https://ark.intel.com/content/www/us/en/ark/products/134913/intel-ssd-d3s4510-series-960gb-m-2-80mm-sata-6gbs-3d2-tlc.html
> > <
> https://ark.intel.com/content/www/us/en/ark/products/134913/intel-ssd-d3s4510-series-960gb-m-2-80mm-sata-6gbs-3d2-tlc.html
> >
> >
> > Cheers
> >
> >
> > Eneko Lacunza
> > Zuzendari teknikoa | Director técnico
> > Binovo IT Human Project
> >
> > Tel. +34 943 569 206 | https://www.binovo.es 
> > Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
> >
> > https://www.youtube.com/user/CANALBINOVO
> > 
> > https://www.linkedin.com/company/37269706/
> > 
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > 
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> > 
> >
>
>   EnekoLacunza
>
> CTO | Zuzendari teknikoa
>
> Binovo IT Human Project
>
> 943 569 206 
>
> elacu...@binovo.es 
>
> binovo.es 
>
> Astigarragako Bidea, 2 - 2 izda. Oficina 10-11, 20180 Oiartzun
>
>
> youtube 
> linkedin 
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Anybody else hitting ceph_assert(is_primary()) in PrimaryLogPG::on_local_recover during upgrades?

2021-11-15 Thread Tobias Urdin
Hello,

Is anybody else hitting the ceph_assert(is_primary()) in 
PrimaryLogPG::on_local_recover [1] recurringly when upgrading?
I’ve been hit with this multiple times now on Octopus and it just very 
annoying, both on 15.2.11 and 15.2.15

Been trying to collect as much information as possible over there.

 ceph version 15.2.15 (2dfb18841cfecc2f7eb7eb2afd65986ca4d95985) octopus 
(stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x14c) [0x55a8cc3e8419]
 2: (()+0x4df5e1) [0x55a8cc3e85e1]
 3: (PrimaryLogPG::on_local_recover(hobject_t const&, ObjectRecoveryInfo 
const&, std::shared_ptr, bool, ceph::os::Transaction*)+0x30c) 
[0x55a8cc5b6ebc]
 4: (ReplicatedBackend::handle_push(pg_shard_t, PushOp const&, PushReplyOp*, 
ceph::os::Transaction*, bool)+0x3a2) [0x55a8cc7e2622]
 5: (ReplicatedBackend::_do_push(boost::intrusive_ptr)+0x243) 
[0x55a8cc7e29e3]
 6: (ReplicatedBackend::_handle_message(boost::intrusive_ptr)+0x298) 
[0x55a8cc7eb548]
 7: (PGBackend::handle_message(boost::intrusive_ptr)+0x4a) 
[0x55a8cc68732a]
 8: (PrimaryLogPG::do_request(boost::intrusive_ptr&, 
ThreadPool::TPHandle&)+0x5cb) [0x55a8cc62d47b]
 9: (OSD::dequeue_op(boost::intrusive_ptr, boost::intrusive_ptr, 
ThreadPool::TPHandle&)+0x2f9) [0x55a8cc4ccb69]
 10: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, 
boost::intrusive_ptr&, ThreadPool::TPHandle&)+0x69) [0x55a8cc708609]
 11: (OSD::ShardedOpWQ::_process(unsigned int, 
ceph::heartbeat_handle_d*)+0x143a) [0x55a8cc4e844a]
 12: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b6) 
[0x55a8ccad4206]
 13: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55a8ccad6d50]
 14: (()+0x7ea5) [0x7f3ed8669ea5]
 15: (clone()+0x6d) [0x7f3ed752c9fd]

[1] https://tracker.ceph.com/issues/50608

Best regards
Tobias
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Dashboard

2021-11-15 Thread Ernesto Puerta
Hi Innocent,

Yes, a reverse proxy should work and in general it's not a bad idea when
you're exposing Ceph Dashboard to a public network. You'll also have to
manually update the "GRAFANA_FRONTEND_API_URL" option ("ceph dashboard
set-grafana-frontend-api-url ") with the public facing URL (instead of
the internal domain URL).

Regarding the Cloudflare configuration, I cannot help you there.

Kind Regards,
Ernesto


On Mon, Nov 15, 2021 at 1:12 PM Innocent Onwukanjo 
wrote:

> Hi Ernesto. Thanks sooo much for the reply. I am actually new to ceph and
> you are right. I would try that out right now. But please I wanted to know
> if using a reverse proxy method to map ceph Dashboard to a domain name is
> the way to go?
> Lastly, I setup my sub domain name using CloudFlare. Wen I run a dig on
> the subdomain name I get 2 IP addresses which are not related to the public
> IP address where my ceph dashboard runs.
> Is this normal? Because when I ping the subdomain I get the IP address of
> the instance.
> Thanks
>
> On Mon, 15 Nov 2021, 10:10 Ernesto Puerta,  wrote:
>
>> Hi,
>>
>> What was the error it threw? Did you intentionally set it up for HTTP? If
>> you're not using a L7 load balancer, you can still configure a reverse
>> proxy with HTTPS in both SSL passthrough and SSL termination modes, so no
>> need to turn HTTPS off.
>>
>> By default the Ceph Dashboard runs with HTTPS (8443), while the default
>> HTTP port is 8080. It looks like there might be a process already listening
>> to that port.
>>
>> I suggest you check the mgr logs  while reloading it and
>> provide any relevant data from there.
>>
>> Kind Regards,
>> Ernesto
>>
>>
>> On Sun, Nov 14, 2021 at 7:56 AM Innocent Onwukanjo 
>> wrote:
>>
>>> Hi!
>>>
>>> While trying to set a domain name for my company's ceph cluster, I used
>>> Nginx on another server to reverse proxy the public IP address of the
>>> dashboard and the port 8443. The domain name is from CloudFlare. The
>>> dashboard came up for HTTP only but threw error for HTTPS and I could not
>>> log in. So I removed the self signed certificate and disabled the
>>> dashboard.
>>> Re-Enabling the dashboard, I now get the error message:
>>>
>>> Error EIO: Module 'dashboard' has experienced an error and cannot handle
>>> commands: Timeout('Port 8443 not free on ceph.phpsandbox.io.',)
>>>
>>> Thanks.
>>> ___
>>> ceph-users mailing list -- ceph-users@ceph.io
>>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>>
>>>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mClock scheduler

2021-11-15 Thread Neha Ojha
Hi Luis,

On Mon, Nov 15, 2021 at 4:57 AM Luis Domingues  wrote:
>
> Hi,
>
> We are testing currently testing the mclock scheduler in a ceph Pacific 
> cluster. We did not test if heavily, but at first glance it looks good on our 
> installation. Probably better than wqp. But we still have a few questions 
> regarding mclock.

I am very glad to hear this.

>
> Is it ready to production, is it safe to replace wqp yet, or is it to soon 
> with Pacific?

I am not aware of any particular issues that users have run into, but
I am also not sure how many have tried the mclock scheduler.

>
> Will it be the default on Quincy? I found on latest that it is the default, 
> but I am not sure that it will be in Quincy.

Yes, it will be the default in Quincy.

Thanks,
Neha


>
> Thanks,
> Luis Domingues
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to minimise the impact of compaction in ‘rocksdb options’?

2021-11-15 Thread Mark Nelson

Hi,


Compaction can block reads, but on the write path you should be able to 
absorb a certain amount of writes via the WAL before rocksdb starts 
throttling writes.  The larger and more WAL buffers you have, the more 
writes you can absorb, but bigger buffers also take more CPU to keep in 
sorted order and more aggregate buffer uses more RAM so it's a double 
edged sword.  I'd suggest looking and seeing how much time you actually 
spend in compaction.  For clusters that primarily are serving block via 
RBD, there's a good chance it's actually fairly minimal.  For RGW 
(especially with lots of small objects and/or using erasure coding) you 
might be spending more time in compaction, but it's important to see how 
much.



FWIW, you can try running the following script against your OSD log to 
see a summary of compaction events:


https://github.com/ceph/cbt/blob/master/tools/ceph_rocksdb_log_parser.py


Mark


On 11/15/21 10:48 AM, Szabo, Istvan (Agoda) wrote:

Hello,

If I’m not mistaken in my cluster this can block io on the osds if have a huge 
amount of objects on that specific osd.

How can I change the values to minimise the impact?

I guess it needs osd restart to make it effective and the “rocksdb options” are 
the values that needs to be tuned, but what should it be changed?

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph Dashboard

2021-11-15 Thread Innocent Onwukanjo
Awesome, thanks Ernesto! It's working now.

On Mon, 15 Nov 2021, 17:04 Ernesto Puerta,  wrote:

> Hi Innocent,
>
> Yes, a reverse proxy should work and in general it's not a bad idea when
> you're exposing Ceph Dashboard to a public network. You'll also have to
> manually update the "GRAFANA_FRONTEND_API_URL" option ("ceph dashboard
> set-grafana-frontend-api-url ") with the public facing URL (instead
> of the internal domain URL).
>
> Regarding the Cloudflare configuration, I cannot help you there.
>
> Kind Regards,
> Ernesto
>
>
> On Mon, Nov 15, 2021 at 1:12 PM Innocent Onwukanjo 
> wrote:
>
>> Hi Ernesto. Thanks sooo much for the reply. I am actually new to ceph and
>> you are right. I would try that out right now. But please I wanted to know
>> if using a reverse proxy method to map ceph Dashboard to a domain name is
>> the way to go?
>> Lastly, I setup my sub domain name using CloudFlare. Wen I run a dig on
>> the subdomain name I get 2 IP addresses which are not related to the public
>> IP address where my ceph dashboard runs.
>> Is this normal? Because when I ping the subdomain I get the IP address of
>> the instance.
>> Thanks
>>
>> On Mon, 15 Nov 2021, 10:10 Ernesto Puerta,  wrote:
>>
>>> Hi,
>>>
>>> What was the error it threw? Did you intentionally set it up for HTTP?
>>> If you're not using a L7 load balancer, you can still configure a reverse
>>> proxy with HTTPS in both SSL passthrough and SSL termination modes, so no
>>> need to turn HTTPS off.
>>>
>>> By default the Ceph Dashboard runs with HTTPS (8443), while the default
>>> HTTP port is 8080. It looks like there might be a process already listening
>>> to that port.
>>>
>>> I suggest you check the mgr logs  while reloading it and
>>> provide any relevant data from there.
>>>
>>> Kind Regards,
>>> Ernesto
>>>
>>>
>>> On Sun, Nov 14, 2021 at 7:56 AM Innocent Onwukanjo 
>>> wrote:
>>>
 Hi!

 While trying to set a domain name for my company's ceph cluster, I used
 Nginx on another server to reverse proxy the public IP address of the
 dashboard and the port 8443. The domain name is from CloudFlare. The
 dashboard came up for HTTP only but threw error for HTTPS and I could
 not
 log in. So I removed the self signed certificate and disabled the
 dashboard.
 Re-Enabling the dashboard, I now get the error message:

 Error EIO: Module 'dashboard' has experienced an error and cannot handle
 commands: Timeout('Port 8443 not free on ceph.phpsandbox.io.',)

 Thanks.
 ___
 ceph-users mailing list -- ceph-users@ceph.io
 To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] This week: Ceph User + Dev Monthly Meetup

2021-11-15 Thread Neha Ojha
Hi everyone,

This event is happening on November 18, 2021, 15:00-16:00 UTC - this
is an hour later than what I had sent in my earlier email (I hadn't
accounted for daylight savings change, sorry!), the calendar invite
reflects the same.

Thanks,
Neha

On Thu, Oct 28, 2021 at 11:53 AM Neha Ojha  wrote:
>
> Hi everyone,
>
> We are kicking off a new monthly meeting for Ceph users to directly
> interact with Ceph Developers. The high-level aim of this meeting is
> to provide users with a forum to:
>
> - share their experience running Ceph clusters
> - provide feedback on Ceph versions they are using
> - ask questions and raise concerns on any matters related to Ceph
> - provide documentation feedback and suggest improvements
>
> Note that this is not a meeting to discuss design ideas or feature
> improvements, we'll continue to use existing CDMs [0] for such
> discussions.
>
> The meeting details have been added to the community calendar [1]. The
> first meeting will be held on November 18, 2021, 14:00-15:00 UTC and
> the agenda is here:
> https://pad.ceph.com/p/ceph-user-dev-monthly-minutes
>
> Hope to see you there!
>
> Thanks,
> Neha
>
> [0] https://tracker.ceph.com/projects/ceph/wiki/Planning
> [1] 
> https://calendar.google.com/calendar/u/0/embed?src=9ts9c7lt7u1vic2ijvvqqlf...@group.calendar.google.com

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSDs get killed by OOM when other host goes down

2021-11-15 Thread Marius Leustean
I upgraded all the OSDs + mons to Pacific 16.2.6
All PGs have been active+clean for the last days, but memory is still quite
high:

"osd_pglog": {

"items": 35066835,

"bytes": 3663079160 (3.6 GB)

},

"buffer_anon": {

"items": 346531,

"bytes": 83548293 (0.08 GB)

},

"total": {

"items": 123515722,

"bytes": 7888791573 (7.88 GB)

}


However, docker stats reports 38GB for that container.

There is a huge gap between what RAM is being used by the container
what ceph daemon osd.xxx dump_mempools reports.


How can I check if trim happens?

How can I check what else is consuming memory in the ceph-osd process?

On Mon, Nov 15, 2021 at 3:50 PM Josh Baergen 
wrote:

> Hi Istvan,
>
> > So this means if we are doing some operation which involved recovery, we
> should not do another one until this trimming not done yet? Let's say I've
> added new host with full of drives, once the rebalance finished, we should
> leave the cluster to trim osdmap before I add another host?
>
> Ah, no, sorry if I gave the wrong impression. If you have Nautilus
> 14.2.12+, Octopus 15.2.5+, or Pacific, then, as long as you don't have
> any down+in OSDs, osdmaps should be trimmed.
>
> Josh
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSD spend too much time on "waiting for readable" -> slow ops -> laggy pg -> rgw stop -> worst case osd restart

2021-11-15 Thread Sage Weil
Okay, I traced one slow op through the logs, and the problem was that the
PG was laggy.  That happened because of the osd.122 that you stopped, which
was marked down in the OSDMap but *not* dead.  It looks like that happened
because the OSD took the 'clean shutdown' path instead of the fast stop.

Have you tried enabling osd_fast_shutdown = true *after* you fixed the
require_osd_release to octopus?   It would have led to slow requests when
you tested before because the new dead_epochfied in the OSDMap that the
read leases rely on was not being encoded, making peering wait for the read
lease to time out even though the stopped osd really died.

I'm not entirely sure if this is the same cluster as the earlier one.. but
given the logs you sent, my suggestion is to enable osd_fast_shutdown and
try again.  If you still get slow requests, can you capture the logs again?

Thanks!
sage


On Fri, Nov 12, 2021 at 7:33 AM Manuel Lausch 
wrote:

> Hi Sage,
>
> I uploaded a lot of debug logs from the OSDs and Mons:
> ceph-post-file: 4ebc2eeb-7bb1-48c4-bbfa-ed581faca74f
>
> At 13:24:25 I stopped OSD 122 and one Minute later I started it again.
> In both cases I got slow ops.
>
> Currently I running the upstream Version (without crude patches)
> ceph version 16.2.6 (ee28fb57e47e9f88813e24bbf4c14496ca299d31) pacific
> (stable)
>
> I hope you can work with it.
>
>
> here the current config
>
> # ceph config dump
> WHO MASK  LEVEL OPTION
> VALUE RO
> globaladvanced  osd_fast_shutdown
>  false
> globaladvanced  osd_fast_shutdown_notify_mon
> false
> globaldev   osd_pool_default_read_lease_ratio
>  0.80
> globaladvanced  paxos_propose_interval
> 1.00
>   mon advanced  auth_allow_insecure_global_id_reclaim
>  true
>   mon advanced  mon_warn_on_insecure_global_id_reclaim
> false
>   mon advanced  mon_warn_on_insecure_global_id_reclaim_allowed
> false
>   mgr advanced  mgr/balancer/active
>  true
>   mgr advanced  mgr/balancer/mode
>  upmap
>   mgr advanced  mgr/balancer/upmap_max_deviation1
>
>   mgr advanced  mgr/progress/enabled
> false *
>   osd dev   bluestore_fsck_quick_fix_on_mount
>   true
>
> # cat /etc/ceph/ceph.conf
> [global]
> # The following parameters are defined in the service.properties like
> below
> # ceph.conf.globa.osd_max_backfills: 1
>
>
>   bluefs bufferd io = true
>   bluestore fsck quick fix on mount = false
>   cluster network = 10.88.26.0/24
>   fsid = 72ccd9c4-5697-478c-99f6-b5966af278c6
>   max open files = 131072
>   mon host = 10.88.7.41 10.88.7.42 10.88.7.43
>   mon max pg per osd = 600
>   mon osd down out interval = 1800
>   mon osd down out subtree limit = host
>   mon osd initial require min compat client = luminous
>   mon osd min down reporters = 2
>   mon osd reporter subtree level = host
>   mon pg warn max object skew = 100
>   osd backfill scan max = 16
>   osd backfill scan min = 8
>   osd deep scrub stride = 1048576
>   osd disk threads = 1
>   osd heartbeat min size = 0
>   osd max backfills = 1
>   osd max scrubs = 1
>   osd op complaint time = 5
>   osd pool default flag hashpspool = true
>   osd pool default min size = 1
>   osd pool default size = 3
>   osd recovery max active = 1
>   osd recovery max single start = 1
>   osd recovery op priority = 3
>   osd recovery sleep hdd = 0.0
>   osd scrub auto repair = true
>   osd scrub begin hour = 5
>   osd scrub chunk max = 1
>   osd scrub chunk min = 1
>   osd scrub during recovery = true
>   osd scrub end hour = 23
>   osd scrub load threshold = 1
>   osd scrub priority = 1
>   osd scrub thread suicide timeout = 0
>   osd snap trim priority = 1
>   osd snap trim sleep = 1.0
>   public network = 10.88.7.0/24
>
> [mon]
>   mon allow pool delete = false
>   mon health preluminous compat warning = false
>   osd pool default flag hashpspool = true
>
>
>
>
> On Thu, 11 Nov 2021 09:16:20 -0600
> Sage Weil  wrote:
>
> > Hi Manuel,
> >
> > Before giving up and putting in an off switch, I'd like to understand
> > why it is taking as long as it is for the PGs to go active.
> >
> > Would you consider enabling debug_osd=10 and debug_ms=1 on your OSDs,
> > and debug_mon=10 + debug_ms=1 on the mons, and reproducing this
> > (without the patch applied this time of course!)?  The logging will
> > slow things down a bit but hopefully the behavior will be close
> > enough to what you see normally that we can tell what is going on
> > (and presumably picking out the pg that was most laggy will highlight
> > the source(s) of the delay).
> >
> > sage
> >
> > On Wed, Nov 10, 2021 at 4:41 AM Manuel Lausch 
> > wrote:
> >
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Adding a RGW realm to a single cephadm-managed ceph cluster

2021-11-15 Thread Eugen Block

Hi,


Couldn't init storage provider (RADOS)


I usually see this when my rgw config is wrong, can you share your rgw  
spec(s)?



Zitat von J-P Methot :

After searching through logs and figuring out how cephadm works,  
I've figured out that when cephadm tries to create the new systemd  
service and launch the new RGW container, it fails with this error :


Couldn't init storage provider (RADOS)

Systemd then complains that the service doesn't exist :

ceph-04c5d4a4-8815-45fb-b97f-027252d1a...@rgw.test.test.ceph-monitor1.twltpl.service: Main process exited, code=exited,  
status=5/NOTINSTALLED


I can't seem to find further logs anywhere regarding why the storage  
provider fails to init. Googling indicate issues with the number of  
pgs in past versions, but I believe it shouldn't be an issue here,  
since the amount of PG auto-adjusts. How could I find out why this  
happens?


On 11/15/21 11:02 AM, J-P Methot wrote:
Yes, I'm trying to add a RGW container on a second port on the same  
server. For example, I do :


ceph orch apply rgw test test --placement="ceph-monitor1:[10.50.47.3:]"

and this results in :

ceph orch ls

NAME RUNNING  REFRESHED  AGE  
PLACEMENT  IMAGE NAME    
IMAGE ID


rgw.test.test    0/1  2s ago 5s  
ceph-monitor1:[10.50.47.3:] docker.io/ceph/ceph:v15 



the image and container ID being unknown is making me scratch my  
head. A look in the log files show this:


2021-11-15 10:50:12,253 INFO Deploy daemon  
rgw.test.test.ceph-monitor1.rtoiwh ...
2021-11-15 10:50:12,254 DEBUG Running command: /usr/bin/docker run  
--rm --ipc=host --net=host --entrypoint stat -e  
CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=ceph-monitor1  
docker.io/ceph/ceph:v15 -c %u %g /var/lib/ceph

2021-11-15 10:50:12,452 DEBUG stat: stdout 167 167
2021-11-15 10:50:12,525 DEBUG Running command: install -d -m0770 -o  
167 -g 167 /var/run/ceph/04c5d4a4-8815-45fb-b97f-027252d1aea5

2021-11-15 10:50:12,534 DEBUG Running command: systemctl daemon-reload
2021-11-15 10:50:12,869 DEBUG Running command: systemctl stop  
ceph-04c5d4a4-8815-45fb-b97f-027252d1a...@rgw.test.test.ceph-monitor1.rtoiwh
2021-11-15 10:50:12,879 DEBUG Running command: systemctl  
reset-failed  
ceph-04c5d4a4-8815-45fb-b97f-027252d1a...@rgw.test.test.ceph-monitor1.rtoiwh
2021-11-15 10:50:12,884 DEBUG systemctl: stderr Failed to reset  
failed state of unit  
ceph-04c5d4a4-8815-45fb-b97f-027252d1a...@rgw.test.test.ceph-monitor1.rtoiwh.service: Unit ceph-04c5d4a4-8815-45fb-b97f-027252d1a...@rgw.test.test.ceph-monitor1.rtoiwh.service not  
loaded


journalctl -xe shows the service entered failed state, without any  
real useful information


Nov 15 10:50:24 ceph-monitor1 systemd[1]:  
ceph-04c5d4a4-8815-45fb-b97f-027252d1a...@rgw.test.test.ceph-monitor1.rtoiwh.service: Failed with result  
'exit-code'.

-- Subject: Unit failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support


What I understand from this is that I'm doing the right thing, it's  
just my cephadm that's breaking, somehow.


On 11/15/21 5:59 AM, Eugen Block wrote:

Hi,

it's not entirely clear how your setup looks like, are you trying  
to setup multiple RGW containers on the same host(s) to serve  
multiple realms or do you have multiple RGWs for that?
You can add a second realm with a spec file or via cli (which you  
already did). If you want to create multiple RGW containers per  
host you need to specify a different port for every RGW, see the  
docs [1] for some examples.


This worked just fine in my Octopus lab except for a little  
mistake in the "port" spec, apparently this


spec:
  port: 8000

doesn't work:

host1:~ # ceph orch apply -i rgw2.yaml
Error EINVAL: ServiceSpec: __init__() got an unexpected keyword  
argument 'port'


But this does:

spec:
  rgw_frontend_port: 8000

Now I have two RGW containers on each host, serving two different realms.


[1] https://docs.ceph.com/en/latest/cephadm/services/rgw/

Zitat von J-P Methot :


Hi,

I'm testing out adding a second RGW realm to my single ceph  
cluster. This is not very well documented though, since obviously  
realms were designed for multi-site deployments.


Now, what I can't seem to figure is if I need to deploy a  
container with cephadm to act as a frontend for this second realm  
and, if so, how? I've set a frontend port and address when I  
created the second realm, but my attempts at creating a RGW  
container for that realm didn't work at all, with the container  
just not booting up.



--
Jean-Philippe Méthot
Senior Openstack system administrator
Administrateur système Openstack sénior
PlanetHoster inc.

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.