[ceph-users] Re: CRUSH rule for EC 6+2 on 6-node cluster

2021-06-04 Thread Fulvio Galeazzi

Hallo Dan,
I am using Nautilus with a slightly outdated version 14.2.16, and I 
don't remember me playing with upmaps in the past.
Following your suggestion, I removed a bunch of upmaps (the "longer" 
lines) and after a while I verified that all PGs are properly mapped.


  Thanks!

Fulvio

Il 5/27/2021 5:33 PM, Dan van der Ster ha scritto:

Hi Fulvio,

I suggest removing only the upmaps which are clearly incorrect, and
then see if the upmap balancer re-creates them.
Perhaps they were created when they were not incorrect, when you had a
different crush rule?
Or perhaps you're running an old version of ceph which had buggy
balancer implementation?

Cheers, Dan



On Thu, May 27, 2021 at 5:16 PM Fulvio Galeazzi  wrote:


Hallo Dan, Nathan, thanks for your replies and apologies for my silence.

Sorry I had made a typo... the rule is really 6+4. And to reply to
Nathan's message, the rule was built like this in anticipation of
getting additional servers, at which point in time I will relax the "2
chunks per OSD" part.

[cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd pool get
default.rgw.buckets.data erasure_code_profile
erasure_code_profile: ec_6and4_big
[cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd erasure-code-profile get
ec_6and4_big
crush-device-class=big
crush-failure-domain=osd
crush-root=default
jerasure-per-chunk-alignment=false
k=6
m=4
plugin=jerasure
technique=reed_sol_van
w=8

Indeed, Dan:

[cephmgr@cephAdmPA1.cephAdmPA1 ~]$ ceph osd dump | grep upmap | grep 116.453
pg_upmap_items 116.453 [76,49,129,108]

Don't think I ever set such an upmap myself. Do you think it would be
good to try and remove all upmaps, let the upmap balancer do its magic,
and check again?

Thanks!

 Fulvio


On 20/05/2021 18:59, Dan van der Ster wrote:

Hold on: 8+4 needs 12 osds but you only show 10 there. Shouldn't you
choose 6 type host and then chooseleaf 2 type osd?

.. Dan


On Thu, May 20, 2021, 1:30 PM Fulvio Galeazzi mailto:fulvio.galea...@garr.it>> wrote:

 Hallo Dan, Bryan,
   I have a rule similar to yours, for an 8+4 pool, with only
 difference that I replaced the second "choose" with "chooseleaf", which
 I understand should make no difference:

 rule default.rgw.buckets.data {
   id 6
   type erasure
   min_size 3
   max_size 10
   step set_chooseleaf_tries 5
   step set_choose_tries 100
   step take default class big
   step choose indep 5 type host
   step chooseleaf indep 2 type osd
   step emit
 }

 I am on Nautilus 14.2.16 and while performing a maintenance the
 other
 day, I noticed 2 PGs were incomplete and caused troubles to some users.
 I then verified that (thanks Bryan for the command):

 [cephmgr@cephAdmCT1.cephAdmCT1 clusterCT]$ for osd in $(ceph pg map
 116.453 -f json | jq -r '.up[]'); do ceph osd find $osd | jq -r '.host'
 ; done | sort | uniq -c | sort -n -k1
 2 r2srv07.ct1.box.garr
 2 r2srv10.ct1.box.garr
 2 r3srv07.ct1.box.garr
 4 r1srv02.ct1.box.garr

 You see that 4 PGs were put on r1srv02.
 May be this happened due to some temporary unavailability of the
 host at
 some point? As all my servers are now up and running, is there a way to
 force the placement rule to rerun?

 Thanks!

  Fulvio


 Il 5/16/2021 11:40 PM, Dan van der Ster ha scritto:
  > Hi Bryan,
  >
  > I had to do something similar, and never found a rule to place
 "up to"
  > 2 chunks per host, so I stayed with the placement of *exactly* 2
  > chunks per host.
  >
  > But I did this slightly differently to what you wrote earlier: my
 rule
  > chooses exactly 4 hosts, then chooses exactly 2 osds on each:
  >
  >  type erasure
  >  min_size 3
  >  max_size 10
  >  step set_chooseleaf_tries 5
  >  step set_choose_tries 100
  >  step take default class hdd
  >  step choose indep 4 type host
  >  step choose indep 2 type osd
  >  step emit
  >
  > If you really need the "up to 2" approach then maybe you can split
  > each host into two "host" crush buckets, with half the OSDs in each.
  > Then a normal host-wise rule should work.
  >
  > Cheers, Dan
  >




--
Fulvio Galeazzi
GARR-CSD Department
skype: fgaleazzi70
tel.: +39-334-6533-250



smime.p7s
Description: Firma crittografica S/MIME
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: SSD recommendations for RBD and VM's

2021-06-04 Thread mj

Hi,

On 5/30/21 8:45 PM, mhnx wrote:

Hello Samuel. Thanks for the answer.

Yes the Intel S4510 series is a good choice but it's expensive.
I have 21 server and data distribution is quite well.
At power loss I don't think I'll lose data. All the VM's using same
image and the rest is cookie.
In this case I'm not sure I should spend extra money on PLP.

Actually I like Samsung 870 EVO. It's cheap and I think 300TBW will be
enough for 5-10years.
Do you know any better ssd with the same price range as 870 EVO?

Samsung 870 evo (500GB) = 5 Years or 300 TBW - $64.99
Samsung 860 pro (512GB) = 5 Years or 600 TBW - $99



But do these not lack power-loss protection..?

We are running the Samsung PM883, as I was told that these would do much 
better as OSDs.


MJ
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: SSD recommendations for RBD and VM's

2021-06-04 Thread huxia...@horebdata.cn
One could use enterprise NVMe SSD (with PLP) as DB/WAL for those consumer SSD



huxia...@horebdata.cn
 
From: mj
Date: 2021-06-04 11:23
To: ceph-users
Subject: [ceph-users] Re: SSD recommendations for RBD and VM's
Hi,
 
On 5/30/21 8:45 PM, mhnx wrote:
> Hello Samuel. Thanks for the answer.
> 
> Yes the Intel S4510 series is a good choice but it's expensive.
> I have 21 server and data distribution is quite well.
> At power loss I don't think I'll lose data. All the VM's using same
> image and the rest is cookie.
> In this case I'm not sure I should spend extra money on PLP.
> 
> Actually I like Samsung 870 EVO. It's cheap and I think 300TBW will be
> enough for 5-10years.
> Do you know any better ssd with the same price range as 870 EVO?
> 
> Samsung 870 evo (500GB) = 5 Years or 300 TBW - $64.99
> Samsung 860 pro (512GB) = 5 Years or 600 TBW - $99
> 
 
But do these not lack power-loss protection..?
 
We are running the Samsung PM883, as I was told that these would do much 
better as OSDs.
 
MJ
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: SSD recommendations for RBD and VM's

2021-06-04 Thread mhnx
I wonder that when a osd came back from power-lost, all the data
scrubbing and there are 2 other copies.
PLP is important on mostly Block Storage, Ceph should easily recover
from that situation.
That's why I don't understand why I should pay more for PLP and other
protections.

In my use case %90 of the data is cookie %10 is coldish metadata and I
don't wanna pay more features I don't need. That's it.
Using 870evo's with Nvme WAL is a good idea but In this case the Price
still goes up cause I'm using 2 osd per device. 60 osd Total across
21Host.

Samsung PM883 is good. We're using them for different projects. It is
at the top of the list.
These are all TLC nands and I think lowest price in the market. Its
better to have MLC but the price goes double.

Samsung PM883 480GB SATA 6Gb/s V4 TLC 2.5" 7mm (1.3 DWPD) 1 $102,48
Micron 5300 PRO 480GB, SATA, 2.5", 3D TLC, 1.5DWPD 1 $110,54
Micron 5300 MAX 480GB, SATA, 2.5", 3D TLC, 5DWPD 1 $132,41
Intel D3-S4610 480GB SATA 6Gb/s 3D TLC 2.5" 7mm 3DWPD Rev.2 1 $131,26


Samsung 860 PRO SATA 2.5" SSD 512GB V-NAND 2bit MLC 600 TBW.
860PRO is MLC, that's good but TBW is half of the PM883 and no PLP.

I think I'm going with the PM883 if there is no other advice from the community.

mj , 4 Haz 2021 Cum, 12:24 tarihinde şunu yazdı:
>
> Hi,
>
> On 5/30/21 8:45 PM, mhnx wrote:
> > Hello Samuel. Thanks for the answer.
> >
> > Yes the Intel S4510 series is a good choice but it's expensive.
> > I have 21 server and data distribution is quite well.
> > At power loss I don't think I'll lose data. All the VM's using same
> > image and the rest is cookie.
> > In this case I'm not sure I should spend extra money on PLP.
> >
> > Actually I like Samsung 870 EVO. It's cheap and I think 300TBW will be
> > enough for 5-10years.
> > Do you know any better ssd with the same price range as 870 EVO?
> >
> > Samsung 870 evo (500GB) = 5 Years or 300 TBW - $64.99
> > Samsung 860 pro (512GB) = 5 Years or 600 TBW - $99
> >
>
> But do these not lack power-loss protection..?
>
> We are running the Samsung PM883, as I was told that these would do much
> better as OSDs.
>
> MJ
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: SSD recommendations for RBD and VM's

2021-06-04 Thread mj

Hi,

On 6/4/21 12:57 PM, mhnx wrote:

I wonder that when a osd came back from power-lost, all the data
scrubbing and there are 2 other copies.
PLP is important on mostly Block Storage, Ceph should easily recover
from that situation.
That's why I don't understand why I should pay more for PLP and other
protections.


I'm no expert (or power user) al all, but my reasoning is: if something 
power-related can take down one of my servers it can just as easily take 
down *all* my ceph servers at once.


And that could just as easily render all three copies inacessible.

Plus how I understand it also: using SSDs with PLP also reduces latency, 
as the SSDs don't need to flush after each write.


(but again: this i my understanding, and I am no expert on the subject)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Creating a role in another tenant seems to be possible

2021-06-04 Thread Daniel Iwan
Hi

It seems that with command like this

aws --profile=my-user-tenant1 --endpoint=$HOST_S3_API --region="" iam
create-role --role-name="tenant2\$TemporaryRole"
--assume-role-policy-document file://json/trust-policy-assume-role.json

I can create a role in another tenant.
Executing user have roles:* capability which I think is necessary to be
able to create roles, but at the same time it seems to be a global ability,
for all tenants.

Similarly, a federated user who assumes a role with iam:CreateRole
permission
can create an arbitrary role like below.

aws --endpoint=$HOST_S3_API --region="" iam create-role
--role-name="tenant2\$TemporaryRole" --assume-role-policy-document
file://json/trust-policy-assume-role.json

Example permission policy
{
"Statement":[
{"Effect":"Allow","Action":["iam:GetRole"]},
{"Effect":"Allow","Action":["iam:CreateRole"]}
]
}

Capability roles:* is not needed in this case, which I think is correct,
because only permission policy of the assumed role is checked.

Getting information about a role from other tenants is possible with
iam:GetRole.
This is less controversial but I would still expect it to be scoped to the
user's tenant unless explicit tenant name is stated in the policy like this

{"Effect":"Allow","Action":["iam:GetRole"],"Resource":"arn:aws:iam::tenant2:*"}

Possibly I'm missing something.
Why is crossing tenants possible?

Regards
Daniel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Creating a role in another tenant seems to be possible

2021-06-04 Thread Pritha Srivastava
On Fri, Jun 4, 2021 at 5:06 PM Daniel Iwan  wrote:

> Hi
>
> It seems that with command like this
>
> aws --profile=my-user-tenant1 --endpoint=$HOST_S3_API --region="" iam
> create-role --role-name="tenant2\$TemporaryRole"
> --assume-role-policy-document file://json/trust-policy-assume-role.json
>
> I can create a role in another tenant.
> Executing user have roles:* capability which I think is necessary to be
> able to create roles, but at the same time it seems to be a global ability,
> for all tenants.
>
>
How did you check whether the role was created in tenant1 or tenant2?
It shouldn't be created in tenant2, if it is, then it's a bug, please open
a tracker issue for it.

Similarly, a federated user who assumes a role with iam:CreateRole
> permission
> can create an arbitrary role like below.
>
> aws --endpoint=$HOST_S3_API --region="" iam create-role
> --role-name="tenant2\$TemporaryRole" --assume-role-policy-document
> file://json/trust-policy-assume-role.json
>
> Example permission policy
> {
> "Statement":[
> {"Effect":"Allow","Action":["iam:GetRole"]},
> {"Effect":"Allow","Action":["iam:CreateRole"]}
> ]
> }
>
> What entity is this permission policy attached to? The user making the
CreateRole call?

Capability roles:* is not needed in this case, which I think is correct,
> because only permission policy of the assumed role is checked.
>
> Getting information about a role from other tenants is possible with
> iam:GetRole.
> This is less controversial but I would still expect it to be scoped to the
> user's tenant unless explicit tenant name is stated in the policy like this
>
>
> {"Effect":"Allow","Action":["iam:GetRole"],"Resource":"arn:aws:iam::tenant2:*"}
>
> Possibly I'm missing something.
> Why is crossing tenants possible?
>
> Regards
> Daniel
>

Thanks,
Pritha

> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: SSD recommendations for RBD and VM's

2021-06-04 Thread mhnx
"Plus how I understand it also: using SSDs with PLP also reduces latency,
as the SSDs don't need to flush after each write."

I didn't know that but it makes sense. I should dig into this.
Thanks.

mj , 4 Haz 2021 Cum, 14:24 tarihinde şunu yazdı:
>
> Hi,
>
> On 6/4/21 12:57 PM, mhnx wrote:
> > I wonder that when a osd came back from power-lost, all the data
> > scrubbing and there are 2 other copies.
> > PLP is important on mostly Block Storage, Ceph should easily recover
> > from that situation.
> > That's why I don't understand why I should pay more for PLP and other
> > protections.
>
> I'm no expert (or power user) al all, but my reasoning is: if something
> power-related can take down one of my servers it can just as easily take
> down *all* my ceph servers at once.
>
> And that could just as easily render all three copies inacessible.
>
> Plus how I understand it also: using SSDs with PLP also reduces latency,
> as the SSDs don't need to flush after each write.
>
> (but again: this i my understanding, and I am no expert on the subject)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Turning on "compression_algorithm" old pool with 500TB usage

2021-06-04 Thread mhnx
Hello. I have a erasure pool and I didn't turn on compression at the beginning.
Now I'm writing new type of very small data and overhead is becoming an issue.
I'm thinking to turn on compression on the pool but in most
filesystems it will effect only the new data. What is the behavior in
ceph?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Rolling upgrade model to new OS

2021-06-04 Thread Drew Weaver
Hello,

I need to upgrade the OS that our Ceph cluster is running on to support new 
versions of Ceph.

Has anyone devised a model for how you handle this?

Do you just:

Install some new nodes with the new OS
Install the old version of Ceph on the new nodes
Add those nodes/osds to the cluster
Remove the old nodes
Upgrade Ceph on the new nodes

Are there any specific OS that Ceph has said that will have longer future 
version support? Would like to only touch the OS every 3-4 years if possible.

Thanks,
-Drew
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Rolling upgrade model to new OS

2021-06-04 Thread Nico Schottelius


Hey Drew,

we have changed the OS multiple times in the lifetime of our ceph
cluster. In general, you can proceed the same way as a regular update,
starting with the mons/mgrs and then migrating the OSDs.

Cheers,

Nico

Drew Weaver  writes:

> Hello,
>
> I need to upgrade the OS that our Ceph cluster is running on to support new 
> versions of Ceph.
>
> Has anyone devised a model for how you handle this?
>
> Do you just:
>
> Install some new nodes with the new OS
> Install the old version of Ceph on the new nodes
> Add those nodes/osds to the cluster
> Remove the old nodes
> Upgrade Ceph on the new nodes
>
> Are there any specific OS that Ceph has said that will have longer future 
> version support? Would like to only touch the OS every 3-4 years if possible.
>
> Thanks,
> -Drew
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


--
Sustainable and modern Infrastructures by ungleich.ch
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Rolling upgrade model to new OS

2021-06-04 Thread Vladimir Sigunov
Hi Drew,
I performed the upgrade from Nautilus  (bare-metal deployment) -> Octopus
(podman containerization) and RHEL-7 -> RHEL-8.
Everything was done in-place. My sequence was:
ceph osd noout/norebalance
shutdown/disable running services
perform full OS upgrade
install necessary software like podman
start ceph monitor in podman
ensure it is joining the rest of cluster
ensure the monitor is stable
deploy osds with the same ids I had previously
ensure everything is stable
deploy additional daemons (if needed)
check cluster's overall health state
if everything is fine, proceed to the next node

There wasn't any major issue related to the upgrade, however I hit some
minors. E.g. Sometimes for some reason monitors didn't join the cluster. or
osds started flapping, but all problems I saw - were fixable in the
observable period of time.
My personal opinion - if you can use docker, then use docker and test the
ceph images before you deploy it into production. I used to use our staging
environment.
Very likely, you should make several changes in systemd unit files and/or
in the docker/podman command. Don't forget to mount all directories ceph is
using, like /var/lib/ceph, etc.
To troubleshoot the first start, I started the container without -d, to
monitor the container's output in real time, then when I had all mistakes
fixed, I restarted it using the systemd unit.
Unit files you can grab on ceph-ansible github.

Good luck!
Vladimir

On Fri, Jun 4, 2021 at 8:55 AM Drew Weaver  wrote:

> Hello,
>
> I need to upgrade the OS that our Ceph cluster is running on to support
> new versions of Ceph.
>
> Has anyone devised a model for how you handle this?
>
> Do you just:
>
> Install some new nodes with the new OS
> Install the old version of Ceph on the new nodes
> Add those nodes/osds to the cluster
> Remove the old nodes
> Upgrade Ceph on the new nodes
>
> Are there any specific OS that Ceph has said that will have longer future
> version support? Would like to only touch the OS every 3-4 years if
> possible.
>
> Thanks,
> -Drew
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Rolling upgrade model to new OS

2021-06-04 Thread Martin Verges
Hello Drew,

or whole deployment and management solution is build on just replacing an
OS whenever there is an update. We at croit.io even provide Debian and Suse
based OS images and you can switch between per host at any time. No problem.

Just go and reinstall a node, install Ceph and the services will come up
without a problem when you have all configs in place.

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


On Fri, 4 Jun 2021 at 14:56, Drew Weaver  wrote:

> Hello,
>
> I need to upgrade the OS that our Ceph cluster is running on to support
> new versions of Ceph.
>
> Has anyone devised a model for how you handle this?
>
> Do you just:
>
> Install some new nodes with the new OS
> Install the old version of Ceph on the new nodes
> Add those nodes/osds to the cluster
> Remove the old nodes
> Upgrade Ceph on the new nodes
>
> Are there any specific OS that Ceph has said that will have longer future
> version support? Would like to only touch the OS every 3-4 years if
> possible.
>
> Thanks,
> -Drew
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-06-04 Thread Marc
Do you use rbd images in containers that are residing on osd nodes? Does this 
give any problems? I used to have kernel mounted cephfs on a osd node, after a 
specific luminous release this was giving me problems.


> -Original Message-
> From: Eneko Lacunza 
> Sent: Friday, 4 June 2021 15:49
> To: ceph-users@ceph.io
> Subject: *SPAM* [ceph-users] Re: Why you might want packages
> not containers for Ceph deployments
> 
> Hi,
> 
> We operate a few Ceph hyperconverged clusters with Proxmox, that
> provides a custom ceph package repository. They do a great work; and
> deployment is a brezee.
> 
> So, even as currently we would rely on Proxmox packages/distribution and
> not upstream, we have a number of other projects deployed with
> containers and we even distribute some of our own development in deb and
> container packages, so I will comment on our view:
> 
> El 2/6/21 a las 23:26, Oliver Freyermuth escribió:
> [...]
> >
> > If I operate services in containers built by developers, of course
> > this ensures the setup works, and dependencies are well tested, and
> > even upgrades work well — but it also means that,
> > at the end of the day, if I run 50 services in 50 different containers
> > from 50 different upstreams, I'll have up to 50 different versions of
> > OpenSSL floating around my production servers.
> > If a security issue is found in any of the packages used in all the
> > container images, I now need to trust the security teams of all the 50
> > developer groups building these containers
> > (and most FOSS projects won't have the ressources, understandably...),
> > instead of the one security team of the disto I use. And then, I also
> > have to re-pull all these containers, after finding out that a
> > security fix has become available.
> > Or I need to build all these containers myself, and effectively take
> > over the complete job, and have my own security team.
> >
> > This may scale somewhat well, if you have a team of 50 people, and
> > every person takes care of one service. Containers are often your
> > friend in this case[1],
> > since it allows to isolate the different responsibilities along with
> > the service.
> >
> > But this is rarely the case outside of industry, and especially not in
> > academics.
> > So the approach we chose for us is to have one common OS everywhere,
> > and automate all of our deployment and configuration management with
> > Puppet.
> > Of course, that puts is in one of the many corners out there, but it
> > scales extremely well to all services we operate,
> > and I can still trust the distro maintainers to keep the base OS safe
> > on all our servers, automate reboots etc.
> >
> > For Ceph, we've actually seen questions about security issues already
> > on the list[0] (never answered AFAICT).
> 
> These are the two main issues I find with containers really:
> 
> - Keeping hosts uptodate is more complex (apt-get update+apt-get
> dist-upgrade and also some kind of docker pull+docker
> restart/docker-compose up ...). Much of the time the second part is not
> standard (just deployed a Harbor service, upgrade is quite simple but I
> have to know how to do it as it's speciffic, maintenance would be much
> easier if it was packaged in Debian). I won't say it's more difficult,
> but it will be more diverse and complex.
> 
> - Container image quality and security support quality, that will vary
> from upstream to upstream. You have to research each of them to know
> were they stand. A distro (specially a good one like Debian, Ubuntu,
> RHEL or SUSE) has known, quality security support for the repositories.
> They will even fix issues not fixed by upstream (o backport them to
> distro's version...). This is more an upstream vs distro issue, really.
> 
> About debugging issues reported with Ceph containers, I think those are
> things waiting for a fix: why are logs writen in container image (or an
> ephemeral volume, I don't know really how is that done right now)
> instead of an external name volume o a local mapped dir in /var/log/ceph ?
> 
> All that said, I think that it makes sense for an upstream project like
> Ceph, to distribute container images, as it is the most generic way to
> distribute (you can deploy on any system/distro supporting container
> images) and eases development. But only distributing container images
> could make more users depend on third party distribution (global or
> specific distros), which would delay feeback/bugreport to upstream.
> 
> Cheers and thanks for the great work!
> 
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
> 
> Tel. +34 943 569 206 | https://www.binovo.es
> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
> 
> https://www.youtube.com/user/CANALBINOVO
> https://www.linkedin.com/company/37269706/
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_

[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-06-04 Thread Eneko Lacunza

Hi,

We operate a few Ceph hyperconverged clusters with Proxmox, that 
provides a custom ceph package repository. They do a great work; and 
deployment is a brezee.


So, even as currently we would rely on Proxmox packages/distribution and 
not upstream, we have a number of other projects deployed with 
containers and we even distribute some of our own development in deb and 
container packages, so I will comment on our view:


El 2/6/21 a las 23:26, Oliver Freyermuth escribió:
[...]


If I operate services in containers built by developers, of course 
this ensures the setup works, and dependencies are well tested, and 
even upgrades work well — but it also means that,
at the end of the day, if I run 50 services in 50 different containers 
from 50 different upstreams, I'll have up to 50 different versions of 
OpenSSL floating around my production servers.
If a security issue is found in any of the packages used in all the 
container images, I now need to trust the security teams of all the 50 
developer groups building these containers

(and most FOSS projects won't have the ressources, understandably...),
instead of the one security team of the disto I use. And then, I also 
have to re-pull all these containers, after finding out that a 
security fix has become available.
Or I need to build all these containers myself, and effectively take 
over the complete job, and have my own security team.


This may scale somewhat well, if you have a team of 50 people, and 
every person takes care of one service. Containers are often your 
friend in this case[1],
since it allows to isolate the different responsibilities along with 
the service.


But this is rarely the case outside of industry, and especially not in 
academics.
So the approach we chose for us is to have one common OS everywhere, 
and automate all of our deployment and configuration management with 
Puppet.
Of course, that puts is in one of the many corners out there, but it 
scales extremely well to all services we operate,
and I can still trust the distro maintainers to keep the base OS safe 
on all our servers, automate reboots etc.


For Ceph, we've actually seen questions about security issues already 
on the list[0] (never answered AFAICT).


These are the two main issues I find with containers really:

- Keeping hosts uptodate is more complex (apt-get update+apt-get 
dist-upgrade and also some kind of docker pull+docker 
restart/docker-compose up ...). Much of the time the second part is not 
standard (just deployed a Harbor service, upgrade is quite simple but I 
have to know how to do it as it's speciffic, maintenance would be much 
easier if it was packaged in Debian). I won't say it's more difficult, 
but it will be more diverse and complex.


- Container image quality and security support quality, that will vary 
from upstream to upstream. You have to research each of them to know 
were they stand. A distro (specially a good one like Debian, Ubuntu, 
RHEL or SUSE) has known, quality security support for the repositories. 
They will even fix issues not fixed by upstream (o backport them to 
distro's version...). This is more an upstream vs distro issue, really.


About debugging issues reported with Ceph containers, I think those are 
things waiting for a fix: why are logs writen in container image (or an 
ephemeral volume, I don't know really how is that done right now) 
instead of an external name volume o a local mapped dir in /var/log/ceph ?


All that said, I think that it makes sense for an upstream project like 
Ceph, to distribute container images, as it is the most generic way to 
distribute (you can deploy on any system/distro supporting container 
images) and eases development. But only distributing container images 
could make more users depend on third party distribution (global or 
specific distros), which would delay feeback/bugreport to upstream.


Cheers and thanks for the great work!

Eneko Lacunza
Zuzendari teknikoa | Director técnico
Binovo IT Human Project

Tel. +34 943 569 206 | https://www.binovo.es
Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun

https://www.youtube.com/user/CANALBINOVO
https://www.linkedin.com/company/37269706/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Why you might want packages not containers for Ceph deployments

2021-06-04 Thread 胡 玮文

> 在 2021年6月4日,21:51,Eneko Lacunza  写道:
> 
> Hi,
> 
> We operate a few Ceph hyperconverged clusters with Proxmox, that provides a 
> custom ceph package repository. They do a great work; and deployment is a 
> brezee.
> 
> So, even as currently we would rely on Proxmox packages/distribution and not 
> upstream, we have a number of other projects deployed with containers and we 
> even distribute some of our own development in deb and container packages, so 
> I will comment on our view:
> 
> El 2/6/21 a las 23:26, Oliver Freyermuth escribió:
> [...]
>> 
>> If I operate services in containers built by developers, of course this 
>> ensures the setup works, and dependencies are well tested, and even upgrades 
>> work well — but it also means that,
>> at the end of the day, if I run 50 services in 50 different containers from 
>> 50 different upstreams, I'll have up to 50 different versions of OpenSSL 
>> floating around my production servers.
>> If a security issue is found in any of the packages used in all the 
>> container images, I now need to trust the security teams of all the 50 
>> developer groups building these containers
>> (and most FOSS projects won't have the ressources, understandably...),
>> instead of the one security team of the disto I use. And then, I also have 
>> to re-pull all these containers, after finding out that a security fix has 
>> become available.
>> Or I need to build all these containers myself, and effectively take over 
>> the complete job, and have my own security team.
>> 
>> This may scale somewhat well, if you have a team of 50 people, and every 
>> person takes care of one service. Containers are often your friend in this 
>> case[1],
>> since it allows to isolate the different responsibilities along with the 
>> service.
>> 
>> But this is rarely the case outside of industry, and especially not in 
>> academics.
>> So the approach we chose for us is to have one common OS everywhere, and 
>> automate all of our deployment and configuration management with Puppet.
>> Of course, that puts is in one of the many corners out there, but it scales 
>> extremely well to all services we operate,
>> and I can still trust the distro maintainers to keep the base OS safe on all 
>> our servers, automate reboots etc.
>> 
>> For Ceph, we've actually seen questions about security issues already on the 
>> list[0] (never answered AFAICT).
> 
> These are the two main issues I find with containers really:
> 
> - Keeping hosts uptodate is more complex (apt-get update+apt-get dist-upgrade 
> and also some kind of docker pull+docker restart/docker-compose up ...). Much 
> of the time the second part is not standard (just deployed a Harbor service, 
> upgrade is quite simple but I have to know how to do it as it's speciffic, 
> maintenance would be much easier if it was packaged in Debian). I won't say 
> it's more difficult, but it will be more diverse and complex.
> 
> - Container image quality and security support quality, that will vary from 
> upstream to upstream. You have to research each of them to know were they 
> stand. A distro (specially a good one like Debian, Ubuntu, RHEL or SUSE) has 
> known, quality security support for the repositories. They will even fix 
> issues not fixed by upstream (o backport them to distro's version...). This 
> is more an upstream vs distro issue, really.
> 
> About debugging issues reported with Ceph containers, I think those are 
> things waiting for a fix: why are logs writen in container image (or an 
> ephemeral volume, I don't know really how is that done right now) instead of 
> an external name volume o a local mapped dir in /var/log/ceph ?

You could find the logs with “journalctl” outside of containers.

> All that said, I think that it makes sense for an upstream project like Ceph, 
> to distribute container images, as it is the most generic way to distribute 
> (you can deploy on any system/distro supporting container images) and eases 
> development. But only distributing container images could make more users 
> depend on third party distribution (global or specific distros), which would 
> delay feeback/bugreport to upstream.
> 
> Cheers and thanks for the great work!
> 
> Eneko Lacunza
> Zuzendari teknikoa | Director técnico
> Binovo IT Human Project
> 
> Tel. +34 943 569 206 | 
> https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.binovo.es%2F&data=04%7C01%7C%7Ce9fd782948fc4ed9160a08d9275fe158%7C84df9e7fe9f640afb435%7C1%7C0%7C637584115021099979%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=npfYsR5uxQiLoMOMaDELSO0uh4%2Fx%2Bj02dMMkBrSA2G0%3D&reserved=0
> Astigarragako Bidea, 2 - 2º izda. Oficina 10-11, 20180 Oiartzun
> 
> https://apac01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.youtube.com%2Fuser%2FCANALBINOVO&data=04%7C01%7C%7Ce9fd782948fc4ed9160a08d9275fe158%7C84df9e7fe9f640afb435%7C1%7C0%7C637584115021099979%7CUnkn

[ceph-users] Zabbix sender issue

2021-06-04 Thread Bob Loi
Hi,

I managed to build a Ceph cluster with the help of cephadm tool. It works like 
a charm.

I have a problem that i’m still not able to fix:

I know that zabbix-sender executable is not integrated into the cephadm image 
of ceph-mgr pulled and started by podman because of this choice.

https://github.com/ceph/ceph-container/issues/1651

I’m a total newbie about containers tecnologies, but i managed to install 
zabbix sender manually by executing this with podman

podman ps -a -> looking for the container ID

podman exec -ti [id_container] /bin/bash -> to

Then install repo via rpm and dnf install zabbix-sender.

Two considerations.

1) https://github.com/ceph/ceph-container/issues/1651  -> This answer still 
leaves me confused. It’s pretty no-sense giving a Zabbix module if then you 
have to install the executable by yourself and the container it’s overwritten 
everytime you reboot the physical host giving minimal info about avoiding this 
behavior.

I know this is open source and you have to know well the environment before 
doing everything, i know devs want less troubles as possible, but this is about 
user experience. If you have to choose between the final user or you having an 
annoying problem, IMHO, i prefer to have a happy user. End of consideration.
I think including zabbix-sender executable into the container wouldn’t kill 
nobody.

2) Since i’m a total noob about containers, podman, docker etc, do you have any 
info about "fix" this behavior and avoid overwrite the mgr containers everytime 
i reboot the host? Please forgive me, i’m totally a newbie about containers.

Thank you in advance for the support.

Best,


Roberto
Kirecom.net
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Connect ceph to proxmox

2021-06-04 Thread Szabo, Istvan (Agoda)
Hi,

Is there a way to connect from my nautilus ceph setup the pool that I created 
in ceph to proxmox? Or need a totally different ceph install?

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---


This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io