[ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-23 Thread Janne Johansson
Den tors 23 apr. 2020 kl 08:49 skrev Darren Soothill <
darren.sooth...@suse.com>:

> If you want the lowest cost per TB then you will be going with larger
> nodes in your cluster but it does mean you minimum cluster size is going to
> be many PB’s in size.
> Now the question is what is the tax that a particular chassis vendor is
> charging you. I know from the configs we do on a regular basis that a 60
> drive chassis will give you the lowest cost per TB. BUT it has
> implications. Your cluster size needs to be up in the order of 10PB
> minimum. 60 x 18TB gives you around 1PB per node.  Oh did you notice here
> we are going for the bigger disk drives. Why because the more data you can
> spread your fixed costs across the lower the overall cost per GB.
>

I don't know all models, but the computers I've looked at with 60 drive
slots will have a small and "crappy" motherboard, with few options, not
many buses/slots/network ports and low amounts of cores, DIMM sockets and
so on, counting on you to make almost a passive storage node on it. I have
a hard time thinking the 60*18TB OSD recovery requirements in cpu and ram
would be covered in any way by the kinds of 60-slot boxes I've seen. Not
that I focus on that area, but it seems like a common tradeoff, Heavy
Duty(tm) motherboards or tons of drives.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrading to Octopus

2020-04-23 Thread Simon Sutter
Hello Khodayar


Of cours I tried installing them with yum...

They are not available in the centos base or epel repos, here are the ones, 
which are available:


[root@node1 ~]# yum list | egrep "cherrypy|jwt|routes"
python-cherrypy.noarch 3.2.2-4.el7@base
python-cherrypy2.noarch2.3.0-19.el7   @epel
python-jwt.noarch  1.5.3-1.el7@base
python-routes.noarch   1.13-2.el7 @epel
nodejs-jwt-simple.noarch   0.2.0-1.el7epel
python36-jwt.noarch1.6.4-2.el7epel


How do I get either: The right packages or a workaround because i can install 
the dependencies with pip?


Regards,

Simon



Von: Khodayar Doustar 
Gesendet: Mittwoch, 22. April 2020 20:02:04
An: Simon Sutter
Cc: ceph-users@ceph.io
Betreff: Re: [ceph-users] Upgrading to Octopus

Hi Simon,

Have you tried installing them with yum?




On Wed, Apr 22, 2020 at 6:16 PM Simon Sutter 
mailto:ssut...@hosttech.ch>> wrote:
Hello everybody


In octopus there are some interesting looking features, so I tried to upgrading 
my Centos 7 test nodes, according to:
https://docs.ceph.com/docs/master/releases/octopus/

Everything went fine and the cluster is healthy.


To test out the new dashboard functions, I tried to install it, but there are 
missing dependencies:

yum install ceph-mgr-dashboard.noarch

.

--> Finished Dependency Resolution
Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
   Requires: python3-routes
Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
   Requires: python3-jwt
Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
   Requires: python3-cherrypy


Installing them with pip3 does of course make no difference, because those are 
yum dependencies.

Does anyone know a workaround?

Do I have to upgrade to Centos 8 for this to work?


Thanks in advance,

Simon
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: missing amqp-exchange on bucket-notification with AMQP endpoint

2020-04-23 Thread Yuval Lifshitz
On Thu, Apr 23, 2020 at 8:28 AM Andreas Unterkircher 
wrote:

> Dear Yuval!
>
> > The message format you tried to use is the standard one (the one being
> > emitted from boto3, or any other AWS SDK [1]).
> > It passes the arguments using 'x-www-form-urlencoded'. For example:
>
> Thank you for your clarification! I've previously tried it as a
> x-www-form-urlencoded-body as well, but I have failed. That it was then
> working using the non-standard-parameters has lead me down the wrong
> road...
> But I have to admit that I'm still failing to create a topic the S3-way.
>
> I've tried it with curl, but as well with Postman.
> Even if I use your example-body, Ceph keeps telling me (at least)
> method-not-allowed.
>
> Is this maybe because I'm using an AWS Sig v4 to authenticate?
>
> yes, this is probably the issue. in the radosgw we use the same signature
mechanism for S3 and for the other services (like topic creation).
see this example:
https://github.com/ceph/ceph/blob/master/examples/boto3/topic_with_endpoint.py#L31
(I guess we should also add that to the docs)

This is the request I'm sending out:
>
> POST / HTTP/1.1
> Content-Type: application/x-www-form-urlencoded; charset=utf-8
> Accept-Encoding: identity
> Date: Tue, 23 Apr 2020 05:00:35 GMT
> X-Amz-Content-Sha256:
> e8d828552b412fde2cd686b0a984509bc485693a02e8c53ab84cf36d1dbb961a
> Host: s3.example.com
> X-Amz-Date: 2 as0200423T050035Z
> Authorization: AWS4-HMAC-SHA256
> Credential=DNQXT3I8Z5MWDJ1A8YMP/20200423/de/s3/aws4_request,
> SignedHeaders=accept-encoding;content-type;date;host;x-amz-content-sha256;x-amz-date,
>
> Signature=fa65844ba997fe11e65be87a18f160afe1ea459892316d6060bbc663daf6eace
> User-Agent: PostmanRuntime/7.24.1
> Accept: */*
> Connection: keep-alive
>
> Content-Length: 303
>
> Name=ajmmvc-1_topic_1&
> Attributes.entry.2.key=amqp-exchange&
> Attributes.entry.1.key=amqp-ack-level&
> Attributes.entry.2.value=amqp.direct&
> Version=2010-03-31&
> Attributes.entry.3.value=amqp%3A%2F%2F127.0.0.1%3A7001&
> Attributes.entry.1.value=none&
> Action=CreateTopic&
> Attributes.entry.3.key=push-endpoint
>
>
> This is the response that comes back:
>
> HTTP/1.1 405 Method Not Allowed
> Content-Length: 200
> x-amz-request-id: tx1-005ea12159-6e47a-s3-datacenter
> Accept-Ranges: bytes
> Content-Type: application/xml
> Date: Thu, 23 Apr 2020 05:02:17 GMT
> 
> encoding="UTF-8"?>MethodNotAllowedtx1-005ea12159-6e47a-s3-datacenter6e47a-s3-datacenter-de
>
>
> This is was radosgw is seeing at the same time
>
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 20 final domain/bucket
> subdomain= domain=s3.example.com in_hosted_domain=1
> in_hosted_domain_s3website=0 s->info.domain=s3.example.com
> s->info.request_uri=/
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 10 meta>>
> HTTP_X_AMZ_CONTENT_SHA256
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 10 meta>> HTTP_X_AMZ_DATE
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 10 x>>
>
> x-amz-content-sha256:e8d828552b412fde2cd686b0a984509bc485693a02e8c53ab84cf36d1dbb961a
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 10 x>>
> x-amz-date:20200423T050035Z
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 20 req 1 0s get_handler
> handler=26RGWHandler_REST_Service_S3
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 10
> handler=26RGWHandler_REST_Service_S3
> 2020-04-23T07:02:17.745+0200 7f5aab2af700  2 req 1 0s getting op 4
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 10 Content of POST:
> Name=ajmmvc-1_topic_1&
> Attributes.entry.2.key=amqp-exchange&
> Attributes.entry.1.key=amqp-ack-level&
> Attributes.entry.2.value=amqp.direct&
> Version=2010-03-31&
> Attributes.entry.3.value=amqp%3A%2F%2F127.0.0.1%3A7001&
> Attributes.entry.1.value=none&
> Action=CreateTopic&
> Attributes.entry.3.key=push-endpoint
>
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 10 Content of POST:
> Name=ajmmvc-1_topic_1&
> Attributes.entry.2.key=amqp-exchange&
> Attributes.entry.1.key=amqp-ack-level&
> Attributes.entry.2.value=amqp.direct&
> Version=2010-03-31&
> Attributes.entry.3.value=amqp%3A%2F%2F127.0.0.1%3A7001&
> Attributes.entry.1.value=none&
> Action=CreateTopic&
> Attributes.entry.3.key=push-endpoint
>
> 2020-04-23T07:02:17.745+0200 7f5aab2af700 10 Content of POST:
> Name=ajmmvc-1_topic_1&
> Attributes.entry.2.key=amqp-exchange&
> Attributes.entry.1.key=amqp-ack-level&
> Attributes.entry.2.value=amqp.direct&
> Version=2010-03-31&
> Attributes.entry.3.value=amqp%3A%2F%2F127.0.0.1%3A7001&
> Attributes

[ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-23 Thread Darren Soothill
I can think of 1 vendor who has made some of the compromises that you talk of 
although memory and CPU is not one of them they are limited on slots and NVME 
capacity.

But there are plenty of other vendors out there who use the same model of 
motherboard across the whole chassis range so there isn’t a compromise in terms 
of slots and CPU.

The compromise may come with the size of the chassis in that a lot of these 
bigger chassis can also be deeper to get rid of the compromises.

The reality with an OSD node is you don't need that many slots or network ports.



From: Janne Johansson 
Date: Thursday, 23 April 2020 at 08:08
To: Darren Soothill 
Cc: ceph-users@ceph.io 
Subject: Re: [ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?
Den tors 23 apr. 2020 kl 08:49 skrev Darren Soothill 
mailto:darren.sooth...@suse.com>>:
If you want the lowest cost per TB then you will be going with larger nodes in 
your cluster but it does mean you minimum cluster size is going to be many PB’s 
in size.
Now the question is what is the tax that a particular chassis vendor is 
charging you. I know from the configs we do on a regular basis that a 60 drive 
chassis will give you the lowest cost per TB. BUT it has implications. Your 
cluster size needs to be up in the order of 10PB minimum. 60 x 18TB gives you 
around 1PB per node.  Oh did you notice here we are going for the bigger disk 
drives. Why because the more data you can spread your fixed costs across the 
lower the overall cost per GB.

I don't know all models, but the computers I've looked at with 60 drive slots 
will have a small and "crappy" motherboard, with few options, not many 
buses/slots/network ports and low amounts of cores, DIMM sockets and so on, 
counting on you to make almost a passive storage node on it. I have a hard time 
thinking the 60*18TB OSD recovery requirements in cpu and ram would be covered 
in any way by the kinds of 60-slot boxes I've seen. Not that I focus on that 
area, but it seems like a common tradeoff, Heavy Duty(tm) motherboards or tons 
of drives.

--
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Healthy objects trapped in incomplete pgs

2020-04-23 Thread Jesper Lykkegaard Karlsen
Dear Cephers,


A few days ago disaster struck the Ceph cluster (erasure-coded) I am 
administrating, as the UPS power was pull from the cluster causing a power 
outage.


After rebooting the system, 6 osds were lost (spread over 5 osd nodes) as they 
could not mount anymore, several others had damages. This was more than the 
host-faliure domain was setup to handle and auto-recovery failed and osds 
started downing in a cascading maner.


When the dust settled, there were 8 pgs (of 2048) inactive and a bunch of osds 
down. I managed to recover 5 pgs, mainly by ceph-objectstore-tool 
export/import/repair commands, but now I am left with 3 pgs that are inactive 
and incomplete.


One of the pgs seems un-salvageable, as I cannot get to become active at all 
(repair/import/export/lowering min_size), but the two others I can get active 
if I export/import one of the pg shards and restart osd.


Rebuilding then starts but after a while one of the osds holding the pgs goes 
down, with a "FAILED ceph_assert(clone_size.count(clone))" message in the log.

If I set osds to noout nodown, then I can that it is only rather few objects 
e.g. 161 of a pg of >10, that are failing to be remapped.


Since most of the object in the two pgs seem intact, it would be sad to delete 
the whole pg (force-create-pg) and loose all that data.


Is there a way to show and delete the failing objects?


I have thought of a recovery plan and want to share that with you, so you can 
comment on this if it sounds doable or not?


  *   Stop osds from recovering:ceph osd set norecover
  *   bring back pgs active:ceph-objectstore-tool export/import and 
restart osd
  *   find files in pgs:  cephfs-data-scan pg_files  

  *   pull out as many as possible of those files to other location.
  *   recreate pgs:  ceph osd force-create-pg 
  *   restart recovery:ceph osd unset norecover
  *   copy back in the recovered files


Would that work or do you have a better suggestion?


Cheers,

Jesper


--
Jesper Lykkegaard Karlsen
Scientific Computing
Centre for Structural Biology
Department of Molecular Biology and Genetics
Aarhus University
Gustav Wieds Vej 10
8000 Aarhus C

E-mail: je...@mbg.au.dk
Tlf:+45 50906203

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-23 Thread Martin Verges
Hello,

simpler systems tend to be cheaper to buy per TB storage, not on a
theoretical but practical quote.

For example 1U Gigabyte 16bay D120-C21 systems with a density of 64 disks
per 4U are quite ok for most users. On 40 Nodes per rack + 2 switches you
have 10PB raw space for around 350k€.
They come with everything you need from dual 10G SFP+ to acceptable 8c/16t
45W TDP CPU. It comes with a M.2 slot if you want a db/wal or other
additional disk.
Such systems equipped with 16x16TB have a price point of below 8k€ or ~31 €
per TB RAW storage.

For me this is just an example of a quite cheap but capable HDD node. I
never saw a better offer for big fat systems on a price per TB and TCO.

Please remember, there is no best node for everyone, this node is not the
best or fastest out on the market and just an example ;)

--
Martin Verges
Managing director

Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263

Web: https://croit.io
YouTube: https://goo.gl/PGE1Bx


Am Do., 23. Apr. 2020 um 11:21 Uhr schrieb Darren Soothill <
darren.sooth...@suse.com>:

> I can think of 1 vendor who has made some of the compromises that you talk
> of although memory and CPU is not one of them they are limited on slots and
> NVME capacity.
>
> But there are plenty of other vendors out there who use the same model of
> motherboard across the whole chassis range so there isn’t a compromise in
> terms of slots and CPU.
>
> The compromise may come with the size of the chassis in that a lot of
> these bigger chassis can also be deeper to get rid of the compromises.
>
> The reality with an OSD node is you don't need that many slots or network
> ports.
>
>
>
> From: Janne Johansson 
> Date: Thursday, 23 April 2020 at 08:08
> To: Darren Soothill 
> Cc: ceph-users@ceph.io 
> Subject: Re: [ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?
> Den tors 23 apr. 2020 kl 08:49 skrev Darren Soothill <
> darren.sooth...@suse.com>:
> If you want the lowest cost per TB then you will be going with larger
> nodes in your cluster but it does mean you minimum cluster size is going to
> be many PB’s in size.
> Now the question is what is the tax that a particular chassis vendor is
> charging you. I know from the configs we do on a regular basis that a 60
> drive chassis will give you the lowest cost per TB. BUT it has
> implications. Your cluster size needs to be up in the order of 10PB
> minimum. 60 x 18TB gives you around 1PB per node.  Oh did you notice here
> we are going for the bigger disk drives. Why because the more data you can
> spread your fixed costs across the lower the overall cost per GB.
>
> I don't know all models, but the computers I've looked at with 60 drive
> slots will have a small and "crappy" motherboard, with few options, not
> many buses/slots/network ports and low amounts of cores, DIMM sockets and
> so on, counting on you to make almost a passive storage node on it. I have
> a hard time thinking the 60*18TB OSD recovery requirements in cpu and ram
> would be covered in any way by the kinds of 60-slot boxes I've seen. Not
> that I focus on that area, but it seems like a common tradeoff, Heavy
> Duty(tm) motherboards or tons of drives.
>
> --
> May the most significant bit of your life be positive.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-23 Thread Richard Hesketh
On Thu, 2020-04-23 at 09:08 +0200, Janne Johansson wrote:
> Den tors 23 apr. 2020 kl 08:49 skrev Darren Soothill <
> darren.sooth...@suse.com>:
> 
> > If you want the lowest cost per TB then you will be going with
> > larger nodes in your cluster but it does mean you minimum cluster
> > size is going to be many PB’s in size.
> > Now the question is what is the tax that a particular chassis
> > vendor is charging you. I know from the configs we do on a regular
> > basis that a 60 drive chassis will give you the lowest cost per TB.
> > BUT it has implications. Your cluster size needs to be up in the
> > order of 10PB minimum. 60 x 18TB gives you around 1PB per node.  Oh
> > did you notice here we are going for the bigger disk drives. Why
> > because the more data you can spread your fixed costs across the
> > lower the overall cost per GB.
> > 
> 
> I don't know all models, but the computers I've looked at with 60
> drive slots will have a small and "crappy" motherboard, with few
> options, not many buses/slots/network ports and low amounts of cores,
> DIMM sockets and so on, counting on you to make almost a passive
> storage node on it. I have a hard time thinking the 60*18TB OSD
> recovery requirements in cpu and ram would be covered in any way by
> the kinds of 60-slot boxes I've seen. Not that I focus on that area,
> but it seems like a common tradeoff, Heavy Duty(tm) motherboards or
> tons of drives.

I would imagine that this describes the use of separate SAS-attached
(or whatever) JBOD boxes rather than everything in a single chassis. My
clusters use 1U servers with decent CPU/memory and SAS adapter cards
hooking up larger JBODs to actually house the disks (for the spinning
rust OSDs, at least).


signature.asc
Description: This is a digitally signed message part
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-23 Thread lin yunfan
Hi Maritin,
How is the performance of d120-c21 hdd cluster? Can it utilize the
full performance of the 16 hdd?


linyunfan

Martin Verges  于2020年4月23日周四 下午6:12写道:
>
> Hello,
>
> simpler systems tend to be cheaper to buy per TB storage, not on a
> theoretical but practical quote.
>
> For example 1U Gigabyte 16bay D120-C21 systems with a density of 64 disks
> per 4U are quite ok for most users. On 40 Nodes per rack + 2 switches you
> have 10PB raw space for around 350k€.
> They come with everything you need from dual 10G SFP+ to acceptable 8c/16t
> 45W TDP CPU. It comes with a M.2 slot if you want a db/wal or other
> additional disk.
> Such systems equipped with 16x16TB have a price point of below 8k€ or ~31 €
> per TB RAW storage.
>
> For me this is just an example of a quite cheap but capable HDD node. I
> never saw a better offer for big fat systems on a price per TB and TCO.
>
> Please remember, there is no best node for everyone, this node is not the
> best or fastest out on the market and just an example ;)
>
> --
> Martin Verges
> Managing director
>
> Mobile: +49 174 9335695
> E-Mail: martin.ver...@croit.io
> Chat: https://t.me/MartinVerges
>
> croit GmbH, Freseniusstr. 31h, 81247 Munich
> CEO: Martin Verges - VAT-ID: DE310638492
> Com. register: Amtsgericht Munich HRB 231263
>
> Web: https://croit.io
> YouTube: https://goo.gl/PGE1Bx
>
>
> Am Do., 23. Apr. 2020 um 11:21 Uhr schrieb Darren Soothill <
> darren.sooth...@suse.com>:
>
> > I can think of 1 vendor who has made some of the compromises that you talk
> > of although memory and CPU is not one of them they are limited on slots and
> > NVME capacity.
> >
> > But there are plenty of other vendors out there who use the same model of
> > motherboard across the whole chassis range so there isn’t a compromise in
> > terms of slots and CPU.
> >
> > The compromise may come with the size of the chassis in that a lot of
> > these bigger chassis can also be deeper to get rid of the compromises.
> >
> > The reality with an OSD node is you don't need that many slots or network
> > ports.
> >
> >
> >
> > From: Janne Johansson 
> > Date: Thursday, 23 April 2020 at 08:08
> > To: Darren Soothill 
> > Cc: ceph-users@ceph.io 
> > Subject: Re: [ceph-users] Re: Dear Abby: Why Is Architecting CEPH So Hard?
> > Den tors 23 apr. 2020 kl 08:49 skrev Darren Soothill <
> > darren.sooth...@suse.com>:
> > If you want the lowest cost per TB then you will be going with larger
> > nodes in your cluster but it does mean you minimum cluster size is going to
> > be many PB’s in size.
> > Now the question is what is the tax that a particular chassis vendor is
> > charging you. I know from the configs we do on a regular basis that a 60
> > drive chassis will give you the lowest cost per TB. BUT it has
> > implications. Your cluster size needs to be up in the order of 10PB
> > minimum. 60 x 18TB gives you around 1PB per node.  Oh did you notice here
> > we are going for the bigger disk drives. Why because the more data you can
> > spread your fixed costs across the lower the overall cost per GB.
> >
> > I don't know all models, but the computers I've looked at with 60 drive
> > slots will have a small and "crappy" motherboard, with few options, not
> > many buses/slots/network ports and low amounts of cores, DIMM sockets and
> > so on, counting on you to make almost a passive storage node on it. I have
> > a hard time thinking the 60*18TB OSD recovery requirements in cpu and ram
> > would be covered in any way by the kinds of 60-slot boxes I've seen. Not
> > that I focus on that area, but it seems like a common tradeoff, Heavy
> > Duty(tm) motherboards or tons of drives.
> >
> > --
> > May the most significant bit of your life be positive.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: adding block.db to OSD

2020-04-23 Thread Igor Fedotov
I don't recall any additional tuning to be applied to new DB volume. And 
assume the hardware is pretty the same...


Do you still have any significant amount of data spilled over for these 
updated OSDs? If not I don't have any valid explanation for the phenomena.



You might want to try "ceph osd bench" to compare OSDs under pretty the 
same load. Any difference observed



On 4/23/2020 8:35 AM, Stefan Priebe - Profihost AG wrote:

Hello,

is there anything else needed beside running:
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-${OSD} 
bluefs-bdev-new-db --dev-target /dev/vgroup/lvdb-1


I did so some weeks ago and currently i'm seeing that all osds 
originally deployed with --block-db show 10-20% I/O waits while all 
those got converted using ceph-bluestore-tool show 80-100% I/O waits.


Also is there some tuning available to use more of the SSD? The SSD 
(block-db) is only saturated at 0-2%.


Greets,
Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: adding block.db to OSD

2020-04-23 Thread Stefan Priebe - Profihost AG

Hi,
Am 23.04.20 um 14:06 schrieb Igor Fedotov:
I don't recall any additional tuning to be applied to new DB volume. And 
assume the hardware is pretty the same...


Do you still have any significant amount of data spilled over for these 
updated OSDs? If not I don't have any valid explanation for the phenomena.


just the 64k from here:
https://tracker.ceph.com/issues/44509

You might want to try "ceph osd bench" to compare OSDs under pretty the 
same load. Any difference observed


Servers are the same HW. OSD Bench is:
# ceph tell osd.0 bench
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 16.09141478101,
"bytes_per_sec": 66727620.822242722,
"iops": 15.909104543266945
}

# ceph tell osd.36 bench
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 10.023828538,
"bytes_per_sec": 107118933.6419194,
"iops": 25.539143953780986
}


OSD 0 is a Toshiba MG07SCA12TA SAS 12G
OSD 36 is a Seagate ST12000NM0008-2H SATA 6G

SSDs are all the same like the rest of the HW. But both drives should 
give the same performance from their specs. The only other difference is 
that OSD 36 was directly created with the block.db device (Nautilus 
14.2.7) and OSD 0 (14.2.8) does not.


Stefan



On 4/23/2020 8:35 AM, Stefan Priebe - Profihost AG wrote:

Hello,

is there anything else needed beside running:
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-${OSD} 
bluefs-bdev-new-db --dev-target /dev/vgroup/lvdb-1


I did so some weeks ago and currently i'm seeing that all osds 
originally deployed with --block-db show 10-20% I/O waits while all 
those got converted using ceph-bluestore-tool show 80-100% I/O waits.


Also is there some tuning available to use more of the SSD? The SSD 
(block-db) is only saturated at 0-2%.


Greets,
Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrading to Octopus

2020-04-23 Thread Khodayar Doustar
Simon,

You can try to search for the exact package name, you can try these repos
as well:

yum  -y install
epel-release centos-release-ceph-nautilus centos-release-openstack-stein


On Thu, Apr 23, 2020 at 11:57 AM Simon Sutter  wrote:

> Hello Khodayar
>
>
> Of cours I tried installing them with yum...
>
> They are not available in the centos base or epel repos, here are the
> ones, which are available:
>
>
> [root@node1 ~]# yum list | egrep "cherrypy|jwt|routes"
> python-cherrypy.noarch 3.2.2-4.el7@base
> python-cherrypy2.noarch2.3.0-19.el7   @epel
> python-jwt.noarch  1.5.3-1.el7@base
> python-routes.noarch   1.13-2.el7 @epel
> nodejs-jwt-simple.noarch   0.2.0-1.el7epel
> python36-jwt.noarch1.6.4-2.el7epel
>
>
> How do I get either: The right packages or a workaround because i can
> install the dependencies with pip?
>
>
> Regards,
>
> Simon
>
>
> 
> Von: Khodayar Doustar 
> Gesendet: Mittwoch, 22. April 2020 20:02:04
> An: Simon Sutter
> Cc: ceph-users@ceph.io
> Betreff: Re: [ceph-users] Upgrading to Octopus
>
> Hi Simon,
>
> Have you tried installing them with yum?
>
>
>
>
> On Wed, Apr 22, 2020 at 6:16 PM Simon Sutter  ssut...@hosttech.ch>> wrote:
> Hello everybody
>
>
> In octopus there are some interesting looking features, so I tried to
> upgrading my Centos 7 test nodes, according to:
> https://docs.ceph.com/docs/master/releases/octopus/
>
> Everything went fine and the cluster is healthy.
>
>
> To test out the new dashboard functions, I tried to install it, but there
> are missing dependencies:
>
> yum install ceph-mgr-dashboard.noarch
>
> .
>
> --> Finished Dependency Resolution
> Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
>Requires: python3-routes
> Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
>Requires: python3-jwt
> Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
>Requires: python3-cherrypy
>
>
> Installing them with pip3 does of course make no difference, because those
> are yum dependencies.
>
> Does anyone know a workaround?
>
> Do I have to upgrade to Centos 8 for this to work?
>
>
> Thanks in advance,
>
> Simon
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io ceph-users-le...@ceph.io>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Increase number of read and writes

2020-04-23 Thread Bobby
Hi,

I am using Ceph in developer mode. Currently I am implementing Librados
examples which are also available in Introduction to Librados section
https://docs.ceph.com/docs/master/rados/api/librados-intro/#step-3-creating-an-i-o-context.
It says once your app has a cluster handle and a connection to a Ceph
Storage Cluster, you may create an I/O Context and begin reading and
writing data.  For example,









*err = rados_write(io, "hw", "Hello World!", 12, 0);if (err < 0) {
  fprintf(stderr, "%s: Cannot write object \"neo-obj\" to pool
%s: %s\n", argv[0], poolname, strerror(-err));
rados_ioctx_destroy(io);rados_shutdown(cluster);
exit(1);} else {printf("\nWrote \"Hello World\"
to object \"neo-obj\".\n");}*

My question, Is "12" is the number of writes? Because I want to test the
with high number of read and writes.

Looking for help !
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Increase number of read and writes

2020-04-23 Thread Janne Johansson
Den tors 23 apr. 2020 kl 16:07 skrev Bobby :

> Hi,
>
> I am using Ceph in developer mode. Currently I am implementing Librados
> examples which are also available in Introduction to Librados section
>
> https://docs.ceph.com/docs/master/rados/api/librados-intro/#step-3-creating-an-i-o-context
> .
> It says once your app has a cluster handle and a connection to a Ceph
> Storage Cluster, you may create an I/O Context and begin reading and
> writing data.  For example,
>
> *err = rados_write(io, "hw", "Hello World!", 12, 0);
>


>
> My question, Is "12" is the number of writes? Because I want to test the
> with high number of read and writes.
>
> Looking for help !
>

Just check what parameters the function takes:
CEPH_RADOS_API

 int rados_write(rados_ioctx_t
* io*,
const char ** oid*, const char ** buf*, size_t* len*, uint64_t* off*)¶


Write *len* bytes from *buf* into the *oid* object, starting at offset *off*.
The value of *len* must be <= UINT_MAX/2.


The 12 seems to be the lenght of "Hello World!" in bytes, which matches
what a normal write() call would need.
In order to test high number of writes, you need to send lots of write
calls in parallel.

(Or just get fio with rbd support compiled in, this is a solved problem
already how to benchmark ceph at a low level)

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrading to Octopus

2020-04-23 Thread Simon Sutter
Khodayar,


I added all those repos, but sitll, those packages are missing.

I can of course search for the exact package name like this:


[root@node1 ~]# yum search python3-cherrypy
Loaded plugins: fastestmirror, langpacks, priorities
Loading mirror speeds from cached hostfile
 * base: pkg.adfinis-sygroup.ch
 * centos-ceph-nautilus: pkg.adfinis-sygroup.ch
 * centos-nfs-ganesha28: pkg.adfinis-sygroup.ch
 * centos-openstack-stein: pkg.adfinis-sygroup.ch
 * centos-qemu-ev: pkg.adfinis-sygroup.ch
 * centos-sclo-rh: pkg.adfinis-sygroup.ch
 * centos-sclo-sclo: pkg.adfinis-sygroup.ch
 * epel: pkg.adfinis-sygroup.ch
 * extras: pkg.adfinis-sygroup.ch
 * updates: pkg.adfinis-sygroup.ch
Warning: No matches found for: python3-cherrypy
No matches found


But as you can see, it cannot find it.

Anything else I can try?


Regards,

Simon


Von: Khodayar Doustar 
Gesendet: Donnerstag, 23. April 2020 14:41:38
An: Simon Sutter
Cc: ceph-users@ceph.io
Betreff: Re: [ceph-users] Re: Upgrading to Octopus

Simon,

You can try to search for the exact package name, you can try these repos as 
well:

yum -y install 
epel-release centos-release-ceph-nautilus centos-release-openstack-stein


On Thu, Apr 23, 2020 at 11:57 AM Simon Sutter 
mailto:ssut...@hosttech.ch>> wrote:
Hello Khodayar


Of cours I tried installing them with yum...

They are not available in the centos base or epel repos, here are the ones, 
which are available:


[root@node1 ~]# yum list | egrep "cherrypy|jwt|routes"
python-cherrypy.noarch 3.2.2-4.el7@base
python-cherrypy2.noarch2.3.0-19.el7   @epel
python-jwt.noarch  1.5.3-1.el7@base
python-routes.noarch   1.13-2.el7 @epel
nodejs-jwt-simple.noarch   0.2.0-1.el7epel
python36-jwt.noarch1.6.4-2.el7epel


How do I get either: The right packages or a workaround because i can install 
the dependencies with pip?


Regards,

Simon



Von: Khodayar Doustar mailto:dous...@rayanexon.ir>>
Gesendet: Mittwoch, 22. April 2020 20:02:04
An: Simon Sutter
Cc: ceph-users@ceph.io
Betreff: Re: [ceph-users] Upgrading to Octopus

Hi Simon,

Have you tried installing them with yum?




On Wed, Apr 22, 2020 at 6:16 PM Simon Sutter 
mailto:ssut...@hosttech.ch>>>
 wrote:
Hello everybody


In octopus there are some interesting looking features, so I tried to upgrading 
my Centos 7 test nodes, according to:
https://docs.ceph.com/docs/master/releases/octopus/

Everything went fine and the cluster is healthy.


To test out the new dashboard functions, I tried to install it, but there are 
missing dependencies:

yum install ceph-mgr-dashboard.noarch

.

--> Finished Dependency Resolution
Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
   Requires: python3-routes
Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
   Requires: python3-jwt
Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
   Requires: python3-cherrypy


Installing them with pip3 does of course make no difference, because those are 
yum dependencies.

Does anyone know a workaround?

Do I have to upgrade to Centos 8 for this to work?


Thanks in advance,

Simon
___
ceph-users mailing list -- 
ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrading to Octopus

2020-04-23 Thread Adam Tygart
The release notes [1] specify only partial support for CentOS 7.

"Note that the dashboard, prometheus, and restful manager modules will
not work on the CentOS 7 build due to Python 3 module dependencies
that are missing in CentOS 7."

You will need to move to CentOS 8, or potentially containerize [2](?)
your managers to get the full functionality.

[1] https://docs.ceph.com/docs/master/releases/octopus/
[2] https://docs.ceph.com/docs/master/cephadm/#cephadm

--
Adam

On Thu, Apr 23, 2020 at 9:39 AM Simon Sutter  wrote:
>
> Khodayar,
>
>
> I added all those repos, but sitll, those packages are missing.
>
> I can of course search for the exact package name like this:
>
>
> [root@node1 ~]# yum search python3-cherrypy
> Loaded plugins: fastestmirror, langpacks, priorities
> Loading mirror speeds from cached hostfile
>  * base: pkg.adfinis-sygroup.ch
>  * centos-ceph-nautilus: pkg.adfinis-sygroup.ch
>  * centos-nfs-ganesha28: pkg.adfinis-sygroup.ch
>  * centos-openstack-stein: pkg.adfinis-sygroup.ch
>  * centos-qemu-ev: pkg.adfinis-sygroup.ch
>  * centos-sclo-rh: pkg.adfinis-sygroup.ch
>  * centos-sclo-sclo: pkg.adfinis-sygroup.ch
>  * epel: pkg.adfinis-sygroup.ch
>  * extras: pkg.adfinis-sygroup.ch
>  * updates: pkg.adfinis-sygroup.ch
> Warning: No matches found for: python3-cherrypy
> No matches found
>
>
> But as you can see, it cannot find it.
>
> Anything else I can try?
>
>
> Regards,
>
> Simon
>
> 
> Von: Khodayar Doustar 
> Gesendet: Donnerstag, 23. April 2020 14:41:38
> An: Simon Sutter
> Cc: ceph-users@ceph.io
> Betreff: Re: [ceph-users] Re: Upgrading to Octopus
>
> Simon,
>
> You can try to search for the exact package name, you can try these repos as 
> well:
>
> yum -y install 
> epel-release centos-release-ceph-nautilus centos-release-openstack-stein
>
>
> On Thu, Apr 23, 2020 at 11:57 AM Simon Sutter 
> mailto:ssut...@hosttech.ch>> wrote:
> Hello Khodayar
>
>
> Of cours I tried installing them with yum...
>
> They are not available in the centos base or epel repos, here are the ones, 
> which are available:
>
>
> [root@node1 ~]# yum list | egrep "cherrypy|jwt|routes"
> python-cherrypy.noarch 3.2.2-4.el7@base
> python-cherrypy2.noarch2.3.0-19.el7   @epel
> python-jwt.noarch  1.5.3-1.el7@base
> python-routes.noarch   1.13-2.el7 @epel
> nodejs-jwt-simple.noarch   0.2.0-1.el7epel
> python36-jwt.noarch1.6.4-2.el7epel
>
>
> How do I get either: The right packages or a workaround because i can install 
> the dependencies with pip?
>
>
> Regards,
>
> Simon
>
>
> 
> Von: Khodayar Doustar mailto:dous...@rayanexon.ir>>
> Gesendet: Mittwoch, 22. April 2020 20:02:04
> An: Simon Sutter
> Cc: ceph-users@ceph.io
> Betreff: Re: [ceph-users] Upgrading to Octopus
>
> Hi Simon,
>
> Have you tried installing them with yum?
>
>
>
>
> On Wed, Apr 22, 2020 at 6:16 PM Simon Sutter 
> mailto:ssut...@hosttech.ch>>>
>  wrote:
> Hello everybody
>
>
> In octopus there are some interesting looking features, so I tried to 
> upgrading my Centos 7 test nodes, according to:
> https://docs.ceph.com/docs/master/releases/octopus/
>
> Everything went fine and the cluster is healthy.
>
>
> To test out the new dashboard functions, I tried to install it, but there are 
> missing dependencies:
>
> yum install ceph-mgr-dashboard.noarch
>
> .
>
> --> Finished Dependency Resolution
> Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
>Requires: python3-routes
> Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
>Requires: python3-jwt
> Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
>Requires: python3-cherrypy
>
>
> Installing them with pip3 does of course make no difference, because those 
> are yum dependencies.
>
> Does anyone know a workaround?
>
> Do I have to upgrade to Centos 8 for this to work?
>
>
> Thanks in advance,
>
> Simon
> ___
> ceph-users mailing list -- 
> ceph-users@ceph.io>
> To unsubscribe send an email to 
> ceph-users-le...@ceph.io>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to 
> ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
_

[ceph-users] Dear Abby: Why Is Architecting CEPH So Hard?

2020-04-23 Thread Linus VanWeil
Hello,

Looks like the original chain got deleted, but thank you to everyone who 
responded. Just to keep any new-comers in the loop, I have pasted the original 
positing below. To all the original contributors to this chain, I feel much 
more confident in my design theory for the storage nodes. However, I wanted to 
narrow the focus and see if I can get any elaborated comments on the two below 
topics.

Does anyone have any real-world data on metrics I can use to size MONs?
When are they active?
When do they utilize CPU, RAM, Storage (ie. larger storage pools require more 
resources, resources are used during recovery, etc.)?

For anyone that commented or has opinions on Storage node sizing:
How does choosing EC vs 3X replication affect your sizing of CPU / RAM?
IS the some kind of over-head generalization I can use if assuming EC (ie. add 
an extra core per OSD)? I understand that recoveries are where this is most 
important, so I am looking for sizing metrics based on living through worst 
case scenarios.



---
ORIGINAL POSTING:

Hey Folks,

This is my first ever post here in the CEPH user group and I will preface with 
the fact
that I know this is a lot of what many people ask frequently. Unlike what I 
assume to be a
large majority of CEPH “users” in this forum, I am more of a CEPH 
“distributor.” My
interests lie in how to build a CEPH environment to best fill an organization’s 
needs.I am
here for the real-world experience and expertise so that I can learn to build 
CEPH
“right.” I have spent the last couple years collecting data on general “best 
practices”
through forum posts, CEPH documentation, CEPHLACON, etc. I wanted to post my 
findings to
the forum to see where I can harden my stance.

Below are two example designs that I might use when architecting a solution 
currently. I
have specific questions around design elements in each that I would like you to 
approve
for holding water or not. I want to focus on the hardware, so I am asking for
generalizations where possible. Let’s assume in all scenarios that we are using 
Luminous
and that the data type is mixed use.
I am not expecting anyone to run through every question, so please feel free to 
comment on
any piece you can. Tell me what is overkill and what is lacking!

Example 1:
8x 60-Bay (8TB) Storage nodes (480x 8TB SAS Drives)
Storage Node Spec:
2x 32C 2.9GHz AMD EPYC
- Documentation mentions .5 cores per OSD for throughput optimized. Are they 
talking
about .5 Physical cores or .5 Logical cores?
- Is it better to pick my processors based on a total GHz measurement like 2GHz 
per
OSD?
- Would a theoretical 8C at 2GHz serve the same number of OSDs as a 16C at 
1GHz? Would
Threads be included in this calculation?
512GB Memory
- I know this is the hot topic because of its role in recoveries. Basically, I 
am
looking for the most generalized practice I can use as a safe number and a 
metric I can
use as a nice to have.
- Is it 1GB of RAM per TB of RAW OSD?
2x 3.2TB NVMe WAHLDB / Log Drive
- Another hot topic that I am sure will bring many “it depends.” All I am 
looking for
is experience on this. I know people have mentioned having at least 70GB of 
Flash for
WAHLDB / Logs.
- Can I use 70GB as a flat calculation per OSD or is it depend on the Size of 
the OSD?
- I know more is better, but what is a number I can use to get started with 
minimal
issues?
2x 56Gbit Links
- I think this should be enough given the rule of thumb of 10Gbit for every 12 
OSDs.
3x MON Node
MON Node Spec:
1x 8C 3.2GHz AMD EPYC
- I can’t really find good practices around when to increase your core count. 
Any
suggestions?
128GB Memory
- What do I need memory for in a MON?
- When do I need to expand?
2x 480GB Boot SSDs
- Any reason to look more closely into the sizing of these drives?
2x 25Gbit Uplinks
- Should these match the output of the storage nodes for any reason?


Example 2:
8x 12-Bay NVMe Storage nodes (96x 1.6TB NVMe Drives)
Storage Node Spec:
2x 32C 2.9GHz AMD EPYC
- I have read that each NMVe OSD should have 10 cores. I am not splitting 
Physical
drives into multiple OSDs so let’s assume I have 12 OSD per Node.
- Would threads count toward my 10 core quota or just physical cores?
- Can I do a similar calculation as I mentioned before and just use 20GHz per 
OSD
instead of focusing on cores specifically?
512GB Memory
- I assume there is some reason I can’t use the same methodology of 1GB per TB 
of OSD
since this is NVMe storage
2x 100Gbit Links
- This is assuming about 1Gigabyte per second of real-world speed per disk

3x MON Node – What differences should MONs serving NVMe have compared to large 
NLSAS
pools?
MON Node Spec:
1x 8C 3.2GHz AMD Epyc
128GB Memory
2x 480GB Boot SSDs
2x 25Gbit Uplinks
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Increase number of read and writes

2020-04-23 Thread Bobby
Hi Janne,

Thanks a lot ! I should have checked it earlier..I got it :-)

Basically I would like to compile the client read and write C/C++ codes and
then later profile the executables with valgrind and other profiling
tools.  The reason being I want to see the function calls, execution time
etc. This is very easy with the given Librados example. I am already doing
the profiling of executables.

What you have pointed out rearding *fio*, this is exactly my next goal (you
read my mind).

Given where I am at the moment (a Ceph deployment cluster) and given what I
want to achieve (profile the executables of the read write test codes with
high number of read and writes), how can I bring *fio* in it? May be there
are already some Ceph test codes with high number of write and read calls
in parallel?

I have come across this one example *librbd*  test code in Ceph repository
( https://github.com/ceph/ceph/blob/master/examples/librbd/hello_world.cc )

..



On Thu, Apr 23, 2020 at 4:16 PM Janne Johansson  wrote:

>
>
> Den tors 23 apr. 2020 kl 16:07 skrev Bobby :
>
>> Hi,
>>
>> I am using Ceph in developer mode. Currently I am implementing Librados
>> examples which are also available in Introduction to Librados section
>>
>> https://docs.ceph.com/docs/master/rados/api/librados-intro/#step-3-creating-an-i-o-context
>> .
>> It says once your app has a cluster handle and a connection to a Ceph
>> Storage Cluster, you may create an I/O Context and begin reading and
>> writing data.  For example,
>>
>> *err = rados_write(io, "hw", "Hello World!", 12, 0);
>>
>
>
>>
>> My question, Is "12" is the number of writes? Because I want to test the
>> with high number of read and writes.
>>
>> Looking for help !
>>
>
> Just check what parameters the function takes:
> CEPH_RADOS_API
> 
>  int rados_write(rados_ioctx_t
> 
> * io*, const char ** oid*, const char ** buf*, size_t* len*, uint64_t
> * off*)¶
> 
>
> Write *len* bytes from *buf* into the *oid* object, starting at offset
> *off*. The value of *len* must be <= UINT_MAX/2.
>
>
> The 12 seems to be the lenght of "Hello World!" in bytes, which matches
> what a normal write() call would need.
> In order to test high number of writes, you need to send lots of write
> calls in parallel.
>
> (Or just get fio with rbd support compiled in, this is a solved problem
> already how to benchmark ceph at a low level)
>
> --
> May the most significant bit of your life be positive.
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrading to Octopus

2020-04-23 Thread Khodayar Doustar
Yes! That was what I was going to paste here

On Thu, Apr 23, 2020 at 7:18 PM Adam Tygart  wrote:

> The release notes [1] specify only partial support for CentOS 7.
>
> "Note that the dashboard, prometheus, and restful manager modules will
> not work on the CentOS 7 build due to Python 3 module dependencies
> that are missing in CentOS 7."
>
> You will need to move to CentOS 8, or potentially containerize [2](?)
> your managers to get the full functionality.
>
> [1] https://docs.ceph.com/docs/master/releases/octopus/
> [2] https://docs.ceph.com/docs/master/cephadm/#cephadm
>
> --
> Adam
>
> On Thu, Apr 23, 2020 at 9:39 AM Simon Sutter  wrote:
> >
> > Khodayar,
> >
> >
> > I added all those repos, but sitll, those packages are missing.
> >
> > I can of course search for the exact package name like this:
> >
> >
> > [root@node1 ~]# yum search python3-cherrypy
> > Loaded plugins: fastestmirror, langpacks, priorities
> > Loading mirror speeds from cached hostfile
> >  * base: pkg.adfinis-sygroup.ch
> >  * centos-ceph-nautilus: pkg.adfinis-sygroup.ch
> >  * centos-nfs-ganesha28: pkg.adfinis-sygroup.ch
> >  * centos-openstack-stein: pkg.adfinis-sygroup.ch
> >  * centos-qemu-ev: pkg.adfinis-sygroup.ch
> >  * centos-sclo-rh: pkg.adfinis-sygroup.ch
> >  * centos-sclo-sclo: pkg.adfinis-sygroup.ch
> >  * epel: pkg.adfinis-sygroup.ch
> >  * extras: pkg.adfinis-sygroup.ch
> >  * updates: pkg.adfinis-sygroup.ch
> > Warning: No matches found for: python3-cherrypy
> > No matches found
> >
> >
> > But as you can see, it cannot find it.
> >
> > Anything else I can try?
> >
> >
> > Regards,
> >
> > Simon
> >
> > 
> > Von: Khodayar Doustar 
> > Gesendet: Donnerstag, 23. April 2020 14:41:38
> > An: Simon Sutter
> > Cc: ceph-users@ceph.io
> > Betreff: Re: [ceph-users] Re: Upgrading to Octopus
> >
> > Simon,
> >
> > You can try to search for the exact package name, you can try these
> repos as well:
> >
> > yum -y install
> epel-release centos-release-ceph-nautilus centos-release-openstack-stein
> >
> >
> > On Thu, Apr 23, 2020 at 11:57 AM Simon Sutter  > wrote:
> > Hello Khodayar
> >
> >
> > Of cours I tried installing them with yum...
> >
> > They are not available in the centos base or epel repos, here are the
> ones, which are available:
> >
> >
> > [root@node1 ~]# yum list | egrep "cherrypy|jwt|routes"
> > python-cherrypy.noarch 3.2.2-4.el7@base
> > python-cherrypy2.noarch2.3.0-19.el7   @epel
> > python-jwt.noarch  1.5.3-1.el7@base
> > python-routes.noarch   1.13-2.el7 @epel
> > nodejs-jwt-simple.noarch   0.2.0-1.el7epel
> > python36-jwt.noarch1.6.4-2.el7epel
> >
> >
> > How do I get either: The right packages or a workaround because i can
> install the dependencies with pip?
> >
> >
> > Regards,
> >
> > Simon
> >
> >
> > 
> > Von: Khodayar Doustar mailto:dous...@rayanexon.ir
> >>
> > Gesendet: Mittwoch, 22. April 2020 20:02:04
> > An: Simon Sutter
> > Cc: ceph-users@ceph.io
> > Betreff: Re: [ceph-users] Upgrading to Octopus
> >
> > Hi Simon,
> >
> > Have you tried installing them with yum?
> >
> >
> >
> >
> > On Wed, Apr 22, 2020 at 6:16 PM Simon Sutter  >> wrote:
> > Hello everybody
> >
> >
> > In octopus there are some interesting looking features, so I tried to
> upgrading my Centos 7 test nodes, according to:
> > https://docs.ceph.com/docs/master/releases/octopus/
> >
> > Everything went fine and the cluster is healthy.
> >
> >
> > To test out the new dashboard functions, I tried to install it, but
> there are missing dependencies:
> >
> > yum install ceph-mgr-dashboard.noarch
> >
> > .
> >
> > --> Finished Dependency Resolution
> > Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
> >Requires: python3-routes
> > Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
> >Requires: python3-jwt
> > Error: Package: 2:ceph-mgr-dashboard-15.2.1-0.el7.noarch (Ceph-noarch)
> >Requires: python3-cherrypy
> >
> >
> > Installing them with pip3 does of course make no difference, because
> those are yum dependencies.
> >
> > Does anyone know a workaround?
> >
> > Do I have to upgrade to Centos 8 for this to work?
> >
> >
> > Thanks in advance,
> >
> > Simon
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io >>
> > To unsubscribe send an email to ceph-users-le...@ceph.io ceph-users-le...@ceph.io>>
> > ___

[ceph-users] Re: How to debug ssh: ceph orch host add ceph01 10.10.1.1

2020-04-23 Thread Ml Ml
Can anyone help me here? :-/

On Wed, Apr 22, 2020 at 10:36 PM Ml Ml  wrote:
>
> Hello List,
>
> i did:
> root@ceph01:~# ceph cephadm set-ssh-config -i /tmp/ssh_conf
>
> root@ceph01:~# cat /tmp/ssh_conf
> Host *
> User root
> StrictHostKeyChecking no
> UserKnownHostsFile /dev/null
>
> root@ceph01:~# ceph config-key set mgr/cephadm/ssh_identity_key -i
> /root/.ssh/id_rsa
> set mgr/cephadm/ssh_identity_key
> root@ceph01:~# ceph config-key set mgr/cephadm/ssh_identity_pub -i
> /root/.ssh/id_rsa.pub
> set mgr/cephadm/ssh_identity_pub
>
> But i get:
> root@ceph01:~# ceph orch host add ceph01 10.10.1.1
> Error ENOENT: Failed to connect to ceph01 (10.10.1.1).  Check that the
> host is reachable and accepts connections using the cephadm SSH key
>
> root@ceph01:~# ceph config-key get mgr/cephadm/ssh_identity_key =>
> this shows my private key
>
> How can i debug this?
>
> root@ceph01:~# ssh 10.10.1.1
>   or
> root@ceph01:~# ssh ceph01
>
> work without a prompt or key error.
>
> I am using 15.2.0.
>
> Thanks,
> Michael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: adding block.db to OSD

2020-04-23 Thread Stefan Priebe - Profihost AG

Hi,

if the OSDs are idle the difference is even more worse:

# ceph tell osd.0 bench
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 15.39670787501,
"bytes_per_sec": 69738403.346825853,
"iops": 16.626931034761871
}

# ceph tell osd.38 bench
{
"bytes_written": 1073741824,
"blocksize": 4194304,
"elapsed_sec": 6.890398517004,
"bytes_per_sec": 155831599.77624846,
"iops": 37.153148597776521
}

Stefan

Am 23.04.20 um 14:39 schrieb Stefan Priebe - Profihost AG:

Hi,
Am 23.04.20 um 14:06 schrieb Igor Fedotov:
I don't recall any additional tuning to be applied to new DB volume. 
And assume the hardware is pretty the same...


Do you still have any significant amount of data spilled over for 
these updated OSDs? If not I don't have any valid explanation for the 
phenomena.


just the 64k from here:
https://tracker.ceph.com/issues/44509

You might want to try "ceph osd bench" to compare OSDs under pretty 
the same load. Any difference observed


Servers are the same HW. OSD Bench is:
# ceph tell osd.0 bench
{
     "bytes_written": 1073741824,
     "blocksize": 4194304,
     "elapsed_sec": 16.09141478101,
     "bytes_per_sec": 66727620.822242722,
     "iops": 15.909104543266945
}

# ceph tell osd.36 bench
{
     "bytes_written": 1073741824,
     "blocksize": 4194304,
     "elapsed_sec": 10.023828538,
     "bytes_per_sec": 107118933.6419194,
     "iops": 25.539143953780986
}


OSD 0 is a Toshiba MG07SCA12TA SAS 12G
OSD 36 is a Seagate ST12000NM0008-2H SATA 6G

SSDs are all the same like the rest of the HW. But both drives should 
give the same performance from their specs. The only other difference is 
that OSD 36 was directly created with the block.db device (Nautilus 
14.2.7) and OSD 0 (14.2.8) does not.


Stefan



On 4/23/2020 8:35 AM, Stefan Priebe - Profihost AG wrote:

Hello,

is there anything else needed beside running:
ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-${OSD} 
bluefs-bdev-new-db --dev-target /dev/vgroup/lvdb-1


I did so some weeks ago and currently i'm seeing that all osds 
originally deployed with --block-db show 10-20% I/O waits while all 
those got converted using ceph-bluestore-tool show 80-100% I/O waits.


Also is there some tuning available to use more of the SSD? The SSD 
(block-db) is only saturated at 0-2%.


Greets,
Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Upgrading to Octopus

2020-04-23 Thread gert . wieberdink
Hello Simon,
I think that Khodayar is right. I managed to install a new Ceph cluster on 
CentOS 8.1. Therefore you will need the ceph-el8.repo for the time being. For 
some reason, "they" left the py3 packages you mentioned out of EPEL (as with 
leveldb, but this package appeared luckily last week in EPEL). 
Please find below the ceph-el8.repo file, which you have to create in 
/etc/yum.repos.d/

[copr:copr.fedorainfracloud.org:ktdreyer:ceph-el8]
name=Copr repo for ceph-el8 owned by ktdreyer
baseurl=https://download.copr.fedorainfracloud.org/results/ktdreyer/ceph-el8/epel-8-$basearch/
type=rpm-md
skip_if_unavailable=True
gpgcheck=1
gpgkey=https://download.copr.fedorainfracloud.org/results/ktdreyer/ceph-el8/pubkey.gpg
repo_gpgcheck=0
enabled=1
enabled_metadata=1

This repository - and CentOS 8.x - should have been sufficient to bring up a 
fresh Ceph cluster.
Please let me know if you still have problems in configuring your Ceph cluster.
rgds,
-gw
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] v13.2.10 Mimic released

2020-04-23 Thread Abhishek Lekshmanan

We're happy to announce the availability of the tenth bugfix release of
Ceph Mimic, this release fixes a RGW vulnerability affecting mimic, and
we recommend that all mimic users upgrade.

Notable Changes
---
* CVE 2020 12059: Fix an issue with Post Object Requests with Tagging
  (#44967, Lei Cao, Abhishek Lekshmanan)


Getting Ceph

* Git at git://github.com/ceph/ceph.git
* Tarball at http://download.ceph.com/tarballs/ceph-13.2.10.tar.gz
* For packages: http://docs.ceph.com/docs/master/install/get-packages/
* Release git sha1: 564bdc4ae87418a232fc901524470e1a0f76d641
* Release blog: https://ceph.io/releases/v13-2-10-mimic-released

--
Abhishek Lekshmanan
SUSE Software Solutions Germany GmbH
GF: Felix Imendörffer
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] active+remapped+backfilling keeps going .. and going

2020-04-23 Thread Kyriazis, George
Hello,

I have a Proxmox ceph cluster with 5 nodes and 3 OSDs each (total 15 OSDs), on 
a 10G network.

The cluster started small, and I’ve progressively added OSDs over time.  
Problem is…. The cluster never rebalances completely.  There is always progress 
on backfilling, but PGs that used to be in active+clean state jump back into 
the active+remapped+backfilling (or active+remapped+backfill_wait) state, to be 
moved to different OSDs.

Initially I had a 1G network (recently upgraded to 10G), and I was holding on 
the backfill settings (osd_max_backfills and osd_recovery_sleep_hdd).  I just 
recently (last few weeks) upgraded to 10G, with osd_max_backfills = 50 and 
osd_recovery_sleep_hdd = 0 (only HDDs, no SSDs).  Cluster has been backfilling 
for months now with no end in sight.

Is this normal behavior?  Is there any setting that I can look at that till 
give me an idea as to why PGs are jumping back into remapped from clean?

Below is output of “ceph osd tree” and “ceph osd df”:

# ceph osd tree
ID  CLASS WEIGHTTYPE NAME   STATUS REWEIGHT PRI-AFF
 -1   203.72472 root default
 -940.01666 host vis-hsw-01
  3   hdd  10.91309 osd.3   up  1.0 1.0
  6   hdd  14.55179 osd.6   up  1.0 1.0
 10   hdd  14.55179 osd.10  up  1.0 1.0
-1340.01666 host vis-hsw-02
  0   hdd  10.91309 osd.0   up  1.0 1.0
  7   hdd  14.55179 osd.7   up  1.0 1.0
 11   hdd  14.55179 osd.11  up  1.0 1.0
-1140.01666 host vis-hsw-03
  4   hdd  10.91309 osd.4   up  1.0 1.0
  8   hdd  14.55179 osd.8   up  1.0 1.0
 12   hdd  14.55179 osd.12  up  1.0 1.0
 -340.01666 host vis-hsw-04
  5   hdd  10.91309 osd.5   up  1.0 1.0
  9   hdd  14.55179 osd.9   up  1.0 1.0
 13   hdd  14.55179 osd.13  up  1.0 1.0
-1543.65807 host vis-hsw-05
  1   hdd  14.55269 osd.1   up  1.0 1.0
  2   hdd  14.55269 osd.2   up  1.0 1.0
 14   hdd  14.55269 osd.14  up  1.0 1.0
 -5   0 host vis-ivb-07
 -7   0 host vis-ivb-10
#

# ceph osd df
ID CLASS WEIGHT   REWEIGHT SIZERAW USE DATAOMAPMETAAVAIL   %USE 
 VAR  PGS STATUS
 3   hdd 10.91309  1.0  11 TiB 8.2 TiB 8.2 TiB 552 MiB  25 GiB 2.7 TiB 
75.08 1.19 131 up
 6   hdd 14.55179  1.0  15 TiB 9.1 TiB 9.1 TiB 1.2 GiB  30 GiB 5.5 TiB 
62.47 0.99 148 up
10   hdd 14.55179  1.0  15 TiB 8.1 TiB 8.1 TiB 1.5 GiB  20 GiB 6.4 TiB 
55.98 0.89 142 up
 0   hdd 10.91309  1.0  11 TiB 7.5 TiB 7.4 TiB 504 MiB  24 GiB 3.5 TiB 
68.34 1.09 120 up
 7   hdd 14.55179  1.0  15 TiB 8.7 TiB 8.7 TiB 1.0 GiB  31 GiB 5.8 TiB 
60.07 0.95 144 up
11   hdd 14.55179  1.0  15 TiB 9.4 TiB 9.3 TiB 819 MiB  20 GiB 5.2 TiB 
64.31 1.02 147 up
 4   hdd 10.91309  1.0  11 TiB 7.0 TiB 7.0 TiB 284 MiB  25 GiB 3.9 TiB 
64.35 1.02 112 up
 8   hdd 14.55179  1.0  15 TiB 9.3 TiB 9.2 TiB 1.8 GiB  29 GiB 5.3 TiB 
63.65 1.01 157 up
12   hdd 14.55179  1.0  15 TiB 8.6 TiB 8.6 TiB 623 MiB  19 GiB 5.9 TiB 
59.14 0.94 136 up
 5   hdd 10.91309  1.0  11 TiB 8.6 TiB 8.6 TiB 542 MiB  29 GiB 2.3 TiB 
79.01 1.26 134 up
 9   hdd 14.55179  1.0  15 TiB 8.2 TiB 8.2 TiB 707 MiB  27 GiB 6.3 TiB 
56.56 0.90 138 up
13   hdd 14.55179  1.0  15 TiB 8.7 TiB 8.7 TiB 741 MiB  18 GiB 5.8 TiB 
59.85 0.95 134 up
 1   hdd 14.55269  1.0  15 TiB 9.8 TiB 9.8 TiB 1.3 GiB  20 GiB 4.8 TiB 
67.18 1.07 158 up
 2   hdd 14.55269  1.0  15 TiB 8.7 TiB 8.7 TiB 936 MiB  18 GiB 5.8 TiB 
60.04 0.95 148 up
14   hdd 14.55269  1.0  15 TiB 8.3 TiB 8.3 TiB 673 MiB  18 GiB 6.3 TiB 
56.97 0.90 131 up
 TOTAL 204 TiB 128 TiB 128 TiB  13 GiB 350 GiB  75 TiB 62.95
MIN/MAX VAR: 0.89/1.26  STDDEV: 6.44
#


Thank you!

George

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Recovery throughput inversely linked with rbd_cache_xyz?

2020-04-23 Thread Harry G. Coin

Hello,

A couple days ago I increased the rbd cache size from the default to 
256MB/osd on a small 4 node, 6 osd/node setup in a test/lab setting.  
The rbd volumes are all vm images with writeback cache parameters and 
steady if only a few mb/sec writes going on. Logging mostly.    I 
noticed the recovery throughput went down 10x - 50x .  Using Ceph 
nautilus.  Am I seeing a coincidence or should recovery throughput tank 
when rbd cache sizes go up?  The underlying pools are mirrored on three 
disks each on a different nodes.


Thanks!

Harry Coin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: active+remapped+backfilling keeps going .. and going

2020-04-23 Thread Eugen Block

Hi,
the balancer is probably running, which mode? I changed the mode to  
none in our own cluster because it also never finished rebalancing and  
we didn’t have a bad pg distribution. Maybe it’s supposed to be like  
that, I don’t know.


Regards
Eugen


Zitat von "Kyriazis, George" :


Hello,

I have a Proxmox ceph cluster with 5 nodes and 3 OSDs each (total 15  
OSDs), on a 10G network.


The cluster started small, and I’ve progressively added OSDs over  
time.  Problem is…. The cluster never rebalances completely.  There  
is always progress on backfilling, but PGs that used to be in  
active+clean state jump back into the active+remapped+backfilling  
(or active+remapped+backfill_wait) state, to be moved to different  
OSDs.


Initially I had a 1G network (recently upgraded to 10G), and I was  
holding on the backfill settings (osd_max_backfills and  
osd_recovery_sleep_hdd).  I just recently (last few weeks) upgraded  
to 10G, with osd_max_backfills = 50 and osd_recovery_sleep_hdd = 0  
(only HDDs, no SSDs).  Cluster has been backfilling for months now  
with no end in sight.


Is this normal behavior?  Is there any setting that I can look at  
that till give me an idea as to why PGs are jumping back into  
remapped from clean?


Below is output of “ceph osd tree” and “ceph osd df”:

# ceph osd tree
ID  CLASS WEIGHTTYPE NAME   STATUS REWEIGHT PRI-AFF
 -1   203.72472 root default
 -940.01666 host vis-hsw-01
  3   hdd  10.91309 osd.3   up  1.0 1.0
  6   hdd  14.55179 osd.6   up  1.0 1.0
 10   hdd  14.55179 osd.10  up  1.0 1.0
-1340.01666 host vis-hsw-02
  0   hdd  10.91309 osd.0   up  1.0 1.0
  7   hdd  14.55179 osd.7   up  1.0 1.0
 11   hdd  14.55179 osd.11  up  1.0 1.0
-1140.01666 host vis-hsw-03
  4   hdd  10.91309 osd.4   up  1.0 1.0
  8   hdd  14.55179 osd.8   up  1.0 1.0
 12   hdd  14.55179 osd.12  up  1.0 1.0
 -340.01666 host vis-hsw-04
  5   hdd  10.91309 osd.5   up  1.0 1.0
  9   hdd  14.55179 osd.9   up  1.0 1.0
 13   hdd  14.55179 osd.13  up  1.0 1.0
-1543.65807 host vis-hsw-05
  1   hdd  14.55269 osd.1   up  1.0 1.0
  2   hdd  14.55269 osd.2   up  1.0 1.0
 14   hdd  14.55269 osd.14  up  1.0 1.0
 -5   0 host vis-ivb-07
 -7   0 host vis-ivb-10
#

# ceph osd df
ID CLASS WEIGHT   REWEIGHT SIZERAW USE DATAOMAPMETA 
AVAIL   %USE  VAR  PGS STATUS
 3   hdd 10.91309  1.0  11 TiB 8.2 TiB 8.2 TiB 552 MiB  25 GiB  
2.7 TiB 75.08 1.19 131 up
 6   hdd 14.55179  1.0  15 TiB 9.1 TiB 9.1 TiB 1.2 GiB  30 GiB  
5.5 TiB 62.47 0.99 148 up
10   hdd 14.55179  1.0  15 TiB 8.1 TiB 8.1 TiB 1.5 GiB  20 GiB  
6.4 TiB 55.98 0.89 142 up
 0   hdd 10.91309  1.0  11 TiB 7.5 TiB 7.4 TiB 504 MiB  24 GiB  
3.5 TiB 68.34 1.09 120 up
 7   hdd 14.55179  1.0  15 TiB 8.7 TiB 8.7 TiB 1.0 GiB  31 GiB  
5.8 TiB 60.07 0.95 144 up
11   hdd 14.55179  1.0  15 TiB 9.4 TiB 9.3 TiB 819 MiB  20 GiB  
5.2 TiB 64.31 1.02 147 up
 4   hdd 10.91309  1.0  11 TiB 7.0 TiB 7.0 TiB 284 MiB  25 GiB  
3.9 TiB 64.35 1.02 112 up
 8   hdd 14.55179  1.0  15 TiB 9.3 TiB 9.2 TiB 1.8 GiB  29 GiB  
5.3 TiB 63.65 1.01 157 up
12   hdd 14.55179  1.0  15 TiB 8.6 TiB 8.6 TiB 623 MiB  19 GiB  
5.9 TiB 59.14 0.94 136 up
 5   hdd 10.91309  1.0  11 TiB 8.6 TiB 8.6 TiB 542 MiB  29 GiB  
2.3 TiB 79.01 1.26 134 up
 9   hdd 14.55179  1.0  15 TiB 8.2 TiB 8.2 TiB 707 MiB  27 GiB  
6.3 TiB 56.56 0.90 138 up
13   hdd 14.55179  1.0  15 TiB 8.7 TiB 8.7 TiB 741 MiB  18 GiB  
5.8 TiB 59.85 0.95 134 up
 1   hdd 14.55269  1.0  15 TiB 9.8 TiB 9.8 TiB 1.3 GiB  20 GiB  
4.8 TiB 67.18 1.07 158 up
 2   hdd 14.55269  1.0  15 TiB 8.7 TiB 8.7 TiB 936 MiB  18 GiB  
5.8 TiB 60.04 0.95 148 up
14   hdd 14.55269  1.0  15 TiB 8.3 TiB 8.3 TiB 673 MiB  18 GiB  
6.3 TiB 56.97 0.90 131 up
 TOTAL 204 TiB 128 TiB 128 TiB  13 GiB 350 GiB   
75 TiB 62.95

MIN/MAX VAR: 0.89/1.26  STDDEV: 6.44
#


Thank you!

George

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] HBase/HDFS on Ceph/CephFS

2020-04-23 Thread jesper
Hi

We have an 3 year old Hadoop cluster - up for refresh - so it is time
to evaluate options. The "only" usecase is running an HBase installation
which is important for us and migrating out of HBase would be a hazzle.

Our Ceph usage has expanded and in general - we really like what we see.

Thus - Can this be "sanely" consolidated somehow? I have seen this:
https://docs.ceph.com/docs/jewel/cephfs/hadoop/
But it seem really-really bogus to me.

It recommends that you set:
pool 3 'hadoop1' rep size 1 min_size 1

Which would - if I understand correct - be disastrous. The Hadoop end would
replicated in 3 across - but within Ceph the replication would be 1.
The 1 replication in ceph means pulling the OSD node would "gaurantee" the
pg's to go inactive - which could be ok - but there is nothing
gauranteeing that the other Hadoop replicas are not served out of the same
OSD-node/pg? In which case - rebooting an OSD node would bring the hadoop
cluster unavailable.

Is anyone serving HBase out of Ceph - how does the stadck and
configuration look? If I went for 3 x replication in both Ceph and HDFS
then it would definately work, but 9x copies of the dataset is a bit more
than what looks feasible at the moment.

Thanks for your reflections/input.

Jesper
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: active+remapped+backfilling keeps going .. and going

2020-04-23 Thread Lomayani S. Laizer
I had a similar problem  when upgraded to octopus and the solution is to
turn off  autobalancing.

You can try to turn off if enabled

ceph balancer off



On Fri, Apr 24, 2020 at 8:51 AM Eugen Block  wrote:

> Hi,
> the balancer is probably running, which mode? I changed the mode to
> none in our own cluster because it also never finished rebalancing and
> we didn’t have a bad pg distribution. Maybe it’s supposed to be like
> that, I don’t know.
>
> Regards
> Eugen
>
>
> Zitat von "Kyriazis, George" :
>
> > Hello,
> >
> > I have a Proxmox ceph cluster with 5 nodes and 3 OSDs each (total 15
> > OSDs), on a 10G network.
> >
> > The cluster started small, and I’ve progressively added OSDs over
> > time.  Problem is…. The cluster never rebalances completely.  There
> > is always progress on backfilling, but PGs that used to be in
> > active+clean state jump back into the active+remapped+backfilling
> > (or active+remapped+backfill_wait) state, to be moved to different
> > OSDs.
> >
> > Initially I had a 1G network (recently upgraded to 10G), and I was
> > holding on the backfill settings (osd_max_backfills and
> > osd_recovery_sleep_hdd).  I just recently (last few weeks) upgraded
> > to 10G, with osd_max_backfills = 50 and osd_recovery_sleep_hdd = 0
> > (only HDDs, no SSDs).  Cluster has been backfilling for months now
> > with no end in sight.
> >
> > Is this normal behavior?  Is there any setting that I can look at
> > that till give me an idea as to why PGs are jumping back into
> > remapped from clean?
> >
> > Below is output of “ceph osd tree” and “ceph osd df”:
> >
> > # ceph osd tree
> > ID  CLASS WEIGHTTYPE NAME   STATUS REWEIGHT PRI-AFF
> >  -1   203.72472 root default
> >  -940.01666 host vis-hsw-01
> >   3   hdd  10.91309 osd.3   up  1.0 1.0
> >   6   hdd  14.55179 osd.6   up  1.0 1.0
> >  10   hdd  14.55179 osd.10  up  1.0 1.0
> > -1340.01666 host vis-hsw-02
> >   0   hdd  10.91309 osd.0   up  1.0 1.0
> >   7   hdd  14.55179 osd.7   up  1.0 1.0
> >  11   hdd  14.55179 osd.11  up  1.0 1.0
> > -1140.01666 host vis-hsw-03
> >   4   hdd  10.91309 osd.4   up  1.0 1.0
> >   8   hdd  14.55179 osd.8   up  1.0 1.0
> >  12   hdd  14.55179 osd.12  up  1.0 1.0
> >  -340.01666 host vis-hsw-04
> >   5   hdd  10.91309 osd.5   up  1.0 1.0
> >   9   hdd  14.55179 osd.9   up  1.0 1.0
> >  13   hdd  14.55179 osd.13  up  1.0 1.0
> > -1543.65807 host vis-hsw-05
> >   1   hdd  14.55269 osd.1   up  1.0 1.0
> >   2   hdd  14.55269 osd.2   up  1.0 1.0
> >  14   hdd  14.55269 osd.14  up  1.0 1.0
> >  -5   0 host vis-ivb-07
> >  -7   0 host vis-ivb-10
> > #
> >
> > # ceph osd df
> > ID CLASS WEIGHT   REWEIGHT SIZERAW USE DATAOMAPMETA
> > AVAIL   %USE  VAR  PGS STATUS
> >  3   hdd 10.91309  1.0  11 TiB 8.2 TiB 8.2 TiB 552 MiB  25 GiB
> > 2.7 TiB 75.08 1.19 131 up
> >  6   hdd 14.55179  1.0  15 TiB 9.1 TiB 9.1 TiB 1.2 GiB  30 GiB
> > 5.5 TiB 62.47 0.99 148 up
> > 10   hdd 14.55179  1.0  15 TiB 8.1 TiB 8.1 TiB 1.5 GiB  20 GiB
> > 6.4 TiB 55.98 0.89 142 up
> >  0   hdd 10.91309  1.0  11 TiB 7.5 TiB 7.4 TiB 504 MiB  24 GiB
> > 3.5 TiB 68.34 1.09 120 up
> >  7   hdd 14.55179  1.0  15 TiB 8.7 TiB 8.7 TiB 1.0 GiB  31 GiB
> > 5.8 TiB 60.07 0.95 144 up
> > 11   hdd 14.55179  1.0  15 TiB 9.4 TiB 9.3 TiB 819 MiB  20 GiB
> > 5.2 TiB 64.31 1.02 147 up
> >  4   hdd 10.91309  1.0  11 TiB 7.0 TiB 7.0 TiB 284 MiB  25 GiB
> > 3.9 TiB 64.35 1.02 112 up
> >  8   hdd 14.55179  1.0  15 TiB 9.3 TiB 9.2 TiB 1.8 GiB  29 GiB
> > 5.3 TiB 63.65 1.01 157 up
> > 12   hdd 14.55179  1.0  15 TiB 8.6 TiB 8.6 TiB 623 MiB  19 GiB
> > 5.9 TiB 59.14 0.94 136 up
> >  5   hdd 10.91309  1.0  11 TiB 8.6 TiB 8.6 TiB 542 MiB  29 GiB
> > 2.3 TiB 79.01 1.26 134 up
> >  9   hdd 14.55179  1.0  15 TiB 8.2 TiB 8.2 TiB 707 MiB  27 GiB
> > 6.3 TiB 56.56 0.90 138 up
> > 13   hdd 14.55179  1.0  15 TiB 8.7 TiB 8.7 TiB 741 MiB  18 GiB
> > 5.8 TiB 59.85 0.95 134 up
> >  1   hdd 14.55269  1.0  15 TiB 9.8 TiB 9.8 TiB 1.3 GiB  20 GiB
> > 4.8 TiB 67.18 1.07 158 up
> >  2   hdd 14.55269  1.0  15 TiB 8.7 TiB 8.7 TiB 936 MiB  18 GiB
> > 5.8 TiB 60.04 0.95 148 up
> > 14   hdd 14.55269  1.0  15 TiB 8.3 TiB 8.3 TiB 673 MiB  18 GiB
> > 6.3 TiB 56.97 0.90 131 up
> >  TOTAL 204 TiB 128 TiB 128 TiB  13 GiB 350 GiB
> > 75 TiB 62.95
> > MIN/MAX VAR: 0.89/1.26  STDDEV: 6.44
> > #
> >
> >
> > Thank you!
> >
> > George
> >
> > ___
> > ceph-users m