[ceph-users] Re: Cephfs IO halt on Node failure

2020-05-18 Thread Eugen Block
If your pool has a min_size 2 and size 2 (always a bad idea) it will  
pause IO in case of a failure until the recovery has finished. So the  
described behaviour is expected.



Zitat von Amudhan P :


Hi,

Crush rule is "replicated" and min_size 2 actually. I am trying to test
multiple volume configs in a single filesystem
using file layout.

I have created metadata pool with rep 3 (min_size2 and replicated crush
rule) and data pool with rep 3  (min_size2 and replicated crush rule). and
also  I have created multiple (replica 2, ec2-1 & ec4-2) pools and added to
the filesystem.

Using file layout I have set different data pool to a different folders. so
I can test different configs in the same filesystem. all data pools
min_size set to handle single node failure.

Single node failure is handled properly when only having metadata pool and
one data pool (rep3).

After adding additional data pool to fs, single node failure scenario is
not working.

regards
Amudhan P

On Sun, May 17, 2020 at 1:29 AM Eugen Block  wrote:


What’s your pool configuration wrt min_size and crush rules?


Zitat von Amudhan P :

> Hi,
>
> I am using ceph Nautilus cluster with below configuration.
>
> 3 node's (Ubuntu 18.04) each has 12 OSD's, and mds, mon and mgr are
running
> in shared mode.
>
> The client mounted through ceph kernel client.
>
> I was trying to emulate a node failure when a write and read were going
on
> (replica2) pool.
>
> I was expecting read and write continue after a small pause due to a Node
> failure but it halts and never resumes until the failed node is up.
>
> I remember I tested the same scenario before in ceph mimic where it
> continued IO after a small pause.
>
> regards
> Amudhan P
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io







___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph as a Fileserver for 3D Content Production

2020-05-18 Thread Janne Johansson
Den sön 17 maj 2020 kl 14:33 skrev Marc Roos :

> outs, and you are more likely to shoot yourself in the foot. At least
> ask first. Eg this bcache, I am not 100% sure what it is, but if it is
> sitting between the osd process and the disk, it could be getting nasty
> with a reset/power outage, when ceph thinks data is written to disk,
> while it is not.
>

Bcache is a neat way in linux to stack devices, working basically like the
ceph layered pools, but on a per-host basis of course.

The main advantage, and why some ceph admins use if instead of putting
WAL/DB on the fast device, is that a single bcache device can be placed in
front of several other devices, and the kernel will decide which writes are
going to use the write cache, so in effect you are sharing the full size of
your fast device speed on all OSDs, instead of having to split your NVME
into pieces or preset a journal size. If you at any point expect to add
more drives to a box, this becomes even more interesting, since knowing
beforehand how many preallocations to leave room for is very hard and
leaves out potential gains until you get it right.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Ceph as a Fileserver for 3D Content Production

2020-05-18 Thread Moritz Wilhelm
Is there any experience  in using bcache with write back on ceph osds? 
Especially considering stability after a power outage?

Holen Sie sich Outlook für Android
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Luminous to Nautilus mon upgrade oddity - failed to decode mgrstat state; luminous dev version? buffer::end_of_buffer

2020-05-18 Thread Dan van der Ster
Hi Tom,

Did you get past this? It looks like the mon is confused how to decode
because of your non-standard release.
(So I imaging that running all 14.2.9 mons would get past it, but if
you're being cautious this should be reproduceable on your test
cluster).

-- Dan


On Wed, May 13, 2020 at 12:07 PM Thomas Byrne - UKRI STFC
 wrote:
>
> Hi all,
>
> We're upgrading a cluster from luminous to nautilus. The monitors and 
> managers are running a non-release version of luminous  
> (12.2.12-642-g5ff3e8e) and we're upgrading them to 14.2.9.
>
> We've upgraded one monitor and it's happily in quorum as a peon. However, 
> when a ceph status hits the nautilus mon it has trouble talking to the 
> manager apparently, and it returns a status output with no pg stats and 
> garbage usage numbers. From the mon log:
>
> 2020-05-13 10:41:43.121 7fa1e6fdf700  0 mon.ceph-mon5@4(peon) e25 
> handle_command mon_command({"prefix": "status"} v 0) v1
> 2020-05-13 10:41:43.121 7fa1e6fdf700  0 log_channel(audit) log [DBG] : 
> from='client.? v1:130.246.x.x:0/3261311028' entity='client.admin' 
> cmd=[{"prefix": "status"}]: dispatch
> 2020-05-13 10:41:43.443 7fa1e6fdf700 -1 mon.ceph-mon5@4(peon).mgrstat failed 
> to decode mgrstat state; luminous dev version? buffer::end_of_buffer
> 2020-05-13 10:41:44.397 7fa1e6fdf700  1 mon.ceph-mon5@4(peon) e25 dropping 
> unexpected mon_health( e 0 r 0 ) v1
>
> Is this expected for a luminous to nautilus upgrade or could this be due to 
> the odd luminous version we are running, or something else entirely?
>
> Cheers,
> Tom
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Luminous to Nautilus mon upgrade oddity - failed to decode mgrstat state; luminous dev version? buffer::end_of_buffer

2020-05-18 Thread Thomas Byrne - UKRI STFC
Hi Dan,

We ended up upgrading all mons+mgrs to 14.2.9 and the message stopped and the 
PG stats reappeared, as expected. Marcello started the OSD restarts this 
morning.

I think it would have been much less stressful to get the cluster onto 12.2.13 
before the nautilus upgrade, and much easier to get help should anything go 
wrong. Something to bear in mind for the future.

Cheers,
Tom

> -Original Message-
> From: Dan van der Ster 
> Sent: 18 May 2020 10:18
> To: Byrne, Thomas (STFC,RAL,SC) 
> Cc: ceph-users@ceph.io; Armand Pilon, Marcello (STFC,RAL,SC)
> 
> Subject: Re: [ceph-users] Luminous to Nautilus mon upgrade oddity - failed to
> decode mgrstat state; luminous dev version? buffer::end_of_buffer
> 
> Hi Tom,
> 
> Did you get past this? It looks like the mon is confused how to decode because
> of your non-standard release.
> (So I imaging that running all 14.2.9 mons would get past it, but if you're 
> being
> cautious this should be reproduceable on your test cluster).
> 
> -- Dan
> 
> 
> On Wed, May 13, 2020 at 12:07 PM Thomas Byrne - UKRI STFC
>  wrote:
> >
> > Hi all,
> >
> > We're upgrading a cluster from luminous to nautilus. The monitors and
> managers are running a non-release version of luminous  (12.2.12-642-
> g5ff3e8e) and we're upgrading them to 14.2.9.
> >
> > We've upgraded one monitor and it's happily in quorum as a peon.
> However, when a ceph status hits the nautilus mon it has trouble talking to 
> the
> manager apparently, and it returns a status output with no pg stats and
> garbage usage numbers. From the mon log:
> >
> > 2020-05-13 10:41:43.121 7fa1e6fdf700  0 mon.ceph-mon5@4(peon) e25
> > handle_command mon_command({"prefix": "status"} v 0) v1
> > 2020-05-13 10:41:43.121 7fa1e6fdf700  0 log_channel(audit) log [DBG] :
> > from='client.? v1:130.246.x.x:0/3261311028' entity='client.admin'
> > cmd=[{"prefix": "status"}]: dispatch
> > 2020-05-13 10:41:43.443 7fa1e6fdf700 -1 mon.ceph-
> mon5@4(peon).mgrstat
> > failed to decode mgrstat state; luminous dev version?
> > buffer::end_of_buffer
> > 2020-05-13 10:41:44.397 7fa1e6fdf700  1 mon.ceph-mon5@4(peon) e25
> > dropping unexpected mon_health( e 0 r 0 ) v1
> >
> > Is this expected for a luminous to nautilus upgrade or could this be due to
> the odd luminous version we are running, or something else entirely?
> >
> > Cheers,
> > Tom
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io

This email and any attachments are intended solely for the use of the named 
recipients. If you are not the intended recipient you must not use, disclose, 
copy or distribute this email or any of its attachments and should notify the 
sender immediately and delete this email from your system. UK Research and 
Innovation (UKRI) has taken every reasonable precaution to minimise risk of 
this email or any attachments containing viruses or malware but the recipient 
should carry out its own virus and malware checks before opening the 
attachments. UKRI does not accept any liability for any losses or damages which 
the recipient may sustain due to presence of any viruses. Opinions, conclusions 
or other information in this message and attachments that are not related 
directly to UKRI business are solely those of the author and do not represent 
the views of UKRI.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephadm and rados gateways

2020-05-18 Thread Sebastian Wagner
This will be fixed in 15.2.2 

https://tracker.ceph.com/issues/45215
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: nfs migrate to rgw

2020-05-18 Thread Zhenshi Zhou
Hi Wido,

I did a research on the nfs files. I found that it contains much pictures
about
50KB, and much video files around 30MB. The amount of the files is more than
1 million. Maybe I can find a way to seperate the files in more buckets so
that
there is no more than 1M objects in each bucket. But how about the small
files
around 50KB. Does rgw serve well on small files?

Wido den Hollander  于2020年5月12日周二 下午2:41写道:

>
>
> On 5/12/20 4:22 AM, Zhenshi Zhou wrote:
> > Hi all,
> >
> > We have several nfs servers providing file storage. There is a nginx in
> > front of
> > nfs servers in order to serve the clients. The files are mostly small
> files
> > and
> > nearly about 30TB in total.
> >
>
> What is small? How many objects/files are you talking about?
>
> > I'm gonna use ceph rgw as the storage. I wanna know if it's appropriate
> to
> > do so.
> > The data migrating from nfs to rgw is a huge job. Besides I'm not sure
> > whether
> > ceph rgw is suitable in this scenario or not.
> >
>
> Yes, it is. But make sure you don't put millions of objects into a
> single bucket. Make sure that you spread them out so that you have let's
> say 1M of objects per bucket at max.
>
> Wido
>
> > Thanks
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RGW issue with containerized ceph

2020-05-18 Thread Szabo, Istvan (Agoda)
Hi Gents,

We are having strange problem with demo ceph container.
Any write operation we do from the RGW fail with a NameResolutionFailure
Read operation work fine.
If we use the ceph UI we can create objects fine though.
This is the env varibles we are using to configure it:

This is the full compose:
version: '3.3'
services:
  ceph:
ports:
  - 8080:8080 # S3
  - 5000:5000 # Web UI
  - 8003:8003
environment:
- NETWORK_AUTO_DETECT=4
- CEPH_DAEMON=demo
- RGW_NAME=ceph
- CEPH_DEMO_UID=owncloud
- CEPH_DEMO_ACCESS_KEY=G1EZ5R4K6IJ7XUQKMAED
- CEPH_DEMO_SECRET_KEY=cNmUrqpBKjCMzcfqG8fg4Qk07Xkoyau52OmvnSsz
- DEBUG=verbose
image: ceph/daemon

Is there anybody has some idea what is going on?

Istvan Szabo
Senior Infrastructure Engineer
---
Agoda Services Co., Ltd.
e: istvan.sz...@agoda.com
---



This message is confidential and is for the sole use of the intended 
recipient(s). It may also be privileged or otherwise protected by copyright or 
other legal rules. If you have received it by mistake please let us know by 
reply email and delete it from your system. It is prohibited to copy this 
message or disclose its content to anyone. Any confidentiality or privilege is 
not waived or lost by any mistaken delivery or unauthorized disclosure of the 
message. All messages sent to and from Agoda may be monitored to ensure 
compliance with company policies, to protect the company's interests and to 
remove potential malware. Electronic messages may be intercepted, amended, lost 
or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: how to restart daemons on 15.2 on Debian 10

2020-05-18 Thread Sean Johnson
Use the same pattern ….

systemctl restart ceph-{fsid}@osd.{id}.service

~Sean

> On May 18, 2020, at 7:16 AM, Ml Ml  wrote:
> 
> Thanks,
> 
> The following seems to work for me on Debian 10 and 15.2.1:
> 
> systemctl restart ceph-5436dd5d-83d4-4dc8-a93b-60ab5db145df@mon.ceph01.service
> 
> How can i restart a single OSD?
> 
> Cheers,
> Michael
> 
> On Sun, May 17, 2020 at 5:10 PM Sean Johnson  wrote:
>> 
>> I have OSD’s on the brain … that line should have read:
>> 
>> systemctl restart ceph-{fsid}@mon.{host}.service
>> 
>> On May 17, 2020, at 10:08 AM, Sean Johnson  wrote:
>> 
>> In case that doesn’t work, there’s also a systemd service that contains the 
>> fsid of the cluster.
>> 
>> So, in the case of a mon service you can also run:
>> 
>> systemctl restart ceph-{fsid}@osd.{host}.service
>> 
>> Logs are correspondingly available via journalctl:
>> 
>> journalctl -u ceph-{fsid}@mon.{host}.service
>> 
>> 
>> ~Sean
>> 
>> On May 15, 2020, at 9:31 AM, Simon Sutter  wrote:
>> 
>> Hello Michael,
>> 
>> 
>> I had the same problems. It's very unfamiliar, if you never worked with the 
>> cephadm tool.
>> 
>> The Way I'm doing it is to go into the cephadm container:
>> # cephadm shell
>> 
>> Here you can list all containers (for each service, one container) with the 
>> orchestration tool:
>> 
>> # ceph orch ps
>> 
>> and then restart it with the orchestration tool:
>> 
>> # ceph orch restart {name from ceph orch ps}
>> 
>> 
>> Hope it helps.
>> 
>> 
>> Ceers,
>> 
>> Simon
>> 
>> 
>> Von: Ml Ml 
>> Gesendet: Freitag, 15. Mai 2020 12:27:09
>> An: ceph-users
>> Betreff: [ceph-users] how to restart daemons on 15.2 on Debian 10
>> 
>> Hello List,
>> 
>> how do you restart daemons (mgr, mon, osd) on 15.2.1?
>> 
>> It used to be something like:
>> systemctl stop ceph-osd@10
>> 
>> Or:
>> systemctl start ceph-mon@ceph03
>> 
>> however, those command do nothing on my setup.
>> 
>> Is this because i use cephadm and that docker stuff?
>> 
>> The Logs also seem to be missing.
>> /var/log/ceph/5436dd5d-83d4-4dc8-a93b-60ab5db145df is pretty empty.
>> 
>> I feel like i am missing a lot of documentation here? Can anyone point
>> me to my missing parts?
>> 
>> Thanks a lot.
>> 
>> Cheers,
>> Michael
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>> 
>> 
>> 
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io



signature.asc
Description: Message signed with OpenPGP
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: nfs migrate to rgw

2020-05-18 Thread Wido den Hollander


On 5/18/20 1:51 PM, Zhenshi Zhou wrote:
> Hi Wido,
> 
> I did a research on the nfs files. I found that it contains much
> pictures about 
> 50KB, and much video files around 30MB. The amount of the files is more than
> 1 million. Maybe I can find a way to seperate the files in more buckets
> so that 
> there is no more than 1M objects in each bucket. But how about the small
> files 
> around 50KB. Does rgw serve well on small files?

I would recommend using different buckets. What I've done in such cases
is use the year+month for sharding.

For example: video-2020-05

RGW can serve objects which are 50kB in size, but there is overhead
involved. Storing a lot of such small objects comes at a price of overhead.

Wido

> 
> Wido den Hollander mailto:w...@42on.com>> 于2020年5月12
> 日周二 下午2:41写道:
> 
> 
> 
> On 5/12/20 4:22 AM, Zhenshi Zhou wrote:
> > Hi all,
> >
> > We have several nfs servers providing file storage. There is a
> nginx in
> > front of
> > nfs servers in order to serve the clients. The files are mostly
> small files
> > and
> > nearly about 30TB in total.
> >
> 
> What is small? How many objects/files are you talking about?
> 
> > I'm gonna use ceph rgw as the storage. I wanna know if it's
> appropriate to
> > do so.
> > The data migrating from nfs to rgw is a huge job. Besides I'm not sure
> > whether
> > ceph rgw is suitable in this scenario or not.
> >
> 
> Yes, it is. But make sure you don't put millions of objects into a
> single bucket. Make sure that you spread them out so that you have let's
> say 1M of objects per bucket at max.
> 
> Wido
> 
> > Thanks
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> 
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> >
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: nfs migrate to rgw

2020-05-18 Thread Paul Emmerich
On Mon, May 18, 2020 at 1:52 PM Zhenshi Zhou  wrote:

>
> 50KB, and much video files around 30MB. The amount of the files is more
> than
> 1 million. Maybe I can find a way to seperate the files in more buckets so
> that
> there is no more than 1M objects in each bucket. But how about the small
> files
> around 50KB. Does rgw serve well on small files?
>

1 million files is usually the point where you first need to start thinking
about some optimizations, but that's mostly just making sure that the index
is on SSD and it'll happily work up to ~10 million files.
Then you might need to start thinking about the index being on *good* SSDs
(and/or on many SSDs/DB devices).

It starts the get interesting if you need to go beyond 100 million files,
that's the point where you need to start tuning shard sizes and the types
of index queries that you send...

I've found that a few hundred million objects per bucket are no problem if
you run with large shard sizes (500k - 1 million); however, there are some
index-queries that can be really expensive like filtering on prefixes in
some pathological cases...

Small files: sure, works well, but can be challenging for erasure coding on
HDDs, but that's unrelated to rgw/you'd have the same problem with CephFS

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


>
> Wido den Hollander  于2020年5月12日周二 下午2:41写道:
>
> >
> >
> > On 5/12/20 4:22 AM, Zhenshi Zhou wrote:
> > > Hi all,
> > >
> > > We have several nfs servers providing file storage. There is a nginx in
> > > front of
> > > nfs servers in order to serve the clients. The files are mostly small
> > files
> > > and
> > > nearly about 30TB in total.
> > >
> >
> > What is small? How many objects/files are you talking about?
> >
> > > I'm gonna use ceph rgw as the storage. I wanna know if it's appropriate
> > to
> > > do so.
> > > The data migrating from nfs to rgw is a huge job. Besides I'm not sure
> > > whether
> > > ceph rgw is suitable in this scenario or not.
> > >
> >
> > Yes, it is. But make sure you don't put millions of objects into a
> > single bucket. Make sure that you spread them out so that you have let's
> > say 1M of objects per bucket at max.
> >
> > Wido
> >
> > > Thanks
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
> > >
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Dealing with non existing crush-root= after reclassify on ec pools

2020-05-18 Thread Dan
I have reclassified a CRUSH map, using the crushtool to a class based ruleset.
I still have an ec pool with an older ec profile with a new non existing 
crush-root=hdd

I already switched the pool’s ruleset over to a newer rule with a newer 
ec-profile with a correct crush-root
But pool ls detail still shows:


pool 9 'data' erasure profile jerasure-3-1 size 4 min_size 3 …..

Jerasure-3-1 being the old profile with non existing crush-root

So what do I do now? Switching over the pool ruleset does not change the 
ec-profile, can I switch the ec-profile over?
What can I expect having a pool with a ec-profile with a non existing 
crush-root key?

Please advise.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] feature mask: why not use HAVE_FEATURE macro in Connection::has_feature()?

2020-05-18 Thread Xinying Song
Hi, everyone:
Why don't we use HAVE_FEATURE macro in Connection::has_feature()? Do
the features in a Connection not need to care about incarnation
things? Missing the macro in Connection is really confusing. Would
anyone like to give some explains on this?
Thanks!
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: how to restart daemons on 15.2 on Debian 10

2020-05-18 Thread Ml Ml
Thanks,

The following seems to work for me on Debian 10 and 15.2.1:

systemctl restart ceph-5436dd5d-83d4-4dc8-a93b-60ab5db145df@mon.ceph01.service

How can i restart a single OSD?

Cheers,
Michael

On Sun, May 17, 2020 at 5:10 PM Sean Johnson  wrote:
>
> I have OSD’s on the brain … that line should have read:
>
> systemctl restart ceph-{fsid}@mon.{host}.service
>
> On May 17, 2020, at 10:08 AM, Sean Johnson  wrote:
>
> In case that doesn’t work, there’s also a systemd service that contains the 
> fsid of the cluster.
>
> So, in the case of a mon service you can also run:
>
> systemctl restart ceph-{fsid}@osd.{host}.service
>
> Logs are correspondingly available via journalctl:
>
> journalctl -u ceph-{fsid}@mon.{host}.service
>
>
> ~Sean
>
> On May 15, 2020, at 9:31 AM, Simon Sutter  wrote:
>
> Hello Michael,
>
>
> I had the same problems. It's very unfamiliar, if you never worked with the 
> cephadm tool.
>
> The Way I'm doing it is to go into the cephadm container:
> # cephadm shell
>
> Here you can list all containers (for each service, one container) with the 
> orchestration tool:
>
> # ceph orch ps
>
> and then restart it with the orchestration tool:
>
> # ceph orch restart {name from ceph orch ps}
>
>
> Hope it helps.
>
>
> Ceers,
>
> Simon
>
> 
> Von: Ml Ml 
> Gesendet: Freitag, 15. Mai 2020 12:27:09
> An: ceph-users
> Betreff: [ceph-users] how to restart daemons on 15.2 on Debian 10
>
> Hello List,
>
> how do you restart daemons (mgr, mon, osd) on 15.2.1?
>
> It used to be something like:
>  systemctl stop ceph-osd@10
>
> Or:
>  systemctl start ceph-mon@ceph03
>
> however, those command do nothing on my setup.
>
> Is this because i use cephadm and that docker stuff?
>
> The Logs also seem to be missing.
> /var/log/ceph/5436dd5d-83d4-4dc8-a93b-60ab5db145df is pretty empty.
>
> I feel like i am missing a lot of documentation here? Can anyone point
> me to my missing parts?
>
> Thanks a lot.
>
> Cheers,
> Michael
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: how to restart daemons on 15.2 on Debian 10

2020-05-18 Thread Ml Ml
Is there no official link/docs how to manage the services on Debian 10
and 15.2.1?

I have seach and seeked:
systemctl restart ceph-5436dd5d-83d4-4dc8-a93b-60ab5db145df@mon.ceph01.service
journalctl -u ceph-5436dd5d-83d4-4dc8-a93b-60ab5db145df@mon.ceph01.service
docker logs 

Back in the days we had simple restart commands, and logs in
/var/log/ceph/* and documentation like:
https://docs.ceph.com/docs/master/rados/operations/operating/

Its really frustrating to struggle like what with basic commands. Did
i miss some docs or did it just get complicated like hell?


Cheers,
Michael

On Fri, May 15, 2020 at 12:27 PM Ml Ml  wrote:
>
> Hello List,
>
> how do you restart daemons (mgr, mon, osd) on 15.2.1?
>
> It used to be something like:
>   systemctl stop ceph-osd@10
>
> Or:
>   systemctl start ceph-mon@ceph03
>
> however, those command do nothing on my setup.
>
> Is this because i use cephadm and that docker stuff?
>
> The Logs also seem to be missing.
> /var/log/ceph/5436dd5d-83d4-4dc8-a93b-60ab5db145df is pretty empty.
>
> I feel like i am missing a lot of documentation here? Can anyone point
> me to my missing parts?
>
> Thanks a lot.
>
> Cheers,
> Michael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Dealing with non existing crush-root= after reclassify on ec pools

2020-05-18 Thread Dan
I think I did a bad job explaining my issue:

I have a fairly old cluster which had a crush map with two trees, one for
hdds and one for ssd, like root hdd {..} and root ssd {...}  now with the
newer class based rules I used crushtool —reclassify to merge those two
trees into root default {...} So I already downloaded, edited and
Reuploaded the crush map, which resulted in a very minor data movement,
which crushtool —compare predicted.  One of my pools is an ec pool with an
ec profile with crush-root=hdd. I can not, I think, change the ec-profile
of an existing pool. But since the pool runs on that profile, with the  now
non existing crush-root=hdd, I am wondering if I can expect to run into
trouble down the line or does the cluster use some internal id, and the
string displayed only matters on creation. Basically am I safe or am I
hosed?


On Mon 18. May 2020 at 19:05, Eric Smith  wrote:

> You'll probably have to decompile, hand edit, recompile, and reset the
> crush map pointing at the expected root. The EC profile is only used during
> pool creation and will not change the crush map if you change the EC
> profile. I think you can expect some data movement if you change the root
> but either way I would test it in a lab if you have one available.
>
> -Original Message-
> From: Dan  On Behalf Of Dan
> Sent: Monday, May 18, 2020 9:14 AM
> To: ceph-users@ceph.io
> Subject: [ceph-users] Dealing with non existing crush-root= after
> reclassify on ec pools
>
> I have reclassified a CRUSH map, using the crushtool to a class based
> ruleset.
> I still have an ec pool with an older ec profile with a new non existing
> crush-root=hdd
>
> I already switched the pool’s ruleset over to a newer rule with a newer
> ec-profile with a correct crush-root But pool ls detail still shows:
>
>
> pool 9 'data' erasure profile jerasure-3-1 size 4 min_size 3 …..
>
> Jerasure-3-1 being the old profile with non existing crush-root
>
> So what do I do now? Switching over the pool ruleset does not change the
> ec-profile, can I switch the ec-profile over?
> What can I expect having a pool with a ec-profile with a non existing
> crush-root key?
>
> Please advise.
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Dealing with non existing crush-root= after reclassify on ec pools

2020-05-18 Thread Paul Emmerich
that part of erasure profiles are only used when a crush rule is created
when creating a pool without explicitly specifying a crush rule



Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90


On Mon, May 18, 2020 at 9:09 PM Dan  wrote:

> I think I did a bad job explaining my issue:
>
> I have a fairly old cluster which had a crush map with two trees, one for
> hdds and one for ssd, like root hdd {..} and root ssd {...}  now with the
> newer class based rules I used crushtool —reclassify to merge those two
> trees into root default {...} So I already downloaded, edited and
> Reuploaded the crush map, which resulted in a very minor data movement,
> which crushtool —compare predicted.  One of my pools is an ec pool with an
> ec profile with crush-root=hdd. I can not, I think, change the ec-profile
> of an existing pool. But since the pool runs on that profile, with the  now
> non existing crush-root=hdd, I am wondering if I can expect to run into
> trouble down the line or does the cluster use some internal id, and the
> string displayed only matters on creation. Basically am I safe or am I
> hosed?
>
>
> On Mon 18. May 2020 at 19:05, Eric Smith  wrote:
>
> > You'll probably have to decompile, hand edit, recompile, and reset the
> > crush map pointing at the expected root. The EC profile is only used
> during
> > pool creation and will not change the crush map if you change the EC
> > profile. I think you can expect some data movement if you change the root
> > but either way I would test it in a lab if you have one available.
> >
> > -Original Message-
> > From: Dan  On Behalf Of Dan
> > Sent: Monday, May 18, 2020 9:14 AM
> > To: ceph-users@ceph.io
> > Subject: [ceph-users] Dealing with non existing crush-root= after
> > reclassify on ec pools
> >
> > I have reclassified a CRUSH map, using the crushtool to a class based
> > ruleset.
> > I still have an ec pool with an older ec profile with a new non existing
> > crush-root=hdd
> >
> > I already switched the pool’s ruleset over to a newer rule with a newer
> > ec-profile with a correct crush-root But pool ls detail still shows:
> >
> >
> > pool 9 'data' erasure profile jerasure-3-1 size 4 min_size 3 …..
> >
> > Jerasure-3-1 being the old profile with non existing crush-root
> >
> > So what do I do now? Switching over the pool ruleset does not change the
> > ec-profile, can I switch the ec-profile over?
> > What can I expect having a pool with a ec-profile with a non existing
> > crush-root key?
> >
> > Please advise.
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io
> >
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] v15.2.2 Octopus released

2020-05-18 Thread Abhishek Lekshmanan

We're happy to announce the second bugfix release of Ceph Octopus stable
release series, we recommend that all Octopus users upgrade. This
release has a range of fixes across all components and a security fix.

Notable Changes
---
* CVE-2020-10736: Fixed an authorization bypass in mons & mgrs (Olle
SegerDahl, Josh Durgin)

For the complete changelog please refer to the full release blog at
https://ceph.io/releases/v15-2-2-octopus-released/

Getting Ceph

* Git at git://github.com/ceph/ceph.git
* Tarball at http://download.ceph.com/tarballs/ceph-15.2.2.tar.gz
* For packages, see
http://docs.ceph.com/docs/master/install/get-packages/
* Release git sha1: 0c857e985a29d90501a285f242ea9c008df49eb8

-- 
Abhishek Lekshmanan
SUSE Software Solutions Germany GmbH
GF: Felix Imendörffer, HRB 36809 (AG Nürnberg)
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Reweighting OSD while down results in undersized+degraded PGs

2020-05-18 Thread Andras Pataki
In a recent cluster reorganization, we ended up with a lot of 
undersized/degraded PGs and a day of recovery from them, when all we 
expected was moving some data around.  After retracing my steps, I found 
something odd.  If I crush reweight an OSD to  0 while it is down - it 
results in the PGs of that OSD ending up degraded even after the OSD is 
restarted.  If I do the same reweighting while the OSD is up - data gets 
moved without any degraded/undersized states. I would not expect this - 
so I wonder if this is a bug or is somehow intended.  This is on ceph 
Nautilus 14.2.8.  Below are the details.


Andras


First the case that works as I would expect:

# Healthy cluster ...
[root@xorphosd00 ~]# ceph -s
  cluster:
    id: 86d8a1b9-761b-4099-a960-6a303b951236
    health: HEALTH_WARN
    noout,nobackfill,noscrub,nodeep-scrub flag(s) set

  services:
    mon: 3 daemons, quorum xorphmon00,xorphmon01,xorphmon02 (age 11d)
    mgr: xorphmon01(active, since 6w), standbys: xorphmon02, xorphmon00
    mds: cephfs:1 {0=xorphmon02=up:active} 1 up:standby
    osd: 270 osds: 270 up (since 2m), 270 in (since 4h)
 flags noout,nobackfill,noscrub,nodeep-scrub

  data:
    pools:   4 pools, 5312 pgs
    objects: 75.87M objects, 287 TiB
    usage:   864 TiB used, 1.1 PiB / 1.9 PiB avail
    pgs: 5312 active+clean

# Reweight an OSD to 0
[root@xorphosd00 ~]# ceph osd crush reweight osd.0 0.0
reweighted item id 0 name 'osd.0' to 0 in crush map

# Crush map changes - data movement is set up, no degraded PGs:
[root@xorphosd00 ~]# ceph -s
  cluster:
    id: 86d8a1b9-761b-4099-a960-6a303b951236
    health: HEALTH_WARN
    noout,nobackfill,noscrub,nodeep-scrub flag(s) set

  services:
    mon: 3 daemons, quorum xorphmon00,xorphmon01,xorphmon02 (age 11d)
    mgr: xorphmon01(active, since 6w), standbys: xorphmon02, xorphmon00
    mds: cephfs:1 {0=xorphmon02=up:active} 1 up:standby
    osd: 270 osds: 270 up (since 10m), 270 in (since 5h); 175 remapped pgs
 flags noout,nobackfill,noscrub,nodeep-scrub

  data:
    pools:   4 pools, 5312 pgs
    objects: 75.87M objects, 287 TiB
    usage:   864 TiB used, 1.1 PiB / 1.9 PiB avail
    pgs: 2562045/232996662 objects misplaced (1.100%)
 5137 active+clean
 172  active+remapped+backfilling
 3    active+remapped+backfill_wait

# Reweight it back to the original weight
[root@xorphosd00 ~]# ceph osd crush reweight osd.0 8.0

# Cluster goes back to clean
reweighted item id 0 name 'osd.0' to 8 in crush map
[root@xorphosd00 ~]# ceph -s
  cluster:
    id: 86d8a1b9-761b-4099-a960-6a303b951236
    health: HEALTH_WARN
    noout,nobackfill,noscrub,nodeep-scrub flag(s) set

  services:
    mon: 3 daemons, quorum xorphmon00,xorphmon01,xorphmon02 (age 11d)
    mgr: xorphmon01(active, since 6w), standbys: xorphmon02, xorphmon00
    mds: cephfs:1 {0=xorphmon02=up:active} 1 up:standby
    osd: 270 osds: 270 up (since 11m), 270 in (since 5h)
 flags noout,nobackfill,noscrub,nodeep-scrub

  data:
    pools:   4 pools, 5312 pgs
    objects: 75.87M objects, 287 TiB
    usage:   864 TiB used, 1.1 PiB / 1.9 PiB avail
    pgs: 5312 active+clean




#
# Now the problematic case
#

# Stop an OSD
[root@xorphosd00 ~]# systemctl stop ceph-osd@0

# We get degraded PGs - as expected
[root@xorphosd00 ~]# ceph -s
  cluster:
    id: 86d8a1b9-761b-4099-a960-6a303b951236
    health: HEALTH_WARN
    noout,nobackfill,noscrub,nodeep-scrub flag(s) set
    1 osds down
    Degraded data redundancy: 873964/232996662 objects degraded 
(0.375%), 82 pgs degraded


  services:
    mon: 3 daemons, quorum xorphmon00,xorphmon01,xorphmon02 (age 11d)
    mgr: xorphmon01(active, since 6w), standbys: xorphmon02, xorphmon00
    mds: cephfs:1 {0=xorphmon02=up:active} 1 up:standby
    osd: 270 osds: 269 up (since 16s), 270 in (since 5h)
 flags noout,nobackfill,noscrub,nodeep-scrub

  data:
    pools:   4 pools, 5312 pgs
    objects: 75.87M objects, 287 TiB
    usage:   864 TiB used, 1.1 PiB / 1.9 PiB avail
    pgs: 873964/232996662 objects degraded (0.375%)
 5230 active+clean
 82   active+undersized+degraded

# Reweight the OSD to 0:
[root@xorphosd00 ~]# ceph osd crush reweight osd.0 0.0

# Still degraded - as expected
reweighted item id 0 name 'osd.0' to 0 in crush map
[root@xorphosd00 ~]# ceph -s
  cluster:
    id: 86d8a1b9-761b-4099-a960-6a303b951236
    health: HEALTH_WARN
    noout,nobackfill,noscrub,nodeep-scrub flag(s) set
    1 osds down
    Degraded data redundancy: 873964/232996662 objects degraded 
(0.375%), 82 pgs degraded


  services:
    mon: 3 daemons, quorum xorphmon00,xorphmon01,xorphmon02 (age 11d)
    mgr: xorphmon01(active, since 6w), standbys: xorphmon02, xorphmon00
    mds: cephfs:1 {0=xorphmon02=up:active} 1 up:standby
    osd: 270 osds: 269 up (since 59s), 270 in (since 5h); 175 remapped pgs
 flags noout,nobackfill,nos

[ceph-users] Mismatched object counts between "rados df" and "rados ls" after rbd images removal

2020-05-18 Thread icy chan
Hi,

The numbers of object counts from "rados df" and "rados ls" are different
in my testing environment. I think it maybe some zero bytes or unclean
objects since I removed all rbd images on top of it few days ago.
How can I make it right / found out where are those ghost objects? Or i
should ignore it since the numbers was not that high.

$ rados -p rbd df
POOL_NAME   USED OBJECTS CLONES  COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED
   RD_OPS  RD   WR_OPS WR USED COMPR UNDER COMPR
rbd   18 MiB  430107  0 1290321  0   00
141243877 6.9 TiB 42395431 11 TiB0 B 0 B

$ rados -p rbd ls | wc -l
4

$ rados -p rbd ls
gateway.conf
rbd_directory
rbd_info
rbd_trash

Regs,
Icy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: nfs migrate to rgw

2020-05-18 Thread Zhenshi Zhou
Awesome, thanks a lot !
I'll try it.

Paul Emmerich  于2020年5月18日周一 下午8:53写道:

>
> On Mon, May 18, 2020 at 1:52 PM Zhenshi Zhou  wrote:
>
>>
>> 50KB, and much video files around 30MB. The amount of the files is more
>> than
>> 1 million. Maybe I can find a way to seperate the files in more buckets so
>> that
>> there is no more than 1M objects in each bucket. But how about the small
>> files
>> around 50KB. Does rgw serve well on small files?
>>
>
> 1 million files is usually the point where you first need to start
> thinking about some optimizations, but that's mostly just making sure that
> the index is on SSD and it'll happily work up to ~10 million files.
> Then you might need to start thinking about the index being on *good* SSDs
> (and/or on many SSDs/DB devices).
>
> It starts the get interesting if you need to go beyond 100 million files,
> that's the point where you need to start tuning shard sizes and the types
> of index queries that you send...
>
> I've found that a few hundred million objects per bucket are no problem if
> you run with large shard sizes (500k - 1 million); however, there are some
> index-queries that can be really expensive like filtering on prefixes in
> some pathological cases...
>
> Small files: sure, works well, but can be challenging for erasure coding
> on HDDs, but that's unrelated to rgw/you'd have the same problem with CephFS
>
> Paul
>
> --
> Paul Emmerich
>
> Looking for help with your Ceph cluster? Contact us at https://croit.io
>
> croit GmbH
> Freseniusstr. 31h
> 81247 München
> www.croit.io
> Tel: +49 89 1896585 90
>
>
>>
>> Wido den Hollander  于2020年5月12日周二 下午2:41写道:
>>
>> >
>> >
>> > On 5/12/20 4:22 AM, Zhenshi Zhou wrote:
>> > > Hi all,
>> > >
>> > > We have several nfs servers providing file storage. There is a nginx
>> in
>> > > front of
>> > > nfs servers in order to serve the clients. The files are mostly small
>> > files
>> > > and
>> > > nearly about 30TB in total.
>> > >
>> >
>> > What is small? How many objects/files are you talking about?
>> >
>> > > I'm gonna use ceph rgw as the storage. I wanna know if it's
>> appropriate
>> > to
>> > > do so.
>> > > The data migrating from nfs to rgw is a huge job. Besides I'm not sure
>> > > whether
>> > > ceph rgw is suitable in this scenario or not.
>> > >
>> >
>> > Yes, it is. But make sure you don't put millions of objects into a
>> > single bucket. Make sure that you spread them out so that you have let's
>> > say 1M of objects per bucket at max.
>> >
>> > Wido
>> >
>> > > Thanks
>> > > ___
>> > > ceph-users mailing list -- ceph-users@ceph.io
>> > > To unsubscribe send an email to ceph-users-le...@ceph.io
>> > >
>> >
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: v15.2.2 Octopus released

2020-05-18 Thread Ashley Merrick
I am getting the following error when trying to upgrade via cephadm



ceph orch upgrade status

{

    "target_image": "docker.io/ceph/ceph:v15.2.2",

    "in_progress": true,

    "services_complete": [],

    "message": "Error: UPGRADE_FAILED_PULL: Upgrade: failed to pull target 
image"

}


Are the packages not yet built?

Thanks



 On Tue, 19 May 2020 04:27:21 +0800 Abhishek Lekshmanan  
wrote 



We're happy to announce the second bugfix release of Ceph Octopus stable 
release series, we recommend that all Octopus users upgrade. This 
release has a range of fixes across all components and a security fix. 
 
Notable Changes 
--- 
* CVE-2020-10736: Fixed an authorization bypass in mons & mgrs (Olle 
SegerDahl, Josh Durgin) 
 
For the complete changelog please refer to the full release blog at 
https://ceph.io/releases/v15-2-2-octopus-released/ 
 
Getting Ceph 
 
* Git at git://github.com/ceph/ceph.git 
* Tarball at http://download.ceph.com/tarballs/ceph-15.2.2.tar.gz 
* For packages, see 
http://docs.ceph.com/docs/master/install/get-packages/ 
* Release git sha1: 0c857e985a29d90501a285f242ea9c008df49eb8 
 
-- 
Abhishek Lekshmanan 
SUSE Software Solutions Germany GmbH 
GF: Felix Imendörffer, HRB 36809 (AG Nürnberg)
___
ceph-users mailing list -- mailto:ceph-users@ceph.io
To unsubscribe send an email to mailto:ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephfs IO halt on Node failure

2020-05-18 Thread Amudhan P
Behaviour is same even after setting min_size 2.

On Mon 18 May, 2020, 12:34 PM Eugen Block,  wrote:

> If your pool has a min_size 2 and size 2 (always a bad idea) it will
> pause IO in case of a failure until the recovery has finished. So the
> described behaviour is expected.
>
>
> Zitat von Amudhan P :
>
> > Hi,
> >
> > Crush rule is "replicated" and min_size 2 actually. I am trying to test
> > multiple volume configs in a single filesystem
> > using file layout.
> >
> > I have created metadata pool with rep 3 (min_size2 and replicated crush
> > rule) and data pool with rep 3  (min_size2 and replicated crush rule).
> and
> > also  I have created multiple (replica 2, ec2-1 & ec4-2) pools and added
> to
> > the filesystem.
> >
> > Using file layout I have set different data pool to a different folders.
> so
> > I can test different configs in the same filesystem. all data pools
> > min_size set to handle single node failure.
> >
> > Single node failure is handled properly when only having metadata pool
> and
> > one data pool (rep3).
> >
> > After adding additional data pool to fs, single node failure scenario is
> > not working.
> >
> > regards
> > Amudhan P
> >
> > On Sun, May 17, 2020 at 1:29 AM Eugen Block  wrote:
> >
> >> What’s your pool configuration wrt min_size and crush rules?
> >>
> >>
> >> Zitat von Amudhan P :
> >>
> >> > Hi,
> >> >
> >> > I am using ceph Nautilus cluster with below configuration.
> >> >
> >> > 3 node's (Ubuntu 18.04) each has 12 OSD's, and mds, mon and mgr are
> >> running
> >> > in shared mode.
> >> >
> >> > The client mounted through ceph kernel client.
> >> >
> >> > I was trying to emulate a node failure when a write and read were
> going
> >> on
> >> > (replica2) pool.
> >> >
> >> > I was expecting read and write continue after a small pause due to a
> Node
> >> > failure but it halts and never resumes until the failed node is up.
> >> >
> >> > I remember I tested the same scenario before in ceph mimic where it
> >> > continued IO after a small pause.
> >> >
> >> > regards
> >> > Amudhan P
> >> > ___
> >> > ceph-users mailing list -- ceph-users@ceph.io
> >> > To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> >>
> >>
> >>
>
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cephfs IO halt on Node failure

2020-05-18 Thread Eugen Block
Was that a typo and you mean you changed min_size to 1? I/O paus with  
min_size 1 and size 2 is unexpected, can you share more details like  
your crushmap and your osd tree?



Zitat von Amudhan P :


Behaviour is same even after setting min_size 2.

On Mon 18 May, 2020, 12:34 PM Eugen Block,  wrote:


If your pool has a min_size 2 and size 2 (always a bad idea) it will
pause IO in case of a failure until the recovery has finished. So the
described behaviour is expected.


Zitat von Amudhan P :

> Hi,
>
> Crush rule is "replicated" and min_size 2 actually. I am trying to test
> multiple volume configs in a single filesystem
> using file layout.
>
> I have created metadata pool with rep 3 (min_size2 and replicated crush
> rule) and data pool with rep 3  (min_size2 and replicated crush rule).
and
> also  I have created multiple (replica 2, ec2-1 & ec4-2) pools and added
to
> the filesystem.
>
> Using file layout I have set different data pool to a different folders.
so
> I can test different configs in the same filesystem. all data pools
> min_size set to handle single node failure.
>
> Single node failure is handled properly when only having metadata pool
and
> one data pool (rep3).
>
> After adding additional data pool to fs, single node failure scenario is
> not working.
>
> regards
> Amudhan P
>
> On Sun, May 17, 2020 at 1:29 AM Eugen Block  wrote:
>
>> What’s your pool configuration wrt min_size and crush rules?
>>
>>
>> Zitat von Amudhan P :
>>
>> > Hi,
>> >
>> > I am using ceph Nautilus cluster with below configuration.
>> >
>> > 3 node's (Ubuntu 18.04) each has 12 OSD's, and mds, mon and mgr are
>> running
>> > in shared mode.
>> >
>> > The client mounted through ceph kernel client.
>> >
>> > I was trying to emulate a node failure when a write and read were
going
>> on
>> > (replica2) pool.
>> >
>> > I was expecting read and write continue after a small pause due to a
Node
>> > failure but it halts and never resumes until the failed node is up.
>> >
>> > I remember I tested the same scenario before in ceph mimic where it
>> > continued IO after a small pause.
>> >
>> > regards
>> > Amudhan P
>> > ___
>> > ceph-users mailing list -- ceph-users@ceph.io
>> > To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>>
>>
>>







___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Mismatched object counts between "rados df" and "rados ls" after rbd images removal

2020-05-18 Thread Eugen Block
That's not wrong, those are expected objects that contain information  
about your rbd images. If you take a look into the rbd_directory  
(while you have images in there) you'll find something like this:


host:~ $ rados -p pool listomapkeys rbd_directory

id_fe976bcfb968bf
id_ffc37728edbdab
name_01673d5d-4b12-4a44-8793-403581f7d808_disk
name_01673d5d-4b12-4a44-8793-403581f7d808_disk.config
name_volume-8a1a0825-1163-44bc-abe2-1a711daea07b


The rbd ls command uses the rbd_directory object to read from, an  
excerpt from rbd man page:


---snip---
ls [-l | –long] [pool-name]

Will list all rbd images listed in the rbd_directory object.
---snip---


The gateway.conf is your iSCSI gateway configuration stored in the cluster.


Zitat von icy chan :


Hi,

The numbers of object counts from "rados df" and "rados ls" are different
in my testing environment. I think it maybe some zero bytes or unclean
objects since I removed all rbd images on top of it few days ago.
How can I make it right / found out where are those ghost objects? Or i
should ignore it since the numbers was not that high.

$ rados -p rbd df
POOL_NAME   USED OBJECTS CLONES  COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED
   RD_OPS  RD   WR_OPS WR USED COMPR UNDER COMPR
rbd   18 MiB  430107  0 1290321  0   00
141243877 6.9 TiB 42395431 11 TiB0 B 0 B

$ rados -p rbd ls | wc -l
4

$ rados -p rbd ls
gateway.conf
rbd_directory
rbd_info
rbd_trash

Regs,
Icy
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io