from:"Ilya Dryomov"

[ceph-users] Re: Are ceph commands backward compatible?

2024-06-17 Thread Ilya Dryomov

On Mon, Jun 17, 2024 at 12:18 AM Satoru Takeuchi
 wrote:
>
> 2024年6月14日(金) 23:24 Anthony D'Atri :
>
> > Usually.  There is a high bar for changing command structure or output.
> > Newer versions are more likely to *add* commands and options than to change
> > or remove them.
> >
> > That said, probably some things now won't take a --cluster argument since
> > vanity names are deprecated, similarly with filestore.
> >
> > One exception that still pisses me off is when the mon time skew was
> > factored out of ceph -s into a separate command
> >
>
>
> Thank you for your answer. I understood. Your comment matches what I had
> inferred from reviewing changelogs so far.

Hi Satoru,

For rbd in particular, we try to be as compatible as possible -- we
have actually rejected some improvements to structured output (--format
json or --format xml) in the past to stay compatible.  As long as you
stick to structured output instead of parsing human-readable output,
I would expect the same shell script to work for quite for a number of
releases, but it's not really tested.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [rbd mirror] integrity of journal-based image mirror

2024-06-23 Thread Ilya Dryomov

On Tue, May 28, 2024 at 4:53 AM Tony Liu  wrote:
>
> Hi,
>
> Say, the source image is being updated and data is mirrored to destination 
> continuously.
> Suddenly, networking of source is down and destination will be promoted and 
> used to
> restore the VM. Is that going to cause any FS issue and, for example, fsck 
> needs to be
> invoked to check and repair FS? Is there any integrity check during 
> journal-based mirror
> to avoid "partial" update caused by networking issue?

Hi Tony,

(Apologies for a late reply, this was marked as spam for some reason.)

rbd-mirror daemon on the destination replays the journal in
a point-in-time, crash-consistent fashion.  If the destination is
promoted while the source and/or the network is down, the promotion
would have to be forced with "rbd mirror image promote --force"
command.

As such, fsck may have some work to do because the filesystem is not
cleanly unmounted and the source is not cleanly demoted.  In general,
there is no way to avoid "partial" on-disk state from the filesystem
perspective here because crash consistency is weaker than filesystem
consistency.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: squid 19.1.0 RC QE validation status

2024-07-01 Thread Ilya Dryomov

On Mon, Jul 1, 2024 at 4:24 PM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/66756#note-1
>
> Release Notes - TBD
> LRC upgrade - TBD
>
> (Reruns were not done yet.)
>
> Seeking approvals/reviews for:
>
> smoke
> rados - Radek, Laura
> rgw- Casey
> fs - Venky
> orch - Adam King
> rbd, krbd - Ilya

Hi Yuri,

Need reruns for rbd and krbd.

After infrastructure failures are cleared in reruns, I'm prepared to
approve as is, but here is a list of no-brainer PRs that would fix some
of the failing jobs in case you end up rebuilding the branch:

https://github.com/ceph/ceph/pull/57031 (qa-only)
https://github.com/ceph/ceph/pull/57465 (qa-only)
https://github.com/ceph/ceph/pull/57556 (qa-only)
https://github.com/ceph/ceph/pull/57571

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: squid 19.1.0 RC QE validation status

2024-07-02 Thread Ilya Dryomov

On Mon, Jul 1, 2024 at 8:41 PM Ilya Dryomov  wrote:
>
> On Mon, Jul 1, 2024 at 4:24 PM Yuri Weinstein  wrote:
> >
> > Details of this release are summarized here:
> >
> > https://tracker.ceph.com/issues/66756#note-1
> >
> > Release Notes - TBD
> > LRC upgrade - TBD
> >
> > (Reruns were not done yet.)
> >
> > Seeking approvals/reviews for:
> >
> > smoke
> > rados - Radek, Laura
> > rgw- Casey
> > fs - Venky
> > orch - Adam King
> > rbd, krbd - Ilya
>
> Hi Yuri,
>
> Need reruns for rbd and krbd.
>
> After infrastructure failures are cleared in reruns, I'm prepared to
> approve as is, but here is a list of no-brainer PRs that would fix some

rbd approved.

Please do another rerun for krbd.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: squid 19.1.0 RC QE validation status

2024-07-02 Thread Ilya Dryomov

On Tue, Jul 2, 2024 at 9:13 PM Laura Flores  wrote:

> The rados suite, upgrade suite, and powercycle are approved by RADOS.
>
> Failures are summarized here:
> https://tracker.ceph.com/projects/rados/wiki/SQUID#Squid-1910
>
> @Ilya Dryomov , please see the upgrade/reef-x suite,
> which had this RBD failure:
>
>- https://tracker.ceph.com/issues/63131 - TestMigration.Stress2:
>snap3, block 171966464~4194304 differs after migration - RBD
>
>
This is known, it won't be a blocker.

Thanks,

Ilya


> @Venky Shankar , please see the powercycle suite,
> which had this CephFS failure:
>
>- https://tracker.ceph.com/issues/64572 - workunits/fsx.sh failure -
>CephFS
>
>
> On Tue, Jul 2, 2024 at 1:17 PM Ilya Dryomov  wrote:
>
>> On Mon, Jul 1, 2024 at 8:41 PM Ilya Dryomov  wrote:
>> >
>> > On Mon, Jul 1, 2024 at 4:24 PM Yuri Weinstein 
>> wrote:
>> > >
>> > > Details of this release are summarized here:
>> > >
>> > > https://tracker.ceph.com/issues/66756#note-1
>> > >
>> > > Release Notes - TBD
>> > > LRC upgrade - TBD
>> > >
>> > > (Reruns were not done yet.)
>> > >
>> > > Seeking approvals/reviews for:
>> > >
>> > > smoke
>> > > rados - Radek, Laura
>> > > rgw- Casey
>> > > fs - Venky
>> > > orch - Adam King
>> > > rbd, krbd - Ilya
>> >
>> > Hi Yuri,
>> >
>> > Need reruns for rbd and krbd.
>> >
>> > After infrastructure failures are cleared in reruns, I'm prepared to
>> > approve as is, but here is a list of no-brainer PRs that would fix some
>>
>> rbd approved.
>>
>> Please do another rerun for krbd.
>>
>> Thanks,
>>
>> Ilya
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
>
> --
>
> Laura Flores
>
> She/Her/Hers
>
> Software Engineer, Ceph Storage <https://ceph.io>
>
> Chicago, IL
>
> lflo...@ibm.com | lflo...@redhat.com 
> M: +17087388804
>
>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: squid 19.1.0 RC QE validation status

2024-07-03 Thread Ilya Dryomov

On Tue, Jul 2, 2024 at 8:16 PM Ilya Dryomov  wrote:
>
> On Mon, Jul 1, 2024 at 8:41 PM Ilya Dryomov  wrote:
> >
> > On Mon, Jul 1, 2024 at 4:24 PM Yuri Weinstein  wrote:
> > >
> > > Details of this release are summarized here:
> > >
> > > https://tracker.ceph.com/issues/66756#note-1
> > >
> > > Release Notes - TBD
> > > LRC upgrade - TBD
> > >
> > > (Reruns were not done yet.)
> > >
> > > Seeking approvals/reviews for:
> > >
> > > smoke
> > > rados - Radek, Laura
> > > rgw- Casey
> > > fs - Venky
> > > orch - Adam King
> > > rbd, krbd - Ilya
> >
> > Hi Yuri,
> >
> > Need reruns for rbd and krbd.
> >
> > After infrastructure failures are cleared in reruns, I'm prepared to
> > approve as is, but here is a list of no-brainer PRs that would fix some
>
> rbd approved.
>
> Please do another rerun for krbd.

krbd approved based on a rerun that I did on Ubuntu:

https://pulpito.ceph.com/dis-2024-07-03_14:41:39-krbd-squid-release-testing-default-smithi/

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RBD Stuck Watcher

2024-07-25 Thread Ilya Dryomov

On Wed, Jul 3, 2024 at 5:45 PM Reid Guyett  wrote:
>
> Hi,
>
> I have a small script in a Docker container we use for a type of CRUD test
> to monitor availability. The script uses Python librbd/librados and is
> launched by Telegraf input.exec. It does the following:
>
>1. Creates an rbd image
>2. Writes a small amount of data to the rbd
>3. Reads the data from the rbd
>4. Deletes the rbd
>5. Closes connections
>
> It works great for 99% of the time but there is a small chance that
> something happens and the script takes too long (1 min) to complete and it
> is killed. I don't have logging to know which step it happens at yet but
> will be adding some. Regardless when the script is killed, sometimes the
> watcher on the rbd isn't going away. I use the same RBD name for each test
> and try to clean up the rbd if it exists prior to starting the next test
> but when the watcher is stuck, it can't.
>
> The only way to cleanup the watcher is to restart the primary osd for the
> rbd_header. Blocklist and restarting the container free the watcher.
>
> When I look at the status of the image I can see the watcher.
> # rbd -p pool status crud-image
> Watchers:
> watcher=:0/3587274006 client.1053762394 cookie=140375838755648
>
> Lookup up primary OSD
> # rbd -p pool info crud-image | grep id
> id: cf235ae95099cb
> # ceph osd map pool rbd_header.cf235ae95099cb
> osdmap e332984 pool 'pool' (1) object 'rbd_header.cf235ae95099cb' -> pg
> 1.a76f353e (1.53e) -> up ([7,66,176], p7) acting ([7,66,176], p7)
>
> Checking watchers on primary OSD does NOT list rbd_header.cf235ae95099cb
> # ceph tell osd.7 dump_watchers
> [
> {
> "namespace": "",
> "object": "rbd_header.70fa4f9b5c2cf8",
> "entity_name": {
> "type": "client",
> "num": 998139266
> },
> "cookie": 140354859197312,
> "timeout": 30,
> "entity_addr_t": {
> "type": "v1",
> "addr": ":0",
> "nonce": 2665188958
> }
> }
> ]
>
> Is this a bug somewhere? I expect that if my script is killed it's watcher
> should die out within a minute. New runs of the script would result in new
> watcher/client/cookie ids.

Hi Reid,

You might be hitting https://tracker.ceph.com/issues/58120.  It looks
like the ticket wasn't moved to the appropriate state when the fix got
merged, so unfortunately the fix isn't available in any of the stable
releases -- only in 19.1.0 (release candidate for squid).  I have just
tweaked the ticket and will stage backport PRs shortly.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Dashboard error on 18.2.4 when listing block images

2024-07-25 Thread Ilya Dryomov

Hi Dan,

What is the output of

$ rbd info images-pubos/144ebab3-b2ee-4331-9d41-8505bcc4e19b

Can you confirm that the problem lies with that image by running

$ rbd diff --whole-object images-pubos/144ebab3-b2ee-4331-9d41-8505bcc4e19b

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Dashboard error on 18.2.4 when listing block images

2024-07-25 Thread Ilya Dryomov

On Thu, Jul 25, 2024 at 10:10 PM Dan O'Brien  wrote:
>
> Ilya -
>
> I don't think images-pubos/144ebab3-b2ee-4331-9d41-8505bcc4e19b is the 
> problem; it was just the last RBD image listed in the log before the crash. 
> The commands you suggested work fine when using that image:
>
> [root@os-storage ~]# rbd info 
> images-pubos/144ebab3-b2ee-4331-9d41-8505bcc4e19b
> rbd image '144ebab3-b2ee-4331-9d41-8505bcc4e19b':
> size 0 B in 0 objects
> order 23 (8 MiB objects)
> snapshot_count: 1
> id: f01052f76969e7
> block_name_prefix: rbd_data.f01052f76969e7
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff, 
> deep-flatten
> op_features:
> flags:
> create_timestamp: Mon Feb 12 17:50:54 2024
> access_timestamp: Mon Feb 12 17:50:54 2024
> modify_timestamp: Mon Feb 12 17:50:54 2024
> [root@os-storage ~]# rbd diff --whole-object 
> images-pubos/144ebab3-b2ee-4331-9d41-8505bcc4e19b

I'm sorry, I meant "rbd du images-pubos/144ebab3-b2ee-4331-9d41-8505bcc4e19b".

I suspect that you are hitting [1].  One workaround would be to go
through all images in all RBD pools that you have and remove any of
them that are 0-sized, meaning that "rbd info" reports "size 0 B in
0 objects".

>
> The other 2 images, related to 2 OpenStack volumes stuck in "error_deleting" 
> state, appear to be the cause of the problem:
>
> [root@os-storage ~]# rbd info 
> volumes-gpu/volume-28bbca8c-fec5-4a33-bbe2-30408f1ea37f
> rbd: error opening image volume-28bbca8c-fec5-4a33-bbe2-30408f1ea37f: (2) No 
> such file or directory
>
> [root@os-storage ~]# rbd diff --whole-object 
> volumes-gpu/volume-28bbca8c-fec5-4a33-bbe2-30408f1ea37f
> rbd: error opening image volume-28bbca8c-fec5-4a33-bbe2-30408f1ea37f: (2) No 
> such file or directory

I don't think these ENOENT errors are related -- the image isn't there
so I don't see a way for the assert to be reached.

[1] https://tracker.ceph.com/issues/66418

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Dashboard error on 18.2.4 when listing block images

2024-07-26 Thread Ilya Dryomov

On Fri, Jul 26, 2024 at 12:17 PM Dan O'Brien  wrote:
>
> I'll try that today.
>
> Looking at the tracker issue you flagged, it seems like it should be fixed in 
> v18.2.4, which is what I'm running.

Hi Dan,

The reef backport [1] has "Target version: Ceph - v18.2.5".  It was
originally targeted for 18.2.4, but as 18.2.3 got renumbered to 18.2.4,
the same occurred to 18.2.4 -- it became 18.2.5.

> Did that commit make it into the 18.2.4 build that was released?

No.

[1] https://tracker.ceph.com/issues/66583

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: squid 19.1.1 RC QE validation status

2024-08-05 Thread Ilya Dryomov

On Mon, Aug 5, 2024 at 10:32 PM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/67340#note-1
>
> Release Notes - N/A
> LRC upgrade - N/A
> Gibba upgrade -TBD
>
> Seeking approvals/reviews for:
>
> rados - Radek, Laura (https://github.com/ceph/ceph/pull/59020 is being
> tested and will be cherry-picked when ready)
>
> rgw - Eric, Adam E
> fs - Venky
> orch - Adam King
> rbd, krbd - Ilya

rbd and krbd approved.

(For krbd, https://tracker.ceph.com/issues/67353 popped up, but it's
not blocker.)

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Resize RBD - New size not compatible with object map

2024-08-06 Thread Ilya Dryomov

On Tue, Aug 6, 2024 at 11:55 AM Torkil Svensgaard  wrote:
>
> Hi
>
> [ceph: root@ceph-flash1 /]# rbd info rbd_ec/projects
> rbd image 'projects':
>  size 750 TiB in 196608000 objects
>  order 22 (4 MiB objects)
>  snapshot_count: 0
>  id: 15a979db61dda7
>  data_pool: rbd_ec_data
>  block_name_prefix: rbd_data.10.15a979db61dda7
>  format: 2
>  features: layering, exclusive-lock, object-map, fast-diff,
> deep-flatten, data-pool
>  op_features:
>  flags:
>  create_timestamp: Thu Jul  7 10:57:13 2022
>  access_timestamp: Thu Jul  7 10:57:13 2022
>  modify_timestamp: Thu Jul  7 10:57:13 2022
>
> We wanted to resize it to 1PB but that failed:
>
> [ceph: root@ceph-flash1 /]# rbd resize rbd_ec/projects --size 1024T
> Resizing image: 0% complete...failed.
> rbd: shrinking an image is only allowed with the --allow-shrink flag
> 2024-08-06T08:42:01.053+ 7fc996492580 -1 librbd::Operations: New
> size not compatible with object map
>
> We can do 800T though:
>
> [ceph: root@ceph-flash1 /]# rbd resize rbd_ec/projects --size 800T
> Resizing image: 100% complete...done.
>
> A problem with the --1024T notation? Or we hitting some sort of size
> limit for RBD?

Hi Torkil,

The latter -- the object-map feature is limited to 25600 objects,
which with 4M objects works out to be ~976T.  For anything larger, the
object-map feature can be disabled with "rbd feature disable" command
(would need to temporarily unmap the image if it's mapped).

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

2024-08-12 Thread Ilya Dryomov

On Mon, Aug 12, 2024 at 10:20 AM Oliver Freyermuth
 wrote:
>
> Dear Cephalopodians,
>
> we've successfully operated a "good old" Mimic cluster with primary RBD 
> images, replicated via journaling to a "backup cluster" with Octopus, for the 
> past years (i.e. one-way replication).
> We've now finally gotten around upgrading the cluster with the primary images 
> to Octopus (and plan to upgrade further in the near future).
>
> After the upgrade, all MON+MGR-OSD+rbd_mirror daemons are running 15.2.17.
>
> We run three rbd-mirror daemons which all share the following client with 
> auth in the "backup" cluster, to which they write:
>
>client.rbd_mirror_backup
>  caps: [mon] profile rbd-mirror
>  caps: [osd] profile rbd
>
> and the following shared client with auth in the "primary cluster" from which 
> they are reading:
>
>client.rbd_mirror
>  caps: [mon] profile rbd
>  caps: [osd] profile rbd
>
> i.e. the same auth as described in the docs[0].
>
> Checking on the primary cluster, we get:
>
> # rbd mirror pool status
>health: UNKNOWN
>daemon health: UNKNOWN
>image health: OK
>images: 288 total
>288 replaying
>
> For some reason, some values are "unknown" here. But mirroring seems to work, 
> as checking on the backup cluster reveals, see for example:
>
># rbd mirror image status zabbix-test.example.com-disk2
>  zabbix-test.example.com-disk2:
>  global_id:   1bdcb981-c1c5-4172-9583-be6a6cd996ec
>  state:   up+replaying
>  description: replaying, 
> {"bytes_per_second":8540.27,"entries_behind_primary":0,"entries_per_second":1.8,"non_primary_position":{"entry_tid":869176,"object_number":504,"tag_tid":1},"primary_position":{"entry_tid":11143,"object_number":7,"tag_tid":1}}
>  service: rbd_mirror_backup on rbd-mirror002.example.com
>  last_update: 2024-08-12 09:53:17
>
> However, we do in some seemingly random cases see that journals are never 
> advanced on the primary cluster — staying with the example above, on the 
> primary cluster I find the following:
>
># rbd journal status --image zabbix-test.physik.uni-bonn.de-disk2
>minimum_set: 1
>active_set: 126
>  registered clients:
>[id=, commit_position=[positions=[[object_number=7, tag_tid=1, 
> entry_tid=11143], [object_number=6, tag_tid=1, entry_tid=11142], 
> [object_number=5, tag_tid=1, entry_tid=11141], [object_number=4, tag_tid=1, 
> entry_tid=11140]]], state=connected]
>[id=52b80bb0-a090-4f7d-9950-c8691ed8fee9, 
> commit_position=[positions=[[object_number=505, tag_tid=1, entry_tid=869181], 
> [object_number=504, tag_tid=1, entry_tid=869180], [object_number=507, 
> tag_tid=1, entry_tid=869179], [object_number=506, tag_tid=1, 
> entry_tid=869178]]], state=connected]
>
> As you can see, the minimum_set was not advanced. As can be seen in "mirror 
> image status", it shows the strange output that non_primary_position seems 
> much more advanced than primary_position. This seems to happen "at random" 
> for only a few volumes...
> There are no other active clients apart from the actual VM (libvirt+qemu).

Hi Oliver,

Were the VM clients (i.e. librbd on the hypervisor nodes) upgraded as well?

>
> As a quick fix, to purge journals piling up over and over, we've only found 
> the "solution" to temporarily disable and then re-enable journaling for 
> affected VM disks, which can be identified by:
>   for A in $(rbd ls); do echo -n "$A: "; rbd --format=json journal status 
> --image $A | jq '.active_set - .minimum_set'; done
>
>
> Any idea what is going wrong here?
> This did not happen with the primary cluster running Mimic and the backup 
> cluster running Octopus before, and also did not happen when both were 
> running Mimic.

You might be hitting https://tracker.ceph.com/issues/57396.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

2024-08-12 Thread Ilya Dryomov

On Mon, Aug 12, 2024 at 11:28 AM Oliver Freyermuth
 wrote:
>
> Am 12.08.24 um 11:09 schrieb Ilya Dryomov:
> > On Mon, Aug 12, 2024 at 10:20 AM Oliver Freyermuth
> >  wrote:
> >>
> >> Dear Cephalopodians,
> >>
> >> we've successfully operated a "good old" Mimic cluster with primary RBD 
> >> images, replicated via journaling to a "backup cluster" with Octopus, for 
> >> the past years (i.e. one-way replication).
> >> We've now finally gotten around upgrading the cluster with the primary 
> >> images to Octopus (and plan to upgrade further in the near future).
> >>
> >> After the upgrade, all MON+MGR-OSD+rbd_mirror daemons are running 15.2.17.
> >>
> >> We run three rbd-mirror daemons which all share the following client with 
> >> auth in the "backup" cluster, to which they write:
> >>
> >> client.rbd_mirror_backup
> >>   caps: [mon] profile rbd-mirror
> >>   caps: [osd] profile rbd
> >>
> >> and the following shared client with auth in the "primary cluster" from 
> >> which they are reading:
> >>
> >> client.rbd_mirror
> >>   caps: [mon] profile rbd
> >>   caps: [osd] profile rbd
> >>
> >> i.e. the same auth as described in the docs[0].
> >>
> >> Checking on the primary cluster, we get:
> >>
> >> # rbd mirror pool status
> >> health: UNKNOWN
> >> daemon health: UNKNOWN
> >> image health: OK
> >> images: 288 total
> >> 288 replaying
> >>
> >> For some reason, some values are "unknown" here. But mirroring seems to 
> >> work, as checking on the backup cluster reveals, see for example:
> >>
> >> # rbd mirror image status zabbix-test.example.com-disk2
> >>   zabbix-test.example.com-disk2:
> >>   global_id:   1bdcb981-c1c5-4172-9583-be6a6cd996ec
> >>   state:   up+replaying
> >>   description: replaying, 
> >> {"bytes_per_second":8540.27,"entries_behind_primary":0,"entries_per_second":1.8,"non_primary_position":{"entry_tid":869176,"object_number":504,"tag_tid":1},"primary_position":{"entry_tid":11143,"object_number":7,"tag_tid":1}}
> >>   service: rbd_mirror_backup on rbd-mirror002.example.com
> >>   last_update: 2024-08-12 09:53:17
> >>
> >> However, we do in some seemingly random cases see that journals are never 
> >> advanced on the primary cluster — staying with the example above, on the 
> >> primary cluster I find the following:
> >>
> >> # rbd journal status --image zabbix-test.physik.uni-bonn.de-disk2
> >> minimum_set: 1
> >> active_set: 126
> >>   registered clients:
> >> [id=, commit_position=[positions=[[object_number=7, tag_tid=1, 
> >> entry_tid=11143], [object_number=6, tag_tid=1, entry_tid=11142], 
> >> [object_number=5, tag_tid=1, entry_tid=11141], [object_number=4, 
> >> tag_tid=1, entry_tid=11140]]], state=connected]
> >> [id=52b80bb0-a090-4f7d-9950-c8691ed8fee9, 
> >> commit_position=[positions=[[object_number=505, tag_tid=1, 
> >> entry_tid=869181], [object_number=504, tag_tid=1, entry_tid=869180], 
> >> [object_number=507, tag_tid=1, entry_tid=869179], [object_number=506, 
> >> tag_tid=1, entry_tid=869178]]], state=connected]
> >>
> >> As you can see, the minimum_set was not advanced. As can be seen in 
> >> "mirror image status", it shows the strange output that 
> >> non_primary_position seems much more advanced than primary_position. This 
> >> seems to happen "at random" for only a few volumes...
> >> There are no other active clients apart from the actual VM (libvirt+qemu).
> >
> > Hi Oliver,
> >
> > Were the VM clients (i.e. librbd on the hypervisor nodes) upgraded as well?
>
> Hi Ilya,
>
> "some of them" — as a matter of fact, we wanted to stress-test VM restarting 
> and live migration first, and in some cases saw VMs stuck for a long time, 
> which is now understandable...
>
> >>
> >> As a quick fix, to purge journals piling up over and over, we've only 
> >> found the "solution" to temporarily disable and then re-enable journaling 
> >> for affected VM disks, which can be identified by:
> >>for

[ceph-users] Re: RBD Journaling seemingly getting stuck for some VMs after upgrade to Octopus

2024-08-13 Thread Ilya Dryomov

On Mon, Aug 12, 2024 at 1:17 PM Oliver Freyermuth
 wrote:
>
> Am 12.08.24 um 12:16 schrieb Ilya Dryomov:
> > On Mon, Aug 12, 2024 at 11:28 AM Oliver Freyermuth
> >  wrote:
> >>
> >> Am 12.08.24 um 11:09 schrieb Ilya Dryomov:
> >>> On Mon, Aug 12, 2024 at 10:20 AM Oliver Freyermuth
> >>>  wrote:
> >>>>
> >>>> Dear Cephalopodians,
> >>>>
> >>>> we've successfully operated a "good old" Mimic cluster with primary RBD 
> >>>> images, replicated via journaling to a "backup cluster" with Octopus, 
> >>>> for the past years (i.e. one-way replication).
> >>>> We've now finally gotten around upgrading the cluster with the primary 
> >>>> images to Octopus (and plan to upgrade further in the near future).
> >>>>
> >>>> After the upgrade, all MON+MGR-OSD+rbd_mirror daemons are running 
> >>>> 15.2.17.
> >>>>
> >>>> We run three rbd-mirror daemons which all share the following client 
> >>>> with auth in the "backup" cluster, to which they write:
> >>>>
> >>>>  client.rbd_mirror_backup
> >>>>caps: [mon] profile rbd-mirror
> >>>>caps: [osd] profile rbd
> >>>>
> >>>> and the following shared client with auth in the "primary cluster" from 
> >>>> which they are reading:
> >>>>
> >>>>  client.rbd_mirror
> >>>>caps: [mon] profile rbd
> >>>>caps: [osd] profile rbd
> >>>>
> >>>> i.e. the same auth as described in the docs[0].
> >>>>
> >>>> Checking on the primary cluster, we get:
> >>>>
> >>>> # rbd mirror pool status
> >>>>  health: UNKNOWN
> >>>>  daemon health: UNKNOWN
> >>>>  image health: OK
> >>>>  images: 288 total
> >>>>  288 replaying
> >>>>
> >>>> For some reason, some values are "unknown" here. But mirroring seems to 
> >>>> work, as checking on the backup cluster reveals, see for example:
> >>>>
> >>>>  # rbd mirror image status zabbix-test.example.com-disk2
> >>>>zabbix-test.example.com-disk2:
> >>>>global_id:   1bdcb981-c1c5-4172-9583-be6a6cd996ec
> >>>>state:   up+replaying
> >>>>description: replaying, 
> >>>> {"bytes_per_second":8540.27,"entries_behind_primary":0,"entries_per_second":1.8,"non_primary_position":{"entry_tid":869176,"object_number":504,"tag_tid":1},"primary_position":{"entry_tid":11143,"object_number":7,"tag_tid":1}}
> >>>>service: rbd_mirror_backup on rbd-mirror002.example.com
> >>>>last_update: 2024-08-12 09:53:17
> >>>>
> >>>> However, we do in some seemingly random cases see that journals are 
> >>>> never advanced on the primary cluster — staying with the example above, 
> >>>> on the primary cluster I find the following:
> >>>>
> >>>>  # rbd journal status --image zabbix-test.physik.uni-bonn.de-disk2
> >>>>  minimum_set: 1
> >>>>  active_set: 126
> >>>>registered clients:
> >>>>  [id=, commit_position=[positions=[[object_number=7, 
> >>>> tag_tid=1, entry_tid=11143], [object_number=6, tag_tid=1, 
> >>>> entry_tid=11142], [object_number=5, tag_tid=1, entry_tid=11141], 
> >>>> [object_number=4, tag_tid=1, entry_tid=11140]]], state=connected]
> >>>>  [id=52b80bb0-a090-4f7d-9950-c8691ed8fee9, 
> >>>> commit_position=[positions=[[object_number=505, tag_tid=1, 
> >>>> entry_tid=869181], [object_number=504, tag_tid=1, entry_tid=869180], 
> >>>> [object_number=507, tag_tid=1, entry_tid=869179], [object_number=506, 
> >>>> tag_tid=1, entry_tid=869178]]], state=connected]
> >>>>
> >>>> As you can see, the minimum_set was not advanced. As can be seen in 
> >>>> "mirror image status", it shows the strange output that 
> >>>> non_primary_position seems much more advanced than primary_position. 
> >>>> This seems to happen "at rando

[ceph-users] Re: CRC Bad Signature when using KRBD

2024-09-06 Thread Ilya Dryomov

On Fri, Sep 6, 2024 at 3:54 AM  wrote:
>
> Hello Ceph Users,
>
> * Problem: we get the following errors when using krbd, we are using rbd
> for vms.
> * Workaround: by switching to librbd the errors disappear.
>
> * Software:
> ** Kernel: 6.8.8-2 (parameters: intel_iommu=on iommu=pt
> pcie_aspm.policy=performance)
> ** Ceph: 18.2.2
>
> Description/Details: Errors from using krbd with ceph. Side-effects:
>
> [Wed Aug 21 03:04:17 2024] libceph: read_partial_message
> 15af2284 data crc 1221767919 != exp. 282251377
> [Wed Aug 21 03:04:17 2024] libceph: read_partial_message
> 66b200ab data crc 3817026135 != exp. 3925662391
> [Wed Aug 21 03:04:17 2024] libceph: osd15 (1)10.1.4.13:6836 bad
> crc/signature
> [Wed Aug 21 03:04:17 2024] libceph: osd13 (1)10.1.4.13:6809 bad
> crc/signature

Hi Jonas,

Are these VMs running Windows?  If so, are you using rxbounce mapping
option ("rbd device map -o rxbounce ...")?  It's more or less required
in case there is a Windows kernel on the I/O path:

rxbounce - Use a bounce buffer when receiving data (since 5.17).
The default behaviour is to read directly into the destination
buffer.  A bounce buffer is needed if the destination buffer isn’t
guaranteed to be stable (i.e. remain unchanged while it is being
read to). In particular this is the case for Windows where
a system-wide “dummy” (throwaway) page may be mapped into the
destination buffer in order to generate a single large I/O.
Otherwise, “libceph: … bad crc/signature” or “libceph: … integrity
error, bad crc” errors and associated performance degradation are
expected.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RFC: cephfs fallocate

2024-09-10 Thread Ilya Dryomov

On Tue, Sep 10, 2024 at 1:23 PM Milind Changire  wrote:
>
> Problem:
> CephFS fallocate implementation does not actually reserve data blocks
> when mode is 0.
> It only truncates the file to the given size by setting the file size
> in the inode.
> So, there is no guarantee that writes to the file will succeed
>
> Solution:
> Since an immediate remediation of this problem is not possible, I
> request users to vote on the most suitable approach to avoid breaking
> dependent software:
> 1. report EOPNOTSUPP (operation not supported) when mode is 0
> 2. continue with existing implementation of file size truncation when mode is > 0

Hi Milind,

The kernel client went with (1) in 2018:


https://github.com/ceph/ceph-client/commit/bddff633ab7bc60a18a86ac8b322695b6f8594d0

And according to Patrick the intent was to do the same for ceph-fuse
(libcephfs) as well:

https://tracker.ceph.com/issues/36317#note-3

Did that part fall through the cracks?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: v15.2.6 Octopus released

2020-11-19 Thread Ilya Dryomov

On Thu, Nov 19, 2020 at 3:39 AM David Galloway  wrote:
>
> This is the 6th backport release in the Octopus series. This releases
> fixes a security flaw affecting Messenger V2 for Octopus & Nautilus. We
> recommend users to update to this release.
>
> Notable Changes
> ---
> * CVE 2020-25660: Fix a regression in Messenger V2 replay attacks

Correction: In Octopus, both Messenger v1 and Messenger v2 are
affected.  The release note will be fixed in [1].

[1] https://github.com/ceph/ceph/pull/38142

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RBD Image can't be formatted - blk_error

2021-01-08 Thread Ilya Dryomov

On Fri, Jan 8, 2021 at 2:19 PM Gaël THEROND  wrote:
>
> Hi everyone!
>
> I'm facing a weird issue with one of my CEPH clusters:
>
> OS: CentOS - 8.2.2004 (Core)
> CEPH: Nautilus 14.2.11 - stable
> RBD using erasure code profile (K=3; m=2)
>
> When I want to format one of my RBD image (client side) I've got the
> following kernel messages multiple time with different sector IDs:
>
>
> *[2417011.790154] blk_update_request: I/O error, dev rbd23, sector
> 164743869184 op 0x3:(DISCARD) flags 0x4000 phys_seg 1 prio class
> 0[2417011.791404] rbd: rbd23: discard at objno 20110336 2490368~1703936
> result -1  *
>
> At first I thought about a faulty disk BUT the monitoring system is not
> showing anything faulty so I decided to run manual tests on all my OSDs to
> look at disk health using smartctl etc.
>
> None of them is marked as not healthy and actually they don't get any
> counter with faulty sectors/read or writes and the Wear Level is 99%
>
> So, the only particularity of this image is it is a 80Tb image, but it
> shouldn't be an issue as we already have that kind of image size used on
> another pool.
>
> If anyone have a clue at how I could sort this out, I'll be more than happy

Hi Gaël,

What command are you running to format the image?

Is it persistent?  After the first formatting attempt fails, do the
following attempts fail too?

Is it always the same set of sectors?

Could you please attach the output of "rbd info" for that image and the
entire kernel log from the time that image is mapped?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RBD Image can't be formatted - blk_error

2021-01-11 Thread Ilya Dryomov

On Mon, Jan 11, 2021 at 10:09 AM Gaël THEROND  wrote:
>
> Hi Ilya,
>
> Here is additional information:
> My cluster is a three OSD Nodes cluster with each node having 24 4TB SSD 
> disks.
>
> The mkfs.xfs command fail with the following error: 
> https://pastebin.com/yTmMUtQs
>
> I'm using the following command to format the image: mkfs.xfs 
> /dev/rbd//
> I'm facing the same problem (and same sectors) if I'm directly targeting the 
> device with mkfs.xfs /dev/rbb
>
> The client authentication caps are as follows: https://pastebin.com/UuAHRycF
>
> Regarding your questions, yes, it is a persistent issue as soon as I try to 
> create a large image from a newly created pool.
> Yes, after the first attempt, all new attempts fail too.
> Yes, it is always the same set of sectors that fails.

Have you tried writing to sector 0, just to take mkfs.xfs out of the
picture?  E.g. "dd if=/dev/zero of=/dev/rbd17 bs=512 count=1 oflag=direct"?

>
> Strange thing is, if I use an already existing pool, and create this 80Tb 
> image within this pool, it formats it correctly.

What do you mean by a newly created pool?  A metadata pool, a data pool
or both?

Are you deleting and re-creating pools (whether metadata or data) with
the same name?  It would help if you paste all commands, starting with
how you create pools all the way to a failing write.

Have you tried mapping using the admin user ("rbd map --id admin ...")?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: krbd crc

2021-02-11 Thread Ilya Dryomov

On Thu, Feb 11, 2021 at 1:34 AM Seena Fallah  wrote:
>
> Hi,
> I have a few questions about krbd on kernel 4.15
>
> 1. Does it support msgr v2? (If not which kernel supports msgr v2?)

No.  Support for msgr2 has been merged into kernel 5.11, due to be
released this weekend.

Note that the kernel client will only support revision 1 of the msgr2
protocol (also referred to as msgr2.1).  The original msgr2 protocol has
security, integrity and some general robustness issues that made it not
conducive to bringing into the kernel.

msgr2.1 protocol was implemented in nautilus 14.2.11 and octopus
15.2.5, so if you want e.g. in-transit encryption with krbd, you will
need at least those versions on the server side.

The original msgr2 protocol is considered deprecated.

> 2. If krbd is using msgr v1, does it checksum (CRC) the messages that it
> sends to see for example if the write is correct or not? and if it does
> checksums, If there were a problem in write how does it react to that? For
> example, does it raise I/O Error or retry or...?

Yes, it does.  In case of a crc mismatch, the messenger will reset the
session and the write will be retried automatically.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: krbd crc

2021-02-11 Thread Ilya Dryomov

On Thu, Feb 11, 2021 at 12:44 PM Seena Fallah  wrote:
>
> Many thanks for your response.
> One more question, In the case of a CRC mismatch how many times does it retry 
> and does it raise any error logs in the kernel to see if it had a CRC 
> mismatch or not?

You will see bad "crc/signature" errors in dmesg.

When the session is reset all its state is discarded, so it will retry
indefinitely.

>
> On Thu, Feb 11, 2021 at 3:05 PM Ilya Dryomov  wrote:
>>
>> On Thu, Feb 11, 2021 at 1:34 AM Seena Fallah  wrote:
>> >
>> > Hi,
>> > I have a few questions about krbd on kernel 4.15
>> >
>> > 1. Does it support msgr v2? (If not which kernel supports msgr v2?)
>>
>> No.  Support for msgr2 has been merged into kernel 5.11, due to be
>> released this weekend.
>>
>> Note that the kernel client will only support revision 1 of the msgr2
>> protocol (also referred to as msgr2.1).  The original msgr2 protocol has
>> security, integrity and some general robustness issues that made it not
>> conducive to bringing into the kernel.
>>
>> msgr2.1 protocol was implemented in nautilus 14.2.11 and octopus
>> 15.2.5, so if you want e.g. in-transit encryption with krbd, you will
>> need at least those versions on the server side.
>>
>> The original msgr2 protocol is considered deprecated.
>>
>> > 2. If krbd is using msgr v1, does it checksum (CRC) the messages that it
>> > sends to see for example if the write is correct or not? and if it does
>> > checksums, If there were a problem in write how does it react to that? For
>> > example, does it raise I/O Error or retry or...?
>>
>> Yes, it does.  In case of a crc mismatch, the messenger will reset the
>> session and the write will be retried automatically.
>>
>> Thanks,
>>
>> Ilya

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: RBD Image can't be formatted - blk_error

2021-02-21 Thread Ilya Dryomov

On Sun, Feb 21, 2021 at 1:04 PM Gaël THEROND  wrote:
>
> Hi Ilya,
>
> Sorry for the late reply, I've been sick all week long :-/ and then really 
> busy at work once I'll get back.
>
> I've tried to wipe out the image by zeroing it (Even tried to fully wipe it), 
> I can see the same error message.
> The thing is, isn't the image created supposed to be empty?

The image is empty, but mkfs.xfs doesn't know that and attempts to
discard it anyway.

>
> Regarding the pool creation, both, I created a new metadata pool (archives) 
> and a new data pool (archives-data) as my pool is used for EC based RBD 
> images.
> Both too, I've tried to delete and re-create pools with a different name and 
> the same name, we always hit the issue.
>
> Here are the commands I used to create those pools and volumes:
> POOLS CREATION:
> ceph osd pool create archives 1024 1024 replicated
> ceph osd pool create archives-data 1024 1024 erasure standard-ec
> ceph osd pool set archives-data allow_ec_overwrites true
>
> VOLUME CREATION:
> rbd create --size 80T --data-pool archives-data archives/mirror
>
> just for complementary information, we use the following EC profile:
>
> k=3
> m=2
> plugin=jerasure
> crush-failure-domain=host
> crush-device-class=ssd
> technique=reed_sol_van
>
> This cluster is composed of 10 OSDs nodes filled with 24 8Tb SSD disks so if 
> I'm not wrong with my maths, our profile is OK so it shouldn't be a 
> profile/crushmap issue.
>
> I didn't try to map the volume using the admin user tho, you're right I 
> should in order to eliminate any auth issue, but I doubt it is related as a 
> smaller image works just fine with this client key using the same pool name.

Please do, it really looks like an authentication issue to me.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: kernel: ceph: mdsmap_decode got incorrect state(up:standby-replay)

2021-02-24 Thread Ilya Dryomov

On Wed, Feb 24, 2021 at 4:09 PM Frank Schilder  wrote:
>
> Hi all,
>
> I get these log messages all the time, sometimes also directly to the 
> terminal:
>
> kernel: ceph: mdsmap_decode got incorrect state(up:standby-replay)
>
> The cluster is healthy and the MDS complaining is actually both, configured 
> and running as a standby-replay daemon. These messages show up at least every 
> hour, but sometimes with much higher frequency. The cluster seems healthy 
> though.
>
> A google search did not bring up anything useful.
>
> Can anyone shed some light on what this message means?

Hi Frank,

It should be harmless.  This warning was removed in kernel 5.11.

Jeff, we should probably push that patch to stable kernels.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs: unable to mount share with 5.11 mainline, ceph 15.2.9, MDS 14.1.16

2021-03-02 Thread Ilya Dryomov

On Tue, Mar 2, 2021 at 9:26 AM Stefan Kooman  wrote:
>
> Hi,
>
> On a CentOS 7 VM with mainline kernel (5.11.2-1.el7.elrepo.x86_64 #1 SMP
> Fri Feb 26 11:54:18 EST 2021 x86_64 x86_64 x86_64 GNU/Linux) and with
> Ceph Octopus 15.2.9 packages installed. The MDS server is running
> Nautilus 14.2.16. Messenger v2 has been enabled. Poort 3300 of the
> monitors is reachable from the client. At mount time we get the following:
>
> > Mar  2 09:01:14  kernel: Key type ceph registered
> > Mar  2 09:01:14  kernel: libceph: loaded (mon/osd proto 15/24)
> > Mar  2 09:01:14  kernel: FS-Cache: Netfs 'ceph' registered for caching
> > Mar  2 09:01:14  kernel: ceph: loaded (mds proto 32)
> > Mar  2 09:01:14  kernel: libceph: mon4 (1)[mond addr]:6789 session 
> > established
> > Mar  2 09:01:14  kernel: libceph: another match of type 1 in addrvec
> > Mar  2 09:01:14  kernel: ceph: corrupt mdsmap
> > Mar  2 09:01:14  kernel: ceph: error decoding mdsmap -22
> > Mar  2 09:01:14  kernel: libceph: another match of type 1 in addrvec
> > Mar  2 09:01:14  kernel: libceph: corrupt full osdmap (-22) epoch 98764 off 
> > 6357 (27a57a75 of d3075952-e307797f)
> > Mar  2 09:02:15  kernel: ceph: No mds server is up or the cluster is laggy
>
> The /etc/ceph/ceph.conf has been adjusted to reflect the messenger v2
> changes. ms_bind_ipv6=trie, ms_bind_ipv4=false. The kernel client still
> seems to be use the v1 port though (although since 5.11 v2 should be
> supported).
>
> Has anyone seen this before? Just guessing here, but could it that the
> client tries to speak v2 protocol on v1 port?

Hi Stefan,

Those "another match of type 1" errors suggest that you have two
different v1 addresses for some of or all OSDs and MDSes in osdmap
and mdsmap respectively.

What is the output of "ceph osd dump" and "ceph fs dump"?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs: unable to mount share with 5.11 mainline, ceph 15.2.9, MDS 14.1.16

2021-03-02 Thread Ilya Dryomov

On Tue, Mar 2, 2021 at 6:02 PM Stefan Kooman  wrote:
>
> On 3/2/21 5:42 PM, Ilya Dryomov wrote:
> > On Tue, Mar 2, 2021 at 9:26 AM Stefan Kooman  wrote:
> >>
> >> Hi,
> >>
> >> On a CentOS 7 VM with mainline kernel (5.11.2-1.el7.elrepo.x86_64 #1 SMP
> >> Fri Feb 26 11:54:18 EST 2021 x86_64 x86_64 x86_64 GNU/Linux) and with
> >> Ceph Octopus 15.2.9 packages installed. The MDS server is running
> >> Nautilus 14.2.16. Messenger v2 has been enabled. Poort 3300 of the
> >> monitors is reachable from the client. At mount time we get the following:
> >>
> >>> Mar  2 09:01:14  kernel: Key type ceph registered
> >>> Mar  2 09:01:14  kernel: libceph: loaded (mon/osd proto 15/24)
> >>> Mar  2 09:01:14  kernel: FS-Cache: Netfs 'ceph' registered for caching
> >>> Mar  2 09:01:14  kernel: ceph: loaded (mds proto 32)
> >>> Mar  2 09:01:14  kernel: libceph: mon4 (1)[mond addr]:6789 session 
> >>> established
> >>> Mar  2 09:01:14  kernel: libceph: another match of type 1 in addrvec
> >>> Mar  2 09:01:14  kernel: ceph: corrupt mdsmap
> >>> Mar  2 09:01:14  kernel: ceph: error decoding mdsmap -22
> >>> Mar  2 09:01:14  kernel: libceph: another match of type 1 in addrvec
> >>> Mar  2 09:01:14  kernel: libceph: corrupt full osdmap (-22) epoch 98764 
> >>> off 6357 (27a57a75 of d3075952-e307797f)
> >>> Mar  2 09:02:15  kernel: ceph: No mds server is up or the cluster is laggy
> >>
> >> The /etc/ceph/ceph.conf has been adjusted to reflect the messenger v2
> >> changes. ms_bind_ipv6=trie, ms_bind_ipv4=false. The kernel client still
> >> seems to be use the v1 port though (although since 5.11 v2 should be
> >> supported).
> >>
> >> Has anyone seen this before? Just guessing here, but could it that the
> >> client tries to speak v2 protocol on v1 port?
> >
> > Hi Stefan,
> >
> > Those "another match of type 1" errors suggest that you have two
> > different v1 addresses for some of or all OSDs and MDSes in osdmap
> > and mdsmap respectively.
> >
> > What is the output of "ceph osd dump" and "ceph fs dump"?
>
> That's a lot of output, so I trimmed it:
>
> --- snip ---
> osd.0 up   in  weight 1 up_from 98071 up_thru 98719 down_at 98068
> last_clean_interval [96047,98067)
> [v2:[2001:7b8:80:1:0:1:2:1]:6848/505534,v1:[2001:7b8:80:1:0:1:2:1]:6854/505534,v2:0.0.0.0:6860/505534,v1:0.0.0.0:6866/505534]

Where did "v2:0.0.0.0:6860/505534,v1:0.0.0.0:6866/505534" come from?
This is what confuses the kernel client: it sees two addresses of
the same type and doesn't know which one to pick.  Instead of blindly
picking the first one (or some other dubious heuristic) it just denies
the osdmap.

> [mds.mds1{0:229930080} state up:active seq 144042 addr
> [v2:[2001:7b8:80:1:0:1:3:1]:6800/2234186180,v1:[2001:7b8:80:1:0:1:3:1]:6801/2234186180,v2:0.0.0.0:6802/2234186180,v1:0.0.0.0:6803/2234186180]]

Same for the mdsmap.

Were you using ipv6 with the kernel client before upgrading to 5.11?

What is output of "ceph daemon osd.0 config get ms_bind_ipv4" on the
osd0 node?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs: unable to mount share with 5.11 mainline, ceph 15.2.9, MDS 14.1.16

2021-03-03 Thread Ilya Dryomov

On Wed, Mar 3, 2021 at 11:15 AM Stefan Kooman  wrote:
>
> On 3/2/21 6:00 PM, Jeff Layton wrote:
>
> >>
> >>>
> >>> v2 support in the kernel is keyed on the ms_mode= mount option, so that
> >>> has to be passed in if you're connecting to a v2 port. Until the mount
> >>> helpers get support for that option you'll need to specify the address
> >>> and port manually if you want to use v2.
> >>
> >> I've tried feeding it ms_mode=v2 but I get a "mount error 22 = Invalid
> >> argument", the ms_mode=legacy does work, but fails with the same errors.
> >>
> >
> > That needs different values. See:
> >
> >  
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=00498b994113a871a556f7ff24a4cf8a00611700
> >
> > You can try passing in a specific mon address and port, like:
> >
> >  192.168.21.22:3300:/cephfs/dir/
> >
> > ...and then pass in ms_mode=crc or something similar.
> >
> > That said, what you're doing should be working, so this sounds like a
> > regression. I presume you're able to mount with earlier kernels? What's
> > the latest kernel version that you have that works?
>
> 5.11 kernel (5.11.2-arch1-1 #1 SMP PREEMPT Fri, 26 Feb 2021 18:26:41
> + x86_64 GNU/Linux) with a cluster that has ms_bind_ipv4=false
> works. Port 3300 ms_mode=prefer-crc and ms_mode=crc work.
>
> I have tested with 5.11 kernel (5.11.2-arch1-1 #1 SMP PREEMPT Fri, 26
> Feb 2021 18:26:41 + x86_64 GNU/Linux) port 3300 and ms_mode=crc as
> well as ms_mode=prefer-crc and that works when cluster is running with
> ms_bind_ipv4=false. So the "fix" is to have this config option set: ceph
> config set global ms_bind_ipv4 false

Right.  According to your original post that was already the case:
"ms_bind_ipv6=trie, ms_bind_ipv4=false".

>
> 5.10 kernel (5.10.19-1-lts Arch Linux) works with a cluster that is IPv6
> only but has ms_bind_ipv4=true. So it's "broken" since 5.11.
>
> So, we have done more reading / researching on the ms_bind_ip{4,6} options:
>
> -
> https://pve.proxmox.com/wiki/Ceph_Luminous_to_Nautilus#Restart_the_OSD_daemon_on_all_nodes
>
> - https://github.com/rook/rook/issues/6266
>
> ^^ Describe that you have to disable bind to IPv4.
>
> - https://github.com/ceph/ceph/pull/13317
>
> ^^ this PR is not completely correct:
>
> **Note:** You may use IPv6 addresses instead of IPv4 addresses, but
> you must set ``ms bind ipv6`` to ``true``.
>
> ^^ That is not enough as we have learned, and starts to give trouble
> with 5.11 linux cephfs client.
>
> And from this documentation:
> https://docs.ceph.com/en/latest/rados/configuration/network-config-ref/#ipv4-ipv6-dual-stack-mode
> we learned that dual stack is not possible for any current stable
> release, but might be possible with latest code. So the takeaway is that
> the linux kernel client needs fixing to be able to support dual stack
> clusters in the future (multiple v1 / v2 address families), and, that
> until then you should run with ms_bind_ipv4=false for IPv6 only clusters.

I don't think we do any dual stack testing, whether in userspace or
(certainly!) with the kernel client.

>
> I'll make a PR to clear up the documenation. Do you want me to create a
> tracker for the kernel client? I will happily test your changes.

Sure.  You are correct that the kernel client needs a bit a work as we
haven't considered dual stack configurations there at all.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: New Issue - Mapping Block Devices

2021-03-22 Thread Ilya Dryomov

On Tue, Mar 23, 2021 at 6:13 AM duluxoz  wrote:
>
> Hi All,
>
> I've got a new issue (hopefully this one will be the last).
>
> I have a working Ceph (Octopus) cluster with a replicated pool
> (my-pool), an erasure-coded pool (my-pool-data), and an image (my-image)
> created - all *seems* to be working correctly. I also have the correct
> Keyring specified (ceph.client.my-id.keyring).
>
> ceph -s is reporting all healthy.
>
> The ec profile (my-ec-profile) was created with: ceph osd
> erasure-code-profile set my-ec-profile k=4 m=2 crush-failure-domain=host
>
> The replicated pool was created with: ceph osd pool create my-pool 100
> 100 replicated
>
> Followed by: rbd pool init my-pool
>
> The ec pool was created with: ceph osd pool create my-pool-data 100 100
> erasure my-ec-profile --autoscale-mode=on
>
> Followed by: rbd pool init my-pool-data
>
> The image was created with: rbd create -s 1T --data-pool my-pool-data
> my-pool/my-image
>
> The Keyring was created with: ceph auth get-or-create client.my-id mon
> 'profile rbd' osd 'profile rbd pool=my-pool' mgr 'profile rbd
> pool=my-pool' -o /etc/ceph/ceph.client.my-id.keyring

Hi Matthew,

If you are using a separate data pool, you need to give "my-id" access
to it:

  osd 'profile rbd pool=my-pool, profile rbd pool=my-pool-data'

>
> On a centos8 client machine I have installed ceph-common, placed the
> Keyring file into /etc/ceph/, and run the command: rbd device map
> my-pool/my-image --id my-id

Does "rbd device map" actually succeed?  Can you attach dmesg from that
client machine from when you (attempted to) map, ran fdisk, etc?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [Ceph-maintainers] v14.2.20 Nautilus released

2021-04-20 Thread Ilya Dryomov

On Tue, Apr 20, 2021 at 2:01 AM David Galloway  wrote:
>
> This is the 20th bugfix release in the Nautilus stable series.  It
> addresses a security vulnerability in the Ceph authentication framework.
> We recommend users to update to this release. For a detailed release
> notes with links & changelog please refer to the official blog entry at
> https://ceph.io/releases/v14-2-20-nautilus-released
>
> Security Fixes
> --
>
> * This release includes a security fix that ensures the global_id value
> (a numeric value that should be unique for every authenticated client or
> daemon in the cluster) is reclaimed after a network disconnect or ticket
> renewal in a secure fashion.  Two new health alerts may appear during
> the upgrade indicating that there are clients or daemons that are not
> yet patched with the appropriate fix.

The link in the blog entry should point at

https://docs.ceph.com/en/latest/security/CVE-2021-20288/

Please refer there for details and recommendations.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [Ceph-maintainers] v15.2.11 Octopus released

2021-04-20 Thread Ilya Dryomov

On Tue, Apr 20, 2021 at 1:56 AM David Galloway  wrote:
>
> This is the 11th bugfix release in the Octopus stable series.  It
> addresses a security vulnerability in the Ceph authentication framework.
> We recommend users to update to this release. For a detailed release
> notes with links & changelog please refer to the official blog entry at
> https://ceph.io/releases/v15-2-11-octopus-released
>
> Security Fixes
> --
>
> * This release includes a security fix that ensures the global_id value
> (a numeric value that should be unique for every authenticated client or
> daemon in the cluster) is reclaimed after a network disconnect or ticket
> renewal in a secure fashion. Two new health alerts may appear during the
> upgrade indicating that there are clients or daemons that are not yet
> patched with the appropriate fix.

The link in the blog entry should point at

https://docs.ceph.com/en/latest/security/CVE-2021-20288/

Please refer there for details and recommendations.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [Ceph-maintainers] v16.2.1 Pacific released

2021-04-20 Thread Ilya Dryomov

On Tue, Apr 20, 2021 at 2:02 AM David Galloway  wrote:
>
> This is the first bugfix release in the Pacific stable series. It
> addresses a security vulnerability in the Ceph authentication framework.
>  We recommend users to update to this release. For a detailed release
> notes with links & changelog please refer to the official blog entry at
> https://ceph.io/releases/v16-2-1-pacific-released
>
> Security Fixes
> --
>
> * This release includes a security fix that ensures the global_id value
> (a numeric value that should be unique for every authenticated client or
> daemon in the cluster) is reclaimed after a network disconnect or ticket
> renewal in a secure fashion.  Two new health alerts may appear during
> the upgrade indicating that there are clients or daemons that are not
> yet patched with the appropriate fix.

The link in the blog entry should point at

https://docs.ceph.com/en/latest/security/CVE-2021-20288/

Please refer there for details and recommendations.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [Ceph-maintainers] v14.2.20 Nautilus released

2021-04-20 Thread Ilya Dryomov

On Tue, Apr 20, 2021 at 11:30 AM Dan van der Ster  wrote:
>
> On Tue, Apr 20, 2021 at 11:26 AM Ilya Dryomov  wrote:
> >
> > On Tue, Apr 20, 2021 at 2:01 AM David Galloway  wrote:
> > >
> > > This is the 20th bugfix release in the Nautilus stable series.  It
> > > addresses a security vulnerability in the Ceph authentication framework.
> > > We recommend users to update to this release. For a detailed release
> > > notes with links & changelog please refer to the official blog entry at
> > > https://ceph.io/releases/v14-2-20-nautilus-released
> > >
> > > Security Fixes
> > > --
> > >
> > > * This release includes a security fix that ensures the global_id value
> > > (a numeric value that should be unique for every authenticated client or
> > > daemon in the cluster) is reclaimed after a network disconnect or ticket
> > > renewal in a secure fashion.  Two new health alerts may appear during
> > > the upgrade indicating that there are clients or daemons that are not
> > > yet patched with the appropriate fix.
> >
> > The link in the blog entry should point at
> >
> > https://docs.ceph.com/en/latest/security/CVE-2021-20288/
> >
> > Please refer there for details and recommendations.
>
> Thanks Ilya.
>
> Is there any potential issue if clients upgrade before the cluster daemons?
> (Our clients will likely get 14.2.20 before all the clusters have been
> upgraded).

No issue.  Userspace clients would just start doing what is expected
by the protocol, same as kernel clients.

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]

2021-04-22 Thread Ilya Dryomov

On Thu, Apr 22, 2021 at 3:24 PM Cem Zafer  wrote:
>
> Hi,
> I have recently add a new host to ceph and copied /etc/ceph directory to
> the new host. When I execute the simple ceph command as "ceph -s", get the
> following error.
>
> 021-04-22T14:50:46.226+0300 7ff541141700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> 2021-04-22T14:50:46.226+0300 7ff540940700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> 2021-04-22T14:50:46.226+0300 7ff533fff700 -1 monclient(hunting):
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> [errno 13] RADOS permission denied (error connecting to the cluster)
>
> When I looked at the syslog on the ceph cluster node, I saw that message
> too.
>
> Apr 22 14:51:40 ceph100 bash[27979]: debug 2021-04-22T11:51:40.684+
> 7fe4d28cb700  0 cephx server client.admin:  attempt to reclaim global_id
> 264198 without presenting ticket
> Apr 22 14:51:40 ceph100 bash[27979]: debug 2021-04-22T11:51:40.684+
> 7fe4d28cb700  0 cephx server client.admin:  could not verify old ticket
>
> Anyone can help me out or assist to the right direction or link?

Hi Cem,

I take it that you upgraded to one of 14.2.20, 15.2.11 or 16.2.1
releases and then set auth_allow_insecure_global_id_reclaim to false?

What version of ceph-common package is installed on that host?  What is
the output of "ceph -v"?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-22 Thread Ilya Dryomov

On Thu, Apr 22, 2021 at 4:20 PM Boris Behrens  wrote:
>
> Hi,
>
> I have a customer VM that is running fine, but I can not make snapshots
> anymore.
> rbd snap create rbd/IMAGE@test-bb-1
> just hangs forever.

Hi Boris,

Run

$ rbd snap create rbd/IMAGE@test-bb-1 --debug-ms=1 --debug-rbd=20

let it hang for a few minutes and attach the output.

>
> When I checked the status with
> rbd status rbd/IMAGE
> it shows one watcher, the cpu node where the VM is running.
>
> What can I do to investigate further, without restarting the VM.
> This is the only affected VM and it stopped working three days ago.

Can you think of any event related to the cluster, that VM or the
VM fleet in general that occurred three days ago?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]

2021-04-22 Thread Ilya Dryomov

On Thu, Apr 22, 2021 at 5:04 PM Cem Zafer  wrote:
>
> Hi Ilya,
> Yes you are correct, I have set auth_allow_insecure_global_id_reclaim to 
> false.
> Host ceph-common package version is 16.2.0 and the cluster ceph -v output is 
> as follows.
>
> root@ceph100:~# ceph -v
> ceph version 16.2.1 (afb9061ab4117f798c858c741efa6390e48ccf10) pacific 
> (stable)
> Regards.

Right, so because you set auth_allow_insecure_global_id_reclaim to false,
older userspace clients, in this case 16.2.0, are not allowed to connect
because they won't reclaim their global_id in a secure fashion.  See

https://docs.ceph.com/en/latest/security/CVE-2021-20288/

for details.

Thanks,

Ilya

>
> On Thu, Apr 22, 2021 at 4:49 PM Ilya Dryomov  wrote:
>>
>> On Thu, Apr 22, 2021 at 3:24 PM Cem Zafer  wrote:
>> >
>> > Hi,
>> > I have recently add a new host to ceph and copied /etc/ceph directory to
>> > the new host. When I execute the simple ceph command as "ceph -s", get the
>> > following error.
>> >
>> > 021-04-22T14:50:46.226+0300 7ff541141700 -1 monclient(hunting):
>> > handle_auth_bad_method server allowed_methods [2] but i only support [2]
>> > 2021-04-22T14:50:46.226+0300 7ff540940700 -1 monclient(hunting):
>> > handle_auth_bad_method server allowed_methods [2] but i only support [2]
>> > 2021-04-22T14:50:46.226+0300 7ff533fff700 -1 monclient(hunting):
>> > handle_auth_bad_method server allowed_methods [2] but i only support [2]
>> > [errno 13] RADOS permission denied (error connecting to the cluster)
>> >
>> > When I looked at the syslog on the ceph cluster node, I saw that message
>> > too.
>> >
>> > Apr 22 14:51:40 ceph100 bash[27979]: debug 2021-04-22T11:51:40.684+
>> > 7fe4d28cb700  0 cephx server client.admin:  attempt to reclaim global_id
>> > 264198 without presenting ticket
>> > Apr 22 14:51:40 ceph100 bash[27979]: debug 2021-04-22T11:51:40.684+
>> > 7fe4d28cb700  0 cephx server client.admin:  could not verify old ticket
>> >
>> > Anyone can help me out or assist to the right direction or link?
>>
>> Hi Cem,
>>
>> I take it that you upgraded to one of 14.2.20, 15.2.11 or 16.2.1
>> releases and then set auth_allow_insecure_global_id_reclaim to false?
>>
>> What version of ceph-common package is installed on that host?  What is
>> the output of "ceph -v"?
>>
>> Thanks,
>>
>> Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-22 Thread Ilya Dryomov

On Thu, Apr 22, 2021 at 5:08 PM Boris Behrens  wrote:
>
>
>
> Am Do., 22. Apr. 2021 um 16:43 Uhr schrieb Ilya Dryomov :
>>
>> On Thu, Apr 22, 2021 at 4:20 PM Boris Behrens  wrote:
>> >
>> > Hi,
>> >
>> > I have a customer VM that is running fine, but I can not make snapshots
>> > anymore.
>> > rbd snap create rbd/IMAGE@test-bb-1
>> > just hangs forever.
>>
>> Hi Boris,
>>
>> Run
>>
>> $ rbd snap create rbd/IMAGE@test-bb-1 --debug-ms=1 --debug-rbd=20
>>
>> let it hang for a few minutes and attach the output.
>
>
> I just pasted a short snip here: https://pastebin.com/B3Xgpbzd
> If you need more I can give it to you, but the output is very large.

Paste the first couple thousand lines (i.e. from the very beginning),
that should be enough.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]

2021-04-22 Thread Ilya Dryomov

On Thu, Apr 22, 2021 at 6:01 PM Cem Zafer  wrote:
>
> Thanks Ilya, pointing me out to the right direction.
> So if I change the auth_allow_insecure_global_id_reclaim to true means older 
> userspace clients allowed to the cluster?

Yes, but upgrading all clients and setting it to false is recommended.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-22 Thread Ilya Dryomov

On Thu, Apr 22, 2021 at 6:00 PM Boris Behrens  wrote:
>
>
>
> Am Do., 22. Apr. 2021 um 17:27 Uhr schrieb Ilya Dryomov :
>>
>> On Thu, Apr 22, 2021 at 5:08 PM Boris Behrens  wrote:
>> >
>> >
>> >
>> > Am Do., 22. Apr. 2021 um 16:43 Uhr schrieb Ilya Dryomov 
>> > :
>> >>
>> >> On Thu, Apr 22, 2021 at 4:20 PM Boris Behrens  wrote:
>> >> >
>> >> > Hi,
>> >> >
>> >> > I have a customer VM that is running fine, but I can not make snapshots
>> >> > anymore.
>> >> > rbd snap create rbd/IMAGE@test-bb-1
>> >> > just hangs forever.
>> >>
>> >> Hi Boris,
>> >>
>> >> Run
>> >>
>> >> $ rbd snap create rbd/IMAGE@test-bb-1 --debug-ms=1 --debug-rbd=20
>> >>
>> >> let it hang for a few minutes and attach the output.
>> >
>> >
>> > I just pasted a short snip here: https://pastebin.com/B3Xgpbzd
>> > If you need more I can give it to you, but the output is very large.
>>
>> Paste the first couple thousand lines (i.e. from the very beginning),
>> that should be enough.
>>
> sure: https://pastebin.com/GsKpLbqG
>
> good luck :)

What is the output of "rbd status"?  I know you said it shows one
watcher, but I need to see it.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]

2021-04-22 Thread Ilya Dryomov

On Thu, Apr 22, 2021 at 9:24 PM Cem Zafer  wrote:
>
> Sorry to disturb you again but changing  the value to yes doesnt affect 
> anything. Executing simple ceph command from the client replies the following 
> error, again. I'm not so sure it is related with that parameter.
> Have you any idea what could cause the problem?
>
> indiana@mars:~$ ceph -s
> 2021-04-22T22:19:51.305+0300 7f20ea249700 -1 monclient(hunting): 
> handle_auth_bad_method server allowed_methods [2] but i only support [1]
> [errno 13] RADOS permission denied (error connecting to the cluster)

This looks like a different host/client.  What version of ceph-common
is installed on mars (or just run "ceph -v" instead of "ceph -s")?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]

2021-04-22 Thread Ilya Dryomov

On Thu, Apr 22, 2021 at 10:16 PM Cem Zafer  wrote:
>
> This client ceph-common version is 16.2.0, here are the outputs.
>
> indiana@mars:~$ ceph -v
> ceph version 16.2.0 (0c2054e95bcd9b30fdd908a79ac1d8bbc3394442) pacific 
> (stable)
>
> indiana@mars:~$ dpkg -l | grep -i ceph-common
> ii  ceph-common16.2.0-1focal  
>amd64common utilities to mount and interact with a ceph 
> storage cluster
> ii  python3-ceph-common16.2.0-1focal  
>all  Python 3 utility libraries for Ceph

Is the problem local to "mars"?  Are you able to "ceph -s" with 16.2.0
client from other hosts?

I suspect there is something wrong with ceph.conf on mars.  The error
message is different:

handle_auth_bad_method server allowed_methods [2] but i only support [1]

instead of

handle_auth_bad_method server allowed_methods [2] but i only support [2]

Please run "ceph -s --debug_ms=1 --debug_monc=25 --debug_auth=25" and
attach ceph.conf from mars.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [2]

2021-04-23 Thread Ilya Dryomov

On Fri, Apr 23, 2021 at 6:57 AM Cem Zafer  wrote:
>
> Hi Ilya,
> Sorry, totally my mistake. I just saw that the configuration on mars like 
> that.
>
> auth_cluster_required = none
> auth_service_required = none
> auth_client_required = none
>
> So I changed none to cephx, solved the problem.
> Thanks for your patience and support.

Yup, that would do it ;)

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-23 Thread Ilya Dryomov

On Fri, Apr 23, 2021 at 9:16 AM Boris Behrens  wrote:
>
>
>
> Am Do., 22. Apr. 2021 um 20:59 Uhr schrieb Ilya Dryomov :
>>
>> On Thu, Apr 22, 2021 at 7:33 PM Boris Behrens  wrote:
>> >
>> >
>> >
>> > Am Do., 22. Apr. 2021 um 18:30 Uhr schrieb Ilya Dryomov 
>> > :
>> >>
>> >> On Thu, Apr 22, 2021 at 6:00 PM Boris Behrens  wrote:
>> >> >
>> >> >
>> >> >
>> >> > Am Do., 22. Apr. 2021 um 17:27 Uhr schrieb Ilya Dryomov 
>> >> > :
>> >> >>
>> >> >> On Thu, Apr 22, 2021 at 5:08 PM Boris Behrens  wrote:
>> >> >> >
>> >> >> >
>> >> >> >
>> >> >> > Am Do., 22. Apr. 2021 um 16:43 Uhr schrieb Ilya Dryomov 
>> >> >> > :
>> >> >> >>
>> >> >> >> On Thu, Apr 22, 2021 at 4:20 PM Boris Behrens  
>> >> >> >> wrote:
>> >> >> >> >
>> >> >> >> > Hi,
>> >> >> >> >
>> >> >> >> > I have a customer VM that is running fine, but I can not make 
>> >> >> >> > snapshots
>> >> >> >> > anymore.
>> >> >> >> > rbd snap create rbd/IMAGE@test-bb-1
>> >> >> >> > just hangs forever.
>> >> >> >>
>> >> >> >> Hi Boris,
>> >> >> >>
>> >> >> >> Run
>> >> >> >>
>> >> >> >> $ rbd snap create rbd/IMAGE@test-bb-1 --debug-ms=1 --debug-rbd=20
>> >> >> >>
>> >> >> >> let it hang for a few minutes and attach the output.
>> >> >> >
>> >> >> >
>> >> >> > I just pasted a short snip here: https://pastebin.com/B3Xgpbzd
>> >> >> > If you need more I can give it to you, but the output is very large.
>> >> >>
>> >> >> Paste the first couple thousand lines (i.e. from the very beginning),
>> >> >> that should be enough.
>> >> >>
>> >> > sure: https://pastebin.com/GsKpLbqG
>> >> >
>> >> > good luck :)
>> >>
>> >> What is the output of "rbd status"?  I know you said it shows one
>> >> watcher, but I need to see it.
>> >>
>> >>
>> > sure
>> > # rbd status rbd/IMAGE
>> > Watchers:
>> > watcher=[fd00:2380:2:43::11]:0/3919389201 client.136378749 
>> > cookie=139968010125312
>>
>
> Hi Ilya,
> thank you a lot for your support.
>
> This might be other hanging snapshot sheduler that got removed afterwards.
> Sorry for that.
>
> https://pastebin.com/TBZs7Mvb
>
> I just created a new paste and added status and lock ls at the top and at the 
> bottom.
> The 2nd watcher disaperas after a minute or so.
> All commands are done within one minute.

This snippet confirms my suspicion.  Unfortunately without a verbose
log from that VM from three days ago (i.e. when it got into this state)
it's hard to tell what exactly went wrong.

The problem is that the VM doesn't consider itself to be the rightful
owner of the lock and so when "rbd snap create" requests the lock from
it in order to make a snapshot, the VM just ignores the request because
even though it owns the lock, its record appears to be of sync.

I'd suggest to kick it by restarting osd36.  If the VM is active, it
should reacquire the lock and hopefully update its internal record as
expected.  If "rbd snap create" still hangs after that, it would mean
that we have a reproducer and can gather logs on the VM side.

What version of qemu/librbd and ceph is in use (both on the VM side and
on the side you are running "rbd snap create"?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-23 Thread Ilya Dryomov

On Fri, Apr 23, 2021 at 12:03 PM Boris Behrens  wrote:
>
>
>
> Am Fr., 23. Apr. 2021 um 11:52 Uhr schrieb Ilya Dryomov :
>>
>>
>> This snippet confirms my suspicion.  Unfortunately without a verbose
>> log from that VM from three days ago (i.e. when it got into this state)
>> it's hard to tell what exactly went wrong.
>>
>> The problem is that the VM doesn't consider itself to be the rightful
>> owner of the lock and so when "rbd snap create" requests the lock from
>> it in order to make a snapshot, the VM just ignores the request because
>> even though it owns the lock, its record appears to be of sync.
>>
>> I'd suggest to kick it by restarting osd36.  If the VM is active, it
>> should reacquire the lock and hopefully update its internal record as
>> expected.  If "rbd snap create" still hangs after that, it would mean
>> that we have a reproducer and can gather logs on the VM side.
>>
>> What version of qemu/librbd and ceph is in use (both on the VM side and
>> on the side you are running "rbd snap create"?
>>
> I just stopped the OSD, waited some seconds and started it again.
> I still can't create snapshots.
>
> Ceph version is 14.2.18 accross the board
> qemu is 4.1.0-1
> as we use krbd, the kernel version is 5.2.9-arch1-1-ARCH
>
> How can I gather more logs to debug it?

Are you saying that this image is mapped and the lock is held by the
kernel client?  It doesn't look that way from the logs you shared.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-23 Thread Ilya Dryomov

On Fri, Apr 23, 2021 at 12:46 PM Boris Behrens  wrote:
>
>
>
> Am Fr., 23. Apr. 2021 um 12:16 Uhr schrieb Ilya Dryomov :
>>
>> On Fri, Apr 23, 2021 at 12:03 PM Boris Behrens  wrote:
>> >
>> >
>> >
>> > Am Fr., 23. Apr. 2021 um 11:52 Uhr schrieb Ilya Dryomov 
>> > :
>> >>
>> >>
>> >> This snippet confirms my suspicion.  Unfortunately without a verbose
>> >> log from that VM from three days ago (i.e. when it got into this state)
>> >> it's hard to tell what exactly went wrong.
>> >>
>> >> The problem is that the VM doesn't consider itself to be the rightful
>> >> owner of the lock and so when "rbd snap create" requests the lock from
>> >> it in order to make a snapshot, the VM just ignores the request because
>> >> even though it owns the lock, its record appears to be of sync.
>> >>
>> >> I'd suggest to kick it by restarting osd36.  If the VM is active, it
>> >> should reacquire the lock and hopefully update its internal record as
>> >> expected.  If "rbd snap create" still hangs after that, it would mean
>> >> that we have a reproducer and can gather logs on the VM side.
>> >>
>> >> What version of qemu/librbd and ceph is in use (both on the VM side and
>> >> on the side you are running "rbd snap create"?
>> >>
>> > I just stopped the OSD, waited some seconds and started it again.
>> > I still can't create snapshots.
>> >
>> > Ceph version is 14.2.18 accross the board
>> > qemu is 4.1.0-1
>> > as we use krbd, the kernel version is 5.2.9-arch1-1-ARCH
>> >
>> > How can I gather more logs to debug it?
>>
>> Are you saying that this image is mapped and the lock is held by the
>> kernel client?  It doesn't look that way from the logs you shared.
>
> We use krbd instead of librbd (at least this is what I think I know), but 
> qemu is doing the kvm/rbd stuff.

I'm going to assume that by "qemu is doing the kvm/rbd stuff", you
mean that you are using the librbd driver inside qemu and that this
image is opened by qemu (i.e. that driver).  If you don't know what
access method is being used, debugging this might be challenging ;)

Let's start with the same output: "rbd lock ls", "rbd status" and "rbd
snap create --debug-ms=1 --debug-rbd=20".  It should be different after
osd36 was restarted.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd snap create now working and just hangs forever

2021-04-23 Thread Ilya Dryomov

On Fri, Apr 23, 2021 at 1:12 PM Boris Behrens  wrote:
>
>
>
> Am Fr., 23. Apr. 2021 um 13:00 Uhr schrieb Ilya Dryomov :
>>
>> On Fri, Apr 23, 2021 at 12:46 PM Boris Behrens  wrote:
>> >
>> >
>> >
>> > Am Fr., 23. Apr. 2021 um 12:16 Uhr schrieb Ilya Dryomov 
>> > :
>> >>
>> >> On Fri, Apr 23, 2021 at 12:03 PM Boris Behrens  wrote:
>> >> >
>> >> >
>> >> >
>> >> > Am Fr., 23. Apr. 2021 um 11:52 Uhr schrieb Ilya Dryomov 
>> >> > :
>> >> >>
>> >> >>
>> >> >> This snippet confirms my suspicion.  Unfortunately without a verbose
>> >> >> log from that VM from three days ago (i.e. when it got into this state)
>> >> >> it's hard to tell what exactly went wrong.
>> >> >>
>> >> >> The problem is that the VM doesn't consider itself to be the rightful
>> >> >> owner of the lock and so when "rbd snap create" requests the lock from
>> >> >> it in order to make a snapshot, the VM just ignores the request because
>> >> >> even though it owns the lock, its record appears to be of sync.
>> >> >>
>> >> >> I'd suggest to kick it by restarting osd36.  If the VM is active, it
>> >> >> should reacquire the lock and hopefully update its internal record as
>> >> >> expected.  If "rbd snap create" still hangs after that, it would mean
>> >> >> that we have a reproducer and can gather logs on the VM side.
>> >> >>
>> >> >> What version of qemu/librbd and ceph is in use (both on the VM side and
>> >> >> on the side you are running "rbd snap create"?
>> >> >>
>> >> > I just stopped the OSD, waited some seconds and started it again.
>> >> > I still can't create snapshots.
>> >> >
>> >> > Ceph version is 14.2.18 accross the board
>> >> > qemu is 4.1.0-1
>> >> > as we use krbd, the kernel version is 5.2.9-arch1-1-ARCH
>> >> >
>> >> > How can I gather more logs to debug it?
>> >>
>> >> Are you saying that this image is mapped and the lock is held by the
>> >> kernel client?  It doesn't look that way from the logs you shared.
>> >
>> > We use krbd instead of librbd (at least this is what I think I know), but 
>> > qemu is doing the kvm/rbd stuff.
>>
>> I'm going to assume that by "qemu is doing the kvm/rbd stuff", you
>> mean that you are using the librbd driver inside qemu and that this
>> image is opened by qemu (i.e. that driver).  If you don't know what
>> access method is being used, debugging this might be challenging ;)
>>
>> Let's start with the same output: "rbd lock ls", "rbd status" and "rbd
>> snap create --debug-ms=1 --debug-rbd=20".  It should be different after
>> osd36 was restarted.
>
> Here is the new one: https://pastebin.com/6qTsJK6W
> Ah ok, this CPU node still got the old thing and uses librbd to work with rbd 
> instead of krbd.

Sorry, I forgot that simply restarting the OSD doesn't trigger the
code path that I'm hoping would cause librbd inside the VM to update
its state.  I took a look at the code and I think there are a couple
of ways to do it (listed in the order of preference):

- cut the network between the VM and the cluster for more than 30
  seconds; it should be done externally so that to the VM it looks
  like a long network blip

- stop the VM process for more than 30 seconds

  $ PID=
  $ kill -STOP $PID && sleep 40 && kill -CONT $PID

- stop the osd36 process for more than 30 seconds with "nodown" flag
  set

  $ ceph osd set nodown
  $ PID=
  $ kill -STOP $PID && sleep 40 && kill -CONT $PID
  $ ceph osd unset nodown

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-04-25 Thread Ilya Dryomov

On Sun, Apr 25, 2021 at 12:37 AM Markus Kienast  wrote:
>
> I am seeing these messages when booting from RBD and booting hangs there.
>
> libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated
> 131072, skipping
>
> However, Ceph Health is OK, so I have no idea what is going on. I
> reboot my 3 node cluster and it works again for about two weeks.
>
> How can I find out more about this issue, how can I dig deeper? Also
> there has been at least one report about this issue before on this
> mailing list - "[ceph-users] Strange Data Issue - Unexpected client
> hang on OSD I/O Error" - but no solution has been presented.
>
> This report was from 2018, so no idea if this is still an issue for
> Dyweni the original reporter. If you read this, I would be happy to
> hear how you solved the problem.

Hi Markus,

What versions of ceph and the kernel are in use?

Are you also seeing I/O errors and "missing primary copy of ..., will
try copies on ..." messages in the OSD logs (in this case osd2)?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-04-28 Thread Ilya Dryomov

On Sun, Apr 25, 2021 at 11:42 AM Ilya Dryomov  wrote:
>
> On Sun, Apr 25, 2021 at 12:37 AM Markus Kienast  wrote:
> >
> > I am seeing these messages when booting from RBD and booting hangs there.
> >
> > libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated
> > 131072, skipping
> >
> > However, Ceph Health is OK, so I have no idea what is going on. I
> > reboot my 3 node cluster and it works again for about two weeks.
> >
> > How can I find out more about this issue, how can I dig deeper? Also
> > there has been at least one report about this issue before on this
> > mailing list - "[ceph-users] Strange Data Issue - Unexpected client
> > hang on OSD I/O Error" - but no solution has been presented.
> >
> > This report was from 2018, so no idea if this is still an issue for
> > Dyweni the original reporter. If you read this, I would be happy to
> > hear how you solved the problem.
>
> Hi Markus,
>
> What versions of ceph and the kernel are in use?
>
> Are you also seeing I/O errors and "missing primary copy of ..., will
> try copies on ..." messages in the OSD logs (in this case osd2)?

For the sake of archives, the "[ceph-users] Strange Data Issue
- Unexpected client hang on OSD I/O Error" instance has been fixed
in 12.2.12, 13.2.5 and 14.2.0:

https://tracker.ceph.com/issues/37680

I also tried to reply to that thread but it didn't go through because
the old ceph-us...@lists.ceph.com mailing list is decommissioned.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-02 Thread Ilya Dryomov

On Sun, May 2, 2021 at 11:15 PM Magnus Harlander  wrote:
>
> Hi,
>
> I know there is a thread about problems with mounting cephfs with 5.11 
> kernels.
> I tried everything that's mentioned there, but I still can not mount a cephfs
> from an octopus node.
>
> I verified:
>
> - I can not mount with 5.11 client kernels (fedora 33 and ubuntu 21.04)
> - I can mount with 5.10 client kernels
> - It is not due to ipv4/ipv6. I'm not using ipv6
> - I'm using a cluster network on a private network segment. Because this was 
> mentioned as a possible cause for the problems (next to ipv6)
>   I removed the cluster network and now I'm using the same network for osd 
> syncs and client connections. It did not help.
> - mount returns with a timeout and error after about 1 minute
> - I tried the ms_mode=legacy (and others) mount options. Nothing helped
> - I tried to use IP:PORT:/fs to mount to exclude DNS as the cause. Didn't 
> help.
> - I did setup a similar test cluster on a few VMs and did not have a problem 
> with mouting.
>   Even used cluster networks, which also worked fine.
>
> I'm running out of ideas? Any help would be appreciated.
>
> \Magnus
>
> My Setup:
>
> SERVER OS:
> ==
> [root@s1 ~]# hostnamectl
>Static hostname: s1.harlan.de
>  Icon name: computer-desktop
>Chassis: desktop
> Machine ID: 3a0a6308630842ffad6b9bb8be4c7547
>Boot ID: ffb2948d3934419dafceb0990316d9fd
>   Operating System: CentOS Linux 8
>CPE OS Name: cpe:/o:centos:centos:8
> Kernel: Linux 4.18.0-240.22.1.el8_3.x86_64
>   Architecture: x86-64
>
> CEPH VERSION:
> =
> ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus 
> (stable)
>
> CLIENT OS:
> ==
> [root@islay ~]# hostnamectl
>Static hostname: islay
>  Icon name: computer-laptop
>Chassis: laptop
> Machine ID: 6de7b27dfd864e9ea52b8b0cff47cdfc
>Boot ID: 6d8d8bb36f274458b2b761b0a046c8ad
>   Operating System: Fedora 33 (Workstation Edition)
>CPE OS Name: cpe:/o:fedoraproject:fedora:33
> Kernel: Linux 5.11.16-200.fc33.x86_64
>   Architecture: x86-64
>
> CEPH VERSION:
> =
> [root@islay harlan]# ceph version
> ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus 
> (stable)
>
> [root@s1 ~]# ceph version
> ceph version 15.2.11 (e3523634d9c2227df9af89a4eac33d16738c49cb) octopus 
> (stable)
>
> FSTAB ENTRY:
> 
> cfs0,cfs1:/fs  /data/fs ceph 
> rw,_netdev,name=admin,secretfile=/etc/ceph/fs.secret 0 0
>
> IP CONFIG MON/OSD NODE (s1)
> ===
> [root@s1 ~]# ip a
> 1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group 
> default qlen 1000
> link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
> inet 127.0.0.1/8 scope host lo
>valid_lft forever preferred_lft forever
> inet6 ::1/128 scope host
>valid_lft forever preferred_lft forever
> 2: enp4s0:  mtu 1500 qdisc fq_codel 
> master bond0 state UP group default qlen 1000
> link/ether 98:de:d0:04:26:86 brd ff:ff:ff:ff:ff:ff
> 3: enp5s0:  mtu 1500 qdisc fq_codel 
> master bond0 state UP group default qlen 1000
> link/ether a8:a1:59:18:e7:ea brd ff:ff:ff:ff:ff:ff
> 4: vmbr:  mtu 1500 qdisc noqueue state UP 
> group default qlen 1000
> link/ether 98:de:d0:04:26:86 brd ff:ff:ff:ff:ff:ff
> inet 192.168.200.111/24 brd 192.168.200.255 scope global noprefixroute 
> vmbr
>valid_lft forever preferred_lft forever
> inet 192.168.200.141/24 brd 192.168.200.255 scope global secondary 
> noprefixroute vmbr
>valid_lft forever preferred_lft forever
> inet 192.168.200.101/24 brd 192.168.200.255 scope global secondary vmbr
>valid_lft forever preferred_lft forever
> inet6 fe80::be55:705d:7c9e:eaa4/64 scope link noprefixroute
>valid_lft forever preferred_lft forever
> 5: bond0:  mtu 1500 qdisc noqueue 
> master vmbr state UP group default qlen 1000
> link/ether 98:de:d0:04:26:86 brd ff:ff:ff:ff:ff:ff
> 6: virbr0:  mtu 1500 qdisc noqueue state 
> DOWN group default qlen 1000
> link/ether 52:54:00:32:ea:2f brd ff:ff:ff:ff:ff:ff
> inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
>valid_lft forever preferred_lft forever
> 7: virbr0-nic:  mtu 1500 qdisc fq_codel master virbr0 
> state DOWN group default qlen 1000
> link/ether 52:54:00:32:ea:2f brd ff:ff:ff:ff:ff:ff
> 8: vnet0:  mtu 1500 qdisc fq_codel master 
> vmbr state UNKNOWN group default qlen 1000
> link/ether fe:54:00:67:4d:15 brd ff:ff:ff:ff:ff:ff
> inet6 fe80::fc54:ff:fe67:4d15/64 scope link
>valid_lft forever preferred_lft forever
>
> CEPH STATUS:
> 
> [root@s1 ~]# ceph -s
>   cluster:
> id: 86bbd6c5-ae96-4c78-8a5e-50623f0ae524
> health: HEALTH_OK
>
>   services:
> mon: 4 daemons, quorum s0,mbox,s1,r1 (age 6h)
> mgr: s1(active, since 6h), standbys: s0
> mds: fs:1 {0=s1=up:active} 1 up:standby
>

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov

On Mon, May 3, 2021 at 9:20 AM Magnus Harlander  wrote:
>
> Am 03.05.21 um 00:44 schrieb Ilya Dryomov:
>
> On Sun, May 2, 2021 at 11:15 PM Magnus Harlander  wrote:
>
> Hi,
>
> I know there is a thread about problems with mounting cephfs with 5.11 
> kernels.
>
> ...
>
> Hi Magnus,
>
> What is the output of "ceph config dump"?
>
> Instead of providing those lines, can you run "ceph osd getmap 64281 -o
> osdmap.64281" and attach osdmap.64281 file?
>
> Thanks,
>
> Ilya
>
> Hi Ilya,
>
> [root@s1 ~]# ceph config dump
> WHO MASK  LEVEL OPTION VALUE RO
> globalbasic device_failure_prediction_mode local
> globaladvanced  ms_bind_ipv4   false
>   mon advanced  auth_allow_insecure_global_id_reclaim  false
>   mon advanced  mon_lease  8.00
>   mgr advanced  mgr/devicehealth/enable_monitoring true
>
> getmap output is attached,

I see the problem, but I don't understand the root cause yet.  It is
related to the two missing OSDs:

> May 02 22:54:05 islay kernel: libceph: no match of type 1 in addrvec
> May 02 22:54:05 islay kernel: libceph: corrupt full osdmap (-2) epoch 64281 
> off 3154 (a90fe1d7 of 0083f4bd-c03bdc9b)

> max_osd 12

> osd.0 up   in  ... 
> [v2:192.168.200.141:6804/3027,v1:192.168.200.141:6805/3027] ... exists,up 
> 631bc170-45fd-4948-9a5e-4c278569c0bc
> osd.1 up   in  ... 
> [v2:192.168.200.140:6811/3066,v1:192.168.200.140:6813/3066] ... exists,up 
> 660a762c-001d-4160-a9ee-d0acd078e776
> osd.2 up   in  ... 
> [v2:192.168.200.141:6815/3008,v1:192.168.200.141:6816/3008] ... exists,up 
> e4d94d3a-ec58-46a1-b61c-c47dd39012ed
> osd.3 up   in  ... 
> [v2:192.168.200.140:6800/3067,v1:192.168.200.140:6801/3067] ... exists,up 
> 26d25060-fd99-4d15-a1b2-ebb77646671e
> osd.4 up   in  ... 
> [v2:192.168.200.140:6804/3049,v1:192.168.200.140:6806/3049] ... exists,up 
> 238f197d-ecbc-4588-8a99-6a63c9bb1a17
> osd.5 up   in  ... 
> [v2:192.168.200.140:6816/3073,v1:192.168.200.140:6817/3073] ... exists,up 
> a9dcb26f-0f1c-4067-a26b-a29939285e0b
> osd.6 up   in  ... 
> [v2:192.168.200.141:6808/3020,v1:192.168.200.141:6809/3020] ... exists,up 
> f399b47d-063f-4b2f-bd93-289377dc9945
> osd.7 up   in  ... 
> [v2:192.168.200.141:6800/3023,v1:192.168.200.141:6801/3023] ... exists,up 
> 3557ceca-7bd8-401e-abd3-59bee168e8f6
> osd.8 up   in  ... 
> [v2:192.168.200.141:6812/3017,v1:192.168.200.141:6813/3017] ... exists,up 
> 7f9cad3f-163d-4bb7-85b2-fffd46982fff
> osd.9 up   in  ... 
> [v2:192.168.200.140:6805/3053,v1:192.168.200.140:6807/3053] ... exists,up 
> c543b12a-f9bf-4b83-af16-f6b8a3926e69

The kernel client is failing to parse addrvec entries for non-existent
osd10 and osd11.  It is probably being too stringent, but before fixing
it I'd like to understand what happened to those OSDs.  It looks like
they were removed but not completely.

What let to their removal?  What commands were used?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov

On Mon, May 3, 2021 at 12:00 PM Magnus Harlander  wrote:
>
> Am 03.05.21 um 11:22 schrieb Ilya Dryomov:
>
> max_osd 12
>
> I never had more then 10 osds on the two osd nodes of this cluster.
>
> I was running a 3 osd-node cluster earlier with more than 10
> osds, but the current cluster has been setup from scratch and
> I definitely don't remember having ever more than 10 osds!
> Very strange!
>
> I had to replace 2 disks because of DOA-Problems, but for that
> I removed 2 osds and created new ones.
>
> I used ceph-deploy do create new osds.
>
> To delete osd.8 I used:
>
> # take it out
> ceph osd out 8
>
> # wait for rebalancing to finish
>
> systemctl stop ceph-osd@8
>
> # wait for a healthy cluster
>
> ceph osd purge 8 --yes-i-really-mean-it
>
> # edit ceph.conf and remove osd.8
>
> ceph-deploy --overwrie-conf admin s0 s1
>
> # Add the new disk and:
> ceph-deploy osd create --data /dev/sdc s0
> ...
>
> it get's created with the next free osd num (8) because purge releases 8 for 
> reuse

It would be nice to track it down, but for the immediate issue of
kernel 5.11 not working, "ceph osd setmaxosd 10" should fix it.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov

On Mon, May 3, 2021 at 12:27 PM Magnus Harlander  wrote:
>
> Am 03.05.21 um 12:25 schrieb Ilya Dryomov:
>
> ceph osd setmaxosd 10
>
> Bingo! Mount works again.
>
> Vry strange things are going on here (-:
>
> Thanx a lot for now!! If I can help to track it down, please let me know.

Good to know it helped!  I'll think about this some more and probably
plan to patch the kernel client to be less stringent and not choke on
this sort of misconfiguration.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-03 Thread Ilya Dryomov

On Mon, May 3, 2021 at 12:24 PM Magnus Harlander  wrote:
>
> Am 03.05.21 um 11:22 schrieb Ilya Dryomov:
>
> There is a 6th osd directory on both machines, but it's empty
>
> [root@s0 osd]# ll
> total 0
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-1
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-3
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-4
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-5
> drwxr-xr-x. 2 ceph ceph   6  3. Apr 19:50 ceph-8 <===
> drwxrwxrwt. 2 ceph ceph 200  2. Mai 16:31 ceph-9
> [root@s0 osd]# pwd
> /var/lib/ceph/osd
>
> [root@s1 osd]# ll
> total 0
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-0
> drwxr-xr-x. 2 ceph ceph   6 Mar 13 17:54 ceph-1 <===
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-2
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-6
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-7
> drwxrwxrwt  2 ceph ceph 200 May  2 15:39 ceph-8
> [root@s1 osd]# pwd
> /var/lib/ceph/osd
>
> The bogus directories are empty and they are
> used on the other machine for a real osd!
>
> How is that?
>
> Should I remove them and restart ceph.target?

I don't think empty directories matter at this point.  You may not have
had 12 OSDs at any point in time, but the max_osd value appears to have
gotten bumped when you were replacing those disks.

Note that max_osd being greater than the number of OSDs is not a big
problem by itself.  The osdmap is going to be larger and require more
memory but that's it.  You can test by setting it back to 12 and trying
to mount -- it should work.  The issue is specific to how to those OSDs
were replaced -- something went wrong and the osdmap somehow ended up
with rather bogus addrvec entries.  Not sure if it's ceph-deploy's
fault, something weird in ceph.conf (back then) or a an actual ceph
bug.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: cephfs mount problems with 5.11 kernel - not a ipv6 problem

2021-05-11 Thread Ilya Dryomov

On Tue, May 11, 2021 at 10:50 AM Konstantin Shalygin  wrote:
>
> Hi Ilya,
>
> On 3 May 2021, at 14:15, Ilya Dryomov  wrote:
>
> I don't think empty directories matter at this point.  You may not have
> had 12 OSDs at any point in time, but the max_osd value appears to have
> gotten bumped when you were replacing those disks.
>
> Note that max_osd being greater than the number of OSDs is not a big
> problem by itself.  The osdmap is going to be larger and require more
> memory but that's it.  You can test by setting it back to 12 and trying
> to mount -- it should work.  The issue is specific to how to those OSDs
> were replaced -- something went wrong and the osdmap somehow ended up
> with rather bogus addrvec entries.  Not sure if it's ceph-deploy's
> fault, something weird in ceph.conf (back then) or a an actual ceph
> bug.
>
>
> What actuality is bug? When max_osds > total_osd_in?

No, as mentioned above max_osds being greater is not a problem per se.
Having max_osds set to 1 when you only have a few dozen is going to
waste a lot of memory and network bandwidth, but if it is just slightly
bigger it's not something to worry about.  Normally these "spare" slots
are ignored, but in Magnus' case they looked rather weird and the kernel
refused the osdmap.  See

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=3f1c6f2122fc780560f09735b6d1dbf39b44eb0f

for details.

> What kernel's was affected?

5.11 and 5.12, backports are on the way.

>
> For example, max_osds is 132, total_osds_in in 126, max osd number is 131 - 
> is affected?

No, max_osds alone is not enough to trigger it.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: v14.2.21 Nautilus released

2021-05-14 Thread Ilya Dryomov

On Fri, May 14, 2021 at 8:20 AM Rainer Krienke  wrote:
>
> Hello,
>
> has the "negative progress bug" also been fixed in 14.2.21? I cannot
> find any info about this in the changelog?

Unfortunately not -- this was a hotfix release driven by rgw and
dashboard CVEs.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Ilya Dryomov

On Sun, May 16, 2021 at 12:54 PM Markus Kienast  wrote:
>
> Hi Ilya,
>
> unfortunately I can not find any "missing primary copy of ..." error in the 
> logs of my 3 OSDs.
> The NVME disks are also brand new and there is not much traffic on them.
>
> The only error keyword I find are those two messages in osd.0 and osd.1 logs 
> shown below.
>
> BTW the error posted before actually concerns osd1. The one I posted was 
> copied from somebody elses bug report, which had similar errors. Here are my 
> original error messages on LTSP boot:

Hi Markus,

Please don't ever paste log messages from other bug reports again.
Your email said "I am seeing these messages ..." and I spent a fair
amount of time staring at the code trying to understand how an issue
that was fixed several releases ago could resurface.

The numbers in the log message mean specific things.  For example it
is immediately obvious that

  get_reply osd1 tid 11 data 4164 > preallocated 4096, skipping

is not related to

  get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

even though they probably look the same to you.

> [10.331119] libceph: mon1 (1)10.101.0.27:6789 session established
> [10.331799] libceph: client175444 fsid 
> b0f4a188-bd81-11ea-8849-97abe2843f29
> [10.336866] libceph: mon0 (1)10.101.0.25:6789 session established
> [10.337598] libceph: client175444 fsid 
> b0f4a188-bd81-11ea-8849-97abe2843f29
> [10.349380] libceph: get_reply osd1 tid 11 data 4164 > preallocated
> 4096, skipping

Please paste the entire boot log and "rbd info" output for the affected
image.

>
> elias@maas:~$ juju ssh ceph-osd/2 sudo zgrep -i error 
> /var/log/ceph/ceph-osd.0.log
> 2021-05-16T08:52:56.872+ 7f0b262c2d80  4 rocksdb: 
> Options.error_if_exists: 0
> 2021-05-16T08:52:59.872+ 7f0b262c2d80  4 rocksdb: 
> Options.error_if_exists: 0
> 2021-05-16T08:53:00.884+ 7f0b262c2d80  1 osd.0 8599 warning: got an error 
> loading one or more classes: (1) Operation not permitted
>
> elias@maas:~$ juju ssh ceph-osd/0 sudo zgrep -i error 
> /var/log/ceph/ceph-osd.1.log
> 2021-05-16T08:49:52.971+ 7fb6aa68ed80  4 rocksdb: 
> Options.error_if_exists: 0
> 2021-05-16T08:49:55.979+ 7fb6aa68ed80  4 rocksdb: 
> Options.error_if_exists: 0
> 2021-05-16T08:49:56.828+ 7fb6aa68ed80  1 osd.1 8589 warning: got an error 
> loading one or more classes: (1) Operation not permitted
>
> How can I find our more about this bug? It keeps coming back every two weeks 
> and I need to restart all OSDs to make it go away for another two weeks. Can 
> I check "tid 11 data 4164" somehow. I find no documentation, what a tid 
> actually is and how I could perform a read test on it.

So *just* restarting the three OSDs you have makes it go away?

What is meant by restarting?  Rebooting the node or simply restarting
the OSD process?

>
> Another interesting detail is, that the problem does only seem to affect 
> booting up from this RBD but not operation per se. The thin clients already 
> booted from this RBD continue working.

I take it that the affected image is mapped on multiple nodes?  If so,
on how many?

>
> All systems run:
> Ubuntu 20.04.2 LTS
> Kernel 5.8.0-53-generic
> ceph version 15.2.8 (bdf3eebcd22d7d0b3dd4d5501bee5bac354d5b55) octopus 
> (stable)
>
> The cluster has been setup with Ubuntu MAAS/juju, consists of
> * 1 MAAS server
> * with 1 virtual LXD juju controller
> * 3 OSD servers with one 2 TB Nvme SSD each for ceph and a 256 SATA SSD for 
> the operating system.
> * each OSD contains a virtualized LXD MON and an LXD FS server (setup through 
> juju, see juju yaml file attached).

Can you describe the client side a bit more?  How many clients do you
have?  How many of them are active at the same time?

What exactly is meant by "booting from RBD"?  Does the affected image
serve as a golden image?  Or some other image is snapshotted and cloned
from before booting?

What do clients do after they boot?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Ilya Dryomov

On Sun, May 16, 2021 at 4:18 PM Markus Kienast  wrote:
>
> Am So., 16. Mai 2021 um 15:36 Uhr schrieb Ilya Dryomov :
>>
>> On Sun, May 16, 2021 at 12:54 PM Markus Kienast  wrote:
>> >
>> > Hi Ilya,
>> >
>> > unfortunately I can not find any "missing primary copy of ..." error in 
>> > the logs of my 3 OSDs.
>> > The NVME disks are also brand new and there is not much traffic on them.
>> >
>> > The only error keyword I find are those two messages in osd.0 and osd.1 
>> > logs shown below.
>> >
>> > BTW the error posted before actually concerns osd1. The one I posted was 
>> > copied from somebody elses bug report, which had similar errors. Here are 
>> > my original error messages on LTSP boot:
>>
>> Hi Markus,
>>
>> Please don't ever paste log messages from other bug reports again.
>> Your email said "I am seeing these messages ..." and I spent a fair
>> amount of time staring at the code trying to understand how an issue
>> that was fixed several releases ago could resurface.
>>
>> The numbers in the log message mean specific things.  For example it
>> is immediately obvious that
>>
>>   get_reply osd1 tid 11 data 4164 > preallocated 4096, skipping
>>
>> is not related to
>>
>>   get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping
>>
>> even though they probably look the same to you.
>
>
> Sorry, I was not aware of that.
>
>>
>> > [10.331119] libceph: mon1 (1)10.101.0.27:6789 session established
>> > [10.331799] libceph: client175444 fsid 
>> > b0f4a188-bd81-11ea-8849-97abe2843f29
>> > [10.336866] libceph: mon0 (1)10.101.0.25:6789 session established
>> > [10.337598] libceph: client175444 fsid 
>> > b0f4a188-bd81-11ea-8849-97abe2843f29
>> > [10.349380] libceph: get_reply osd1 tid 11 data 4164 > preallocated
>> > 4096, skipping
>>
>> Please paste the entire boot log and "rbd info" output for the affected
>> image.
>
>
> elias@maas:~$ rbd info squashfs/ltsp-01
> rbd image 'ltsp-01':
> size 3.5 GiB in 896 objects
> order 22 (4 MiB objects)
> snapshot_count: 0
> id: 23faade1714
> block_name_prefix: rbd_data.23faade1714
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
> op_features:
> flags:
> create_timestamp: Mon Jan 11 12:09:22 2021
> access_timestamp: Wed Feb 24 10:55:17 2021
> modify_timestamp: Mon Jan 11 12:09:22 2021
>
> I don't have the boot log available right now, but you can watch a video of 
> the boot process right here: https://photos.app.goo.gl/S8PssYu2VAr4CSeg7
>
> It seems to be consistently "tid 11" consistently, while in this video it was 
> "data 4288" not "data 4164" as above. But the image has been modified in the 
> meantime, as far as I can recall, so that might be due to that reason.
>>
>>
>> >
>> > elias@maas:~$ juju ssh ceph-osd/2 sudo zgrep -i error 
>> > /var/log/ceph/ceph-osd.0.log
>> > 2021-05-16T08:52:56.872+ 7f0b262c2d80  4 rocksdb:  
>> >Options.error_if_exists: 0
>> > 2021-05-16T08:52:59.872+ 7f0b262c2d80  4 rocksdb:  
>> >Options.error_if_exists: 0
>> > 2021-05-16T08:53:00.884+ 7f0b262c2d80  1 osd.0 8599 warning: got an 
>> > error loading one or more classes: (1) Operation not permitted
>> >
>> > elias@maas:~$ juju ssh ceph-osd/0 sudo zgrep -i error 
>> > /var/log/ceph/ceph-osd.1.log
>> > 2021-05-16T08:49:52.971+ 7fb6aa68ed80  4 rocksdb:  
>> >Options.error_if_exists: 0
>> > 2021-05-16T08:49:55.979+ 7fb6aa68ed80  4 rocksdb:  
>> >Options.error_if_exists: 0
>> > 2021-05-16T08:49:56.828+ 7fb6aa68ed80  1 osd.1 8589 warning: got an 
>> > error loading one or more classes: (1) Operation not permitted
>> >
>> > How can I find our more about this bug? It keeps coming back every two 
>> > weeks and I need to restart all OSDs to make it go away for another two 
>> > weeks. Can I check "tid 11 data 4164" somehow. I find no documentation, 
>> > what a tid actually is and how I could perform a read test on it.
>>
>> So *just* restarting the three OSDs you have makes it go away?
>>
>> What is meant by restarting?  Rebooting the node or simply restarting
>> the OSD process?
>
>
> I did reboot all OSD nodes and since the MON and FS

[ceph-users] Re: libceph: get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping

2021-05-16 Thread Ilya Dryomov

On Sun, May 16, 2021 at 8:06 PM Markus Kienast  wrote:
>
> Am So., 16. Mai 2021 um 19:38 Uhr schrieb Ilya Dryomov :
>>
>> On Sun, May 16, 2021 at 4:18 PM Markus Kienast  wrote:
>> >
>> > Am So., 16. Mai 2021 um 15:36 Uhr schrieb Ilya Dryomov 
>> > :
>> >>
>> >> On Sun, May 16, 2021 at 12:54 PM Markus Kienast  
>> >> wrote:
>> >> >
>> >> > Hi Ilya,
>> >> >
>> >> > unfortunately I can not find any "missing primary copy of ..." error in 
>> >> > the logs of my 3 OSDs.
>> >> > The NVME disks are also brand new and there is not much traffic on them.
>> >> >
>> >> > The only error keyword I find are those two messages in osd.0 and osd.1 
>> >> > logs shown below.
>> >> >
>> >> > BTW the error posted before actually concerns osd1. The one I posted 
>> >> > was copied from somebody elses bug report, which had similar errors. 
>> >> > Here are my original error messages on LTSP boot:
>> >>
>> >> Hi Markus,
>> >>
>> >> Please don't ever paste log messages from other bug reports again.
>> >> Your email said "I am seeing these messages ..." and I spent a fair
>> >> amount of time staring at the code trying to understand how an issue
>> >> that was fixed several releases ago could resurface.
>> >>
>> >> The numbers in the log message mean specific things.  For example it
>> >> is immediately obvious that
>> >>
>> >>   get_reply osd1 tid 11 data 4164 > preallocated 4096, skipping
>> >>
>> >> is not related to
>> >>
>> >>   get_reply osd2 tid 1459933 data 3248128 > preallocated 131072, skipping
>> >>
>> >> even though they probably look the same to you.
>> >
>> >
>> > Sorry, I was not aware of that.
>> >
>> >>
>> >> > [10.331119] libceph: mon1 (1)10.101.0.27:6789 session established
>> >> > [10.331799] libceph: client175444 fsid 
>> >> > b0f4a188-bd81-11ea-8849-97abe2843f29
>> >> > [10.336866] libceph: mon0 (1)10.101.0.25:6789 session established
>> >> > [10.337598] libceph: client175444 fsid 
>> >> > b0f4a188-bd81-11ea-8849-97abe2843f29
>> >> > [10.349380] libceph: get_reply osd1 tid 11 data 4164 > preallocated
>> >> > 4096, skipping
>> >>
>> >> Please paste the entire boot log and "rbd info" output for the affected
>> >> image.
>> >
>> >
>> > elias@maas:~$ rbd info squashfs/ltsp-01
>> > rbd image 'ltsp-01':
>> > size 3.5 GiB in 896 objects
>> > order 22 (4 MiB objects)
>> > snapshot_count: 0
>> > id: 23faade1714
>> > block_name_prefix: rbd_data.23faade1714
>> > format: 2
>> > features: layering, exclusive-lock, object-map, fast-diff, deep-flatten
>> > op_features:
>> > flags:
>> > create_timestamp: Mon Jan 11 12:09:22 2021
>> > access_timestamp: Wed Feb 24 10:55:17 2021
>> > modify_timestamp: Mon Jan 11 12:09:22 2021
>> >
>> > I don't have the boot log available right now, but you can watch a video 
>> > of the boot process right here: https://photos.app.goo.gl/S8PssYu2VAr4CSeg7
>> >
>> > It seems to be consistently "tid 11" consistently, while in this video it 
>> > was "data 4288" not "data 4164" as above. But the image has been modified 
>> > in the meantime, as far as I can recall, so that might be due to that 
>> > reason.
>> >>
>> >>
>> >> >
>> >> > elias@maas:~$ juju ssh ceph-osd/2 sudo zgrep -i error 
>> >> > /var/log/ceph/ceph-osd.0.log
>> >> > 2021-05-16T08:52:56.872+ 7f0b262c2d80  4 rocksdb:   
>> >> >   Options.error_if_exists: 0
>> >> > 2021-05-16T08:52:59.872+ 7f0b262c2d80  4 rocksdb:   
>> >> >   Options.error_if_exists: 0
>> >> > 2021-05-16T08:53:00.884+ 7f0b262c2d80  1 osd.0 8599 warning: got an 
>> >> > error loading one or more classes: (1) Operation not permitted
>> >> >
>> >> > elias@maas:~$ juju ssh ceph-osd/0 sudo zgrep -i error 
>> >> > /var/log/ceph/ceph-osd.1.log
>> >> > 2021-05-16T08:49:52.971+ 7fb6aa68ed80  4

[ceph-users] Re: Mon crash when client mounts CephFS

2021-06-08 Thread Ilya Dryomov

On Tue, Jun 8, 2021 at 9:20 PM Phil Merricks  wrote:
>
> Hey folks,
>
> I have deployed a 3 node dev cluster using cephadm.  Deployment went
> smoothly and all seems well.
>
> If I try to mount a CephFS from a client node, 2/3 mons crash however.
> I've begun picking through the logs to see what I can see, but so far
> other than seeing the crash in the log itself, it's unclear what the cause
> of the crash is.
>
> Here's a log. .  You can see where the crash is
> occurring around the line that begins with "Jun 08 18:56:04 okcomputer
> podman[790987]:"

Hi Phil,

I assume you are mounting the kernel client, not ceph-fuse?  If so,
what is the kernel version on the client node?

ceph version 16.2.4 (3cbe25cde3cfa028984618ad32de9edc4c1eaed0)
pacific (stable)
1: /lib64/libpthread.so.0(+0x12b20) [0x7fc36de86b20]
2: gsignal()
3: abort()
4: /lib64/libstdc++.so.6(+0x9009b) [0x7fc36d4a409b]
5: /lib64/libstdc++.so.6(+0x9653c) [0x7fc36d4aa53c]
6: /lib64/libstdc++.so.6(+0x96597) [0x7fc36d4aa597]
7: /lib64/libstdc++.so.6(+0x967f8) [0x7fc36d4aa7f8]
8: /lib64/libstdc++.so.6(+0x92045) [0x7fc36d4a6045]
9: /usr/bin/ceph-mon(+0x4d8da6) [0x563c51ad8da6]
10: (MDSMonitor::check_sub(Subscription*)+0x819) [0x563c51acf329]
11: (Monitor::handle_subscribe(boost::intrusive_ptr)+0xcd8)
[0x563c518c1258]
12: (Monitor::dispatch_op(boost::intrusive_ptr)+0x78d)
[0x563c518e72ed]
13: (Monitor::_ms_dispatch(Message*)+0x670) [0x563c518e8910]
14: (Dispatcher::ms_dispatch2(boost::intrusive_ptr
const&)+0x5c) [0x563c51916fdc]
15: (DispatchQueue::entry()+0x126a) [0x7fc3705c6b1a]
16: (DispatchQueue::DispatchThread::entry()+0x11) [0x7fc370676b71]
17: /lib64/libpthread.so.0(+0x814a) [0x7fc36de7c14a]
18: clone()

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: nautilus: rbd ls returns ENOENT for some images

2021-06-09 Thread Ilya Dryomov

On Wed, Jun 9, 2021 at 11:24 AM Peter Lieven  wrote:
>
> Hi,
>
>
> we currently run into an issue where a rbd ls for a namespace returns ENOENT 
> for some of the images in that namespace.
>
>
> /usr/bin/rbd --conf=XXX --id XXX ls 
> 'mypool/28ef9470-76eb-4f77-bc1b-99077764ff7c' -l --format=json
> 2021-06-09 11:03:34.916 7f2225ffb700 -1 librbd::io::AioCompletion: 
> 0x55ca2390 fail: (2) No such file or directory
> 2021-06-09 11:03:34.916 7f2225ffb700 -1 librbd::io::AioCompletion: 
> 0x55caccd2b920 fail: (2) No such file or directory
> 2021-06-09 11:03:34.920 7f2225ffb700 -1 librbd::io::AioCompletion: 
> 0x55caccd9b4e0 fail: (2) No such file or directory
> rbd: error opening 34810ac2-3112-4fef-938c-b76338b0eeaf.raw: (2) No such file 
> or directory
> rbd: error opening c9882583-6dd5-4eca-bb82-3e81f7d63fa9.raw: (2) No such file 
> or directory
> rbd: error opening 5d5251d1-f017-4382-845c-65e504683742.raw: (2) No such file 
> or directory
> 2021-06-09 11:03:34.924 7f2225ffb700 -1 librbd::io::AioCompletion: 
> 0x55cacce07b00 fail: (2) No such file or directory
> rbd: error opening c625b898-ec34-4446-9455-d2b70d9e378f.raw: (2) No such file 
> or directory
> 2021-06-09 11:03:34.924 7f2225ffb700 -1 librbd::io::AioCompletion: 
> 0x55caccd7cce0 fail: (2) No such file or directory
> rbd: error opening 990c4bbe-6a7b-4adf-aab8-432e18d79e58.raw: (2) No such file 
> or directory
> 2021-06-09 11:03:34.924 7f2225ffb700 -1 librbd::io::AioCompletion: 
> 0x55cacce336f0 fail: (2) No such file or directory
> rbd: error opening 7382eb5b-a3eb-41e2-89b6-512f7b1d86c0.raw: (2) No such file 
> or directory
> [{"image":"108600c6-2312-4d61-9f5b-35b351112512.raw","size":3145728,"format":2,"lock_type":"exclusive"},{"image":"1292ef0c-2333-44f1-be30-39105f7d176e.raw","size":262149242880,"format":2,"lock_type":"exclusive"},{"image":"8cda5c3f-cdbd-42f4-918f-1480354e7965.raw","size":262149242880,"format":2,"lock_type":"exclusive"}]
> rbd: listing images failed: (2) No such file or directory
>
>
> The way to trigger this state was that the images which show "No such file or 
> directory" were deleted with rbd rm, but the operation was interrupted (rbd 
> process was killed) due to a timeout.
>
> What is the best way to recover from this and how to properly clean up?
>
>
> Release is nautilus 14.2.20

Hi Peter,

Does "rbd ls" without "-l" succeed?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: nautilus: rbd ls returns ENOENT for some images

2021-06-09 Thread Ilya Dryomov

On Wed, Jun 9, 2021 at 1:36 PM Peter Lieven  wrote:
>
> Am 09.06.21 um 13:28 schrieb Ilya Dryomov:
> > On Wed, Jun 9, 2021 at 11:24 AM Peter Lieven  wrote:
> >> Hi,
> >>
> >>
> >> we currently run into an issue where a rbd ls for a namespace returns 
> >> ENOENT for some of the images in that namespace.
> >>
> >>
> >> /usr/bin/rbd --conf=XXX --id XXX ls 
> >> 'mypool/28ef9470-76eb-4f77-bc1b-99077764ff7c' -l --format=json
> >> 2021-06-09 11:03:34.916 7f2225ffb700 -1 librbd::io::AioCompletion: 
> >> 0x55ca2390 fail: (2) No such file or directory
> >> 2021-06-09 11:03:34.916 7f2225ffb700 -1 librbd::io::AioCompletion: 
> >> 0x55caccd2b920 fail: (2) No such file or directory
> >> 2021-06-09 11:03:34.920 7f2225ffb700 -1 librbd::io::AioCompletion: 
> >> 0x55caccd9b4e0 fail: (2) No such file or directory
> >> rbd: error opening 34810ac2-3112-4fef-938c-b76338b0eeaf.raw: (2) No such 
> >> file or directory
> >> rbd: error opening c9882583-6dd5-4eca-bb82-3e81f7d63fa9.raw: (2) No such 
> >> file or directory
> >> rbd: error opening 5d5251d1-f017-4382-845c-65e504683742.raw: (2) No such 
> >> file or directory
> >> 2021-06-09 11:03:34.924 7f2225ffb700 -1 librbd::io::AioCompletion: 
> >> 0x55cacce07b00 fail: (2) No such file or directory
> >> rbd: error opening c625b898-ec34-4446-9455-d2b70d9e378f.raw: (2) No such 
> >> file or directory
> >> 2021-06-09 11:03:34.924 7f2225ffb700 -1 librbd::io::AioCompletion: 
> >> 0x55caccd7cce0 fail: (2) No such file or directory
> >> rbd: error opening 990c4bbe-6a7b-4adf-aab8-432e18d79e58.raw: (2) No such 
> >> file or directory
> >> 2021-06-09 11:03:34.924 7f2225ffb700 -1 librbd::io::AioCompletion: 
> >> 0x55cacce336f0 fail: (2) No such file or directory
> >> rbd: error opening 7382eb5b-a3eb-41e2-89b6-512f7b1d86c0.raw: (2) No such 
> >> file or directory
> >> [{"image":"108600c6-2312-4d61-9f5b-35b351112512.raw","size":3145728,"format":2,"lock_type":"exclusive"},{"image":"1292ef0c-2333-44f1-be30-39105f7d176e.raw","size":262149242880,"format":2,"lock_type":"exclusive"},{"image":"8cda5c3f-cdbd-42f4-918f-1480354e7965.raw","size":262149242880,"format":2,"lock_type":"exclusive"}]
> >> rbd: listing images failed: (2) No such file or directory
> >>
> >>
> >> The way to trigger this state was that the images which show "No such file 
> >> or directory" were deleted with rbd rm, but the operation was interrupted 
> >> (rbd process was killed) due to a timeout.
> >>
> >> What is the best way to recover from this and how to properly clean up?
> >>
> >>
> >> Release is nautilus 14.2.20
> > Hi Peter,
> >
> > Does "rbd ls" without "-l" succeed?
>
>
> Yes, it does:
>
>
> /usr/bin/rbd --conf=XXX --id XXX ls 
> 'mypool/28ef9470-76eb-4f77-bc1b-99077764ff7c' --format=json
>
>  
> ["108600c6-2312-4d61-9f5b-35b351112512.raw","1292ef0c-2333-44f1-be30-39105f7d176e.raw","8cda5c3f-cdbd-42f4-918f-1480354e7965.raw","34810ac2-3112-4fef-938c-b76338b0eeaf.raw","c9882583-6dd5-4eca-bb82-3e81f7d63fa9.raw","5d5251d1-f017-4382-845c-65e504683742.raw","c625b898-ec34-4446-9455-d2b70d9e378f.raw","990c4bbe-6a7b-4adf-aab8-432e18d79e58.raw","7382eb5b-a3eb-41e2-89b6-512f7b1d86c0.raw"]

I think simply re-running interrupted "rbd rm" commands would work and
clean up properly.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Performance (RBD) regression after upgrading beyond v15.2.8

2021-06-09 Thread Ilya Dryomov

On Wed, Jun 9, 2021 at 1:38 PM Wido den Hollander  wrote:
>
> Hi,
>
> While doing some benchmarks I have two identical Ceph clusters:
>
> 3x SuperMicro 1U
> AMD Epyc 7302P 16C
> 256GB DDR
> 4x Samsung PM983 1,92TB
> 100Gbit networking
>
> I tested on such a setup with v16.2.4 with fio:
>
> bs=4k
> qd=1
>
> IOps: 695
>
> That was very low as I was expecting at least >1000 IOps.
>
> I checked with the second Ceph cluster which was still running v15.2.8,
> the result: 1364 IOps.
>
> I then upgraded from 15.2.8 to 15.2.13: 725 IOps
>
> Looking at the differences between v15.2.8 and v15.2.8 of options.cc I
> saw these options:
>
> bluefs_buffered_io: false -> true
> bluestore_cache_trim_max_skip_pinned: 1000 -> 64
>
> The main difference seems to be 'bluefs_buffered_io', but in both cases
> this was already explicitly set to 'true'.
>
> So anything beyond 15.2.8 is right now giving me a much lower I/O
> performance with Queue Depth = 1 and Block Size = 4k.
>
> 15.2.8: 1364 IOps
> 15.2.13: 725 IOps
> 16.2.4: 695 IOps
>
> Has anybody else seen this as well? I'm trying to figure out where this
> is going wrong.

Hi Wido,

Going by the subject, I assume these are rbd numbers?  If so, did you
run any RADOS-level benchmarks?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Can not mount rbd device anymore

2021-06-23 Thread Ilya Dryomov

On Wed, Jun 23, 2021 at 9:59 AM Matthias Ferdinand  wrote:
>
> On Tue, Jun 22, 2021 at 02:36:00PM +0200, Ml Ml wrote:
> > Hello List,
> >
> > oversudden i can not mount a specific rbd device anymore:
> >
> > root@proxmox-backup:~# rbd map backup-proxmox/cluster5 -k
> > /etc/ceph/ceph.client.admin.keyring
> > /dev/rbd0
> >
> > root@proxmox-backup:~# mount /dev/rbd0 /mnt/backup-cluster5/
> >  (just never times out)
>
>
> Hi,
>
> there used to be some kernel lock issues when the kernel rbd client
> tried to access an OSD on the same machine. Not sure if these issues
> still exist (but I would guess so) and if you use your proxmox cluster
> in a hyperconverged manner (nodes providing VMs and storage service at
> the same time) you may just have been lucky that it had worked before.
>
> Instead of the kernel client mount you can try to export the volume as
> an NBD device (https://docs.ceph.com/en/latest/man/8/rbd-nbd/) and
> mounting that. rbd-nbd runs in userspace and should not have that
> locking problem.

rbd-nbd is also susceptible to locking up in such setups, likely more
so than krbd.  Don't forget that it also has a kernel component and
there are actually more opportunities for things to go sideways/lock up
because there is an extra daemon involved allocating some additional
memory for each I/O request.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: when is krbd on osd nodes starting to get problematic?

2021-06-23 Thread Ilya Dryomov

On Wed, Jun 23, 2021 at 3:36 PM Marc  wrote:
>
> From what kernel / ceph version is krbd usage on a osd node problematic?
>
> Currently I am running Nautilus 14.2.11 and el7 3.10 kernel without any 
> issues.
>
> I can remember using a cephfs mount without any issues as well, until some 
> specific luminous update surprised me. So maybe nice to know when to expect 
> this.

It has always been the case.  This is a rather fundamental issue and
it is not specific to Ceph.  I don't think there is a particular Ceph
release or kernel version to name other than it became much harder to
hit with modern kernels.

I would be cautious about attributing random stalls or hangs that may
be experienced for a wide variety of reasons to this co-location issue,
even if moving the mount to another machine happened to help.  Usually
such reports lack the necessary evidence, the last one that I could
confirm to be the co-location related lockup was at least a couple of
years ago.

Thanks,

Ilya

>
>
>
> > -Original Message-
> > Sent: Wednesday, 23 June 2021 11:25
> > Subject: *SPAM* [ceph-users] Re: Can not mount rbd device
> > anymore
> >
> > On Wed, Jun 23, 2021 at 9:59 AM Matthias Ferdinand
> >  wrote:
> > >
> > > On Tue, Jun 22, 2021 at 02:36:00PM +0200, Ml Ml wrote:
> > > > Hello List,
> > > >
> > > > oversudden i can not mount a specific rbd device anymore:
> > > >
> > > > root@proxmox-backup:~# rbd map backup-proxmox/cluster5 -k
> > > > /etc/ceph/ceph.client.admin.keyring
> > > > /dev/rbd0
> > > >
> > > > root@proxmox-backup:~# mount /dev/rbd0 /mnt/backup-cluster5/
> > > >  (just never times out)
> > >
> > >
> > > Hi,
> > >
> > > there used to be some kernel lock issues when the kernel rbd client
> > > tried to access an OSD on the same machine. Not sure if these issues
> > > still exist (but I would guess so) and if you use your proxmox cluster
> > > in a hyperconverged manner (nodes providing VMs and storage service at
> > > the same time) you may just have been lucky that it had worked before.
> > >
> > > Instead of the kernel client mount you can try to export the volume as
> > > an NBD device (https://docs.ceph.com/en/latest/man/8/rbd-nbd/) and
> > > mounting that. rbd-nbd runs in userspace and should not have that
> > > locking problem.
> >
> > rbd-nbd is also susceptible to locking up in such setups, likely more
> > so than krbd.  Don't forget that it also has a kernel component and
> > there are actually more opportunities for things to go sideways/lock up
> > because there is an extra daemon involved allocating some additional
> > memory for each I/O request.
> >
> > Thanks,
> >
> > Ilya
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Can not mount rbd device anymore

2021-06-25 Thread Ilya Dryomov

On Fri, Jun 25, 2021 at 11:25 AM Ml Ml  wrote:
>
> The rbd Client is not on one of the OSD Nodes.
>
> I now added a "backup-proxmox/cluster5a" to it and it works perfectly.
> Just that one rbd image sucks. The last thing i remember was to resize
> the Image from 6TB to 8TB and i then did a xfs_grow on it.
>
> Does that ring a bell?

It does seem like a filesystem problem so far but you haven't posted
dmesg or other details.  "mount" will not time out, if it's not returning
due to hanging somewhere you would likely get "task ... blocked for ..."
splats in dmesg.

Thanks,

        Ilya

>
>
> On Wed, Jun 23, 2021 at 11:25 AM Ilya Dryomov  wrote:
> >
> > On Wed, Jun 23, 2021 at 9:59 AM Matthias Ferdinand  
> > wrote:
> > >
> > > On Tue, Jun 22, 2021 at 02:36:00PM +0200, Ml Ml wrote:
> > > > Hello List,
> > > >
> > > > oversudden i can not mount a specific rbd device anymore:
> > > >
> > > > root@proxmox-backup:~# rbd map backup-proxmox/cluster5 -k
> > > > /etc/ceph/ceph.client.admin.keyring
> > > > /dev/rbd0
> > > >
> > > > root@proxmox-backup:~# mount /dev/rbd0 /mnt/backup-cluster5/
> > > >  (just never times out)
> > >
> > >
> > > Hi,
> > >
> > > there used to be some kernel lock issues when the kernel rbd client
> > > tried to access an OSD on the same machine. Not sure if these issues
> > > still exist (but I would guess so) and if you use your proxmox cluster
> > > in a hyperconverged manner (nodes providing VMs and storage service at
> > > the same time) you may just have been lucky that it had worked before.
> > >
> > > Instead of the kernel client mount you can try to export the volume as
> > > an NBD device (https://docs.ceph.com/en/latest/man/8/rbd-nbd/) and
> > > mounting that. rbd-nbd runs in userspace and should not have that
> > > locking problem.
> >
> > rbd-nbd is also susceptible to locking up in such setups, likely more
> > so than krbd.  Don't forget that it also has a kernel component and
> > there are actually more opportunities for things to go sideways/lock up
> > because there is an extra daemon involved allocating some additional
> > memory for each I/O request.
> >
> > Thanks,
> >
> > Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Unprotect snapshot: device or resource busy

2021-07-01 Thread Ilya Dryomov

On Thu, Jul 1, 2021 at 8:37 AM Jan Kasprzak  wrote:
>
> Hello, Ceph users,
>
> How can I figure out why it is not possible to unprotect a snapshot
> in a RBD image? I use this RBD pool for OpenNebula, and somehow there
> is a snapshot in one image, which OpenNebula does not see. So I wanted
> to delete the snapshot:
>
> # rbd info one/one-1312
> rbd image 'one-1312':
> size 8 GiB in 2048 objects
> order 22 (4 MiB objects)
> snapshot_count: 1
> id: 6732dccd50fa75
> block_name_prefix: rbd_data.6732dccd50fa75
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff, 
> deep-flatten
> op_features:
> flags:
> create_timestamp: Wed Jun 30 10:41:59 2021
> access_timestamp: Wed Jun 30 16:48:30 2021
> modify_timestamp: Wed Jun 30 15:52:18 2021
>
> # rbd snap ls one/one-1312
> SNAPID NAME SIZE  PROTECTED TIMESTAMP
>   1727 snap 8 GiB yes   Wed Jun 30 16:11:39 2021
>
> # rbd snap rm one/one-1312@snap
> Removing snap: 0% complete...failed.
> rbd: snapshot 'snap' is protected from removal.
> 2021-07-01 08:33:41.489 7f79c6ffd700 -1 librbd::Operations: snapshot is 
> protected
>
> # rbd snap unprotect one/one-1312@snap
> 2021-07-01 08:28:40.747 7f3cb6ffd700 -1 librbd::SnapshotUnprotectRequest: 
> cannot unprotect: at least 1 child(ren) [68ba8e7bace188] in pool 'one'
> 2021-07-01 08:28:40.749 7f3cb6ffd700 -1 librbd::SnapshotUnprotectRequest: 
> encountered error: (16) Device or resource busy
> 2021-07-01 08:28:40.749 7f3cb6ffd700 -1 librbd::SnapshotUnprotectRequest: 
> 0x56522f10e830 should_complete_error: ret_val=-16
> rbd: unprotecting snap failed: 2021-07-01 08:28:40.751 7f3cb6ffd700 -1 
> librbd::SnapshotUnprotectRequest: 0x56522f10e830 should_complete_error: 
> ret_val=-16
>
> As far as I can see neither the snapshot nor the RBD image itself is
> used by a running qemu in my cluster. How can I delete the snapshot
> or debug the problem further?

Hi Jan,

There seems to be a clone image that is based on that snapshot.
"rbd children one/one-1312@snap" should give you its name.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Unprotect snapshot: device or resource busy

2021-07-01 Thread Ilya Dryomov

On Thu, Jul 1, 2021 at 9:48 AM Jan Kasprzak  wrote:
>
> Ilya Dryomov wrote:
> : On Thu, Jul 1, 2021 at 8:37 AM Jan Kasprzak  wrote:
> : >
> : > Hello, Ceph users,
> : >
> : > How can I figure out why it is not possible to unprotect a snapshot
> : > in a RBD image? I use this RBD pool for OpenNebula, and somehow there
> : > is a snapshot in one image, which OpenNebula does not see. So I wanted
> : > to delete the snapshot:
> : >
> : > # rbd info one/one-1312
> : > rbd image 'one-1312':
> : > size 8 GiB in 2048 objects
> : > order 22 (4 MiB objects)
> : > snapshot_count: 1
> : > id: 6732dccd50fa75
> : > block_name_prefix: rbd_data.6732dccd50fa75
> : > format: 2
> : > features: layering, exclusive-lock, object-map, fast-diff, 
> deep-flatten
> : > op_features:
> : > flags:
> : > create_timestamp: Wed Jun 30 10:41:59 2021
> : > access_timestamp: Wed Jun 30 16:48:30 2021
> : > modify_timestamp: Wed Jun 30 15:52:18 2021
> : >
> : > # rbd snap ls one/one-1312
> : > SNAPID NAME SIZE  PROTECTED TIMESTAMP
> : >   1727 snap 8 GiB yes   Wed Jun 30 16:11:39 2021
> : >
> : > # rbd snap rm one/one-1312@snap
> : > Removing snap: 0% complete...failed.
> : > rbd: snapshot 'snap' is protected from removal.
> : > 2021-07-01 08:33:41.489 7f79c6ffd700 -1 librbd::Operations: snapshot is 
> protected
> : >
> : > # rbd snap unprotect one/one-1312@snap
> : > 2021-07-01 08:28:40.747 7f3cb6ffd700 -1 librbd::SnapshotUnprotectRequest: 
> cannot unprotect: at least 1 child(ren) [68ba8e7bace188] in pool 'one'
> : > 2021-07-01 08:28:40.749 7f3cb6ffd700 -1 librbd::SnapshotUnprotectRequest: 
> encountered error: (16) Device or resource busy
> : > 2021-07-01 08:28:40.749 7f3cb6ffd700 -1 librbd::SnapshotUnprotectRequest: 
> 0x56522f10e830 should_complete_error: ret_val=-16
> : > rbd: unprotecting snap failed: 2021-07-01 08:28:40.751 7f3cb6ffd700 -1 
> librbd::SnapshotUnprotectRequest: 0x56522f10e830 should_complete_error: 
> ret_val=-16
> : >
> : > As far as I can see neither the snapshot nor the RBD image itself is
> : > used by a running qemu in my cluster. How can I delete the snapshot
> : > or debug the problem further?
> :
> : Hi Jan,
> :
> : There seems to be a clone image that is based on that snapshot.
> : "rbd children one/one-1312@snap" should give you its name.
>
> Hi Ilya,
>
> thanks for the tip. But apparently there seem to be no children
> of the snap or the base image:
>
> # rbd children one/one-1312@snap
> # rbd children one/one-1312

It is probably in the trash.  Try "rbd children -a one/one-1312@snap"
or "rbd trash ls -a one".

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [solved] Unprotect snapshot: device or resource busy

2021-07-01 Thread Ilya Dryomov

On Thu, Jul 1, 2021 at 10:50 AM Jan Kasprzak  wrote:
>
> Ilya Dryomov wrote:
> : On Thu, Jul 1, 2021 at 8:37 AM Jan Kasprzak  wrote:
> : >
> : > # rbd snap unprotect one/one-1312@snap
> : > 2021-07-01 08:28:40.747 7f3cb6ffd700 -1 librbd::SnapshotUnprotectRequest: 
> cannot unprotect: at least 1 child(ren) [68ba8e7bace188] in pool 'one'
> : > 2021-07-01 08:28:40.749 7f3cb6ffd700 -1 librbd::SnapshotUnprotectRequest: 
> encountered error: (16) Device or resource busy
> : > 2021-07-01 08:28:40.749 7f3cb6ffd700 -1 librbd::SnapshotUnprotectRequest: 
> 0x56522f10e830 should_complete_error: ret_val=-16
> : > rbd: unprotecting snap failed: 2021-07-01 08:28:40.751 7f3cb6ffd700 -1 
> librbd::SnapshotUnprotectRequest: 0x56522f10e830 should_complete_error: 
> ret_val=-16
> : >
> : > As far as I can see neither the snapshot nor the RBD image itself is
> : > used by a running qemu in my cluster. How can I delete the snapshot
> : > or debug the problem further?
> :
> : There seems to be a clone image that is based on that snapshot.
> : "rbd children one/one-1312@snap" should give you its name.
>
> OK, there was a child which did not show up in "rbd children" - the
> previously deleted clone. We use the trash feature, so it did not
> get deleted altogether, but moved to trash instead. I guess "rbd children"
> should be modified to show also images fro trash.
>
> after rbd restore --pool one one-1312-4742-0 I was able to delete
> the image using "rbd rm one/one-1312-4742-0", unprotect the snapshot
> with "rbd snap unprotect one/one-1312@snap", and finally delete it
> with "rbd snap rm one/one-1312@snap".
>
> So my immediate problem is fixed, but it would be nice if "rbd children"
> can also display images from trash.

"rbd children -a" does that as noted in my previous reply.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd: map failed: rbd: sysfs write failed -- (108) Cannot send after transport endpoint shutdown

2021-07-01 Thread Ilya Dryomov

On Thu, Jul 1, 2021 at 10:36 AM Oliver Dzombic  wrote:
>
>
>
> Hi,
>
> mapping of rbd volumes fails clusterwide.

Hi Oliver,

Clusterwide -- meaning on more than one client node?

>
> The volumes that are mapped, are ok, but new volumes wont map.
>
> Receiving errors liks:
>
> (108) Cannot send after transport endpoint shutdown
>
> or executing:
>
> #rbd -p CEPH map vm-120-disk-0
>
> will show:
>
> rbd: sysfs write failed
> In some cases useful info is found in syslog - try "dmesg | tail".
> rbd: map failed: (108) Cannot send after transport endpoint shutdown

Did you look in dmesg as the error suggests?  Please attach the entire
dmesg -- it seems doubtful that the existing mappings are OK because it
appears that the client instance got blocklisted.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Windows Client on 16.2.+

2021-07-19 Thread Ilya Dryomov

On Thu, Jul 15, 2021 at 11:55 PM Robert W. Eckert  wrote:
>
> I would like to directly mount cephfs from the windows client, and keep 
> getting the error below.
>
>
> PS C:\Program Files\Ceph\bin> .\ceph-dokan.exe -l x
> 2021-07-15T17:41:30.365Eastern Daylight Time 4 -1 monclient(hunting): 
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> 2021-07-15T17:41:30.365Eastern Daylight Time 5 -1 monclient(hunting): 
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> 2021-07-15T17:41:30.365Eastern Daylight Time 6 -1 monclient(hunting): 
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> failed to fetch mon config (--no-mon-config to skip)
>
> My Ceph.conf on the windows client looks like
>
> # minimal ceph.conf for fe3a7cb0-69ca-11eb-8d45-c86000d08867
> [global]
> fsid = fe3a7cb0-69ca-11eb-8d45-c86000d08867
> mon_host = [v2:192.168.2.142:3300/0,v1:192.168.2.142:6789/0] 
> [v2:192.168.2.141:3300/0,v1:192.168.2.141:6789/0] 
> [v2:192.168.2.199:3300/0,v1:192.168.2.199:6789/0]
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> [client.admin]
> keyring = c:/programdata/ceph/ceph.client.admin.keyring
>
> With the same global settings, I can mount ceph on a Ubuntu WSL2 VM, and even 
> access it from windows, but would rather have the direct connection.
>
> This appears to be an issue with the ceph client for windows found from 
> https://docs.ceph.com/en/latest/install/windows-install/  because the RBD 
> command gives a similar error
>
> PS C:\Program Files\Ceph\bin> rbd ls
> 2021-07-15T17:53:51.244Eastern Daylight Time 5 -1 monclient(hunting): 
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> 2021-07-15T17:53:51.244Eastern Daylight Time 3 -1 monclient(hunting): 
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> 2021-07-15T17:53:51.245Eastern Daylight Time 4 -1 monclient(hunting): 
> handle_auth_bad_method server allowed_methods [2] but i only support [2]

Hi Robert,

I suspect this has to do with the fix for [1].  What is
auth_allow_insecure_global_id_reclaim set to on the monitors?
The Windows build appears to be missing that fix and therefore
gets treated as an insecure client.

[1] https://docs.ceph.com/en/latest/security/CVE-2021-20288/

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph-Dokan on windows 10 not working after upgrade to pacific

2021-07-19 Thread Ilya Dryomov

On Tue, Jun 29, 2021 at 4:03 PM Lucian Petrut
 wrote:
>
> Hi,
>
> It’s a compatibility issue, we’ll have to update the Windows Pacific build.

Hi Lucian,

Did you get a chance to update the build?

I assume that means the MSI installer at [1]?  I see [2] but the MSI
bundle still seems to contain the old build based off of [3].

[1] https://cloudba.se/ceph-win-latest
[2] 
https://github.com/cloudbase/ceph-windows-installer/commit/78eabd08996c2992621c1d9261ac920915e6ccc9
[3] 
https://github.com/petrutlucian94/ceph/commit/5656003758614f8fd2a8c49c2e7d4f5cd637b0ea

Thanks,

Ilya

>
> Sorry for the delayed reply, hundreds of Ceph ML mails ended up in my spam 
> box. Ironically, I’ll have to thank Office 365 for that :).
>
> Regards,
> Lucian Petrut
>
> From: Robert W. Eckert
> Sent: Friday, May 14, 2021 7:26 PM
> To: ceph-users@ceph.io
> Subject: [ceph-users] ceph-Dokan on windows 10 not working after upgrade to 
> pacific
>
> Hi- I recently upgraded to pacific, and I am now getting an error connecting 
> on my windows 10 machine:
> The error is the handle_auth_bad_method,  I tried a few combinations of 
> cephx,none on the monitors, but I keep getting the same error.
>
> The same config(With paths updated) and key ring works on my WSL instance 
> running an old luminous client (I can't seem to get it to install a newer 
> client )
>
> Do you have any suggestions on where to look?
> Thanks,
> Rob.
> -
>
> PS C:\Program Files\Ceph\bin> .\ceph-dokan.exe --id rob -l Q
> 2021-05-14T12:19:58.172Eastern Daylight Time 5 -1 monclient(hunting): 
> handle_auth_bad_method server allowed_methods [2] but i only support [2]
> failed to fetch mon config (--no-mon-config to skip)
>
> PS C:\Program Files\Ceph\bin> cat  c:/ProgramData/ceph/ceph.client.rob.keyring
> [client.rob]
> key = 
> caps mon = "allow rwx"
> caps osd = "allow rwx"
>
> PS C:\Program Files\Ceph\bin> cat C:\ProgramData\Ceph\ceph.conf
> # minimal ceph.conf
> [global]
> log to stderr = true
> ; Uncomment the following in order to use the Windows Event Log
> log to syslog = true
>
> run dir = C:/ProgramData/ceph/out
> crash dir = C:/ProgramData/ceph/out
>
> ; Use the following to change the cephfs client log level
> debug client = 2
> [global]
> fsid = 
> mon_host = []
> auth_cluster_required = cephx
> auth_service_required = cephx
> auth_client_required = cephx
> [client]
> keyring = c:/ProgramData/ceph/ceph.client.rob.keyring
> log file = C:/ProgramData/ceph/out/$name.$pid.log
> admin socket = C:/ProgramData/ceph/out/$name.$pid.asok
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: ceph-Dokan on windows 10 not working after upgrade to pacific

2021-07-21 Thread Ilya Dryomov

On Tue, Jul 20, 2021 at 11:49 PM Robert W. Eckert  wrote:
>
> The link in the ceph documentation 
> (https://docs.ceph.com/en/latest/install/windows-install/) is  
> https://cloudbase.it/ceph-for-windows/ is  https://cloudba.se/ceph-win-latest 
> the same?

Yes.  https://cloudba.se/ceph-win-latest is where "Ceph 16.0.0 for
Windows x64 - Latest Build" button points to.

Thanks,

Ilya

>
> Thanks,
> Rob
>
>
> -Original Message-
> From: Ilya Dryomov 
> Sent: Monday, July 19, 2021 8:04 AM
> To: Lucian Petrut 
> Cc: Robert W. Eckert ; ceph-users@ceph.io
> Subject: Re: [ceph-users] Re: ceph-Dokan on windows 10 not working after 
> upgrade to pacific
>
> On Tue, Jun 29, 2021 at 4:03 PM Lucian Petrut 
>  wrote:
> >
> > Hi,
> >
> > It’s a compatibility issue, we’ll have to update the Windows Pacific build.
>
> Hi Lucian,
>
> Did you get a chance to update the build?
>
> I assume that means the MSI installer at [1]?  I see [2] but the MSI bundle 
> still seems to contain the old build based off of [3].
>
> [1] https://cloudba.se/ceph-win-latest
> [2] 
> https://github.com/cloudbase/ceph-windows-installer/commit/78eabd08996c2992621c1d9261ac920915e6ccc9
> [3] 
> https://github.com/petrutlucian94/ceph/commit/5656003758614f8fd2a8c49c2e7d4f5cd637b0ea
>
> Thanks,
>
> Ilya
>
> >
> > Sorry for the delayed reply, hundreds of Ceph ML mails ended up in my spam 
> > box. Ironically, I’ll have to thank Office 365 for that :).
> >
> > Regards,
> > Lucian Petrut
> >
> > From: Robert W. Eckert<mailto:r...@rob.eckert.name>
> > Sent: Friday, May 14, 2021 7:26 PM
> > To: ceph-users@ceph.io<mailto:ceph-users@ceph.io>
> > Subject: [ceph-users] ceph-Dokan on windows 10 not working after
> > upgrade to pacific
> >
> > Hi- I recently upgraded to pacific, and I am now getting an error 
> > connecting on my windows 10 machine:
> > The error is the handle_auth_bad_method,  I tried a few combinations of 
> > cephx,none on the monitors, but I keep getting the same error.
> >
> > The same config(With paths updated) and key ring works on my WSL
> > instance running an old luminous client (I can't seem to get it to
> > install a newer client )
> >
> > Do you have any suggestions on where to look?
> > Thanks,
> > Rob.
> > -
> >
> > PS C:\Program Files\Ceph\bin> .\ceph-dokan.exe --id rob -l Q
> > 2021-05-14T12:19:58.172Eastern Daylight Time 5 -1 monclient(hunting):
> > handle_auth_bad_method server allowed_methods [2] but i only support
> > [2] failed to fetch mon config (--no-mon-config to skip)
> >
> > PS C:\Program Files\Ceph\bin> cat
> > c:/ProgramData/ceph/ceph.client.rob.keyring
> > [client.rob]
> > key = 
> > caps mon = "allow rwx"
> > caps osd = "allow rwx"
> >
> > PS C:\Program Files\Ceph\bin> cat C:\ProgramData\Ceph\ceph.conf #
> > minimal ceph.conf [global]
> > log to stderr = true
> > ; Uncomment the following in order to use the Windows Event Log
> > log to syslog = true
> >
> > run dir = C:/ProgramData/ceph/out
> > crash dir = C:/ProgramData/ceph/out
> >
> > ; Use the following to change the cephfs client log level
> > debug client = 2
> > [global]
> > fsid = 
> > mon_host = []
> > auth_cluster_required = cephx
> > auth_service_required = cephx
> > auth_client_required = cephx
> > [client]
> > keyring = c:/ProgramData/ceph/ceph.client.rob.keyring
> > log file = C:/ProgramData/ceph/out/$name.$pid.log
> > admin socket = C:/ProgramData/ceph/out/$name.$pid.asok
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io
> >
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: nobody in control of ceph csi development?

2021-07-22 Thread Ilya Dryomov

On Wed, Jul 21, 2021 at 4:30 PM Marc  wrote:
>
> Crappy code continues to live on?
>
> This issue has been automatically marked as stale because it has not had 
> recent activity. It will be closed in a week if no further activity occurs. 
> Thank you for your contributions.

Hi Marc,

Which issue are you referring to?

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: unable to map device with krbd on el7 with ceph nautilus

2021-07-26 Thread Ilya Dryomov

On Fri, Jul 23, 2021 at 11:58 PM  wrote:
>
> Hi.
>
> I've followed the installation guide and got nautilus 14.2.22 running on el7 
> via https://download.ceph.com/rpm-nautilus/el7/x86_64/ yum repo.
> I'm now trying to map a device on an el7 and getting extremely weird errors:
>
> # rbd info test1/blk1 --name client.testing-rw
> rbd image 'blk1':
> size 50 GiB in 12800 objects
> order 22 (4 MiB objects)
> snapshot_count: 0
> id: 2e0929313a08e
> block_name_prefix: rbd_data.2e0929313a08e
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff, 
> deep-flatten
> op_features:
> flags:
> create_timestamp: Fri Jul 23 15:59:12 2021
> access_timestamp: Fri Jul 23 15:59:12 2021
> modify_timestamp: Fri Jul 23 15:59:12 2021
>
> # rbd device map test1/blk1 --name client.testing-rw
> rbd: sysfs write failed
> In some cases useful info is found in syslog - try "dmesg | tail".
> rbd: map failed: (3) No such process
>
> # dmesg | tail
> [91885.624859] libceph: resolve 'name=testing-rw' (ret=-3): failed
> [91885.624863] libceph: parse_ips bad ip 
> 'name=testing-rw,key=client.testing-rw'

Hi,

I think it should be "rbd device map test1/blk1 --id testing-rw".

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: unable to map device with krbd on el7 with ceph nautilus

2021-07-26 Thread Ilya Dryomov

On Mon, Jul 26, 2021 at 12:39 PM  wrote:
>
> Although I appreciate the responses, they have provided zero help solving 
> this issue thus far.
> It seems like the kernel module doesn't even get to the stage where it reads 
> the attributes/features of the device. It doesn't know where to connect and, 
> presumably, is confused by the options passed by userspace.

Sorry, your "rbd info" output didn't register with me for some reason.
--id and --user should be equivalent and --name with "client." prefix
should be fine too so my suggestion was useless.

As Marc mentioned, you would need to disable unsupported features but
you are right that the kernel doesn't make it to that point.

>
> Obviously, I have already tried "--user", "--name" and so on with similar 
> messages in dmesg. I don't feel like downgrading to 10. Any way to make el7 
> work with 14 userspace utils?

It is expected to work.  Could you please strace "rbd device map"
with something like "strace -f -e write -s 500 rbd device map ..."
and attach the output?

Thanks,

Ilya

>
> I have also just realized there was 1 message missing from provided logs, 
> which somehow was logged by userspace(?) and not kernel. Here's the full log:
>
> # rbd device map test1/blk1 --user testing-rw
>
> Jul 26 05:33:53 xx key.dns_resolver[9147]: name=testing-rw: No address 
> associated with name
> Jul 26 05:33:53 xx kernel: libceph: resolve 'name=testing-rw' (ret=-3): failed
> Jul 26 05:33:53 xx kernel: libceph: parse_ips bad ip 
> 'name=testing-rw,key=client.testing-rw'
>
> # rbd info test1/blk1 --user testing-rw
> works perfectly
>
> Thanks.
>
>
> On 7/24/21 12:47 PM, Marc wrote:
> >
> > If you have the default kernel you can not use all these features. I think 
> > even dmesg shows you something about that when mapping.
> >
> >
> >> -Original Message-
> >> From: cek+c...@deepunix.net
> >> Sent: Friday, 23 July 2021 23:58
> >> To: ceph-users@ceph.io
> >> Subject: *SPAM* [ceph-users] unable to map device with krbd on
> >> el7 with ceph nautilus
> >>
> >> Hi.
> >>
> >> I've followed the installation guide and got nautilus 14.2.22 running on
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: unable to map device with krbd on el7 with ceph nautilus

2021-07-26 Thread Ilya Dryomov

On Mon, Jul 26, 2021 at 5:25 PM  wrote:
>
> Have found the problem. All this was caused by missing mon_host directive in 
> ceph.conf. I have expected userspace to catch this, but it looks like it 
> didn't care.

We should probably add an explicit check for that so that the error
message is explicit.

> We use DNS SRV in this cluster.
>
> With mon_host directive reinstated, it was able to connect:
> Jul 26 09:51:40 xx kernel: libceph: mon0 10.xx:6789 session established
> Jul 26 09:51:40 xx kernel: libceph: client188721 fsid 
> 548a0823-815a-4ac5-a2e5-42cc7e8206ab
> Jul 26 09:51:40 xx kernel: rbd: image blk1: image uses unsupported features: 
> 0x38

Now you just need to disable object-map, fast-diff and deep-flatten
with "rbd feature disable" as mentioned by Marc and Dimitri.

>
> I'm wondering what happens in case this mon1 host goes down, will the kernel 
> module go through the remaining mon directive addresses?

Yes, these addreses are used to put together the initial list of
monitors (initial monmap).  The kernel picks one at random and keeps
trying until the session with either of them gets established.  After
that the real monmap received from the cluster replaces the initial
list.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: BUG #51821 - client is using insecure global_id reclaim

2021-08-09 Thread Ilya Dryomov

On Mon, Aug 9, 2021 at 5:14 PM Robert W. Eckert  wrote:
>
> I have had the same issue with the windows client.
> I had to issue
> ceph config set mon auth_expose_insecure_global_id_reclaim false
> Which allows the other clients to connect.
> I think you need to restart the monitors as well, because the first few times 
> I tried this, I still couldn't connect.

For archive's sake, I'd like to mention that disabling
auth_expose_insecure_global_id_reclaim isn't right and it wasn't
intended for this.  Enabling auth_allow_insecure_global_id_reclaim
should be enough to allow all (however old) clients to connect.
The fact that it wasn't enough for the available Windows build
suggests that there is some subtle breakage in it because all "expose"
does is it forces the client to connect twice instead of just once.
It doesn't actually refuse old unpatched clients.

(The breakage isn't surprising given that the available build is
more or less a random development snapshot with some pending at the
time Windows-specific patches applied.  I'll try to escalate issue
and get the linked MSI bundle updated.)

Thanks,

Ilya

>
> -Original Message-
> From: Richard Bade 
> Sent: Sunday, August 8, 2021 8:27 PM
> To: Daniel Persson 
> Cc: Ceph Users 
> Subject: [ceph-users] Re: BUG #51821 - client is using insecure global_id 
> reclaim
>
> Hi Daniel,
> I had a similar issue last week after upgrading my test cluster from
> 14.2.13 to 14.2.22 which included this fix for Global ID reclaim in .20. My 
> issue was a rados gw that I was re-deploying on the latest version. The 
> problem seemed to be related with cephx authentication.
> It kept displaying the error message you have and the service wouldn't start.
> I ended up stopping and removing the old rgw service, deleting all the keys 
> in /etc/ceph/ and all data in /var/lib/ceph/radosgw/ and re-deploying the 
> radosgw. This used the new rgw bootstrap keys and new key for this radosgw.
> So, I would suggest you double and triple check which keys your clients are 
> using and that cephx is enabled correctly on your cluster.
> Check your admin key in /etc/ceph as well, as that's what's being used for 
> ceph status.
>
> Regards,
> Rich
>
> On Sun, 8 Aug 2021 at 05:01, Daniel Persson  wrote:
> >
> > Hi everyone.
> >
> > I suggested asking for help here instead of in the bug tracker so that
> > I will try it.
> >
> > https://tracker.ceph.com/issues/51821?next_issue_id=51820&prev_issue_i
> > d=51824
> >
> > I have a problem that I can't seem to figure out how to resolve the issue.
> >
> > AUTH_INSECURE_GLOBAL_ID_RECLAIM: client is using insecure global_id
> > reclaim
> > AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure
> > global_id reclaim
> >
> >
> > Both of these have to do with reclaiming ID and securing that no
> > client could steal or reuse another client's ID. I understand the
> > reason for this and want to resolve the issue.
> >
> > Currently, I have three different clients.
> >
> > * One Windows client using the latest Ceph-Dokan build. (ceph version
> > 15.0.0-22274-g5656003758 (5656003758614f8fd2a8c49c2e7d4f5cd637b0ea)
> > pacific
> > (rc))
> > * One Linux Debian build using the built packages for that kernel. (
> > 4.19.0-17-amd64)
> > * And one client that I've built from source for a raspberry PI as
> > there is no arm build for the Pacific release. (5.11.0-1015-raspi)
> >
> > If I switch over to not allow global id reclaim, none of these clients
> > could connect, and using the command "ceph status" on one of my nodes
> > will also fail.
> >
> > All of them giving the same error message:
> >
> > monclient(hunting): handle_auth_bad_method server allowed_methods [2]
> > but i only support [2]
> >
> >
> > Has anyone encountered this problem and have any suggestions?
> >
> > PS. The reason I have 3 different hosts is that this is a test
> > environment where I try to resolve and look at issues before we
> > upgrade our production environment to pacific. DS.
> >
> > Best regards
> > Daniel
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an
> > email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to 
> ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Discard / Trim does not shrink rbd image size when disk is partitioned

2021-08-12 Thread Ilya Dryomov

On Thu, Aug 12, 2021 at 5:03 PM Boris Behrens  wrote:
>
> Hi everybody,
>
> we just stumbled over a problem where the rbd image does not shrink, when
> files are removed.
> This only happenes when the rbd image is partitioned.
>
> * We tested it with centos8/ubuntu20.04 with ext4 and a gpt partition table
> (/boot and /)
> * the kvm device is virtio-scsi-pci with krbd
> * Mount option discard is set
> * command to create large file: dd if=/dev/zero of=testfile bs=64M
> count=1000
> * the image grows in the size we expect
> * when we remove the testfile the rbd image stays at the size
> * we wen recreate the deleted file with the command the rbd image grows
> further
> * using fstrim does not work
> * adding a new disk and initialize the ext4 directly on the disk (wihtout
> partitioning) the trim does work and the rbd image shrinks back to a couple
> GB
> * we use ceph 14.2.21
>
> Does anybody experienced the same issue and maybe know how to solve the
> problem?

Hi Boris,

For discard to work the same way as without partitioning, you need to
make sure that your partitions start at rbd object size boundaries (see
"order" in "rbd info" output).  The default object size is 4M (8192
512-byte sectors).  Here is an fdisk output for a 512M /boot partition
and the remainder used for /:

Command (m for help): p
Disk /dev/rbd0: 100 GiB, 107374182400 bytes, 209715200 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 65536 bytes / 65536 bytes
Disklabel type: gpt
Disk identifier: 77553519-6CA1-DF48-8242-97F45B9E10C9

DeviceStart   End   Sectors  Size Type
/dev/rbd0p18192   1056767   1048576  512M Linux filesystem
/dev/rbd0p2 1056768 209715166 208658399 99.5G Linux filesystem

Note that "Start" values are wholly divisible by 8192.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Discard / Trim does not shrink rbd image size when disk is partitioned

2021-08-16 Thread Ilya Dryomov

On Fri, Aug 13, 2021 at 9:45 AM Boris Behrens  wrote:
>
> Hi Janne,
> thanks for the hint. I was aware of that, but it is goot to add that
> knowledge to the question for further googlesearcher.
>
> Hi Ilya,
> that fixed it. Do we know why the discard does not work when the partition
> table is not aligned? We provide OS templates to our customer, but they can
> also create and attach an empty block device, and they will certainly not
> check if the partitions are aligned correctly.

Hi Boris,

Not the partition table but the partitions themselves.

If the partition isn't aligned to rbd object size (but still aligned to
something reasonable such as 1M), discard would work but *appear* to be
ineffective because for large discard requests instead of just removing
whole objects which is immediately visible in "rbd du" output it would
have to resort to truncating and punching holes in those objects.  To
see some of the effect one would need to run "rbd du --exact" which is
slow.

Some of because "rbd du --exact" would basically sum up the remaining
truncated object sizes but since holes don't consume space that figure
would still be inaccurate (bigger).  To see the real effect, one would
need to look at "ceph osd df" DATA column before running discard (here
removing the file) and after.

And what Janne mentioned, of course.  "-o discard" isn't guaranteed
to actually free up space in all scenarios.  This is why I said "for
discard to work the same way as without partitioning" in my previous
email -- because irrespective of partitioning online discards wouldn't
always have the desired behaviour (on any block device, not just rbd).

Thanks,

Ilya

>
> Cheers
>  Boris
>
>
> Am Fr., 13. Aug. 2021 um 08:44 Uhr schrieb Janne Johansson <
> icepic...@gmail.com>:
>
> > Den tors 12 aug. 2021 kl 17:04 skrev Boris Behrens :
> > > Hi everybody,
> > > we just stumbled over a problem where the rbd image does not shrink, when
> > > files are removed.
> > > This only happenes when the rbd image is partitioned.
> > >
> > > * We tested it with centos8/ubuntu20.04 with ext4 and a gpt partition
> > table
> > > (/boot and /)
> > > * the kvm device is virtio-scsi-pci with krbd
> > > * Mount option discard is set
> > > * command to create large file: dd if=/dev/zero of=testfile bs=64M
> > > count=1000
> > > * the image grows in the size we expect
> > > * when we remove the testfile the rbd image stays at the size
> > > * we wen recreate the deleted file with the command the rbd image grows
> > > further
> >
> > Just a small nit on this single point, regardless of if trim/discard
> > works or not:
> > There is no guarantee that writing a file, removing it and then
> > re-writing a file
> > will ever end up in the same spot again. In fact, most modern filesystems
> > will
> > probably make sure to NOT place things at the same spot again.
> > Since the second write ends up in a different place, it will once again
> > expand
> > your sparse/thin image by the amount of written bytes, this is very much
> > to be expected.
> >
> > I'm sorry if you already knew this and I am just stating the obvious to
> > you, but
> > your text came over as if you expected the second write to not increase the
> > image since that "space" was already blown up on the first write.
> >
> > Trim/discard should still be investigated so you can make it shrink back
> > again somehow, just wanted to point this out for the records.
> >
> >
> > --
> > May the most significant bit of your life be positive.
> >
>
>
> --
> Die Selbsthilfegruppe "UTF-8-Probleme" trifft sich diesmal abweichend im
> groÃƒ¼en Saal.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: EC and rbd-mirroring

2021-08-18 Thread Ilya Dryomov

On Wed, Aug 18, 2021 at 12:40 PM Torkil Svensgaard  wrote:
>
> Hi
>
> I am looking at one way mirroring from cluster A to B cluster B.
>
> As pr [1] I have configured two pools for RBD on cluster B:
>
> 1) Pool rbd_data using default EC 2+2
> 2) Pool rbd using replica 2
>
> I have a peer relationship set up so when I enable mirroring on an image
> in cluster A it will be replicated to cluster B but it will put both
> data and metadata in the rbd pool.
>
> How do I get the rbd-mirror daemon to use rbd_data for data and rbd for
> metadata only?
>
> Thanks,
>
> Torkil
>
> [1]
> https://docs.ceph.com/en/latest/rados/operations/erasure-code/#erasure-coding-with-overwrites

Hi Torkil,

This is covered here:

https://docs.ceph.com/en/latest/rbd/rbd-mirroring/#data-pools

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: BUG #51821 - client is using insecure global_id reclaim

2021-08-18 Thread Ilya Dryomov

On Tue, Aug 17, 2021 at 9:56 AM Daniel Persson  wrote:
>
> Hi again.
>
> I've now solved my issue with help from people in this group. Thank you for
> helping out.
> I thought the process was a bit complicated so I created a short video
> describing the process.
>
> https://youtu.be/Ds4Wvvo79-M
>
> I hope this helps someone else, and again thank you.

For those who aren't up for building the Windows bits themselves
(and/or need rbd-wnbd which isn't quite covered in the video), the
MSI bundle at [1] has been updated to 16.2.5.

[1] https://cloudbase.it/ceph-for-windows

Thanks,

Ilya

>
> Best regards
> Daniel
>
>
> On Mon, Aug 9, 2021 at 5:43 PM Ilya Dryomov  wrote:
>
> > On Mon, Aug 9, 2021 at 5:14 PM Robert W. Eckert 
> > wrote:
> > >
> > > I have had the same issue with the windows client.
> > > I had to issue
> > > ceph config set mon auth_expose_insecure_global_id_reclaim false
> > > Which allows the other clients to connect.
> > > I think you need to restart the monitors as well, because the first few
> > times I tried this, I still couldn't connect.
> >
> > For archive's sake, I'd like to mention that disabling
> > auth_expose_insecure_global_id_reclaim isn't right and it wasn't
> > intended for this.  Enabling auth_allow_insecure_global_id_reclaim
> > should be enough to allow all (however old) clients to connect.
> > The fact that it wasn't enough for the available Windows build
> > suggests that there is some subtle breakage in it because all "expose"
> > does is it forces the client to connect twice instead of just once.
> > It doesn't actually refuse old unpatched clients.
> >
> > (The breakage isn't surprising given that the available build is
> > more or less a random development snapshot with some pending at the
> > time Windows-specific patches applied.  I'll try to escalate issue
> > and get the linked MSI bundle updated.)
> >
> > Thanks,
> >
> > Ilya
> >
> > >
> > > -Original Message-
> > > From: Richard Bade 
> > > Sent: Sunday, August 8, 2021 8:27 PM
> > > To: Daniel Persson 
> > > Cc: Ceph Users 
> > > Subject: [ceph-users] Re: BUG #51821 - client is using insecure
> > global_id reclaim
> > >
> > > Hi Daniel,
> > > I had a similar issue last week after upgrading my test cluster from
> > > 14.2.13 to 14.2.22 which included this fix for Global ID reclaim in .20.
> > My issue was a rados gw that I was re-deploying on the latest version. The
> > problem seemed to be related with cephx authentication.
> > > It kept displaying the error message you have and the service wouldn't
> > start.
> > > I ended up stopping and removing the old rgw service, deleting all the
> > keys in /etc/ceph/ and all data in /var/lib/ceph/radosgw/ and re-deploying
> > the radosgw. This used the new rgw bootstrap keys and new key for this
> > radosgw.
> > > So, I would suggest you double and triple check which keys your clients
> > are using and that cephx is enabled correctly on your cluster.
> > > Check your admin key in /etc/ceph as well, as that's what's being used
> > for ceph status.
> > >
> > > Regards,
> > > Rich
> > >
> > > On Sun, 8 Aug 2021 at 05:01, Daniel Persson 
> > wrote:
> > > >
> > > > Hi everyone.
> > > >
> > > > I suggested asking for help here instead of in the bug tracker so that
> > > > I will try it.
> > > >
> > > > https://tracker.ceph.com/issues/51821?next_issue_id=51820&prev_issue_i
> > > > d=51824
> > > >
> > > > I have a problem that I can't seem to figure out how to resolve the
> > issue.
> > > >
> > > > AUTH_INSECURE_GLOBAL_ID_RECLAIM: client is using insecure global_id
> > > > reclaim
> > > > AUTH_INSECURE_GLOBAL_ID_RECLAIM_ALLOWED: mons are allowing insecure
> > > > global_id reclaim
> > > >
> > > >
> > > > Both of these have to do with reclaiming ID and securing that no
> > > > client could steal or reuse another client's ID. I understand the
> > > > reason for this and want to resolve the issue.
> > > >
> > > > Currently, I have three different clients.
> > > >
> > > > * One Windows client using the latest Ceph-Dokan build. (ceph version
> > > > 15.0.0-22274-g

[ceph-users] Re: tcmu-runner crashing on 16.2.5

2021-08-25 Thread Ilya Dryomov

On Wed, Aug 25, 2021 at 7:02 AM Paul Giralt (pgiralt)  wrote:
>
> I upgraded to Pacific 16.2.5 about a month ago and everything was working 
> fine. Suddenly for the past few days I’ve started having the tcmu-runner 
> container on my iSCSI gateways just disappear. I’m assuming this is because 
> they have crashed. I deployed the services using cephadm / ceph orch in 
> Docker containers.
>
> It appears that when the service crashes, the container just disappears and 
> it doesn’t look like tcmu-runner is exporting logs anywhere, so I can’t 
> figure out any way to determine the root cause of these failures. When this 
> happens, it appears to cause issues where I can’t reboot the machine (Running 
> CentOS 8) and I need to power-cycle the server to recover.
>
> I’m really not sure where to look to figure out why it’s suddenly failing. 
> The failure is happening randomly on all 4 of the iSCSI gateways. Any 
> pointers would be greatly appreciated.

Hi Paul,

Does the node hang while shutting down or does it lock up so that you
can't even issue the reboot command?

The first place to look at is dmesg and "systemctl status".  cephadm
wraps the services into systemd units so there should be a record of
it terminating there.  If tcmu-runner is indeed crashing, Xiubo (CCed)
might be able to help with debugging.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd-nbd crashes Error: failed to read nbd request header: (33) Numerical argument out of domain

2021-08-30 Thread Ilya Dryomov

On Tue, Aug 24, 2021 at 11:43 AM Yanhu Cao  wrote:
>
> Any progress on this? We have encountered the same problem, use the
> rbd-nbd option timeout=120.
> ceph version: 14.2.13
> kernel version: 4.19.118-2+deb10u1

Hi Yanhu,

No, we still don't know what is causing this.

If rbd-nbd is being too slow, perhaps disabling the timeout would help?
Starting with kernel 5.4, "--io-timeout 0" should do it.

In general, the nbd driver is pretty unstable in older kernels.
Timeout handling is just one example so I would advise upgrading
to a recent kernel, e.g. 5.10 LTS.

Thanks,

Ilya

>
> On Wed, May 19, 2021 at 10:55 PM Mykola Golub  wrote:
> >
> > On Wed, May 19, 2021 at 11:32:04AM +0800, Zhi Zhang wrote:
> > > On Wed, May 19, 2021 at 11:19 AM Zhi Zhang 
> > > wrote:
> > >
> > > >
> > > > On Tue, May 18, 2021 at 10:58 PM Mykola Golub 
> > > > wrote:
> > > > >
> > > > > Could you please provide the full rbd-nbd log? If it is too large for
> > > > > the attachment then may be via some public url?
> > > >
> > > >  ceph.rbd-client.log.bz2
> > > > 
> > > >
> > > > I uploaded it to google driver. Pls check it out.
> > >
> > > We found the reader_entry thread got zero byte when trying to read the nbd
> > > request header, then rbd-nbd exited and closed the socket. But we haven't
> > > figured out why read zero byte?
> >
> > Ok. I was hoping to find some hint in the log, why the read from the
> > kernel could return without data, but I don't see it.
> >
> > From experience it could happen when the rbd-nbd got stack or was too
> > slow so the kernel failed after timeout, but it looked different in
> > the logs AFAIR. Anyway you can try increasing the timeout using
> > rbd-nbd --timeout (--io-timeout in newer versions) option. The default
> > is 30 sec.
> >
> > If it does not help, probably you will find a clue increasing the
> > kernel debug level for nbd (it seems it is possible to do).
> >
> > --
> > Mykola Golub
> > ___
> > Dev mailing list -- d...@ceph.io
> > To unsubscribe send an email to dev-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd-nbd crashes Error: failed to read nbd request header: (33) Numerical argument out of domain

2021-08-30 Thread Ilya Dryomov

On Mon, Aug 30, 2021 at 1:06 PM Yanhu Cao  wrote:
>
> Hi Ilya,
>
> Recently, we found these patches(v2)
> http://archive.lwn.net:8080/linux-kernel/YRHa%2FkeJ4pHP3hnL@T590/T/.
> Maybe related?
>
> v3: 
> https://lore.kernel.org/linux-block/20210824141227.808340-2-yuku...@huawei.com/

It doesn't look related at first sight, but who knows...

This is exactly my point about 4.19 being too old -- it is hard to
justify spending time on debugging an issue that reproduces once in
a while on old kernels because it could have been fixed by something
that would appear to be unrelated.

Thanks,

Ilya

>
> On Mon, Aug 30, 2021 at 6:34 PM Ilya Dryomov  wrote:
> >
> > On Tue, Aug 24, 2021 at 11:43 AM Yanhu Cao  wrote:
> > >
> > > Any progress on this? We have encountered the same problem, use the
> > > rbd-nbd option timeout=120.
> > > ceph version: 14.2.13
> > > kernel version: 4.19.118-2+deb10u1
> >
> > Hi Yanhu,
> >
> > No, we still don't know what is causing this.
> >
> > If rbd-nbd is being too slow, perhaps disabling the timeout would help?
> > Starting with kernel 5.4, "--io-timeout 0" should do it.
> >
> > In general, the nbd driver is pretty unstable in older kernels.
> > Timeout handling is just one example so I would advise upgrading
> > to a recent kernel, e.g. 5.10 LTS.
> >
> > Thanks,
> >
> > Ilya
> >
> > >
> > > On Wed, May 19, 2021 at 10:55 PM Mykola Golub  
> > > wrote:
> > > >
> > > > On Wed, May 19, 2021 at 11:32:04AM +0800, Zhi Zhang wrote:
> > > > > On Wed, May 19, 2021 at 11:19 AM Zhi Zhang 
> > > > > wrote:
> > > > >
> > > > > >
> > > > > > On Tue, May 18, 2021 at 10:58 PM Mykola Golub 
> > > > > > 
> > > > > > wrote:
> > > > > > >
> > > > > > > Could you please provide the full rbd-nbd log? If it is too large 
> > > > > > > for
> > > > > > > the attachment then may be via some public url?
> > > > > >
> > > > > >  ceph.rbd-client.log.bz2
> > > > > > <https://drive.google.com/file/d/1TuiGOrVAgKIJ3BUmiokG0cU12fnlQ3GR/view?usp=drive_web>
> > > > > >
> > > > > > I uploaded it to google driver. Pls check it out.
> > > > >
> > > > > We found the reader_entry thread got zero byte when trying to read 
> > > > > the nbd
> > > > > request header, then rbd-nbd exited and closed the socket. But we 
> > > > > haven't
> > > > > figured out why read zero byte?
> > > >
> > > > Ok. I was hoping to find some hint in the log, why the read from the
> > > > kernel could return without data, but I don't see it.
> > > >
> > > > From experience it could happen when the rbd-nbd got stack or was too
> > > > slow so the kernel failed after timeout, but it looked different in
> > > > the logs AFAIR. Anyway you can try increasing the timeout using
> > > > rbd-nbd --timeout (--io-timeout in newer versions) option. The default
> > > > is 30 sec.
> > > >
> > > > If it does not help, probably you will find a clue increasing the
> > > > kernel debug level for nbd (it seems it is possible to do).
> > > >
> > > > --
> > > > Mykola Golub
> > > > ___
> > > > Dev mailing list -- d...@ceph.io
> > > > To unsubscribe send an email to dev-le...@ceph.io
> > > ___
> > > ceph-users mailing list -- ceph-users@ceph.io
> > > To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: quincy v17.2.7 QE Validation status

2023-10-25 Thread Ilya Dryomov

On Mon, Oct 23, 2023 at 5:15 PM Yuri Weinstein  wrote:
>
> If no one has anything else left, we have all issues resolved and
> ready for the 17.2.7 release

A last-minute issue with exporter daemon [1][2] necessitated a revert
[3].  17.2.7 builds would need to be respinned: since the tag created
by Jenkins hasn't been merged and packages haven't been pushed there is
no further impact.

The lack of test coverage in this area was brought up in the CLT call
earlier today.  I have bumped [4] by summarizing the history there.

[1] https://github.com/ceph/ceph/pull/54153#discussion_r1369834098
[2] https://github.com/ceph/ceph/pull/50749#pullrequestreview-1694336396
[3] https://github.com/ceph/ceph/pull/54169
[4] https://tracker.ceph.com/issues/59561

Thanks,

Ilya

>
> On Mon, Oct 23, 2023 at 8:12 AM Laura Flores  wrote:
> >
> > Regarding the crash in quincy-p2p (tracked in
> > https://tracker.ceph.com/issues/63257), @Prashant Dhange
> >  and I evaluated it, and we've concluded it isn't a
> > blocker for 17.2.7.
> >
> > So, quincy-p2p is approved.
> >
> > Thanks,
> > Laura
> >
> >
> >
> > On Sat, Oct 21, 2023 at 12:27 AM Venky Shankar  wrote:
> >
> > > Hi Yuri,
> > >
> > > On Fri, Oct 20, 2023 at 9:44 AM Venky Shankar  wrote:
> > > >
> > > > Hi Yuri,
> > > >
> > > > On Thu, Oct 19, 2023 at 10:48 PM Venky Shankar 
> > > wrote:
> > > > >
> > > > > Hi Yuri,
> > > > >
> > > > > On Thu, Oct 19, 2023 at 9:32 PM Yuri Weinstein 
> > > wrote:
> > > > > >
> > > > > > We are still finishing off:
> > > > > >
> > > > > > - revert PR https://github.com/ceph/ceph/pull/54085, needs smoke
> > > suite rerun
> > > > > > - removed s3tests https://github.com/ceph/ceph/pull/54078 merged
> > > > > >
> > > > > > Venky, Casey FYI
> > > > >
> > > > > https://github.com/ceph/ceph/pull/53139 is causing a smoke test
> > > > > failure. Details:
> > > > > https://github.com/ceph/ceph/pull/53139#issuecomment-1771388202
> > > > >
> > > > > I've sent a revert for that change -
> > > > > https://github.com/ceph/ceph/pull/54108 - will let you know when it's
> > > > > ready for testing.
> > > >
> > > > smoke passes with this revert
> > > >
> > > >
> > > https://pulpito.ceph.com/vshankar-2023-10-19_20:24:36-smoke-wip-vshankar-testing-quincy-20231019.172112-testing-default-smithi/
> > > >
> > > > fs suite running now...
> > >
> > > Test results are here -
> > > https://tracker.ceph.com/projects/cephfs/wiki/Quincy#2023-October-19
> > >
> > > Yuri, please merge change - https://github.com/ceph/ceph/pull/54108
> > >
> > > and consider this as "fs approved".
> > >
> > > >
> > > > >
> > > > > >
> > > > > > On Wed, Oct 18, 2023 at 9:07 PM Venky Shankar 
> > > wrote:
> > > > > > >
> > > > > > > On Tue, Oct 17, 2023 at 12:23 AM Yuri Weinstein <
> > > ywein...@redhat.com> wrote:
> > > > > > > >
> > > > > > > > Details of this release are summarized here:
> > > > > > > >
> > > > > > > > https://tracker.ceph.com/issues/63219#note-2
> > > > > > > > Release Notes - TBD
> > > > > > > >
> > > > > > > > Issue https://tracker.ceph.com/issues/63192 appears to be
> > > failing several runs.
> > > > > > > > Should it be fixed for this release?
> > > > > > > >
> > > > > > > > Seeking approvals/reviews for:
> > > > > > > >
> > > > > > > > smoke - Laura
> > > > > > >
> > > > > > > There's one failure in the smoke tests
> > > > > > >
> > > > > > >
> > > https://pulpito.ceph.com/yuriw-2023-10-18_14:58:31-smoke-quincy-release-distro-default-smithi/
> > > > > > >
> > > > > > > caused by
> > > > > > >
> > > > > > > https://github.com/ceph/ceph/pull/53647
> > > > > > >
> > > > > > > (which was marked DNM but got merged). However, it's a test case
> > > thing
> > > > > > > and we can live with it.
> > > > > > >
> > > > > > > Yuri mention in slack that he might do another round of
> > > build/tests,
> > > > > > > so, Yuri, here's the reverted change:
> > > > > > >
> > > > > > >https://github.com/ceph/ceph/pull/54085
> > > > > > >
> > > > > > > > rados - Laura, Radek, Travis, Ernesto, Adam King
> > > > > > > >
> > > > > > > > rgw - Casey
> > > > > > > > fs - Venky
> > > > > > > > orch - Adam King
> > > > > > > >
> > > > > > > > rbd - Ilya
> > > > > > > > krbd - Ilya
> > > > > > > >
> > > > > > > > upgrade/quincy-p2p - Known issue IIRC, Casey pls confirm/approve
> > > > > > > >
> > > > > > > > client-upgrade-quincy-reef - Laura
> > > > > > > >
> > > > > > > > powercycle - Brad pls confirm
> > > > > > > >
> > > > > > > > ceph-volume - Guillaume pls take a look
> > > > > > > >
> > > > > > > > Please reply to this email with approval and/or trackers of 
> > > > > > > > known
> > > > > > > > issues/PRs to address them.
> > > > > > > >
> > > > > > > > Josh, Neha - gibba and LRC upgrades -- N/A for quincy now after
> > > reef release.
> > > > > > > >
> > > > > > > > Thx
> > > > > > > > YuriW
> > > > > > > > ___
> > > > > > > > ceph-users mailing list -- ceph-users@ceph.io
> > > > > > > > To unsubscribe send an ema

[ceph-users] Re: reef 18.2.1 QE Validation status

2023-11-06 Thread Ilya Dryomov

On Mon, Nov 6, 2023 at 10:31 PM Yuri Weinstein  wrote:
>
> Details of this release are summarized here:
>
> https://tracker.ceph.com/issues/63443#note-1
>
> Seeking approvals/reviews for:
>
> smoke - Laura, Radek, Prashant, Venky (POOL_APP_NOT_ENABLE failures)
> rados - Neha, Radek, Travis, Ernesto, Adam King
> rgw - Casey
> fs - Venky
> orch - Adam King
> rbd - Ilya
> krbd - Ilya

rbd and krbd approved (there are some dead jobs caused by infra
issue(s) even in the reruns, but overall nothing suspicious).

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: per-rbd snapshot limitation

2023-11-15 Thread Ilya Dryomov

On Wed, Nov 15, 2023 at 5:57 PM Wesley Dillingham  
wrote:
>
> looking into how to limit snapshots at the ceph level for RBD snapshots.
> Ideally ceph would enforce an arbitrary number of snapshots allowable per
> rbd.
>
> Reading the man page for rbd command I see this option:
> https://docs.ceph.com/en/quincy/man/8/rbd/#cmdoption-rbd-limit
>
> --limit
>
> Specifies the limit for the number of snapshots permitted.
>
> Seems perfect. But on attempting to use it as such I get an error:
>
> admin@rbdtest:~$ rbd create testpool/test3 --size=100M --limit=3
> rbd: unrecognised option '--limit=3'
>
> Where am I going wrong here? Is there another way to enforce a limit of
> snapshots for RBD? Thanks.

Hi Wes,

I think you want "rbd snap limit set --limit 3 testpool/test3".

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Different behaviors for ceph kernel client in limiting IOPS when data pool enters `nearfull`?

2023-11-16 Thread Ilya Dryomov

On Thu, Nov 16, 2023 at 3:21 AM Xiubo Li  wrote:
>
> Hi Matt,
>
> On 11/15/23 02:40, Matt Larson wrote:
> > On CentOS 7 systems with the CephFS kernel client, if the data pool has a
> > `nearfull` status there is a slight reduction in write speeds (possibly
> > 20-50% fewer IOPS).
> >
> > On a similar Rocky 8 system with the CephFS kernel client, if the data pool
> > has `nearfull` status, a similar test shows write speeds at different block
> > sizes shows the IOPS < 150 bottlenecked vs the typical write
> > performance that might be with 2-3 IOPS at a particular block size.
> >
> > Is there any way to avoid the extremely bottlenecked IOPS seen on the Rocky
> > 8 system CephFS kernel clients during the `nearfull` condition or to have
> > behavior more similar to the CentOS 7 CephFS clients?
> >
> > Do different OS or Linux kernels have greatly different ways they respond
> > or limit on the IOPS? Are there any options to adjust how they limit on
> > IOPS?
>
> Just to be clear that the kernel on CentOS 7 is lower than the kernel on
> Rocky 8, they may behave differently someway. BTW, are the ceph versions
> the same for your test between CentOS 7 and Rocky 8 ?
>
> I saw in libceph.ko there has some code will handle the OSD FULL case,
> but I didn't find the near full case, let's get help from Ilya about this.
>
> @Ilya,
>
> Do you know will the osdc will behave differently when it detects the
> pool is near full ?

Hi Xiubo,

It's not libceph or osdc, but CephFS itself.  I think Matt is running
against this fix:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=7614209736fbc4927584d4387faade4f31444fce

It was previously discussed in detail here:

https://lore.kernel.org/ceph-devel/caoi1vp_k2ybx9+jffmuhcuxsyngftqjyh+frusyy4ureprk...@mail.gmail.com/

The solution is to add additional capacity or bump the nearfull
threshold:

https://lore.kernel.org/ceph-devel/23f46ca6dd1f45a78beede92fc91d...@mpinat.mpg.de/

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Different behaviors for ceph kernel client in limiting IOPS when data pool enters `nearfull`?

2023-11-16 Thread Ilya Dryomov

On Thu, Nov 16, 2023 at 5:26 PM Matt Larson  wrote:
>
> Ilya,
>
>  Thank you for providing these discussion threads on the Kernel fixes for 
> where there was a change and details on this affects the clients.
>
>  What is the expected behavior in CephFS client when there are multiple data 
> pools in the CephFS? Does having 'nearfull' in any data pool in the CephFS 
> then trigger the synchronous writes for clients even if they would be writing 
> to a CephFS location mapped to a non-nearfull data pool? I.e. is 'nearfull' / 
> sync behavior global across the same CephFS filesystem?

I would expect it to apply only to the pool in question (i.e. not be
global), but let's get Xiubo or someone else working on CephFS to
confirm.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: understand "extent"

2023-11-27 Thread Ilya Dryomov

On Sat, Nov 25, 2023 at 4:19 AM Tony Liu  wrote:
>
> Hi,
>
> The context is RBD on bluestore. I did check extent on Wiki.
> I see "extent" when talking about snapshot and export/import.
> For example, when create a snapshot, we mark extents. When
> there is write to marked extents, we will make a copy.
> I also know that user data on block device maps to objects.

Hi Tony,

An extent as simply a byte range, often defined by an offset where it
starts and a length (denoted as offset~length).

> How "extent" and "object" are related?

In the context of RBD there are several types of extents, but the two
main ones are image extents and object extents.  An image extent is at
an offset into the image, while an object extent is at an offset into
a particular object.  Any range in the image can be referred to using
either type of extent.  For example, assuming default striping:

  image extent 9437184~4096

and

  object 2 extent 1048576~4096

refer to the same 4096-byte block (objects are 0-indexed).  Similarly:

  image extent 12578816~8192

and

  object 2 extent 4190208~4096
  object 3 extent 0~4096

refer to the same 8192-byte block, but in this case the image extent
corresponds to two object extents.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Space reclaim doesn't happening in nautilus RBD pool

2023-11-30 Thread Ilya Dryomov

On Thu, Nov 30, 2023 at 8:25 AM Szabo, Istvan (Agoda)
 wrote:
>
> Hi,
>
> Is there any config on Ceph that block/not perform space reclaim?
> I test on one pool which has only one image 1.8 TiB in used.
>
>
> rbd $p du im/root
> warning: fast-diff map is not enabled for root. operation may be slow.
> NAMEPROVISIONED USED
> root 2.2 TiB 1.8 TiB
>
>
>
> I already removed all snaphots and now pool has only one image alone.
> I run both fstrim  over the filesystem (XFS) and try rbd sparsify im/root  
> (don't know what it is exactly but it mentions to reclaim something)
> It still shows the pool used 6.9 TiB which totally not make sense right? It 
> should be up to 3.6 (1.8 * 2) according to its replica?

Hi Istvan,

Have you checked RBD trash?

$ rbd trash ls -p im

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Space reclaim doesn't happening in nautilus RBD pool

2023-12-04 Thread Ilya Dryomov

Hi Istvan,

The number of objects in "im" pool (918.34k) doesn't line up with
"rbd du" output which says that only 2.2T are provisioned (that would
take roughly ~576k objects).  This usually occurs when there are object
clones caused by previous snapshots -- keep in mind that trimming
object clones after a snapshot is removed is an asynchronous process
and it can take a while.

Just to confirm, what is the output of "rbd info im/root",
"rbd snap ls --all im/root", "ceph df" (please recapture) and
"ceph osd pool ls detail" (only "im" pool is of interest)?

Thanks,

Ilya

On Fri, Dec 1, 2023 at 5:31 AM Szabo, Istvan (Agoda)
 wrote:
>
> Thrash empty.
>
> Istvan Szabo
> Staff Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> -------
>
>
>
> 
> From: Ilya Dryomov 
> Sent: Thursday, November 30, 2023 6:27 PM
> To: Szabo, Istvan (Agoda) 
> Cc: Ceph Users 
> Subject: Re: [ceph-users] Space reclaim doesn't happening in nautilus RBD pool
>
> Email received from the internet. If in doubt, don't click any link nor open 
> any attachment !
> 
>
> On Thu, Nov 30, 2023 at 8:25 AM Szabo, Istvan (Agoda)
>  wrote:
> >
> > Hi,
> >
> > Is there any config on Ceph that block/not perform space reclaim?
> > I test on one pool which has only one image 1.8 TiB in used.
> >
> >
> > rbd $p du im/root
> > warning: fast-diff map is not enabled for root. operation may be slow.
> > NAMEPROVISIONED USED
> > root 2.2 TiB 1.8 TiB
> >
> >
> >
> > I already removed all snaphots and now pool has only one image alone.
> > I run both fstrim  over the filesystem (XFS) and try rbd sparsify im/root  
> > (don't know what it is exactly but it mentions to reclaim something)
> > It still shows the pool used 6.9 TiB which totally not make sense right? It 
> > should be up to 3.6 (1.8 * 2) according to its replica?
>
> Hi Istvan,
>
> Have you checked RBD trash?
>
> $ rbd trash ls -p im
>
> Thanks,
>
> Ilya
>
> 
>
> This message is confidential and is for the sole use of the intended 
> recipient(s). It may also be privileged or otherwise protected by copyright 
> or other legal rules. If you have received it by mistake please let us know 
> by reply email and delete it from your system. It is prohibited to copy this 
> message or disclose its content to anyone. Any confidentiality or privilege 
> is not waived or lost by any mistaken delivery or unauthorized disclosure of 
> the message. All messages sent to and from Agoda may be monitored to ensure 
> compliance with company policies, to protect the company's interests and to 
> remove potential malware. Electronic messages may be intercepted, amended, 
> lost or deleted, or contain viruses.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: the image used size becomes 0 after export/import with snapshot

2023-12-04 Thread Ilya Dryomov

On Tue, Nov 28, 2023 at 8:18 AM Tony Liu  wrote:
>
> Hi,
>
> I have an image with a snapshot and some changes after snapshot.
> ```
> $ rbd du backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26
> NAME  
>   PROVISIONED  USED
> f0408e1e-06b6-437b-a2b5-70e3751d0a26@snapshot-eb085877-7557-4620-9c01-c5587b857029
>10 GiB  2.4 GiB
> f0408e1e-06b6-437b-a2b5-70e3751d0a26  
>10 GiB  2.4 GiB
>
>10 GiB  4.8 GiB
> ```
> If there is no changes after snapshot, the image line will show 0 used.
>
> I did export and import.
> ```
> $ rbd export --export-format 2 backup/f0408e1e-06b6-437b-a2b5-70e3751d0a26 - 
> | rbd import --export-format 2 - backup/test
> Exporting image: 100% complete...done.
> Importing image: 100% complete...done.
> ```
>
> When check the imported image, the image line shows 0 used.
> ```
> $ rbd du backup/test
> NAMEPROVISIONED  USED
> test@snapshot-eb085877-7557-4620-9c01-c5587b857029   10 GiB  2.4 GiB
> test 10 GiB  0 B
>   10 GiB  2.4 GiB
> ```
> Any clues how that happened? I'd expect the same du as the source.

Hi Tony,

"rbd import" command does zero detection at 4k granularity by default.
If the "after snapshot" changes just zeroed everything in the snapshot,
such a discrepancy in "rbd du" USED column is expected.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Is there any way to merge an rbd image's full backup and a diff?

2023-12-12 Thread Ilya Dryomov

On Tue, Dec 12, 2023 at 1:03 AM Satoru Takeuchi
 wrote:
>
> Hi,
>
> I'm developing RBD images' backup system. In my case, a backup data
> must be stored at least two weeks. To meet this requirement, I'd like
> to take backups as follows:
>
> 1. Take a full backup by rbd export first.
> 2. Take a differencial backups everyday.
> 3. Merge the full backup and the oldest (taken two weeks ago) diff.
>
> As a result of evaluation, I confirmed there is no problem in step 1
> and 2. However,
> I found that step 3 couldn't be accomplished by `rbd merge-diff  backup> `
> because `rbd merge-diff` only accepts a diff as a first parameter. Is
> there any way
> to merge a full backup and a diff?

Hi Satoru,

Not at the moment.  Mykola has an old work-in-progress PR which extends
"rbd import-diff" command to make this possible [1].  Since you as
a user expected "rbd merge-diff" to be able to this, I wonder if this
functionality might be better placed under "rbd merge-diff"?  That way
the operations on files would be separated from the operations on RBD
images.

For your backup system, couldn't you just merge the two oldest
differentials with "rbd merge-diff" instead?  Instead of advancing the
full export, you would be advancing the first differential -- it would
represent a diff between the initial export and the "2 weeks" backup
over time.

[1] https://github.com/ceph/ceph/pull/41375

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Is there any way to merge an rbd image's full backup and a diff?

2023-12-13 Thread Ilya Dryomov

On Wed, Dec 13, 2023 at 12:48 AM Satoru Takeuchi
 wrote:
>
> Hi Ilya,
>
> 2023年12月12日(火) 21:23 Ilya Dryomov :
> > Not at the moment.  Mykola has an old work-in-progress PR which extends
> > "rbd import-diff" command to make this possible [1].
>
> I didn't know this PR. Thank you very much.I'll evaluate this PR later.
>
> > Since you as
> > a user expected "rbd merge-diff" to be able to this, I wonder if this
> > functionality might be better placed under "rbd merge-diff"?  That way
> > the operations on files would be separated from the operations on RBD
> > images.
>
> Yes, I think so too.
>
> >
> > For your backup system, couldn't you just merge the two oldest
> > differentials with "rbd merge-diff" instead?  Instead of advancing the
> > full export, you would be advancing the first differential -- it would
> > represent a diff between the initial export and the "2 weeks" backup
> > over time.
>
> Yes, it's possible. It's one of a workaround I thought. Then the
> backup data are as follows:
>
> a. The full backup taken at least 14 days ago.
> b. The latest 14 days backup data

I think it would be:

a. A full backup (taken potentially months ago, exact age doesn't
   really matter)
b. Differential #1 - diff from the full backup to the 14 days old version
c. Differential #2 - diff from the 14 days old to the 13 days old version
d. Differential #3 - diff from the 13 days old to the 12 days old version
...

Every day after a new differential is taken, (b) and (c) would be
merged, keeping the number of differentials constant.

>
> In this case, I concern the effect if (a) becomes too old. However, it
> might be a groundless fear.

I would suggest re-capturing the full backup from time to time, just as
a precaution against something going wrong with a backup based on a too
long series of (merged) differentials.  It might be groundless concern,
but then you can't be too careful when it comes to backups.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd trash: snapshot id is protected from removal

2023-12-15 Thread Ilya Dryomov

On Fri, Dec 15, 2023 at 12:52 PM Eugen Block  wrote:
>
> Hi,
>
> I've been searching and trying things but to no avail yet.
> This is uncritical because it's a test cluster only, but I'd still
> like to have a solution in case this somehow will make it into our
> production clusters.
> It's an Openstack Victoria Cloud with Ceph backend. If one tries to
> remove a glance image (openstack image delete {UUID}' which usually
> has a protected snapshot it will fail to do so, but apparently the
> snapshot is actually moved to the trash namespace. And since it is
> protected, I can't remove it:
>
> storage01:~ # rbd -p images snap ls 278ffe2b-67a7-40d0-87b7-903f2fc9c3b4 --all
> SNAPID  NAME  SIZEPROTECTED
> TIMESTAMP NAMESPACE
> 159  1a97db13-307e-4820-8dc2-8549e9ba1ad7  39 MiB Thu
> Dec 14 08:29:56 2023  trash (snap)
>
> storage01:~ # rbd snap rm --snap-id 159
> images/278ffe2b-67a7-40d0-87b7-903f2fc9c3b4
> rbd: snapshot id 159 is protected from removal.
>
> storage01:~ # rbd snap ls images/278ffe2b-67a7-40d0-87b7-903f2fc9c3b4
> storage01:~ #
>
> This is a small image and only a test environment, but these orphans
> could potentially fill up lots of space. In a newer openstack version
> (I tried with Antelope) this doesn't seem to work like that anymore,
> so that's good. But how would I get rid of that trash snapshot in this
> cluster?

Hi Eugen,

This means that there is at least one clone based off of that snapshot.
You should be able to identify it with:

$ rbd children --all --snap-id 159 images/278ffe2b-67a7-40d0-87b7-903f2fc9c3b4

Get rid of the clone(s) and the snapshot should get removed
automatically.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd persistent cache configuration

2024-01-05 Thread Ilya Dryomov

On Thu, Jan 4, 2024 at 4:41 PM Peter  wrote:
>
> I follow below document to setup image level rbd persistent cache,
> however I get error output while i using the command provide by the document.
> I have put my commands and descriptions below.
> Can anyone give some instructions? thanks in advance.
>
> https://docs.ceph.com/en/pacific/rbd/rbd-persistent-write-back-cache/
>
> https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/5/html/block_device_guide/ceph-block-devices#enabling-persistent-write-log-cache_block
> [https://access.redhat.com/webassets/avalon/g/shadowman-200.png]
> Chapter 2. Ceph block devices Red Hat Ceph Storage 5 | Red Hat Customer 
> Portal
> Access Red Hat’s knowledge, guidance, and support through your subscription.
> access.redhat.com
> I tried use host level client command, i got no error, however I won't be 
> able to get cache usage output.
> "ceph config set client rbd_persistent_cache_mode ssd
> ceph config set client rbd_plugins pwl_cache"
>
>
>
> [root@master-node1 ceph]# rbd info sas-pool/testdrive
> rbd image 'testdrive':
> size 40 GiB in 10240 objects
> order 22 (4 MiB objects)
> snapshot_count: 0
> id: 3de76a7e7c519
> block_name_prefix: rbd_data.3de76a7e7c519
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff, 
> deep-flatten
> op_features:
> flags:
> create_timestamp: Thu Jun 29 02:03:41 2023
> access_timestamp: Thu Jun 29 07:19:40 2023
> modify_timestamp: Thu Jun 29 07:18:00 2023
>
> I check feature  exclusive-lock has been already enabled
> and I run following command get fault output.
> [root@master-node1 ceph]# rbd config image set sas-pool/testdrive 
> rbd_persistent_cache_mode ssd
> rbd: invalid config key: rbd_persistent_cache_mode
>
> [root@master-node1 ceph]# rbd config image set sas-pool/testdrive rbd_plugins 
> pwl_cache
> rbd: invalid config key: rbd_plugins

Hi Peter,

What is the output of "rbd --version" on this node?

Were "ceph config set client rbd_persistent_cache_mode ssd" and "ceph
config set client rbd_plugins pwl_cache" above ran on a different node?

>
> root@node1:~# rbd status sas-pool/testdrive
> Watchers:
> watcher=10.1.254.51:0/1544956346 client.39553300 
> cookie=140244238214096
>
>
> I hope to see the output include the persistent cache state like below:
>
> $ rbd status rbd/foo
> Watchers:
> watcher=10.10.0.102:0/1061883624 client.25496 cookie=140338056493088
> Persistent cache state:
> host: sceph9
> path: /mnt/nvme0/rbd-pwl.rbd.101e5824ad9a.pool
> size: 1 GiB
> mode: ssd
> stats_timestamp: Sun Apr 10 13:26:32 2022
> present: true   empty: falseclean: false
> allocated: 509 MiB
> cached: 501 MiB
> dirty: 338 MiB
> free: 515 MiB
> hits_full: 1450 / 61%
> hits_partial: 0 / 0%
> misses: 924
> hit_bytes: 192 MiB / 66%
> miss_bytes: 97 MiB

Normally, the output that you are expecting would be there only while
the image is opened.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd persistent cache configuration

2024-01-06 Thread Ilya Dryomov

On Sat, Jan 6, 2024 at 12:02 AM Peter  wrote:
>
> Thanks for ressponse! Yes, it is in use
>
> "watcher=10.1.254.51:0/1544956346 client.39553300 cookie=140244238214096" 
> this is indicating the client is connect the image.
> I am using fio perform write task on it.
>
> I guess it is the feature not enable correctly or setting somewhere 
> incorrect. Should I restart any process after modifying Ceph config?
>
> Any thought?

You haven't answered my question on versions.  Given "rbd: invalid
config key: ..." errors, it's very plausible that the cache is not
enabled correctly or even not supported at all.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd persistent cache configuration

2024-01-08 Thread Ilya Dryomov

On Mon, Jan 8, 2024 at 10:43 PM Peter  wrote:
>
>  rbd --version
> ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus 
> (stable)

Hi Peter,

The PWL cache was introduced in Pacific (16.2.z).

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: rbd map snapshot, mount lv, node crash

2024-01-19 Thread Ilya Dryomov

On Fri, Jan 19, 2024 at 2:38 PM Marc  wrote:
>
> Am I doing something weird when I do on a ceph node (nautilus, el7):
>
> rbd snap ls vps-test -p rbd
> rbd map vps-test@vps-test.snap1 -p rbd
>
> mount -o ro /dev/mapper/VGnew-LVnew /mnt/disk <--- reset/reboot ceph node

Hi Marc,

It's not clear where /dev/mapper/VGnew-LVnew points to, but in
principle this should work.  There is nothing weird about mapping
a snapshot, so I'd suggest capturing the kernel panic splat for
debugging.

Thanks,

Ilya
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: [DOC] Openstack with RBD DOC update?

2024-01-24 Thread Ilya Dryomov

On Wed, Jan 24, 2024 at 7:31 PM Eugen Block  wrote:
>
> We do like the separation of nova pools as well, and we also heavily
> use ephemeral disks instead of boot-from-volume instances. One of the
> reasons being that you can't detach a root volume from an instances.
> It helps in specific maintenance cases, so +1 for keeping it in the
> docs.

So it seems like instead of dropping mentions of vms pool, we should
expand "Configuring Nova" section where it says

In order to boot virtual machines directly from Ceph volumes, you
must configure the ephemeral backend for Nova.

with appropriate steps and /etc/nova/nova.conf snippet.  I'm guessing

images_type = rbd
images_rbd_pool = vms
images_rbd_ceph_conf = /etc/ceph/ceph.conf

at a minimum?

Zitat or Eugen, do you want to suggest a precise edit based on your
working configuration for Zac to incorporate or perhaps even open a PR
directly?

Thanks,

Ilya

>
> Zitat von Erik McCormick :
>
> > On Wed, Jan 24, 2024 at 10:02 AM Murilo Morais 
> > wrote:
> >
> >>  Good afternoon everybody!
> >>
> >> I have a question regarding the documentation... I was reviewing it and
> >> realized that the "vms" pool is not being used anywhere in the configs.
> >>
> >> The first mention of this pool was in commit 2eab1c1 and, in e9b13fa, the
> >> configuration section of nova.conf was removed, but the pool configuration
> >> remained there.
> >>
> >> Would it be correct to ignore all mentions of this pool (I don't see any
> >> use for it)? If so, it would be interesting to update the documentation.
> >>
> >> https://docs.ceph.com/en/latest/rbd/rbd-openstack/#create-a-pool
> >
> >
> > The use of that "vms" pool is for Nova to directly store "ephemeral" disks
> > in ceph instead of on local disk. It used to be described in the Ceph doc,
> > but seems to no longer be there. It's still in the Redhat version [1]
> > however. Wouldn't it be better to put that back instead of removing the
> > creation of the vms pool from the docs? Maybe there's a good reason we only
> > want to boot instances into volumes now, but I'm not aware of it.
> >
> > [1] - Section 3.4.3 of
> > https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/3/html-single/ceph_block_device_to_openstack_guide/index
> >
> > -Erik
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
>
>
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

1 2 3 >

1 - 100 of 283 matches

Mail list logo