Re: [ceph-users] CRUSH puzzle: step weighted-take

2018-09-28 Thread Dan van der Ster
On Thu, Sep 27, 2018 at 6:34 PM Luis Periquito  wrote:
>
> I think your objective is to move the data without anyone else
> noticing. What I usually do is reduce the priority of the recovery
> process as much as possible. Do note this will make the recovery take
> a looong time, and will also make recovery from failures slow...
> ceph tell osd.* injectargs '--osd_recovery_sleep 0.9'
> ceph tell osd.* injectargs '--osd-max-backfills 1'
> ceph tell osd.* injectargs '--osd-recovery-op-priority 1'
> ceph tell osd.* injectargs '--osd-client-op-priority 63'
> ceph tell osd.* injectargs '--osd-recovery-max-active 1'
> ceph tell osd.* injectargs '--osd_recovery_max_chunk 524288'
>
> I would also assume you have set osd_scrub_during_recovery to false.
>

Thanks Luis -- that will definitely be how we backfill if we go that
route. However I would prefer to avoid one big massive change that
takes a long time to complete.

- dan

>
>
> On Thu, Sep 27, 2018 at 4:19 PM Dan van der Ster  wrote:
> >
> > Dear Ceph friends,
> >
> > I have a CRUSH data migration puzzle and wondered if someone could
> > think of a clever solution.
> >
> > Consider an osd tree like this:
> >
> >   -2   4428.02979 room 0513-R-0050
> >  -72911.81897 rack RA01
> >   -4917.27899 rack RA05
> >   -6917.25500 rack RA09
> >   -9786.23901 rack RA13
> >  -14895.43903 rack RA17
> >  -65   1161.16003 room 0513-R-0060
> >  -71578.76001 ipservice S513-A-IP38
> >  -70287.56000 rack BA09
> >  -80291.20001 rack BA10
> >  -76582.40002 ipservice S513-A-IP63
> >  -75291.20001 rack BA11
> >  -78291.20001 rack BA12
> >
> > In the beginning, for reasons that are not important, we created two pools:
> >   * poolA chooses room=0513-R-0050 then replicates 3x across the racks.
> >   * poolB chooses room=0513-R-0060, replicates 2x across the
> > ipservices, then puts a 3rd replica in room 0513-R-0050.
> >
> > For clarity, here is the crush rule for poolB:
> > type replicated
> > min_size 1
> > max_size 10
> > step take 0513-R-0060
> > step chooseleaf firstn 2 type ipservice
> > step emit
> > step take 0513-R-0050
> > step chooseleaf firstn -2 type rack
> > step emit
> >
> > Now to the puzzle.
> > For reasons that are not important, we now want to change the rule for
> > poolB to put all three 3 replicas in room 0513-R-0060.
> > And we need to do this in a way which is totally non-disruptive
> > (latency-wise) to the users of either pools. (These are both *very*
> > active RBD pools).
> >
> > I see two obvious ways to proceed:
> >   (1) simply change the rule for poolB to put a third replica on any
> > osd in room 0513-R-0060. I'm afraid though that this would involve way
> > too many concurrent backfills, cluster-wide, even with
> > osd_max_backfills=1.
> >   (2) change poolB size to 2, then change the crush rule to that from
> > (1), then reset poolB size to 3. This would risk data availability
> > during the time that the pool is size=2, and also risks that every osd
> > in room 0513-R-0050 would be too busy deleting for some indeterminate
> > time period (10s of minutes, I expect).
> >
> > So I would probably exclude those two approaches.
> >
> > Conceptually what I'd like to be able to do is a gradual migration,
> > which if I may invent some syntax on the fly...
> >
> > Instead of
> >step take 0513-R-0050
> > do
> >step weighted-take 99 0513-R-0050 1 0513-R-0060
> >
> > That is, 99% of the time take room 0513-R-0050 for the 3rd copies, 1%
> > of the time take room 0513-R-0060.
> > With a mechanism like that, we could gradually adjust those "step
> > weighted-take" lines until 100% of the 3rd copies were in 0513-R-0060.
> >
> > I have a feeling that something equivalent to that is already possible
> > with weight-sets or some other clever crush trickery.
> > Any ideas?
> >
> > Best Regards,
> >
> > Dan
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CRUSH puzzle: step weighted-take

2018-09-28 Thread Dan van der Ster
On Thu, Sep 27, 2018 at 9:57 PM Maged Mokhtar  wrote:
>
>
>
> On 27/09/18 17:18, Dan van der Ster wrote:
> > Dear Ceph friends,
> >
> > I have a CRUSH data migration puzzle and wondered if someone could
> > think of a clever solution.
> >
> > Consider an osd tree like this:
> >
> >-2   4428.02979 room 0513-R-0050
> >   -72911.81897 rack RA01
> >-4917.27899 rack RA05
> >-6917.25500 rack RA09
> >-9786.23901 rack RA13
> >   -14895.43903 rack RA17
> >   -65   1161.16003 room 0513-R-0060
> >   -71578.76001 ipservice S513-A-IP38
> >   -70287.56000 rack BA09
> >   -80291.20001 rack BA10
> >   -76582.40002 ipservice S513-A-IP63
> >   -75291.20001 rack BA11
> >   -78291.20001 rack BA12
> >
> > In the beginning, for reasons that are not important, we created two pools:
> >* poolA chooses room=0513-R-0050 then replicates 3x across the racks.
> >* poolB chooses room=0513-R-0060, replicates 2x across the
> > ipservices, then puts a 3rd replica in room 0513-R-0050.
> >
> > For clarity, here is the crush rule for poolB:
> >  type replicated
> >  min_size 1
> >  max_size 10
> >  step take 0513-R-0060
> >  step chooseleaf firstn 2 type ipservice
> >  step emit
> >  step take 0513-R-0050
> >  step chooseleaf firstn -2 type rack
> >  step emit
> >
> > Now to the puzzle.
> > For reasons that are not important, we now want to change the rule for
> > poolB to put all three 3 replicas in room 0513-R-0060.
> > And we need to do this in a way which is totally non-disruptive
> > (latency-wise) to the users of either pools. (These are both *very*
> > active RBD pools).
> >
> > I see two obvious ways to proceed:
> >(1) simply change the rule for poolB to put a third replica on any
> > osd in room 0513-R-0060. I'm afraid though that this would involve way
> > too many concurrent backfills, cluster-wide, even with
> > osd_max_backfills=1.
> >(2) change poolB size to 2, then change the crush rule to that from
> > (1), then reset poolB size to 3. This would risk data availability
> > during the time that the pool is size=2, and also risks that every osd
> > in room 0513-R-0050 would be too busy deleting for some indeterminate
> > time period (10s of minutes, I expect).
> >
> > So I would probably exclude those two approaches.
> >
> > Conceptually what I'd like to be able to do is a gradual migration,
> > which if I may invent some syntax on the fly...
> >
> > Instead of
> > step take 0513-R-0050
> > do
> > step weighted-take 99 0513-R-0050 1 0513-R-0060
> >
> > That is, 99% of the time take room 0513-R-0050 for the 3rd copies, 1%
> > of the time take room 0513-R-0060.
> > With a mechanism like that, we could gradually adjust those "step
> > weighted-take" lines until 100% of the 3rd copies were in 0513-R-0060.
> >
> > I have a feeling that something equivalent to that is already possible
> > with weight-sets or some other clever crush trickery.
> > Any ideas?
> >
> > Best Regards,
> >
> > Dan
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> would it be possible in your case to create a parent datacenter bucket
> to hold both rooms and assign their relative weights there, then for the
> third replica do a step take to this parent bucket ? its not elegant but
> may do the trick.

Hey, that might work! both rooms are already in the default root:

  -1   5589.18994 root default
  -2   4428.02979 room 0513-R-0050
 -65   1161.16003 room 0513-R-0060
 -71578.76001 ipservice S513-A-IP38
 -76582.40002 ipservice S513-A-IP63

so I'll play with a test pool and weighting down room 0513-R-0060 to
see if this can work.

Thanks!

-- dan

> The suggested step weighted-take would be more flexible as it can be
> changed on a replica level, but i do not know if you can do this with
> existing code.
>
> Maged
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] CRUSH puzzle: step weighted-take

2018-09-28 Thread Dan van der Ster
On Fri, Sep 28, 2018 at 12:51 AM Goncalo Borges
 wrote:
>
> Hi Dan
>
> Hope to find you ok.
>
> Here goes a suggestion from someone who has been sitting in the side line for 
> the last 2 years but following stuff as much as possible
>
> Will weight set per pool help?
>
> This is only possible in luminous but according to the docs there is the 
> possibility to adjust positional weights for devices hosting replicas of 
> objects for a given bucket.

We're running luminous, so weight-sets are indeed in the game.
I need to read the docs in detail to see if it could help...
combining with Maged's idea might be the solution.

Thanks!

dan


> Cheers
> Goncalo
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mimic cluster is offline and not healing

2018-09-28 Thread Stefan Kooman
Quoting by morphin (morphinwith...@gmail.com):
> Good news... :)
> 
> After I tried everything. I decide to re-create my MONs from OSD's and
> I used the script:
> https://paste.ubuntu.com/p/rNMPdMPhT5/
> 
> And it worked!!!

Congrats!

> I think when 2 server crashed and come back same time some how MON's
> confused and the maps just corrupted.
> After re-creation all the MONs was have the same map so it worked.
> But still I dont know how to hell the mons can cause endless %95 I/O ???
> This a bug anyway and if you dont want to leave the problem then do
> not "enable" your mons. Just start them manual! Another tough lesson.

The only time we needed to manually start the mons was at "bootstrap"
time. After a reboot they are brought up by systemd ... and it keeps on
working. Have you rebooted your mon(s) after the manual start?

> 
> ceph -s: https://paste.ubuntu.com/p/m3hFF22jM9/
> 
> As you can see below some of the OSDs are still down. And when I start
> them they dont start.
> Check start log: https://paste.ubuntu.com/p/ZJQG4khdbx/
> Debug log: https://paste.ubuntu.com/p/J3JyGShHym/
> 
> What we can do for the problem?
Apply PR https://github.com/ceph/ceph/pull/24064

I see that you are running Mimic 13.2.1 ... 13.2.2 was released a few
days ago. Not sure if this fix has made it into 13.2.2.

> What is the cause of the problem?

Somehow it looks like you hit this issue:
https://tracker.ceph.com/issues/24866

Gr. Stefan

-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Bluestore DB showing as ssd

2018-09-28 Thread Igor Fedotov

Hi Brett,

most probably your device is reported as hdd by the kernel, please check 
by running the following:


 cat /sys/block/sdf/queue/rotational

It should be 0 for SSD.


But as far as I know BlueFS (i.e. DB+WAL stuff) doesn't have any 
specific behavior which depends on this flag so most probably you 
shouldn't worry.


The above is false for 'slow' device and BlueStore core though.


Thanks,

Igor



On 9/22/2018 2:24 AM, Brett Chancellor wrote:
Hi all. Quick question about osd metadata information. I have several 
OSDs setup with the data dir on HDD and the db going to a partition on 
ssd. But when I look at the metadata for all the OSDs, it's showing 
the db as "hdd". Does this effect anything? And is there anyway to 
change it?


$ sudo ceph osd metadata 1
{
    "id": 1,
    "arch": "x86_64",
    "back_addr": ":6805/2053608",
    "back_iface": "eth0",
    "bluefs": "1",
    "bluefs_db_access_mode": "blk",
    "bluefs_db_block_size": "4096",
    "bluefs_db_dev": "8:80",
    "bluefs_db_dev_node": "sdf",
    "bluefs_db_driver": "KernelDevice",
    "bluefs_db_model": "PERC H730 Mini  ",
    "bluefs_db_partition_path": "/dev/sdf2",
    "bluefs_db_rotational": "1",
    "bluefs_db_size": "266287972352",
*    "bluefs_db_type": "hdd",*
    "bluefs_single_shared_device": "0",
    "bluefs_slow_access_mode": "blk",
    "bluefs_slow_block_size": "4096",
    "bluefs_slow_dev": "253:1",
    "bluefs_slow_dev_node": "dm-1",
    "bluefs_slow_driver": "KernelDevice",
    "bluefs_slow_model": "",
    "bluefs_slow_partition_path": "/dev/dm-1",
    "bluefs_slow_rotational": "1",
    "bluefs_slow_size": "6000601989120",
    "bluefs_slow_type": "hdd",
    "bluestore_bdev_access_mode": "blk",
    "bluestore_bdev_block_size": "4096",
    "bluestore_bdev_dev": "253:1",
    "bluestore_bdev_dev_node": "dm-1",
    "bluestore_bdev_driver": "KernelDevice",
    "bluestore_bdev_model": "",
    "bluestore_bdev_partition_path": "/dev/dm-1",
    "bluestore_bdev_rotational": "1",
    "bluestore_bdev_size": "6000601989120",
    "bluestore_bdev_type": "hdd",
    "ceph_version": "ceph version 12.2.4 
(52085d5249a80c5f5121a76d6288429f35e4e77b) luminous (stable)",

    "cpu": "Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz",
    "default_device_class": "hdd",
    "distro": "centos",
    "distro_description": "CentOS Linux 7 (Core)",
    "distro_version": "7",
    "front_addr": ":6804/2053608",
    "front_iface": "eth0",
    "hb_back_addr": ".78:6806/2053608",
    "hb_front_addr": ".78:6807/2053608",
    "hostname": "ceph0rdi-osd2-1-xrd.eng.sfdc.net 
",

    "journal_rotational": "1",
    "kernel_description": "#1 SMP Tue Jun 26 16:32:21 UTC 2018",
    "kernel_version": "3.10.0-862.6.3.el7.x86_64",
    "mem_swap_kb": "0",
    "mem_total_kb": "131743604",
    "os": "Linux",
    "osd_data": "/var/lib/ceph/osd/ceph-1",
    "osd_objectstore": "bluestore",
    "rotational": "1"
}



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs issue with moving files between data pools gives Input/output error

2018-09-28 Thread Marc Roos


Looks like that if I move files between different data pools of the 
cephfs, something is still refering to the 'old location' and gives an 
Input/output error. I assume this, because I am using different client 
ids for authentication. 

With the same user as configured in ganesha, mounting (kernel) erasure 
code cephfs m
can create file out4

At nfs4 client, same location m
I can read out4
I can create out5
I can read out5

Mounted root cephfs create file in folder t (test replicated 1)
I can create out6
I can move out6 to a the folder m (erasure coded)
I can read out6

At nfs4 client, m location
[@m]# cat out6
cat: out6: Input/output error




[client.cephfs.t]
 key = xxx==
 caps mds = "allow rw path=/t"
 caps mgr = "allow r"
 caps mon = "allow r"
 caps osd = "allow rwx pool=fs_meta,allow rwx pool=fs_data,  allow 
rwx pool=fs_data.r1"

[client.cephfs.m]
 key = xxx==
 caps mds = "allow rw path=/m"
 caps mgr = "allow r"
 caps mon = "allow r"
 caps osd = "allow rwx pool=fs_meta,allow rwx pool=fs_data.ec"


[@ test]# cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)

[@ test]# rpm -qa | grep ceph | sort
ceph-12.2.8-0.el7.x86_64
ceph-base-12.2.8-0.el7.x86_64
ceph-common-12.2.8-0.el7.x86_64
ceph-fuse-12.2.8-0.el7.x86_64
ceph-mds-12.2.8-0.el7.x86_64
ceph-mgr-12.2.8-0.el7.x86_64
ceph-mon-12.2.8-0.el7.x86_64
ceph-osd-12.2.8-0.el7.x86_64
ceph-radosgw-12.2.8-0.el7.x86_64
ceph-selinux-12.2.8-0.el7.x86_64
collectd-ceph-5.8.0-2.el7.x86_64
libcephfs2-12.2.8-0.el7.x86_64
python-cephfs-12.2.8-0.el7.x86_64

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rados rm objects, still appear in rados ls

2018-09-28 Thread Frank (lists)

Hi,

On my cluster I tried to clear all objects from a pool. I used the 
command "rados -p bench ls | xargs rados -p bench rm". (rados -p bench 
cleanup doesn't clean everything, because there was a lot of other 
testing going on here).


Now 'rados -p bench ls' returns a list of objects, which don't exists: 
[root@ceph01 yum.repos.d]# rados -p bench stat 
benchmark_data_ceph01.example.com_1805226_object32453
 error stat-ing 
bench/benchmark_data_ceph01.example.com_1805226_object32453: (2) No such 
file or directory


I've tried scrub and deepscrub the pg the object is in, but the problem 
persists. What causes this?


I use Centos 7.5 with mimic 13.2.2


regards,

Frank de Bot

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rados rm objects, still appear in rados ls

2018-09-28 Thread John Spray
On Fri, Sep 28, 2018 at 2:25 PM Frank (lists)  wrote:
>
> Hi,
>
> On my cluster I tried to clear all objects from a pool. I used the
> command "rados -p bench ls | xargs rados -p bench rm". (rados -p bench
> cleanup doesn't clean everything, because there was a lot of other
> testing going on here).
>
> Now 'rados -p bench ls' returns a list of objects, which don't exists:
> [root@ceph01 yum.repos.d]# rados -p bench stat
> benchmark_data_ceph01.example.com_1805226_object32453
>   error stat-ing
> bench/benchmark_data_ceph01.example.com_1805226_object32453: (2) No such
> file or directory
>
> I've tried scrub and deepscrub the pg the object is in, but the problem
> persists. What causes this?

Are you perhaps using a cache tier pool?

John

>
> I use Centos 7.5 with mimic 13.2.2
>
>
> regards,
>
> Frank de Bot
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs issue with moving files between data pools gives Input/output error

2018-09-28 Thread John Spray
On Fri, Sep 28, 2018 at 2:28 PM Marc Roos  wrote:
>
>
> Looks like that if I move files between different data pools of the
> cephfs, something is still refering to the 'old location' and gives an
> Input/output error. I assume this, because I am using different client
> ids for authentication.
>
> With the same user as configured in ganesha, mounting (kernel) erasure
> code cephfs m
> can create file out4
>
> At nfs4 client, same location m
> I can read out4
> I can create out5
> I can read out5
>
> Mounted root cephfs create file in folder t (test replicated 1)
> I can create out6
> I can move out6 to a the folder m (erasure coded)
> I can read out6
>
> At nfs4 client, m location
> [@m]# cat out6
> cat: out6: Input/output error

If it was due to permissions, I would expect to see EPERM rather than
EIO.  EIO suggests something more fundamentally broken, like a client
version that doesn't understand the latest layout format.

Assuming you're using the CephFS FSAL in Ganesha (rather than
re-exporting a local mount of CephFS), it should be possible to create
an /etc/ceph/ceph.conf file with a "[client]" section that enables
debug logging (debug client = 10 or similar), and sets an output
location ("log file = /tmp/client.log") -- that might give a bit more
information about the nature of the error.

John

>
>
>
> [client.cephfs.t]
>  key = xxx==
>  caps mds = "allow rw path=/t"
>  caps mgr = "allow r"
>  caps mon = "allow r"
>  caps osd = "allow rwx pool=fs_meta,allow rwx pool=fs_data,  allow
> rwx pool=fs_data.r1"
>
> [client.cephfs.m]
>  key = xxx==
>  caps mds = "allow rw path=/m"
>  caps mgr = "allow r"
>  caps mon = "allow r"
>  caps osd = "allow rwx pool=fs_meta,allow rwx pool=fs_data.ec"
>
>
> [@ test]# cat /etc/redhat-release
> CentOS Linux release 7.5.1804 (Core)
>
> [@ test]# rpm -qa | grep ceph | sort
> ceph-12.2.8-0.el7.x86_64
> ceph-base-12.2.8-0.el7.x86_64
> ceph-common-12.2.8-0.el7.x86_64
> ceph-fuse-12.2.8-0.el7.x86_64
> ceph-mds-12.2.8-0.el7.x86_64
> ceph-mgr-12.2.8-0.el7.x86_64
> ceph-mon-12.2.8-0.el7.x86_64
> ceph-osd-12.2.8-0.el7.x86_64
> ceph-radosgw-12.2.8-0.el7.x86_64
> ceph-selinux-12.2.8-0.el7.x86_64
> collectd-ceph-5.8.0-2.el7.x86_64
> libcephfs2-12.2.8-0.el7.x86_64
> python-cephfs-12.2.8-0.el7.x86_64
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs issue with moving files between data pools gives Input/output error

2018-09-28 Thread Marc Roos
 

If I copy the file out6 to out7 in the same location. I can read the 
out7 file on the nfs client.


-Original Message-
To: ceph-users
Subject: [ceph-users] cephfs issue with moving files between data pools 
gives Input/output error


Looks like that if I move files between different data pools of the 
cephfs, something is still refering to the 'old location' and gives an 
Input/output error. I assume this, because I am using different client 
ids for authentication. 

With the same user as configured in ganesha, mounting (kernel) erasure 
code cephfs m can create file out4

At nfs4 client, same location m
I can read out4
I can create out5
I can read out5

Mounted root cephfs create file in folder t (test replicated 1) I can 
create out6 I can move out6 to a the folder m (erasure coded) I can read 
out6

At nfs4 client, m location
[@m]# cat out6
cat: out6: Input/output error




[client.cephfs.t]
 key = xxx==
 caps mds = "allow rw path=/t"
 caps mgr = "allow r"
 caps mon = "allow r"
 caps osd = "allow rwx pool=fs_meta,allow rwx pool=fs_data,  allow 
rwx pool=fs_data.r1"

[client.cephfs.m]
 key = xxx==
 caps mds = "allow rw path=/m"
 caps mgr = "allow r"
 caps mon = "allow r"
 caps osd = "allow rwx pool=fs_meta,allow rwx pool=fs_data.ec"


[@ test]# cat /etc/redhat-release
CentOS Linux release 7.5.1804 (Core)

[@ test]# rpm -qa | grep ceph | sort
ceph-12.2.8-0.el7.x86_64
ceph-base-12.2.8-0.el7.x86_64
ceph-common-12.2.8-0.el7.x86_64
ceph-fuse-12.2.8-0.el7.x86_64
ceph-mds-12.2.8-0.el7.x86_64
ceph-mgr-12.2.8-0.el7.x86_64
ceph-mon-12.2.8-0.el7.x86_64
ceph-osd-12.2.8-0.el7.x86_64
ceph-radosgw-12.2.8-0.el7.x86_64
ceph-selinux-12.2.8-0.el7.x86_64
collectd-ceph-5.8.0-2.el7.x86_64
libcephfs2-12.2.8-0.el7.x86_64
python-cephfs-12.2.8-0.el7.x86_64

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs issue with moving files between data pools gives Input/output error

2018-09-28 Thread Marc Roos


Is this useful? I think this is the section of the client log when 

[@test2 m]$ cat out6
cat: out6: Input/output error 

2018-09-28 16:03:39.082200 7f1ad01f1700 10 client.3246756 fill_statx on 
0x100010943bc snap/devhead mode 040557 mtime 2018-09-28 14:49:35.349370 
ctime 2018-09-28 14:49:35.349370
2018-09-28 16:03:39.082223 7f1ad01f1700  3 client.3246756 ll_getattrx 
0x100010943bc.head = 0
2018-09-28 16:03:39.082727 7f1ae813f700 10 client.3246756 fill_statx on 
0x10001698ac5 snap/devhead mode 0100644 mtime 2018-09-28 14:45:50.323273 
ctime 2018-09-28 14:47:47.028679
2018-09-28 16:03:39.082737 7f1ae813f700  3 client.3246756 ll_getattrx 
0x10001698ac5.head = 0
2018-09-28 16:03:39.083149 7f1ac07f8700  3 client.3246756 ll_open 
0x10001698ac5.head 0
2018-09-28 16:03:39.083160 7f1ac07f8700 10 client.3246756 _getattr mask 
As issued=1
2018-09-28 16:03:39.083165 7f1ac07f8700  3 client.3246756 may_open 
0x7f1a7810ad00 = 0
2018-09-28 16:03:39.083169 7f1ac07f8700 10 break_deleg: breaking delegs 
on 0x10001698ac5.head(faked_ino=0 ref=2 ll_ref=1 cap_refs={} open={1=1} 
mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273 
caps=pAsLsXsFs(0=pAsLsXsFs) objectset[0x10001698ac5 ts 0/0 objects 0 
dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
2018-09-28 16:03:39.083183 7f1ac07f8700 10 delegations_broken: 
delegations empty on 0x10001698ac5.head(faked_ino=0 ref=2 ll_ref=1 
cap_refs={} open={1=1} mode=100644 size=17/0 nlink=1 mtime=2018-09-28 
14:45:50.323273 caps=pAsLsXsFs(0=pAsLsXsFs) objectset[0x10001698ac5 ts 
0/0 objects 0 dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
2018-09-28 16:03:39.083198 7f1ac07f8700 10 client.3246756 
choose_target_mds from caps on inode 0x10001698ac5.head(faked_ino=0 
ref=3 ll_ref=1 cap_refs={} open={1=1} mode=100644 size=17/0 nlink=1 
mtime=2018-09-28 14:45:50.323273 caps=pAsLsXsFs(0=pAsLsXsFs) 
objectset[0x10001698ac5 ts 0/0 objects 0 dirty_or_tx 0] 
parents=0x7f1a780f1dd0 0x7f1a7810ad00)
2018-09-28 16:03:39.083209 7f1ac07f8700 10 client.3246756 send_request 
rebuilding request 1911 for mds.0
2018-09-28 16:03:39.083218 7f1ac07f8700 10 client.3246756 send_request 
client_request(unknown.0:1911 open #0x10001698ac5 2018-09-28 
16:03:39.083194 caller_uid=501, caller_gid=501{501,}) v4 to mds.0
2018-09-28 16:03:39.084088 7f1a82ffd700  5 client.3246756 
set_cap_epoch_barrier epoch = 24093
2018-09-28 16:03:39.084097 7f1a82ffd700 10 client.3246756  mds.0 seq now 
1
2018-09-28 16:03:39.084108 7f1a82ffd700  5 client.3246756 
handle_cap_grant on in 0x10001698ac5 mds.0 seq 7 caps now pAsLsXsFscr 
was pAsLsXsFs
2018-09-28 16:03:39.084118 7f1a82ffd700 10 client.3246756 
update_inode_file_time 0x10001698ac5.head(faked_ino=0 ref=3 ll_ref=1 
cap_refs={} open={1=1} mode=100644 size=17/0 nlink=1 mtime=2018-09-28 
14:45:50.323273 caps=pAsLsXsFs(0=pAsLsXsFs) objectset[0x10001698ac5 ts 
0/0 objects 0 dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00) 
pAsLsXsFs ctime 2018-09-28 14:47:47.028679 mtime 2018-09-28 
14:45:50.323273
2018-09-28 16:03:39.084133 7f1a82ffd700 10 client.3246756   grant, new 
caps are Fcr
2018-09-28 16:03:39.084143 7f1a82ffd700 10 client.3246756 insert_trace 
from 2018-09-28 16:03:39.083217 mds.0 is_target=1 is_dentry=0
2018-09-28 16:03:39.084147 7f1a82ffd700 10 client.3246756  features 
0x3ffddff8eea4fffb
2018-09-28 16:03:39.084148 7f1a82ffd700 10 client.3246756 
update_snap_trace len 48
2018-09-28 16:03:39.084181 7f1a82ffd700 10 client.3246756 
update_snap_trace snaprealm(0x1 nref=755 c=0 seq=1 parent=0x0 
my_snaps=[] cached_snapc=1=[]) seq 1 <= 1 and same parent, SKIPPING
2018-09-28 16:03:39.084186 7f1a82ffd700 10 client.3246756  hrm  
is_target=1 is_dentry=0
2018-09-28 16:03:39.084195 7f1a82ffd700 10 client.3246756 add_update_cap 
issued pAsLsXsFscr -> pAsLsXsFscr from mds.0 on 
0x10001698ac5.head(faked_ino=0 ref=3 ll_ref=1 cap_refs={} open={1=1} 
mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273 
caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0 objects 0 
dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
2018-09-28 16:03:39.084268 7f1ac07f8700 10 client.3246756 _create_fh 
0x10001698ac5 mode 1
2018-09-28 16:03:39.084280 7f1ac07f8700  3 client.3246756 ll_open 
0x10001698ac5.head 0 = 0 (0x7f1a24028e10)
2018-09-28 16:03:39.084373 7f1a82ffd700 10 client.3246756 put_inode on 
0x10001698ac5.head(faked_ino=0 ref=5 ll_ref=1 cap_refs={} open={1=1} 
mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273 
caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0 objects 0 
dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
2018-09-28 16:03:39.084392 7f1a82ffd700 10 client.3246756 put_inode on 
0x10001698ac5.head(faked_ino=0 ref=4 ll_ref=1 cap_refs={} open={1=1} 
mode=100644 size=17/0 nlink=1 mtime=2018-09-28 14:45:50.323273 
caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[0x10001698ac5 ts 0/0 objects 0 
dirty_or_tx 0] parents=0x7f1a780f1dd0 0x7f1a7810ad00)
2018-09-28 16:03:39.084899 7f1af0161700  3 client.3246756 ll_read 
0x7f1a2

[ceph-users] Problems after increasing number of PGs in a pool

2018-09-28 Thread Vladimir Brik
Hello

I've attempted to increase the number of placement groups of the pools
in our test cluster and now ceph status (below) is reporting problems. I
am not sure what is going on or how to fix this. Troubleshooting
scenarios in the docs don't seem to quite match what I am seeing.

I have no idea how to begin to debug this. I see OSDs listed in
"blocked_by" of pg dump, but don't know how to interpret that. Could
somebody assist please?

I attached output of "ceph pg dump_stuck -f json-pretty" just in case.

The cluster consists of 5 hosts, each with 16 HDDs and 4 SSDs. I am
running 13.2.2.

This is the affected pool:
pool 6 'fs-data-ec-ssd' erasure size 5 min_size 4 crush_rule 6
object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 2493 lfor
0/2491 flags hashpspool,ec_overwrites stripe_width 12288 application cephfs


Thanks,

Vlad


ceph health

  cluster:
id: 47caa1df-42be-444d-b603-02cad2a7fdd3
health: HEALTH_WARN
Reduced data availability: 155 pgs inactive, 47 pgs peering,
64 pgs stale
Degraded data redundancy: 321039/114913606 objects degraded
(0.279%), 108 pgs degraded, 108 pgs undersized

  services:
mon: 5 daemons, quorum ceph-1,ceph-2,ceph-3,ceph-4,ceph-5
mgr: ceph-3(active), standbys: ceph-2, ceph-5, ceph-1, ceph-4
mds: cephfs-1/1/1 up  {0=ceph-5=up:active}, 4 up:standby
osd: 100 osds: 100 up, 100 in; 165 remapped pgs

  data:
pools:   6 pools, 5120 pgs
objects: 22.98 M objects, 88 TiB
usage:   154 TiB used, 574 TiB / 727 TiB avail
pgs: 3.027% pgs not active
 321039/114913606 objects degraded (0.279%)
 4903 active+clean
 105  activating+undersized+degraded+remapped
 61   stale+active+clean
 47   remapped+peering
 3stale+activating+undersized+degraded+remapped
 1active+clean+scrubbing+deep


stuck.json.gz
Description: application/gzip
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSDs crashing

2018-09-28 Thread Josh Haft
Created: https://tracker.ceph.com/issues/36250

On Tue, Sep 25, 2018 at 9:08 PM Brad Hubbard  wrote:
>
> On Tue, Sep 25, 2018 at 11:31 PM Josh Haft  wrote:
> >
> > Hi cephers,
> >
> > I have a cluster of 7 storage nodes with 12 drives each and the OSD
> > processes are regularly crashing. All 84 have crashed at least once in
> > the past two days. Cluster is Luminous 12.2.2 on CentOS 7.4.1708,
> > kernel version 3.10.0-693.el7.x86_64. I rebooted one of the OSD nodes
> > to see if that cleared up the issue, but it did not. This problem has
> > been going on for about a month now, but it was much less frequent
> > initially - I'd see a crash once every few days or so. I took a look
> > through the mailing list and bug reports, but wasn't able to find
> > anything resembling this problem.
> >
> > I am running a second cluster - also 12.2.2, CentOS 7.4.1708, and
> > kernel version 3.10.0-693.el7.x86_64 - but I do not see the issue
> > there.
> >
> > Log messages always look similar to the following, and I've pulled out
> > the back trace from a core dump as well. The aborting thread always
> > looks to be msgr-worker.
> >
>
> 
>
> > #7  0x7f9e731a3a36 in __cxxabiv1::__terminate (handler= > out>) at ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:38
> > #8  0x7f9e731a3a63 in std::terminate () at
> > ../../../../libstdc++-v3/libsupc++/eh_terminate.cc:48
> > #9  0x7f9e731fa345 in std::(anonymous
> > namespace)::execute_native_thread_routine (__p=) at
> > ../../../../../libstdc++-v3/src/c++11/thread.cc:92
>
> That is this code executing.
>
> https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libstdc%2B%2B-v3/src/c%2B%2B11/thread.cc;h=0351f19e042b0701ba3c2597ecec87144fd631d5;hb=cf82a597b0d189857acb34a08725762c4f5afb50#l76
>
> So the problem is we are generating an exception when our thread gets
> run, we should probably catch that before it gets to here but that's
> another story...
>
> The exception is "buffer::malformed_input: entity_addr_t marker != 1"
> and there is some precedent for this
> (https://tracker.ceph.com/issues/21660,
> https://tracker.ceph.com/issues/24819) but I don't think they are your
> issue.
>
> We generated that exception because we encountered an ill-formed
> entity_addr_t whilst decoding a message.
>
> Could you open a tracker for this issue and upload the entire log from
> a crash, preferably with "debug ms >= 5" but be careful as this will
> create very large log files. You can use ceph-post-file to upload
> large compressed files.
>
> Let me know the tracker ID here once you've created it.
>
> P.S. This is likely fixed in a later version of Luminous since you
> seem to be the only one hitting it. Either that or there is something
> unusual about your environment.
>
> >
> > Has anyone else seen this? Any suggestions on how to proceed? I do
> > intend to upgrade to Mimic but would prefer to do it when the cluster
> > is stable.
> >
> > Thanks for your help.
> > Josh
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
> Cheers,
> Brad
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Manually deleting an RGW bucket

2018-09-28 Thread Sean Purdy


Hi,


How do I delete an RGW/S3 bucket and its contents if the usual S3 API commands 
don't work?

The bucket has S3 delete markers that S3 API commands are not able to remove, 
and I'd like to reuse the bucket name.  It was set up for versioning and 
lifecycles under ceph 12.2.5 which broke the bucket when a reshard happened.  
12.2.7 allowed me to remove the regular files but not the delete markers.

There must be a way of removing index files and so forth through rados commands.


Thanks,

Sean
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] swift staticsite api

2018-09-28 Thread junk required

HI there, I'm trying to enable swift static site ability in my rgw.

It appears to be supported http://docs.ceph.com/docs/master/radosgw/swift/ but 
I can't find any documentation on it.

All I can find is for s3
https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html/object_gateway_guide_for_red_hat_enterprise_linux/configuration#creating_a_site
&
https://gist.github.com/robbat2/ec0a66eed28e5f0e1ef7018e9c77910c

I've tried using the following document to use the api calls 
http://docs.ceph.com/docs/master/radosgw/swift/

But it's not working.

Is there an undocumented rgw config setting I need to turn on? Or an equivalent 
to `rgw_enable_apis = s3website` for swift?

If I try and go to my swift endpoint:
http://therobinsonfamily.net/swift/staticsite/

I get

staticsite1000falseindex.html2018-09-28T16:10:16.334Z"984256f54df93400961cbfac92b1377f"12STANDARDbaggypantsBaggypants

instead of the content of index.html
Leon.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problems after increasing number of PGs in a pool

2018-09-28 Thread Burkhard Linke

Hi,


On 28.09.2018 18:04, Vladimir Brik wrote:

Hello

I've attempted to increase the number of placement groups of the pools
in our test cluster and now ceph status (below) is reporting problems. I
am not sure what is going on or how to fix this. Troubleshooting
scenarios in the docs don't seem to quite match what I am seeing.

I have no idea how to begin to debug this. I see OSDs listed in
"blocked_by" of pg dump, but don't know how to interpret that. Could
somebody assist please?

I attached output of "ceph pg dump_stuck -f json-pretty" just in case.

The cluster consists of 5 hosts, each with 16 HDDs and 4 SSDs. I am
running 13.2.2.

This is the affected pool:
pool 6 'fs-data-ec-ssd' erasure size 5 min_size 4 crush_rule 6
object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 2493 lfor
0/2491 flags hashpspool,ec_overwrites stripe_width 12288 application cephfs


Just a guess: are you running in the pgs-per-osd limit? In luminous an 
OSD will stop accepting new PGs if a certain limit (default afaik 200) 
of PG on that OSD is reached. The PGs stay in the activating state, 
similar to your output.


The mentioned pool has 2048 pgs, size=5 -> ~10.000 instances, 100 osds 
-> ~ 100 pgs per osds with that pool alone. The output mentions an 
overall PG number of 5120, so there are probably other pools, too.


You can check this by running 'ceph osd df'; the last column is the 
number of PGs on the OSD. If this number if >= 200, the OSD will not 
accept new PGs.


You can adopt the limits with the mon_max_pg_per_osd and 
max_pg_per_osd_hard_ratio settings. See ceph documentation for more 
details about this.


Regards,
Burkhard
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Problems after increasing number of PGs in a pool

2018-09-28 Thread Paul Emmerich
I guess the pool is mapped to SSDs only from the name and you only got 20 SSDs.
So you should have about ~2000 effective PGs taking replication into account.

Your pool has ~10k effective PGs with k+m=5 and you seem to have 5
more pools

Check "ceph osd df tree" to see how many PGs per OSD you got.

Try increasing these two options to "fix" it.

mon max pg per osd
osd max pg per osd hard ratio


Paul
Am Fr., 28. Sep. 2018 um 18:05 Uhr schrieb Vladimir Brik
:
>
> Hello
>
> I've attempted to increase the number of placement groups of the pools
> in our test cluster and now ceph status (below) is reporting problems. I
> am not sure what is going on or how to fix this. Troubleshooting
> scenarios in the docs don't seem to quite match what I am seeing.
>
> I have no idea how to begin to debug this. I see OSDs listed in
> "blocked_by" of pg dump, but don't know how to interpret that. Could
> somebody assist please?
>
> I attached output of "ceph pg dump_stuck -f json-pretty" just in case.
>
> The cluster consists of 5 hosts, each with 16 HDDs and 4 SSDs. I am
> running 13.2.2.
>
> This is the affected pool:
> pool 6 'fs-data-ec-ssd' erasure size 5 min_size 4 crush_rule 6
> object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 2493 lfor
> 0/2491 flags hashpspool,ec_overwrites stripe_width 12288 application cephfs
>
>
> Thanks,
>
> Vlad
>
>
> ceph health
>
>   cluster:
> id: 47caa1df-42be-444d-b603-02cad2a7fdd3
> health: HEALTH_WARN
> Reduced data availability: 155 pgs inactive, 47 pgs peering,
> 64 pgs stale
> Degraded data redundancy: 321039/114913606 objects degraded
> (0.279%), 108 pgs degraded, 108 pgs undersized
>
>   services:
> mon: 5 daemons, quorum ceph-1,ceph-2,ceph-3,ceph-4,ceph-5
> mgr: ceph-3(active), standbys: ceph-2, ceph-5, ceph-1, ceph-4
> mds: cephfs-1/1/1 up  {0=ceph-5=up:active}, 4 up:standby
> osd: 100 osds: 100 up, 100 in; 165 remapped pgs
>
>   data:
> pools:   6 pools, 5120 pgs
> objects: 22.98 M objects, 88 TiB
> usage:   154 TiB used, 574 TiB / 727 TiB avail
> pgs: 3.027% pgs not active
>  321039/114913606 objects degraded (0.279%)
>  4903 active+clean
>  105  activating+undersized+degraded+remapped
>  61   stale+active+clean
>  47   remapped+peering
>  3stale+activating+undersized+degraded+remapped
>  1active+clean+scrubbing+deep
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] QEMU/Libvirt + librbd issue using Luminous 12.2.7

2018-09-28 Thread Andre Goree

On 2018/08/21 1:24 pm, Jason Dillaman wrote:

Can you collect any librados / librbd debug logs and provide them via
pastebin? Just add / tweak the following in your "/etc/ceph/ceph.conf"
file's "[client]" section and re-run to gather the logs.

[client]
log file = /path/to/a/log/file
debug ms = 1
debug monc = 20
debug objecter = 20
debug rados = 20
debug rbd = 20

...




--
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



Returning to this as I've finally had time, lol.

I've tried adding the above [client] lines to both the machine on which 
the VMs run (the one running libvirt and the VMs) as well as the ceph 
node running the MON and MGR, but nothing happens -- i.e., nothing is 
printed to the logfile that I define.


FWIW, I'm still having this issue in 12.2.8 as well :/


--
Andre Goree
-=-=-=-=-=-
Email - andre at drenet.net
Website   - http://blog.drenet.net
PGP key   - http://www.drenet.net/pubkey.html
-=-=-=-=-=-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rados rm objects, still appear in rados ls

2018-09-28 Thread Frank de Bot (lists)
John Spray wrote:
> On Fri, Sep 28, 2018 at 2:25 PM Frank (lists)  wrote:
>>
>> Hi,
>>
>> On my cluster I tried to clear all objects from a pool. I used the
>> command "rados -p bench ls | xargs rados -p bench rm". (rados -p bench
>> cleanup doesn't clean everything, because there was a lot of other
>> testing going on here).
>>
>> Now 'rados -p bench ls' returns a list of objects, which don't exists:
>> [root@ceph01 yum.repos.d]# rados -p bench stat
>> benchmark_data_ceph01.example.com_1805226_object32453
>>   error stat-ing
>> bench/benchmark_data_ceph01.example.com_1805226_object32453: (2) No such
>> file or directory
>>
>> I've tried scrub and deepscrub the pg the object is in, but the problem
>> persists. What causes this?
> 
> Are you perhaps using a cache tier pool?

The pool had 2 snaps. After removing those, the ls command returned no
'non-existing' objects. I expected that ls would only return objects of
the current contents, I did not specify -s for working with snaps of the
pool.

> 
> John
> 
>>
>> I use Centos 7.5 with mimic 13.2.2
>>
>>
>> regards,
>>
>> Frank de Bot
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] QEMU/Libvirt + librbd issue using Luminous 12.2.7

2018-09-28 Thread Andre Goree

On 2018/09/28 2:26 pm, Andre Goree wrote:

On 2018/08/21 1:24 pm, Jason Dillaman wrote:

Can you collect any librados / librbd debug logs and provide them via
pastebin? Just add / tweak the following in your "/etc/ceph/ceph.conf"
file's "[client]" section and re-run to gather the logs.

[client]
log file = /path/to/a/log/file
debug ms = 1
debug monc = 20
debug objecter = 20
debug rados = 20
debug rbd = 20

...




--
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



Returning to this as I've finally had time, lol.

I've tried adding the above [client] lines to both the machine on
which the VMs run (the one running libvirt and the VMs) as well as the
ceph node running the MON and MGR, but nothing happens -- i.e.,
nothing is printed to the logfile that I define.

FWIW, I'm still having this issue in 12.2.8 as well :/




I actually got the logging working, here's the log from a failed attach: 
 https://pastebin.com/jCiD4E2p


Thanks!


--
Andre Goree
-=-=-=-=-=-
Email - andre at drenet.net
Website   - http://blog.drenet.net
PGP key   - http://www.drenet.net/pubkey.html
-=-=-=-=-=-
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Manually deleting an RGW bucket

2018-09-28 Thread Konstantin Shalygin

How do I delete an RGW/S3 bucket and its contents if the usual S3 API commands 
don't work?

The bucket has S3 delete markers that S3 API commands are not able to remove, 
and I'd like to reuse the bucket name.  It was set up for versioning and 
lifecycles under ceph 12.2.5 which broke the bucket when a reshard happened.  
12.2.7 allowed me to remove the regular files but not the delete markers.

There must be a way of removing index files and so forth through rados commands.



What error actually is?

For delete bucket you should delete all bucket objects ("s3cmd rm -rf 
s3://bucket/") and multipart uploads.




k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com