[ceph-users] Re: Fwd: HeartbeatMap FAILED assert(0 == "hit suicide timeout")

2019-10-09 Thread huang jun
If you got a coredump file, then you should check why the thread takes so long to have a job done. 潘东元 于2019年10月10日周四 上午10:51写道: > > hi all, > my osd hit suicide timeout. > some log: > 2019-10-10 03:53:13.017760 7f1ab886e700 0 -- 192.168.1.5:6810/1028846 > >> 192.168.1.25:6802/24020795 p

[ceph-users] Fwd: HeartbeatMap FAILED assert(0 == "hit suicide timeout")

2019-10-09 Thread 潘东元
hi all, my osd hit suicide timeout. some log: 2019-10-10 03:53:13.017760 7f1ab886e700 0 -- 192.168.1.5:6810/1028846 >> 192.168.1.25:6802/24020795 pipe(0x257eb80 sd=69 :47977 s=2 pgs=287284 cs=41 l=0 c=0x21431760).fault, initiating reconnect 2019-10-10 03:53:13.017799 7f1ab967c700 0 -- 192

[ceph-users] Re: Unexpected increase in the memory usage of OSDs

2019-10-09 Thread Anthony D'Atri
>>> Do you have statistics on the size of the OSDMaps or count of them >>> which were being maintained by the OSDs? >> No, I don't think so. How can I find this information? > > Hmm I don't know if we directly expose the size of maps. There are > perfcounters which expose the range of maps being k

[ceph-users] Re: 14.2.4 Deduplication

2019-10-09 Thread Alex Gorbachev
On Wed, Oct 9, 2019 at 12:56 PM Gregory Farnum wrote: > So since nobody who's actually working on it has chimed in: > While there is some deduplication functionality built into the system, > AFAIK it's not something considered for users at this point. It's > under ongoing development, doesn't hav

[ceph-users] Re: 14.2.4 Deduplication

2019-10-09 Thread Gregory Farnum
So since nobody who's actually working on it has chimed in: While there is some deduplication functionality built into the system, AFAIK it's not something considered for users at this point. It's under ongoing development, doesn't have performance data, and isn't plumbed through into a lot of the

[ceph-users] Can't Modify Zone

2019-10-09 Thread Mac Wynkoop
When trying to modify a zone in one of my clusters to promote it to the master zone, I get this error: ~ $ radosgw-admin zone modify --rgw-zone atl --master failed to update zonegroup: 2019-10-09 15:41:53.409 7f9ecae26840 0 ERROR: found existing zone name atl (94d26f94-d64c-40d1-9a33-56afa948d86a

[ceph-users] Re: Sick Nautilus cluster, OOM killing OSDs, lots of osdmaps

2019-10-09 Thread Sage Weil
[adding dev] On Wed, 9 Oct 2019, Aaron Johnson wrote: > Hi all > > I have a smallish test cluster (14 servers, 84 OSDs) running 14.2.4. > Monthly OS patching and reboots that go along with it have resulted in > the cluster getting very unwell. > > Many of the servers in the cluster are OOM-ki

[ceph-users] Re: ceph-mgr Module "zabbix" cannot send Data

2019-10-09 Thread Wido den Hollander
On 10/9/19 5:20 PM, i.schm...@langeoog.de wrote: > Thank you very much! This helps a lot! > > I'm thinking if it is a good idea at all, to tie ceph data input to a > specific host of that cluster in zabbix. I could try and set up a new host in > zabbix called "Ceph", representing the cluster

[ceph-users] Re: ceph-mgr Module "zabbix" cannot send Data

2019-10-09 Thread i . schmidt
Thank you very much! This helps a lot! I'm thinking if it is a good idea at all, to tie ceph data input to a specific host of that cluster in zabbix. I could try and set up a new host in zabbix called "Ceph", representing the cluster as a whole, just for the monitoring of ceph statuses, since c

[ceph-users] Sick Nautilus cluster, OOM killing OSDs, lots of osdmaps

2019-10-09 Thread Aaron Johnson
Hi all I have a smallish test cluster (14 servers, 84 OSDs) running 14.2.4. Monthly OS patching and reboots that go along with it have resulted in the cluster getting very unwell. Many of the servers in the cluster are OOM-killing the ceph-osd processes when they try to start. (6 OSDs per se

[ceph-users] Re: ceph-mgr Module "zabbix" cannot send Data

2019-10-09 Thread Wido den Hollander
On 10/7/19 9:15 AM, i.schm...@langeoog.de wrote: > Hi Folks > > We are using Ceph as our storage backend on our 6 Node Proxmox VM Cluster. To > Monitor our systems we use Zabbix and i would like to get some Ceph Data into > our Zabbix to get some alarms when something goes wrong. > > Ceph mg

[ceph-users] Re: Is it possible to have a 2nd cephfs_data volume? [Openstack]

2019-10-09 Thread Paul Emmerich
On Wed, Oct 9, 2019 at 10:45 AM Jeremi Avenant wrote: > Good morning > > Q: Is it possible to have a 2nd cephfs_data volume and exposing it to the > same openstack environment? > yes, see documentation for cephfs layouts: https://docs.ceph.com/docs/master/cephfs/file-layouts/ > > Reason being:

[ceph-users] Re: CephFS no permissions for subdir

2019-10-09 Thread Eugen Block
then the client has read permissions to path=/ABC. This is because of "mds 'allow r, allow rws ...". Remove the 'allow r' caps from mds section and then the client only gets read permissions to the specified paths. It is for being able to make snapshots in cephfs (not rgw or rbd). I get

[ceph-users] Re: CephFS no permissions for subdir

2019-10-09 Thread Lars Täuber
Hi Eugen, Wed, 09 Oct 2019 08:44:28 + Eugen Block ==> ceph-users@ceph.io : > Hi, > > > I'd tried to make this: > > ceph auth caps client.XYZ mon 'allow r' mds 'allow r, allow rws > > path=/XYZ, allow path=/ABC' osd 'allow rw pool=cephfs_data' > > do you want to remove all permissions fr

[ceph-users] Re: CephFS no permissions for subdir

2019-10-09 Thread Eugen Block
Hi, I'd tried to make this: ceph auth caps client.XYZ mon 'allow r' mds 'allow r, allow rws path=/XYZ, allow path=/ABC' osd 'allow rw pool=cephfs_data' do you want to remove all permissions from path "/ABC"? If so you should simply remove that from the command: ceph auth caps client.XYZ

[ceph-users] Is it possible to have a 2nd cephfs_data volume? [Openstack]

2019-10-09 Thread Jeremi Avenant
Good morning Q: Is it possible to have a 2nd cephfs_data volume and exposing it to the same openstack environment? Reason being: Our current profile is configured with erasure code value of k=3,m=1 (rack level) but we looking to buy another +- 6PB of storage w/ controllers and was thinking of mo

[ceph-users] Re: ceph-mgr Module "zabbix" cannot send Data

2019-10-09 Thread i . schmidt
Sorry, somehow my reply created a new thread. This message originally belongs here: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/Z3DQN4RVZDP7ZEQTKXFQB6DTQZMJ5ONV/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an

[ceph-users] Re: ceph-mgr Module "zabbix" cannot send Data

2019-10-09 Thread Ingo Schmidt
Thx for the hint. I fiddled around with the configuration and found this: > root@vm-2:~# ceph zabbix send > Failed to send data to Zabbix while > root@vm-2:~# zabbix_sender -vv -z 192.168.15.253 -p 10051 -s vm-2 -k > ceph.num_osd -o 32 > zabbix_sender [1724513]: DEBUG: answer > [{"response":

[ceph-users] CephFS no permissions for subdir

2019-10-09 Thread Lars Täuber
Hi! Is it possible and if yes how to remove any permission to a subdir for a user. I'd tried to make this: ceph auth caps client.XYZ mon 'allow r' mds 'allow r, allow rws path=/XYZ, allow path=/ABC' osd 'allow rw pool=cephfs_data' but got: Error EINVAL: mds capability parse failed, stopped at '

[ceph-users] Re: Large omap objects in radosgw .usage pool: is there a way to reshard the rgw usage log?

2019-10-09 Thread Florian Haas
On 09/10/2019 09:07, Florian Haas wrote: > Also, is anyone aware of any adverse side effects of increasing these > thresholds, and/or changing the usage log sharding settings, that I > should keep in mind here? Sorry, I should have checked the latest in the list archives; Paul Emmerich has just re

[ceph-users] Large omap objects in radosgw .usage pool: is there a way to reshard the rgw usage log?

2019-10-09 Thread Florian Haas
Hi, I am currently dealing with a cluster that's been in use for 5 years and during that time, has never had its radosgw usage log trimmed. Now that the cluster has been upgraded to Nautilus (and has completed a full deep-scrub), it is in a permanent state of HEALTH_WARN because of one large omap