date:20180808

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Josef Zelenka

Thank you for your suggestion, tried it, really seems like the other osds think the osd is dead(if I understand this right), however the networking seems absolutely fine between the nodes(no issues in graphs etc). -13> 2018-08-08 09:13:58.466119 7fe053d41700 1 -- 10.12.3.17:0/706864 <==

Re: [ceph-users] pg count question

2018-08-08 Thread Sébastien VIGNERON

The formula seems correct for a 100 pg/OSD target. > Le 8 août 2018 à 04:21, Satish Patel a écrit : > > Thanks! > > Do you have any comments on Question: 1 ? > > On Tue, Aug 7, 2018 at 10:59 AM, Sébastien VIGNERON > wrote: >> Question 2: >> >> ceph osd pool set-quota max_objects|max_bytes

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Zhenshi Zhou

Hi, I find an old server which mounted cephfs and has the debug files. # cat osdc REQUESTS 0 homeless 0 LINGER REQUESTS BACKOFFS # cat monc have monmap 2 want 3+ have osdmap 3507 have fsmap.user 0 have mdsmap 55 want 56+ fs_cluster_id -1 # cat mdsc 194 mds0getattr #1036ae3 What does i

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Brad Hubbard

What is the load like on the osd host at the time and what does the disk utilization look like? Also, what does the transaction look like from one of the osds that sends the "you died" message with debugging osd 20 and ms 1 enabled? On Wed, Aug 8, 2018 at 5:34 PM, Josef Zelenka wrote: > Thank yo

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Brad Hubbard

Do you see "internal heartbeat not healthy" messages in the log of the osd that suicides? On Wed, Aug 8, 2018 at 5:45 PM, Brad Hubbard wrote: > What is the load like on the osd host at the time and what does the > disk utilization look like? > > Also, what does the transaction look like from one

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Josef Zelenka

Checked the system load on the host with the OSD that is suiciding currently and it's fine, however i can see a noticeably higher IO (around 700), though that seems rather like a symptom of the constant flapping/attempting to come up to me(it's an SSD based Ceph so this shouldn't cause much har

Re: [ceph-users] Tons of "cls_rgw.cc:3284: gc_iterate_entries end_key=" records in OSD logs

2018-08-08 Thread Jakub Jaszewski

Hi All, exactly the same story today, same 8 OSDs and a lot of garbage collection objects to process Below is the number of "cls_rgw.cc:3284: gc_iterate_entries end_key=" entries per OSD log file hostA: /var/log/ceph/ceph-osd.58.log 1826467 hostB: /var/log/ceph/ceph-osd.88.log 2924241 host

Re: [ceph-users] CephFS - Mounting a second Ceph file system

2018-08-08 Thread John Spray

On Tue, Aug 7, 2018 at 11:41 PM Scott Petersen wrote: > We are using kernel 4.15.17 and we keep receiving this error > mount.ceph: unrecognized mount option "mds_namespace", passing to kernel. > That message is harmless -- it just means that the userspace mount.ceph utility doesn't do anything w

Re: [ceph-users] Whole cluster flapping

2018-08-08 Thread CUZA Frédéric

Thx for the command line, I did take a look too it what I don’t really know what to search for, my bad…. All this flapping is due to deep-scrub when it starts on an OSD things start to go bad. I set out all the OSDs that were flapping the most (1 by 1 after rebalancing) and it looks better even

Re: [ceph-users] [Ceph-deploy] Cluster Name

2018-08-08 Thread Thode Jocelyn

Hi, We are still blocked by this problem on our end. Glen did you or someone else figure out something for this ? Regards Jocelyn Thode From: Glen Baars [mailto:g...@onsitecomputers.com.au] Sent: jeudi, 2 août 2018 05:43 To: Erik McCormick Cc: Thode Jocelyn ; Vasu Kulkarni ; ceph-users@lists

Re: [ceph-users] Whole cluster flapping

2018-08-08 Thread Webert de Souza Lima

So your OSDs are really too busy to respond heartbeats. You'll be facing this for sometime until cluster loads get lower. I would set `ceph osd set nodeep-scrub` until the heavy disk IO stops. maybe you can schedule it for enable during the night and disabling in the morning. Regards, Webert Lim

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Webert de Souza Lima

You could also see open sessions at the MDS server by issuing `ceph daemon mds.XX session ls` Regards, Webert Lima DevOps Engineer at MAV Tecnologia *Belo Horizonte - Brasil* *IRC NICK - WebertRLZ* On Wed, Aug 8, 2018 at 5:08 AM Zhenshi Zhou wrote: > Hi, I find an old server which mounted ce

Re: [ceph-users] Whole cluster flapping

2018-08-08 Thread Will Marley

Hi again Frederic, It may be worth looking at a recovery sleep. osd recovery sleep Description: Time in seconds to sleep before next recovery or backfill op. Increasing this value will slow down recovery operation while client operations will be less impacted. Type: Float Default: 0 osd re

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Zhenshi Zhou

Hi Webert, That command shows the current sessions, whereas the server which I get the files(osdc,mdsc,monc) disconnect for a long time. So I cannot get useful infomation from the command you provide. Thanks Webert de Souza Lima 于2018年8月8日周三下午10:10写道： > You could also see open sessions at the

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Webert de Souza Lima

Hi Zhenshi, if you still have the client mount hanging but no session is connected, you probably have some PID waiting with blocked IO from cephfs mount. I face that now and then and the only solution is to reboot the server, as you won't be able to kill a process with pending IO. Regards, Weber

Re: [ceph-users] [Ceph-deploy] Cluster Name

2018-08-08 Thread Erik McCormick

I'm not using this feature, so maybe I'm missing something, but from the way I understand cluster naming to work... I still don't understand why this is blocking for you. Unless you are attempting to mirror between two clusters running on the same hosts (why would you do this?) then systemd doesn'

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Zhenshi Zhou

Hi, Is there any other way excpet rebooting the server when the client hangs? If the server is in production environment, I can't restart it everytime. Webert de Souza Lima 于2018年8月8日周三下午10:33写道： > Hi Zhenshi, > > if you still have the client mount hanging but no session is connected, > you pro

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Webert de Souza Lima

You can only try to remount the cephs dir. It will probably not work, giving you I/O Errors, so the fallback would be to use a fuse-mount. If I recall correctly you could do a lazy umount on the current dir (umount -fl /mountdir) and remount it using the FUSE client. it will work for new sessions

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread Jake Grimmett

Hi John, With regard to memory pressure; Does the cephfs fuse client also cause a deadlock - or is this just the kernel client? We run the fuse client on ten OSD nodes, and use parsync (parallel rsync) to backup two beegfs systems (~1PB). Ordinarily fuse works OK, but any OSD problems can cause

Re: [ceph-users] cephfs kernel client hangs

2018-08-08 Thread John Spray

On Wed, Aug 8, 2018 at 4:46 PM Jake Grimmett wrote: > > Hi John, > > With regard to memory pressure; Does the cephfs fuse client also cause a > deadlock - or is this just the kernel client? TBH, I'm not expert enough on the kernel-side implementation of fuse to say. Ceph does have the fuse_disab

Re: [ceph-users] permission errors rolling back ceph cluster to v13

2018-08-08 Thread Gregory Farnum

On Tue, Aug 7, 2018 at 6:27 PM Raju Rangoju wrote: > Hi, > > > > I have been running into some connection issues with the latest ceph-14 > version, so we thought the feasible solution would be to roll back the > cluster to previous version (ceph-13.0.1) where things are known to work > properly.

[ceph-users] removing auids and auid-based cephx capabilities

2018-08-08 Thread Sage Weil

There is an undocumented part of the cephx authentication framework called the 'auid' (auth uid) that assigns an integer identifier to cephx users and to rados pools and allows you to craft cephx capabilities that apply to those pools. This is leftover infrastructure from an ancient time in wh

Re: [ceph-users] Slack-IRC integration

2018-08-08 Thread Gregory Farnum

I looked at this a bit and it turns out anybody who's already in the slack group can invite people with unrestricted domains. I think it's just part of Slack that you need to specify which domains are allowed in by default? Patrick set things up a couple years ago so I suppose our next community ma

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

2018-08-08 Thread Alexandre DERUMIER

Hi, I have upgraded to 12.2.7 , 2 weeks ago, and I don't see anymore memory increase ! (can't confirm that it was related to your patch). Thanks again for helping ! Regards, Alexandre Derumier - Mail original - De: "Zheng Yan" À: "aderumier" Cc: "ceph-users" Envoyé: Mardi 29 Mai

Re: [ceph-users] OSD had suicide timed out

2018-08-08 Thread Brad Hubbard

If, in the above case, osd 13 was not too busy to respond (resource shortage) then you need to find out why else osd 5, etc. could not contact it. On Wed, Aug 8, 2018 at 6:47 PM, Josef Zelenka wrote: > Checked the system load on the host with the OSD that is suiciding currently > and it's fine, h

Re: [ceph-users] permission errors rolling back ceph cluster to v13

2018-08-08 Thread Raju Rangoju

Thanks Greg. I think I have to re-install ceph v13 from scratch then. -Raju From: Gregory Farnum Sent: 09 August 2018 01:54 To: Raju Rangoju Cc: ceph-users@lists.ceph.com Subject: Re: [ceph-users] permission errors rolling back ceph cluster to v13 On Tue, Aug 7, 2018 at 6:27 PM Raju Rangoju m

Re: [ceph-users] [Ceph-deploy] Cluster Name

2018-08-08 Thread Thode Jocelyn

Hi Erik, The thing is that the rbd-mirror service uses the /etc/sysconfig/ceph file to determine which configuration file to use (from CLUSTER_NAME). So you need to set this to the name you chose for rbd-mirror to work. However setting this CLUSTER_NAME variable in /etc/sysconfig/ceph makes it

Re: [ceph-users] Upgrading journals to BlueStore: a conundrum

2018-08-08 Thread Gregory Farnum

You could try flushing out the FileStore journals off the SSD and creating new ones elsewhere (eg, colocated). This will obviously have a substantial impact on performance but perhaps that’s acceptable during your upgrade window? On Mon, Aug 6, 2018 at 12:32 PM Robert Stanford wrote: > > Eugen:

Re: [ceph-users] OSD had suicide timed out

Re: [ceph-users] pg count question

Re: [ceph-users] cephfs kernel client hangs

Re: [ceph-users] OSD had suicide timed out

Re: [ceph-users] OSD had suicide timed out

Re: [ceph-users] OSD had suicide timed out

Re: [ceph-users] Tons of "cls_rgw.cc:3284: gc_iterate_entries end_key=" records in OSD logs

Re: [ceph-users] CephFS - Mounting a second Ceph file system

Re: [ceph-users] Whole cluster flapping

Re: [ceph-users] [Ceph-deploy] Cluster Name

Re: [ceph-users] Whole cluster flapping

Re: [ceph-users] cephfs kernel client hangs

Re: [ceph-users] Whole cluster flapping

Re: [ceph-users] cephfs kernel client hangs

Re: [ceph-users] cephfs kernel client hangs

Re: [ceph-users] [Ceph-deploy] Cluster Name

Re: [ceph-users] cephfs kernel client hangs

Re: [ceph-users] cephfs kernel client hangs

Re: [ceph-users] cephfs kernel client hangs

Re: [ceph-users] cephfs kernel client hangs

Re: [ceph-users] permission errors rolling back ceph cluster to v13

[ceph-users] removing auids and auid-based cephx capabilities

Re: [ceph-users] Slack-IRC integration

Re: [ceph-users] ceph mds memory usage 20GB : is it normal ?

Re: [ceph-users] OSD had suicide timed out

Re: [ceph-users] permission errors rolling back ceph cluster to v13

Re: [ceph-users] [Ceph-deploy] Cluster Name

Re: [ceph-users] Upgrading journals to BlueStore: a conundrum

28 matches

Site Navigation

Mail list logo

Footer information