[ceph-users] Power failure recovery woes

2015-02-17 Thread Jeff
st_e.version.version < e.version.version) common/HeartbeatMap.cc: 79: FAILED assert(0 == "hit suicide timeout") Does anyone have any suggestions on how to recover our cluster? Thanks! Jeff ___ ceph-users mailing list ceph-users@lis

Re: [ceph-users] Power failure recovery woes

2015-02-17 Thread Jeff
Udo, Yes, the osd is mounted: /dev/sda4 963605972 260295676 703310296 28% /var/lib/ceph/osd/ceph-2 Thanks, Jeff Original Message Subject: Re: [ceph-users] Power failure recovery woes Date: 2015-02-17 04:23 From: Udo Lembke To: Jeff , ceph-users

Re: [ceph-users] Power failure recovery woes

2015-02-17 Thread Jeff
er=ceph -i 0 -f Is there any way to get the cluster to recognize them as being up? osd-1 has the "FAILED assert(last_e.version.version < e.version.version)" errors. Thanks, Jeff # idweight type name up/down reweight -1 10.22 root default

Re: [ceph-users] Power failure recovery woes (fwd)

2015-02-20 Thread Jeff
Should I infer from the silence that there is no way to recover from the "FAILED assert(last_e.version.version < e.version.version)" errors? Thanks, Jeff - Forwarded message from Jeff - Date: Tue, 17 Feb 2015 09:16:33 -0500 From: Jeff To: ceph-users@l

[ceph-users] slow requests/blocked

2014-11-20 Thread Jeff
ocked messages. Any idea(s) on what's wrong/where to look? Thanks! Jeff -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] slow requests/blocked

2014-11-20 Thread Jeff
: /var/log/ceph/ceph-osd.11.log 81 ceph5: /var/log/ceph/ceph-osd.12.log 393 I'll try to catch them while they're happening and see what I can learn. Thanks again!! Jeff On Thu, Nov 20, 2014 at 06:40:57AM -0800, Jean-Charles LOPEZ wrote: > Hi Jeff, > > it would pro

[ceph-users] mon problem after power failure

2015-01-09 Thread Jeff
sing files; e.g.: /var/lib/ceph/mon/ceph-ceph4/store.db/4011258.ldb 2015-01-09 11:30:32.024445 b6ea1740 -1 failed to create new leveldb store Does anyone have any suggestions for how to get these two monitors running again? Thanks! Jeff ___

Re: [ceph-users] mon problem after power failure

2015-01-10 Thread Jeff
Thanks - ceph health is now reporting HEALTH_OK :-) On Sat, Jan 10, 2015 at 02:55:01AM +, Joao Eduardo Luis wrote: > On 01/09/2015 04:31 PM, Jeff wrote: > >We had a power failure last night and our five node cluster has > >two nodes with mon's that fail to start.

[ceph-users] Qs on caches, and cephfs

2017-10-22 Thread Jeff
o do everything manually right now to get a better understanding of it all. The ceph docs seem to be version controlled but I can't seem to find the repo to update, if you can point me to it I'd be happy to submit patches to it. Thnx in advance! Jeff. _

[ceph-users] long blocking with writes on rbds

2015-04-08 Thread Jeff Epstein
it } rule rule-district-2 { ruleset 1 type replicated min_size 2 max_size 3 step take district-2 step chooseleaf firstn 0 type osd step emit } # end crush map Does anyone have any insight into diagnosing this problem? Jeff __

Re: [ceph-users] long blocking with writes on rbds

2015-04-08 Thread Jeff Epstein
of 3 this is 2200 pgs / OSD, which might be too much and unnecessarily increase the load on your OSDs. Best regards, Lionel Bouton Our workload involves creating and destroying a lot of pools. Each pool has 100 pgs, so it adds up. Could this be causing the problem? What would you suggest inste

Re: [ceph-users] long blocking with writes on rbds

2015-04-09 Thread Jeff Epstein
d various maps updated cluster wide. Rince and repeat until all objects have been dealt with. Quite a bit more involved, but that's the price you have to pay when you have a DISTRIBUTED storage architecture that doesn't rely on a single item (like an inode) to reflect things for the w

Re: [ceph-users] long blocking with writes on rbds

2015-04-08 Thread Jeff Epstein
a region as available and allowing it to be overwritten, as would a traditional file system? Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] 100% IO Wait with CEPH RBD and RSYNC

2015-04-22 Thread Jeff Epstein
Hi Christian This sounds like the same problem we are having. We get long wait times on ceph nodes, with certain commands (in our case, mainly mkfs) blocking for long periods of time, stuck in a wait (and not read or write) state. We get the same warning messages in syslog, as well. Jeff

Re: [ceph-users] long blocking with writes on rbds

2015-04-22 Thread Jeff Epstein
On 04/10/2015 10:10 AM, Lionel Bouton wrote: On 04/10/15 15:41, Jeff Epstein wrote: [...] This seems highly unlikely. We get very good performance without ceph. Requisitioning and manupulating block devices through LVM happens instantaneously. We expect that ceph will be a bit slower by

Re: [ceph-users] long blocking with writes on rbds

2015-04-22 Thread Jeff Epstein
, an outdated kernel driver isn't out of the question; if anyone has any concrete information, I'd be grateful. Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] long blocking with writes on rbds

2015-04-23 Thread Jeff Epstein
192.168.128.4:6800 socket closed (con state OPEN) Jeff On 04/23/2015 12:26 AM, Jeff Epstein wrote: Do you have some idea how I can diagnose this problem? I'll look at ceph -s output while you get these stuck process to see if there's any unusual activity (scrub/deep scrub/recovery/bacfill

Re: [ceph-users] long blocking with writes on rbds

2015-04-24 Thread Jeff Epstein
's a pastebin from an OSD experiencing the problem I described. I set debug_osd to 5/5. If you can provide any insight, I'd be grateful. http://pastebin.com/kLSwbVRb Also, if you have any more suggestions on how I can collect potentially interesting debug info, please let me know. Tha

Re: [ceph-users] long blocking with writes on rbds

2015-05-06 Thread Jeff Epstein
s now normal. Odd that no one here suggested this fix, and all the messing about with various topologies, placement groups, and so on, was for naught. Jeff On 04/09/2015 11:25 PM, Jeff Epstein wrote: As a follow-up to this issue, I'd like to point out some other things I've notice

[ceph-users] Did I permanently break it?

2013-07-29 Thread Jeff Moskow
ceph bits are up to date as of yesterday (ceph version 0.61.7 (8f010aff684e820ecc837c25ac77c7a05d7191ff). Thanks for any help/suggestions!! Jeff -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] "rbd ls -l" hangs

2013-07-30 Thread Jeff Moskow
This is the same issue as yesterday, but I'm still searching for a solution. We have a lot of data on the cluster that we need and can't get to it reasonably (It took over 12 hours to export a 2GB image). The only thing that status reports as wrong is: health HEALTH_WARN 1 pgs incomplete;

Re: [ceph-users] "rbd ls -l" hangs

2013-07-30 Thread Jeff Moskow
7-30 06:08:18.883179 11127'11658123 12914'1506 [11,9] [11,9] 10321'11641837 2013-07-28 00:59:09.552640 10321'11641837 Thanks again! Jeff On Tue, Jul 30, 2013 at 11:44:58AM +0200, Jens Kristian S?gaard wrote: > Hi, > >> This is

Re: [ceph-users] "rbd ls -l" hangs

2013-07-30 Thread Jeff Moskow
OK - so while things are definitely better, we still are not where we were and "rbd ls -l" still hangs. Any suggestions? -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] "rbd ls -l" hangs

2013-08-01 Thread Jeff Moskow
MB in blocks of 4096 KB in 240.974360 sec at 4351 KB/sec 2013-08-01 12:43:39.320462 osd.12 172.16.170.5:6801/1700 1348 : [INF] bench: wrote 1024 MB in blocks of 4096 KB in 259.023646 sec at 4048 KB/sec Jeff -- ___ ceph-users mailing list ceph-users

[ceph-users] re-initializing a ceph cluster

2013-08-05 Thread Jeff Moskow
TH_WARN 32 pgs degraded; 86 pgs stuck unclean Thanks! Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] pgs stuck unclean -- how to fix? (fwd)

2013-08-09 Thread Jeff Moskow
aded 2013-08-06 12:00:47.758742 21920'85238 21920'206648 [4,6] [4,6] 0'0 2013-08-05 06:58:36.681726 0'0 2013-08-05 06:58:36.681726 0.4e0 0 0 0 0 0 0 active+remapped 2013-08-06 12:00:47.765391

Re: [ceph-users] pgs stuck unclean -- how to fix? (fwd)

2013-08-09 Thread Jeff Moskow
Thanks for the suggestion. I had tried stopping each OSD for 30 seconds, then restarting it, waiting 2 minutes and then doing the next one (all OSD's eventually restarted). I tried this twice. -- ___ ceph-users mailing list ceph-users@lists.ceph.co

[ceph-users] ceph rbd io tracking (rbdtop?)

2013-08-12 Thread Jeff Moskow
Hi, The activity on our ceph cluster has gone up a lot. We are using exclusively RBD storage right now. Is there a tool/technique that could be used to find out which rbd images are receiving the most activity (something like "rbdtop")? Thanks,

Re: [ceph-users] pgs stuck unclean -- how to fix? (fwd)

2013-08-12 Thread Jeff Moskow
Sam, I've attached both files. Thanks! Jeff On Mon, Aug 12, 2013 at 01:46:57PM -0700, Samuel Just wrote: > Can you attach the output of ceph osd tree? > > Also, can you run > > ceph osd getmap -o /tmp/osdmap > > and attach /tmp/osdmap? > -Sam >

Re: [ceph-users] pgs stuck unclean -- how to fix? (fwd)

2013-08-12 Thread Jeff Moskow
Sam, 3, 14 and 16 have been down for a while and I'll eventually replace those drives (I could do it now) but didn't want to introduce more variables. We are using RBD with Proxmox, so I think the answer about kernel clients is yes Jeff On Mon, Aug 12, 2013 at

Re: [ceph-users] pgs stuck unclean -- how to fix? (fwd)

2013-08-13 Thread Jeff Moskow
Sam, Thanks that did it :-) health HEALTH_OK monmap e17: 5 mons at {a=172.16.170.1:6789/0,b=172.16.170.2:6789/0,c=172.16.170.3:6789/0,d=172.16.170.4:6789/0,e=172.16.170.5:6789/0}, election epoch 9794, quorum 0,1,2,3,4 a,b,c,d,e osdmap e23445: 14 osds: 13 up, 13 in pgmap v1355

Re: [ceph-users] Wheezy machine died with problems on osdmap

2013-08-15 Thread Jeff Williams
Giuseppe, You could install the kernel from wheezy backports - it is currently at 3.9. http://backports.debian.org/Instructions/ http://packages.debian.org/source/stable-backports/linux Regards, Jeff On 14 August 2013 10:08, Giuseppe 'Gippa' Paterno' wrote: > Hi Sage, >

Re: [ceph-users] "rbd ls -l" hangs

2013-08-15 Thread Jeff Moskow
ncing everything is working fine :-) (ceph auth del osd.x ; ceph osd crush rm osd.x ; ceph osd rm osd.x). Jeff On Wed, Aug 14, 2013 at 01:54:16PM -0700, Gregory Farnum wrote: > On Thu, Aug 1, 2013 at 9:57 AM, Jeff Moskow wrote: > > Greg, > > > > Thanks for the hints.

[ceph-users] performance questions

2013-08-17 Thread Jeff Moskow
t are the recommended ways of seeing who/what is consuming the largest amount of disk/network bandwidth? Thanks! Jeff -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] performance questions

2013-08-20 Thread Jeff Moskow
Hi, I am now occasionally seeing a ceph statuses like this: health HEALTH_WARN 2 requests are blocked > 32 sec They aren't always present even though the cluster is still slow, but they may be a clue.... Jeff On Sat, Aug 17, 2013 at 02:32:47PM -07

Re: [ceph-users] performance questions

2013-08-20 Thread Jeff Moskow
Hi, More information. If I look in /var/log/ceph/ceph.log, I see 7893 slow requests in the last 3 hours of which 7890 are from osd.4. Should I assume a bad drive? I SMART says the drive is healthy? Bad osd? Thanks, Jeff

Re: [ceph-users] performance questions

2013-08-20 Thread Jeff Moskow
Martin, Thanks for the confirmation about 3-replica performance. dmesg | fgrep /dev/sdb # returns no matches Jeff -- ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] do not upgrade bobtail -> dumpling directly until 0.67.2

2013-08-21 Thread Jeff Bachtel
Is there an issue ID associated with this? For those of us who made the long jump and want to avoid any unseen problems. Thanks, Jeff On Tue, Aug 20, 2013 at 7:57 PM, Sage Weil wrote: > We've identified a problem when upgrading directly from bobtail to > dumpling; please wait u

Re: [ceph-users] ocfs2 for OSDs?

2013-09-12 Thread Jeff Bachtel
Previous experience with OCFS2 was that its actual performance was pretty lackluster/awful. The bits Oracle threw on top of (I think) ext3 to make it work as a multi-writer filesystem with all of the signalling that implies brought the overall performance down. Jeff On Wed, Sep 11, 2013 at 9:58

[ceph-users] Continually crashing osds

2013-10-21 Thread Jeff Williams
/csHHjC2h I have run the osds with the debug statements per the email, but I'm unsure where to post them, they are 108M each without compression. Should I create a bug on the tracker? Thanks, Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com

Re: [ceph-users] Continually crashing osds

2013-10-21 Thread Jeff Williams
We're running xfs on a 3.8.0-31-generic kernel Thanks, Jeff On 10/21/13 1:54 PM, "Samuel Just" wrote: >It looks like an xattr vanished from one of your objects on osd.3. >What fs are you running? > >On Mon, Oct 21, 2013 at 9:58 AM, Jeff Williams >wrote: >&

Re: [ceph-users] Continually crashing osds

2013-10-21 Thread Jeff Williams
What is the best way to do that? I tried ceph pg repair, but it only did so much. On 10/21/13 3:54 PM, "Samuel Just" wrote: >Can you get the pg to recover without osd.3? >-Sam > >On Mon, Oct 21, 2013 at 1:59 PM, Jeff Williams >wrote: >> We're runn

Re: [ceph-users] Continually crashing osds

2013-10-21 Thread Jeff Williams
I apologize, I should have mentioned that both osd.3 and osd.11 crash immediately and if I do not 'set noout', the crash cascades to the rest of the cluster. Thanks, Jeff Sent from my Samsung Galaxy Note™, an AT&T LTE smartphone Original message From: Sam

Re: [ceph-users] the state of cephfs in giant

2014-10-13 Thread Jeff Bailey
On 10/13/2014 4:56 PM, Sage Weil wrote: On Mon, 13 Oct 2014, Eric Eastman wrote: I would be interested in testing the Samba VFS and Ganesha NFS integration with CephFS. Are there any notes on how to configure these two interfaces with CephFS? For ganesha I'm doing something like: FSAL { CE

Re: [ceph-users] ulimit max user processes (-u) and non-root ceph clients

2013-12-16 Thread Jeff Bailey
On 12/16/2013 2:36 PM, Dan Van Der Ster wrote: On Dec 16, 2013 8:26 PM, Gregory Farnum wrote: On Mon, Dec 16, 2013 at 11:08 AM, Dan van der Ster wrote: Hi, Sorry to revive this old thread, but I wanted to update you on the current pains we're going through related to clients' nproc (and now

[ceph-users] Current state of OpenStack/Ceph rbd live migration?

2014-01-06 Thread Jeff Bachtel
just curious if this situation is rectified? Thanks, Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] mon not binding to public interface

2014-01-13 Thread Jeff Bachtel
n this host) and osd (on other hosts) bind to 0.0.0.0 and a public IP, respectively. At this point public/cluster addr/network are WAY overspecified in ceph.conf, but the problem appeared with far less specification. Any ideas? Thanks, Jeff ___ cep

Re: [ceph-users] mon not binding to public interface

2014-01-15 Thread Jeff Bachtel
If I understand correctly then, I should either not specify mon addr or set it to an external IP? Thanks for the clarification, Jeff On 01/15/2014 03:58 PM, John Wilkins wrote: Jeff, First, if you've specified the public and cluster networks in [global], you don't need to

[ceph-users] Possible repo packaging regression

2014-04-30 Thread Jeff Bachtel
ead 7fa7524f67a0 The SRPM for what ended up on ceph-extras wasn't uploaded to the repo, so I didn't check to see if it was the Basho patch being applied again or something else. Downgrading back to leveldb 1.7.0-2 resolved my problem. Is

[ceph-users] Manually mucked up pg, need help fixing

2014-05-03 Thread Jeff Bachtel
the pg, but I'd prefer to learn enough of the innards to understand what is going on, and possible means of fixing it. Thanks for any help, Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Jeff Bachtel
question with the hope that the cluster would roll back epochs for 0.2f, but all it does is recreate the pg directory (empty) on osd.4. Jeff On 05/05/2014 04:33 PM, Gregory Farnum wrote: What's your cluster look like? I wonder if you can just remove the bad PG from osd.4 and let it r

Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Jeff Bachtel
Thanks. That is a cool utility, unfortunately I'm pretty sure the pg in question had a cephfs object instead of rbd images (because mounting cephfs is the only noticeable brokenness). Jeff On 05/05/2014 06:43 PM, Jake Young wrote: I was in a similar situation where I could see the PGs da

Re: [ceph-users] Manually mucked up pg, need help fixing

2014-05-05 Thread Jeff Bachtel
fter object recovery is as complete as it's going to get. At this point though I'm shrugging and accepting the data loss, but ideas on how to create a new pg to replace the incomplete 0.2f would be deeply useful. I'm supposing ceph pg force_create_pg 0.2f would suffice. Jeff

[ceph-users] pgs not mapped to osds, tearing hair out

2014-05-09 Thread Jeff Bachtel
27;t have any examples of how. Thanks for any help, Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] pgs not mapped to osds, tearing hair out

2014-05-09 Thread Jeff Bachtel
Wow I'm an idiot for getting the wrong reweight command. Thanks so much, Jeff On May 9, 2014 11:06 AM, "Sage Weil" wrote: > On Fri, 9 May 2014, Jeff Bachtel wrote: > > I'm working on http://tracker.ceph.com/issues/8310 , basically by > bringing > > osds

Re: [ceph-users] v0.80.1 Firefly released

2014-05-12 Thread Jeff Bachtel
I see the EL6 build on http://ceph.com/rpm-firefly/el6/x86_64/ but not on gitbuilder (last build 07MAY). Is 0.80.1 considered a different branch ref for purposes of gitbuilder? Jeff On 05/12/2014 05:31 PM, Sage Weil wrote: This first Firefly point release fixes a few bugs, the most visible

[ceph-users] Problem with ceph_filestore_dump, possibly stuck in a loop

2014-05-16 Thread Jeff Bachtel
basic premise even trying to do that, please let me know so I can wave off (in which case, I believe I'd use ceph_filestore_dump to delete all copies of this pg in the cluster so I can force create it, which is failing at this time). Thanks, Jeff ___

[ceph-users] maximum number of mapped rbds?

2015-09-03 Thread Jeff Epstein
host. Can this be the source of the problem? If so, is there a workaround? $ rbd -p platform showmapped|wc -l 248 Thanks. Best, Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] occasional failure to unmap rbd

2015-09-25 Thread Jeff Epstein
cking the unmap? Is there a way to force unmap? Best, Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] occasional failure to unmap rbd

2015-09-25 Thread Jeff Epstein
On 09/25/2015 12:38 PM, Ilya Dryomov wrote: On Fri, Sep 25, 2015 at 7:17 PM, Jeff Epstein wrote: We occasionally have a situation where we are unable to unmap an rbd. This occurs intermittently, with no obvious cause. For the most part, rbds can be unmapped fine, but sometimes we get this

Re: [ceph-users] occasional failure to unmap rbd

2015-09-25 Thread Jeff Epstein
refcount, lsof wouldn't help. Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] occasional failure to unmap rbd

2015-09-25 Thread Jeff Epstein
On 09/25/2015 02:28 PM, Jan Schermer wrote: What about /sys/block/krbdX/holders? Nothing in there? There is no /sys/block/krbd450, but there is /sys/block/rbd450. In our case, /sys/block/rbd450/holders is empty. Jeff ___ ceph-users mailing list

Re: [ceph-users] Fast Ceph a Cluster with PB storage

2016-08-09 Thread Jeff Bailey
On 8/9/2016 10:43 AM, Wido den Hollander wrote: Op 9 augustus 2016 om 16:36 schreef Александр Пивушков : > >> Hello dear community! I'm new to the Ceph and not long ago took up the theme of building clusters. Therefore it is very important to your opinion. It is necessary to create a clus

[ceph-users] Unable to get RadosGW working on CentOS 6

2013-05-13 Thread Jeff Bachtel
c -l 237 VirtualHost servername matches fqdn. ceph.conf uses short hostname (both are in /etc/hosts pointing to same IP). Any ideas what might be causing the FastCGI errors? I saw the similar problems originally with fcgid, which was what led me to install mod_fastcgi. Thanks, Jeff _

Re: [ceph-users] Unable to get RadosGW working on CentOS 6

2013-05-14 Thread Jeff Bachtel
That configuration option is set, the results are the same. To clarify: do I need to start radosgw from the command line if it is being spawned by fastcgi? I've tried it both ways with the same result. Thanks, Jeff On Tue, May 14, 2013 at 12:56 AM, Yehuda Sadeh wrote: > On Mon, May

Re: [ceph-users] Unable to get RadosGW working on CentOS 6

2013-05-14 Thread Jeff Bachtel
next branch, things seem to be working (s3test.py is successful). Thanks for the help, Jeff On Tue, May 14, 2013 at 6:35 AM, Jeff Bachtel < jbach...@bericotechnologies.com> wrote: > That configuration option is set, the results are the same. To clarify: do > I need to start radosgw

Re: [ceph-users] CentOS + qemu-kvm rbd support update

2013-06-04 Thread Jeff Bachtel
Hijacking (because it's related): a couple weeks ago on IRC it was indicated a repo with these (or updated) qemu builds for CentOS should be coming soon from Ceph/Inktank. Did that ever happen? Thanks, Jeff On Mon, Jun 3, 2013 at 10:25 PM, YIP Wai Peng wrote: > Hi Andrel, > >

Re: [ceph-users] Issues with a fresh cluster and HEALTH_WARN

2013-06-06 Thread Jeff Bailey
You need to fix your clocks (usually with ntp). According to the log message they can be off by 50ms and yours seems to be about 85ms off. On 6/6/2013 8:40 PM, Joshua Mesilane wrote: > Hi, > > I'm currently evaulating ceph as a solution to some HA storage that > we're looking at. To test I have

Re: [ceph-users] Changing Replication count

2016-09-06 Thread Jeff Bailey
On 9/6/2016 8:41 PM, Vlad Blando wrote: Hi, My replication count now is this [root@controller-node ~]# ceph osd lspools 4 images,5 volumes, Those aren't replica counts they're pool ids. [root@controller-node ~]# and I made adjustment and made it to 3 for images and 2 to volumes to 3, it

[ceph-users] maintenance questions

2016-10-07 Thread Jeff Applewhite
- new to the list.​ ​Thanks in advance!​ -- Jeff Applewhite Principal Product Manager ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph and NVMe

2018-09-06 Thread Jeff Bailey
I haven't had any problems using 375GB P4800X's in R730 and R740xd machines for DB+WAL. The iDRAC whines a bit on the R740 but everything works fine. On 9/6/2018 3:09 PM, Steven Vacaroaia wrote: Hi , Just to add to this question, is anyone using Intel Optane DC P4800X on DELL R630 ...or any

[ceph-users] interpreting ceph mds stat

2018-10-03 Thread Jeff Smith
? What is the 0=a=up:active? Is that saying rank 0 of file system a is up:active? Jeff Smith ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] list admin issues

2018-10-06 Thread Jeff Smith
I have been removed twice. On Sat, Oct 6, 2018 at 7:07 AM Elias Abacioglu wrote: > > Hi, > > I'm bumping this old thread cause it's getting annoying. My membership get > disabled twice a month. > Between my two Gmail accounts I'm in more than 25 mailing lists and I see > this behavior only here.

[ceph-users] mds will not activate

2018-10-06 Thread Jeff Smith
I had to reboot my mds. The hot spare did not kick in and now I am showing the filesystem is degraded and offline. Both mds are showing as up:standby. I am not sure how to proceed. cluster: id: 188c7fba-288f-45e9-bca1-cc5fceccd2a1 health: HEALTH_ERR 1 filesystem is deg

Re: [ceph-users] list admin issues

2018-10-08 Thread Jeff Smith
;> > > On Sun, Oct 7, 2018 at 5:38 AM Svante Karlsson >> > > wrote: >> > >> >> > >> I'm also getting removed but not only from ceph. I subscribe >> > >> d...@kafka.apache.org list and the same thing happens there. >>

Re: [ceph-users] Intel P3700 PCI-e as journal drives?

2016-01-12 Thread Jeff Bailey
On 1/12/2016 4:51 AM, Burkhard Linke wrote: Hi, On 01/08/2016 03:02 PM, Paweł Sadowski wrote: Hi, Quick results for 1/5/10 jobs: *snipsnap* Run status group 0 (all jobs): WRITE: io=21116MB, aggrb=360372KB/s, minb=360372KB/s, maxb=360372KB/s, mint=6msec, maxt=6msec *snipsnap*

[ceph-users] OSDs are down, don't know why

2016-01-15 Thread Jeff Epstein
- Is there any need to open ports other than TCP 6789 and 6800-6803? - Any other suggestions? ceph 0.94 on Debian Jessie Best, Jeff ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] OSDs are down, don't know why

2016-01-18 Thread Jeff Epstein
Hi Steve Thanks for your answer. I don't have a private network defined. Furthermore, in my current testing configuration, there is only one OSD, so communication between OSDs should be a non-issue. Do you know how OSD up/down state is determined when there is only one OSD? Best, Jeff

Re: [ceph-users] OSDs are down, don't know why

2016-01-18 Thread Jeff Epstein
age, be advised that any dissemination or copying of this message is prohibited. If you received this message erroneously, please notify the sender and delete it, together with any attachments. -Original Message- From: Jeff Epstein [mailto:jeff.epst...@commerceguys.com] Sent: Monday, Jan

Re: [ceph-users] Tips for faster openstack instance boot

2016-02-08 Thread Jeff Bailey
Your glance images need to be raw, also. A QCOW image will be copied/converted. On 2/8/2016 3:33 PM, Jason Dillaman wrote: If Nova and Glance are properly configured, it should only require a quick clone of the Glance image to create your Nova ephemeral image. Have you double-checked your c

Re: [ceph-users] Fwd: NAS solution for CephFS

2019-02-13 Thread Jeff Layton
g missing at this point is delegations in an active/active configuration, but that's mainly because of the synchronous nature of libcephfs. We have a potential fix for that problem but it requires work in libcephfs that is not yet done. Cheers, -- Jeff Layton ___

Re: [ceph-users] Fwd: NAS solution for CephFS

2019-02-14 Thread Jeff Layton
On Thu, 2019-02-14 at 10:35 +0800, Marvin Zhang wrote: > On Thu, Feb 14, 2019 at 8:09 AM Jeff Layton wrote: > > > Hi, > > > As http://docs.ceph.com/docs/master/cephfs/nfs/ says, it's OK to > > > config active/passive NFS-Ganesha to use CephFs. My question is if

Re: [ceph-users] Fwd: NAS solution for CephFS

2019-02-14 Thread Jeff Layton
On Thu, 2019-02-14 at 19:49 +0800, Marvin Zhang wrote: > Hi Jeff, > Another question is about Client Caching when disabling delegation. > I set breakpoint on nfs4_op_read, which is OP_READ process function in > nfs-ganesha. Then I read a file, I found that it will hit only once on >

Re: [ceph-users] Fwd: NAS solution for CephFS

2019-02-14 Thread Jeff Layton
n the v4 client does revalidate the cache, it relies heavily on NFSv4 change attribute. Cephfs's change attribute is cluster-coherent too, so if the client does revalidate it should see changes made on other servers. > On Thu, Feb 14, 2019 at 8:29 PM Jeff Layton wrote: > > On Thu, 201

Re: [ceph-users] Fwd: NAS solution for CephFS

2019-02-15 Thread Jeff Layton
On Fri, 2019-02-15 at 15:34 +0800, Marvin Zhang wrote: > Thanks Jeff. > If I set Attr_Expiration_Time as zero in conf , deos it mean timeout > is zero? If so, every client will see the change immediately. Will it > decrease the performance hardly? > I seems that GlusterFS FSAL

Re: [ceph-users] Fwd: NAS solution for CephFS

2019-02-18 Thread Jeff Layton
when running fio on a single file from a > single client. > > NFS iops? I'd guess more READ ops in particular? Is that with a FSAL_CEPH backend? > > > > > > On Thu, Feb 14, 2019 at 9:04 PM Jeff Layton > > > wrote: > > > > On Thu, 2019-02-14

Re: [ceph-users] Fwd: NAS solution for CephFS

2019-02-18 Thread Jeff Layton
vides any performance gain when the attributes are already cached in the libcephfs layer. If we did want to start using the mdcache, then we'd almost certainly want to invalidate that cache when libcephfs gives up caps. I just don't see how the extra layer of caching provides mu

Re: [ceph-users] Intel P4600 3.2TB U.2 form factor NVMe firmware problems causing dead disks

2019-02-26 Thread Jeff Smith
We had several postgresql servers running these disks from Dell. Numerous failures, including one server that had 3 die at once. Dell claims it is a firmware issue instructed us to upgrade to QDV1DP15 from QDV1DP12 (I am not sure how these line up to the Intel firmwares). We lost several more

[ceph-users] Deploying a Ceph+NFS Server Cluster with Rook

2019-03-06 Thread Jeff Layton
/deploying-a-cephnfs-server-cluster-with-rook/ I don't think that site has a way to post comments, but I'm happy to answer questions about it via email. -- Jeff Layton signature.asc Description: This is a digitally signed message part ___ ceph-use

Re: [ceph-users] NFS-Ganesha CEPH_FSAL | potential locking issue

2019-04-16 Thread Jeff Layton
like ganesha is probably just too swamped with write requests to do much else, but you'll probably want to do the legwork starting with the hanging application, and figure out what it's doing that takes so long. Is it some syscall? Which one? >From there you can start looking at statisti

Re: [ceph-users] Nfs-ganesha with rados_kv backend

2019-05-29 Thread Jeff Layton
. With that, you can also use the rados_ng recovery backend, which is more resilient in the face of multiple crashes. In that configuration you would want to have the same config file on both nodes, including the same nodeid so that you can potentially take advantage of the RECLAIM_RESET interface to kill off the old session quickly after the server restarts. You also need a much longer grace period. Cheers, -- Jeff Layton ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] CephFS : Kernel/Fuse technical differences

2019-06-24 Thread Jeff Layton
ack down to the kernel which then wakes up the original task so it can get the result. FUSE is a wonderful thing, but it's not really built for speed. -- Jeff Layton ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] [Nfs-ganesha-devel] 2.7.3 with CEPH_FSAL Crashing

2019-07-17 Thread Jeff Layton
This is almost certainly the same bug that is fixed here: https://github.com/ceph/ceph/pull/28324 It should get backported soon-ish but I'm not sure which luminous release it'll show up in. Cheers, Jeff On Wed, 2019-07-17 at 10:36 +0100, David C wrote: > Thanks for taking a

Re: [ceph-users] [Nfs-ganesha-devel] 2.7.3 with CEPH_FSAL Crashing

2019-07-17 Thread Jeff Layton
Ahh, I just noticed you were running nautilus on the client side. This patch went into v14.2.2, so once you update to that you should be good to go. -- Jeff On Wed, 2019-07-17 at 17:10 -0400, Jeff Layton wrote: > This is almost certainly the same bug that is fixed here: > > https://g

Re: [ceph-users] Ceph nfs ganesha exports

2019-08-01 Thread Jeff Layton
or viruses; you must scan for > these. Please note that e-mails sent to and from blocz IO Limited are > routinely monitored for record keeping, quality control and training > purposes, to ensure regulatory compliance and to prevent viruses and > unauthorised use of our computer syste

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-14 Thread Jeff Layton
el log from one of the hosts (the other two were similar): > > https://mrcn.st/p/ezrhr1qR > > > > After playing some service failover games and hard rebooting the three > > affected client boxes everything seems to be fine. The remaining FS > > client box had no kernel err

Re: [ceph-users] CephFS meltdown fallout: mds assert failure, kernel oopses

2019-08-15 Thread Jeff Layton
On Thu, 2019-08-15 at 16:45 +0900, Hector Martin wrote: > On 15/08/2019 03.40, Jeff Layton wrote: > > On Wed, 2019-08-14 at 19:29 +0200, Ilya Dryomov wrote: > > > Jeff, the oops seems to be a NULL dereference in ceph_lock_message(). > > > Please take a look. > >

Re: [ceph-users] Weird mount issue (Ubuntu 18.04, Ceph 14.2.5 & 14.2.6)

2020-01-17 Thread Jeff Layton
Actually, scratch that. I went ahead and opened this: https://tracker.ceph.com/issues/43649 Feel free to watch that one for updates. On Fri, 2020-01-17 at 07:43 -0500, Jeff Layton wrote: > No problem. Can you let me know the tracker bug number once you've > opened it? > &g

Re: [ceph-users] Weird mount issue (Ubuntu 18.04, Ceph 14.2.5 & 14.2.6)

2020-01-17 Thread Jeff Layton
On Fri, 2020-01-17 at 17:10 +0100, Ilya Dryomov wrote: > On Fri, Jan 17, 2020 at 2:21 AM Aaron wrote: > > No worries, can definitely do that. > > > > Cheers > > Aaron > > > > On Thu, Jan 16, 2020 at 8:08 PM Jeff Layton wrote: > > > On T

Re: [ceph-users] pgs stuck unclean after growing my ceph-cluster

2013-03-13 Thread Jeff Anderson-Lee
u might also just be able to enable the CRUSH tunables (http://ceph.com/docs/master/rados/operations/crush-map/#tunables). I experienced this (stuck active+remapped) frequently with the stock 0.41 apt-get/Ubuntu version of ceph. Less so with Bobtail. Jeff Anderson-Lee John, this is becoming a more

  1   2   >