Re: [ceph-users] Looking for help with debugging cephfs snapshots

2017-10-22 Thread Eric Eastman
On Sun, Oct 22, 2017 at 8:05 PM, Yan, Zheng wrote: > On Mon, Oct 23, 2017 at 9:35 AM, Eric Eastman > wrote: > > With help from the list we recently recovered one of our Jewel based > > clusters that started failing when we got to about 4800 cephfs snapshots. > > W

[ceph-users] Looking for help with debugging cephfs snapshots

2017-10-22 Thread Eric Eastman
With help from the list we recently recovered one of our Jewel based clusters that started failing when we got to about 4800 cephfs snapshots. We understand that cephfs snapshots are still marked experimental. We are running a single active MDS with 2 standby MDS. We only have a single file syst

Re: [ceph-users] Is the StupidAllocator supported in Luminous?

2017-09-09 Thread Eric Eastman
Opened: http://tracker.ceph.com/issues/21332 On Sat, Sep 9, 2017 at 10:03 PM, Gregory Farnum wrote: > Yes. Please open a ticket! > > >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.co

[ceph-users] Is the StupidAllocator supported in Luminous?

2017-09-09 Thread Eric Eastman
I am seeing OOM issues with some of my OSD nodes that I am testing with Bluestore on 12.2.0, so I decided to try the StupidAllocator to see if it has a smaller memory footprint, by setting the following in my ceph.conf: bluefs_allocator = stupid bluestore_cache_size_hdd = 1073741824 bluestore_cach

Re: [ceph-users] Ceph release cadence

2017-09-06 Thread Eric Eastman
I have been working with Ceph for the last several years and I help support multiple Ceph clusters. I would like to have the team drop the Even/Odd release schedule, and go to an all production release schedule. I would like releases on no more then a 9 month schedule, with smaller incremental cha

Re: [ceph-users] Ceph file system hang

2017-06-16 Thread Eric Eastman
I have created a ticket on this issue: http://tracker.ceph.com/issues/20329 On Thu, Jun 15, 2017 at 12:14 PM, Eric Eastman wrote: > On Thu, Jun 15, 2017 at 11:45 AM, David Turner wrote: >> Have you compared performance to mounting cephfs using ceph-fuse instead of >> the kernel

Re: [ceph-users] Ceph file system hang

2017-06-15 Thread Eric Eastman
ile system should not hang. Thanks, Eric > > On Thu, Jun 15, 2017 at 12:39 PM Eric Eastman > wrote: >> >> We are running Ceph 10.2.7 and after adding a new multi-threaded >> writer application we are seeing hangs accessing metadata from ceph >> file system kernel mount

[ceph-users] Ceph file system hang

2017-06-15 Thread Eric Eastman
We are running Ceph 10.2.7 and after adding a new multi-threaded writer application we are seeing hangs accessing metadata from ceph file system kernel mounted clients. I have a "du -ah /cephfs" process that been stuck for over 12 hours on one cephfs client system. We started seeing hung "du -ah

Re: [ceph-users] cephfs (rbd) read performance low - where is the bottleneck?

2016-11-20 Thread Eric Eastman
Have you looked at your file layout? On a test cluster running 10.2.3 I created a 5GB file and then looked at the layout: # ls -l test.dat -rw-r--r-- 1 root root 524288 Nov 20 23:09 test.dat # getfattr -n ceph.file.layout test.dat # file: test.dat ceph.file.layout="stripe_unit=4194304 s

Re: [ceph-users] Recovering full OSD

2016-08-08 Thread Eric Eastman
Under Jewel 10.2.2 I have also had to delete PG directories to get very full OSDs to restart. I first use "du -sh *" under the "current" directory to find which OSD directories are the fullest on the full OSD disk, and pick 1 of the fullest. I then look at the PG map and verify the PG is replicate

Re: [ceph-users] CephFS + CTDB/Samba - MDS session timeout on lockfile

2016-05-11 Thread Eric Eastman
On Wed, May 11, 2016 at 2:04 AM, Nick Fisk wrote: >> -Original Message- >> From: Eric Eastman [mailto:eric.east...@keepertech.com] >> Sent: 10 May 2016 18:29 >> To: Nick Fisk >> Cc: Ceph Users >> Subject: Re: [ceph-users] CephFS + CTDB/Samba - MDS ses

Re: [ceph-users] CephFS + CTDB/Samba - MDS session timeout on lockfile

2016-05-10 Thread Eric Eastman
On Tue, May 10, 2016 at 6:48 AM, Nick Fisk wrote: > > >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Nick Fisk >> Sent: 10 May 2016 13:30 >> To: 'Eric Eastman' >> Cc: 'Ceph

Re: [ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-09 Thread Eric Eastman
On Mon, May 9, 2016 at 8:08 PM, Yan, Zheng wrote: > On Tue, May 10, 2016 at 2:10 AM, Eric Eastman > wrote: >> On Mon, May 9, 2016 at 10:36 AM, Gregory Farnum wrote: >>> On Sat, May 7, 2016 at 9:53 PM, Eric Eastman >>> wrote: >>>> On Fri, May 6,

Re: [ceph-users] CephFS + CTDB/Samba - MDS session timeout on lockfile

2016-05-09 Thread Eric Eastman
On Mon, May 9, 2016 at 3:28 PM, Nick Fisk wrote: > Hi Eric, > >> >> I am trying to do some similar testing with SAMBA and CTDB with the Ceph >> file system. Are you using the vfs_ceph SAMBA module or are you kernel >> mounting the Ceph file system? > > I'm using the kernel client. I couldn't find

Re: [ceph-users] CephFS + CTDB/Samba - MDS session timeout on lockfile

2016-05-09 Thread Eric Eastman
I am trying to do some similar testing with SAMBA and CTDB with the Ceph file system. Are you using the vfs_ceph SAMBA module or are you kernel mounting the Ceph file system? Thanks Eric On Mon, May 9, 2016 at 9:31 AM, Nick Fisk wrote: > Hi All, > > I've been testing an active/active Samba clus

Re: [ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-09 Thread Eric Eastman
On Mon, May 9, 2016 at 10:36 AM, Gregory Farnum wrote: > On Sat, May 7, 2016 at 9:53 PM, Eric Eastman > wrote: >> On Fri, May 6, 2016 at 2:14 PM, Eric Eastman >> wrote: >> >> >> A simple test of setting an ACL from the command line to a fuse >> mount

Re: [ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-07 Thread Eric Eastman
On Fri, May 6, 2016 at 2:14 PM, Eric Eastman wrote: > As it should be working, I will increase the logging level in my > smb.conf file and see what info I can get out of the logs, and report back. Setting the log level = 20 in my smb.conf file, and trying to add an additional use

Re: [ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-06 Thread Eric Eastman
Samba vfs_ceph.c ACL patches listed here: https://lists.samba.org/archive/samba-technical/2016-March/113063.html Are not in the released 4.4.2 Samba source code or in the master git branch of Samba. -Eric On Fri, May 6, 2016 at 12:53 PM, Gregory Farnum wrote: > On Fri, May 6, 2016 at 9:53 AM, Eric

[ceph-users] ACL support in Jewel using fuse and SAMBA

2016-05-06 Thread Eric Eastman
I was doing some SAMBA testing and noticed that a kernel mounted share acted differently then a fuse mounted share with Windows security on my windows client. I cut my test down to as simple as possible, and I am seeing the kernel mounted Ceph file system working as expected with SAMBA and the fuse

Re: [ceph-users] Multiple MDSes

2016-04-22 Thread Eric Eastman
On Fri, Apr 22, 2016 at 9:59 PM, Andrus, Brian Contractor wrote: > All, > > Ok, I understand Jewel is considered stable for CephFS with a single active > MDS. > > But, how do I add a standby MDS? What documentation I find is a bit > confusing. > > I ran > > ceph-deploy create mds systemA > ceph-de

Re: [ceph-users] v10.0.4 released

2016-03-19 Thread Eric Eastman
Thank you for doing this. It will make testing 10.0.x easier for all of us in the field, and will make it easier to report bugs, as we will know that the problems we find were not caused by our build process. Eric On Wed, Mar 16, 2016 at 7:14 AM, Loic Dachary wrote: > Hi, > > Because of a tiny

Re: [ceph-users] Ceph file system is not freeing space

2015-11-11 Thread Eric Eastman
On Wed, Nov 11, 2015 at 4:19 PM, John Spray wrote: > > Eric: for the ticket, can you also gather an MDS log (with debug mds = > 20) from the point where the MDS starts up until the point where it > has been active for a few seconds? The strays are evaluated for > purging during startup, so there

Re: [ceph-users] Ceph file system is not freeing space

2015-11-11 Thread Eric Eastman
On Wed, Nov 11, 2015 at 11:09 AM, John Spray wrote: > On Wed, Nov 11, 2015 at 5:39 PM, Eric Eastman > wrote: >> I am trying to figure out why my Ceph file system is not freeing >> space. Using Ceph 9.1.0 I created a file system with snapshots >> enabled, filled up t

[ceph-users] Ceph file system is not freeing space

2015-11-11 Thread Eric Eastman
I am trying to figure out why my Ceph file system is not freeing space. Using Ceph 9.1.0 I created a file system with snapshots enabled, filled up the file system over days while taking snapshots hourly. I then deleted all files and all snapshots, but Ceph is not returning the space. I left the c

Re: [ceph-users] Read-out much slower than write-in on my ceph cluster

2015-10-28 Thread Eric Eastman
On the RBD performance issue, you may want to look at: http://tracker.ceph.com/issues/9192 Eric On Tue, Oct 27, 2015 at 8:59 PM, FaHui Lin wrote: > Dear Ceph experts, > > I found something strange about the performance of my Ceph cluster: Read-out > much slower than write-in. > > I have 3 machin

[ceph-users] Seeing huge number of open pipes per OSD process

2015-10-05 Thread Eric Eastman
I am testing a Ceph cluster running Ceph v9.0.3 on Trusty using the 4.3rc4 kernel and I am seeing a huge number of open pipes on my OSD processes as I run a sequential load on the system using a single Ceph file system client. A "lsof -n > file.txt" on one of the OSD servers produced a 9GB file wi

Re: [ceph-users] Ceph File System ACL Support

2015-08-16 Thread Eric Eastman
On Sun, Aug 16, 2015 at 9:12 PM, Yan, Zheng wrote: > On Mon, Aug 17, 2015 at 9:38 AM, Eric Eastman > wrote: >> Hi, >> >> I need to verify in Ceph v9.0.2 if the kernel version of Ceph file >> system supports ACLs and the libcephfs file system interface does not

[ceph-users] Ceph File System ACL Support

2015-08-16 Thread Eric Eastman
Hi, I need to verify in Ceph v9.0.2 if the kernel version of Ceph file system supports ACLs and the libcephfs file system interface does not. I am trying to have SAMBA, version 4.3.0rc1, support Windows ACLs using "vfs objects = acl_xattr" with the SAMBA VFS Ceph file system interface "vfs objects

Re: [ceph-users] rbd-fuse Transport endpoint is not connected

2015-07-30 Thread Eric Eastman
It is great having access to features that are not fully production ready, but it would be nice to know which Ceph features are ready and which are not. Just like Ceph File System is well marked that it is not yet fully ready for production, it would be nice if rbd-fuse could be marked as not read

Re: [ceph-users] State of nfs-ganesha CEPH fsal

2015-07-27 Thread Eric Eastman
We are looking at using Ganesha NFS with the Ceph file system. Currently I am testing the FSAL interface on Ganesha NFS Release = V2.2.0-2 running on Ceph 9.0.2. This is all early work, as Ceph FS is still not considered production ready, and Ceph 9.0.2 is a development release. Currently I am on

Re: [ceph-users] Weird behaviour of cephfs with samba

2015-07-27 Thread Eric Eastman
I don't have any answers but I am also seeing some strange results exporting a Ceph file system using the Samba VFS interface on Ceph version 9.0.2. If I mount a Linux client with vers=1, I see the file system the same as I see it on a ceph file system mount. If I use vers=2.0 or vers=3.0 on the

Re: [ceph-users] Enclosure power failure pausing client IO till all connected hosts up

2015-07-23 Thread Eric Eastman
You may want to check your min_size value for your pools. If it is set to the pool size value, then the cluster will not do I/O if you loose a chassis. On Sun, Jul 5, 2015 at 11:04 PM, Mallikarjun Biradar wrote: > Hi all, > > Setup details: > Two storage enclosures each connected to 4 OSD nodes

Re: [ceph-users] Ceph experiences

2015-07-18 Thread Eric Eastman
Congratulations on getting your cluster up and running. Many of us have seen the distribution issue on smaller clusters. More PGs and more OSDs help. A 100 OSD configuration balances better then a 12 OSD system. Ceph tries to protect your data, so a single full OSD shuts off writes. Ceph CRUSH

Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-14 Thread Eric Eastman
Mon, Jul 13, 2015 at 9:40 AM, Eric Eastman wrote: > Thanks John. I will back the test down to the simple case of 1 client > without the kernel driver and only running NFS Ganesha, and work forward > till I trip the problem and report my findings. > > Eric > > On Mon, Jul 13,

Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-13 Thread Eric Eastman
Thanks John. I will back the test down to the simple case of 1 client without the kernel driver and only running NFS Ganesha, and work forward till I trip the problem and report my findings. Eric On Mon, Jul 13, 2015 at 2:18 AM, John Spray wrote: > > > On 13/07/2015 04:02, Eric East

Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-12 Thread Eric Eastman
In the last email, I stated the clients were not mounted using the ceph file system kernel driver. Re-checking the client systems, the file systems are mounted, but all the IO is going through Ganesha NFS using the ceph file system library interface. On Sun, Jul 12, 2015 at 9:02 PM, Eric Eastman

Re: [ceph-users] mds0: Client failing to respond to cache pressure

2015-07-12 Thread Eric Eastman
Hi John, I am seeing this problem with Ceph v9.0.1 with the v4.1 kernel on all nodes. This system is using 4 Ceph FS client systems. They all have the kernel driver version of CephFS loaded, but none are mounting the file system. All 4 clients are using the libcephfs VFS interface to Ganesha NFS

Re: [ceph-users] Cannot delete ceph file system snapshots

2015-07-08 Thread Eric Eastman
Thank you! That was the solution. Eric On Wed, Jul 8, 2015 at 12:02 PM, Jan Schermer wrote: > Have you tried "rmdir" instead of "rm -rf"? > > Jan > > > On 08 Jul 2015, at 19:17, Eric Eastman > wrote: > > > > Hi, > > > > I hav

[ceph-users] Cannot delete ceph file system snapshots

2015-07-08 Thread Eric Eastman
Hi, I have created a ceph file system on a cluster running ceph v9.0.1 and have enable snapshots with the command: ceph mds set allow_new_snaps true --yes-i-really-mean-it On the top level of the ceph file system, I can cd into the hidden .snap directory and I can create new directories with the

Re: [ceph-users] umount stuck on NFS gateways switch over by using Pacemaker

2015-05-28 Thread Eric Eastman
On Thu, May 28, 2015 at 1:33 AM, wrote: > > Hello, > > I am testing NFS over RBD recently. I am trying to build the NFS HA > environment under Ubuntu 14.04 for testing, and the packages version > information as follows: > - Ubuntu 14.04 : 3.13.0-32-generic(Ubuntu 14.04.2 LTS) > - ceph : 0.80.9

Re: [ceph-users] Understanding High Availability - iSCSI/CIFS/NFS

2015-04-04 Thread Eric Eastman
You may want to look at the Clustered SCSI Target Using RBD Status Blueprint, Etherpad and video at: https://wiki.ceph.com/Planning/Blueprints/Hammer/Clustered_SCSI_target_using_RBD http://pad.ceph.com/p/I-scsi https://www.youtube.com/watch?v=quLqLnWF6A8&index=7&list=PLrBUGiINAakNGDE42uLyU2S1s_9HV

Re: [ceph-users] Problem mapping RBD images with v0.92

2015-02-07 Thread Eric Eastman
shared* > > > > Without the "--image-shared" option, rbd CLI creates the image with > RBD_FEATURE_EXCLUSIVE_LOCK, which is not supported by the linux kernel RDB. > > > > Thanks, > > Raju > > > > > > *From:* ceph-users [mailto:ceph-users-boun..

[ceph-users] Problem mapping RBD images with v0.92

2015-02-07 Thread Eric Eastman
Has anything changed in v0.92 that would keep a 3.18 Kernel from mapping a RBD image? I have been using a test script to create RBD images and map them since FireFly and the script has worked fine through Ceph v0.91. It is not working with v0.92, so I minimized it to the following 3 commands whic

Re: [ceph-users] chattr +i not working with cephfs

2015-01-28 Thread Eric Eastman
On Wed, Jan 28, 2015 at 11:43 AM, Gregory Farnum wrote: > > On Wed, Jan 28, 2015 at 10:06 AM, Sage Weil wrote: > > On Wed, 28 Jan 2015, John Spray wrote: > >> On Wed, Jan 28, 2015 at 5:23 PM, Gregory Farnum wrote: > >> > My concern is whether we as the FS are responsible for doing anything > >>

Re: [ceph-users] chattr +i not working with cephfs

2015-01-28 Thread Eric Eastman
e if Sage/Greg can offer any more background. > > John > > On Wed, Jan 28, 2015 at 1:24 AM, Eric Eastman > wrote: >> Should chattr +i work with cephfs? >> >> Using ceph v0.91 and a 3.18 kernel on the CephFS client, I tried this: >> >> # mount | grep ceph

[ceph-users] chattr +i not working with cephfs

2015-01-27 Thread Eric Eastman
Should chattr +i work with cephfs? Using ceph v0.91 and a 3.18 kernel on the CephFS client, I tried this: # mount | grep ceph 172.16.30.10:/ on /cephfs/test01 type ceph (name=cephfs,key=client.cephfs) # echo 1 > /cephfs/test01/test.1 # ls -l /cephfs/test01/test.1 -rw-r--r-- 1 root root 2 Jan 27

Re: [ceph-users] Cache tiers flushing logic

2014-12-30 Thread Eric Eastman
On Tue, Dec 30, 2014 at 12:38 PM, Erik Logtenberg wrote: >> >> Hi Erik, >> >> I have tiering working on a couple test clusters. It seems to be >> working with Ceph v0.90 when I set: >> >> ceph osd pool set POOL hit_set_type bloom >> ceph osd pool set POOL hit_set_count 1 >> ceph osd pool set PO

Re: [ceph-users] Cache tiers flushing logic

2014-12-30 Thread Eric Eastman
On Tue, Dec 30, 2014 at 7:56 AM, Erik Logtenberg wrote: > > Hi, > > I use a cache tier on SSD's in front of the data pool on HDD's. > > I don't understand the logic behind the flushing of the cache however. > If I start writing data to the pool, it all ends up in the cache pool at > first. So far

Re: [ceph-users] Create OSD on ZFS Mount (firefly)

2014-11-26 Thread Eric Eastman
> - Have created ZFS mount: > “/var/lib/ceph/osd/ceph-0” > - followed the instructions at: > http://ceph.com/docs/firefly/rados/operations/add-or-rm-osds/ > failing on the step 4. Initialize the OSD data directory. > ceph-osd -i 0 —mkfs --mkkey > 2014-11-25 22:12:26.563666 7ff12b466780 -1 > fil

Re: [ceph-users] the state of cephfs in giant

2014-10-13 Thread Eric Eastman
I would be interested in testing the Samba VFS and Ganesha NFS integration with CephFS. Are there any notes on how to configure these two interfaces with CephFS? Eric We've been doing a lot of work on CephFS over the past few months. This is an update on the current state of things as of G

Re: [ceph-users] GPF kernel panics

2014-07-31 Thread Eric Eastman
Do not go to a 3.15 or later Ubuntu kernel at this time if your are using krbd. See bug 8818. The Ubuntu 3.14.x kernels seems to work fine with krbd on Trusty. The mainline packages from Ubuntu should be helpful in testing. Info: https://wiki.ubuntu.com/Kernel/MainlineBuilds Package

Re: [ceph-users] HEALTH_WARN pool has too few pgs

2014-06-12 Thread Eric Eastman
Hi JC, The cluster already has 1024 PGs on only 15 OSD, which is above the formula of (100 x #OSDs)/size. How large should I make it? # ceph osd dump | grep Ray pool 17 'Ray' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 7785 owner 0 f

[ceph-users] HEALTH_WARN pool has too few pgs

2014-06-11 Thread Eric Eastman
Hi, I am seeing the following warning on one of my test clusters: # ceph health detail HEALTH_WARN pool Ray has too few pgs pool Ray objects per pg (24) is more than 12 times cluster average (2) This is a reported issue and is set to "Won't Fix" at: http://tracker.ceph.com/issues/8103 My test

Re: [ceph-users] Installing ceph without access to the internet

2014-04-26 Thread Eric Eastman
I have done it. It can be built on an isolated system if you create your own OS and Ceph repositories. You can also build Ceph from a tar ball on an isolated system. Eric hi, Im on a site with no access to the internet and Im trying to install ceph during the installation it tries to downlo

Re: [ceph-users] Mounting with dmcrypt still fails

2014-03-19 Thread Eric Eastman
On Ubuntu 1310 with Ceph 0.72, after manually putting in the patch from http://tracker.ceph.com/issues/6966 I was able to create my dmcrypt OSD with: ceph-deploy disk zap tca14:/dev/cciss/c0d1 ceph-deploy --verbose osd create --dmcrypt tca14:/dev/cciss/c0d1 Looking at the mount points with df

Re: [ceph-users] Recommended node size for Ceph

2014-03-10 Thread Eric Eastman
Why the limit of 6 OSDs per SSD? Where does Ceph tail off in performance when having to many OSDs in servers? When your Journal isn't able to keep up. If you use SSDs for journaling, use 6 OSDs per SSD at max. I am doing testing with a PCI-e based SSD, and showing that even with 15 OSD di

Re: [ceph-users] clock skew

2014-01-30 Thread Eric Eastman
I have this problem on some of my Ceph clusters, and I think it is due to the older hardware the I am using does not have the best clocks. To fix the problem, I setup one server in my lab to be my local NTP time server, and then on each of my Ceph monitors, in the /etc/ntp.conf file, I put in

Re: [ceph-users] Ceph incomplete pg

2013-12-16 Thread Eric Eastman
I am currently trying to figure out how to debug pgs issues myself and the debugging documentation I have found has not been that helpful. In my case the underlying problem is probably ZFS which I am using for my OSDs, but it would be nice to be able to recover what I can. My health output is

[ceph-users] Issues using ZFS in 0.72.1

2013-12-16 Thread Eric Eastman
Hi, I have started stress testing the ZFS OSD backend on ceph 0.72.1 that I built with ZFS support. Below is one of the issues I have been looking at this morning. My main question is should I just open "new issues" at http://tracker.ceph.com/ as I find these problems in my testing? 2013-1

[ceph-users] Questions/comments on using ZFS for OSDs

2013-11-12 Thread Eric Eastman
I built Ceph version 0.72 with --with-libzfs on Ubuntu 1304 after installing ZFS from th ppa:zfs-native/stable repository. The ZFS version is v0.6.2-1 I do have a few questions and comments on Ceph using ZFS backed OSDs As ceph-deploy does not show support for ZFS, I used the instructions at:

Re: [ceph-users] Balance data on near full osd warning or error

2013-10-22 Thread Eric Eastman
Hello, What I have used to rebalance my cluster is: ceph osd reweight-by-utilization we're using a small Ceph cluster with 8 nodes, each 4 osds. People are using it through instances and volumes in a Openstack platform. We're facing a HEALTH_ERR with full or near full osds : cluster 5942e1

Re: [ceph-users] Multiple kernel RBD clients failures

2013-10-01 Thread Eric Eastman
Hi Travis, Both you and Yan saw the same thing, in that the drives in my test system go from 300GB to 4TB. I used ceph-deploy to create all the OSDs, which I assume picked the weights of 0.26 for my 300GB drives, and 3.63 for my 4TB drives. All the OSDs that are reporting nearly full are the

Re: [ceph-users] Multiple kernel RBD clients failures

2013-09-30 Thread Eric Eastman
Thank you for the reply -28 == -ENOSPC (No space left on device). I think it's is due to the fact that some osds are near full. Yan, Zheng I thought that may be the case, but I would expect that ceph health would tell me I had a full OSDs, but it is only saying they are near full: # c

[ceph-users] Multiple kernel RBD clients failures

2013-09-30 Thread Eric Eastman
I have 5 RBD kernel based clients, all using kernel 3.11.1, running Ubuntu 1304, that all failed with a write error at the same time and I need help to figure out what caused the failure. The 5 clients were all using the same pool, and each had its own image, with an 18TB XFS file system on e

Re: [ceph-users] ceph-deploy mon create / gatherkeys problems

2013-08-18 Thread Eric Eastman
It may be overkill, but my notes on starting over with loading ceph on my test clusters is: From ceph00 (my admin node) ceph-deploy -v purge ceph00 ceph01 ceph02 ceph03 ceph04 ... Then on each system rm -rf /var/lib/ceph /etc/ceph Then reinstall the ceph software and configure it with cep

Re: [ceph-users] ceph-deploy progress and CDS session

2013-08-02 Thread Eric Eastman
Hi, First I would like to state that with all its limitiation, I have managed to build multiple clusters with ceph-deploy and without it, I would have been totally lost. Things that I feel would improve it include: A debug mode where it lists everything it is doing. This will be helpful in

Re: [ceph-users] Defective ceph startup script

2013-07-31 Thread Eric Eastman
I just checked my cluster with 3 monitor nodes and 2 osd nodes, and none of them had sockets in /var/run/ceph. I verified # ceph health HEALTH_OK So I rebooted one of my monitor nodes, and when it came back up, the socket was there. # ls -l /var/run/ceph/ total 0 srwxr-xr-x 1 root root 0 Ju

Re: [ceph-users] Defective ceph startup script

2013-07-31 Thread Eric Eastman
Hi Greg, I saw about the same thing on Ubuntu 13.04 as you did. I used apt-get -y update apt-get -y upgrade On all my cluster nodes to upgrade from 0.61.5 to 0.61.7 and then noticed that some of my systems did not restart all the daemons. I tried: stop ceph-all start ceph-all On those node

Re: [ceph-users] ceph-deploy

2013-07-24 Thread Eric Eastman
Hi Sage, I tested the HP cciss devices as OSD disks on the --dev=wip-cuttlefish-ceph-disk build tonight and it worked, but not exactly as expected. I first tried: # ceph-deploy -v osd create testsrv16:c0d1 which failed with: ceph-disk: Error: data path does not exist: /dev/c0d1 so I went t

Re: [ceph-users] ceph-deploy

2013-07-24 Thread Eric Eastman
Later today I will try both the HP testing using multiple cciss devices for my OSDs and separately testing manually specifying the dm devices on my external FC and iSCSI storage and will let you know how both tests turn out. Thanks again, Eric Tomorrow I will bring up a HP system that has mul

Re: [ceph-users] ceph-deploy

2013-07-23 Thread Eric Eastman
210018 -> ../../dm-3 lrwxrwxrwx 1 root root 10 Jul 23 23:54 /dev/disk/by-id/dm-name-1IET\x20\x20\x20\x20\x210019 -> ../../dm-0 Thank you for all the help. Eric On Wed, 24 Jul 2013, Eric Eastman wrote: Still looks the same. I even tried doing a purge to make sure tried a purge

Re: [ceph-users] ceph-deploy

2013-07-23 Thread Eric Eastman
Still looks the same. I even tried doing a purge to make sure tried a purge to make sure everything was clean root@ceph00:~# ceph-deploy -v purge ceph11 Purging from cluster ceph hosts ceph11 Detecting platform for host ceph11 ... Distro Ubuntu codename raring Purging host ceph11 ... root@ceph0

Re: [ceph-users] ceph-deploy

2013-07-23 Thread Eric Eastman
0 Jul 23 20:25 c0d0 brw-rw 1 root disk 104, 1 Jul 23 20:25 c0d0p1 brw-rw 1 root disk 104, 2 Jul 23 20:25 c0d0p2 brw-rw---- 1 root disk 104, 5 Jul 23 20:25 c0d0p5 On Tue, 23 Jul 2013, Eric Eastman wrote: I tried running: ceph-deploy install --dev=wip-cuttlefish-ceph-disk HOST To a c

Re: [ceph-users] ceph-deploy

2013-07-23 Thread Eric Eastman
nection.py", line 323, in send_request return self.__handle(m) File "/usr/lib/python2.7/dist-packages/pushy/protocol/baseconnection.py", line 639, in __handle raise e pushy.protocol.proxy.ExceptionProxy: Command '['apt-get', '-q', 'update'

Re: [ceph-users] ceph-deploy

2013-07-23 Thread Eric Eastman
I will test it out and let you know. Eric On Tue, 23 Jul 2013, Eric Eastman wrote: I am seeing issues with ceph-deploy and ceph-disk, which it calls, if the storage devices are not generic sdx devices. On my older HP systems, ceph-deploy fails on the cciss devices, and I tried to use it

Re: [ceph-users] ceph-deploy

2013-07-23 Thread Eric Eastman
I am seeing issues with ceph-deploy and ceph-disk, which it calls, if the storage devices are not generic sdx devices. On my older HP systems, ceph-deploy fails on the cciss devices, and I tried to use it with multipath dm devices, and that did not work at all. Logging is not verbose enough t