Re: [ceph-users] v0.80.4 Firefly released

2014-07-16 Thread James Harper
> Hi! > > I'm trying to install ceph on Debian wheezy (from deb > http://ceph.com/debian/ wheezy main) and getting following error: > > # apt-get update && apt-get dist-upgrade -y && apt-get install -y ceph > > ... > > The following packages have unmet dependencies: > ceph : Depends: ceph-comm

Re: [ceph-users] v0.80.4 Firefly released

2014-07-16 Thread James Harper
Can you offer some comments on what the impact is likely to be to the data in an affected cluster? Should all data now be treated with suspicion and restored back to before the firefly upgrade? James > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On B

Re: [ceph-users] Using large SSD cache tier instead of SSD journals?

2014-07-08 Thread James Harper
> Hi James, > > Yes, I've checked bcache, but as far as I can tell you need to manually > configure and register the backing devices and attach them to the cache > device, which is not really suitable to dynamic environment (like RBD devices > for cloud VMs). > You would use bcache for the osd n

Re: [ceph-users] Using large SSD cache tier instead of SSD journals?

2014-07-08 Thread James Harper
> I'm thinking of whether it makes sense to use the available SSDs in the > cluster nodes (1 SSD for 4 HDDs) as part of a writeback cache pool in front of > the IO intensive pool, instead of using them as journal SSDs? With this > method, the OSD journals would be co-located on the HDDs or the SSD:

Re: [ceph-users] inconsistent pgs

2014-07-07 Thread James Harper
> > You can look at which OSDs the PGs map to. If the PGs have > insufficient replica counts they'll report as degraded in "ceph -s" or > "ceph -w". I meant in a general sense. If I have a pg that I suspect might be insufficiently redundant I can look that up, but I'd like to know in advance an

Re: [ceph-users] inconsistent pgs

2014-07-07 Thread James Harper
> > It sounds like maybe you've got a bad CRUSH map if you're seeing that. > One of the things the tunables do is make the algorithm handle a > variety of maps better, but if PGs are only mapping to one OSD you > need to fix that. > How can I tell that this is definitely the case (all copies of

Re: [ceph-users] inconsistent pgs

2014-07-07 Thread James Harper
> > Okay. Based on your description I think the reason for the tunables > crashes is that either the "out" OSDs, or possibly one of the > monitors, never got restarted. You should be able to update the > tunables now, if you want to. (Or there's also a config option that > will disable the warning

Re: [ceph-users] inconsistent pgs

2014-07-07 Thread James Harper
#x27;m stuck with the Debian repo versions anyway... thanks James > — were you rebalancing when you > did the upgrade? Did the marked out OSDs get upgraded? > Did you restart all the monitors prior to changing the tunables? (Are > you *sure*?) > -Greg > Software Engineer #42 @ ht

Re: [ceph-users] Release notes for firefly not very clear wrt the tunables

2014-07-07 Thread James Harper
> > Hi Sage, > > > Thanks for pointing this out. Is this clearer? > > Yes. Although it would probably be useful to say that using 'ceph osd > crush tunables bobtail' will be enough to get rid of the warning and > will not break compatibility too much (3.15 isn't that common, there > is not even

Re: [ceph-users] inconsistent pgs

2014-07-05 Thread James Harper
> > I have 4 physical boxes each running 2 OSD's. I needed to retire one so I set > the 2 OSD's on it to 'out' and everything went as expected. Then I noticed > that 'ceph health' was reporting that my crush map had legacy tunables. The > release notes told me I needed to do 'ceph osd crush tunabl

[ceph-users] inconsistent pgs

2014-07-05 Thread James Harper
I was having some problem with my mds getting stick in 'rejoin' state on a dumpling install so I did a dist-upgrade on my installation, thinking it would install a later dumpling, and got landed with firefly, which is now in Debian Jessie. That resolved the mds problem but introduced a problem o

Re: [ceph-users] windows client

2014-03-13 Thread James Harper
> > Is it possible that ceph support windows client? Now I can only use RESTful > API(Swift-compatible) through ceph object gateway, > > but the languages that can be used are java, python and ruby, not C# or C++. > Is there any good wrapper for C# or C++,thanks. > I have a kind-of-working port

Re: [ceph-users] Running a mon on a USB stick

2014-03-09 Thread James Harper
> > I agree, I have tried it before and even with tmpfs and removing the logs, usb > sticks will last only a few months (3-4 at most) > Not all USB sticks are the same. Some will last much much longer than others before they become write-only. Is anyone making USB memory sticks built on the sa

[ceph-users] public network

2014-01-05 Thread James Harper
Is there any requirement that the monitors have to be on the same subnet as each other, and/or the osd public network? It's going to simplify things greatly if I can move them progressively. Thanks James ___ ceph-users mailing list ceph-users@lists.ce

[ceph-users] snapshot atomicity

2014-01-02 Thread James Harper
I've not used ceph snapshots before. The documentation says that the rbd device should not be in use before creating a snapshot. Does this mean that creating a snapshot is not an atomic operation? I'm happy with a crash consistent filesystem if that's all the warning is about. If it is atomic,

Re: [ceph-users] vm fs corrupt after pgs stuck

2014-01-02 Thread James Harper
> > I just had to restore an ms exchange database after an ceph hiccup (no actual > data lost - Exchange is very good like that with its no loss restore!). The > order > of events went something like: > > . Loss of connection on osd to the cluster network (public network was okay) > . pgs report

[ceph-users] vm fs corrupt after pgs stuck

2014-01-02 Thread James Harper
I just had to restore an ms exchange database after an ceph hiccup (no actual data lost - Exchange is very good like that with its no loss restore!). The order of events went something like: . Loss of connection on osd to the cluster network (public network was okay) . pgs reported stuck . stopp

Re: [ceph-users] shutting down for maintenance

2013-12-31 Thread James Harper
> > Most production clusters are large enough that you don't have to bring down > the entire cluster to do maintenance on particular machines. If your > reconfiguring the entire network, that's a bit more involved. I'm not sure > what your cluster looks like, so I can't advise. However, you menti

[ceph-users] shutting down for maintenance

2013-12-31 Thread James Harper
I need to shut down ceph for maintenance to make some hardware changes. Is it sufficient to just stop all services on all nodes, or is there a way to put the whole cluster into standby or something first? And when things come back up, IP addresses on the cluster network will be different (publi

[ceph-users] valid characters for pool and rbd name

2013-12-14 Thread James Harper
What characters are valid for a pool and an rbd name? Thanks James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Mounting Ceph on Linux/Windows

2013-12-06 Thread James Harper
Out of curiosity I tried the 'ceph' command from windows too. I had to rename librados.dll to librados.so.2, install a readline replacement (https://pypi.python.org/pypi/pyreadline/2.0), and even then it completely ignored anything I put on the command line, but from the ceph shell I could do t

Re: [ceph-users] Mounting Ceph on Linux/Windows

2013-12-05 Thread James Harper
> > On Thu, 5 Dec 2013, James Harper wrote: > > > > > > Can someone point me to directions on how to mount a Ceph storage > > > volume on Linux as well as Windows? > > > > > > > Do you mean cephfs filesystem, or rbd block device? > >

Re: [ceph-users] Mounting Ceph on Linux/Windows

2013-12-05 Thread James Harper
> > Can someone point me to directions on how to mount a Ceph storage > volume on Linux as well as Windows? > Do you mean cephfs filesystem, or rbd block device? I have ported librbd to windows in a very "alpha" sense - it compiles and I can do things like 'rbd ls' and 'rbd import', but haven'

Re: [ceph-users] btrfs constant background write activity even at idle

2013-12-04 Thread James Harper
> > Can you generate an OSD log with 'debug filestore = 20' for an idle period? > Any more tests you would like me to run? I'm going to recreate that osd as xfs soon. James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/

Re: [ceph-users] optimal setup with 4 x ethernet ports

2013-12-04 Thread James Harper
> > Ceph.conf cluster network (supernet): 10.0.0.0/8 > > > > Cluster network #1: 10.1.1.0/24 > > Cluster network #2: 10.1.2.0/24 > > > > With that configuration OSD address autodection *should* just work. > > It should work but thinking more about it the OSDs will likely be > assigned IPs on a si

Re: [ceph-users] btrfs constant background write activity even at idle

2013-12-04 Thread James Harper
> > Can you try > > ceph tell osd.71 injectargs '--filestore-max-sync-interval 20' > > (or some other btrfs osd) and see if it changes? > It's now 9:15am here so the network is getting less idle and any measurements I'm taking will be more noisy, but anyway... iostat -x 10 now alternates be

Re: [ceph-users] btrfs constant background write activity even at idle

2013-12-04 Thread James Harper
(reposted to the list without the attachment as the list blocks it. If anyone else wants it I can send it direct) > Hi James, > > Can you generate an OSD log with 'debug filestore = 20' for an idle period? > The best I can do right now is 'relatively idle'. Ceph -w says that the cluster is av

[ceph-users] btrfs constant background write activity even at idle

2013-12-03 Thread James Harper
I have been testing osd on btrfs, and the first thing I notice is that there is constant write activity when idle. The write activity hovers between 5Mbytes/second and 30Mbytes/second, and averages around 9Mbytes/second (as determined by iostat -x 30). On average, iostat is showing around 90 w/

Re: [ceph-users] Can't stop osd (state D)

2013-12-03 Thread James Harper
> Dear ceph users, > > I have a ceph cluster running 0.67.4. Two osd are down in "ceph -s". > They are stil there in ps and I can't stop them (service ceph stop osd.x > or kill or even kill -9)! > > Any idea? > Do you know why the OSDs went down? If their state is D in 'ps -ax' then check dmes

Re: [ceph-users] odd performance graph

2013-12-02 Thread James Harper
> > I also noticed a graph like this once i benchmarked w2k8 guest on ceph with > rbd. > To me it looked like when the space on the drive is used, the throughput is > lower, when the space read by rbd on the drive is unused, the reads are > superfast. > > I don't know how rbd works inside, but i

[ceph-users] optimal setup with 4 x ethernet ports

2013-12-01 Thread James Harper
My OSD servers have 4 network ports currently, configured as: eth0 - lan/osd public eth1 - unused eth2 - osd cluster network #1 eth3 - osd cluster network #2 each server has two OSD's, one is configured on osd cluster network #1, the other on osd cluster network #2. This avoids any messing around

Re: [ceph-users] odd performance graph

2013-12-01 Thread James Harper
> Hi, > > > > The low points are all ~35Mbytes/sec and the high points are all > > > ~60Mbytes/sec. This is very reproducible. > > > > It occurred to me that just stopping the OSD's selectively would allow me to > > see if there was a change when one > > was ejected, but at no time was there a cha

Re: [ceph-users] ceph-deploy Platform is not supported: debian

2013-12-01 Thread James Harper
> > On Sun, Dec 1, 2013 at 6:47 PM, James Harper > wrote: > >> > >> ceph-deploy uses Python to detect information for a given platform, > >> can you share what this command gives > >> as output? > >> > >> python -c "import plat

Re: [ceph-users] ceph-deploy Platform is not supported: debian

2013-12-01 Thread James Harper
> > ceph-deploy uses Python to detect information for a given platform, > can you share what this command gives > as output? > > python -c "import platform; print platform.linux_distribution()" > Servers that 'gatherkeys' does work on: ('debian', '7.1', '') ('debian', '7.2', '') Servers that '

Re: [ceph-users] ceph-deploy Platform is not supported: debian

2013-12-01 Thread James Harper
> > On Fri, Nov 29, 2013 at 3:19 AM, James Harper > wrote: > > When I do gatherkeys, ceph-deploy tells me: > > > > UnsupportedPlatform: Platform is not supported: debian > > > > Given that I downloaded ceph-deploy from the ceph.com debian > repositor

Re: [ceph-users] odd performance graph

2013-11-30 Thread James Harper
> > I ran HTTach on one of my VM's and got a graph that looks like this: > > ___-- > > The low points are all ~35Mbytes/sec and the high points are all > ~60Mbytes/sec. This is very reproducible. > > HDTach does sample reads across the whole disk, so would I be right in > thinking that

[ceph-users] odd performance graph

2013-11-30 Thread James Harper
I ran HTTach on one of my VM's and got a graph that looks like this: ___-- The low points are all ~35Mbytes/sec and the high points are all ~60Mbytes/sec. This is very reproducible. HDTach does sample reads across the whole disk, so would I be right in thinking that the variation is du

[ceph-users] ceph-deploy and config file

2013-11-29 Thread James Harper
Aside from the messages about ceph-deploy saying debian is not supported on two of my nodes, I'm having some other problems moving to ceph-deploy. I'm running with 2 OSD's on each node, and I'm using a numbering sequence of osd., so node 7 has osd.70 and osd.71. This way it's immediately obviou

Re: [ceph-users] ceph-deploy Platform is not supported: debian

2013-11-29 Thread James Harper
> > When I do gatherkeys, ceph-deploy tells me: > > UnsupportedPlatform: Platform is not supported: debian > > Given that I downloaded ceph-deploy from the ceph.com debian repository, > I'm hoping that Debian is supported and that I have something screwy > somewhere. > > Any suggestions? > I

[ceph-users] ceph-deploy Platform is not supported: debian

2013-11-29 Thread James Harper
When I do gatherkeys, ceph-deploy tells me: UnsupportedPlatform: Platform is not supported: debian Given that I downloaded ceph-deploy from the ceph.com debian repository, I'm hoping that Debian is supported and that I have something screwy somewhere. Any suggestions? Thanks James ___

[ceph-users] change isize on existing xfs

2013-11-28 Thread James Harper
I just noticed that one of my OSD's has its xfs filesystem created with isize=256, instead of the 2048 it should have been created with. Is this going to be hurting performance enough to warrant burning the OSD and recreating it? And is there a way to change it on the fly (I expect not, but may

Re: [ceph-users] [Big Problem?] Why not using Device'UUID in ceph.conf

2013-11-26 Thread James Harper
> Hi all > > I have 3 OSDs, named sdb, sdc, sdd. > Suppose, one OSD with device /dev/sdc die => My server have only sdb, sdc > at the moment. > Because device /dev/sdc replaced by /dev/sdd Can you just use one of the /dev/disk/by-/ symlinks? Eg /dev/disk/by-uuid/153cf32b-e46b-4d31-95ef-749db3a88

Re: [ceph-users] HEALTH_WARN # requests are blocked > 32 sec

2013-11-25 Thread James Harper
> > Writes seem to be happing during the block but this is now getting more > frequent and seems to be for longer periods. > Looking at the osd logs for 3 and 8 there's nothing of relevance in there. > > Any ideas on the next step? > Look for iowait and other disk metrics: iostat -x 1 high i

Re: [ceph-users] installing OS on software RAID

2013-11-25 Thread James Harper
> > We need to install the OS on the 3TB harddisks that come with our Dell > servers. (After many attempts, I've discovered that Dell servers won't allow > attaching an external harddisk via the PCIe slot. (I've tried everything). ) > > But, must I therefore sacrifice two hard disks (RAID-1) for

Re: [ceph-users] Multicast

2013-11-02 Thread James Harper
> > Hi All > > I was wondering whether multicast could be used for the replication > traffic? It just seemed that the outbound network bandwidth from the > source could be halved. > Right now I think ceph traffic is all TCP, which doesn't do multicast. You'd either need to make ceph use UDP a

[ceph-users] migrating to ceph-deploy

2013-11-01 Thread James Harper
I have a cluster already set up, and I'd like to start using ceph-deploy to add my next OSD. The cluster currently doesn't have any authentication or anything. Should I start using ceph-deploy now, or just add the OSD manually? If the former, is there anything I need to do to make sure ceph-depl

Re: [ceph-users] Ceph + Xen - RBD io hang

2013-10-28 Thread James Harper
Maybe nothing to do with your issue, but I was having problems using librbd with blktap, and ended up adding: [client] ms rwthread stack bytes = 8388608 to my config. This is a workaround, not a fix though (IMHO) as there is nothing to indicate that librbd is running out of stack space, rathe

Re: [ceph-users] RBD & Windows Failover Clustering

2013-10-23 Thread James Harper
> Hello, > So, if your cluster nodes are running virtualized with Qemu/KVM, you can > present them a virtual SCSI drive, from the same RBD image. > It will be like a shared FC SCSI SAN LUN. > You would want to be absolutely sure that neither qemu or rbd was doing any sort of caching though for t

[ceph-users] import VHD

2013-10-21 Thread James Harper
Can any suggest a straightforward way to import a VHD to a ceph RBD? The easier the better! Thanks James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Ceph and RAID

2013-10-03 Thread James Harper
> If following the "raid mirror" approach, would you then skip redundency > at the ceph layer to keep your total overhead the same? It seems that > would be risky in the even you loose your storage server with the > raid-1'd drives. No Ceph level redunancy would then be fatal. But if > you do ra

Re: [ceph-users] How to force lost PGs

2013-09-03 Thread James Harper
> > > > Now I'm trying to clear the stale PGs. I've tried removing the OSD from the > > crush maps, the OSD lists etc, without any luck. > > Note that this means that you destroyed all copies of those 3 PGs, which > means this experiment lost data. > > You can make ceph recreate the PGs (empty!)

Re: [ceph-users] The whole cluster hangs when changing MTU to 9216

2013-08-26 Thread James Harper
> > Centos 6.4 > Ceph Cuttlefish 0.61.7, or 0.61.8. > > I changed the MTU to 9216(or 9000), then restarted all the cluster nodes. > The whole cluster hung, with messages in the mon log as below: Does tcpdump report any tcp or ip checksum errors? (tcpdump -v -s0 -i http://lists.ceph.com/listinfo

Re: [ceph-users] Multiple CephFS filesystems per cluster

2013-08-21 Thread James Harper
> Hi, > > Is it possible to have more than one CephFS filesystem per Ceph cluster? > > In the default configuration, a ceph cluster has got only one filesystem, and > you can mount that or nothing. Is it possible somehow to have several > distinct > filesystems per cluster, preferably with access

Re: [ceph-users] v0.61.8 Cuttlefish released

2013-08-19 Thread James Harper
> On Mon, 19 Aug 2013, James Harper wrote: > > > > > > We've made another point release for Cuttlefish. This release contains a > > > number of fixes that are generally not individually critical, but do trip > > > up users from time to time, are non-int

Re: [ceph-users] v0.61.8 Cuttlefish released

2013-08-19 Thread James Harper
> > We've made another point release for Cuttlefish. This release contains a > number of fixes that are generally not individually critical, but do trip > up users from time to time, are non-intrusive, and have held up under > testing. > > Notable changes include: > > * librados: fix async aio

Re: [ceph-users] v0.67 Dumpling released

2013-08-14 Thread James Harper
> Hi, > is it ok to upgrade from 0.66 to 0.67 by just running 'apt-get upgrade' > and rebooting the nodes one by one ? Is a full reboot required? James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-c

Re: [ceph-users] qemu-1.4.0 and onwards, linux kernel 3.2.x, ceph-RBD, heavy I/O leads to kernel_hung_tasks_timout_secs message and unresponsive qemu-process, [Qemu-devel] [Bug 1207686]

2013-08-13 Thread James Harper
> > This looks like a different issue than Oliver's. I see one anomaly in the > log, where a rbd io completion is triggered a second time for no apparent > reason. I opened a separate bug > > http://tracker.ceph.com/issues/5955 > > and pushed wip-5955 that will hopefully shine some light

Re: [ceph-users] [list admin] - membership disabled due to bounces

2013-08-11 Thread James Harper
This list actually does get a bit of spam, unlike most lists I'm subscribed to. I'm surprised more reputation filters haven't blocked it. Rejecting spam is the only right way to do it (junk mail folders are dumb), but obviously the ceph-users list is taking the bounces as indicating a problem wi

[ceph-users] fuse or kernel fs?

2013-08-06 Thread James Harper
Are the fuse and kernel filesystem drivers about the same or is one definitely better than the other? Thanks James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Large storage nodes - best practices

2013-08-05 Thread James Harper
> > In the previous email, you are forgetting Raid1 has a write penalty of 2 > since it > is mirroring and now we are talking about different types of raid and nothing > really to do about Ceph. One of the main advantages of Ceph is to have data > replicated so you don't have to do Raid to that d

Re: [ceph-users] Large storage nodes - best practices

2013-08-05 Thread James Harper
> I am looking at evaluating ceph for use with large storage nodes (24-36 SATA > disks per node, 3 or 4TB per disk, HBAs, 10G ethernet). > > What would be the best practice for deploying this? I can see two main > options. > > (1) Run 24-36 osds per node. Configure ceph to replicate data to one o

Re: [ceph-users] Block device storage

2013-08-05 Thread James Harper
> > Hi there > > I have a few questions regarding the block device storage and the > ceph-filesystem. > > We want to cluster a database (Progress) on a clustered filesystem , but > the database requires the > operating system to see the clustered storage area as a block device , > and not a netw

Re: [ceph-users] kernel BUG at net/ceph/osd_client.c:2103

2013-08-05 Thread James Harper
> > It's Xen yes, but no I didn't tried the RBD tab client, for two > reasons : > - too young to enable it in production > - Debian packages don't have the TAP driver > It works under Wheezy. blktap is available via dkms package, then just replace the tapdisk with the rbd version and follow the

Re: [ceph-users] kernel BUG at net/ceph/osd_client.c:2103

2013-08-04 Thread James Harper
What VM? If Xen, have you tried the rbd tap client? James > -Original Message- > From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users- > boun...@lists.ceph.com] On Behalf Of Olivier Bonvalet > Sent: Monday, 5 August 2013 11:07 AM > To: ceph-users@lists.ceph.com > Subject: [ceph-user

[ceph-users] network layout

2013-07-29 Thread James Harper
My servers all have 4 x 1gb network adapters, and I'm presently using DRBD over a bonded rr link. Moving to ceph, I'm thinking for each server: eth0 - LAN traffic for server and VM's eth1 - "public" ceph traffic eth2+eth3 - LACP bonded for "cluster" ceph traffic I'm thinking LACP should work ok

[ceph-users] 1 x raid0 or 2 x disk

2013-07-21 Thread James Harper
I have a server with 2 x 2TB disks. For performance, is it better to combine them as a single OSD backed by RAID0 or have 2 OSD's backed by a single disk? (log will be on SSD in either case). My need in performance is more IOPS than overall throughput (maybe that's a universal thing? :) Thanks

Re: [ceph-users] HEALTH_WARN low disk space

2013-07-14 Thread James Harper
> > On 07/14/2013 04:27 AM, James Harper wrote: > > My cluster is in HEALTH_WARN state because one of my monitors has low > disk space on /var/lib/ceph. Looking into this in more detail, there are a > bunch of .sst files dating back to Jul 7, and then a lot more at Jun 30 and &

[ceph-users] HEALTH_WARN low disk space

2013-07-13 Thread James Harper
My cluster is in HEALTH_WARN state because one of my monitors has low disk space on /var/lib/ceph. Looking into this in more detail, there are a bunch of .sst files dating back to Jul 7, and then a lot more at Jun 30 and older. I'm thinking that these older files are just ones that mon has faile

Re: [ceph-users] consumer nas as osd

2013-07-04 Thread James Harper
> On Wed, Jul 3, 2013 at 3:42 AM, James Harper > wrote: > > Has anyone used a consumer grade NAS (netgear, qnap, dlink, etc) as an > OSD before? > > > > Qnap TS-421 has a Marvell 2Ghz CPU, 1Gbyte memory, dual gigabit > Ethernet, and 4 hotswap disk bays. Is there

[ceph-users] consumer nas as osd

2013-07-03 Thread James Harper
Has anyone used a consumer grade NAS (netgear, qnap, dlink, etc) as an OSD before? Qnap TS-421 has a Marvell 2Ghz CPU, 1Gbyte memory, dual gigabit Ethernet, and 4 hotswap disk bays. Is there anything about the Marvell CPU that would make OSD run badly? What about mon? Thanks James ___

Re: [ceph-users] Possible to bind one osd with a specific network adapter?

2013-06-21 Thread James Harper
> > Hi List, > Each of my osd nodes has 5 network Gb adapters, and has many osds, one > disk one osd. They are all connected with a Gb switch. > Currently I can get an average 100MB/s of read/write speed. To improve the > throughput further, the network bandwidth will be the bottleneck, right? Do

Re: [ceph-users] Desktop or Enterprise SATA Drives?

2013-06-20 Thread James Harper
> Hi all > > I'm building a small ceph cluster with 3 nodes (my first ceph cluster). > Each Node with one System Disk, one Journal SSD Disk and one SATA OSD > Disk. > > My question is now should I use Desktop or Enterprise SATA Drives? > Enterprise Drives have a higher MTBF but the Firmware is ac

[ceph-users] why so many ceph-create-keys processes?

2013-06-19 Thread James Harper
Why are there so many ceph-create-keys processes? Under Debian, every time I start the mons another ceph-create-keys process starts up. Thanks James ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-cep

[ceph-users] Why does ceph need a filesystem (was Simulating Disk Failure)

2013-06-14 Thread James Harper
> > Yeah. You've picked up on some warty bits of Ceph's error handling here for > sure, but it's exacerbated by the fact that you're not simulating what you > think. In a real disk error situation the filesystem would be returning EIO or > something, but here it's returning ENOENT. Since the OSD i

Re: [ceph-users] mons not starting

2013-05-31 Thread James Harper
> > My monitors are suddenly not starting up properly, or at all. Using latest > Debian release from ceph.com/debian-cuttlefish wheezy > > One (mon.7 ip ending in .190) starts but says things like this in the logs: > 1 mon.7@0(probing) e3 discarding message > mon_subscribe({monmap=0+,osdmap=796})

[ceph-users] mons not starting

2013-05-31 Thread James Harper
My monitors are suddenly not starting up properly, or at all. Using latest Debian release from ceph.com/debian-cuttlefish wheezy One (mon.7 ip ending in .190) starts but says things like this in the logs: 1 mon.7@0(probing) e3 discarding message mon_subscribe({monmap=0+,osdmap=796}) and sending

Re: [ceph-users] HEALTH_ERR 14 pgs inconsistent; 18 scrub errors

2013-05-13 Thread James Harper
> > Am 14.05.2013 02:11, schrieb James Harper: > >> > >> Am 14.05.2013 01:46, schrieb James Harper: > >>> After replacing a failed harddisk, ceph health reports "HEALTH_ERR 14 > pgs > >> inconsistent; 18 scrub errors" > >>&

[ceph-users] HEALTH_ERR 14 pgs inconsistent; 18 scrub errors

2013-05-13 Thread James Harper
After replacing a failed harddisk, ceph health reports "HEALTH_ERR 14 pgs inconsistent; 18 scrub errors" The disk was a total loss so I replaced it, ran mkfs etc and rebuilt the osd and while it has resynchronised everything the above still remains. What should I do to resolve this? Thanks Ja

Re: [ceph-users] HEALTH_WARN after upgrade to cuttlefish

2013-05-08 Thread James Harper
> On 05/08/2013 08:44 AM, David Zafman wrote: > > > > According to "osdmap e504: 4 osds: 2 up, 2 in" you have 2 of 4 osds that are > down and out. That may be the issue. > > Also, running 'ceph health detail' will give you specifics on what is > causing the HEALTH_WARN. > # ceph health detail H

[ceph-users] HEALTH_WARN after upgrade to cuttlefish

2013-05-08 Thread James Harper
I've just upgraded my ceph install to cuttlefish (was 0.60) from Debian. My mon's don't regularly die anymore, or at least haven't so far, but health is always HEALTH_WARN even though I can't see any indication of why: # ceph status health HEALTH_WARN monmap e1: 3 mons at {4=192.168.200.1

Re: [ceph-users] Debian Squeeze - Ceph and RBD Kernel Modules Missing

2013-05-04 Thread James Harper
> > Squeeze is running 2.6.32, and the Ceph filesystem client was first > merged in 2.6.33 (rbd in 2.6.37 I think). We don't have any backports > to that far, sorry. The link I gave for squeeze backports (eg backported by debian from wheezy for squeeze) definitely includes the 3.2.x kernel which

Re: [ceph-users] Debian Squeeze - Ceph and RBD Kernel Modules Missing

2013-05-04 Thread James Harper
> > Yes, I had the same issue and was not able to resolve it for squeeze in the > short time I had. I ended up upgrading to wheezy and everything worked as > it should. Mine is a test cluster so I didn't mind upgrading but I need to > resolve the issue for prod deployment. > FWIW, there's a 3.2

Re: [ceph-users] Backporting the kernel client

2013-04-29 Thread James Harper
> > I'm probably not the only one who would like to run a > distribution-provided kernel (which for Debian Wheezy/Ubuntu Precise is > 3.2) and still have a recent-enough Ceph kernel client. So I'm wondering > whether it's feasible to backport the kernel client to an earlier kernel. You can grab t

[ceph-users] journal on ramdisk for testing

2013-04-25 Thread James Harper
I'm doing some testing and wanted to see the effect of increasing journal speed, and the fastest way to do this seemed to be to put it on a ramdisk where latency should drop to near zero and I can see what other inefficiencies exist. I created a tmpfs of sufficient size, copied journal on to tha

[ceph-users] bad crc message in error logs

2013-04-24 Thread James Harper
I'm seeing a few messages like this on my OSD logfiles: 2013-04-25 00:00:08.174869 e3ca2b70 0 bad crc in data 1652929673 != exp 2156854821 2013-04-25 00:00:08.179749 e3ca2b70 0 -- 192.168.200.191:6882/30908 >> 192.168.200.197:0/3338580093 pipe(0xc70e1c0 sd=24 :6882 s=0 pgs=0 cs=0 l=0).accept

Re: [ceph-users] clean shutdown and failover of osd

2013-04-20 Thread James Harper
> > [ This is a good query for ceph-users. ] > Well... this is embarrassing. In reading the docs at http://ceph.com/docs/master/start/get-involved/ there was no mention of a users list so I just assumed there wasn't one. Looking again I see that if I go to the link from the main page http://c