Re: [ceph-users] strange cache tier behaviour with cephfs

2016-06-14 Thread Christian Balzer
Hello, On Tue, 14 Jun 2016 06:47:03 +0100 Nick Fisk wrote: > osd_tier_promote_max_objects_sec > and > osd_tier_promote_max_bytes_sec > Right, I remember those from February and May. And I'm not asking for this feature, but personally I would have split that in read and write promotes. As in,

Re: [ceph-users] strange unfounding of PGs

2016-06-14 Thread Csaba Tóth
Hi Nick! Yes i did. :( Do you know how can i fix it? Nick Fisk ezt írta (időpont: 2016. jún. 14., K, 7:52): > Did you enable the sortbitwise flag as per the upgrade instructions, as > there is a known bug with it? I don't know why these instructions haven't > been amended in light of this bug.

Re: [ceph-users] strange unfounding of PGs

2016-06-14 Thread Christian Balzer
On Tue, 14 Jun 2016 07:09:45 + Csaba Tóth wrote: > Hi Nick! > Yes i did. :( > Do you know how can i fix it? > > Supposedly just by un-setting it: https://www.mail-archive.com/ceph-users@lists.ceph.com/msg29651.html Christian > Nick Fisk ezt írta (időpont: 2016. jún. 14., K, 7:52): > > >

Re: [ceph-users] strange unfounding of PGs

2016-06-14 Thread Csaba Tóth
Yes! After i read the mail i unsetted it immadiately, and now the recovery process started to continue. After i switched back my off keeped OSD ceph founded the unfounded objects, and now the recovery process runs. Thanks Nick and Christian, you saved me! :) Christian Balzer ezt írta (időpont:

Re: [ceph-users] UnboundLocalError: local variable 'region_name' referenced before assignment

2016-06-14 Thread Parveen Sharma
Any help for me as well, please. :) - Parveen On Tue, Jun 14, 2016 at 11:55 AM, Parveen Sharma wrote: > Hi, > > I'm getting "UnboundLocalError: local variable 'region_name' referenced > before assignment" error while placing an object in my earlier created > bucket using my RADOSGW with boto

[ceph-users] Ubuntu Trusty: kernel 3.13 vs kernel 4.2

2016-06-14 Thread magicb...@hotmail.com
Hi list, is there any opinion/recommendation regarding the ubuntu trusty available kernels and Ceph(hammer, xfs)? Does kernel 4.2 worth installing from Ceph(hammer, xfs) perspective? Thanks :) ___ ceph-users mailing list ceph-users@lists.ceph.com htt

[ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Ansgar Jazdzewski
Hi, we are using ceph and radosGW to store images (~300kb each) in S3, when in comes to deep-scrubbing we facing task timeouts (> 30s ...) my questions is: in case of that amount of objects/files is it better to calculate the PGs on a object-bases instant of the volume size? and how it should be

Re: [ceph-users] Ubuntu Trusty: kernel 3.13 vs kernel 4.2

2016-06-14 Thread Jan Schermer
One storage setups has exhibited extremely poor performance in my lab on 4.2 kernel (mdraid1+lvm+nfs), others run fine. No problems with xenial so far. If I had to choose a LTS kernel for trusty I'd choose the xenial one. (Btw I think newest trusty point release already has the 4.2 HWE stack by

[ceph-users] tier pool 'ssdpool' has snapshot state; it cannot be added as a tier without breaking the pool.

2016-06-14 Thread ????
Hi, All i have make a sas pool and a ssd pool. then run "ceph osd tier add ssdpool saspool", it says: tier pool 'ssdpool' has snapshot state; it cannot be added as a tier without breaking the pool. anyone who had hit the case? what can i do? and, "ceph osd pool" has "mksnap"

Re: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Wido den Hollander
> Op 14 juni 2016 om 10:10 schreef Ansgar Jazdzewski > : > > > Hi, > > we are using ceph and radosGW to store images (~300kb each) in S3, > when in comes to deep-scrubbing we facing task timeouts (> 30s ...) > > my questions is: > > in case of that amount of objects/files is it better to cal

Re: [ceph-users] strange cache tier behaviour with cephfs

2016-06-14 Thread Oliver Dzombic
Hi, ok lets make it step by step: before `dd if=file of=/dev/zero` [root@cephmon1 ~]# rados -p ssd_cache cache-flush-evict-all -> Moving all away [root@cephmon1 ~]# rados -p ssd_cache ls [root@cephmon1 ~]# -> empty cache osds at that point: /dev/sde1 234315556 84368 234231188 1%

Re: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Василий Ангапов
Is it a good idea to disable scrub and deep-scrub for bucket.index pool? What negative consequences it may cause? 2016-06-14 11:51 GMT+03:00 Wido den Hollander : > >> Op 14 juni 2016 om 10:10 schreef Ansgar Jazdzewski >> : >> >> >> Hi, >> >> we are using ceph and radosGW to store images (~300kb e

[ceph-users] local variable 'region_name' referenced before assignment

2016-06-14 Thread Parveen Sharma
Hi, I'm getting "UnboundLocalError: local variable 'region_name' referenced before assignment" error while placing an object in my earlier created bucket using my RADOSGW with boto. My package details: $ sudo rpm -qa | grep rados librados2-10.2.1-0.el7.x86_64 libradosstriper1-10.2.1-0.el7.x86_

Re: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Wido den Hollander
> Op 14 juni 2016 om 11:00 schreef Василий Ангапов : > > > Is it a good idea to disable scrub and deep-scrub for bucket.index > pool? What negative consequences it may cause? > No, I would not do that. Scrubbing is essential to detect (silent) data corruption. You should really scrub all you

Re: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Василий Ангапов
Wido, can you please give more details about that? What sort of corruption may occur? What scrubbing actually does especially for bucket index pool? 2016-06-14 12:05 GMT+03:00 Wido den Hollander : > >> Op 14 juni 2016 om 11:00 schreef Василий Ангапов : >> >> >> Is it a good idea to disable scrub a

Re: [ceph-users] Issue installing ceph with ceph-deploy

2016-06-14 Thread Fran Barrera
Hi, Thanks to both of you, finally the problem was fixed deleting everything and the user ceph and install again as George commented. Best Regards, Fran. 2016-06-13 17:41 GMT+02:00 Tu Holmes : > I have seen this. > > Just stop ceph and kill any ssh processes related to it. > > I had the same

[ceph-users] local variable 'region_name' referenced before assignment

2016-06-14 Thread Parveen Sharma
Hi, I'm getting "UnboundLocalError: local variable 'region_name' referenced before assignment" error while placing an object in my earlier created bucket using my RADOSGW with boto. My package details: $ sudo rpm -qa | grep rados librados2-10.2.1-0.el7.x86_64 libradosstriper1-10.2.1-0.el7.x86_

Re: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Nmz
- Original Message - From: Wido den Hollander To: Василий Ангапов Date: Tuesday, June 14, 2016, 12:05:51 PM Subject: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs >> Op 14 juni 2016 om 11:00 schreef Василий Ангапов : >> >> >> Is it a good idea to disable scrub and

Re: [ceph-users] strange cache tier behaviour with cephfs

2016-06-14 Thread Oliver Dzombic
Hi, wow. After setting in ceph.conf and restarting the whole cluster this: osd tier promote max bytes sec = 1610612736 osd tier promote max objects sec = 2 And repeating the test, the cache pool got the full 11 GB of the test file with 2560 objects copied from the cold pool. Aaand, repea

Re: [ceph-users] local variable 'region_name' referenced before assignment

2016-06-14 Thread Parveen Sharma
I'm sending on personnel ID as my posts to ceph-users@lists.ceph.com are not reaching to the mailing list, though I've subscribed. On Tue, Jun 14, 2016 at 2:49 PM, Parveen Sharma wrote: > > Hi, > > I'm getting "UnboundLocalError: local variable 'region_name' referenced > before assignment" e

Re: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Christian Balzer
Hello, On Tue, 14 Jun 2016 12:20:44 +0300 Nmz wrote: > > > > - Original Message - > From: Wido den Hollander > To: Василий Ангапов > Date: Tuesday, June 14, 2016, 12:05:51 PM > Subject: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs > > > >> Op 14 juni 2016 om 11:

Re: [ceph-users] strange cache tier behaviour with cephfs

2016-06-14 Thread Oliver Dzombic
Hi, ok the write test also shows now a more expected behaviour. As it seems to me, if there is more writing than osd_tier_promote_max_bytes_sec the write's are going directly against the cold pool ( which is a really good behaviour ( seriously ) ). But that should be definitly added to the doc

Re: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Ansgar Jazdzewski
Hi, yes we have index sharding enabled, we have only tow big buckets at the moment with 15Mil objects each and some smaller ones cheers, Ansgar 2016-06-14 10:51 GMT+02:00 Wido den Hollander : > >> Op 14 juni 2016 om 10:10 schreef Ansgar Jazdzewski >> : >> >> >> Hi, >> >> we are using ceph and r

Re: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Ansgar Jazdzewski
Hi, your cluster will be in warning state if you disable scrubbing, and you relay need it in case of some data loss cheers, Ansgar 2016-06-14 11:05 GMT+02:00 Wido den Hollander : > >> Op 14 juni 2016 om 11:00 schreef Василий Ангапов : >> >> >> Is it a good idea to disable scrub and deep-scrub fo

[ceph-users] How to select particular OSD to act as primary OSD.

2016-06-14 Thread Kanchana. P
Hi, How to select particular OSD to act as primary OSD. I modified the ceph.conf file and added [mon] ... mon osd allow primary affinity = true Restarted ceph target, now primary affinity is set to true in all monitor nodes. Using the below commands set some weights to the osds. $ ceph osd primar

Re: [ceph-users] How to select particular OSD to act as primary OSD.

2016-06-14 Thread shylesh kumar
Hi, I think you can edit the crush rule something like below rule another_replicated_ruleset { ruleset 1 type replicated min_size 1 max_size 10 step take default step take osd1 step choose firstn 1 type osd step emit step tak

Re: [ceph-users] strange cache tier behaviour with cephfs

2016-06-14 Thread Nick Fisk
The basic logic is that if a IO is not on the cache tier, then proxy it, which means do the IO direct on the base tier. The throttle is designed to minimise the latency impact of promotions and flushes. So yes during testing it will not promote everything, but during normal work loads it makes

[ceph-users] Unable to mount the CephFS file system from client node with "mount error 5 = Input/output error"

2016-06-14 Thread Rakesh Parkiti
Hello, Unable to mount the CephFS file system from client node with "mount error 5 = Input/output error" MDS was installed on a separate node. Ceph Cluster health is OK and mds services are running. firewall was disabled across all the nodes in a cluster. -- Ceph Cluster Nodes (RHEL 7.2 vers

[ceph-users] "mount error 5 = Input/output error" with the CephFS file system from client node

2016-06-14 Thread Rakesh Parkiti
Hello, Unable to mount the CephFS file system from client node with "mount error 5 = Input/output error" MDS was installed on a separate node. Ceph Cluster health is OK and mds services are running. firewall was disabled across all the nodes in a cluster. -- Ceph Cluster Nodes (RHEL 7.2 version

Re: [ceph-users] Unable to mount the CephFS file system fromclientnode with "mount error 5 = Input/output error"

2016-06-14 Thread Burkhard Linke
Hi, On 06/14/2016 01:21 PM, Rakesh Parkiti wrote: Hello, Unable to mount the CephFS file system from client node with *"mount error 5 = Input/output error"* MDS was installed on a separate node. Ceph Cluster health is OK and mds services are running. firewall was disabled across all the nodes

Re: [ceph-users] local variable 'region_name' referenced before assignment

2016-06-14 Thread Shilpa Manjarabad Jagannath
- Original Message - > From: "Parveen Sharma" > To: ceph-users@lists.ceph.com > Sent: Tuesday, June 14, 2016 2:34:27 PM > Subject: [ceph-users] local variable 'region_name' referenced before > assignment > > Hi, > > I'm getting "UnboundLocalError: local variable 'region_name' referen

[ceph-users] Ceph and Openstack

2016-06-14 Thread Fran Barrera
Hi all, I have a problem integration Glance with Ceph. Openstack Mitaka Ceph Jewel I've following the Ceph doc ( http://docs.ceph.com/docs/jewel/rbd/rbd-openstack/) but when I try to list or create images, I have an error "Unable to establish connection to http://IP:9292/v2/images";, and in the

Re: [ceph-users] Disk failures

2016-06-14 Thread Jan Schermer
Hi, bit rot is not "bit rot" per se - nothing is rotting on the drive platter. It occurs during reads (mostly, anyway), and it's random. You can happily read a block and get the correct data, then read it again and get garbage, then get correct data again. This could be caused by a worn out cell

Re: [ceph-users] How to select particular OSD to act as primary OSD.

2016-06-14 Thread Kanchana. P
Thanks for the reply shylesh, but the procedure is not working. In ceph.com it is mentioned that we can make particular osd as a primary osd by setting primary affinity weightage between 0-1. But it is not working. On 14 Jun 2016 16:15, "shylesh kumar" wrote: > Hi, > > I think you can edit the cr

Re: [ceph-users] Ceph and Openstack

2016-06-14 Thread Jason Dillaman
On Tue, Jun 14, 2016 at 8:15 AM, Fran Barrera wrote: > 2016-06-14 14:02:54.634 2256 DEBUG glance_store.capabilities [-] Store > glance_store._drivers.rbd.Store doesn't support updating dynamic storage > capabilities. Please overwrite 'update_capabilities' method of the store to > implement updatin

Re: [ceph-users] "mount error 5 = Input/output error" with the CephFS file system from client node

2016-06-14 Thread Gregory Farnum
On Tue, Jun 14, 2016 at 4:29 AM, Rakesh Parkiti wrote: > Hello, > > Unable to mount the CephFS file system from client node with "mount error 5 > = Input/output error" > MDS was installed on a separate node. Ceph Cluster health is OK and mds > services are running. firewall was disabled across all

Re: [ceph-users] RadosGW - Problems running the S3 and SWIFT API at the same time

2016-06-14 Thread Saverio Proto
I am at the Ceph Day at CERN, I asked to Sage if it is supported to enable both S3 and swift API at the same time. The answer is yes, so it is meant to be supported, and this that we see here is probably a bug. I opened a bug report: http://tracker.ceph.com/issues/16293 If anyone has a chance to

[ceph-users] RGW: ERROR: failed to distribute cache

2016-06-14 Thread Василий Ангапов
Hello, I have Ceph 10.2.1 and when creating user in RGW I get the following error: $ radosgw-admin user create --uid=test --display-name="test" 2016-06-14 14:07:32.332288 7f00a4487a40 0 ERROR: failed to distribute cache for ed-1.rgw.meta:.meta:user:test:_dW3fzQ3UX222SWQvr3qeHYR:1 2016-06-14 14:0

Re: [ceph-users] RGW: ERROR: failed to distribute cache

2016-06-14 Thread Василий Ангапов
I also get the following: $ radosgw-admin period update --commit 2016-06-14 14:32:28.982847 7fed392baa40 0 ERROR: failed to distribute cache for .rgw.root:periods.87abf44e-cab3-48c4-b012-0a9247519a5b:staging 2016-06-14 14:32:38.991846 7fed392baa40 0 ERROR: failed to distribute cache for .rgw.ro

Re: [ceph-users] librados and multithreading

2016-06-14 Thread Юрий Соколов
Common, friends, No one knows an answer? 12 июня 2016 г. 16:21 пользователь "Юрий Соколов" написал: > I don't know. That is why i'm asking here. > > 2016-06-12 6:36 GMT+03:00 Ken Peng : > > Hi, > > > > We had experienced the similar error, when writing to RBD block with > > multi-threads using fi

Re: [ceph-users] RGW: ERROR: failed to distribute cache

2016-06-14 Thread Василий Ангапов
BTW, I have 10 RGW load balanced through Apache. When restarting one of them I get the following messages in log: 2016-06-14 14:44:15.919801 7fd4728dea40 2 all 8 watchers are set, enabling cache 2016-06-14 14:44:15.919879 7fce370f7700 2 garbage collection: start 2016-06-14 14:44:15.919990 7fce36

Re: [ceph-users] Ceph and Openstack

2016-06-14 Thread Jonathan D. Proulx
On Tue, Jun 14, 2016 at 02:15:45PM +0200, Fran Barrera wrote: :Hi all, : :I have a problem integration Glance with Ceph. : :Openstack Mitaka :Ceph Jewel : :I've following the Ceph doc ( :http://docs.ceph.com/docs/jewel/rbd/rbd-openstack/) but when I try to list :or create images, I have an error "U

Re: [ceph-users] librados and multithreading

2016-06-14 Thread Jason Dillaman
On Fri, Jun 10, 2016 at 12:37 PM, Юрий Соколов wrote: > Good day, all. > > I found this issue: https://github.com/ceph/ceph/pull/5991 > > Did this issue affected librados ? No -- this affected the start-up and shut-down of librbd as described in the associated tracker ticket. > Were it safe to u

Re: [ceph-users] Ceph and Openstack

2016-06-14 Thread Iban Cabrillo
Hi Jon, Which is the hypervisor used for your Openstack deployment? We have lots of troubles with xen until latest libvirt ( in libvirt < 1.3.2 package, RDB driver was not supported ) Regards, I 2016-06-14 17:38 GMT+02:00 Jonathan D. Proulx : > On Tue, Jun 14, 2016 at 02:15:45PM +0200, Fran B

Re: [ceph-users] librados and multithreading

2016-06-14 Thread Юрий Соколов
Thank you, Jason. 2016-06-14 18:43 GMT+03:00 Jason Dillaman : > On Fri, Jun 10, 2016 at 12:37 PM, Юрий Соколов wrote: >> Good day, all. >> >> I found this issue: https://github.com/ceph/ceph/pull/5991 >> >> Did this issue affected librados ? > > No -- this affected the start-up and shut-down of l

[ceph-users] cephfs reporting 2x data available

2016-06-14 Thread Daniel Davidson
I have just deployed a cluster and started messing with it, which I think two replicas. However when I have a metadata server and mount via fuse, it is reporting its full size. With two replicas, I thought it would be only reporting half of that. Did I make a mistake, or is there something I

Re: [ceph-users] which CentOS 7 kernel is compatible with jewel?

2016-06-14 Thread Ilya Dryomov
On Mon, Jun 13, 2016 at 8:37 PM, Michael Kuriger wrote: > I just realized that this issue is probably because I’m running jewel 10.2.1 > on the servers side, but accessing from a client running hammer 0.94.7 or > infernalis 9.2.1 > > Here is what happens if I run rbd ls from a client on infernal

Re: [ceph-users] Ceph and Openstack

2016-06-14 Thread Jonathan D. Proulx
On Tue, Jun 14, 2016 at 05:48:11PM +0200, Iban Cabrillo wrote: :Hi Jon, : Which is the hypervisor used for your Openstack deployment? We have lots :of troubles with xen until latest libvirt ( in libvirt < 1.3.2 package, RDB :driver was not supported ) we're using kvm (Ubuntu 14.04, libvirt 1.2.1

Re: [ceph-users] Clearing Incomplete Clones State

2016-06-14 Thread Lazuardi Nasution
Hi, Additional information. It seem that snapshot state is wrong. Any idea on my case? How to manually edit pool flags for removing "incomplete_clones" flag? [root@management-b ~]# rados -p rbd ls rbd_directory [root@management-b ~]# rados -p rbd_cache ls rbd_directory [root@management-b ~]# rado

Re: [ceph-users] cephfs reporting 2x data available

2016-06-14 Thread John Spray
On Tue, Jun 14, 2016 at 7:45 PM, Daniel Davidson wrote: > I have just deployed a cluster and started messing with it, which I think > two replicas. However when I have a metadata server and mount via fuse, it > is reporting its full size. With two replicas, I thought it would be only > reporting

[ceph-users] ceph-deploy jewel install dependencies

2016-06-14 Thread Noah Watkins
Installing Jewel with ceph-deploy has been working for weeks. Today I started to get some dependency issues: [b61808c8624c][DEBUG ] The following packages have unmet dependencies: [b61808c8624c][DEBUG ] ceph : Depends: ceph-mon (= 10.2.1-1trusty) but it is not going to be installed [b61808c8624c]

[ceph-users] Protecting rbd from multiple simultaneous mapping.

2016-06-14 Thread Puneet Zaroo
The email thread here : http://www.spinics.net/lists/ceph-devel/msg12226.html discusses a way of preventing multiple simultaneous clients from mapping an rbd via the legacy advisory locking scheme, along with osd blacklisting. Is it now advisable to use the exclusive lock feature, discussed here :

Re: [ceph-users] 40Mil objects in S3 rados pool / how calculate PGs

2016-06-14 Thread Василий Ангапов
But is there any way to recreate bucket index for existing bucket? Is it possible to change bucket's index pool to some new pool in its metadata and them tell RadosGW to rebuild (--check --fix) index? Sounds really crazy but will it work? Will the new index become sharded? 2016-06-14 13:18 GMT+03:

Re: [ceph-users] cephfs reporting 2x data available

2016-06-14 Thread Daniel Davidson
Thanks John, I just wanted to make sure I wasnt doing anything wrong, that should work fine. Dan On 06/14/2016 03:24 PM, John Spray wrote: On Tue, Jun 14, 2016 at 7:45 PM, Daniel Davidson wrote: I have just deployed a cluster and started messing with it, which I think two replicas. Howeve

Re: [ceph-users] ceph-deploy jewel install dependencies

2016-06-14 Thread Alfredo Deza
Is it possible you tried to install just when I was syncing 10.2.2 ? :) Would you mind trying this again and see if you are good? On Tue, Jun 14, 2016 at 5:31 PM, Noah Watkins wrote: > Installing Jewel with ceph-deploy has been working for weeks. Today I > started to get some dependency issues:

Re: [ceph-users] ceph-deploy jewel install dependencies

2016-06-14 Thread Alfredo Deza
On Tue, Jun 14, 2016 at 5:52 PM, Alfredo Deza wrote: > Is it possible you tried to install just when I was syncing 10.2.2 ? > > :) > > Would you mind trying this again and see if you are good? > > On Tue, Jun 14, 2016 at 5:31 PM, Noah Watkins wrote: >> Installing Jewel with ceph-deploy has been w

Re: [ceph-users] ceph-deploy jewel install dependencies

2016-06-14 Thread Noah Watkins
Yeh, I'm still seeing the problem, too Thanks. On Tue, Jun 14, 2016 at 2:55 PM Alfredo Deza wrote: > On Tue, Jun 14, 2016 at 5:52 PM, Alfredo Deza wrote: > > Is it possible you tried to install just when I was syncing 10.2.2 ? > > > > :) > > > > Would you mind trying this again and see if you a

[ceph-users] Spreading deep-scrubbing load

2016-06-14 Thread Jared Curtis
I’ve just started looking into one of our ceph clusters because a weekly deep scrub had a major IO impact on the cluster which caused multiple VMs to grind to a halt. So far I’ve discovered that this particular cluster is configured incorrectly for the number of PGS per OSD. Currently that sett

[ceph-users] striping for a small cluster

2016-06-14 Thread pixelfairy
We have a small cluster, 3mons, each which also have 6 4tb osds, and a 20gig link to the cluster (2x10gig lacp to a stacked pair of switches). well have at least replica pool (size=3) and one erasure coded pool. current plan is to have journals coexist with osds as that seems to the be safest and m

Re: [ceph-users] ceph-deploy jewel install dependencies

2016-06-14 Thread Alfredo Deza
We are now good to go. Sorry for all the troubles, some packages were missed in the metadata, had to resync+re-sign them to get everything in order. Just tested it out and it works as expected. Let me know if you have any issues. On Tue, Jun 14, 2016 at 5:57 PM, Noah Watkins wrote: > Yeh, I'm s

Re: [ceph-users] Spreading deep-scrubbing load

2016-06-14 Thread Christian Balzer
Hello, On Wed, 15 Jun 2016 00:01:42 + Jared Curtis wrote: > I’ve just started looking into one of our ceph clusters because a weekly > deep scrub had a major IO impact on the cluster which caused multiple > VMs to grind to a halt. > A story you will find aplenty in the ML archives. > So fa

Re: [ceph-users] ceph-deploy jewel install dependencies

2016-06-14 Thread Noah Watkins
Working for me now. Thanks for taking care of this. - Noah On Tue, Jun 14, 2016 at 5:42 PM, Alfredo Deza wrote: > We are now good to go. > > Sorry for all the troubles, some packages were missed in the metadata, > had to resync+re-sign them to get everything in order. > > Just tested it out and

Re: [ceph-users] Disk failures

2016-06-14 Thread Christian Balzer
Hello, On Tue, 14 Jun 2016 14:26:41 +0200 Jan Schermer wrote: > Hi, > bit rot is not "bit rot" per se - nothing is rotting on the drive > platter. Never mind that I used the wrong terminology (according to Wiki) and that my long experience with "laser-rot" probably caused me to choose that ter

Re: [ceph-users] striping for a small cluster

2016-06-14 Thread Christian Balzer
Hello, On Wed, 15 Jun 2016 00:22:51 + pixelfairy wrote: > We have a small cluster, 3mons, each which also have 6 4tb osds, and a > 20gig link to the cluster (2x10gig lacp to a stacked pair of switches). > well have at least replica pool (size=3) and one erasure coded pool. I'm neither parti

Re: [ceph-users] striping for a small cluster

2016-06-14 Thread pixelfairy
looks like well rebuild the cluster when bluestore is released anyway. thanks! On Tue, Jun 14, 2016 at 7:02 PM Christian Balzer wrote: > > Hello, > > On Wed, 15 Jun 2016 00:22:51 + pixelfairy wrote: > > > We have a small cluster, 3mons, each which also have 6 4tb osds, and a > > 20gig link t

Re: [ceph-users] Disk failures

2016-06-14 Thread Bill Sharer
This is why I use btrfs mirror sets underneath ceph and hopefully more than make up for the space loss by going with 2 replicas instead of 3 and on the fly lzo compression. The ceph deep scrubs replace any need for btrfs scrubs, but I still get the benefit of self healing when btrfs finds bit

Re: [ceph-users] Disk failures

2016-06-14 Thread Gandalf Corvotempesta
Il 15 giu 2016 03:27, "Christian Balzer" ha scritto: > And that makes deep-scrubbing something of quite limited value. This is not true. If you checksum *before* writing to disk (so when data is still in ram) then when reading back from disk you could do the checksum verification and if doesn't m