Re: [ceph-users] SSD journals killed by VMs generating 500 IOPs (4kB) non-stop for a month, seemingly because of a syslog-ng bug

2015-11-23 Thread Mart van Santen
Hello, On 11/22/2015 10:01 PM, Robert LeBlanc wrote: > There have been numerous on the mailing list of the Samsung EVO and > Pros failing far before their expected wear. This is most likely due > to the 'uncommon' workload of Ceph and the controllers of those drives > are not really designed to

[ceph-users] Cluster always scrubbing.

2015-11-23 Thread Mika c
Hi cephers, We are facing a scrub issue. Our CEPH cluster is using Trusty / Hammer 0.94.1 and have almost 320 OSD disks on 10 nodes. And there are more than 30,000 PGs on cluster. The cluster works fine until last week. We found the cluster health status start display "active+clean+scrubbing+d

Re: [ceph-users] SSD journals killed by VMs generating 500 IOPs (4kB) non-stop for a month, seemingly because of a syslog-ng bug

2015-11-23 Thread Eneko Lacunza
Hi Mart, El 23/11/15 a las 10:29, Mart van Santen escribió: On 11/22/2015 10:01 PM, Robert LeBlanc wrote: There have been numerous on the mailing list of the Samsung EVO and Pros failing far before their expected wear. This is most likely due to the 'uncommon' workload of Ceph and the control

Re: [ceph-users] SSD journals killed by VMs generating 500 IOPs (4kB) non-stop for a month, seemingly because of a syslog-ng bug

2015-11-23 Thread Sean Redmond
Hi Mart, I agree with Eneko, I had 72 of the Samaung Evo drives in service for journals (4:1) and ended up replacing them all within 9 months with Intel DC 3700's due to high number of failures and very poor performance resulting in frequent blocked ops. Just stick with the Intel Data Center Grad

Re: [ceph-users] Cluster always scrubbing.

2015-11-23 Thread Sean Redmond
Hi Mika, Have the scubs been running for a long time? Can you see what pool they are running on? You can check using `ceph pg dump | grep scrub` Thanks On Mon, Nov 23, 2015 at 9:32 AM, Mika c wrote: > Hi cephers, > We are facing a scrub issue. Our CEPH cluster is using Trusty / Hammer > 0.

Re: [ceph-users] SSD journals killed by VMs generating 500 IOPs (4kB) non-stop for a month, seemingly because of a syslog-ng bug

2015-11-23 Thread Mart van Santen
On 11/23/2015 10:42 AM, Eneko Lacunza wrote: > Hi Mart, > > El 23/11/15 a las 10:29, Mart van Santen escribió: >> >> >> On 11/22/2015 10:01 PM, Robert LeBlanc wrote: >>> There have been numerous on the mailing list of the Samsung EVO and >>> Pros failing far before their expected wear. This is mo

[ceph-users] op sequence

2015-11-23 Thread louis
Hi, if I submit read or write io in a sequence from a ceph client, will these sequence will be kept in osds side? Thanks发自网易邮箱大师 ___ ceph-users mailing list ceph-users

Re: [ceph-users] Fixing inconsistency

2015-11-23 Thread Gregory Farnum
On Wed, Nov 18, 2015 at 4:34 AM, Межов Игорь Александрович wrote: > Hi! > > As for my previous message, digging mailing list gave me only one method to > fix > inconsistency - truncate object files in a filesystem to a size, that they > have > in ceph metadata: > > http://www.spinics.net/lists/c

Re: [ceph-users] Objects per PG skew warning

2015-11-23 Thread Gregory Farnum
On Thu, Nov 19, 2015 at 8:56 PM, Richard Gray wrote: > Hi, > > Running 'health detail' on our Ceph cluster this morning, I notice a warning > about one of the pools having significantly more objects per placement group > than the cluster average. > > ceph> health detail > HEALTH_WARN pool cas_back

[ceph-users] Cannot Issue Ceph Command

2015-11-23 Thread James Gallagher
Hi there, I have managed to complete the storage cluster Quick Start Guide and have had no issues so far. However, whenever I try to use a [ceph] command rather than [ceph-deploy] command, I get an error message which basically states that the command isn't installed. bash: ceph: command not foun

[ceph-users] Ceph 0.94.5 with accelio

2015-11-23 Thread German Anders
Hi all, I want to know if there's any improvement or update regarding ceph 0.94.5 with accelio, I've an already configured cluster (with no data on it) and I would like to know if there's a way to 'modify' the cluster in order to use accelio. Any info would be really appreciated. Cheers, *German

Re: [ceph-users] Cannot Issue Ceph Command

2015-11-23 Thread Mykola
Please run ceph-deploy on your host machine as well. Sent from Outlook Mail for Windows 10 phone From: James Gallagher Sent: Monday, November 23, 2015 5:03 PM To: ceph-users@lists.ceph.com Subject: [ceph-users] Cannot Issue Ceph Command Hi there, I have managed to complete the storage cluste

Re: [ceph-users] Cannot Issue Ceph Command

2015-11-23 Thread Mart van Santen
ceph-deploy and the ceph command itself are seperate packages. It is possible to install *only* the ceph-deploy package without the ceph package. Normally it is as simple as "apt-get install ceph" (depending on your OS). Regards, Mart van Santen On 11/23/2015 05:03 PM, James Gallagher wrote: >

[ceph-users] v10.0.0 released

2015-11-23 Thread Sage Weil
This is the first development release for the Jewel cycle. We are off to a good start, with lots of performance improvements flowing into the tree. We are targetting sometime in Q1 2016 for the final Jewel. Notable Changes --- * build: cmake tweaks (`pr#6254

Re: [ceph-users] Ceph 0.94.5 with accelio

2015-11-23 Thread Gregory Farnum
On Mon, Nov 23, 2015 at 10:05 AM, German Anders wrote: > Hi all, > > I want to know if there's any improvement or update regarding ceph 0.94.5 > with accelio, I've an already configured cluster (with no data on it) and I > would like to know if there's a way to 'modify' the cluster in order to use

Re: [ceph-users] op sequence

2015-11-23 Thread David Riedl
As far as I understand the structur of CEPH the answer is No. The CRUSH Algorithm decides when and how data gets written. Original research paper about CRUSH: http://ceph.com/papers/weil-crush-sc06.pdf High level description of the CRUSH map inside CEPH http://docs.ceph.com/docs/master/rados/ope

Re: [ceph-users] op sequence

2015-11-23 Thread Gregory Farnum
On Mon, Nov 23, 2015 at 8:44 AM, louis wrote: > Hi, if I submit read or write io in a sequence from a ceph client, will > these sequence will be kept in osds side? Thanks Any writes from the same client, to the same object, will be ordered with respect to one another. But there are no other guara

Re: [ceph-users] Ceph 0.94.5 with accelio

2015-11-23 Thread German Anders
Thanks a lot for the quick update Greg. This lead me to ask if there's anything out there to improve performance in an Infiniband environment with Ceph. In the cluster that I mentioned earlier. I've setup 4 OSD server nodes nodes each with 8 OSD daemons running with 800x Intel SSD DC S3710 disks (7

Re: [ceph-users] Ceph 0.94.5 with accelio

2015-11-23 Thread Mark Nelson
Hi German, I don't have exactly the same setup, but on the ceph community cluster I have tests with: 4 nodes, each of which are configured in some tests with: 2 x Intel Xeon E5-2650 1 x Intel XL710 40GbE (currently limited to about 2.5GB/s each) 1 x Intel P3700 800GB (4 OSDs per card using 4

Re: [ceph-users] SSD Caching Mode Question

2015-11-23 Thread Samuel Just
My read of that doc is that you still need to either set the configs to force all objects to be flushed or use the rados command to flush/evict all objects. -Sam On Wed, Nov 18, 2015 at 2:38 AM, Nick Fisk wrote: > Hi Robert, > >> -Original Message- >> From: ceph-users [mailto:ceph-users-b

Re: [ceph-users] SSD Caching Mode Question

2015-11-23 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Hmmm. It sounds like some objects should be flushed automatically but maybe not all of them. However, I'm not seeing any objects being evicted at all and I know that objects in the tier are being modified. 1. Change the cache mode to forward so that

[ceph-users] CEPH over SW-RAID

2015-11-23 Thread Jose Tavares
Hi guys ... Is there any advantage in running CEPH over a Linux SW-RAID to avoid data corruption due to disk bad blocks? Can we just rely on the scrubbing feature of CEPH? Can we live without an underlying layer that avoids hardware problems to be passed to CEPH? I have a setup where I put one O

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Jan Schermer
SW-RAID doesn't help with bit-rot if that's what you're afraid of. If you are afraid bit-rot you need to use a fully checksumming filesystem like ZFS. Ceph doesn't help there either when using replicas - not sure how strong error detection+correction is in EC-type pools. The only thing I can sug

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Most people run their clusters with no RAID for the data disks (some will run RAID for the journals, but we don't). We use the scrub mechanism to find data inconsistency and we use three copies to do RAID over host/racks, etc. Unless you have a speci

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Lionel Bouton
Le 23/11/2015 18:17, Jan Schermer a écrit : > SW-RAID doesn't help with bit-rot if that's what you're afraid of. > If you are afraid bit-rot you need to use a fully checksumming filesystem > like ZFS. > Ceph doesn't help there either when using replicas - not sure how strong > error detection+cor

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Jose Tavares
Yes, but with SW-RAID, when we have a block that was read and does not match its checksum, the device falls out of the array, and the data is read again from the other devices in the array. The problem is that in SW-RAID1 we don't have the badblocks isolated. The disks can be sincronized again as t

Re: [ceph-users] Ceph 0.94.5 with accelio

2015-11-23 Thread German Anders
Hi Mark, Thanks a lot for the quick response. Regarding the numbers that you send me, they look REALLY nice. I've the following setup 4 OSD nodes: 2 x Intel Xeon E5-2650v2 @2.60Ghz 1 x Network controller: Mellanox Technologies MT27500 Family [ConnectX-3] Dual-Port (1 for PUB and 1 for CLUS) 1 x

[ceph-users] ceph-mon cpu 100%

2015-11-23 Thread Yujian Peng
The mons in my production cluster have a very high cpu usage 100%. I think it may be caused by the leveldb compression. How yo disable leveldb compression ? Just add leveldb_compression = false to the ceph.conf and restart the mons? Thanks a lot! ___ ce

Re: [ceph-users] SSD Caching Mode Question

2015-11-23 Thread Nick Fisk
> -Original Message- > From: Robert LeBlanc [mailto:rob...@leblancnet.us] > Sent: 23 November 2015 17:16 > To: Samuel Just > Cc: Nick Fisk ; Ceph-User > Subject: Re: [ceph-users] SSD Caching Mode Question > > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Hmmm. It sounds like som

Re: [ceph-users] ceph-mon cpu 100%

2015-11-23 Thread Gregory Farnum
Yep. I think you can inject it into the running mons without restarting as well (injectargs). -Greg On Mon, Nov 23, 2015 at 11:46 AM, Yujian Peng wrote: > The mons in my production cluster have a very high cpu usage 100%. > I think it may be caused by the leveldb compression. > How yo disable lev

Re: [ceph-users] Ceph 0.94.5 with accelio

2015-11-23 Thread Robert LeBlanc
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Are you using unconnected mode or connected mode? With connected mode you can up your MTU to 64K which may help on the network side. - Robert LeBlanc PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1 On Mon, Nov 23

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Jan Schermer
So I assume we _are_ talking about bit-rot? > On 23 Nov 2015, at 18:37, Jose Tavares wrote: > > Yes, but with SW-RAID, when we have a block that was read and does not > match its checksum, the device falls out of the array, and the data is read > again from the other devices in the array. That'

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Lionel Bouton
Hi, Le 23/11/2015 18:37, Jose Tavares a écrit : > Yes, but with SW-RAID, when we have a block that was read and does not match > its checksum, the device falls out of the array I don't think so. Under normal circumstances a device only falls out of a md array if it doesn't answer IO queries afte

Re: [ceph-users] High load during recovery (after disk placement)

2015-11-23 Thread Gregory Farnum
On Fri, Nov 20, 2015 at 11:33 AM, Simon Engelsman wrote: > Hi, > > We've experienced a very weird problem last week with our Ceph > cluster. We would like to ask your opinion(s) and advice > > Our dedicated Ceph OSD nodes run with: > > Total platform > - IO Average: 2500 wrps, ~ 600 rps > - Replic

Re: [ceph-users] CACHEMODE_READFORWARD doesn't try proxy write?

2015-11-23 Thread Gregory Farnum
Yeah, the write proxying is pretty new and the fact that it's missing from an oddball like READFORWARD isn't surprising. (Not good, exactly, but not surprising.) What are you doing with this caching mode? On Thu, Nov 19, 2015 at 10:34 AM, Nick Fisk wrote: > Don’t know why that URL got changed, i

Re: [ceph-users] Ceph 0.94.5 with accelio

2015-11-23 Thread German Anders
Hi Robert, Thanks for the response. I was configured as 'datagram', so I try to changed it in the /etc/network/interfaces file and add the following: ## IB0 PUBLIC_CEPH auto ib0 iface ib0 inet static address 172.23.17.8 netmask 255.255.240.0 network 172.23.16.0 post-up

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Jose Tavares
On Mon, Nov 23, 2015 at 4:07 PM, Jan Schermer wrote: > So I assume we _are_ talking about bit-rot? > > > On 23 Nov 2015, at 18:37, Jose Tavares wrote: > > > > Yes, but with SW-RAID, when we have a block that was read and does not > > match its checksum, the device falls out of the array, and the

Re: [ceph-users] Ceph 0.94.5 with accelio

2015-11-23 Thread German Anders
Got it Robert, It was my mistake, I put post-up instead of pre-up, now it changed ok, I'll do new tests with this config and let you know. Regards, *German* 2015-11-23 15:36 GMT-03:00 German Anders : > Hi Robert, > > Thanks for the response. I was configured as 'datagram', so I try to > change

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Jose Tavares
On Mon, Nov 23, 2015 at 4:15 PM, Lionel Bouton < lionel-subscript...@bouton.name> wrote: > Hi, > > Le 23/11/2015 18:37, Jose Tavares a écrit : > > Yes, but with SW-RAID, when we have a block that was read and does not > match its checksum, the device falls out of the array > > I don't think so. Un

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Lionel Bouton
Le 23/11/2015 19:58, Jose Tavares a écrit : > > > On Mon, Nov 23, 2015 at 4:15 PM, Lionel Bouton > > wrote: > > Hi, > > Le 23/11/2015 18:37, Jose Tavares a écrit : > > Yes, but with SW-RAID, when we have a block that was read and does not > matc

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Jose Tavares
On Mon, Nov 23, 2015 at 5:26 PM, Lionel Bouton < lionel-subscript...@bouton.name> wrote: > Le 23/11/2015 19:58, Jose Tavares a écrit : > > > > On Mon, Nov 23, 2015 at 4:15 PM, Lionel Bouton < > lionel-subscript...@bouton.name> wrote: > >> Hi, >> >> Le 23/11/2015 18:37, Jose Tavares a écrit : >> >

Re: [ceph-users] librbd - threads grow with each Image object

2015-11-23 Thread Jason Dillaman
Couldn't hurt to open a feature request for this on the tracker. -- Jason Dillaman - Original Message - > From: "Haomai Wang" > To: "Allen Liao" > Cc: ceph-users@lists.ceph.com > Sent: Saturday, November 21, 2015 11:57:11 AM > Subject: Re: [ceph-users] librbd - threads grow with each

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Lionel Bouton
Le 23/11/2015 21:01, Jose Tavares a écrit : > > > > > My new question regarding Ceph is if it isolates this bad sectors where > it found bad data when scrubbing? or there will be always a replica of > something over a known bad block..? > Ceph OSDs don't know about bad sectors, they deleg

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Jose Tavares
On Mon, Nov 23, 2015 at 6:40 PM, Lionel Bouton < lionel-subscript...@bouton.name> wrote: > Le 23/11/2015 21:01, Jose Tavares a écrit : > > > >> >> > My new question regarding Ceph is if it isolates this bad sectors where >> it found bad data when scrubbing? or there will be always a replica of >>

Re: [ceph-users] CEPH over SW-RAID

2015-11-23 Thread Lionel Bouton
Le 23/11/2015 21:58, Jose Tavares a écrit : > > AFAIK, people are complaining about lots os bad blocks in the new big > disks. The hardware list seems to be small and unable to replace > theses blocks. Note that if by big disks you mean SMR-based disks, they can exhibit what looks like bad blocks

[ceph-users] 回复:Re: can not create rbd image

2015-11-23 Thread louis
Interesting thing is, I also can not remove the image. The thing I can do is removing pool. After that,I can create image again. But you know, this behavior is not reasonable, we can not expect lost all images,even nearly full. My file system is ext3 and Ubuntu . One thing need be reminded

Re: [ceph-users] Cluster always scrubbing.

2015-11-23 Thread Mika c
Hi Sean, Yes, the cluster scrubbing status(scrub + deep scrub) is almost two weeks. And the result of execute `ceph pg dump | grep scrub` is empty. But command of "ceph health" show there is "*16 pgs active+clean+scrubbing+deep, 2** pgs active+clean+scrubbing*". I have 2 osds have slow

[ceph-users] Performance question

2015-11-23 Thread Marek Dohojda
I have a Hammer Ceph cluster on 7 nodes with total 14 OSDs. 7 of which are SSD and 7 of which are SAS 10K drives. I get typically about 100MB IO rates on this cluster. I have a simple question. Is 100MB within my configuration what I should expect, or should it be higher? I am not sure if I sho

Re: [ceph-users] Performance question

2015-11-23 Thread Haomai Wang
On Tue, Nov 24, 2015 at 10:23 AM, Marek Dohojda wrote: > I have a Hammer Ceph cluster on 7 nodes with total 14 OSDs. 7 of which are > SSD and 7 of which are SAS 10K drives. I get typically about 100MB IO rates > on this cluster. You mixed up sas and ssd in one pool? > > I have a simple questio

Re: [ceph-users] Performance question

2015-11-23 Thread Marek Dohojda
No SSD and SAS are in two separate pools. On Mon, Nov 23, 2015 at 7:30 PM, Haomai Wang wrote: > On Tue, Nov 24, 2015 at 10:23 AM, Marek Dohojda > wrote: > > I have a Hammer Ceph cluster on 7 nodes with total 14 OSDs. 7 of which > are > > SSD and 7 of which are SAS 10K drives. I get typically

Re: [ceph-users] Performance question

2015-11-23 Thread Haomai Wang
On Tue, Nov 24, 2015 at 10:35 AM, Marek Dohojda wrote: > No SSD and SAS are in two separate pools. > > On Mon, Nov 23, 2015 at 7:30 PM, Haomai Wang wrote: >> >> On Tue, Nov 24, 2015 at 10:23 AM, Marek Dohojda >> wrote: >> > I have a Hammer Ceph cluster on 7 nodes with total 14 OSDs. 7 of which

Re: [ceph-users] Performance question

2015-11-23 Thread Marek Dohojda
Sorry I should have specified SAS is the 100 MB :) , but to be honest SSD isn't much faster. On Mon, Nov 23, 2015 at 7:38 PM, Haomai Wang wrote: > On Tue, Nov 24, 2015 at 10:35 AM, Marek Dohojda > wrote: > > No SSD and SAS are in two separate pools. > > > > On Mon, Nov 23, 2015 at 7:30 PM, Hao

[ceph-users] New added osd always down

2015-11-23 Thread hzwulibin
Hi, cepher My cluster has a big problem. ceph version: 0.80.10 1. OSD are full, i can't delete volume, the io seems blocked. when i rm a image, here is the error message: sudo rbd rm ff3a6870-24cb-427a-979b-6b9b257032c3 -p vol_ssd 2015-11-24 14:14:26.418016 7f9b900a5780 -1 librbd::ImageCtx: error