[ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Yujian Peng
Hi, I'm encountering a data disaster. I have a ceph cluster with 145 osd. The data center had a power problem yesterday, and all of the ceph nodes were down. But now I find that 6 disks(xfs) in 4 nodes have data corruption. Some disks are unable to mount, and some disks have IO errors in syslog.

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Ilya Dryomov
On Mon, May 4, 2015 at 11:46 AM, Florent B wrote: > Hi, > > I would like to know which kernel version is needed to mount CephFS on a > Hammer cluster ? > > And if we use 3.16 kernel of Debian Jessie, can we hope using CephFS for > a few next release without problem ? I would advice to run the lat

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Alexandre DERUMIER
maybe this could help to repair pgs ? http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/ (6 disk at the same time seem pretty strange. do you have some kind of writeback cache enable of theses disks ?) - Mail original - De: "Yujian Peng" À: "ceph-users" Envo

Re: [ceph-users] A pesky unfound object

2015-05-04 Thread Eino Tuominen
Hi everybody, Does anybody have any clue on this? I've run a deep scrub on the pg and the status is still showing one unfound object. This is a test cluster only, but I'd like to learn what has happened and why... Thanks, -- Eino Tuominen -Original Message- From: ceph-users [mailto:

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Yujian Peng
Alexandre DERUMIER writes: > > > maybe this could help to repair pgs ? > > http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/ > > (6 disk at the same time seem pretty strange. do you have some kind of writeback cache enable of theses disks ?) The only writeback cache i

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Steffen W Sørensen
> On 04/05/2015, at 15.01, Yujian Peng wrote: > > Alexandre DERUMIER writes: > >> >> >> maybe this could help to repair pgs ? >> >> http://www.sebastien-han.fr/blog/2015/04/27/ceph-manually-repair-object/ >> >> (6 disk at the same time seem pretty strange. do you have some kind of >> writ

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Christopher Kunz
Am 04.05.15 um 09:00 schrieb Yujian Peng: > Hi, > I'm encountering a data disaster. I have a ceph cluster with 145 osd. The > data center had a power problem yesterday, and all of the ceph nodes were > down. > But now I find that 6 disks(xfs) in 4 nodes have data corruption. Some disks > are unabl

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Emmanuel Florac
Le Mon, 4 May 2015 07:00:32 + (UTC) Yujian Peng écrivait: > I'm encountering a data disaster. I have a ceph cluster with 145 osd. > The data center had a power problem yesterday, and all of the ceph > nodes were down. But now I find that 6 disks(xfs) in 4 nodes have > data corruption. Some di

[ceph-users] Rados Object gateway installation

2015-05-04 Thread MOSTAFA Ali (INTERN)
Hi Ceph users, I am new to ceph, I installed a small cluster (3 Monitors with 5 OSDs). Now I am trying to install an Object Gateway server. I followed the steps in the documentation but I am not able to launch the service using /etc/init.d/radosgw start instead I am using sudo -E /usr/bin/radosg

[ceph-users] how to display client io in hammer

2015-05-04 Thread Chad William Seys
Hi all, Looks like in Hammer 'ceph -s' no longer displays client IO and ops. How does one display that these days? Thanks, C. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

[ceph-users] The first infernalis dev release will be v9.0.0

2015-05-04 Thread Sage Weil
The first Ceph release back in Jan of 2008 was 0.1. That made sense at the time. We haven't revised the versioning scheme since then, however, and are now at 0.94.1 (first Hammer point release). To avoid reaching 0.99 (and 0.100 or 1.00?) we have a new strategy. This was discussed a bit on

Re: [ceph-users] how to display client io in hammer

2015-05-04 Thread Chad William Seys
Ooops! Turns out I forgot to mount the ceph rbd, so no client IO displayed! C. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Kicking 'Remapped' PGs

2015-05-04 Thread Gregory Farnum
On Sun, May 3, 2015 at 5:18 AM, Paul Evans wrote: > Thanks, Greg. Following your lead, we discovered the proper > 'set_choose_tries xxx’ value had not been applied to *this* pool’s rule, and > we updated the cluster accordingly. We then moved a random OSD out and back > in to ‘kick’ things, but n

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-05-04 Thread Tuomas Juntunen
Hi below is the mds dump dumped mdsmap epoch 1799 epoch 1799 flags 0 created 2014-12-10 12:44:34.188118 modified2015-05-04 07:16:37.205350 tableserver 0 root0 session_timeout 60 session_autoclose 300 max_file_size 1099511627776 last_failure1794 last_failure_osd_epoc

Re: [ceph-users] The first infernalis dev release will be v9.0.0

2015-05-04 Thread Loic Dachary
+1 ;-) On 04/05/2015 18:09, Sage Weil wrote: > The first Ceph release back in Jan of 2008 was 0.1. That made sense at > the time. We haven't revised the versioning scheme since then, however, > and are now at 0.94.1 (first Hammer point release). To avoid reaching > 0.99 (and 0.100 or 1.00?)

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-05-04 Thread Sage Weil
On Mon, 4 May 2015, Tuomas Juntunen wrote: > 5827504:10.20.0.11:6800/3382530 'ceph1' mds.0.262 up:rejoin seq 33159 This is why it is 'degraded'... stuck in up:rejoin state. > The active+clean+replay has been there for a day now, so there must be > something that is not ok, if it should've

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-05-04 Thread Tuomas Juntunen
Hi Ok, restarting osd's did it. I thought I restarted the daemons after it was almost clean, but it seems I didn't. Now everything is running fine. Thanks again! Br, Tuomas -Original Message- From: Sage Weil [mailto:s...@newdream.net] Sent: 4. toukokuuta 2015 20:21 To: Tuomas Juntun

[ceph-users] I have a trouble using theuthology ceph test tool

2015-05-04 Thread 박근영
Hi, Cepher! I need help to use teuthology, the Ceph intergration test framework. there are three nodes like below 1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11.0.0.100, user: teuth 2. target server / OS: Ubuntu 14.04, IP: 11.0.0.10, user: ubuntu 3. teuthology server / OS: Ubuntu 14

[ceph-users] I have a trouble using theuthology ceph test tool

2015-05-04 Thread 박근영
Hi, Cepher! I need help to use teuthology, the Ceph intergration test framework. there are three nodes like below 1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11.0.0.100, user: teuth 2. target server / OS: Ubuntu 14.04, IP: 11.0.0.10, user: ubuntu 3. teuthology server / OS: Ubuntu 14.04,

Re: [ceph-users] ERROR: missing keyring, cannot use cephx for authentication

2015-05-04 Thread Jesus Chavez (jeschave)
You have saved my day! Thank you so much :) no seems to be working not sure why it has that behaivor... Thank you so much! Jesus Chavez SYSTEMS ENGINEER-C.SALES jesch...@cisco.com Phone: +52 55 5267 3146 Mobile: +51 1 5538883255 CCIE - 44433 On Apr 14, 2015, at 12:3

Re: [ceph-users] Firefly - Giant : CentOS 7 : install failed ceph-deploy

2015-05-04 Thread Michael Kidd
For Firefly / Giant installs, I've had success with the following: yum install ceph ceph-common --disablerepo=base --disablerepo=epel Let us know if this works for you as well. Thanks, Michael J. Kidd Sr. Storage Consultant Inktank Professional Services - by Red Hat On Wed, Apr 8, 2015 at 8:5

[ceph-users] Ceph migration to AWS

2015-05-04 Thread Mike Travis
To those interested in a tricky problem, We have a Ceph cluster running at one of our data centers. One of our client's requirements is to have them hosted at AWS. My question is: How do we effectively migrate our data on our internal Ceph cluster to an AWS Ceph cluster? Ideas currently on the ta

Re: [ceph-users] Help with CEPH deployment

2015-05-04 Thread Venkateswara Rao Jujjuri
Thanks Mark. I switched to completely different machine and started from scratch, things were much smoother this time. Cluster was up in 30 mins. I guess purgedata , droplets and and purge is Not enough to bring the machine back clean? What I was trying on the old machine to reset it. Thanks JV O

Re: [ceph-users] strange benchmark problem : restarting osd daemon improve performance from 100k iops to 300k iops

2015-05-04 Thread Venkateswara Rao Jujjuri
If I want to use librados API for performance testing, are there any existing benchmark tools which directly accesses librados (not through rbd or gateway) Thanks in advance, JV On Sun, Apr 26, 2015 at 10:46 PM, Alexandre DERUMIER wrote: >>>I'll retest tcmalloc, because I was prety sure to have

[ceph-users] I have a trouble using theuthology ceph test tool..

2015-05-04 Thread 박근영
Hi, Cepher! I need help to use teuthology, the Ceph intergration test framework. there are three nodes like below 1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11.0.0.100, user: teuth2. targetserver / OS: Ubuntu 14.04, IP: 11.0.0.10, user: ubuntu 3. teuthology server / OS: Ubuntu 14.04, IP: 1

[ceph-users] Using RAID Controller for OSD and JNL disks in Ceph Nodes

2015-05-04 Thread Sanjoy Dasgupta
Hi! This is an often discussed and clarified topic, but Reason why I am asking is because If We use a RAID controller with Lot of Cache (FBWC) and Configure each Drive as Single Drive RAID0, then Write to disks will benefit by using FBWC and accelerate I/O performance. Is this correct assumption

Re: [ceph-users] ERROR: missing keyring, cannot use cephx for authentication

2015-05-04 Thread Jesus Chavez (jeschave)
How did you get the UUIDs without mounting the osds? Thanks Jesus Chavez SYSTEMS ENGINEER-C.SALES jesch...@cisco.com Phone: +52 55 5267 3146 Mobile: +51 1 5538883255 CCIE - 44433 On Apr 10, 2015, at 11:47 PM, "oyym...@gmail.com" mailto:oyym

[ceph-users] How to add a slave to rgw

2015-05-04 Thread 周炳华
Hi, geeks: I have a ceph cluster for rgw service in production, which was setup according to the simple configuration tutorial, with only one deafult region and one default zone. Even worse, I didn't enable neither the meta logging nor the data logging. Now i want to add a slave zone to the rgw fo

[ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-05-04 Thread tuomas . juntunen
I upgraded Ceph from 0.87 Giant to 0.94.1 Hammer Then created new pools and deleted some old ones. Also I created one pool for tier to be able to move data without outage. After these operations all but 10 OSD's are down and creating this kind of messages to logs, I get more than 100gb of these i

[ceph-users] OSDs remain down

2015-05-04 Thread Jesus Chavez (jeschave)
Hi all , still have a lot of problems when building power fail… There is only one simple node where OSDs remains down after reboot and there is something weird, 1 from 12 OSDs gets up after reboot but just one…Here is an example: Filesystem Size Used Avail Use% Mounted on /dev/mapp

[ceph-users] How to Stop/start a specific OSD

2015-05-04 Thread MOSTAFA Ali (INTERN)
Hello all, I am a new user of Ceph. I am trying to stop an OSD, I can stop it (and all other OSDs on the node) using the command : sudo stop ceph-osd-all. But the command: sudo stop ceph-osd id=0 returns an error user@node2:~$ ceph osd tree # idweight type name up/down reweight -1

[ceph-users] OSD failing to start [fclose error: (61) No data available]

2015-05-04 Thread Sourabh saryal
Hi, Trying to start an OSD , which is failing to restart. /etc/init.d/ceph start osd.140 === osd.140 === create-or-move updated item name 'osd.140' weight 3.63 at location {host=XXX,root=default} to crush map Starting Ceph osd.140 on ... starting osd.140 at :/0 osd_data /var/lib/ceph/osd/ceph-140

[ceph-users] Help with CEPH deployment

2015-05-04 Thread Venkateswara Rao Jujjuri
Started to install basic cluster from scratch, but running into keyring issues. Basically /etc/ceph/ceph.client.admin.keyring is not getting generated on monitor node. When I tried to create it on the monitor node, it fails with: $ ceph auth get-or-create client.admin mon 'allow *' mds 'allow *' o

[ceph-users] Rack awareness with different hardware layouts

2015-05-04 Thread Rogier Dikkes
Hello all, At this moment we have a scenario where i would like your opinion on. Scenario: Currently we have a ceph environment with 1 rack of hardware, this rack contains a couple of OSD nodes with 4T disks. In a few weeks time we will deploy 2 more racks with OSD nodes, these nodes have 6T

Re: [ceph-users] Ceph migration to AWS

2015-05-04 Thread Kyle Bader
> To those interested in a tricky problem, > > We have a Ceph cluster running at one of our data centers. One of our > client's requirements is to have them hosted at AWS. My question is: How do > we effectively migrate our data on our internal Ceph cluster to an AWS Ceph > cluster? > > Ideas curre

[ceph-users] I have a trouble using theuthology ceph test tool

2015-05-04 Thread 박근영
Hi, Cepher! I need help to use teuthology, the Ceph intergration test framework. there are three nodes like below 1. paddles, pulpito server / OS: Ubuntu 14.04, IP: 11 . 0 . 0 . 100, user: teuth 2. targetserver / OS: Ubuntu 14.04, IP: 11 . 0 . 0 . 10, user: ubuntu 3. teuthology server / OS: Ubunt

[ceph-users] NVMe Journal and Mixing IO

2015-05-04 Thread Atze de Vries
Hi, We are designing a new Ceph cluster. Some of the cluster wil be used to run vms and most of it wil be used for file storage and object storage. We want to separate the workload for vms (high IO / small block) from the bulk storage (big block lots of latency) since mixing IO seems to be a bad i

Re: [ceph-users] Help with CEPH deployment

2015-05-04 Thread Venkateswara Rao Jujjuri
Here is the output..I am still stuck at this step. :( (multiple times tried to by purging and restarting from scratch) vjujjuri@rgulistan-wsl10:~/ceph-cluster$ ceph-deploy mon create-initial [ceph_deploy.conf][DEBUG ] found configuration file at: /home/vjujjuri/.cephdeploy.conf [ceph_deploy.cli][I

[ceph-users] about rgw region and zone

2015-05-04 Thread TERRY
Hi: all when I Configuring Federated Gateways?? I got the error as below: sudo radosgw-agent -c /etc/ceph/ceph-data-sync.conf ERROR:root:Could not retrieve region map from destination Traceback (most recent call last): File "/usr/lib/python2.6/site-packages/radosgw_agent/cli.py", lin

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Chad William Seys
Hi Florent, Most likely Debian will release "backported" kernels for Jessie, as they have for Wheezy. E.g. Wheezy has had kernel 3.16 backported to it: https://packages.debian.org/search?suite=wheezy-backports&searchon=names&keywords=linux-image-amd64 C. _

Re: [ceph-users] Help with CEPH deployment

2015-05-04 Thread Vasu Kulkarni
What are your initial monitor nodes? i,e what nodes did you specify in the first step: ceph-deploy new {initial-monitor-node(s)} Did you specify rgulistan-wsl11 as your monitor node in that step? - Original Message - From: "Venkateswara Rao Jujjuri" To: "ceph-devel" , "ceph-users" Sen

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Ilya Dryomov
On Mon, May 4, 2015 at 9:40 PM, Chad William Seys wrote: > Hi Florent, > Most likely Debian will release "backported" kernels for Jessie, as they > have for Wheezy. > E.g. Wheezy has had kernel 3.16 backported to it: > > https://packages.debian.org/search?suite=wheezy-backports&searchon=names&

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread cwseys
HI Illya, Any new features, development work and most of the enhancements are not backported. Only a selected bunch of bug fixes is. Not sure what you are trying to say. Wheezy was released with kernel 3.2 and bugfixes are applied to 3.2 by Debian throughout Wheezy's support cycle. But by

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread Ilya Dryomov
On Mon, May 4, 2015 at 11:25 PM, cwseys wrote: > HI Illya, > >> Any new features, development work and most of the enhancements are not >> backported. Only a selected bunch of bug fixes is. > > > Not sure what you are trying to say. > > Wheezy was released with kernel 3.2 and bugfixes are applied

Re: [ceph-users] Kernel version for CephFS client ?

2015-05-04 Thread ceph
Linux 4.0 lives in Debian : 7% [jack:~]apt-cache policy linux-image-4.0.0-trunk-amd64 linux-image-4.0.0-trunk-amd64: Installé : (aucun) Candidat : 4.0-1~exp1 Table de version : 4.0-1~exp1 0 1 http://ftp.fr.debian.org/debian/ experimental/main amd64 Packages On 04/05/2015 22:37

Re: [ceph-users] Help with CEPH deployment

2015-05-04 Thread Mark Kirkwood
On 05/05/15 04:16, Venkateswara Rao Jujjuri wrote: Thanks Mark. I switched to completely different machine and started from scratch, things were much smoother this time. Cluster was up in 30 mins. I guess purgedata , droplets and and purge is Not enough to bring the machine back clean? What I was

Re: [ceph-users] Shadow Files

2015-05-04 Thread Yehuda Sadeh-Weinraub
I've been working on a new tool that would detect leaked rados objects. It will take some time for it to be merged into an official release, or even into the master branch, but if anyone likes to play with it, it is in the wip-rgw-orphans branch. At the moment I recommend to not remove any obj

Re: [ceph-users] Ceph migration to AWS

2015-05-04 Thread Christian Balzer
On Mon, 4 May 2015 11:21:12 -0700 Kyle Bader wrote: > > To those interested in a tricky problem, > > > > We have a Ceph cluster running at one of our data centers. One of our > > client's requirements is to have them hosted at AWS. My question is: > > How do we effectively migrate our data on our

Re: [ceph-users] Btrfs defragmentation

2015-05-04 Thread Lionel Bouton
On 05/04/15 01:34, Sage Weil wrote: > On Mon, 4 May 2015, Lionel Bouton wrote: >> Hi, >> >> we began testing one Btrfs OSD volume last week and for this first test >> we disabled autodefrag and began to launch manual btrfs fi defrag. >> [...] > Cool.. let us know how things look after it ages! We

Re: [ceph-users] Using RAID Controller for OSD and JNL disks in Ceph Nodes

2015-05-04 Thread Christian Balzer
On Mon, 13 Apr 2015 10:39:57 +0530 Sanjoy Dasgupta wrote: > Hi! > > This is an often discussed and clarified topic, but Reason why I am > asking is because > > If We use a RAID controller with Lot of Cache (FBWC) and Configure each > Drive as Single Drive RAID0, then Write to disks will benefit

Re: [ceph-users] NVMe Journal and Mixing IO

2015-05-04 Thread Christian Balzer
Hello, On Wed, 15 Apr 2015 10:26:37 +0200 Atze de Vries wrote: > Hi, > > We are designing a new Ceph cluster. Some of the cluster wil be used to > run vms and most of it wil be used for file storage and object storage. > We want to separate the workload for vms (high IO / small block) from the

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Yujian Peng
Emmanuel Florac writes: > > Le Mon, 4 May 2015 07:00:32 + (UTC) > Yujian Peng 126.com> écrivait: > > > I'm encountering a data disaster. I have a ceph cluster with 145 osd. > > The data center had a power problem yesterday, and all of the ceph > > nodes were down. But now I find that 6 dis

Re: [ceph-users] Using RAID Controller for OSD and JNL disks in Ceph Nodes

2015-05-04 Thread Jake Young
On Monday, May 4, 2015, Christian Balzer wrote: > On Mon, 13 Apr 2015 10:39:57 +0530 Sanjoy Dasgupta wrote: > > > Hi! > > > > This is an often discussed and clarified topic, but Reason why I am > > asking is because > > > > If We use a RAID controller with Lot of Cache (FBWC) and Configure each >

Re: [ceph-users] Btrfs defragmentation

2015-05-04 Thread Timofey Titovets
Hi list, Excuse me, what I'm saying is off topic @Lionel, if you use btrfs, did you already try to use btrfs compression for OSD? If yes, сan you share the your experience? 2015-05-05 3:24 GMT+03:00 Lionel Bouton : > On 05/04/15 01:34, Sage Weil wrote: >> On Mon, 4 May 2015, Lionel Bouton wrote:

Re: [ceph-users] Motherboard recommendation?

2015-05-04 Thread Mohamed Pakkeer
Hi Mark , Thanks for your reply and your CPU test report. It really help us to identify appropriate hardware for EC based Ceph cluster. Currently we are using Intel Xeon 2630 V3 ( 16 core * 2.4 Ghz = 38.4 GHz) processor. I think, you have tested with Intel Xeon 2630L V2 ( 12 * 2.4 Ghz = 28.8 GHz)

Re: [ceph-users] xfs corruption, data disaster!

2015-05-04 Thread Nick Fisk
This is probably similar to what you want to try and do, but also mark those failed OSD's as lost as I don't think you will have much luck getting them back up and running. http://ceph.com/community/incomplete-pgs-oh-my/#more-6845 The only other option would be if anyone knows a way to rebuild