Re: [ceph-users] old osds take much longer to start than newer osd

2015-03-02 Thread Stephan Hohn
Try and check the xfs fragmentation factor on your „old“ osds. $ xfs_db -c frag -r /dev/sdX and see if it’s incredible high. > On 27 Feb 2015, at 14:02, Corin Langosch wrote: > > Hi guys, > > I'm using ceph for a long time now, since bobtail. I always upgraded every > few weeks/ months to th

Re: [ceph-users] question about rgw create bucket

2015-03-02 Thread ghislain.chevalier
HI all, I think this question can maybe be linked to the mail I sent (fev 25) related to "unconsistency between bucket and bucket.instance". Best regards De : ceph-users [mailto:ceph-users-boun...@lists.ceph.com] De la part de baijia...@126.com Envoyé : lundi 2 mars 2015 08:00 À : ceph-users; Cep

Re: [ceph-users] old osds take much longer to start than newer osd

2015-03-02 Thread Corin Langosch
It's a little worse, but not much: root@r-ch106:~# xfs_db -c frag -r /dev/sda1 actual 397955, ideal 324744, fragmentation factor 18.40% root@r-ch106:~# xfs_db -c frag -r /dev/sdb2 actual 378729, ideal 324349, fragmentation factor 14.36% root@r-ch105:~# xfs_db -c frag -r /dev/sdb2 actual 382831, i

[ceph-users] Permanente Mount RBD blocs device RHEL7

2015-03-02 Thread Jesus Chavez (jeschave)
Hi all! I have been trying to get permanent my fs maked by the rbd device mapping on rhel7 modifying /etc/fstab but everytime I reboot the server I lose the mapping to the pool so the server gets stuck since It didnt find the /dev/rbd0 device, does anybody know if there any procedure to not lose

Re: [ceph-users] Permanente Mount RBD blocs device RHEL7

2015-03-02 Thread Alexandre DERUMIER
Hi, maybe this can help you: http://www.sebastien-han.fr/blog/2013/11/22/map-slash-unmap-rbd-device-on-boot-slash-shutdown/ Regards, Alexandre - Mail original - De: "Jesus Chavez (jeschave)" À: "ceph-users" Envoyé: Lundi 2 Mars 2015 11:14:49 Objet: [ceph-users] Permanente Mount RBD

Re: [ceph-users] Permanente Mount RBD blocs device RHEL7

2015-03-02 Thread Jesus Chavez (jeschave)
Thank you so much Alexandre! :) Jesus Chavez SYSTEMS ENGINEER-C.SALES jesch...@cisco.com Phone: +52 55 5267 3146 Mobile: +51 1 5538883255 CCIE - 44433 On Mar 2, 2015, at 4:26 AM, Alexandre DERUMIER mailto:aderum...@odiso.com>> wrote: Hi, maybe this can help you:

Re: [ceph-users] SSD selection

2015-03-02 Thread Tony Harris
On Sun, Mar 1, 2015 at 11:19 PM, Christian Balzer wrote: > > > > > > I'll be honest, the pricing on Intel's website is far from reality. I > > haven't been able to find any OEMs, and retail pricing on the 200GB 3610 > > is ~231 (the $300 must have been a different model in the line). > > Althoug

[ceph-users] qemu-kvm and cloned rbd image

2015-03-02 Thread koukou73gr
Hello, Today I thought I'd experiment with snapshots and cloning. So I did: rbd import --image-format=2 vm-proto.raw rbd/vm-proto rbd snap create rbd/vm-proto@s1 rbd snap protect rbd/vm-proto@s1 rbd clone rbd/vm-proto@s1 rbd/server And then proceeded to create a qemu-kvm guest with rbd/server

Re: [ceph-users] Ceph Hammer OSD Shard Tuning Test Results

2015-03-02 Thread Mark Nelson
Hi Alex, I see I even responded in the same thread! This would be a good thing to bring up in the meeting on Wednesday. Those are far faster single OSD results than I've been able to muster with simplemessenger. I wonder how much effect flow-control and header/data crc had. He did have qu

[ceph-users] [URGENT-HELP] - Ceph rebalancing again after taking OSD out of CRUSH map

2015-03-02 Thread Andrija Panic
Hi people, I had one OSD crash, so the rebalancing happened - all fine (some 3% of the data has been moved arround, and rebalanced) and my previous recovery/backfill throtling was applied fine and we didnt have a unusable cluster. Now I used the procedure to remove this crashed OSD comletely from

Re: [ceph-users] [URGENT-HELP] - Ceph rebalancing again after taking OSD out of CRUSH map

2015-03-02 Thread Wido den Hollander
On 03/02/2015 03:56 PM, Andrija Panic wrote: > Hi people, > > I had one OSD crash, so the rebalancing happened - all fine (some 3% of the > data has been moved arround, and rebalanced) and my previous > recovery/backfill throtling was applied fine and we didnt have a unusable > cluster. > > Now I

[ceph-users] ceph breizh meetup

2015-03-02 Thread eric mourgaya
Hi cephers, The next ceph breizhcamp will be scheduled the 12th march 2015, at Nantes more precisely at Suravenir Assurance 2 rue vasco de gama,Saint-Herblain, France. It will begin at 10.00AM. join us an fill the http://doodle.com/hvb99f2am7qucd5q -- Eric Mourgaya, Respectons la planete! Lut

[ceph-users] XFS recovery on boot : rogue mounts ?

2015-03-02 Thread SCHAER Frederic
Hi, I rebooted a failed server, which is now showing a rogue filesystem mount. Actually, there were also several disks missing in the node, all reported as "prepared" by ceph-disk, but not activated. [root@ceph2 ~]# grep /var/lib/ceph/tmp /etc/mtab /dev/sdo1 /var/lib/ceph/tmp/mnt.usVRe8 xfs rw,n

Re: [ceph-users] Ceph Hammer OSD Shard Tuning Test Results

2015-03-02 Thread Alexandre DERUMIER
>> This would be a good thing to bring up in the meeting on Wednesday. yes ! >>I wonder how much effect flow-control and header/data crc had. yes. I known that sommath also disable crc for his bench >>Were the simplemessenger tests on IPoIB or native? I think it's native, as the Vu Pham bench

[ceph-users] Some long running ops may lock osd

2015-03-02 Thread Erdem Agaoglu
Hi all, especially devs, We have recently pinpointed one of the causes of slow requests in our cluster. It seems deep-scrubs on pg's that contain the index file for a large radosgw bucket lock the osds. Incresing op threads and/or disk threads helps a little bit, but we need to increase them beyon

Re: [ceph-users] [URGENT-HELP] - Ceph rebalancing again after taking OSD out of CRUSH map

2015-03-02 Thread Andrija Panic
OK thx Wido. Than can we at least update the documentaiton, that will say MAJOR data rebalancing will happen AGAIN, and not 3%, but 37% in my case. Because, I would never run this during work hours, while clients are hammering VMs... This reminds me of those tunable changes couple of months ago,

Re: [ceph-users] Shutting down a cluster fully and powering it back up

2015-03-02 Thread Daniel Schneller
On 2015-02-28 20:46:15 +, Gregory Farnum said: Sounds good! -Greg On Sat, Feb 28, 2015 at 10:55 AM David wrote: Hi! We did that a few weeks ago and it mostly worked fine. However, on startup of one of the 4 machines, it got stuck while starting OSDs (at least that's what the console ou

[ceph-users] RadosGW Log Rotation (firefly)

2015-03-02 Thread Daniel Schneller
On our Ubuntu 14.04/Firefly 0.80.8 cluster we are seeing problem with log file rotation for the rados gateway. The /etc/logrotate.d/radosgw script gets called, but it does not work correctly. It spits out this message, coming from the postrotate portion: /etc/cron.daily/logrotate: reload:

Re: [ceph-users] old osds take much longer to start than newer osd

2015-03-02 Thread Gregory Farnum
This is probably LevelDB being slow. The monitor has some options to "compact" the store on startup and I thought the osd handled it automatically, but you could try looking for something like that and see if it helps. -Greg On Fri, Feb 27, 2015 at 5:02 AM Corin Langosch wrote: > Hi guys, > > I'm

Re: [ceph-users] What does the parameter journal_align_min_size mean?

2015-03-02 Thread Gregory Farnum
On Fri, Feb 27, 2015 at 5:03 AM, Mark Wu wrote: > > I am wondering how the value of journal_align_min_size gives impact on > journal padding. Is there any document describing the disk layout of > journal? Not much, unfortunately. Just looking at the code, the journal will align any writes which a

Re: [ceph-users] More than 50% osds down, CPUs still busy; will the cluster recover without help?

2015-03-02 Thread Gregory Farnum
You can turn the filestore up to 20 instead of 1. ;) You might also explore what information you can get out of the admin socket. You are correct that those numbers are the OSD epochs, although note that when the system is running you'll get output both for the OSD as a whole and for individual PG

Re: [ceph-users] Some long running ops may lock osd

2015-03-02 Thread Gregory Farnum
On Mon, Mar 2, 2015 at 7:56 AM, Erdem Agaoglu wrote: > Hi all, especially devs, > > We have recently pinpointed one of the causes of slow requests in our > cluster. It seems deep-scrubs on pg's that contain the index file for a > large radosgw bucket lock the osds. Incresing op threads and/or disk

[ceph-users] Fresh install of GIANT failing?

2015-03-02 Thread Don Doerner
All, Using ceph-deploy, I see a failure to install ceph on a node. At the beginning of the ceph-deploy output, it says it is installing "stable version giant". The last few lines are... [192.168.167.192][DEBUG ] --> Finished Dependency Resolution [192.168.167.192][WARNIN] Error: Package: 1:pytho

Re: [ceph-users] RadosGW Log Rotation (firefly)

2015-03-02 Thread Gregory Farnum
On Mon, Mar 2, 2015 at 8:44 AM, Daniel Schneller wrote: > On our Ubuntu 14.04/Firefly 0.80.8 cluster we are seeing > problem with log file rotation for the rados gateway. > > The /etc/logrotate.d/radosgw script gets called, but > it does not work correctly. It spits out this message, > coming from

Re: [ceph-users] Fresh install of GIANT failing?

2015-03-02 Thread Don Doerner
Oops, typo... should say "Using ceph-deploy, I see a failure to install ceph on a RHEL7 node"... -don- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Don Doerner Sent: 02 March, 2015 10:17 To: ceph-users@lists.ceph.com Subject: [ceph-users] Fresh install of GIANT fail

Re: [ceph-users] RadosGW Log Rotation (firefly)

2015-03-02 Thread Daniel Schneller
On 2015-03-02 18:17:00 +, Gregory Farnum said: I'm not very (well, at all, for rgw) familiar with these scripts, but how are you starting up your RGW daemon? There's some way to have Apache handle the process instead of Upstart, but Yehuda says "you don't want to do it". -Greg Well, we in

Re: [ceph-users] Some long running ops may lock osd

2015-03-02 Thread Erdem Agaoglu
Hi Gregory, We are not using listomapkeys that way or in any way to be precise. I used it here just to reproduce the behavior/issue. What i am really interested in is if scrubbing-deep actually mitigates the problem and/or is there something that can be further improved. Or i guess we should go

[ceph-users] ceph binary missing from ceph-0.87.1-0.el6.x86_64

2015-03-02 Thread Michael Kuriger
Hi all, When doing a fresh install on a new cluster, and using the latest rpm (0.87.1) ceph-deploy fails right away. I checked the files inside the rpm, and /usr/bin/ceph is not there. Upgrading from the previous rpm seems to work, but ceph-deploy is pulling the latest rpm automatically. [c

Re: [ceph-users] Fresh install of GIANT failing?

2015-03-02 Thread Don Doerner
Problem solved, I've been pointed at repository problem and an existing Ceph issue (http://tracker.ceph.com/issues/10476) by a couple of helpful folks. Thanks, -don- From: Don Doerner Sent: 02 March, 2015 10:20 To: Don Doerner; ceph-users@lists.ceph.com Subject: RE: Fresh install of GIANT failin

[ceph-users] Calamari Reconfiguration

2015-03-02 Thread Garg, Pankaj
Hi, I had a cluster that was working correctly with Calamari and I was able to see and manage from the Dashboard. I had to reinstall the cluster and change IP Addresses etc. so I built my cluster back up, with same name, but mainly network changes. When I went to calamari, it shows some stale inf

[ceph-users] Tues/Wed CDS Schedule Posted

2015-03-02 Thread Patrick McGarry
Hey cephers, The basic schedule has been posted for CDS tomorrow and Wednesday. If you are a blueprint owner and are unable to make the slot you have been assigned please let me know. We're working with some pretty tight time constraints to make this all work, but there is a little wiggle room if

Re: [ceph-users] RadosGW Log Rotation (firefly)

2015-03-02 Thread Georgios Dimitrakakis
Daniel, on CentOS the logrotate script was not invoked incorrectly because it was called everywhere as "radosgw": e.g. service radosgw reload >/dev/null or initctl reload radosgw cluster="$cluster" id="$id" 2>/dev/null || : but there isn't any radosgw service! I had to change it into "ceph

[ceph-users] New SSD Question

2015-03-02 Thread Tony Harris
Hi all, After the previous thread, I'm doing my SSD shopping for and I came across an SSD called an Edge Boost Pro w/ Power Fail, it seems to have some impressive specs - in most places decent user reviews, in once place a poor one - I was wondering if anyone has had any experience with these dri

Re: [ceph-users] ceph binary missing from ceph-0.87.1-0.el6.x86_64

2015-03-02 Thread Gregory Farnum
The ceph tool got moved into ceph-common at some point, so it shouldn't be in the ceph rpm. I'm not sure what step in the installation process should have handled that, but I imagine it's your problem. -Greg On Mon, Mar 2, 2015 at 11:24 AM, Michael Kuriger wrote: > Hi all, > When doing a fresh in

[ceph-users] RadosGW do not populate "log file"

2015-03-02 Thread Italo Santos
Hello everyone, I have a radosgw configured with the bellow ceph.conf file, but this instanse aren't generate any log entry on "log file" path, the log is aways empty, but if I take a look to the apache access.log there are a lot of entries. Anyone knows why? Regards. Italo Santos http://ita

[ceph-users] Inter-zone replication and High Availability

2015-03-02 Thread Brian Button
Hi, I'm trying to understand object storage replication between two ceph clusters in two zones in a single region. Setting up replication itself isn't the issue, it's how to ensure high availability and data safety between the clusters when failing over. The simplest case is flipping the pri

[ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Scottix
We have a file system running CephFS and for a while we had this issue when doing an ls -la we get question marks in the response. -rw-r--r-- 1 wwwrun root14761 Feb 9 16:06 data.2015-02-08_00-00-00.csv.bz2 -? ? ? ? ?? data.2015-02-09_00-00-00.csv.bz2 If we

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Gregory Farnum
On Mon, Mar 2, 2015 at 3:39 PM, Scottix wrote: > We have a file system running CephFS and for a while we had this issue when > doing an ls -la we get question marks in the response. > > -rw-r--r-- 1 wwwrun root14761 Feb 9 16:06 > data.2015-02-08_00-00-00.csv.bz2 > -? ? ? ?

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Bill Sanders
Forgive me if this is unhelpful, but could it be something to do with permissions of the directory and not Ceph at all? http://superuser.com/a/528467 Bill On Mon, Mar 2, 2015 at 3:47 PM, Gregory Farnum wrote: > On Mon, Mar 2, 2015 at 3:39 PM, Scottix wrote: > > We have a file system running C

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Scottix
3 Ceph servers on Ubuntu 12.04.5 - kernel 3.13.0-29-generic We have an old server that we compiled the ceph-fuse client on Suse11.4 - kernel 2.6.37.6-0.11 This is the only mount we have right now. We don't have any problems reading the files and the directory shows full 775 permissions and doing

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Gregory Farnum
I bet it's that permission issue combined with a minor bug in FUSE on that kernel, or maybe in the ceph-fuse code (but I've not seen it reported before, so I kind of doubt it). If you run ceph-fuse with "debug client = 20" it will output (a whole lot of) logging to the client's log file and you cou

Re: [ceph-users] CephFS Attributes Question Marks

2015-03-02 Thread Scottix
I'll try the following things and report back to you. 1. I can get a new kernel on another machine and mount to the CephFS and see if I get the following errors. 2. I'll run the debug and see if anything comes up. I'll report back to you when I can do these things. Thanks, Scottie On Mon, Mar 2

[ceph-users] EC configuration questions...

2015-03-02 Thread Don Doerner
Hello, I am trying to set up to measure erasure coding performance and overhead. My Ceph "cluster-of-one" has 27 disks, hence 27 OSDs, all empty. I have ots of memory, and I am using "osd crush chooseleaf type = 0" in my config file, so my OSDs should be able to peer with others on the same h

Re: [ceph-users] EC configuration questions...

2015-03-02 Thread Don Doerner
Update: the attempt to define a traditional replicated pool was successful; it's online and ready to go. So the cluster basics appear sound... -don- From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Don Doerner Sent: 02 March, 2015 16:18 To: ceph-users@lists.ceph.com Su

Re: [ceph-users] EC configuration questions...

2015-03-02 Thread Loic Dachary
Hi Don, On 03/03/2015 01:18, Don Doerner wrote:> Hello, > > > > I am trying to set up to measure erasure coding performance and overhead. My > Ceph “cluster-of-one” has 27 disks, hence 27 OSDs, all empty. I have ots of > memory, and I am using “osd crush chooseleaf type = 0” in my config f

Re: [ceph-users] New SSD Question

2015-03-02 Thread Christian Balzer
On Mon, 2 Mar 2015 16:12:59 -0600 Tony Harris wrote: > Hi all, > > After the previous thread, I'm doing my SSD shopping for and I came > across an SSD called an Edge Boost Pro w/ Power Fail, it seems to have > some impressive specs - in most places decent user reviews, in once > place a poor one

Re: [ceph-users] Some long running ops may lock osd

2015-03-02 Thread Ben Hines
We're seeing a lot of this as well. (as i mentioned to sage at SCALE..) Is there a rule of thumb at all for how big is safe to let a RGW bucket get? Also, is this theoretically resolved by the new bucket-sharding feature in the latest dev release? -Ben On Mon, Mar 2, 2015 at 11:08 AM, Erdem Agao

Re: [ceph-users] Update 0.80.5 to 0.80.8 --the VM's read request become too slow

2015-03-02 Thread Nathan O'Sullivan
On 11/02/2015 1:46 PM, 杨万元 wrote: Hello! We use Ceph+Openstack in our private cloud. Recently we upgrade our centos6.5 based cluster from Ceph Emperor to Ceph Firefly. At first,we use redhat yum repo epel to upgrade, this Ceph's version is 0.80.5. First upgrade monitor,then osd,last cl

Re: [ceph-users] Some long running ops may lock osd

2015-03-02 Thread GuangYang
We have had good experience so far keeping each bucket less than 0.5 million objects, by client side sharding. But I think it would be nice you can test at your scale, with your hardware configuration, as well as your expectation over the tail latency. Generally the bucket sharding should help,

Re: [ceph-users] RadosGW do not populate "log file"

2015-03-02 Thread zhangdongmao
I have met this before. Because I use apache with rgw, radosrgw is executed by the user 'apache', so you have to make sure the apache user have permissions to write the log file. 在 2015年03月03日 07:06, Italo Santos 写道: Hello everyone, I have a radosgw configured with the bellow ceph.conf file,

Re: [ceph-users] Some long running ops may lock osd

2015-03-02 Thread Ben Hines
Blind-bucket would be perfect for us, as we don't need to list the objects. We only need to list the bucket when doing a bucket deletion. If we could clean out/delete all objects in a bucket (without iterating/listing them) that would be ideal.. On Mon, Mar 2, 2015 at 7:34 PM, GuangYang wrote: >

Re: [ceph-users] v0.93 Hammer release candidate released

2015-03-02 Thread Sage Weil
I forgot to mention a very important note for those running the v0.92 development release and upgrading: On Fri, 27 Feb 2015, Sage Weil wrote: > Upgrading > - * If you are upgrading from v0.92, you must stop all OSD daemons and flush their journals (``ceph-osd -i NNN --flush-journal'')

Re: [ceph-users] v0.93 Hammer release candidate released

2015-03-02 Thread Sage Weil
On Mon, 2 Mar 2015, Sage Weil wrote: > I forgot to mention a very important note for those running the v0.92 > development release and upgrading: > > On Fri, 27 Feb 2015, Sage Weil wrote: > > Upgrading > > - > > * If you are upgrading from v0.92, you must stop all OSD daemons and flush

Re: [ceph-users] Update 0.80.5 to 0.80.8 --the VM's read request become too slow

2015-03-02 Thread Gregory Farnum
On Mon, Mar 2, 2015 at 7:15 PM, Nathan O'Sullivan wrote: > > On 11/02/2015 1:46 PM, 杨万元 wrote: > > Hello! > We use Ceph+Openstack in our private cloud. Recently we upgrade our > centos6.5 based cluster from Ceph Emperor to Ceph Firefly. > At first,we use redhat yum repo epel to upgrade, th

Re: [ceph-users] Update 0.80.5 to 0.80.8 --the VM's read request become too slow

2015-03-02 Thread Alexandre DERUMIER
I think this will be fixed in next firefly point release tracker for firefly 0.80.8 speed decrease http://tracker.ceph.com/issues/10956 Jason Dillaman link it to the famous object_cacher bug: http://tracker.ceph.com/issues/9854 - Mail original - De: "Gregory Farnum" À: "Nathan O'Sulliv

Re: [ceph-users] Some long running ops may lock osd

2015-03-02 Thread Erdem Agaoglu
Thank you folks for bringing that up. I had some questions about sharding. We'd like blind buckets too, at least it's on the roadmap. For the current sharded implementation, what are the final details? Is number of shards defined per bucket or globally? Is there a way to split current indexes into