Re: [ceph-users] about rgw region and zone

2015-04-28 Thread Karan Singh
Hi > On 28 Apr 2015, at 07:12, TERRY <316828...@qq.com> wrote: > > Hi: all > > when I Configuring Federated Gateways, I got the error as below: > > sudo radosgw-agent -c /etc/ceph/ceph-data-sync.conf > ERROR:root:Could not retrieve region map from destination You should check that t

Re: [ceph-users] cephfs: recovering from transport endpoint not connected?

2015-04-28 Thread Burkhard Linke
Hi, On 04/27/2015 02:31 PM, Yan, Zheng wrote: On Mon, Apr 27, 2015 at 3:42 PM, Burkhard Linke wrote: Hi, I've deployed ceph on a number of nodes in our compute cluster (Ubuntu 14.04 Ceph Firefly 0.80.9). /ceph is mounted via ceph-fuse. From time to time some nodes loose their access to ceph

[ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Dominik Hannen
Hi ceph-users, I am currently planning a cluster and would like some input specifically about the storage-nodes. The non-osd systems will be running on more powerful system. Interconnect as currently planned: 4 x 1Gbit LACP Bonds over a pair of MLAG-capable switches (planned: EX3300) So far I

Re: [ceph-users] [cephfs][ceph-fuse] cache size or memory leak?

2015-04-28 Thread John Spray
On 28/04/2015 06:55, Dexter Xiong wrote: Hi, I've deployed a small hammer cluster 0.94.1. And I mount it via ceph-fuse on Ubuntu 14.04. After several hours I found that the ceph-fuse process crashed. The end is the crash log from /var/log/ceph/ceph-client.admin.log. The memory cost of ce

Re: [ceph-users] Another OSD Crush question.

2015-04-28 Thread Rogier Dikkes
Hi Robert, Relocating the older hardware to the new racks is also an interesting option. Thanks for the suggestion! Rogier Dikkes Systeem Programmeur Hadoop & HPC Cloud SURFsara | Science Park 140 | 1098 XG Amsterdam > On Apr 23, 2015, at 5:50 PM, Robert LeBlanc wrote: > > If you force CRUS

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Jake Young
On Tuesday, April 28, 2015, Dominik Hannen wrote: > Hi ceph-users, > > I am currently planning a cluster and would like some input specifically > about the storage-nodes. > > The non-osd systems will be running on more powerful system. > > Interconnect as currently planned: > 4 x 1Gbit LACP Bonds

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Nick Fisk
Hi Dominik, Answers in line > -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Dominik Hannen > Sent: 28 April 2015 10:35 > To: ceph-users@lists.ceph.com > Subject: [ceph-users] Cost- and Powerefficient OSD-Nodes > > Hi ceph-users, > > I am

[ceph-users] Ceph is Full

2015-04-28 Thread Ray Sun
Emergency Help! One of ceph cluster is full, and ceph -s returns: [root@controller ~]# ceph -s cluster 059f27e8-a23f-4587-9033-3e3679d03b31 health HEALTH_ERR 20 pgs backfill_toofull; 20 pgs degraded; 20 pgs stuck unclean; recovery 7482/129081 objects degraded (5.796%); 2 full osd(s); 1 ne

Re: [ceph-users] Ceph is Full

2015-04-28 Thread Ray Sun
More detail about ceph health detail [root@controller ~]# ceph health detail HEALTH_ERR 20 pgs backfill_toofull; 20 pgs degraded; 20 pgs stuck unclean; recovery 7482/129081 objects degraded (5.796%); 2 full osd(s); 1 near full osd(s) pg 3.8 is stuck unclean for 7067109.597691, current state active+

[ceph-users] Use object-map Feature on existing rbd images ?

2015-04-28 Thread Christoph Adomeit
Hi there, we are using ceph hammer and we have some fully provisioned images with only little data. rbd export of a 500 GB rbd Image takes long time although there are only 15 GB of used data, even if the rbd-image is trimmed. Do you think it is a good idea to enable the object-map feature on al

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Dominik Hannen
>> Interconnect as currently planned: >> 4 x 1Gbit LACP Bonds over a pair of MLAG-capable switches (planned: EX3300) > One problem with LACP is that it will only allow you to have 1Gbps between > any two IPs or MACs (depending on your switch config). This will most > likely limit the throughput of

Re: [ceph-users] Ceph is Full

2015-04-28 Thread Sebastien Han
You can try to push the full ratio a bit further and then delete some objects. > On 28 Apr 2015, at 15:51, Ray Sun wrote: > > More detail about ceph health detail > [root@controller ~]# ceph health detail > HEALTH_ERR 20 pgs backfill_toofull; 20 pgs degraded; 20 pgs stuck unclean; > recovery 74

Re: [ceph-users] Ceph is Full

2015-04-28 Thread Ray Sun
Sébastien ​, ​Thanks for your answer, I am a fan of your blog, it really help me a lot. ​I found there're two ways to do that: The first one is use command line, but after I tried ceph pg set_full_ratio 0.98​ Seems it not worked. Then I tried to modify the ceph.conf and add mon osd full ratio =

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Dominik Hannen > Sent: 28 April 2015 15:30 > To: Jake Young > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Cost- and Powerefficient OSD-Nodes > > >> Interconnect as currently

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread David Burley
We tested the m500 960GB for journaling and found at most it could journal 3 spinner OSDs. I'd strongly recommend you avoid the Crucial consumer drives based on our testing/usage. We ended up journaling those to the spinner itself and getting better performance. Also, I wouldn't trust their power l

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Jake Young
On Tuesday, April 28, 2015, Nick Fisk wrote: > > > > > > -Original Message- > > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com > ] On Behalf Of > > Dominik Hannen > > Sent: 28 April 2015 15:30 > > To: Jake Young > > Cc: ceph-users@lists.ceph.com > > Subject: Re: [ceph-users]

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Dominik Hannen
> > 2 x (2 x 1Gbit) was on my mind with cluster/public separated, if the > > performance of 4 x 1Gbit LACP would not deliver. > > Regarding source-IP/dest-IP hashing with LACP. Wouldn't it be sufficient > to > > give each osd-process its own IP for cluster/public then? > I'm not sure this is supp

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Dominik Hannen
>> Interconnect as currently planned: >> 4 x 1Gbit LACP Bonds over a pair of MLAG-capable switches (planned: >> EX3300) > If you can do 10GB networking its really worth it. I found that with 1G, > latency effects your performance before you max out the bandwidth. We got > some Supermicro servers w

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Nick Fisk
> -Original Message- > From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of > Dominik Hannen > Sent: 28 April 2015 17:08 > To: Nick Fisk > Cc: ceph-users@lists.ceph.com > Subject: Re: [ceph-users] Cost- and Powerefficient OSD-Nodes > > >> Interconnect as currently p

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-04-28 Thread Sage Weil
On Tue, 28 Apr 2015, Tuomas Juntunen wrote: > Just to add some more interesting behavior to my problem, is that monitors > are not updating the status of OSD's. Yeah, this is very strange. I see 2015-04-27 22:25:26.142245 7f78d793a900 10 osd.15 17882 _get_pool 4 cached_removed_snaps [1~1,4~a,f~

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Scott Laird
FYI, most Juniper switches hash LAGs on IP+port, so you'd get somewhat better performance than you would with simple MAC or IP hashing. 10G is better if you can afford it, though. On Tue, Apr 28, 2015 at 9:55 AM Nick Fisk wrote: > > > > > > -Original Message- > > From: ceph-users [mailt

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-04-28 Thread Tuomas Juntunen
Hi Here's the file. The cluster was working totally fine for months before this, no problems at all. 'ceph osd tree' still shows 11 osd's up, even though all of them are down. If I add a new one, that will get added to the list and it goes up, but when I stop it, the status will still show it as

Re: [ceph-users] about rgw region and zone

2015-04-28 Thread Karan Singh
You should try to create a new user without —system option so basically create a normal user , then create some bucket and object and finally try to resync cluster. Karan Singh Systems Specialist , Storage Platforms CSC - IT Cent

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-04-28 Thread Sage Weil
On Tue, 28 Apr 2015, Tuomas Juntunen wrote: > Hi > > Here's the file. Hrm. Can you also attach ceph osd dump -f json-pretty 17846 > The cluster was working totally fine for months before this, no problems at > all. > > 'ceph osd tree' still shows 11 osd's up, even though all of them are down

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-04-28 Thread Tuomas Juntunen
Here it is Br, Tuomas -Original Message- From: Sage Weil [mailto:sw...@redhat.com] Sent: 28. huhtikuuta 2015 21:57 To: Tuomas Juntunen Cc: ceph-users@lists.ceph.com Subject: RE: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down On Tue

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-04-28 Thread Tuomas Juntunen
Hi Missed that last one, no theres nothing special there really. Here's the conf, very simple. [global] fsid = a2974742-3805-4cd3-bc79-765f2bddaefe mon_initial_members = ceph1, ceph2, ceph3 mon_host = 10.20.0.11,10.20.0.12,10.20.0.13 auth_cluster_required = cephx auth_service_required = cephx aut

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-04-28 Thread Sage Weil
[adding ceph-devel] Okay, I see the problem. This seems to be unrelated ot the giant -> hammer move... it's a result of the tiering changes you made: > > > > > > The following: > > > > > > > > > > > > ceph osd tier add img images --force-nonempty > > > > > > ceph osd tier cache-mode images for

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Dominik Hannen
> FYI, most Juniper switches hash LAGs on IP+port, so you'd get somewhat > better performance than you would with simple MAC or IP hashing. 10G is > better if you can afford it, though. interesting, I just read up about the topic, those Juniper-Switches seem to be a nice pick then. __

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-04-28 Thread Sage Weil
Hi Tuomas, I've pushed an updated wip-hammer-snaps branch. Can you please try it? The build will appear here http://gitbuilder.ceph.com/ceph-deb-trusty-x86_64-basic/sha1/08bf531331afd5e2eb514067f72afda11bcde286 (or a similar url; adjust for your distro). Thanks! sage On Tue, 28 Ap

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Patrick Hahn
I haven't used them myself but switching silicon is getting pretty cheap nowadays: http://whiteboxswitch.com/products/edge-core-as5610-52x There's similar products (basically the same Broadcom ASIC) from Quanta and I think Supermicro announced one recently as well. They're not as plug and play s

Re: [ceph-users] Ceph Radosgw multi zone data replication failure

2015-04-28 Thread Vickey Singh
Hello Geeks Need your help and advice in this problem. - VS - On Tue, Apr 28, 2015 at 12:48 AM, Vickey Singh wrote: > Hello Alfredo / Craig > > First of all Thank You So much for replying and giving your precious time > to this problem. > > @Alfredo : I tried version radosgw-agent version 1.2.

[ceph-users] Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

2015-04-28 Thread Sean
Hey yall! I have a weird issue and I am not sure where to look so any help would be appreciated. I have a large ceph giant cluster that has been stable and healthy almost entirely since its inception. We have stored over 1.5PB into the cluster currently through RGW and everything seems to be

Re: [ceph-users] Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

2015-04-28 Thread Yehuda Sadeh-Weinraub
- Original Message - > From: "Sean" > To: ceph-users@lists.ceph.com > Sent: Tuesday, April 28, 2015 2:52:35 PM > Subject: [ceph-users] Civet RadosGW S3 not storing complete obects; civetweb > logs stop after rotation > > Hey yall! > > I have a weird issue and I am not sure where to lo

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Dominik Hannen
> We tested the m500 960GB for journaling and found at most it could journal > 3 spinner OSDs. I'd strongly recommend you avoid the Crucial consumer > drives based on our testing/usage. We ended up journaling those to the > spinner itself and getting better performance. Also, I wouldn't trust their

Re: [ceph-users] Civet RadosGW S3 not storing complete obects; civetweb logs stop after rotation

2015-04-28 Thread Sean Sullivan
Will do. The reason for the partial request is that the total size of the file is close to 1TB so attempting a download would take quite some time on our 10Gb connection. What is odd is that if I request the last bit received to the end of the file we get a 406 can not be satisfied response

Re: [ceph-users] Cost- and Powerefficient OSD-Nodes

2015-04-28 Thread Dominik Hannen
> It's all about the total latency per operation. Most IO sizes over 10GB > don't make much difference to the Round Trip Time. But comparatively even > 128KB IO's over 1GB take quite a while. For example ping a host with a > payload of 64k over 1GB and 10GB networks and look at the difference in >

[ceph-users] Cannot remove cache pool used by CephFS

2015-04-28 Thread CY Chang
Hi, I set up a cache pool for data pool used in CephFS. When I tried to remove the cache pool, I got this error: pool 'XXX' is in use by CephFS via its tier. So, my question is: why is it forbidden to remove tiers from a base pool in use by CephFS? How about the pool in use by RBD? CY ___

Re: [ceph-users] Ceph is Full

2015-04-28 Thread Ray Sun
After add a new ceph-osd, seems the ceph is back to normal. But there's still a warning message in ceph health detail. During the previous three months, there's a OSD node restart very often due to power supply problem. So I guess maybe it is related to this. But I not quite sure how to fix it, Ple

Re: [ceph-users] Upgrade from Giant to Hammer and after some basic operations most of the OSD's went down

2015-04-28 Thread Tuomas Juntunen
Hi I updated that version and it seems that something did happen, the osd's stayed up for a while and 'ceph status' got updated. But then in couple of minutes, they all went down the same way. I have attached new 'ceph osd dump -f json-pretty' and got a new log from one of the osd's with osd debu

Re: [ceph-users] [cephfs][ceph-fuse] cache size or memory leak?

2015-04-28 Thread Dexter Xiong
I tried set client cache size = 100, but it doesn't solve the problem. I tested ceph-fuse with kernel version 3.13.0-24 3.13.0-49 and 3.16.0-34. On Tue, Apr 28, 2015 at 7:39 PM John Spray wrote: > > > On 28/04/2015 06:55, Dexter Xiong wrote: > > Hi, > > I've deployed a small hammer cluster