Re: [ceph-users] help with ceph radosgw configure

2014-03-15 Thread wsnote
Thanks for your reply.
I am sure that there is only one web server in CentOS.
All my steps are as follows:

First of all, I have set DNS, so that I can
nslookup ceph65
nslookup a.ceph65
nslookup anyother.ceph65


Then
1. yum install httpd mod_fastcgi mod_ssl
rm /etc/httpd/conf.d/welcome.conf  
rm /var/www/error/noindex.html  
2. vi /etc/httpd/conf/httpd.conf
-
Listen 65080
ServerName ceph65
-
3. make sure that
LoadModule rewrite_module modules/mod_rewrite.so
LoadModule fastcgi_module modules/mod_fastcgi.so
LoadModule ssl_module modules/mod_ssl.so
4. Generate ssl
cd /etc/pki/tls/private/
openssl genrsa -des3 -out server.key 2048
openssl req -new -key server.key -out server.csr
cp server.key server.key.orig
openssl rsa -in server.key.orig -out server.key
openssl x509 -req -days 65535 -in server.csr -signkey server.key -out server.crt
rm server.key.orig server.csr
mv server.crt /etc/pki/tls/certs
5. vi /etc/httpd/conf.d/ssl.conf
-
Listen 65443

SSLCertificateFile /etc/pki/tls/certs/server.crt
SSLCertificateKeyFile /etc/pki/tls/private/server.key
-
6. install  ceph-radosgw  and  radosgw-agent
yum install ceph-radosgw radosgw-agent
7. vi ceph.conf and copy to other ceph server
-
[client.radosgw.gateway]
host = ceph65
public_addr = 192.168.8.183
rgw_dns_name = 127.0.0.1
keyring = /etc/ceph/keyring.radosgw.gateway
rgw_socket_path = /tmp/radosgw.sock
log_file = /var/log/ceph/radosgw.log
-
8.
mkdir -p /var/lib/ceph/radosgw/ceph-radosgw.gateway
9.
ceph-authtool --create-keyring /etc/ceph/keyring.radosgw.gateway
chmod +r /etc/ceph/keyring.radosgw.gateway
ceph-authtool /etc/ceph/keyring.radosgw.gateway -n client.radosgw.gateway 
--gen-key
ceph-authtool -n client.radosgw.gateway --cap osd 'allow rwx' --cap mon 'allow 
rw' /etc/ceph/keyring.radosgw.gateway
ceph auth add client.radosgw.gateway -i /etc/ceph/keyring.radosgw.gateway
10. vi /etc/httpd/conf.d/fastcgi.conf
-
FastCgiWrapper Off
-
11. vi /etc/httpd/conf.d/rgw.conf
-

ServerName ceph65
ServerAdmin ceph65
DocumentRoot /var/www/html
   

RewriteEngine On
RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*) 
/s3gw.fcgi?page=$1¶ms=$2&%{QUERY_STRING} 
[E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]



Options +ExecCGI
AllowOverride All
SetHandler fastcgi-script
Order allow,deny
Allow from all
AuthBasicAuthoritative Off


   
AllowEncodedSlashes On
ErrorLog /var/log/httpd/rgw_error_log
CustomLog /var/log/httpd/rgw_access_log combined
ServerSignature Off
SetEnv SERVER_PORT_SECURE 65443

 

ServerName ceph65
ServerAdmin ceph65
DocumentRoot /var/www/html
#ErrorLog logs/ssl_error_log
#TransferLog logs/ssl_access_log
LogLevel warn
SSLEngine on
SSLProtocol all -SSLv2
SSLCipherSuite ALL:!ADH:!EXPORT:!SSLv2:RC4+RSA:+HIGH:+MEDIUM:+LOW
SSLCertificateFile /etc/pki/tls/certs/server.crt
SSLCertificateKeyFile /etc/pki/tls/private/server.key

SSLOptions +StdEnvVars


SSLOptions +StdEnvVars

SetEnvIf User-Agent ".*MSIE.*" nokeepalive ssl-unclean-shutdown 
downgrade-1.0 force-response-1.0
CustomLog logs/ssl_request_log "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x 
\"%r\" %b"
   

RewriteEngine On
RewriteRule ^/([a-zA-Z0-9-_.]*)([/]?.*) 
/s3gw.fcgi?page=$1¶ms=$2&%{QUERY_STRING} 
[E=HTTP_AUTHORIZATION:%{HTTP:Authorization},L]



Options +ExecCGI
AllowOverride All
SetHandler fastcgi-script
Order allow,deny
Allow from all
AuthBasicAuthoritative Off


   
AllowEncodedSlashes On
ErrorLog /var/log/httpd/rgw_error_log
CustomLog /var/log/httpd/rgw_access_log combined
ServerSignature Off
SetEnv SERVER_PORT_SECURE 65443

 

FastCgiExternalServer /var/www/html/s3gw.fcgi -socket /tmp/radosgw.sock

-
12. vi /var/www/html/s3gw.fcgi and chmod +x  /var/www/html/s3gw.fcgi
-
#!/bin/sh
exec /usr/bin/radosgw -c /etc/ceph/ceph.conf -n client.radosgw.gateway
-
13. rm -rf /tmp/radosgw.sock
14. start radosgw
chkconfig --add ceph-radosgw
chkconfig ceph-radosgw on
service ceph -a restart
service httpd restart
service ceph-radosgw start
servi

[ceph-users] RBD as backend for iSCSI SAN Targets

2014-03-15 Thread Karol Kozubal
Hi Everyone,

I am just wondering if any of you are running a ceph cluster with an iSCSI 
target front end? I know this isn’t available out of the box, unfortunately in 
one particular use case we are looking at providing iSCSI access and it's a 
necessity. I am liking the idea of having rbd devices serving block level 
storage to the iSCSI Target servers while providing a unified backed for native 
rbd access by openstack and various application servers. On multiple levels 
this would reduce the complexity of our SAN environment and move us away from 
expensive proprietary solutions that don’t scale out.

If any of you have deployed any HA iSCSI Targets backed by rbd I would really 
appreciate your feedback and any thoughts.

Karol
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Replication lag in block storage

2014-03-15 Thread Ирек Фасихов
Which model you have hard drives?


2014-03-14 21:59 GMT+04:00 Greg Poirier :

> We are stressing these boxes pretty spectacularly at the moment.
>
> On every box I have one OSD that is pegged for IO almost constantly.
>
> ceph-1:
> Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdv   0.00 0.00  104.00  160.00   748.00  1000.0013.24
> 1.154.369.461.05   3.70  97.60
>
> ceph-2:
> Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdq   0.0025.00  109.00  218.00   844.00  1773.5016.01
> 1.374.209.031.78   3.01  98.40
>
> ceph-3:
> Device: rrqm/s   wrqm/s r/s w/srkB/swkB/s avgrq-sz
> avgqu-sz   await r_await w_await  svctm  %util
> sdm   0.00 0.00  126.00   56.00   996.00   540.0016.88
> 1.015.588.060.00   5.43  98.80
>
> These are all disks in my block storage pool.
>
>  osdmap e26698: 102 osds: 102 up, 102 in
>   pgmap v6752413: 4624 pgs, 3 pools, 14151 GB data, 21729 kobjects
> 28517 GB used, 65393 GB / 93911 GB avail
> 4624 active+clean
>   client io 1915 kB/s rd, 59690 kB/s wr, 1464 op/s
>
> I don't see any smart errors, but i'm slowly working my way through all of
> the disks on these machines with smartctl to see if anything stands out.
>
>
> On Fri, Mar 14, 2014 at 9:52 AM, Gregory Farnum  wrote:
>
>> On Fri, Mar 14, 2014 at 9:37 AM, Greg Poirier 
>> wrote:
>> > So, on the cluster that I _expect_ to be slow, it appears that we are
>> > waiting on journal commits. I want to make sure that I am reading this
>> > correctly:
>> >
>> >   "received_at": "2014-03-14 12:14:22.659170",
>> >
>> > { "time": "2014-03-14 12:14:22.660191",
>> >   "event": "write_thread_in_journal_buffer"},
>> >
>> > At this point we have received the write and are attempting to write the
>> > transaction to the OSD's journal, yes?
>> >
>> > Then:
>> >
>> > { "time": "2014-03-14 12:14:22.900779",
>> >   "event": "journaled_completion_queued"},
>> >
>> > 240ms later we have successfully written to the journal?
>>
>> Correct. That seems an awfully long time for a 16K write, although I
>> don't know how much data I have on co-located journals. (At least, I'm
>> assuming it's in the 16K range based on the others, although I'm just
>> now realizing that subops aren't providing that information...I've
>> created a ticket to include that diagnostic info in future.)
>> -Greg
>> Software Engineer #42 @ http://inktank.com | http://ceph.com
>>
>>
>> > I expect this particular slowness due to colocation of journal and data
>> on
>> > the same disk (and it's a spinning disk, not an SSD). I expect some of
>> this
>> > could be alleviated by migrating journals to SSDs, but I am looking to
>> > rebuild in the near future--so am willing to hobble in the meantime.
>> >
>> > I am surprised that our all SSD cluster is also underperforming. I am
>> trying
>> > colocating the journal on the same disk with all SSDs at the moment and
>> will
>> > see if the performance degradation is of the same nature.
>> >
>> >
>> >
>> > On Thu, Mar 13, 2014 at 6:25 PM, Gregory Farnum 
>> wrote:
>> >>
>> >> Right. So which is the interval that's taking all the time? Probably
>> >> it's waiting for the journal commit, but maybe there's something else
>> >> blocking progress. If it is the journal commit, check out how busy the
>> >> disk is (is it just saturated?) and what its normal performance
>> >> characteristics are (is it breaking?).
>> >> -Greg
>> >> Software Engineer #42 @ http://inktank.com | http://ceph.com
>> >>
>> >>
>> >> On Thu, Mar 13, 2014 at 5:48 PM, Greg Poirier > >
>> >> wrote:
>> >> > Many of the sub ops look like this, with significant lag between
>> >> > received_at
>> >> > and commit_sent:
>> >> >
>> >> > { "description": "osd_op(client.6869831.0:1192491
>> >> > rbd_data.67b14a2ae8944a.9105 [write 507904~3686400]
>> >> > 6.556a4db0
>> >> > e660)",
>> >> >   "received_at": "2014-03-13 20:42:05.811936",
>> >> >   "age": "46.088198",
>> >> >   "duration": "0.038328",
>> >> > 
>> >> > { "time": "2014-03-13 20:42:05.850215",
>> >> >   "event": "commit_sent"},
>> >> > { "time": "2014-03-13 20:42:05.850264",
>> >> >   "event": "done"}]]},
>> >> >
>> >> > In this case almost 39ms between received_at and commit_sent.
>> >> >
>> >> > A particularly egregious example of 80+ms lag between received_at and
>> >> > commit_sent:
>> >> >
>> >> >{ "description": "osd_op(client.6869831.0:1190526
>> >> > rbd_data.67b14a2ae8944a.8fac [write 3325952~868352]
>> >> > 6.5255f5fd
>> >> > e660)",
>> >> >   "received_at": "201

Re: [ceph-users] RBD as backend for iSCSI SAN Targets

2014-03-15 Thread Wido den Hollander

On 03/15/2014 04:11 PM, Karol Kozubal wrote:

Hi Everyone,

I am just wondering if any of you are running a ceph cluster with an
iSCSI target front end? I know this isn’t available out of the box,
unfortunately in one particular use case we are looking at providing
iSCSI access and it's a necessity. I am liking the idea of having rbd
devices serving block level storage to the iSCSI Target servers while
providing a unified backed for native rbd access by openstack and
various application servers. On multiple levels this would reduce the
complexity of our SAN environment and move us away from expensive
proprietary solutions that don’t scale out.

If any of you have deployed any HA iSCSI Targets backed by rbd I would
really appreciate your feedback and any thoughts.



I haven't used it in production, but a couple of things which come to mind:

- Use TGT so you can run it all in userspace backed by librbd
- Do not use writeback caching on the targets

You could use multipathing if you don't use writeback caching. Use 
writeback would also cause data loss/corruption in case of multiple targets.


It will probably just work with TGT, but I don't know anything about the 
performance.



Karol


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD as backend for iSCSI SAN Targets

2014-03-15 Thread Karol Kozubal
Hi Wido,

I will have some new hardware for running tests in the next two weeks or
so and will report my findings once I get a chance to run some tests. I
will disable writeback on the target side as I will be attempting to
configure an ssd caching pool of 24 ssd's with writeback for the main pool
with 360 disks with a 5 osd spinners to 1 ssd journal ratio. I will be
running everything through 10Gig SFP+ Ethernet interfaces with a dedicated
cluster network interface, dedicated public ceph interface and a separate
iscsi network also with 10 gig interfaces for the target machines.

I am ideally looking for a 20,000 to 60,000 IOPS from this system if I can
get the caching pool configuration right. The application has a 30ms max
latency requirement for the storage.

In my current tests I have only spinners with SAS 10K disks, 4.2ms write
latency on the disks with separate journaling on SAS 15K disks with a
3.3ms write latency. With 20 OSDs and 4 Journals I am only concerned with
the overall operation apply latency that I have been seeing (1-6ms idle is
normal, but up to 60-170ms for a moderate workload using rbd bench-write)
however I am on a network where I am bound to 1500 mtu and I will get to
test jumbo frames with the next setup in addition to the ssd¹s. I suspect
the overall performance will be good in the new test setup and I am
curious to see what my tests will yield.

Thanks for the response!

Karol



On 2014-03-15, 12:18 PM, "Wido den Hollander"  wrote:

>On 03/15/2014 04:11 PM, Karol Kozubal wrote:
>> Hi Everyone,
>>
>> I am just wondering if any of you are running a ceph cluster with an
>> iSCSI target front end? I know this isn¹t available out of the box,
>> unfortunately in one particular use case we are looking at providing
>> iSCSI access and it's a necessity. I am liking the idea of having rbd
>> devices serving block level storage to the iSCSI Target servers while
>> providing a unified backed for native rbd access by openstack and
>> various application servers. On multiple levels this would reduce the
>> complexity of our SAN environment and move us away from expensive
>> proprietary solutions that don¹t scale out.
>>
>> If any of you have deployed any HA iSCSI Targets backed by rbd I would
>> really appreciate your feedback and any thoughts.
>>
>
>I haven't used it in production, but a couple of things which come to
>mind:
>
>- Use TGT so you can run it all in userspace backed by librbd
>- Do not use writeback caching on the targets
>
>You could use multipathing if you don't use writeback caching. Use
>writeback would also cause data loss/corruption in case of multiple
>targets.
>
>It will probably just work with TGT, but I don't know anything about the
>performance.
>
>> Karol
>>
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>-- 
>Wido den Hollander
>42on B.V.
>
>Phone: +31 (0)20 700 9902
>Skype: contact42on
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD as backend for iSCSI SAN Targets

2014-03-15 Thread Wido den Hollander

On 03/15/2014 05:40 PM, Karol Kozubal wrote:

Hi Wido,

I will have some new hardware for running tests in the next two weeks or
so and will report my findings once I get a chance to run some tests. I
will disable writeback on the target side as I will be attempting to
configure an ssd caching pool of 24 ssd's with writeback for the main pool
with 360 disks with a 5 osd spinners to 1 ssd journal ratio. I will be


How are the SSDs going to be in writeback? Is that the new caching pool 
feature?



running everything through 10Gig SFP+ Ethernet interfaces with a dedicated
cluster network interface, dedicated public ceph interface and a separate
iscsi network also with 10 gig interfaces for the target machines.



That seems like a good network.


I am ideally looking for a 20,000 to 60,000 IOPS from this system if I can
get the caching pool configuration right. The application has a 30ms max
latency requirement for the storage.



20.000 to 60.000 is a big difference. But the only way you are going to 
achieve that is by doing a lot of parellel I/O. Ceph doesn't excel in 
single threads doing a lot of I/O.


So if you have multiple RBD devices on which you are doing the I/O it 
shouldn't be that much of a problem.


Just spread out the I/O. Scale horizontal instead of vertical.


In my current tests I have only spinners with SAS 10K disks, 4.2ms write
latency on the disks with separate journaling on SAS 15K disks with a
3.3ms write latency. With 20 OSDs and 4 Journals I am only concerned with
the overall operation apply latency that I have been seeing (1-6ms idle is
normal, but up to 60-170ms for a moderate workload using rbd bench-write)
however I am on a network where I am bound to 1500 mtu and I will get to
test jumbo frames with the next setup in addition to the ssd¹s. I suspect
the overall performance will be good in the new test setup and I am
curious to see what my tests will yield.

Thanks for the response!

Karol



On 2014-03-15, 12:18 PM, "Wido den Hollander"  wrote:


On 03/15/2014 04:11 PM, Karol Kozubal wrote:

Hi Everyone,

I am just wondering if any of you are running a ceph cluster with an
iSCSI target front end? I know this isn¹t available out of the box,
unfortunately in one particular use case we are looking at providing
iSCSI access and it's a necessity. I am liking the idea of having rbd
devices serving block level storage to the iSCSI Target servers while
providing a unified backed for native rbd access by openstack and
various application servers. On multiple levels this would reduce the
complexity of our SAN environment and move us away from expensive
proprietary solutions that don¹t scale out.

If any of you have deployed any HA iSCSI Targets backed by rbd I would
really appreciate your feedback and any thoughts.



I haven't used it in production, but a couple of things which come to
mind:

- Use TGT so you can run it all in userspace backed by librbd
- Do not use writeback caching on the targets

You could use multipathing if you don't use writeback caching. Use
writeback would also cause data loss/corruption in case of multiple
targets.

It will probably just work with TGT, but I don't know anything about the
performance.


Karol


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com





--
Wido den Hollander
42on B.V.

Phone: +31 (0)20 700 9902
Skype: contact42on
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD as backend for iSCSI SAN Targets

2014-03-15 Thread Karol Kozubal
How are the SSDs going to be in writeback? Is that the new caching pool
Feature?

I am not sure what version implemented this, but it is documented here
(https://ceph.com/docs/master/dev/cache-pool/).
I will be using the latest stable release for my next batch of testing,
right now I am on 0.67.4 and I will be moving towards the 0.72.x branch.

As for the IOPS, it would be a total cluster IO throughput estimate based
on an application that would be reading/writing to more than 60 rbd
volumes.





On 2014-03-15, 1:11 PM, "Wido den Hollander"  wrote:

>On 03/15/2014 05:40 PM, Karol Kozubal wrote:
>> Hi Wido,
>>
>> I will have some new hardware for running tests in the next two weeks or
>> so and will report my findings once I get a chance to run some tests. I
>> will disable writeback on the target side as I will be attempting to
>> configure an ssd caching pool of 24 ssd's with writeback for the main
>>pool
>> with 360 disks with a 5 osd spinners to 1 ssd journal ratio. I will be
>
>How are the SSDs going to be in writeback? Is that the new caching pool
>feature?
>
>> running everything through 10Gig SFP+ Ethernet interfaces with a
>>dedicated
>> cluster network interface, dedicated public ceph interface and a
>>separate
>> iscsi network also with 10 gig interfaces for the target machines.
>>
>
>That seems like a good network.
>
>> I am ideally looking for a 20,000 to 60,000 IOPS from this system if I
>>can
>> get the caching pool configuration right. The application has a 30ms max
>> latency requirement for the storage.
>>
>
>20.000 to 60.000 is a big difference. But the only way you are going to
>achieve that is by doing a lot of parellel I/O. Ceph doesn't excel in
>single threads doing a lot of I/O.
>
>So if you have multiple RBD devices on which you are doing the I/O it
>shouldn't be that much of a problem.
>
>Just spread out the I/O. Scale horizontal instead of vertical.
>
>> In my current tests I have only spinners with SAS 10K disks, 4.2ms write
>> latency on the disks with separate journaling on SAS 15K disks with a
>> 3.3ms write latency. With 20 OSDs and 4 Journals I am only concerned
>>with
>> the overall operation apply latency that I have been seeing (1-6ms idle
>>is
>> normal, but up to 60-170ms for a moderate workload using rbd
>>bench-write)
>> however I am on a network where I am bound to 1500 mtu and I will get to
>> test jumbo frames with the next setup in addition to the ssd¹s. I
>>suspect
>> the overall performance will be good in the new test setup and I am
>> curious to see what my tests will yield.
>>
>> Thanks for the response!
>>
>> Karol
>>
>>
>>
>> On 2014-03-15, 12:18 PM, "Wido den Hollander"  wrote:
>>
>>> On 03/15/2014 04:11 PM, Karol Kozubal wrote:
 Hi Everyone,

 I am just wondering if any of you are running a ceph cluster with an
 iSCSI target front end? I know this isn¹t available out of the box,
 unfortunately in one particular use case we are looking at providing
 iSCSI access and it's a necessity. I am liking the idea of having rbd
 devices serving block level storage to the iSCSI Target servers while
 providing a unified backed for native rbd access by openstack and
 various application servers. On multiple levels this would reduce the
 complexity of our SAN environment and move us away from expensive
 proprietary solutions that don¹t scale out.

 If any of you have deployed any HA iSCSI Targets backed by rbd I would
 really appreciate your feedback and any thoughts.

>>>
>>> I haven't used it in production, but a couple of things which come to
>>> mind:
>>>
>>> - Use TGT so you can run it all in userspace backed by librbd
>>> - Do not use writeback caching on the targets
>>>
>>> You could use multipathing if you don't use writeback caching. Use
>>> writeback would also cause data loss/corruption in case of multiple
>>> targets.
>>>
>>> It will probably just work with TGT, but I don't know anything about
>>>the
>>> performance.
>>>
 Karol


 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

>>>
>>>
>>> --
>>> Wido den Hollander
>>> 42on B.V.
>>>
>>> Phone: +31 (0)20 700 9902
>>> Skype: contact42on
>>> ___
>>> ceph-users mailing list
>>> ceph-users@lists.ceph.com
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
>-- 
>Wido den Hollander
>42on B.V.
>
>Phone: +31 (0)20 700 9902
>Skype: contact42on
>___
>ceph-users mailing list
>ceph-users@lists.ceph.com
>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD as backend for iSCSI SAN Targets

2014-03-15 Thread Karol Kozubal
I just re-read the documentation… It looks like its a proposed feature
that is in development. I will have to adjust my test in consequence in
that case.

Any one out there have any ideas when this will be implemented? Or what
the plans look like as of right now?



On 2014-03-15, 1:17 PM, "Karol Kozubal"  wrote:

>How are the SSDs going to be in writeback? Is that the new caching pool
>Feature?
>
>I am not sure what version implemented this, but it is documented here
>(https://ceph.com/docs/master/dev/cache-pool/).
>I will be using the latest stable release for my next batch of testing,
>right now I am on 0.67.4 and I will be moving towards the 0.72.x branch.
>
>As for the IOPS, it would be a total cluster IO throughput estimate based
>on an application that would be reading/writing to more than 60 rbd
>volumes.
>
>
>
>
>
>On 2014-03-15, 1:11 PM, "Wido den Hollander"  wrote:
>
>>On 03/15/2014 05:40 PM, Karol Kozubal wrote:
>>> Hi Wido,
>>>
>>> I will have some new hardware for running tests in the next two weeks
>>>or
>>> so and will report my findings once I get a chance to run some tests. I
>>> will disable writeback on the target side as I will be attempting to
>>> configure an ssd caching pool of 24 ssd's with writeback for the main
>>>pool
>>> with 360 disks with a 5 osd spinners to 1 ssd journal ratio. I will be
>>
>>How are the SSDs going to be in writeback? Is that the new caching pool
>>feature?
>>
>>> running everything through 10Gig SFP+ Ethernet interfaces with a
>>>dedicated
>>> cluster network interface, dedicated public ceph interface and a
>>>separate
>>> iscsi network also with 10 gig interfaces for the target machines.
>>>
>>
>>That seems like a good network.
>>
>>> I am ideally looking for a 20,000 to 60,000 IOPS from this system if I
>>>can
>>> get the caching pool configuration right. The application has a 30ms
>>>max
>>> latency requirement for the storage.
>>>
>>
>>20.000 to 60.000 is a big difference. But the only way you are going to
>>achieve that is by doing a lot of parellel I/O. Ceph doesn't excel in
>>single threads doing a lot of I/O.
>>
>>So if you have multiple RBD devices on which you are doing the I/O it
>>shouldn't be that much of a problem.
>>
>>Just spread out the I/O. Scale horizontal instead of vertical.
>>
>>> In my current tests I have only spinners with SAS 10K disks, 4.2ms
>>>write
>>> latency on the disks with separate journaling on SAS 15K disks with a
>>> 3.3ms write latency. With 20 OSDs and 4 Journals I am only concerned
>>>with
>>> the overall operation apply latency that I have been seeing (1-6ms idle
>>>is
>>> normal, but up to 60-170ms for a moderate workload using rbd
>>>bench-write)
>>> however I am on a network where I am bound to 1500 mtu and I will get
>>>to
>>> test jumbo frames with the next setup in addition to the ssd¹s. I
>>>suspect
>>> the overall performance will be good in the new test setup and I am
>>> curious to see what my tests will yield.
>>>
>>> Thanks for the response!
>>>
>>> Karol
>>>
>>>
>>>
>>> On 2014-03-15, 12:18 PM, "Wido den Hollander"  wrote:
>>>
 On 03/15/2014 04:11 PM, Karol Kozubal wrote:
> Hi Everyone,
>
> I am just wondering if any of you are running a ceph cluster with an
> iSCSI target front end? I know this isn¹t available out of the box,
> unfortunately in one particular use case we are looking at providing
> iSCSI access and it's a necessity. I am liking the idea of having rbd
> devices serving block level storage to the iSCSI Target servers while
> providing a unified backed for native rbd access by openstack and
> various application servers. On multiple levels this would reduce the
> complexity of our SAN environment and move us away from expensive
> proprietary solutions that don¹t scale out.
>
> If any of you have deployed any HA iSCSI Targets backed by rbd I
>would
> really appreciate your feedback and any thoughts.
>

 I haven't used it in production, but a couple of things which come to
 mind:

 - Use TGT so you can run it all in userspace backed by librbd
 - Do not use writeback caching on the targets

 You could use multipathing if you don't use writeback caching. Use
 writeback would also cause data loss/corruption in case of multiple
 targets.

 It will probably just work with TGT, but I don't know anything about
the
 performance.

> Karol
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


 --
 Wido den Hollander
 42on B.V.

 Phone: +31 (0)20 700 9902
 Skype: contact42on
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>>
>>
>>
>>-- 
>>Wido den Hollander
>>42on B.V.
>>
>>Phon

Re: [ceph-users] RBD as backend for iSCSI SAN Targets

2014-03-15 Thread Sage Weil
On Sat, 15 Mar 2014, Karol Kozubal wrote:
> I just re-read the documentation… It looks like its a proposed feature
> that is in development. I will have to adjust my test in consequence in
> that case.
> 
> Any one out there have any ideas when this will be implemented? Or what
> the plans look like as of right now?

This will appear in 0.78, which will be out in the next week.

sage

> 
> 
> 
> On 2014-03-15, 1:17 PM, "Karol Kozubal"  wrote:
> 
> >How are the SSDs going to be in writeback? Is that the new caching pool
> >Feature?
> >
> >I am not sure what version implemented this, but it is documented here
> >(https://ceph.com/docs/master/dev/cache-pool/).
> >I will be using the latest stable release for my next batch of testing,
> >right now I am on 0.67.4 and I will be moving towards the 0.72.x branch.
> >
> >As for the IOPS, it would be a total cluster IO throughput estimate based
> >on an application that would be reading/writing to more than 60 rbd
> >volumes.
> >
> >
> >
> >
> >
> >On 2014-03-15, 1:11 PM, "Wido den Hollander"  wrote:
> >
> >>On 03/15/2014 05:40 PM, Karol Kozubal wrote:
> >>> Hi Wido,
> >>>
> >>> I will have some new hardware for running tests in the next two weeks
> >>>or
> >>> so and will report my findings once I get a chance to run some tests. I
> >>> will disable writeback on the target side as I will be attempting to
> >>> configure an ssd caching pool of 24 ssd's with writeback for the main
> >>>pool
> >>> with 360 disks with a 5 osd spinners to 1 ssd journal ratio. I will be
> >>
> >>How are the SSDs going to be in writeback? Is that the new caching pool
> >>feature?
> >>
> >>> running everything through 10Gig SFP+ Ethernet interfaces with a
> >>>dedicated
> >>> cluster network interface, dedicated public ceph interface and a
> >>>separate
> >>> iscsi network also with 10 gig interfaces for the target machines.
> >>>
> >>
> >>That seems like a good network.
> >>
> >>> I am ideally looking for a 20,000 to 60,000 IOPS from this system if I
> >>>can
> >>> get the caching pool configuration right. The application has a 30ms
> >>>max
> >>> latency requirement for the storage.
> >>>
> >>
> >>20.000 to 60.000 is a big difference. But the only way you are going to
> >>achieve that is by doing a lot of parellel I/O. Ceph doesn't excel in
> >>single threads doing a lot of I/O.
> >>
> >>So if you have multiple RBD devices on which you are doing the I/O it
> >>shouldn't be that much of a problem.
> >>
> >>Just spread out the I/O. Scale horizontal instead of vertical.
> >>
> >>> In my current tests I have only spinners with SAS 10K disks, 4.2ms
> >>>write
> >>> latency on the disks with separate journaling on SAS 15K disks with a
> >>> 3.3ms write latency. With 20 OSDs and 4 Journals I am only concerned
> >>>with
> >>> the overall operation apply latency that I have been seeing (1-6ms idle
> >>>is
> >>> normal, but up to 60-170ms for a moderate workload using rbd
> >>>bench-write)
> >>> however I am on a network where I am bound to 1500 mtu and I will get
> >>>to
> >>> test jumbo frames with the next setup in addition to the ssd¹s. I
> >>>suspect
> >>> the overall performance will be good in the new test setup and I am
> >>> curious to see what my tests will yield.
> >>>
> >>> Thanks for the response!
> >>>
> >>> Karol
> >>>
> >>>
> >>>
> >>> On 2014-03-15, 12:18 PM, "Wido den Hollander"  wrote:
> >>>
>  On 03/15/2014 04:11 PM, Karol Kozubal wrote:
> > Hi Everyone,
> >
> > I am just wondering if any of you are running a ceph cluster with an
> > iSCSI target front end? I know this isn¹t available out of the box,
> > unfortunately in one particular use case we are looking at providing
> > iSCSI access and it's a necessity. I am liking the idea of having rbd
> > devices serving block level storage to the iSCSI Target servers while
> > providing a unified backed for native rbd access by openstack and
> > various application servers. On multiple levels this would reduce the
> > complexity of our SAN environment and move us away from expensive
> > proprietary solutions that don¹t scale out.
> >
> > If any of you have deployed any HA iSCSI Targets backed by rbd I
> >would
> > really appreciate your feedback and any thoughts.
> >
> 
>  I haven't used it in production, but a couple of things which come to
>  mind:
> 
>  - Use TGT so you can run it all in userspace backed by librbd
>  - Do not use writeback caching on the targets
> 
>  You could use multipathing if you don't use writeback caching. Use
>  writeback would also cause data loss/corruption in case of multiple
>  targets.
> 
>  It will probably just work with TGT, but I don't know anything about
> the
>  performance.
> 
> > Karol
> >
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/

Re: [ceph-users] No more Journals ?

2014-03-15 Thread Karan Singh
Hello Everyone

If you see ceph day presentation delivered by Sebastien ( slide number 23 )   
http://www.slideshare.net/Inktank_Ceph/ceph-performance

It looks like Firefly has dropped support to Journals , How concrete is this 
news ???


-Karan-


On 14 Mar 2014, at 15:35, Jake Young  wrote:

> You should take a look at this blog post:
> 
> http://ceph.com/community/ceph-performance-part-2-write-throughput-without-ssd-journals/
> 
> The test results shows that using a RAID card with a write-back cache without 
> journal disks can perform better or equivalent to using journal disks with 
> XFS. 
> 
> As to whether or not it’s better to buy expensive controllers and use all of 
> your drive bays for spinning disks or cheap controllers and use some portion 
> of your bays for SSDs/Journals, there are trade-offs.  If built right, 
> systems with SSD journals provide higher large block write throughput, while 
> putting journals on the data disks provides higher storage density.  Without 
> any tuning both solutions currently provide similar IOP throughput.
> 
> Jake
> 
> 
> On Friday, March 14, 2014, Markus Goldberg  wrote:
> Sorry,
> i should have asked a little bit clearer:
> Can ceph (or OSDs) be used without journals now ?
> The Journal-Parameter seems to be optional ( because of '[...]' )
> 
> Markus
> Am 14.03.2014 12:19, schrieb John Spray:
> Journals have not gone anywhere, and ceph-deploy still supports
> specifying them with exactly the same syntax as before.
> 
> The page you're looking at is the simplified "quick start", the detail
> on osd creation including journals is here:
> http://eu.ceph.com/docs/v0.77/rados/deployment/ceph-deploy-osd/
> 
> Cheers,
> John
> 
> On Fri, Mar 14, 2014 at 9:47 AM, Markus Goldberg
>  wrote:
> Hi,
> i'm a little bit surprised. I read through the new manuals of 0.77
> (http://eu.ceph.com/docs/v0.77/start/quick-ceph-deploy/)
> In the section of creating the osd the manual says:
> 
> Then, from your admin node, use ceph-deploy to prepare the OSDs.
> 
> ceph-deploy osd prepare {ceph-node}:/path/to/directory
> 
> For example:
> 
> ceph-deploy osd prepare node2:/var/local/osd0 node3:/var/local/osd1
> 
> Finally, activate the OSDs.
> 
> ceph-deploy osd activate {ceph-node}:/path/to/directory
> 
> For example:
> 
> ceph-deploy osd activate node2:/var/local/osd0 node3:/var/local/osd1
> 
> 
> In former versions the osd was created like:
> 
> ceph-deploy -v --overwrite-conf osd --fs-type btrfs prepare
> bd-0:/dev/sdb:/dev/sda5
> 
> ^^ Journal
> As i remember defining and creating a journal for each osd was a must.
> 
> So the question is: Are Journals obsolet now ?
> 
> --
> MfG,
>Markus Goldberg
> 
> --
> Markus Goldberg   Universität Hildesheim
>Rechenzentrum
> Tel +49 5121 88392822 Marienburger Platz 22, D-31141 Hildesheim, Germany
> Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
> --
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 
> -- 
> MfG,
>   Markus Goldberg
> 
> --
> Markus Goldberg   Universität Hildesheim
>   Rechenzentrum
> Tel +49 5121 88392822 Marienburger Platz 22, D-31141 Hildesheim, Germany
> Fax +49 5121 88392823 email goldb...@uni-hildesheim.de
> --
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] No more Journals ?

2014-03-15 Thread Alexandre DERUMIER
Hi,

This is the new objectstore multi backend, instead of using a filesystem 
(xfs,btrfs) , you can use leveldb,rocksdb,... which don't need journal,
because operations are atomic.

I think it should be release with firefly, if I remember.


About this, can somebody tell me if write are same speed, for  osd xfs 
HDD+journal on ssd   vs osd leveldb HDD ?
(As journal on ssd should improve latencies)

  

- Mail original - 

De: "Karan Singh"  
À: "Jake Young" , ceph-users@lists.ceph.com, "Sebastien Han" 
, "Markus Goldberg"  
Envoyé: Samedi 15 Mars 2014 19:07:56 
Objet: Re: [ceph-users] No more Journals ? 


Hello Everyone 


If you see ceph day presentation delivered by Sebastien ( slide number 23 ) 
http://www.slideshare.net/Inktank_Ceph/ceph-performance 


It looks like Firefly has dropped support to Journals , How concrete is this 
news ??? 




-Karan- 




On 14 Mar 2014, at 15:35, Jake Young < jak3...@gmail.com > wrote: 


You should take a look at this blog post: 


http://ceph.com/community/ceph-performance-part-2-write-throughput-without-ssd-journals/
 


The test results shows that using a RAID card with a write-back cache without 
journal disks can perform better or equivalent to using journal disks with XFS. 


As to whether or not it’s better to buy expensive controllers and use all of 
your drive bays for spinning disks or cheap controllers and use some portion of 
your bays for SSDs/Journals, there are trade-offs. If built right, systems with 
SSD journals provide higher large block write throughput, while putting 
journals on the data disks provides higher storage density. Without any tuning 
both solutions currently provide similar IOP throughput . 

Jake 



On Friday, March 14, 2014, Markus Goldberg < goldb...@uni-hildesheim.de > 
wrote: 


Sorry, 
i should have asked a little bit clearer: 
Can ceph (or OSDs) be used without journals now ? 
The Journal-Parameter seems to be optional ( because of '[...]' ) 

Markus 
Am 14.03.2014 12:19, schrieb John Spray: 


Journals have not gone anywhere, and ceph-deploy still supports 
specifying them with exactly the same syntax as before. 

The page you're looking at is the simplified "quick start", the detail 
on osd creation including journals is here: 
http://eu.ceph.com/docs/v0.77/ rados/deployment/ceph-deploy- osd/ 

Cheers, 
John 

On Fri, Mar 14, 2014 at 9:47 AM, Markus Goldberg 
< goldb...@uni-hildesheim.de > wrote: 


Hi, 
i'm a little bit surprised. I read through the new manuals of 0.77 
( http://eu.ceph.com/docs/v0. 77/start/quick-ceph-deploy/ ) 
In the section of creating the osd the manual says: 

Then, from your admin node, use ceph-deploy to prepare the OSDs. 

ceph-deploy osd prepare {ceph-node}:/path/to/directory 

For example: 

ceph-deploy osd prepare node2:/var/local/osd0 node3:/var/local/osd1 

Finally, activate the OSDs. 

ceph-deploy osd activate {ceph-node}:/path/to/directory 

For example: 

ceph-deploy osd activate node2:/var/local/osd0 node3:/var/local/osd1 


In former versions the osd was created like: 

ceph-deploy -v --overwrite-conf osd --fs-type btrfs prepare 
bd-0:/dev/sdb:/dev/sda5 

^^ Journal 
As i remember defining and creating a journal for each osd was a must. 

So the question is: Are Journals obsolet now ? 

-- 
MfG, 
Markus Goldberg 

-- -- -- 
Markus Goldberg Universität Hildesheim 
Rechenzentrum 
Tel +49 5121 88392822 Marienburger Platz 22, D-31141 Hildesheim, Germany 
Fax +49 5121 88392823 email goldb...@uni-hildesheim.de 
-- -- -- 


__ _ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/ listinfo.cgi/ceph-users-ceph. com 







-- 
MfG, 
Markus Goldberg 

-- -- -- 
Markus Goldberg Universität Hildesheim 
Rechenzentrum 
Tel +49 5121 88392822 Marienburger Platz 22, D-31141 Hildesheim, Germany 
Fax +49 5121 88392823 email goldb...@uni-hildesheim.de 
-- -- -- 


__ _ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/ listinfo.cgi/ceph-users-ceph. com 


___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 




___ 
ceph-users mailing list 
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] AWS SDK and multipart upload

2014-03-15 Thread Neil Soman
Just FYI for any who might be using the AWS Java SDK with rgw.

There is a bug in older versions of the AWS SDK in the
CompleteMultipartUpload call. The Etag that is sent in the manifest is not
formatted correctly. This will cause rgw to return a 400.

e.g.
T 192.168.1.16:46532 -> 192.168.1.51:80 [AP]
POST
/ed19074e-ed9a-488e-96e0-0d29b3704717/5bc1961c-215a-449e-9df4-c80b50ecfe64-multi-1394917767?uploadId=huS2ydwnDmOJ0vqu_CDCRJIIxvQuXfe
HTTP/1.1.
Host: 192.168.1.51.
Authorization: AWS foo:bar.
Date: Sat, 15 Mar 2014 21:09:28 GMT.
User-Agent: aws-sdk-java/1.5.0 Linux/2.6.32-431.5.1.el6.x86_64
OpenJDK_64-Bit_Server_VM/24.45-b08/1.7.0_51.
Content-Type: text/plain.
Content-Length: 147.
Connection: Keep-Alive.
.
1"e;cd3573ccd5891f07fcd519881cc74738"e;


The ""e;" above should be """

I moved to version 1.7.1 of the AWS SDK and multipart worked just fine.

If you already knew about it, please ignore :)
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com