[ceph-users] Re: Not able to access radosgw S3 bucket creation with AWS java SDK. Caused by: java.net.UnknownHostException: issue.

2020-07-29 Thread sathvik vutukuri
Hi All,

Any update in this from any one?

On Tue, Jul 28, 2020 at 4:00 PM sathvik vutukuri <7vik.sath...@gmail.com>
wrote:

> Hi All,
>
> radosgw-admin is configured in ceph-deploy, created a few buckets from the
> Ceph dashboard, but when accessing through Java AWS S3 code to create a new
> bucket i am facing the below issue..
>
> Exception in thread "main" com.amazonaws.SdkClientException: Unable to
> execute HTTP request: firstbucket.rgwhost
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008)
> at
> com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:394)
> at
> com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:5950)
> at
> com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1812)
> at
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1772)
> at
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1710)
> at org.S3.App.main(App.java:71)
> Caused by: java.net.UnknownHostException: firstbucket.rgwhost
> at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
> at java.net.InetAddress.getAllByName(InetAddress.java:1193)
> at java.net.InetAddress.getAllByName(InetAddress.java:1127)
> at
> com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
> at
> com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
> at
> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
> at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
> at com.amazonaws.http.conn.$Proxy3.connect(Unknown Source)
> at
> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
> at
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
> at
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
> at
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
> at
> com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1330)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
> ... 15 more
>
>
>
>
>
>
> --
> Thanks,
> Vutukuri Sathvik,
> 8197748291.
>


-- 
Thanks,
Vutukuri Sathvik,
8197748291.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Usable space vs. Overhead

2020-07-29 Thread Janne Johansson
Den ons 29 juli 2020 kl 03:17 skrev David Orman :

> That's what the formula on the ceph link arrives at, a 2/3 or 66.66%
> overhead. But if a 4 byte object is split into 4x1 byte chunks data (4
> bytes total) + 2x 1 byte chunks parity (2 bytes total), you arrive at 6
> bytes, which is 50% more than 4 bytes. So 50% overhead, vs. 33.33% overhead
> as the other formula arrives at. I'm curious what I'm missing.
>
>
Are you sure you are not just mixing up overhead with usable %?

50% overhead means you write 4 bytes, get 2 bytes "extra" for a total of 6.
In this case 4 out of 6 is 66.67% usable space, i.e. two thirds.

So if the formula says you will get 66% usable it means you get two-thirds
usable out of your drives with EC4+2, and it can also be said that the data
is 100%, and the overhead is 50% of that, but you need to know which of
the figures you want to calculate.

Either "how large is the growth of the data I put in"
OR "How much of the stored data is my original bytes and how much
in percent is the checksums".

For 4+2, the growth is 50%, since you add two (50% of four) to 4 original
bytes,
and for a six-drive setting, two drives go to checksums so you only get 66%
usable
if you fill that cluster up. The space allocated to checksums (33%)
is "50% of 66%" so the overhead is still 50% no matter how you calculate i

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Not able to access radosgw S3 bucket creation with AWS java SDK. Caused by: java.net.UnknownHostException: issue.

2020-07-29 Thread Zhenshi Zhou
It's maybe a dns issue, I guess.

sathvik vutukuri <7vik.sath...@gmail.com> 于2020年7月29日周三 下午3:21写道:

> Hi All,
>
> Any update in this from any one?
>
> On Tue, Jul 28, 2020 at 4:00 PM sathvik vutukuri <7vik.sath...@gmail.com>
> wrote:
>
> > Hi All,
> >
> > radosgw-admin is configured in ceph-deploy, created a few buckets from
> the
> > Ceph dashboard, but when accessing through Java AWS S3 code to create a
> new
> > bucket i am facing the below issue..
> >
> > Exception in thread "main" com.amazonaws.SdkClientException: Unable to
> > execute HTTP request: firstbucket.rgwhost
> > at
> >
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
> > at
> >
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
> > at
> >
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
> > at
> >
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
> > at
> >
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
> > at
> >
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
> > at
> >
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
> > at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
> > at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
> > at
> > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062)
> > at
> > com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008)
> > at
> >
> com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:394)
> > at
> >
> com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:5950)
> > at
> >
> com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1812)
> > at
> >
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1772)
> > at
> >
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1710)
> > at org.S3.App.main(App.java:71)
> > Caused by: java.net.UnknownHostException: firstbucket.rgwhost
> > at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
> > at java.net.InetAddress.getAllByName(InetAddress.java:1193)
> > at java.net.InetAddress.getAllByName(InetAddress.java:1127)
> > at
> >
> com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
> > at
> >
> com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
> > at
> >
> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
> > at
> >
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> > at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > at java.lang.reflect.Method.invoke(Method.java:498)
> > at
> >
> com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
> > at com.amazonaws.http.conn.$Proxy3.connect(Unknown Source)
> > at
> >
> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
> > at
> >
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
> > at
> >
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
> > at
> >
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
> > at
> >
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
> > at
> >
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
> > at
> >
> com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
> > at
> >
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1330)
> > at
> >
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
> > ... 15 more
> >
> >
> >
> >
> >
> >
> > --
> > Thanks,
> > Vutukuri Sathvik,
> > 8197748291.
> >
>
>
> --
> Thanks,
> Vutukuri Sathvik,
> 8197748291.
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Not able to access radosgw S3 bucket creation with AWS java SDK. Caused by: java.net.UnknownHostException: issue.

2020-07-29 Thread Chris Palmer
This works for me (the code switches between AWS and RGW according to 
whether s3Endpoint is set). You need the pathStyleAccess unless you have 
wildcard DNS names etc.


            String s3Endpoint = "http://my.host:80";;

            AmazonS3ClientBuilder s3b = AmazonS3ClientBuilder.standard ();

            if (s3Endpoint == null) {

                s3b.setRegion (s3Region);

            } else {

                s3b.setEndpointConfiguration (new EndpointConfiguration 
(s3Endpoint, s3Region));

    s3b.enablePathStyleAccess ();

            }

            if (s3Profile != null) s3b.setCredentials (new 
ProfileCredentialsProvider (s3Profile));

    AmazonS3 s3 = s3b.build ();



On 29/07/2020 08:19, sathvik vutukuri wrote:

Hi All,

Any update in this from any one?

On Tue, Jul 28, 2020 at 4:00 PM sathvik vutukuri <7vik.sath...@gmail.com>
wrote:


Hi All,

radosgw-admin is configured in ceph-deploy, created a few buckets from the
Ceph dashboard, but when accessing through Java AWS S3 code to create a new
bucket i am facing the below issue..

Exception in thread "main" com.amazonaws.SdkClientException: Unable to
execute HTTP request: firstbucket.rgwhost
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062)
at
com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008)
at
com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:394)
at
com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:5950)
at
com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1812)
at
com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1772)
at
com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1710)
at org.S3.App.main(App.java:71)
Caused by: java.net.UnknownHostException: firstbucket.rgwhost
at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
at java.net.InetAddress.getAllByName(InetAddress.java:1193)
at java.net.InetAddress.getAllByName(InetAddress.java:1127)
at
com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
at
com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
at
org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
at
org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
at com.amazonaws.http.conn.$Proxy3.connect(Unknown Source)
at
org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
at
org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
at
org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
at
org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
at
org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
at
com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1330)
at
com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
... 15 more






--
Thanks,
Vutukuri Sathvik,
8197748291.





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Not able to access radosgw S3 bucket creation with AWS java SDK. Caused by: java.net.UnknownHostException: issue.

2020-07-29 Thread sathvik vutukuri
Thanks, I'll check it out.

On Wed, 29 Jul 2020, 13:35 Chris Palmer, 
wrote:

> This works for me (the code switches between AWS and RGW according to
> whether s3Endpoint is set). You need the pathStyleAccess unless you have
> wildcard DNS names etc.
>
> String s3Endpoint = "http://my.host:80"; ;
>
> AmazonS3ClientBuilder s3b = AmazonS3ClientBuilder.standard ();
>
> if (s3Endpoint == null) {
>
> s3b.setRegion (s3Region);
>
> } else {
>
> s3b.setEndpointConfiguration (new EndpointConfiguration 
> (s3Endpoint, s3Region));
>
> s3b.enablePathStyleAccess ();
>
> }
>
> if (s3Profile != null) s3b.setCredentials (new 
> ProfileCredentialsProvider (s3Profile));
>
> AmazonS3 s3 = s3b.build ();
>
>
>
> On 29/07/2020 08:19, sathvik vutukuri wrote:
>
> Hi All,
>
> Any update in this from any one?
>
> On Tue, Jul 28, 2020 at 4:00 PM sathvik vutukuri <7vik.sath...@gmail.com> 
> <7vik.sath...@gmail.com>
> wrote:
>
>
> Hi All,
>
> radosgw-admin is configured in ceph-deploy, created a few buckets from the
> Ceph dashboard, but when accessing through Java AWS S3 code to create a new
> bucket i am facing the below issue..
>
> Exception in thread "main" com.amazonaws.SdkClientException: Unable to
> execute HTTP request: firstbucket.rgwhost
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062)
> at
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008)
> at
> com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:394)
> at
> com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:5950)
> at
> com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1812)
> at
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1772)
> at
> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1710)
> at org.S3.App.main(App.java:71)
> Caused by: java.net.UnknownHostException: firstbucket.rgwhost
> at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
> at java.net.InetAddress.getAllByName(InetAddress.java:1193)
> at java.net.InetAddress.getAllByName(InetAddress.java:1127)
> at
> com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
> at
> com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
> at
> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
> at
> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
> at com.amazonaws.http.conn.$Proxy3.connect(Unknown Source)
> at
> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
> at
> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
> at
> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
> at
> org.apache.http.impl.client.InternalHttpClient.doExecute(InternalHttpClient.java:185)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:83)
> at
> org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:56)
> at
> com.amazonaws.http.apache.client.impl.SdkHttpClient.execute(SdkHttpClient.java:72)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1330)
> at
> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145)
> ... 15 more
>
>
>
>
>
>
> --
> Thanks,
> Vutukuri Sathvik,
> 8197748291.
>
>
>
>
>
___

[ceph-users] Setting rbd_default_data_pool through the config store

2020-07-29 Thread Wido den Hollander

Hi,

I'm trying to have clients read the 'rbd_default_data_pool' config 
option from the config store when creating a RBD image.


This doesn't seem to work and I'm wondering if somebody knows why.

I tried:

$ ceph config set client rbd_default_data_pool rbd-data
$ ceph config set global rbd_default_data_pool rbd-data

They both show up under:

$ ceph config dump

However, newly created RBD images with the 'rbd' CLI tool do not use the 
data pool.


If I set this in ceph.conf it works:

[client]
rbd_default_data_pool = rbd-data

Somehow librbd isn't fetching these configuration options. Any hints on 
how to get this working?


The end result is that libvirt (which doesn't read ceph.conf) should 
also be able to create RBD images with a different data pool.


Wido
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Usable space vs. Overhead

2020-07-29 Thread Benoît Knecht
Aren't you just looking at the same thing from two different perspective?

In one case you say: I have 100% of useful data, and I need to add 50% of 
parity for a total of 150% raw data.

In the other, you say: Out of 100% of raw data, 2/3 is useful data, 1/3 is 
parity, which gives you your 33.3% overhead.

But it's the exact same thing, it just depends on whether you consider your 
overhead as a percentage of total (raw) data, or as a percentage of useful data.

--
Ben

‐‐‐ Original Message ‐‐‐
On Tuesday, July 28, 2020 10:32 PM, David Orman  wrote:

> I'm having a hard time understanding the EC usable space vs. raw.
>
> https://ceph.io/geen-categorie/ceph-erasure-coding-overhead-in-a-nutshell/
> indicates "nOSD * k / (k+m) * OSD Size" is how you calculate usable space,
> but that's not lining up with what i'd expect just from k data chunks + m
> parity chunks.
>
> So, for example, k=4, m=2. you'd expect every 4 byte object written would
> consume 6 bytes, so 50% overhead. however, the prior formula in a 7 server
> cluster, using 4+2 encoding, would indicate 66.67% usable capacity vs. raw
> storage.
>
> What am I missing here?
>
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mimic: much more raw used than reported

2020-07-29 Thread Igor Fedotov

Hi Frank,

you might want to proceed with perf counters' dump analysis in the 
following way:


For 2-3 arbitrary osds

- save current perf counter dump

- reset perf counters

- leave OSD under the regular load for a while.

- dump perf counters again

- share both saved and new dumps and/or check stats on 'big' writes vs. 
'small' ones.



Thanks,

Igor

On 7/29/2020 2:49 PM, Frank Schilder wrote:


Dear Igor,

please find below data from "ceph osd df tree" and per-OSD bluestore stats 
pasted together with the script for extraction for reference. We have now:

df USED: 142 TB
bluestore_stored: 190.9TB (142*8/6 = 189, so matches)
bluestore_allocated: 275.2TB
osd df tree USE: 276.1 (so matches with bluestore_allocated as well)

The situation has gotten worse, the mismatch of raw used to stored is now 85TB. 
Compression is almost irrelevant. This matches with my earlier report with data taken 
from "ceph osd df tree" alone. Compared with my previous report, what I seem to 
see is that a sequential write of 22TB (user data) causes an excess of 16TB (raw). This 
does not make sense and is not explained with the partial overwrite amplification you 
referred me to.

The real question I still have is how can I find out how much of the excess 
usage is attributed to the issue you pointed me to, and how much might be due 
to something else. I would probably need a way to find objects that are 
affected by partial overwrite amplification and account for their total to see 
how much of the excess they explain. Ideally allowing me to identify the RBD 
images responsible.

I do *not* believe that *all* this extra usage is due to the partial overwrite 
amplification. We do not have the use case simulated with the subsequent dd 
commands in your post 
https://lists.ceph.io/hyperkitty/list/d...@ceph.io/thread/OHPO43J54TPBEUISYCK3SRV55SIZX2AT/,
 overwriting old data with an offset. On these images, we store very large 
files (15GB) that are written *only* *once* and not modified again. We 
currently do nothing else but sequential writes to a file system.

The only objects that might see a partial overwrite could be at the tail of 
such a file, when the beginning of a new file is written to an object that 
already holds a tail, and potentially objects holding file system meta data. 
With an RBD object size of 4M, this amounts to a comparably small number of 
objects that almost certainly cannot explain the observed 44% excess even 
assuming worst case amplification.

The data:

NAME ID USED%USED MAX AVAIL OBJECTS
sr-rbd-data-one-hdd  11 142 TiB 71.1258 TiB 37415413

osd df tree   blue stats
   ID   SIZEUSE alloc  store
   848.96.2   6.14.3
  1458.95.6   5.53.7
  1568.96.3   6.24.2
  1688.96.1   6.04.1
  1818.96.6   6.64.4
   748.95.2   5.23.7
  1448.95.9   5.94.0
  1578.96.6   6.54.5
  1698.96.4   6.34.4
  1808.96.6   6.64.5
   608.95.7   5.64.0
  1468.95.9   5.84.0
  1588.96.7   6.74.6
  1708.96.5   6.54.4
  1828.95.8   5.74.0
   638.95.8   5.84.1
  1488.96.5   6.44.4
  1598.94.9   4.93.3
  1728.96.4   6.34.4
  1838.96.5   6.44.4
  2298.95.6   5.63.8
  2328.96.3   6.24.3
  2358.95.0   4.93.3
  2388.96.6   6.54.4
  259 117.5   7.45.1
  2318.96.2   6.14.2
  2338.96.7   6.64.5
  2368.96.3   6.24.2
  2398.95.2   5.13.5
  263 116.5   6.54.4
  2288.96.3   6.34.3
  2308.96.0   5.94.0
  2348.96.5   6.44.4
  2378.96.0   5.94.1
  260 116.6   6.54.5
08.96.3   6.34.3
28.96.4   6.44.5
   728.95.4   5.43.7
   768.96.2   6.14.3
   868.95.6   5.53.9
18.96.0   5.94.1
38.95.7   5.74.0
   738.96.1   6.04.3
   858.96.8   6.74.6
   878.96.1   6.14.3
  SUM  406.8  276.1 275.2  190.9

The script:

#!/bin/bash

format_TB() {
tmp=$(($1/1024))
echo "${tmp}.$(( (10*($1-tmp*1024))/1024 ))"
}

blue_stats() {
al_tot=0
st_tot=0
printf "%12s\n" "blue stats"
printf "%5s  %5s\n" "alloc" "store"
for o in "$@" ; do
host_ip="$(ceph osd find "$o" | jq -r '.ip' | cut -d ":" -f1)"
bs_data="$(ssh "$host_ip" ceph daemon "osd.$o" perf dump | jq 
'.bluestore')"
bs_alloc=$(( $(echo "$bs_data" | jq '.bluestore_allocated') 
/1024/1024/1024 ))
al_tot=$(( $al_tot+$bs_alloc ))
bs_store=$(( $(echo "$bs_data" | jq '.bluestore_stored') 
/1024/1024/1024 ))
st_tot=$(( $st_tot+$bs_store ))
p

[ceph-users] High io wait when osd rocksdb is compacting

2020-07-29 Thread Raffael Bachmann

Hi All,

I'm kind of crossposting this from here: 
https://forum.proxmox.com/threads/i-o-wait-after-upgrade-5-x-to-6-2-and-ceph-luminous-to-nautilus.73581/
But since I'm more and more sure that it's a ceph problem I'll try my 
luck here.


Since updating from Luminous to Nautilus I have a big problem.

I have a 3 node cluster. Each cluster has 2 nvme ssd and a 10GBASE-T net 
for ceph.
Every few minutes a osd seems to compact the rocksdb. While doing this 
it uses alot of I/O and blocks.
This basically blocks the whole cluster and no VM/Container can read 
data for some seconds (minutes).


While it happens "iostat -x" looks like this:

Devicer/s w/s rkB/s wkB/s   rrqm/s   wrqm/s  %rrqm  
%wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme0n1  0.002.00  0.00 24.00 0.0046.00   0.00  
95.830.000.00   0.00 0.0012.00   2.00   0.40
nvme1n1  0.00 1495.00  0.00   3924.00 0.00  6099.00   0.00  
80.310.00  352.39 523.78 0.00 2.62   0.67 100.00

And iotop:

Total DISK READ: 0.00 B/s | Total DISK WRITE:  1573.47 K/s
Current DISK READ:   0.00 B/s | Current DISK WRITE:   3.43 M/s
TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>COMMAND
   2306 be/4 ceph0.00 B/s 1533.22 K/s  0.00 % 99.99 % ceph-osd -f 
--cluster ceph --id 3 --setuser ceph --setgroup ceph [rocksdb:low1]


In the ceph-osd log I see that rocksdb is compacting. 
https://gist.github.com/qwasli/3bd0c7d535ee462feff8aaee618f3e08


The pool and one OSD is nearfull. I'd planed to move some data away to 
another ceph pool. But now I'm not sure anymore if I should go with ceph.
I'l move some data away anyway today to see if that helps, but before 
the upgrade there was the same amount of data an I haven't had a problem.


Any hints to solve this are appreciated.

Cheers
Raffael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Setting rbd_default_data_pool through the config store

2020-07-29 Thread Jason Dillaman
On Wed, Jul 29, 2020 at 6:23 AM Wido den Hollander  wrote:
>
> Hi,
>
> I'm trying to have clients read the 'rbd_default_data_pool' config
> option from the config store when creating a RBD image.
>
> This doesn't seem to work and I'm wondering if somebody knows why.

It looks like all string-based config overrides for RBD are ignored:

2020-07-29T08:52:44.393-0400 7f2a97fff700  4 set_mon_vals failed to
set rbd_default_data_pool = rbd-data: Configuration option
'rbd_default_data_pool' may not be modified at runtime

librbd always accesses the config options in a thread-safe manner, so
I'll open a tracker ticket to flag all the RBD string config options
are runtime updatable (primitive data type options are implicitly
runtime updatable).

> I tried:
>
> $ ceph config set client rbd_default_data_pool rbd-data
> $ ceph config set global rbd_default_data_pool rbd-data
>
> They both show up under:
>
> $ ceph config dump
>
> However, newly created RBD images with the 'rbd' CLI tool do not use the
> data pool.
>
> If I set this in ceph.conf it works:
>
> [client]
> rbd_default_data_pool = rbd-data
>
> Somehow librbd isn't fetching these configuration options. Any hints on
> how to get this working?
>
> The end result is that libvirt (which doesn't read ceph.conf) should
> also be able to create RBD images with a different data pool.
>
> Wido
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>


-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Setting rbd_default_data_pool through the config store

2020-07-29 Thread Wido den Hollander




On 29/07/2020 14:54, Jason Dillaman wrote:

On Wed, Jul 29, 2020 at 6:23 AM Wido den Hollander  wrote:


Hi,

I'm trying to have clients read the 'rbd_default_data_pool' config
option from the config store when creating a RBD image.

This doesn't seem to work and I'm wondering if somebody knows why.


It looks like all string-based config overrides for RBD are ignored:

2020-07-29T08:52:44.393-0400 7f2a97fff700  4 set_mon_vals failed to
set rbd_default_data_pool = rbd-data: Configuration option
'rbd_default_data_pool' may not be modified at runtime

librbd always accesses the config options in a thread-safe manner, so
I'll open a tracker ticket to flag all the RBD string config options
are runtime updatable (primitive data type options are implicitly
runtime updatable).


I wasn't updating it at runtime, I just wanted to make sure that I don't 
have to set this in ceph.conf everywhere (and libvirt doesn't read 
ceph.conf)


But it seems that Python works:

#!/usr/bin/python3

import rados
import rbd

cluster = rados.Rados(conffile='/etc/ceph/ceph.conf')
cluster.connect()
ioctx = cluster.open_ioctx('rbd')

rbd_inst = rbd.RBD()
size = 4 * 1024**3  # 4 GiB
rbd_inst.create(ioctx, 'myimage', size)

ioctx.close()
cluster.shutdown()


And then:

$ ceph config set client rbd_default_data_pool rbd-data

rbd image 'myimage':
size 4 GiB in 1024 objects
order 22 (4 MiB objects)
snapshot_count: 0
id: 1aa963a21028
data_pool: rbd-data
block_name_prefix: rbd_data.2.1aa963a21028
format: 2
	features: layering, exclusive-lock, object-map, fast-diff, 
deep-flatten, data-pool



I haven't tested this through libvirt yet. That's the next thing to test.

Wido




I tried:

$ ceph config set client rbd_default_data_pool rbd-data
$ ceph config set global rbd_default_data_pool rbd-data

They both show up under:

$ ceph config dump

However, newly created RBD images with the 'rbd' CLI tool do not use the
data pool.

If I set this in ceph.conf it works:

[client]
rbd_default_data_pool = rbd-data

Somehow librbd isn't fetching these configuration options. Any hints on
how to get this working?

The end result is that libvirt (which doesn't read ceph.conf) should
also be able to create RBD images with a different data pool.

Wido
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io





___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: High io wait when osd rocksdb is compacting

2020-07-29 Thread Wido den Hollander



On 29/07/2020 14:52, Raffael Bachmann wrote:

Hi All,

I'm kind of crossposting this from here: 
https://forum.proxmox.com/threads/i-o-wait-after-upgrade-5-x-to-6-2-and-ceph-luminous-to-nautilus.73581/ 

But since I'm more and more sure that it's a ceph problem I'll try my 
luck here.


Since updating from Luminous to Nautilus I have a big problem.

I have a 3 node cluster. Each cluster has 2 nvme ssd and a 10GBASE-T net 
for ceph.
Every few minutes a osd seems to compact the rocksdb. While doing this 
it uses alot of I/O and blocks.
This basically blocks the whole cluster and no VM/Container can read 
data for some seconds (minutes).


While it happens "iostat -x" looks like this:

Device    r/s w/s rkB/s wkB/s   rrqm/s   wrqm/s  
%rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz  svctm  %util
nvme0n1  0.00    2.00  0.00 24.00 0.00    46.00   
0.00  95.83    0.00    0.00   0.00 0.00    12.00   2.00   0.40
nvme1n1  0.00 1495.00  0.00   3924.00 0.00  6099.00   
0.00  80.31    0.00  352.39 523.78 0.00 2.62   0.67 100.00


And iotop:

Total DISK READ: 0.00 B/s | Total DISK WRITE:  1573.47 K/s
Current DISK READ:   0.00 B/s | Current DISK WRITE:   3.43 M/s
     TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>    COMMAND
    2306 be/4 ceph    0.00 B/s 1533.22 K/s  0.00 % 99.99 % ceph-osd 
-f --cluster ceph --id 3 --setuser ceph --setgroup ceph [rocksdb:low1]



In the ceph-osd log I see that rocksdb is compacting. 
https://gist.github.com/qwasli/3bd0c7d535ee462feff8aaee618f3e08


The pool and one OSD is nearfull. I'd planed to move some data away to 
another ceph pool. But now I'm not sure anymore if I should go with ceph.
I'l move some data away anyway today to see if that helps, but before 
the upgrade there was the same amount of data an I haven't had a problem.


Any hints to solve this are appreciated.


What model/type of NVMe is this?

And on a nearfull cluster these problems can arise, it's usually not a 
good idea to have OSDs be nearfull.


What does 'ceph df' tell you?

Wido



Cheers
Raffael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Setting rbd_default_data_pool through the config store

2020-07-29 Thread Jason Dillaman
On Wed, Jul 29, 2020 at 9:03 AM Wido den Hollander  wrote:
>
>
>
> On 29/07/2020 14:54, Jason Dillaman wrote:
> > On Wed, Jul 29, 2020 at 6:23 AM Wido den Hollander  wrote:
> >>
> >> Hi,
> >>
> >> I'm trying to have clients read the 'rbd_default_data_pool' config
> >> option from the config store when creating a RBD image.
> >>
> >> This doesn't seem to work and I'm wondering if somebody knows why.
> >
> > It looks like all string-based config overrides for RBD are ignored:
> >
> > 2020-07-29T08:52:44.393-0400 7f2a97fff700  4 set_mon_vals failed to
> > set rbd_default_data_pool = rbd-data: Configuration option
> > 'rbd_default_data_pool' may not be modified at runtime
> >
> > librbd always accesses the config options in a thread-safe manner, so
> > I'll open a tracker ticket to flag all the RBD string config options
> > are runtime updatable (primitive data type options are implicitly
> > runtime updatable).
>
> I wasn't updating it at runtime, I just wanted to make sure that I don't
> have to set this in ceph.conf everywhere (and libvirt doesn't read
> ceph.conf)

You weren't updating it at runtime -- the MON's "MConfig" message back
to the client was attempting to set the config option after "rbd" had
already started. However, if it's working under python, perhaps there
is an easy tweak for "rbd" to have it delay flagging the application
as having started until after it has connected to the cluster. Right
now it manages its own CephContext lifetime which it re-uses when
creating a librados connection. It's that CephContext that is flagged
as "running" prior to librados actually connecting to the cluster.

> But it seems that Python works:
>
> #!/usr/bin/python3
>
> import rados
> import rbd
>
> cluster = rados.Rados(conffile='/etc/ceph/ceph.conf')
> cluster.connect()
> ioctx = cluster.open_ioctx('rbd')
>
> rbd_inst = rbd.RBD()
> size = 4 * 1024**3  # 4 GiB
> rbd_inst.create(ioctx, 'myimage', size)
>
> ioctx.close()
> cluster.shutdown()
>
>
> And then:
>
> $ ceph config set client rbd_default_data_pool rbd-data
>
> rbd image 'myimage':
> size 4 GiB in 1024 objects
> order 22 (4 MiB objects)
> snapshot_count: 0
> id: 1aa963a21028
> data_pool: rbd-data
> block_name_prefix: rbd_data.2.1aa963a21028
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff,
> deep-flatten, data-pool
>
>
> I haven't tested this through libvirt yet. That's the next thing to test.
>
> Wido
>
> >
> >> I tried:
> >>
> >> $ ceph config set client rbd_default_data_pool rbd-data
> >> $ ceph config set global rbd_default_data_pool rbd-data
> >>
> >> They both show up under:
> >>
> >> $ ceph config dump
> >>
> >> However, newly created RBD images with the 'rbd' CLI tool do not use the
> >> data pool.
> >>
> >> If I set this in ceph.conf it works:
> >>
> >> [client]
> >> rbd_default_data_pool = rbd-data
> >>
> >> Somehow librbd isn't fetching these configuration options. Any hints on
> >> how to get this working?
> >>
> >> The end result is that libvirt (which doesn't read ceph.conf) should
> >> also be able to create RBD images with a different data pool.
> >>
> >> Wido
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
> >>
> >
> >
>


-- 
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: High io wait when osd rocksdb is compacting

2020-07-29 Thread Raffael Bachmann


Hi Wido

Thanks for the quick answer. They are all Intel p3520 
https://ark.intel.com/content/www/us/en/ark/products/88727/intel-ssd-dc-p3520-series-2-0tb-2-5in-pcie-3-0-x4-3d1-mlc.html

And this is ceph df
RAW STORAGE:
    CLASS SIZE   AVAIL   USED    RAW USED %RAW USED
    nvme  11 TiB 2.3 TiB 8.6 TiB  8.7 TiB 79.28
    TOTAL 11 TiB 2.3 TiB 8.6 TiB  8.7 TiB 79.28

POOLS:
    POOL ID STORED  OBJECTS USED    %USED MAX AVAIL
    ceph  8 2.9 TiB 769.41k 8.6 TiB 89.15   359 GiB

Cheers
Raffael

On 29/07/2020 15:04, Wido den Hollander wrote:



On 29/07/2020 14:52, Raffael Bachmann wrote:

Hi All,

I'm kind of crossposting this from here: 
https://forum.proxmox.com/threads/i-o-wait-after-upgrade-5-x-to-6-2-and-ceph-luminous-to-nautilus.73581/ 

But since I'm more and more sure that it's a ceph problem I'll try my 
luck here.


Since updating from Luminous to Nautilus I have a big problem.

I have a 3 node cluster. Each cluster has 2 nvme ssd and a 10GBASE-T 
net for ceph.
Every few minutes a osd seems to compact the rocksdb. While doing 
this it uses alot of I/O and blocks.
This basically blocks the whole cluster and no VM/Container can read 
data for some seconds (minutes).


While it happens "iostat -x" looks like this:

Device    r/s w/s rkB/s wkB/s   rrqm/s wrqm/s  
%rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm  %util
nvme0n1  0.00    2.00  0.00 24.00 0.00 46.00   
0.00  95.83    0.00    0.00   0.00 0.00    12.00 2.00   0.40
nvme1n1  0.00 1495.00  0.00   3924.00 0.00 6099.00   
0.00  80.31    0.00  352.39 523.78 0.00 2.62 0.67 100.00


And iotop:

Total DISK READ: 0.00 B/s | Total DISK WRITE: 1573.47 K/s
Current DISK READ:   0.00 B/s | Current DISK WRITE: 3.43 M/s
 TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>    COMMAND
    2306 be/4 ceph    0.00 B/s 1533.22 K/s  0.00 % 99.99 % 
ceph-osd -f --cluster ceph --id 3 --setuser ceph --setgroup ceph 
[rocksdb:low1]



In the ceph-osd log I see that rocksdb is compacting. 
https://gist.github.com/qwasli/3bd0c7d535ee462feff8aaee618f3e08


The pool and one OSD is nearfull. I'd planed to move some data away 
to another ceph pool. But now I'm not sure anymore if I should go 
with ceph.
I'l move some data away anyway today to see if that helps, but before 
the upgrade there was the same amount of data an I haven't had a 
problem.


Any hints to solve this are appreciated.


What model/type of NVMe is this?

And on a nearfull cluster these problems can arise, it's usually not a 
good idea to have OSDs be nearfull.


What does 'ceph df' tell you?

Wido



Cheers
Raffael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: High io wait when osd rocksdb is compacting

2020-07-29 Thread Mark Nelson

Hi Raffael,


Adam made a PR this year that shards rocksdb data across different 
column families to help reduce compaction overhead.  The goal is to 
reduce write-amplification during compaction by storing multiple small 
LSM hierarchies rather than 1 big one.  We've seen evidence that this 
lowers compaction time and overhead, sometimes significantly.  That PR 
was merged to master on April 26th so I don't believe it's in any of the 
releases yet but you can test it if you have a non-production cluster 
available.  That PR is here:



https://github.com/ceph/ceph/pull/34006


Normally though you should have about 1GB of WAL to absorb writes during 
compaction and rocksdb automatically slows writes down if the buffers 
start filling up.  You should only see a write stall from compaction if 
you completely fill all of the buffers.  Also, you shouldn't see 
compaction at one level blocking IO to the entire database.  Something 
seems off to me here.


If you have OSD logs, you can see a history of the compaction events by 
running this script:


https://github.com/ceph/cbt/blob/master/tools/ceph_rocksdb_log_parser.py


That can give you an idea of how long your compaction events are lasting 
and what they are doing.



Mark


On 7/29/20 7:52 AM, Raffael Bachmann wrote:

Hi All,

I'm kind of crossposting this from here: 
https://forum.proxmox.com/threads/i-o-wait-after-upgrade-5-x-to-6-2-and-ceph-luminous-to-nautilus.73581/
But since I'm more and more sure that it's a ceph problem I'll try my 
luck here.


Since updating from Luminous to Nautilus I have a big problem.

I have a 3 node cluster. Each cluster has 2 nvme ssd and a 10GBASE-T 
net for ceph.
Every few minutes a osd seems to compact the rocksdb. While doing this 
it uses alot of I/O and blocks.
This basically blocks the whole cluster and no VM/Container can read 
data for some seconds (minutes).


While it happens "iostat -x" looks like this:

Device    r/s w/s rkB/s wkB/s   rrqm/s wrqm/s  
%rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm  %util
nvme0n1  0.00    2.00  0.00 24.00 0.00 46.00   
0.00  95.83    0.00    0.00   0.00 0.00    12.00 2.00   0.40
nvme1n1  0.00 1495.00  0.00   3924.00 0.00 6099.00   
0.00  80.31    0.00  352.39 523.78 0.00 2.62 0.67 100.00


And iotop:

Total DISK READ: 0.00 B/s | Total DISK WRITE:  1573.47 K/s
Current DISK READ:   0.00 B/s | Current DISK WRITE:   3.43 M/s
    TID  PRIO  USER DISK READ  DISK WRITE  SWAPIN IO>    COMMAND
   2306 be/4 ceph    0.00 B/s 1533.22 K/s  0.00 % 99.99 % ceph-osd 
-f --cluster ceph --id 3 --setuser ceph --setgroup ceph [rocksdb:low1]



In the ceph-osd log I see that rocksdb is compacting. 
https://gist.github.com/qwasli/3bd0c7d535ee462feff8aaee618f3e08


The pool and one OSD is nearfull. I'd planed to move some data away to 
another ceph pool. But now I'm not sure anymore if I should go with ceph.
I'l move some data away anyway today to see if that helps, but before 
the upgrade there was the same amount of data an I haven't had a problem.


Any hints to solve this are appreciated.

Cheers
Raffael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: High io wait when osd rocksdb is compacting

2020-07-29 Thread Raffael Bachmann

Hi Mark

Unfortunately it is the production cluster and I don't have another one :-(

This is the output of the log parser. I'have nothing to compare them to. 
Stupid me has no more logs from before the upgrade.


python ceph_rocksdb_log_parser.py ceph-osd.1.log
Compaction Statistics   ceph-osd.1.log
Total OSD Log Duration (seconds)    55500.457
Number of Compaction Events 13
Avg Compaction Time (seconds)   116.498074615
Total Compaction Time (seconds) 1514.47497
Avg Output Size: (MB)   422.757656391
Total Output Size: (MB) 5495.84953308
Total Input Records 21019590
Total Output Records    18093259
Avg Output Throughput (MB/s)    3.53010211372
Avg Input Records/second    17994.0419635
Avg Output Records/second   16449.9710169
Avg Output/Input Ratio  0.891530624966

ceph-osd.1.log

start_offset    compaction_time_seconds output_level 
num_output_files    total_output_size num_input_records   
num_output_records  output (MB/s) input (r/s) output (r/s)    
output/input ratio
417.204 70.247058   1   5   261853019   1476689 138 
3.55491754393   21021.3643396   19708.2132607 0.937532547476
546.271 128.652685  2   7   473883973   1674393 1098908 
3.51279861751   13014.8313655   8541.66393807 0.656302313734
5761.795    60.460736   1   4   211033833 1041408 
1013909 3.32873133441   17224.5339521   16769.7098494 0.973594402962
14912.985   64.958415   1   4   231336608 1316575 
1249120 3.3963233477    20267.9668215   19229.5332329 0.948764787422
15152.316   238.925764  2   14  944635417 2445094 
1902084 3.77052068592   10233.6975262   7960.98322825 0.77791855855
24607.857   53.022134   1   4   188414045 1029179 
988116  3.38887973778   19410.36549 18635.915333 0.960101206884
31259.993   55.442826   1   4   210856392 1296725 
1221474 3.62694941814   23388.5083708   22031.2362865 0.941968420444
31574.193   313.736584  2   18  1213247010 2928742 
2359960 3.68794259867   9335.03502416   7522.10650703 0.805793067467
37708.375   49.78089    1   3   171888381 974097  
939847  3.29294101107   19567.6895291   18879.6745096 0.96483923059
43219.745   51.798215   1   4   193360867 1246101 
1172257 3.5600318014    24056.8328465   22631.2238752 0.940739956071
48041.751   56.559014   1   4   208216413 1451105 
1367052 3.5108576209    25656.4762604   24170.3647804 0.942076555453
48368.403   325.833185  2   19  1289359869 3196156 
2489088 3.77380036251   9809.17889011   7639.1482347 0.778775504074
52693.952   45.057464   1   3   164730093 943326  
907000  3.48663339848   20936.0651101   20129.8501842 0.961491573433


cheers
Raffael


On 29/07/2020 15:19, Mark Nelson wrote:

Hi Raffael,


Adam made a PR this year that shards rocksdb data across different 
column families to help reduce compaction overhead.  The goal is to 
reduce write-amplification during compaction by storing multiple small 
LSM hierarchies rather than 1 big one.  We've seen evidence that this 
lowers compaction time and overhead, sometimes significantly.  That PR 
was merged to master on April 26th so I don't believe it's in any of 
the releases yet but you can test it if you have a non-production 
cluster available.  That PR is here:



https://github.com/ceph/ceph/pull/34006


Normally though you should have about 1GB of WAL to absorb writes 
during compaction and rocksdb automatically slows writes down if the 
buffers start filling up.  You should only see a write stall from 
compaction if you completely fill all of the buffers.  Also, you 
shouldn't see compaction at one level blocking IO to the entire 
database.  Something seems off to me here.


If you have OSD logs, you can see a history of the compaction events 
by running this script:


https://github.com/ceph/cbt/blob/master/tools/ceph_rocksdb_log_parser.py


That can give you an idea of how long your compaction events are 
lasting and what they are doing.



Mark


On 7/29/20 7:52 AM, Raffael Bachmann wrote:

Hi All,

I'm kind of crossposting this from here: 
https://forum.proxmox.com/threads/i-o-wait-after-upgrade-5-x-to-6-2-and-ceph-luminous-to-nautilus.73581/
But since I'm more and more sure that it's a ceph problem I'll try my 
luck here.


Since updating from Luminous to Nautilus I have a big problem.

I have a 3 node cluster. Each cluster has 2 nvme ssd and a 10GBASE-T 
net for ceph.
Every few minutes a osd seems to compact the rocksdb. While doing 
this it uses alot of I/O and blocks.
This basically blocks the whole cluster and no VM/Container can read 
data for some seconds (minutes).


While it happens "iostat -x" looks like this:

Device    r/s w/s rkB/s wkB/s   rrqm/s wrqm/s  
%rrqm  %wrqm r_await w_await aqu-sz rareq-sz wareq-sz svctm  %util
nvme0n1  0.00    2.00  0.00 24.00 0.00 46.00   
0.00  95.83    0

[ceph-users] How to contact geek squad online

2020-07-29 Thread hirkumispe
Fix technical breakdown of all your electronics and appliances at Geek Squad 
Support. Reach the certified experts at Geek Squad Support for fixing any kind 
of technical bug with your devices. Best of services and assistance assured at 
support.
https://geekstechs.org/geek-squad-support/
Remove the technical glitches and flaws of your electronic appliances with Geek 
Squad Online Support. Avail the best online support services for fixing your 
electronic appliance. Reach the experts at Geek Squad Online Support with 24/7 
availability to avail best solutions for your troublesome devices.
https://geek-customer-care.com/geek-squad-online-support.html
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How to change toner in sharp printer

2020-07-29 Thread hirkumispe
Don't know how to change toner cartridge on your sharp printer? Check our this 
step by step guide to learn how to replace toner in sharp printer
https://printersetup.org/change-toner-in-sharp-printer/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Setting rbd_default_data_pool through the config store

2020-07-29 Thread Jason Dillaman
On Wed, Jul 29, 2020 at 9:07 AM Jason Dillaman  wrote:
>
> On Wed, Jul 29, 2020 at 9:03 AM Wido den Hollander  wrote:
> >
> >
> >
> > On 29/07/2020 14:54, Jason Dillaman wrote:
> > > On Wed, Jul 29, 2020 at 6:23 AM Wido den Hollander  wrote:
> > >>
> > >> Hi,
> > >>
> > >> I'm trying to have clients read the 'rbd_default_data_pool' config
> > >> option from the config store when creating a RBD image.
> > >>
> > >> This doesn't seem to work and I'm wondering if somebody knows why.
> > >
> > > It looks like all string-based config overrides for RBD are ignored:
> > >
> > > 2020-07-29T08:52:44.393-0400 7f2a97fff700  4 set_mon_vals failed to
> > > set rbd_default_data_pool = rbd-data: Configuration option
> > > 'rbd_default_data_pool' may not be modified at runtime
> > >
> > > librbd always accesses the config options in a thread-safe manner, so
> > > I'll open a tracker ticket to flag all the RBD string config options
> > > are runtime updatable (primitive data type options are implicitly
> > > runtime updatable).
> >
> > I wasn't updating it at runtime, I just wanted to make sure that I don't
> > have to set this in ceph.conf everywhere (and libvirt doesn't read
> > ceph.conf)
>
> You weren't updating it at runtime -- the MON's "MConfig" message back
> to the client was attempting to set the config option after "rbd" had
> already started. However, if it's working under python, perhaps there
> is an easy tweak for "rbd" to have it delay flagging the application
> as having started until after it has connected to the cluster. Right
> now it manages its own CephContext lifetime which it re-uses when
> creating a librados connection. It's that CephContext that is flagged
> as "running" prior to librados actually connecting to the cluster.

It looks like this is caused by two issues:

-- In [1], this will prevent librados from applying any MON config
overrides (for strings). This line can just be trivially removed.

-- Fixing that, there is a race in librados / MonClient [2] where it
attempts to first pull the config from the MONs, but it uses a
separate thread to actually apply the received config values, which
can race w/ the completion of the bootstrap occurring in the main
thread. This means that the example below may work sometimes -- and
may fail other times.

> > But it seems that Python works:
> >
> > #!/usr/bin/python3
> >
> > import rados
> > import rbd
> >
> > cluster = rados.Rados(conffile='/etc/ceph/ceph.conf')
> > cluster.connect()
> > ioctx = cluster.open_ioctx('rbd')
> >
> > rbd_inst = rbd.RBD()
> > size = 4 * 1024**3  # 4 GiB
> > rbd_inst.create(ioctx, 'myimage', size)
> >
> > ioctx.close()
> > cluster.shutdown()
> >
> >
> > And then:
> >
> > $ ceph config set client rbd_default_data_pool rbd-data
> >
> > rbd image 'myimage':
> > size 4 GiB in 1024 objects
> > order 22 (4 MiB objects)
> > snapshot_count: 0
> > id: 1aa963a21028
> > data_pool: rbd-data
> > block_name_prefix: rbd_data.2.1aa963a21028
> > format: 2
> > features: layering, exclusive-lock, object-map, fast-diff,
> > deep-flatten, data-pool
> >
> >
> > I haven't tested this through libvirt yet. That's the next thing to test.
> >
> > Wido
> >
> > >
> > >> I tried:
> > >>
> > >> $ ceph config set client rbd_default_data_pool rbd-data
> > >> $ ceph config set global rbd_default_data_pool rbd-data
> > >>
> > >> They both show up under:
> > >>
> > >> $ ceph config dump
> > >>
> > >> However, newly created RBD images with the 'rbd' CLI tool do not use the
> > >> data pool.
> > >>
> > >> If I set this in ceph.conf it works:
> > >>
> > >> [client]
> > >> rbd_default_data_pool = rbd-data
> > >>
> > >> Somehow librbd isn't fetching these configuration options. Any hints on
> > >> how to get this working?
> > >>
> > >> The end result is that libvirt (which doesn't read ceph.conf) should
> > >> also be able to create RBD images with a different data pool.
> > >>
> > >> Wido
> > >> ___
> > >> ceph-users mailing list -- ceph-users@ceph.io
> > >> To unsubscribe send an email to ceph-users-le...@ceph.io
> > >>
> > >
> > >
> >
>
>
> --
> Jason

[1] https://github.com/ceph/ceph/blob/master/src/tools/rbd/Utils.cc#L680
[2] https://github.com/ceph/ceph/blob/master/src/mon/MonClient.cc#L445

--
Jason
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mimic: much more raw used than reported

2020-07-29 Thread Igor Fedotov

Frank,

so you have pretty high amount of small writes indeed. More than a half 
of the written volume (in bytes) is done via small writes.


And 6x times more small requests.


This looks pretty odd for sequential write pattern and is likely to be 
the root cause for that space overhead.


I can see approx 1.4GB additionally lost per each of these 3 OSDs since 
perf dump reset  ( = allocated_new - stored_new - (allocated_old - 
stored_old))


Below are some speculations on what might be happening by for sure I 
could be wrong/missing something. So please do not consider this as a 
100% valid analysis.


Client does writes in 1MB chunks. This is split into 6 EC chunks (+2 
added) which results in approx 170K writing block to object store ( = 
1MB / 6). Which corresponds to 1x128K big write and 1x42K small tailing 
one. Resulting in 3x64K allocations.


The next client adjacent write results in another 128K blob, one more 
"small" tailing blob and heading blob which partially overlaps with the 
previous tailing 42K chunk. Overlapped chunks are expected to be merged. 
But presumably this doesn't happen due to that "partial EC overwrites" 
issue. So instead additional 64K blob is allocated for overlapped range.


I.e. 2x170K writes cause 2x128K blobs, 1x64K tailing blob and 2x64K 
blobs for the range where two writes adjoined. 64K wasted!


And similarly +64K space overhead per each additional append to this object.


Again I'm not completely sure the above analysis is 100% valid and this 
doesn't explain that large amount of small requests. But you might want 
to check/tune/experiment on client writing size. E.g. increase it to 4M 
if it' less or make divisible by 6.


Hope this helps.

Thanks,

Igor

On 7/29/2020 4:06 PM, Frank Schilder wrote:


Hi Igor,

thanks! Here a sample extract for one OSD, time stamp (+%F-%H%M%S) in file 
name. For the second collection I let it run for about 10 minutes after reset:

perf_dump_2020-07-29-142739.osd181:"bluestore_write_big": 10216689,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_big_bytes": 
992602882048,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_big_blobs": 
10758603,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small": 63863813,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_bytes": 
1481631167388,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_unused": 
17279108,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_deferred": 
13629951,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_pre_read": 
13629951,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_new": 
32954754,
perf_dump_2020-07-29-142739.osd181:"compress_success_count": 1167212,
perf_dump_2020-07-29-142739.osd181:"compress_rejected_count": 1493508,
perf_dump_2020-07-29-142739.osd181:"bluestore_compressed": 149993487447,
perf_dump_2020-07-29-142739.osd181:"bluestore_compressed_allocated": 
206610432000,
perf_dump_2020-07-29-142739.osd181:"bluestore_compressed_original": 
362672914432,
perf_dump_2020-07-29-142739.osd181:"bluestore_extent_compress": 
24431903,

perf_dump_2020-07-29-143836.osd181:"bluestore_write_big": 10736,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_big_bytes": 
1363214336,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_big_blobs": 12291,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small": 67527,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_bytes": 
1591140352,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_unused": 
17528,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_deferred": 
13854,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_pre_read": 
13854,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_new": 36145,
perf_dump_2020-07-29-143836.osd181:"compress_success_count": 1641,
perf_dump_2020-07-29-143836.osd181:"compress_rejected_count": 2341,
perf_dump_2020-07-29-143836.osd181:"bluestore_compressed": 150044304023,
perf_dump_2020-07-29-143836.osd181:"bluestore_compressed_allocated": 
206654210048,
perf_dump_2020-07-29-143836.osd181:"bluestore_compressed_original": 
362729676800,
perf_dump_2020-07-29-143836.osd181:"bluestore_extent_compress": 24979,

If necessary, the full outputs for 3 OSDs can be found here:

Before reset:

https://pastebin.com/zNgRwuNv
https://pastebin.com/NDzdbhWc
https://pastebin.com/mpra6PAS

After reset:

https://pastebin.com/Ywrwscea
https://pastebin.com/sLjxK1Jw
https://pastebin.com/ik3n7Xtz

I do see an unreasonable number of small (re-)writes with average size of ca. 
20K, seems not to be due to compression. Unfortunately, I can't see anything 
about alignment of writes.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


[ceph-users] Re: Usable space vs. Overhead

2020-07-29 Thread David Orman
Hi,

Thank you, everyone, for the help. I absolutely was mixing up the two,
which is why I was asking for guidance. The example made it clear. The
question I was trying to answer was: what would the capacity of the cluster
be, for actual data, based on the raw disk space + server/drive count +
erasure coding profile. It sounds like the 'usable' calculation (66% in
this case) is the accurate number, assuming I were to fill the cluster to
100%, which I realize is not ideal with Ceph.

Respectfully,
David Orman

On Wed, Jul 29, 2020 at 2:27 AM Janne Johansson  wrote:

>
>
> Den ons 29 juli 2020 kl 03:17 skrev David Orman :
>
>> That's what the formula on the ceph link arrives at, a 2/3 or 66.66%
>> overhead. But if a 4 byte object is split into 4x1 byte chunks data (4
>> bytes total) + 2x 1 byte chunks parity (2 bytes total), you arrive at 6
>> bytes, which is 50% more than 4 bytes. So 50% overhead, vs. 33.33%
>> overhead
>> as the other formula arrives at. I'm curious what I'm missing.
>>
>>
> Are you sure you are not just mixing up overhead with usable %?
>
> 50% overhead means you write 4 bytes, get 2 bytes "extra" for a total of 6.
> In this case 4 out of 6 is 66.67% usable space, i.e. two thirds.
>
> So if the formula says you will get 66% usable it means you get two-thirds
> usable out of your drives with EC4+2, and it can also be said that the data
> is 100%, and the overhead is 50% of that, but you need to know which of
> the figures you want to calculate.
>
> Either "how large is the growth of the data I put in"
> OR "How much of the stored data is my original bytes and how much
> in percent is the checksums".
>
> For 4+2, the growth is 50%, since you add two (50% of four) to 4 original
> bytes,
> and for a six-drive setting, two drives go to checksums so you only get
> 66% usable
> if you fill that cluster up. The space allocated to checksums (33%)
> is "50% of 66%" so the overhead is still 50% no matter how you calculate i
>
> --
> May the most significant bit of your life be positive.
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: High io wait when osd rocksdb is compacting

2020-07-29 Thread Igor Fedotov

Hi Raffael,

wondering if all OSDs are suffering from slow compaction or just he one 
which is "near full"?


Do other OSDs has that "log_latency_fn slow operation observed for" lines?

Have you tried "osd bench" command for your OSDs? Does it show similar 
numbers for every OSD?


You might want to try manual offline DB compaction using 
ceph-kvstore-tool. Any improvements after that?



Thanks,

Igor

On 7/29/2020 4:35 PM, Raffael Bachmann wrote:

Hi Mark

Unfortunately it is the production cluster and I don't have another 
one :-(


This is the output of the log parser. I'have nothing to compare them 
to. Stupid me has no more logs from before the upgrade.


python ceph_rocksdb_log_parser.py ceph-osd.1.log
Compaction Statistics   ceph-osd.1.log
Total OSD Log Duration (seconds)    55500.457
Number of Compaction Events 13
Avg Compaction Time (seconds)   116.498074615
Total Compaction Time (seconds) 1514.47497
Avg Output Size: (MB)   422.757656391
Total Output Size: (MB) 5495.84953308
Total Input Records 21019590
Total Output Records    18093259
Avg Output Throughput (MB/s)    3.53010211372
Avg Input Records/second    17994.0419635
Avg Output Records/second   16449.9710169
Avg Output/Input Ratio  0.891530624966

ceph-osd.1.log

start_offset    compaction_time_seconds output_level 
num_output_files    total_output_size num_input_records 
num_output_records  output (MB/s) input (r/s) output (r/s)    
output/input ratio
417.204 70.247058   1   5   261853019   1476689 
138 3.55491754393   21021.3643396   19708.2132607 0.937532547476
546.271 128.652685  2   7   473883973   1674393 
1098908 3.51279861751   13014.8313655   8541.66393807 0.656302313734
5761.795    60.460736   1   4   211033833 1041408 
1013909 3.32873133441   17224.5339521   16769.7098494 0.973594402962
14912.985   64.958415   1   4   231336608 1316575 
1249120 3.3963233477    20267.9668215   19229.5332329 0.948764787422
15152.316   238.925764  2   14  944635417 2445094 
1902084 3.77052068592   10233.6975262   7960.98322825 0.77791855855
24607.857   53.022134   1   4   188414045 1029179 
988116  3.38887973778   19410.36549 18635.915333 0.960101206884
31259.993   55.442826   1   4   210856392 1296725 
1221474 3.62694941814   23388.5083708   22031.2362865 0.941968420444
31574.193   313.736584  2   18  1213247010 2928742 
2359960 3.68794259867   9335.03502416   7522.10650703 0.805793067467
37708.375   49.78089    1   3   171888381 974097 
939847  3.29294101107   19567.6895291   18879.6745096 0.96483923059
43219.745   51.798215   1   4   193360867 1246101 
1172257 3.5600318014    24056.8328465   22631.2238752 0.940739956071
48041.751   56.559014   1   4   208216413 1451105 
1367052 3.5108576209    25656.4762604   24170.3647804 0.942076555453
48368.403   325.833185  2   19  1289359869 3196156 
2489088 3.77380036251   9809.17889011   7639.1482347 0.778775504074
52693.952   45.057464   1   3   164730093 943326 
907000  3.48663339848   20936.0651101   20129.8501842 0.961491573433


cheers
Raffael


On 29/07/2020 15:19, Mark Nelson wrote:

Hi Raffael,


Adam made a PR this year that shards rocksdb data across different 
column families to help reduce compaction overhead. The goal is to 
reduce write-amplification during compaction by storing multiple 
small LSM hierarchies rather than 1 big one. We've seen evidence that 
this lowers compaction time and overhead, sometimes significantly.  
That PR was merged to master on April 26th so I don't believe it's in 
any of the releases yet but you can test it if you have a 
non-production cluster available.  That PR is here:



https://github.com/ceph/ceph/pull/34006


Normally though you should have about 1GB of WAL to absorb writes 
during compaction and rocksdb automatically slows writes down if the 
buffers start filling up.  You should only see a write stall from 
compaction if you completely fill all of the buffers.  Also, you 
shouldn't see compaction at one level blocking IO to the entire 
database.  Something seems off to me here.


If you have OSD logs, you can see a history of the compaction events 
by running this script:


https://github.com/ceph/cbt/blob/master/tools/ceph_rocksdb_log_parser.py


That can give you an idea of how long your compaction events are 
lasting and what they are doing.



Mark


On 7/29/20 7:52 AM, Raffael Bachmann wrote:

Hi All,

I'm kind of crossposting this from here: 
https://forum.proxmox.com/threads/i-o-wait-after-upgrade-5-x-to-6-2-and-ceph-luminous-to-nautilus.73581/
But since I'm more and more sure that it's a ceph problem I'll try 
my luck here.


Since updating from Luminous to Nautilus I have a big problem.

I have a 3 node cluster. Each cluster has 2 nvme ssd and a 10GBASE-T 
net for ceph.
Every few minutes a osd seems

[ceph-users] Re: Setting rbd_default_data_pool through the config store

2020-07-29 Thread Wido den Hollander




On 29/07/2020 16:00, Jason Dillaman wrote:

On Wed, Jul 29, 2020 at 9:07 AM Jason Dillaman  wrote:


On Wed, Jul 29, 2020 at 9:03 AM Wido den Hollander  wrote:




On 29/07/2020 14:54, Jason Dillaman wrote:

On Wed, Jul 29, 2020 at 6:23 AM Wido den Hollander  wrote:


Hi,

I'm trying to have clients read the 'rbd_default_data_pool' config
option from the config store when creating a RBD image.

This doesn't seem to work and I'm wondering if somebody knows why.


It looks like all string-based config overrides for RBD are ignored:

2020-07-29T08:52:44.393-0400 7f2a97fff700  4 set_mon_vals failed to
set rbd_default_data_pool = rbd-data: Configuration option
'rbd_default_data_pool' may not be modified at runtime

librbd always accesses the config options in a thread-safe manner, so
I'll open a tracker ticket to flag all the RBD string config options
are runtime updatable (primitive data type options are implicitly
runtime updatable).


I wasn't updating it at runtime, I just wanted to make sure that I don't
have to set this in ceph.conf everywhere (and libvirt doesn't read
ceph.conf)


You weren't updating it at runtime -- the MON's "MConfig" message back
to the client was attempting to set the config option after "rbd" had
already started. However, if it's working under python, perhaps there
is an easy tweak for "rbd" to have it delay flagging the application
as having started until after it has connected to the cluster. Right
now it manages its own CephContext lifetime which it re-uses when
creating a librados connection. It's that CephContext that is flagged
as "running" prior to librados actually connecting to the cluster.


It looks like this is caused by two issues:

-- In [1], this will prevent librados from applying any MON config
overrides (for strings). This line can just be trivially removed.

-- Fixing that, there is a race in librados / MonClient [2] where it
attempts to first pull the config from the MONs, but it uses a
separate thread to actually apply the received config values, which
can race w/ the completion of the bootstrap occurring in the main
thread. This means that the example below may work sometimes -- and
may fail other times.


Interesting! In this case it will be libvirt which runs for ever and 
talks to librbd/librados.


I'll need to see how that works out. I'll test and report back.

Wido




But it seems that Python works:

#!/usr/bin/python3

import rados
import rbd

cluster = rados.Rados(conffile='/etc/ceph/ceph.conf')
cluster.connect()
ioctx = cluster.open_ioctx('rbd')

rbd_inst = rbd.RBD()
size = 4 * 1024**3  # 4 GiB
rbd_inst.create(ioctx, 'myimage', size)

ioctx.close()
cluster.shutdown()


And then:

$ ceph config set client rbd_default_data_pool rbd-data

rbd image 'myimage':
 size 4 GiB in 1024 objects
 order 22 (4 MiB objects)
 snapshot_count: 0
 id: 1aa963a21028
 data_pool: rbd-data
 block_name_prefix: rbd_data.2.1aa963a21028
 format: 2
 features: layering, exclusive-lock, object-map, fast-diff,
deep-flatten, data-pool


I haven't tested this through libvirt yet. That's the next thing to test.

Wido




I tried:

$ ceph config set client rbd_default_data_pool rbd-data
$ ceph config set global rbd_default_data_pool rbd-data

They both show up under:

$ ceph config dump

However, newly created RBD images with the 'rbd' CLI tool do not use the
data pool.

If I set this in ceph.conf it works:

[client]
rbd_default_data_pool = rbd-data

Somehow librbd isn't fetching these configuration options. Any hints on
how to get this working?

The end result is that libvirt (which doesn't read ceph.conf) should
also be able to create RBD images with a different data pool.

Wido
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io









--
Jason


[1] https://github.com/ceph/ceph/blob/master/src/tools/rbd/Utils.cc#L680
[2] https://github.com/ceph/ceph/blob/master/src/mon/MonClient.cc#L445

--
Jason


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: High io wait when osd rocksdb is compacting

2020-07-29 Thread Mark Nelson
Wow, that's crazy.  You only had 13 compaction events for that OSD over 
roughly 15 days but the average compaction time was 116 seconds!  Notice 
too though that the average compaction output size is 422MB with an 
average output throughput of 3.5MB!  That's really slow with RocksDB 
sitting on an NVMe drive.  You are only processing about 16K records/second.



Here are some of the results from our internal NVMe (Intel P4510) test 
cluster looking at Sharded vs Unsharded rocksdb.  This was based on 
master from last fall so figure it's about halfway between Nautilus and 
Octopus.  These results are not exactly comparable to yours since we're 
using some experimental settings, but your compaction events look like 
they are orders of magnitude slower.



https://docs.google.com/spreadsheets/d/1FYFBxwvE1i28AKoLyqrksHptE1Z523NU3Fag0MELTQo/edit?usp=sharing


No wonder you are seeing periodic stalls.  How many DBs per NVMe drive?  
What's your cluster workload typically like? Also, can you see if the 
NVMe drive aqu-sz is getting big waiting for the requests to be serviced?



Mark


On 7/29/20 8:35 AM, Raffael Bachmann wrote:

Hi Mark

Unfortunately it is the production cluster and I don't have another 
one :-(


This is the output of the log parser. I'have nothing to compare them 
to. Stupid me has no more logs from before the upgrade.


python ceph_rocksdb_log_parser.py ceph-osd.1.log
Compaction Statistics   ceph-osd.1.log
Total OSD Log Duration (seconds)    55500.457
Number of Compaction Events 13
Avg Compaction Time (seconds)   116.498074615
Total Compaction Time (seconds) 1514.47497
Avg Output Size: (MB)   422.757656391
Total Output Size: (MB) 5495.84953308
Total Input Records 21019590
Total Output Records    18093259
Avg Output Throughput (MB/s)    3.53010211372
Avg Input Records/second    17994.0419635
Avg Output Records/second   16449.9710169
Avg Output/Input Ratio  0.891530624966

ceph-osd.1.log

start_offset    compaction_time_seconds output_level 
num_output_files    total_output_size num_input_records 
num_output_records  output (MB/s) input (r/s) output (r/s)    
output/input ratio
417.204 70.247058   1   5   261853019   1476689 
138 3.55491754393   21021.3643396   19708.2132607 0.937532547476
546.271 128.652685  2   7   473883973   1674393 
1098908 3.51279861751   13014.8313655   8541.66393807 0.656302313734
5761.795    60.460736   1   4   211033833 1041408 
1013909 3.32873133441   17224.5339521   16769.7098494 0.973594402962
14912.985   64.958415   1   4   231336608 1316575 
1249120 3.3963233477    20267.9668215   19229.5332329 0.948764787422
15152.316   238.925764  2   14  944635417 2445094 
1902084 3.77052068592   10233.6975262   7960.98322825 0.77791855855
24607.857   53.022134   1   4   188414045 1029179 
988116  3.38887973778   19410.36549 18635.915333 0.960101206884
31259.993   55.442826   1   4   210856392 1296725 
1221474 3.62694941814   23388.5083708   22031.2362865 0.941968420444
31574.193   313.736584  2   18  1213247010 2928742 
2359960 3.68794259867   9335.03502416   7522.10650703 0.805793067467
37708.375   49.78089    1   3   171888381 974097 
939847  3.29294101107   19567.6895291   18879.6745096 0.96483923059
43219.745   51.798215   1   4   193360867 1246101 
1172257 3.5600318014    24056.8328465   22631.2238752 0.940739956071
48041.751   56.559014   1   4   208216413 1451105 
1367052 3.5108576209    25656.4762604   24170.3647804 0.942076555453
48368.403   325.833185  2   19  1289359869 3196156 
2489088 3.77380036251   9809.17889011   7639.1482347 0.778775504074
52693.952   45.057464   1   3   164730093 943326 
907000  3.48663339848   20936.0651101   20129.8501842 0.961491573433


cheers
Raffael


On 29/07/2020 15:19, Mark Nelson wrote:

Hi Raffael,


Adam made a PR this year that shards rocksdb data across different 
column families to help reduce compaction overhead. The goal is to 
reduce write-amplification during compaction by storing multiple 
small LSM hierarchies rather than 1 big one. We've seen evidence that 
this lowers compaction time and overhead, sometimes significantly.  
That PR was merged to master on April 26th so I don't believe it's in 
any of the releases yet but you can test it if you have a 
non-production cluster available.  That PR is here:



https://github.com/ceph/ceph/pull/34006


Normally though you should have about 1GB of WAL to absorb writes 
during compaction and rocksdb automatically slows writes down if the 
buffers start filling up.  You should only see a write stall from 
compaction if you completely fill all of the buffers.  Also, you 
shouldn't see compaction at one level blocking IO to the entire 
database.  Something seems off to me here.


If you have OSD logs, you can see a history of the comp

[ceph-users] Re: Usable space vs. Overhead

2020-07-29 Thread Janne Johansson
Den ons 29 juli 2020 kl 16:34 skrev David Orman :

> Thank you, everyone, for the help. I absolutely was mixing up the two,
> which is why I was asking for guidance. The example made it clear. The
> question I was trying to answer was: what would the capacity of the cluster
> be, for actual data, based on the raw disk space + server/drive count +
> erasure coding profile. It sounds like the 'usable' calculation (66% in
> this case) is the accurate number, assuming I were to fill the cluster to
> 100%, which I realize is not ideal with Ceph.
>

It is bad on almost all kinds of storage systems to fill it up like that,
any storage that has any concept of data that can move (so excluding tapes
or CDroms more or less) will want to have some extra space, and ceph will
start to warn/act/refuse when you pass 85,90,95% filled, so aim for
something where you will be starting to buy more nodes/disks when your
first OSD is over 70 or so, otherwise you will be doing a lot of manual
work like rebalancing and reweighing in order to not go above 85 until your
new drives can be added to the system.

If you have few nodes, one host outage will represent a large part of the
available storage, so one can make all kinds of calculations on overhead
and things like "with EC4+2 I can lose two drives and still recover", but
if you only have 6 hosts and one goes dead (for any reason), your total has
fallen with 16.7% so if you were at some 70% full with 6 hosts, you are
going to be all but totally filled up with only 5 which will cause issues
(like OSDs refusing IO to not move to 100% full), even if you only lost one
of each EC4+2-group from that host.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mimic: much more raw used than reported

2020-07-29 Thread Frank Schilder
Dear Igor,

please find below data from "ceph osd df tree" and per-OSD bluestore stats 
pasted together with the script for extraction for reference. We have now:

df USED: 142 TB
bluestore_stored: 190.9TB (142*8/6 = 189, so matches)
bluestore_allocated: 275.2TB
osd df tree USE: 276.1 (so matches with bluestore_allocated as well)

The situation has gotten worse, the mismatch of raw used to stored is now 85TB. 
Compression is almost irrelevant. This matches with my earlier report with data 
taken from "ceph osd df tree" alone. Compared with my previous report, what I 
seem to see is that a sequential write of 22TB (user data) causes an excess of 
16TB (raw). This does not make sense and is not explained with the partial 
overwrite amplification you referred me to.

The real question I still have is how can I find out how much of the excess 
usage is attributed to the issue you pointed me to, and how much might be due 
to something else. I would probably need a way to find objects that are 
affected by partial overwrite amplification and account for their total to see 
how much of the excess they explain. Ideally allowing me to identify the RBD 
images responsible.

I do *not* believe that *all* this extra usage is due to the partial overwrite 
amplification. We do not have the use case simulated with the subsequent dd 
commands in your post 
https://lists.ceph.io/hyperkitty/list/d...@ceph.io/thread/OHPO43J54TPBEUISYCK3SRV55SIZX2AT/,
 overwriting old data with an offset. On these images, we store very large 
files (15GB) that are written *only* *once* and not modified again. We 
currently do nothing else but sequential writes to a file system.

The only objects that might see a partial overwrite could be at the tail of 
such a file, when the beginning of a new file is written to an object that 
already holds a tail, and potentially objects holding file system meta data. 
With an RBD object size of 4M, this amounts to a comparably small number of 
objects that almost certainly cannot explain the observed 44% excess even 
assuming worst case amplification.

The data:

NAME ID USED%USED MAX AVAIL OBJECTS  
sr-rbd-data-one-hdd  11 142 TiB 71.1258 TiB 37415413

   osd df tree   blue stats
  ID   SIZEUSE alloc  store
  848.96.2   6.14.3
 1458.95.6   5.53.7
 1568.96.3   6.24.2
 1688.96.1   6.04.1
 1818.96.6   6.64.4
  748.95.2   5.23.7
 1448.95.9   5.94.0
 1578.96.6   6.54.5
 1698.96.4   6.34.4
 1808.96.6   6.64.5
  608.95.7   5.64.0
 1468.95.9   5.84.0
 1588.96.7   6.74.6
 1708.96.5   6.54.4
 1828.95.8   5.74.0
  638.95.8   5.84.1
 1488.96.5   6.44.4
 1598.94.9   4.93.3
 1728.96.4   6.34.4
 1838.96.5   6.44.4
 2298.95.6   5.63.8
 2328.96.3   6.24.3
 2358.95.0   4.93.3
 2388.96.6   6.54.4
 259 117.5   7.45.1
 2318.96.2   6.14.2
 2338.96.7   6.64.5
 2368.96.3   6.24.2
 2398.95.2   5.13.5
 263 116.5   6.54.4
 2288.96.3   6.34.3
 2308.96.0   5.94.0
 2348.96.5   6.44.4
 2378.96.0   5.94.1
 260 116.6   6.54.5
   08.96.3   6.34.3
   28.96.4   6.44.5
  728.95.4   5.43.7
  768.96.2   6.14.3
  868.95.6   5.53.9
   18.96.0   5.94.1
   38.95.7   5.74.0
  738.96.1   6.04.3
  858.96.8   6.74.6
  878.96.1   6.14.3
 SUM  406.8  276.1 275.2  190.9

The script:

#!/bin/bash

format_TB() {
tmp=$(($1/1024))
echo "${tmp}.$(( (10*($1-tmp*1024))/1024 ))"
}

blue_stats() {
al_tot=0
st_tot=0
printf "%12s\n" "blue stats"
printf "%5s  %5s\n" "alloc" "store"
for o in "$@" ; do
host_ip="$(ceph osd find "$o" | jq -r '.ip' | cut -d ":" -f1)"
bs_data="$(ssh "$host_ip" ceph daemon "osd.$o" perf dump | jq 
'.bluestore')"
bs_alloc=$(( $(echo "$bs_data" | jq '.bluestore_allocated') 
/1024/1024/1024 ))
al_tot=$(( $al_tot+$bs_alloc ))
bs_store=$(( $(echo "$bs_data" | jq '.bluestore_stored') 
/1024/1024/1024 ))
st_tot=$(( $st_tot+$bs_store ))
printf "%5s  %5s\n" "$(format_TB $bs_alloc)" "$(format_TB 
$bs_store)"
done
printf "%5s  %5s\n" "$(format_TB $al_tot)" "$(format_TB $st_tot)"
}

df_tree_data="$(ceph osd df tree | sed -e "s/  *$//g" | awk 'BEGIN 
{printf("%18s\n", "osd df tree")} /root default/ {o=0} /datacenter ServerRoom/ 
{o=1} (o==1 && $2=="hdd") {s+=$5;u+=$7;printf("%4s  %5s  %5s\n", $1, $5, $7)} 
f==0 {printf("%4s  %5s  %5s\n", $1, $5, $6);f=1} END {printf("%4s

[ceph-users] Stuck removing osd with orch

2020-07-29 Thread Ml Ml
Hello,
yesterday i did:
 ceph osd purge 32 --yes-i-really-mean-it

I also started to upgrade:
 ceph orch upgrade start --ceph-version 15.2.4

It seems its really gone:
 ceph osd crush remove osd.32  => device 'osd.32' does not appear in
the crush map

ceph orch ps:
 osd.32ceph01  error  4h ago 2h
  docker.io/ceph/ceph:v15   

root@ceph01:~# ceph orch osd rm status
NAME   HOST   PGS STARTED_AT
osd.32 ceph01 n/a 2020-07-29 09:50:11.026643  (some hours ago)


root@ceph01:~# ceph health detail
HEALTH_ERR Module 'cephadm' has failed: auth get failed: failed to
find osd.32 in keyring retval: -2; Low space hindering backfill (add
storage if this doesn't resolve itself): 105 pgs backfill_toofull;
Degraded data redundancy: 9882/33167949 objects degraded (0.030%), 1
pg degraded, 1 pg undersized
[ERR] MGR_MODULE_ERROR: Module 'cephadm' has failed: auth get failed:
failed to find osd.32 in keyring retval: -2
Module 'cephadm' has failed: auth get failed: failed to find
osd.32 in keyring retval: -2


Any idea what i would try to remove that OSD?

Thanks,
Michael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: mimic: much more raw used than reported

2020-07-29 Thread Frank Schilder
Hi Igor,

thanks! Here a sample extract for one OSD, time stamp (+%F-%H%M%S) in file 
name. For the second collection I let it run for about 10 minutes after reset:

perf_dump_2020-07-29-142739.osd181:"bluestore_write_big": 10216689,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_big_bytes": 
992602882048,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_big_blobs": 
10758603,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small": 63863813,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_bytes": 
1481631167388,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_unused": 
17279108,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_deferred": 
13629951,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_pre_read": 
13629951,
perf_dump_2020-07-29-142739.osd181:"bluestore_write_small_new": 
32954754,
perf_dump_2020-07-29-142739.osd181:"compress_success_count": 1167212,
perf_dump_2020-07-29-142739.osd181:"compress_rejected_count": 1493508,
perf_dump_2020-07-29-142739.osd181:"bluestore_compressed": 149993487447,
perf_dump_2020-07-29-142739.osd181:"bluestore_compressed_allocated": 
206610432000,
perf_dump_2020-07-29-142739.osd181:"bluestore_compressed_original": 
362672914432,
perf_dump_2020-07-29-142739.osd181:"bluestore_extent_compress": 
24431903,

perf_dump_2020-07-29-143836.osd181:"bluestore_write_big": 10736,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_big_bytes": 
1363214336,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_big_blobs": 12291,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small": 67527,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_bytes": 
1591140352,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_unused": 
17528,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_deferred": 
13854,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_pre_read": 
13854,
perf_dump_2020-07-29-143836.osd181:"bluestore_write_small_new": 36145,
perf_dump_2020-07-29-143836.osd181:"compress_success_count": 1641,
perf_dump_2020-07-29-143836.osd181:"compress_rejected_count": 2341,
perf_dump_2020-07-29-143836.osd181:"bluestore_compressed": 150044304023,
perf_dump_2020-07-29-143836.osd181:"bluestore_compressed_allocated": 
206654210048,
perf_dump_2020-07-29-143836.osd181:"bluestore_compressed_original": 
362729676800,
perf_dump_2020-07-29-143836.osd181:"bluestore_extent_compress": 24979,

If necessary, the full outputs for 3 OSDs can be found here:

Before reset:

https://pastebin.com/zNgRwuNv
https://pastebin.com/NDzdbhWc
https://pastebin.com/mpra6PAS

After reset:

https://pastebin.com/Ywrwscea
https://pastebin.com/sLjxK1Jw
https://pastebin.com/ik3n7Xtz

I do see an unreasonable number of small (re-)writes with average size of ca. 
20K, seems not to be due to compression. Unfortunately, I can't see anything 
about alignment of writes.

Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14


From: Igor Fedotov 
Sent: 29 July 2020 14:04:34
To: Frank Schilder; ceph-users
Subject: Re: [ceph-users] mimic: much more raw used than reported

Hi Frank,

you might want to proceed with perf counters' dump analysis in the
following way:

For 2-3 arbitrary osds

- save current perf counter dump

- reset perf counters

- leave OSD under the regular load for a while.

- dump perf counters again

- share both saved and new dumps and/or check stats on 'big' writes vs.
'small' ones.


Thanks,

Igor

On 7/29/2020 2:49 PM, Frank Schilder wrote:

> Dear Igor,
>
> please find below data from "ceph osd df tree" and per-OSD bluestore stats 
> pasted together with the script for extraction for reference. We have now:
>
> df USED: 142 TB
> bluestore_stored: 190.9TB (142*8/6 = 189, so matches)
> bluestore_allocated: 275.2TB
> osd df tree USE: 276.1 (so matches with bluestore_allocated as well)
>
> The situation has gotten worse, the mismatch of raw used to stored is now 
> 85TB. Compression is almost irrelevant. This matches with my earlier report 
> with data taken from "ceph osd df tree" alone. Compared with my previous 
> report, what I seem to see is that a sequential write of 22TB (user data) 
> causes an excess of 16TB (raw). This does not make sense and is not explained 
> with the partial overwrite amplification you referred me to.
>
> The real question I still have is how can I find out how much of the excess 
> usage is attributed to the issue you pointed me to, and how much might be due 
> to something else. I would probably need a way to find objects that are 
> affected by partial overwrite amplification and account for their total to 
> see how much of the excess they explain. Ideally allowing me to identify the 
> RBD images re

[ceph-users] Re: cephadm and disk partitions

2020-07-29 Thread Robert LeBlanc
Jason,

The family and I are doing well, thanks for asking. I haven't worked
with Octopus yet, so I can't really talk towards that. Ceph
historically hasn't cared about physical disk layout, and personally I
think the Ceph code path is too heavy to really worry about
optimizations there. The LVM layer generally is pretty light and you
usually get more benefits than performance hit such relocating and
extending devices online. Sorry, I can't talk to how cephadm operates.

Robert LeBlanc


Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1

On Tue, Jul 28, 2020 at 9:25 PM Jason Borden  wrote:
>
> Hi Robert!
>
> Thanks for answering my question. I take it you're working a lot with Ceph 
> these days! On my pre-octopus clusters I did use LVM backed by partitions, 
> but I always kind of wondered if it was a good practice or not as it added an 
> additional layer and obscures the underlying disk topology. Then on this new 
> octopus cluster I wanted to use the new cephadm approach for management and 
> it seems to steer you away from using partitions or LVM directly, thus my 
> question. I don't really have the option to not use partitions in this 
> particular instance. I was merely curious if there was a particular reason 
> that cephadm doesn't consider partitions (or LVM) as being "available" 
> devices. All the storage in this cluster is the same so no need to split 
> metadata on to faster storage in my instance. Anyway, it's good to hear from 
> you. Hope you and your family are doing well.
>
> Thanks,
> Jason
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [EXTERNAL] Re: S3 bucket lifecycle not deleting old objects

2020-07-29 Thread Alex Hussein-Kershaw
Hi Robin,

Thanks for the reply. I'm currently testing this on a bucket with a single 
object, on a Ceph cluster with a very tiny amount of data. 

I've done what you suggested and run the `radosgw-admin lc process` command and 
turned up the RGW logs - but I saw nothing. 

[qs-admin@portala0 ceph]$ ceph --admin-daemon 
/var/run/ceph/ceph-client.rgw.portala0.rgw0.49.94758071386744.asok config set 
debug_rgw 5/5
{
"success": ""
}
[qs-admin@portala0 ceph]$ ceph --admin-daemon 
/var/run/ceph/ceph-client.rgw.portala0.rgw0.49.94758071386744.asok config get 
debug_rgw
{
"debug_rgw": "5/5"
}
[qs-admin@portala0 ceph]$ radosgw-admin lc process
[qs-admin@portala0 ceph]$

No mention of life cycle in the logs at all - I am surprised here as I was 
expecting to see a flurry of activity with the logs turned all the way up?  Am 
I doing something daft here?

Also for info:

[qs-admin@portala0 ceph]$ radosgw-admin lc list
[
{
"bucket": ":ahk-test:22bef6b9-67c8-41e6-9e51-17eaddf906fb.1444202.1",
"status": "UNINITIAL"
}
}

Thanks,
Alex

-Original Message-
From: Robin H. Johnson  
Sent: 29 July 2020 06:44
To: Alex Hussein-Kershaw 
Cc: ceph-users@ceph.io
Subject: [EXTERNAL] Re: [ceph-users] S3 bucket lifecycle not deleting old 
objects

On Tue, Jul 28, 2020 at 01:28:14PM +, Alex Hussein-Kershaw wrote:
> Hello,
> 
> I have a problem that old versions of S3 objects are not being deleted. Can 
> anyone advise as to why? I'm using Ceph 14.2.9.
How many objects are in the bucket? If it's a lot, then you may run into RGW's 
lifecycle performance limitations: listing each bucket is a very slow operation 
for lifecycle prior to improvements make in later versions (Octopus with maybe 
a backport to Nautilius?)

If the bucket doesn't have a lot of operations, you could try running the 
'radosgw-admin lc process' directly, with debug logging, and see where it gets 
bogged down.

--
Robin Hugh Johnson
Gentoo Linux: Dev, Infra Lead, Foundation Treasurer
E-Mail   : robb...@gentoo.org
GnuPG FP : 11ACBA4F 4778E3F6 E4EDF38E B27B944E 34884E85 GnuPG FP : 7D0B3CEB 
E9B85B1F 825BCECF EE05E6F6 A48F6136
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Setting rbd_default_data_pool through the config store

2020-07-29 Thread Wido den Hollander



On 29/07/2020 16:54, Wido den Hollander wrote:



On 29/07/2020 16:00, Jason Dillaman wrote:
On Wed, Jul 29, 2020 at 9:07 AM Jason Dillaman  
wrote:


On Wed, Jul 29, 2020 at 9:03 AM Wido den Hollander  
wrote:




On 29/07/2020 14:54, Jason Dillaman wrote:
On Wed, Jul 29, 2020 at 6:23 AM Wido den Hollander  
wrote:


Hi,

I'm trying to have clients read the 'rbd_default_data_pool' config
option from the config store when creating a RBD image.

This doesn't seem to work and I'm wondering if somebody knows why.


It looks like all string-based config overrides for RBD are ignored:

2020-07-29T08:52:44.393-0400 7f2a97fff700  4 set_mon_vals failed to
set rbd_default_data_pool = rbd-data: Configuration option
'rbd_default_data_pool' may not be modified at runtime

librbd always accesses the config options in a thread-safe manner, so
I'll open a tracker ticket to flag all the RBD string config options
are runtime updatable (primitive data type options are implicitly
runtime updatable).


I wasn't updating it at runtime, I just wanted to make sure that I 
don't

have to set this in ceph.conf everywhere (and libvirt doesn't read
ceph.conf)


You weren't updating it at runtime -- the MON's "MConfig" message back
to the client was attempting to set the config option after "rbd" had
already started. However, if it's working under python, perhaps there
is an easy tweak for "rbd" to have it delay flagging the application
as having started until after it has connected to the cluster. Right
now it manages its own CephContext lifetime which it re-uses when
creating a librados connection. It's that CephContext that is flagged
as "running" prior to librados actually connecting to the cluster.


It looks like this is caused by two issues:

-- In [1], this will prevent librados from applying any MON config
overrides (for strings). This line can just be trivially removed.

-- Fixing that, there is a race in librados / MonClient [2] where it
attempts to first pull the config from the MONs, but it uses a
separate thread to actually apply the received config values, which
can race w/ the completion of the bootstrap occurring in the main
thread. This means that the example below may work sometimes -- and
may fail other times.


Interesting! In this case it will be libvirt which runs for ever and 
talks to librbd/librados.


I'll need to see how that works out. I'll test and report back.



I can confirm this works with Libvirt. I created a RBD volume through 
Libvirt's RBD storage driver and this resulted in the 'data-pool' 
feature set and the RBD image using the data pool.


On the hypervisor where libvirt runs no ceph.conf is present. All 
information is provided through Libvirt's XML definitions which only 
contain the Monitors and the Cephx credentials.


In this case librados/librbd fetched the configuration from the Config 
Store and thus detected it needed to use the data pool feature.


I'll keep an eye out to see if this goes wrong and it by accident 
creates an image without this feature.


Running 15.2.4 in this case on Ubuntu 18.04

Wido


Wido




But it seems that Python works:

#!/usr/bin/python3

import rados
import rbd

cluster = rados.Rados(conffile='/etc/ceph/ceph.conf')
cluster.connect()
ioctx = cluster.open_ioctx('rbd')

rbd_inst = rbd.RBD()
size = 4 * 1024**3  # 4 GiB
rbd_inst.create(ioctx, 'myimage', size)

ioctx.close()
cluster.shutdown()


And then:

$ ceph config set client rbd_default_data_pool rbd-data

rbd image 'myimage':
 size 4 GiB in 1024 objects
 order 22 (4 MiB objects)
 snapshot_count: 0
 id: 1aa963a21028
 data_pool: rbd-data
 block_name_prefix: rbd_data.2.1aa963a21028
 format: 2
 features: layering, exclusive-lock, object-map, fast-diff,
deep-flatten, data-pool


I haven't tested this through libvirt yet. That's the next thing to 
test.


Wido




I tried:

$ ceph config set client rbd_default_data_pool rbd-data
$ ceph config set global rbd_default_data_pool rbd-data

They both show up under:

$ ceph config dump

However, newly created RBD images with the 'rbd' CLI tool do not 
use the

data pool.

If I set this in ceph.conf it works:

[client]
rbd_default_data_pool = rbd-data

Somehow librbd isn't fetching these configuration options. Any 
hints on

how to get this working?

The end result is that libvirt (which doesn't read ceph.conf) should
also be able to create RBD images with a different data pool.

Wido
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io









--
Jason


[1] https://github.com/ceph/ceph/blob/master/src/tools/rbd/Utils.cc#L680
[2] https://github.com/ceph/ceph/blob/master/src/mon/MonClient.cc#L445

--
Jason


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___

[ceph-users] Re: cephadm and disk partitions

2020-07-29 Thread David Orman
cephadm will handle the LVM for you when you deploy using an OSD
specification. For example, we have NVME and rotational drives, and cephadm
will automatically deploy servers with the DB/WAL on NVME and the data on
the rotational drives, with a limit of 12 rotational per NVME - it handles
all the LVM magic as long as we feed it bare drives.

If you're happy with how it works, it makes management/expansion fairly
easy with a well-written OSD specification vs. doing all of it manually (I
had scripts I had written prior to bootstrap ceph node storage prior to
deployment).

On Tue, Jul 28, 2020 at 11:25 PM Jason Borden 
wrote:

> Hi Robert!
>
> Thanks for answering my question. I take it you're working a lot with Ceph
> these days! On my pre-octopus clusters I did use LVM backed by partitions,
> but I always kind of wondered if it was a good practice or not as it added
> an additional layer and obscures the underlying disk topology. Then on this
> new octopus cluster I wanted to use the new cephadm approach for management
> and it seems to steer you away from using partitions or LVM directly, thus
> my question. I don't really have the option to not use partitions in this
> particular instance. I was merely curious if there was a particular reason
> that cephadm doesn't consider partitions (or LVM) as being "available"
> devices. All the storage in this cluster is the same so no need to split
> metadata on to faster storage in my instance. Anyway, it's good to hear
> from you. Hope you and your family are doing well.
>
> Thanks,
> Jason
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: High io wait when osd rocksdb is compacting

2020-07-29 Thread Raffael Bachmann

Hi Igor

Thanks for you answer. All the disks had low latancy warnings. "had" 
because I think the problem is solved.
After moving some data and almost losing the nearfull nvme pool, because 
one disk had so much latency that ceph decided to mark it out, I could 
start destroying and recreating each nvme osd.
I did this becuase the latency problem still existed even with only a 
half full pool. I'm now in the middle of recreating the osds one by one.
The old ones still have latency issues when compacting the rocksdb but 
the new ones don't. So I hope the problem will be gone by tomorrow.
There is one difference between the old osds and the recreated ones. The 
old ones were partitioned and the mount /var/lib/ceph/osd/ceph-1 was the 
first partition as xfs.
Now they are lvm and /var/lib/ceph/osd/ceph-1 is tmpfs. Im not yet 
familiar enough with all ceph details to know why this changed or what 
exactly the change is. Both, old and new, are bluestore.


Cheers,
Raffael


On 29/07/2020 16:48, Igor Fedotov wrote:

Hi Raffael,

wondering if all OSDs are suffering from slow compaction or just he 
one which is "near full"?


Do other OSDs has that "log_latency_fn slow operation observed for" 
lines?


Have you tried "osd bench" command for your OSDs? Does it show similar 
numbers for every OSD?


You might want to try manual offline DB compaction using 
ceph-kvstore-tool. Any improvements after that?



Thanks,

Igor

On 7/29/2020 4:35 PM, Raffael Bachmann wrote:

Hi Mark

Unfortunately it is the production cluster and I don't have another 
one :-(


This is the output of the log parser. I'have nothing to compare them 
to. Stupid me has no more logs from before the upgrade.


python ceph_rocksdb_log_parser.py ceph-osd.1.log
Compaction Statistics   ceph-osd.1.log
Total OSD Log Duration (seconds)    55500.457
Number of Compaction Events 13
Avg Compaction Time (seconds)   116.498074615
Total Compaction Time (seconds) 1514.47497
Avg Output Size: (MB)   422.757656391
Total Output Size: (MB) 5495.84953308
Total Input Records 21019590
Total Output Records    18093259
Avg Output Throughput (MB/s)    3.53010211372
Avg Input Records/second    17994.0419635
Avg Output Records/second   16449.9710169
Avg Output/Input Ratio  0.891530624966

ceph-osd.1.log

start_offset    compaction_time_seconds output_level 
num_output_files    total_output_size num_input_records 
num_output_records  output (MB/s) input (r/s) output (r/s)    
output/input ratio
417.204 70.247058   1   5   261853019   1476689 
138 3.55491754393   21021.3643396   19708.2132607 0.937532547476
546.271 128.652685  2   7   473883973   1674393 
1098908 3.51279861751   13014.8313655   8541.66393807 0.656302313734
5761.795    60.460736   1   4   211033833 1041408 
1013909 3.32873133441   17224.5339521   16769.7098494 0.973594402962
14912.985   64.958415   1   4   231336608 1316575 
1249120 3.3963233477    20267.9668215   19229.5332329 0.948764787422
15152.316   238.925764  2   14  944635417 2445094 
1902084 3.77052068592   10233.6975262   7960.98322825 0.77791855855
24607.857   53.022134   1   4   188414045 1029179 
988116  3.38887973778   19410.36549 18635.915333 0.960101206884
31259.993   55.442826   1   4   210856392 1296725 
1221474 3.62694941814   23388.5083708   22031.2362865 0.941968420444
31574.193   313.736584  2   18  1213247010 2928742 
2359960 3.68794259867   9335.03502416   7522.10650703 0.805793067467
37708.375   49.78089    1   3   171888381 974097 
939847  3.29294101107   19567.6895291   18879.6745096 0.96483923059
43219.745   51.798215   1   4   193360867 1246101 
1172257 3.5600318014    24056.8328465   22631.2238752 0.940739956071
48041.751   56.559014   1   4   208216413 1451105 
1367052 3.5108576209    25656.4762604   24170.3647804 0.942076555453
48368.403   325.833185  2   19  1289359869 3196156 
2489088 3.77380036251   9809.17889011   7639.1482347 0.778775504074
52693.952   45.057464   1   3   164730093 943326 
907000  3.48663339848   20936.0651101   20129.8501842 0.961491573433


cheers
Raffael


On 29/07/2020 15:19, Mark Nelson wrote:

Hi Raffael,


Adam made a PR this year that shards rocksdb data across different 
column families to help reduce compaction overhead. The goal is to 
reduce write-amplification during compaction by storing multiple 
small LSM hierarchies rather than 1 big one. We've seen evidence 
that this lowers compaction time and overhead, sometimes 
significantly.  That PR was merged to master on April 26th so I 
don't believe it's in any of the releases yet but you can test it if 
you have a non-production cluster available.  That PR is here:



https://github.com/ceph/ceph/pull/34006


Normally though you should have about 1GB of WAL to absorb writes 
during compaction and rocks

[ceph-users] Re: High io wait when osd rocksdb is compacting

2020-07-29 Thread Raffael Bachmann

Hi Mark

I think its 15 hours not 15 days. But the compaction time seems really 
to be slow. I' destroying and recreating all nvme osds one by one. And 
the recreated ones don't have latency problems and are also much faster 
compacting the disk.


This is since two hours:
Compaction Statistics   /var/log/ceph/ceph-osd.0.log
Total OSD Log Duration (seconds)    7909.104
Number of Compaction Events 11
Avg Compaction Time (seconds)   1.26702554545
Total Compaction Time (seconds) 13.937281
Avg Output Size: (MB)   268.282840729
Total Output Size: (MB) 2951.11124802
Total Input Records 7693669
Total Output Records    7670229
Avg Output Throughput (MB/s)    225.29745104
Avg Input Records/second    533134.954087
Avg Output Records/second   531386.725197
Avg Output/Input Ratio  0.996558805862

Not sure if you are interested in the answers to your questions anymore 
but:
DBs per drive: I think one? (Not yet familiar with all the ceph details. 
It was "just" a normal osd. the full disk, bluestore including db)
Workload: Rather low. Having three node proxmox/ceph cluster is just for 
not have a single point of failure.  The cpus and disks are mostly bored.

In iostat the aqu-sz value is getting to about 520 when this occours

There is one difference between the old osds and the recreated ones. The 
old ones were partitioned and the mount /var/lib/ceph/osd/ceph-1 was the 
first partition as xfs.
Now they are lvm and /var/lib/ceph/osd/ceph-1 is tmpfs. Both, old and 
new, are bluestore.


I'm still in the middle recreating one by one. Luckily it's not a 
petabyte cluster with thousand of disks ;-)


Anyway, thanks everyone for answering and helping so fast. Having a 
mailing list this active is really nice.

Cheers,
Raffael

On 29/07/2020 16:53, Mark Nelson wrote:
Wow, that's crazy.  You only had 13 compaction events for that OSD 
over roughly 15 days but the average compaction time was 116 seconds! 
Notice too though that the average compaction output size is 422MB 
with an average output throughput of 3.5MB!  That's really slow with 
RocksDB sitting on an NVMe drive.  You are only processing about 16K 
records/second.



Here are some of the results from our internal NVMe (Intel P4510) test 
cluster looking at Sharded vs Unsharded rocksdb.  This was based on 
master from last fall so figure it's about halfway between Nautilus 
and Octopus.  These results are not exactly comparable to yours since 
we're using some experimental settings, but your compaction events 
look like they are orders of magnitude slower.



https://docs.google.com/spreadsheets/d/1FYFBxwvE1i28AKoLyqrksHptE1Z523NU3Fag0MELTQo/edit?usp=sharing 




No wonder you are seeing periodic stalls.  How many DBs per NVMe 
drive?  What's your cluster workload typically like? Also, can you see 
if the NVMe drive aqu-sz is getting big waiting for the requests to be 
serviced?



Mark


On 7/29/20 8:35 AM, Raffael Bachmann wrote:

Hi Mark

Unfortunately it is the production cluster and I don't have another 
one :-(


This is the output of the log parser. I'have nothing to compare them 
to. Stupid me has no more logs from before the upgrade.


python ceph_rocksdb_log_parser.py ceph-osd.1.log
Compaction Statistics   ceph-osd.1.log
Total OSD Log Duration (seconds)    55500.457
Number of Compaction Events 13
Avg Compaction Time (seconds)   116.498074615
Total Compaction Time (seconds) 1514.47497
Avg Output Size: (MB)   422.757656391
Total Output Size: (MB) 5495.84953308
Total Input Records 21019590
Total Output Records    18093259
Avg Output Throughput (MB/s)    3.53010211372
Avg Input Records/second    17994.0419635
Avg Output Records/second   16449.9710169
Avg Output/Input Ratio  0.891530624966

ceph-osd.1.log

start_offset    compaction_time_seconds output_level 
num_output_files    total_output_size num_input_records 
num_output_records  output (MB/s) input (r/s) output (r/s)    
output/input ratio
417.204 70.247058   1   5   261853019   1476689 
138 3.55491754393   21021.3643396   19708.2132607 0.937532547476
546.271 128.652685  2   7   473883973   1674393 
1098908 3.51279861751   13014.8313655   8541.66393807 0.656302313734
5761.795    60.460736   1   4   211033833 1041408 
1013909 3.32873133441   17224.5339521   16769.7098494 0.973594402962
14912.985   64.958415   1   4   231336608 1316575 
1249120 3.3963233477    20267.9668215   19229.5332329 0.948764787422
15152.316   238.925764  2   14  944635417 2445094 
1902084 3.77052068592   10233.6975262   7960.98322825 0.77791855855
24607.857   53.022134   1   4   188414045 1029179 
988116  3.38887973778   19410.36549 18635.915333 0.960101206884
31259.993   55.442826   1   4   210856392 1296725 
1221474 3.62694941814   23388.5083708   22031.2362865 0.941968420444
31574.193   313.736584  2   18  1213247010 2928742 
2359960 3.687942598

[ceph-users] Re: High io wait when osd rocksdb is compacting

2020-07-29 Thread Mark Nelson


On 7/29/20 7:47 PM, Raffael Bachmann wrote:

Hi Mark

I think its 15 hours not 15 days. But the compaction time seems really 
to be slow. I' destroying and recreating all nvme osds one by one. And 
the recreated ones don't have latency problems and are also much 
faster compacting the disk.


This is since two hours:
Compaction Statistics   /var/log/ceph/ceph-osd.0.log
Total OSD Log Duration (seconds)    7909.104
Number of Compaction Events 11
Avg Compaction Time (seconds)   1.26702554545
Total Compaction Time (seconds) 13.937281
Avg Output Size: (MB)   268.282840729
Total Output Size: (MB) 2951.11124802
Total Input Records 7693669
Total Output Records    7670229
Avg Output Throughput (MB/s)    225.29745104
Avg Input Records/second    533134.954087
Avg Output Records/second   531386.725197
Avg Output/Input Ratio  0.996558805862

Not sure if you are interested in the answers to your questions 
anymore but:
DBs per drive: I think one? (Not yet familiar with all the ceph 
details. It was "just" a normal osd. the full disk, bluestore 
including db)
Workload: Rather low. Having three node proxmox/ceph cluster is just 
for not have a single point of failure.  The cpus and disks are mostly 
bored.

In iostat the aqu-sz value is getting to about 520 when this occours



Well now that is an interesting detail.  That's the number of requests 
that are backed up waiting to be serviced by the device below Ceph.  
That would indicate that the device wasn't servicing requests quickly.  
Not sure what that means in relation to all of your other new findings, 
but something definitely seems to be behaving strangely.






There is one difference between the old osds and the recreated ones. 
The old ones were partitioned and the mount /var/lib/ceph/osd/ceph-1 
was the first partition as xfs.
Now they are lvm and /var/lib/ceph/osd/ceph-1 is tmpfs. Both, old and 
new, are bluestore.


I'm still in the middle recreating one by one. Luckily it's not a 
petabyte cluster with thousand of disks ;-)


Anyway, thanks everyone for answering and helping so fast. Having a 
mailing list this active is really nice.

Cheers,
Raffael

On 29/07/2020 16:53, Mark Nelson wrote:
Wow, that's crazy.  You only had 13 compaction events for that OSD 
over roughly 15 days but the average compaction time was 116 seconds! 
Notice too though that the average compaction output size is 422MB 
with an average output throughput of 3.5MB!  That's really slow with 
RocksDB sitting on an NVMe drive.  You are only processing about 16K 
records/second.



Here are some of the results from our internal NVMe (Intel P4510) 
test cluster looking at Sharded vs Unsharded rocksdb. This was based 
on master from last fall so figure it's about halfway between 
Nautilus and Octopus.  These results are not exactly comparable to 
yours since we're using some experimental settings, but your 
compaction events look like they are orders of magnitude slower.



https://docs.google.com/spreadsheets/d/1FYFBxwvE1i28AKoLyqrksHptE1Z523NU3Fag0MELTQo/edit?usp=sharing 




No wonder you are seeing periodic stalls.  How many DBs per NVMe 
drive?  What's your cluster workload typically like? Also, can you 
see if the NVMe drive aqu-sz is getting big waiting for the requests 
to be serviced?



Mark


On 7/29/20 8:35 AM, Raffael Bachmann wrote:

Hi Mark

Unfortunately it is the production cluster and I don't have another 
one :-(


This is the output of the log parser. I'have nothing to compare them 
to. Stupid me has no more logs from before the upgrade.


python ceph_rocksdb_log_parser.py ceph-osd.1.log
Compaction Statistics   ceph-osd.1.log
Total OSD Log Duration (seconds)    55500.457
Number of Compaction Events 13
Avg Compaction Time (seconds)   116.498074615
Total Compaction Time (seconds) 1514.47497
Avg Output Size: (MB)   422.757656391
Total Output Size: (MB) 5495.84953308
Total Input Records 21019590
Total Output Records    18093259
Avg Output Throughput (MB/s)    3.53010211372
Avg Input Records/second    17994.0419635
Avg Output Records/second   16449.9710169
Avg Output/Input Ratio  0.891530624966

ceph-osd.1.log

start_offset    compaction_time_seconds output_level 
num_output_files    total_output_size num_input_records 
num_output_records  output (MB/s) input (r/s) output 
(r/s)    output/input ratio
417.204 70.247058   1   5   261853019 1476689 138 
3.55491754393   21021.3643396   19708.2132607 0.937532547476
546.271 128.652685  2   7   473883973 1674393 1098908 
3.51279861751   13014.8313655   8541.66393807 0.656302313734
5761.795    60.460736   1   4   211033833 1041408 
1013909 3.32873133441   17224.5339521   16769.7098494 0.973594402962
14912.985   64.958415   1   4   231336608 1316575 
1249120 3.3963233477    20267.9668215   19229.5332329 0.948764787422
15152.316   238.925764  2   14  944635417 2445094 
1902084 3.77052068592   10233.6975262 

[ceph-users] Reply with Attachments in Outlook

2020-07-29 Thread emailfix11
People who frequently deal with emails in Outlook must be aware of a common 
issue. That is we won’t be able to reply with original attachments. This 
situation does result in multitudinous troubles. For instance, if in reply 
we’ve put forward some errors about original attachments, recipients who need 
to check the errors wouldn’t be able to find and open the attachments directly 
in our reply. They have to go to “Sent Items” folder and open the original 
email as well as its inside attachments. Far and away, it is pretty 
inconvenient for both senders and recipients. In fact, this case isn’t involved 
with any outlook reply with attachment. It only manifests a default 
configuration of Outlook. In a nutshell, reply can’t attach original messages 
by default in Outlook. Nonetheless, we can configure Outlook to permit reply 
with original attachments. Here are the concrete steps.
For more info: 
https://www.emailsfix.com/outlook-email/how-to-reply-with-attachment-in-outlook/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Not able to access radosgw S3 bucket creation with AWS java SDK. Caused by: java.net.UnknownHostException: issue.

2020-07-29 Thread sathvik vutukuri
Hi Chris,

Thanks for the info. Code worked for me with pathstyleaccess with out DNS
issues.

BasicAWSCredentials awsCreds = new
BasicAWSCredentials("uiuiusidusiyd898798798",
"HJHGGyugyuyudfyGJHGYGIYIGU");

AmazonS3ClientBuilder s3b = AmazonS3ClientBuilder.standard();
s3b.setEndpointConfiguration(new EndpointConfiguration("http://:80", "us-east"));
s3b.enablePathStyleAccess();
s3b.withCredentials(new AWSStaticCredentialsProvider(awsCreds));

AmazonS3 connection = s3b.build();




On Wed, Jul 29, 2020 at 1:36 PM sathvik vutukuri <7vik.sath...@gmail.com>
wrote:

> Thanks, I'll check it out.
>
> On Wed, 29 Jul 2020, 13:35 Chris Palmer, 
> wrote:
>
>> This works for me (the code switches between AWS and RGW according to
>> whether s3Endpoint is set). You need the pathStyleAccess unless you have
>> wildcard DNS names etc.
>>
>> String s3Endpoint = "http://my.host:80"; ;
>>
>> AmazonS3ClientBuilder s3b = AmazonS3ClientBuilder.standard ();
>>
>> if (s3Endpoint == null) {
>>
>> s3b.setRegion (s3Region);
>>
>> } else {
>>
>> s3b.setEndpointConfiguration (new EndpointConfiguration 
>> (s3Endpoint, s3Region));
>>
>> s3b.enablePathStyleAccess ();
>>
>> }
>>
>> if (s3Profile != null) s3b.setCredentials (new 
>> ProfileCredentialsProvider (s3Profile));
>>
>> AmazonS3 s3 = s3b.build ();
>>
>>
>>
>> On 29/07/2020 08:19, sathvik vutukuri wrote:
>>
>> Hi All,
>>
>> Any update in this from any one?
>>
>> On Tue, Jul 28, 2020 at 4:00 PM sathvik vutukuri <7vik.sath...@gmail.com> 
>> <7vik.sath...@gmail.com>
>> wrote:
>>
>>
>> Hi All,
>>
>> radosgw-admin is configured in ceph-deploy, created a few buckets from the
>> Ceph dashboard, but when accessing through Java AWS S3 code to create a new
>> bucket i am facing the below issue..
>>
>> Exception in thread "main" com.amazonaws.SdkClientException: Unable to
>> execute HTTP request: firstbucket.rgwhost
>> at
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleRetryableException(AmazonHttpClient.java:1207)
>> at
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1153)
>> at
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802)
>> at
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770)
>> at
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744)
>> at
>> com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704)
>> at
>> com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686)
>> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550)
>> at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530)
>> at
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062)
>> at
>> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008)
>> at
>> com.amazonaws.services.s3.AmazonS3Client.access$300(AmazonS3Client.java:394)
>> at
>> com.amazonaws.services.s3.AmazonS3Client$PutObjectStrategy.invokeServiceCall(AmazonS3Client.java:5950)
>> at
>> com.amazonaws.services.s3.AmazonS3Client.uploadObject(AmazonS3Client.java:1812)
>> at
>> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1772)
>> at
>> com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1710)
>> at org.S3.App.main(App.java:71)
>> Caused by: java.net.UnknownHostException: firstbucket.rgwhost
>> at java.net.InetAddress.getAllByName0(InetAddress.java:1281)
>> at java.net.InetAddress.getAllByName(InetAddress.java:1193)
>> at java.net.InetAddress.getAllByName(InetAddress.java:1127)
>> at
>> com.amazonaws.SystemDefaultDnsResolver.resolve(SystemDefaultDnsResolver.java:27)
>> at
>> com.amazonaws.http.DelegatingDnsResolver.resolve(DelegatingDnsResolver.java:38)
>> at
>> org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:112)
>> at
>> org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:374)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>> at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>> at java.lang.reflect.Method.invoke(Method.java:498)
>> at
>> com.amazonaws.http.conn.ClientConnectionManagerFactory$Handler.invoke(ClientConnectionManagerFactory.java:76)
>> at com.amazonaws.http.conn.$Proxy3.connect(Unknown Source)
>> at
>> org.apache.http.impl.execchain.MainClientExec.establishRoute(MainClientExec.java:393)
>> at
>> org.apache.http.impl.execchain.MainClientExec.execute(MainClientExec.java:236)
>> at
>> org.apache.http.impl.execchain.ProtocolExec.execute(ProtocolExec.java:186)
>> at
>> org.apac