[ceph-users] How to change the pg numbers

2020-08-18 Thread norman

Hi guys,

I have a rbd pool pg_num 2048, I want to change it to 4096,  how can I
do this?

If I change it directly to 4096, it may cause the client slow requests,
What the better step size should be?

Thanks,

Kern
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to change the pg numbers

2020-08-18 Thread Hans van den Bogert
I don't think it might lead to more client slow requests if you set it 
to 4096 in one step, since there is a cap on how many recovery/backfill 
requests there can be per OSD at any given time.


I am not sure though, but I am happy to be proved wrong by the senior 
members in this list :)


Hans

On 8/18/20 10:23 AM, norman wrote:

Hi guys,

I have a rbd pool pg_num 2048, I want to change it to 4096,  how can I
do this?

If I change it directly to 4096, it may cause the client slow requests,
What the better step size should be?

Thanks,

Kern
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: OSDs get full with bluestore logs

2020-08-18 Thread Janne Johansson
It says:

 FAILED assert(0 == "bluefs enospc")

Could it be that the OSD disks you use are very very small?

Den mån 17 aug. 2020 kl 20:26 skrev Khodayar Doustar :

> Hi,
>
> I have a 3 node cluster of mimic with 9 osds (3 osds on each node).
> I use this cluster to test integration of an application with S3 api.
>
> The problem is that after a few days all OSD starts filling up with
> bluestore logs and goes down and out one by one!
> I cannot stop the logs and I cannot find the setting to fix this leakage,
> this should be a leakage in logs because it's not logical to fill up all
> OSD with bluefs logs.
>
> This is an example of logs which is being repeated in bluestore logs:
>
> [root@server2 ~]# ceph-bluestore-tool --command bluefs-log-dump --path
> /var/lib/ceph/osd/ceph-5
> .
> .
>
> [root@server1 ~]# ceph osd df tree
> ID CLASS WEIGHT  REWEIGHT SIZE   USE DATAOMAP  META AVAIL  %USE
> VAR  PGS TYPE NAME
> -1   0.16727-0 B 0 B 0 B   0 B  0 B0 B0
>0   - root default
> -3   0.05576-0 B 0 B 0 B   0 B  0 B0 B0
>0   - host server1
>  0   hdd 0.01859  1.00 B 0 B 0 B   0 B  0 B0 B0
>0   0 osd.0
>  1   hdd 0.0185900 B 0 B 0 B   0 B  0 B0 B0
>0   0 osd.1
>  2   hdd 0.0185900 B 0 B 0 B   0 B  0 B0 B0
>0   0 osd.2
> -5   0.05576- 19 GiB 1.4 GiB 360 MiB 3 KiB 1024 MiB 18 GiB0
>0   - host server2
>  3   hdd 0.01859  1.00 B 0 B 0 B   0 B  0 B0 B0
>0   0 osd.3
>  4   hdd 0.0185900 B 0 B 0 B   0 B  0 B0 B0
>0   0 osd.4
>  5   hdd 0.01859  1.0 19 GiB 1.4 GiB 360 MiB 3 KiB 1024 MiB 18 GiB 7.11
> 1.04  99 osd.5
> -7   0.05576-0 B 0 B 0 B   0 B  0 B0 B0
>0   - host server3
>  6   hdd 0.01859  1.0 19 GiB 1.2 GiB 249 MiB 3 KiB 1024 MiB 18 GiB 6.55
> 0.96  78 osd.6
>  7   hdd 0.01859  1.00 B 0 B 0 B   0 B  0 B0 B0
>0   0 osd.7
>  8   hdd 0.01859  1.00 B 0 B 0 B   0 B  0 B0 B0
>0   0 osd.8
> TOTAL 38 GiB 2.6 GiB 610 MiB 6 KiB  2.0 GiB 35 GiB 6.83
>
> MIN/MAX VAR: 0/1.04  STDDEV: 5.58
> [root@server1 ~]#
>
>
> I'm kind of newbie to ceph, so any help or hint would be appreciated.
> Did I hit a bug or something is wrong with my configuration?
>

Make the disks larger, those sizes are far too small for any usable
cluster, so I don't think that use case gets tested at all.

The database preallocations, WAL and things OSDs create in order to be good
for 100G -> 12-14-18TB drives makes them less useful for 0.018TB drives.

I don't think the logs are the real problem, the OSD processes are crashing
because you give them no room and then they log repeatedly that they can't
restart because they are still out of space.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to change the pg numbers

2020-08-18 Thread Stefan Kooman
On 2020-08-18 11:13, Hans van den Bogert wrote:
> I don't think it might lead to more client slow requests if you set it
> to 4096 in one step, since there is a cap on how many recovery/backfill
> requests there can be per OSD at any given time.
> 
> I am not sure though, but I am happy to be proved wrong by the senior
> members in this list :)

Not sure if I qualify for senior, but here are my 2 cents ...

I would argue that you do want to do this in one step. Doing this in
multiple steps will trigger data movement every time you change pg_num
(and pgp_num for that matter). Ceph will recalculate a new mapping every
time you change the pg(p)_num for a pool (or by altering  CRUSH rules).

osd_recovery_max_active = 1
osd_max_backfills = 1

If your cluster can't handle this than I wonder what a disk / host
failure would trigger.

Some on this list would argue that you also want the following setting
to avoid client IO starvation:

ceph config set osd osd_op_queue_cut_off high

This is already the default in Octopus.

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] fio rados ioengine

2020-08-18 Thread Frank Ritchie
Hi all,

When testing with the fio rados ioengine is it necessary to run a
write test with the no-cleanup option before running read tests like
is required with rados bench?

thx
Frank
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] radosgw beast access logs

2020-08-18 Thread Graham Allan
Are there any plans to add access logs to the beast frontend, in the 
same way we can get with civetweb? Increasing the "debug rgw" setting 
really doesn't provide the same thing.


Graham
--
Graham Allan - g...@umn.edu
Associate Director of Operations - Minnesota Supercomputing Institute
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] why ceph-fuse init Objecter with osd_timeout = 0

2020-08-18 Thread Ch Wan
Hi all

I'm using the mimic-13.2.4 version and ceph-fuse as client. Recently I'm
suffering a strange problem: on client machine we can see the tcp has ESTAB
state to osd machine, but on osd machine nothing could be found. Then the
client hangs on read/write requests from this osd
So I'm trying to figure out why this happens, and search for configuration
to set a timeout for osd requests.
I noticed that osdc/Objecter.h has a osd_timeout field, which will be
initiated when we create an Objecter.
But in ceph-fuse, it creates an Objecter using a fixed value 0, meaning no
timeout?

StandaloneClient::StandaloneClient(Messenger *m, MonClient *mc,
>boost::asio::io_context& ictx)
>   : Client(m, mc, new Objecter(m->cct, m, mc, ictx, 0, 0))
>

 Here are my questions:
1. Why ceph-fuse set osd_timeout to 0?
2. Do we have other configurations to let osd requests fail instead of
hanging forever?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to change the pg numbers

2020-08-18 Thread Joachim Kraftmayer
A few years ago Dan van der Ster and I were working on two similar 
scripts for increasing pgs.


Just have a look at the following link:

https://github.com/cernceph/ceph-scripts/blob/master/tools/split/ceph-gentle-split 




___

Clyso GmbH


Am 18.08.2020 um 11:27 schrieb Stefan Kooman:

On 2020-08-18 11:13, Hans van den Bogert wrote:

I don't think it might lead to more client slow requests if you set it
to 4096 in one step, since there is a cap on how many recovery/backfill
requests there can be per OSD at any given time.

I am not sure though, but I am happy to be proved wrong by the senior
members in this list :)

Not sure if I qualify for senior, but here are my 2 cents ...

I would argue that you do want to do this in one step. Doing this in
multiple steps will trigger data movement every time you change pg_num
(and pgp_num for that matter). Ceph will recalculate a new mapping every
time you change the pg(p)_num for a pool (or by altering  CRUSH rules).

osd_recovery_max_active = 1
osd_max_backfills = 1

If your cluster can't handle this than I wonder what a disk / host
failure would trigger.

Some on this list would argue that you also want the following setting
to avoid client IO starvation:

ceph config set osd osd_op_queue_cut_off high

This is already the default in Octopus.

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Alpine linux librados-dev missing

2020-08-18 Thread Marc Roos


I am not sure if I should try this, but I was trying to build this 
dovecot-ceph-plugin plugin on alpine linux to create a nice small 
container image. However alpine linux does not seem to have 
librados-dev. 

Did any one do something similar, and have a workaround for this?




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to change the pg numbers

2020-08-18 Thread norman

Hans,

I made a big change in my staging cluster before, I set a pool pg_num
from 8 to 2048, it cased the cluster available for a long time :(

On 18/8/2020 下午5:13, Hans van den Bogert wrote:

I don't think it might lead to more client slow requests if you set it
to 4096 in one step, since there is a cap on how many
recovery/backfill requests there can be per OSD at any given time.

I am not sure though, but I am happy to be proved wrong by the senior
members in this list :)

Hans

On 8/18/20 10:23 AM, norman wrote:

Hi guys,

I have a rbd pool pg_num 2048, I want to change it to 4096,  how can I
do this?

If I change it directly to 4096, it may cause the client slow requests,
What the better step size should be?

Thanks,

Kern
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to change the pg numbers

2020-08-18 Thread norman

Stefan,

I agree with you about the crush rule,  but I truely met the problem for
the cluster,

I set the values large for a quick recover:

osd_recovery_max_active 16

osd_max_backfills 32

Is it a very bad setting?


Kern

On 18/8/2020 下午5:27, Stefan Kooman wrote:

On 2020-08-18 11:13, Hans van den Bogert wrote:

I don't think it might lead to more client slow requests if you set it
to 4096 in one step, since there is a cap on how many recovery/backfill
requests there can be per OSD at any given time.

I am not sure though, but I am happy to be proved wrong by the senior
members in this list :)

Not sure if I qualify for senior, but here are my 2 cents ...

I would argue that you do want to do this in one step. Doing this in
multiple steps will trigger data movement every time you change pg_num
(and pgp_num for that matter). Ceph will recalculate a new mapping every
time you change the pg(p)_num for a pool (or by altering  CRUSH rules).

osd_recovery_max_active = 1
osd_max_backfills = 1

If your cluster can't handle this than I wonder what a disk / host
failure would trigger.

Some on this list would argue that you also want the following setting
to avoid client IO starvation:

ceph config set osd osd_op_queue_cut_off high

This is already the default in Octopus.

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] cephadm not working with non-root user

2020-08-18 Thread Amudhan P
Hi,

I am trying to install ceph 'octopus' using cephadm. In bootstrap
command, I have specified a non-root user account as ssh-user.
cephadm bootstrap --mon-ip xx.xxx.xx.xx --ssh-user non-rootuser

when bootstrap about to complete it threw an error stating.


INFO:cephadm:Non-zero exit code 2 from /usr/bin/podman run --rm --net=host
--ipc=host -e CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=node1 -
  v
/var/log/ceph/ae4ed114-e145-11ea-9c1f-0025900a8ebe:/var/log/ceph:z -v
/tmp/ceph-tmpm22k9j9w:/etc/ceph/ceph.client.admin.keyring:z -v
/tmp/ceph-tmpe   1ltigk8:/etc/ceph/ceph.conf:z --entrypoint
/usr/bin/ceph docker.io/ceph/ceph:v15 orch host add node1
INFO:cephadm:/usr/bin/ceph:stderr Error ENOENT: Failed to connect to node1
(node1).
INFO:cephadm:/usr/bin/ceph:stderr Check that the host is reachable and
accepts connections using the cephadm SSH key
INFO:cephadm:/usr/bin/ceph:stderr
INFO:cephadm:/usr/bin/ceph:stderr you may want to run:
INFO:cephadm:/usr/bin/ceph:stderr > ceph cephadm get-ssh-config > ssh_config
INFO:cephadm:/usr/bin/ceph:stderr > ceph config-key get
mgr/cephadm/ssh_identity_key > key
INFO:cephadm:/usr/bin/ceph:stderr > ssh -F ssh_config -i key root@node1
"
In the above steps, it's trying to connect as root to the node and when I
downloaded ssh_config file it was also specified as 'root' inside. so, I
modified the config file and uploaded but same to ceph but still ssh to
node1 is not working.

To confirm if I have used the right command been used during bootstrap. I
have tried the below command.

" ceph config-key dump mgr/cephadm/ssh_user"
{
"mgr/cephadm/ssh_user": "non-rootuser"
}

and the output shows the user I have used during bootstrap  "non-rootuser"

but at the same time when I run cmd " ceph cephadm get-user " the output
still shows 'root' as the user.

Why the change is not affecting? do anyone faced a similar issue in
bootstrap?

Is there any way to avoid using container with cephadm?

regards
Amudhan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to change the pg numbers

2020-08-18 Thread Eugen Block

I set the values large for a quick recover:

osd_recovery_max_active 16

osd_max_backfills 32

Is it a very bad setting?


Only bad for the clients. ;-) As Stefan already advised, turn down  
these values to 1 and let the cluster rebalance slowly. If the client  
performance seems fine you can increase by 1 or so and see how it  
behaves. You'll have to find reasonable values for your specific setup  
to have a good mix between quick recovery without impacting client  
performance too much.


Regards,
Eugen


Zitat von norman :


Stefan,

I agree with you about the crush rule,  but I truely met the problem for
the cluster,

I set the values large for a quick recover:

osd_recovery_max_active 16

osd_max_backfills 32

Is it a very bad setting?


Kern

On 18/8/2020 下午5:27, Stefan Kooman wrote:

On 2020-08-18 11:13, Hans van den Bogert wrote:

I don't think it might lead to more client slow requests if you set it
to 4096 in one step, since there is a cap on how many recovery/backfill
requests there can be per OSD at any given time.

I am not sure though, but I am happy to be proved wrong by the senior
members in this list :)

Not sure if I qualify for senior, but here are my 2 cents ...

I would argue that you do want to do this in one step. Doing this in
multiple steps will trigger data movement every time you change pg_num
(and pgp_num for that matter). Ceph will recalculate a new mapping every
time you change the pg(p)_num for a pool (or by altering  CRUSH rules).

osd_recovery_max_active = 1
osd_max_backfills = 1

If your cluster can't handle this than I wonder what a disk / host
failure would trigger.

Some on this list would argue that you also want the following setting
to avoid client IO starvation:

ceph config set osd osd_op_queue_cut_off high

This is already the default in Octopus.

Gr. Stefan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] What to Do When Printer Goes Down? Contact Epson Customer Service.

2020-08-18 Thread mary smith
Found your printer is down? Printer issues tend to frustrate users at times, 
especially when you need it the most. Now, you not need to panic. Epson 
Customer Service professionals can diagnose the problems and fix your printer 
problem in a matter of minutes. You just need to share your problems, the error 
code and other errors. They avail expertise and trained exclusively to provide 
instant services. https://www.epsonprintersupportpro.net/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How To Reset HP Printer? Get In Touch With HP Support Assistant.

2020-08-18 Thread mary smith
If you are not aware of the process of getting the real time process of 
resetting your HP printer, then you should fetch the technical backing directly 
from the HP Support Assistant. Here, you will be able to get the top-to-toe 
solutions under the supervision of the experts in a couple of seconds.  
https://www.amiytech.com/hp-support-assistant/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io