[ceph-users] Re: ceph fs crashes on simple fio test

2019-09-20 Thread Frank Schilder
Dear all,

I found a partial solution to the problem and I also repeated a bit of testing, 
see below.


# Client-sided solution, works for single-client IO

The hard solution is to mount cephfs with the option "sync". This will 
translate any IO to direct IO and successfully throttle clients no matter how 
they perform IO. This will even work in multi-client set-ups. A somewhat less 
restrictive option is to set low values for vm.dirty_[background_]bytes to 
allow some buffered IO for small bursts. I tried with

vm.dirty_background_bytes = 524288
vm.dirty_bytes = 1048576

and less restrictive

vm.dirty_background_bytes = 2097152
vm.dirty_bytes = 67108864

(without sync mount option) and it seems to have the desired effect. It is 
possible to obtain good large-IO size throughput while limiting small IO size 
IOPs to a healthy level. Of course, this does not address destructive 
multi-client IO patterns, which must be addressed on the server side.


# Test observations

Today I repeated a shorter test to avoid crashing the cluster bad. We are in 
production and I don't have a test cluster. Therefore, if anyone could try this 
on a test cluster and check if the observations can be confirmed, that would be 
great.

Here is a one-line command:

fio -name=rand-write -directory=/mnt/cephfs/home/frans/fio 
-filename_format=tmp/fio-\$jobname-\$jobnum-\$filenum -rw=randwrite -bs=4K 
-numjobs=4 -time_based=1 -runtime=5 -filesize=100G -ioengine=sync -direct=0 
-iodepth=1

Adjust runtime and numjobs to increasingly higher values to increase stress. In 
my original tests I observed OSD outages with numjobs=4 and runtime=30 already. 
Note that these occur several minutes after the fio command completes. Here are 
my today's observations with "osd_op_queue=wpq" and "osd_op_queue_cut_off=high" 
and a 5 sec run time:

- High IOPs (>4kops) on the data pool come in two waves.
- The first wave does not cause slow ops.
- There is a phase of low activity.
- A second wave starts and now slow meta data ops are reported by the MDS. 
Health level becomes warn.
- The cluster crunches through the meta data ops for a minute or so and then 
settles. This is quite a long time considering a 5 secs burst.
- OSDs did not go out, but this could be due to not running the test long 
enough.

Best regards,

=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cannot start virtual machines KVM / LXC

2019-09-20 Thread Thomas Schneider
Hi,

I cannot get rid of
 pgs unknown
because there were 3 disks that couldn't be started.
Therefore I destroyed the relevant OSD and re-created it for the
relevant disks.
Then I added the 3 OSDs to crushmap.

Regards
Thomas

Am 20.09.2019 um 08:19 schrieb Ashley Merrick:
> Your need to fix this first.
>
>     pgs: 0.056% pgs unknown
>  0.553% pgs not active
>
> The back filling will cause slow I/O, but having pgs unknown and not
> active will cause I/O blocking which your seeing with the VM booting.
>
> Seems you have 4 OSD's down, if you get them back online you should be
> able to get all the PG's online.
>
>
>  On Fri, 20 Sep 2019 14:14:01 +0800 *Thomas <74cmo...@gmail.com>*
> wrote 
>
> Hi,
>
> here I describe 1 of the 2 major issues I'm currently facing in my 8
> node ceph cluster (2x MDS, 6x ODS).
>
> The issue is that I cannot start any virtual machine KVM or container
> LXC; the boot process just hangs after a few seconds.
> All these KVMs and LXCs have in common that their virtual disks
> reside
> in the same pool: hdd
>
> This pool hdd is relatively small compared to the largest pool:
> hdb_backup
> root@ld3955:~# rados df
> POOL_NAME  USED  OBJECTS CLONES    COPIES
> MISSING_ON_PRIMARY
> UNFOUND DEGRADED    RD_OPS   RD    WR_OPS  WR USED COMPR
> UNDER COMPR
> backup  0 B    0  0
> 0 
> 0   0    0 0  0 B 0 0 B    0
> B 0 B
> hdb_backup  589 TiB 51262212  0
> 153786636 
> 0   0   124895  12266095  4.3 TiB 247132863 463 TiB    0
> B 0 B
> hdd 3.2 TiB   281884   6568   
> 845652 
> 0   0 1658 275277357   16 TiB 208213922  10 TiB    0
> B 0 B
> pve_cephfs_data 955 GiB    91832  0   
> 275496 
> 0   0 3038  2103 1021 MiB    102170 318 GiB    0
> B 0 B
> pve_cephfs_metadata 486 MiB   62  0  
> 186 
> 0   0    7   860  1.4 GiB 12393 166 MiB    0
> B 0 B
>
> total_objects    51635990
> total_used   597 TiB
> total_avail  522 TiB
> total_space  1.1 PiB
>
> This is the current health status of the ceph cluster:
>   cluster:
>     id: 6b1b5117-6e08-4843-93d6-2da3cf8a6bae
>     health: HEALTH_ERR
>     1 filesystem is degraded
>     1 MDSs report slow metadata IOs
>     1 backfillfull osd(s)
>     87 nearfull osd(s)
>     1 pool(s) backfillfull
>     Reduced data availability: 54 pgs inactive, 47 pgs
> peering,
> 1 pg stale
>     Degraded data redundancy: 129598/154907946 objects
> degraded
> (0.084%), 33 pgs degraded, 33 pgs undersized
>     Degraded data redundancy (low space): 322 pgs
> backfill_toofull
>     1 subtrees have overcommitted pool target_size_bytes
>     1 subtrees have overcommitted pool target_size_ratio
>     1 pools have too many placement groups
>     21 slow requests are blocked > 32 sec
>
>   services:
>     mon: 3 daemons, quorum ld5505,ld5506,ld5507 (age 14h)
>     mgr: ld5507(active, since 16h), standbys: ld5506, ld5505
>     mds: pve_cephfs:1/1 {0=ld3955=up:replay} 1 up:standby
>     osd: 360 osds: 356 up, 356 in; 382 remapped pgs
>
>   data:
>     pools:   5 pools, 8868 pgs
>     objects: 51.64M objects, 197 TiB
>     usage:   597 TiB used, 522 TiB / 1.1 PiB avail
>     pgs: 0.056% pgs unknown
>  0.553% pgs not active
>  129598/154907946 objects degraded (0.084%)
>  229/154907946 objects misplaced (1.427%)
>  8458 active+clean
>  298  active+remapped+backfill_toofull
>  29   remapped+peering
>  24  
> active+undersized+degraded+remapped+backfill_toofull
>  22   active+remapped+backfill_wait
>  17   peering
>  5    unknown
>  5    active+recovery_wait+undersized+degraded+remapped
>  3    active+undersized+degraded+remapped+backfill_wait
>  2    activating+remapped
>  1    active+clean+remapped
>  1    stale+peering
>  1    active+remapped+backfilling
>  1    active+recovering+undersized+remapped
>  1    active+recovery_wait+degraded
>
>   io:
>     client:   9.2 KiB/s wr, 0 op/s rd, 1 op/s wr
>
> I believe the cluster is busy with rebalancing pool hdb_backup.
> I set the balance mode upmap recently after the 589TB data was
> written.
> root@ld39

[ceph-users] Re: Cannot start virtual machines KVM / LXC

2019-09-20 Thread Thomas
Hi,

I cannot get rid of
 pgs unknown
because there were 3 disks that couldn't be started.
Therefore I destroyed the relevant OSD and re-created it for the
relevant disks.
Then I added the 3 OSDs to crushmap.

Regards
Thomas

Am 20.09.2019 um 08:19 schrieb Ashley Merrick:
> Your need to fix this first.
>
>     pgs: 0.056% pgs unknown
>  0.553% pgs not active
>
> The back filling will cause slow I/O, but having pgs unknown and not
> active will cause I/O blocking which your seeing with the VM booting.
>
> Seems you have 4 OSD's down, if you get them back online you should be
> able to get all the PG's online.
>
>
>  On Fri, 20 Sep 2019 14:14:01 +0800 *Thomas <74cmo...@gmail.com>*
> wrote 
>
> Hi,
>
> here I describe 1 of the 2 major issues I'm currently facing in my 8
> node ceph cluster (2x MDS, 6x ODS).
>
> The issue is that I cannot start any virtual machine KVM or container
> LXC; the boot process just hangs after a few seconds.
> All these KVMs and LXCs have in common that their virtual disks
> reside
> in the same pool: hdd
>
> This pool hdd is relatively small compared to the largest pool:
> hdb_backup
> root@ld3955:~# rados df
> POOL_NAME  USED  OBJECTS CLONES    COPIES
> MISSING_ON_PRIMARY
> UNFOUND DEGRADED    RD_OPS   RD    WR_OPS  WR USED COMPR
> UNDER COMPR
> backup  0 B    0  0
> 0 
> 0   0    0 0  0 B 0 0 B    0
> B 0 B
> hdb_backup  589 TiB 51262212  0
> 153786636 
> 0   0   124895  12266095  4.3 TiB 247132863 463 TiB    0
> B 0 B
> hdd 3.2 TiB   281884   6568   
> 845652 
> 0   0 1658 275277357   16 TiB 208213922  10 TiB    0
> B 0 B
> pve_cephfs_data 955 GiB    91832  0   
> 275496 
> 0   0 3038  2103 1021 MiB    102170 318 GiB    0
> B 0 B
> pve_cephfs_metadata 486 MiB   62  0  
> 186 
> 0   0    7   860  1.4 GiB 12393 166 MiB    0
> B 0 B
>
> total_objects    51635990
> total_used   597 TiB
> total_avail  522 TiB
> total_space  1.1 PiB
>
> This is the current health status of the ceph cluster:
>   cluster:
>     id: 6b1b5117-6e08-4843-93d6-2da3cf8a6bae
>     health: HEALTH_ERR
>     1 filesystem is degraded
>     1 MDSs report slow metadata IOs
>     1 backfillfull osd(s)
>     87 nearfull osd(s)
>     1 pool(s) backfillfull
>     Reduced data availability: 54 pgs inactive, 47 pgs
> peering,
> 1 pg stale
>     Degraded data redundancy: 129598/154907946 objects
> degraded
> (0.084%), 33 pgs degraded, 33 pgs undersized
>     Degraded data redundancy (low space): 322 pgs
> backfill_toofull
>     1 subtrees have overcommitted pool target_size_bytes
>     1 subtrees have overcommitted pool target_size_ratio
>     1 pools have too many placement groups
>     21 slow requests are blocked > 32 sec
>
>   services:
>     mon: 3 daemons, quorum ld5505,ld5506,ld5507 (age 14h)
>     mgr: ld5507(active, since 16h), standbys: ld5506, ld5505
>     mds: pve_cephfs:1/1 {0=ld3955=up:replay} 1 up:standby
>     osd: 360 osds: 356 up, 356 in; 382 remapped pgs
>
>   data:
>     pools:   5 pools, 8868 pgs
>     objects: 51.64M objects, 197 TiB
>     usage:   597 TiB used, 522 TiB / 1.1 PiB avail
>     pgs: 0.056% pgs unknown
>  0.553% pgs not active
>  129598/154907946 objects degraded (0.084%)
>  229/154907946 objects misplaced (1.427%)
>  8458 active+clean
>  298  active+remapped+backfill_toofull
>  29   remapped+peering
>  24  
> active+undersized+degraded+remapped+backfill_toofull
>  22   active+remapped+backfill_wait
>  17   peering
>  5    unknown
>  5    active+recovery_wait+undersized+degraded+remapped
>  3    active+undersized+degraded+remapped+backfill_wait
>  2    activating+remapped
>  1    active+clean+remapped
>  1    stale+peering
>  1    active+remapped+backfilling
>  1    active+recovering+undersized+remapped
>  1    active+recovery_wait+degraded
>
>   io:
>     client:   9.2 KiB/s wr, 0 op/s rd, 1 op/s wr
>
> I believe the cluster is busy with rebalancing pool hdb_backup.
> I set the balance mode upmap recently after the 589TB data was
> written.
> root@ld39

[ceph-users] How to reduce or control memory usage during recovery?

2019-09-20 Thread Amudhan P
Hi,

I am using ceph mimic in a small test setup using the below configuration.

OS: ubuntu 18.04

1 node running (mon,mds,mgr) + 4 core cpu and 4GB RAM and 1 Gb lan
3 nodes each having 2 osd's, disks are 2TB + 2 core cpu and 4G RAM  and 1
Gb lan
1 node acting as cephfs client  + 2 core cpu and 4G RAM  and 1 Gb lan

configured cephfs_metadata_pool (3 replica) and cephfs_data_pool erasure
2+1.

When running a script doing multiple folders creation ceph started throwing
error late IO due to high metadata workload.
once after folder creation complete PG's degraded and I am waiting for PG
to complete recovery but my OSD's starting to crash due to OOM and
restarting after some time.

Now my question is I can wait for recovery to complete but how do I stop
OOM and OSD crash? basically want to know the way to control memory usage
during recovery and make it stable.

I have also set very low PG metadata_pool 8 and data_pool 16.

I have already set "mon osd memory target to 1Gb" and I have set
max-backfill from 1 to 8.

Attached msg from "kern.log" from one of the node and snippet of error msg
in this mail.

-error msg snippet --
-bash: fork: Cannot allocate memory

Sep 18 19:01:57 test-node1 kernel: [341246.765644] msgr-worker-0 invoked
oom-killer: gfp_mask=0x14200ca(GFP_HIGHUSER_MOVABLE), nodemask=(null),
order=0, oom_score_adj=0
Sep 18 19:02:00 test-node1 kernel: [341246.765645] msgr-worker-0 cpuset=/
mems_allowed=0
Sep 18 19:02:00 test-node1 kernel: [341246.765650] CPU: 1 PID: 1737 Comm:
msgr-worker-0 Not tainted 4.15.0-45-generic #48-Ubuntu

Sep 18 19:02:02 test-node1 kernel: [341246.765833] Out of memory: Kill
process 1727 (ceph-osd) score 489 or sacrifice child
Sep 18 19:02:03 test-node1 kernel: [341246.765919] Killed process 1727
(ceph-osd) total-vm:3483844kB, anon-rss:1992708kB, file-rss:0kB,
shmem-rss:0kB
Sep 18 19:02:03 test-node1 kernel: [341246.899395] oom_reaper: reaped
process 1727 (ceph-osd), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
Sep 18 22:09:57 test-node1 kernel: [352529.433155] perf: interrupt took too
long (4965 > 4938), lowering kernel.perf_event_max_sample_rate to 40250

regards
Amudhan
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] handle_connect_reply_2 connect got BADAUTHORIZER when running ceph pg query

2019-09-20 Thread Thomas

Hi,
ceph health status reports unknown objects.
All objects reside on same osd.9

When I execute ceph pg  query I get this (endless) output:
2019-09-20 14:47:35.922 7f937144f700  0 --1- 10.97.206.91:0/2060489821 
>> v1:10.97.206.93:7054/15812 conn(0x7f935407c120 0x7f935407b120 :-1 
s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 
connect got BADAUTHORIZER

^CTraceback (most recent call last):
  File "/usr/bin/ceph", line 1263, in 
    retval = main()
  File "/usr/bin/ceph", line 1179, in main
    prefix='get_command_descriptions')
  File "/usr/lib/python2.7/dist-packages/ceph_argparse.py", line 1459, 
in json_command

    inbuf, timeout, verbose)
  File "/usr/lib/python2.7/dist-packages/ceph_argparse.py", line 1329, 
in send_command_retry

    return send_command(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/ceph_argparse.py", line 1381, 
in send_command

    cluster.pg_command, pgid, cmd, inbuf, timeout=timeout)
  File "/usr/lib/python2.7/dist-packages/ceph_argparse.py", line 1311, 
in run_in_thread

    t.join(timeout=timeout)
  File "/usr/lib/python2.7/threading.py", line 951, in join
    self.__block.wait(delay)
  File "/usr/lib/python2.7/threading.py", line 359, in wait
    _sleep(delay)
KeyboardInterrupt
2019-09-20 14:47:35.950 7f937144f700  0 --1- 10.97.206.91:0/2060489821 
>> v1:10.97.206.93:7054/15812 conn(0x7f935407f4b0 0x7f935407b920 :-1 
s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 
connect got BADAUTHORIZER
2019-09-20 14:47:35.950 7f937144f700  0 --1- 10.97.206.91:0/2060489821 
>> v1:10.97.206.93:7054/15812 conn(0x7f935407c120 0x7f935407b120 :-1 
s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 
connect got BADAUTHORIZER
2019-09-20 14:47:35.950 7f937144f700  0 --1- 10.97.206.91:0/2060489821 
>> v1:10.97.206.93:7054/15812 conn(0x7f935407f4b0 0x7f935407b920 :-1 
s=CONNECTING_SEND_CONNECT_MSG pgs=0 cs=0 l=1).handle_connect_reply_2 
connect got BADAUTHORIZER


How can I fix this issue with pg / osd.9?

THX
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Nautilus dashboard: MDS performance graph doesn't refresh

2019-09-20 Thread Eugen Block

Hi all,

I regularly check the MDS performance graphs in the dashboard,  
especially the requests per second is interesting in my case.
Since our upgrade to Nautilus the values in the activity column are  
still refreshed every 5 seconds (I believe), but the graphs are not  
refreshed since that upgrade anymore. I couldn't find anything in the  
tracker or the mailing list, can anyone comment on this?


Thank you & best regards,
Eugen

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RGW backup to tape

2019-09-20 Thread Robert LeBlanc
The question was posed, "What if we want to backup our RGW data to
tape?" Anyone doing this? Any suggestions? We could probably just
catch any PUT requests and queue them to be written to tape. Our
dataset is so large, that traditional backup solutions don't seem
feasible (GFS), so probably a single copy (or two copies on different
tapes at the same time) when the object is created.

Bonus points for being near-line.

Thanks,
Robert LeBlanc

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW backup to tape

2019-09-20 Thread Paul Emmerich
Probably easiest if you get a tape library that supports S3. You might
even have some luck with radosgw's cloud sync module (but I wouldn't
count on it, Octopus should improve things, though)

Just intercepting PUT requests isn't that easy because of multi-part
stuff and load balancing. I.e., if you upload a large file you should
be sending it in chunks and each chunk should go to a different
server, that makes any "simple" solutions pretty messy.


Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Fri, Sep 20, 2019 at 8:01 PM Robert LeBlanc  wrote:
>
> The question was posed, "What if we want to backup our RGW data to
> tape?" Anyone doing this? Any suggestions? We could probably just
> catch any PUT requests and queue them to be written to tape. Our
> dataset is so large, that traditional backup solutions don't seem
> feasible (GFS), so probably a single copy (or two copies on different
> tapes at the same time) when the object is created.
>
> Bonus points for being near-line.
>
> Thanks,
> Robert LeBlanc
> 
> Robert LeBlanc
> PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

2019-09-20 Thread shubjero
Still trying to solve this one.

Here is the corresponding log entry when the large omap object was found:

ceph-osd.1284.log.2.gz:2019-09-18 11:43:39.237 7fcd68f96700  0
log_channel(cluster) log [WRN] : Large omap object found. Object:
26:86e4c833:::usage.22:head Key count: 2009548 Size (bytes): 369641376

I have since trimmed the entire usage log and disabled it entirely.
You can see from the output below that there's nothing in these usage
log objects.

for i in `rados -p .usage ls`; do echo $i; rados -p .usage
listomapkeys $i | wc -l; done
usage.29
0
usage.12
0
usage.1
0
usage.26
0
usage.20
0
usage.24
0
usage.16
0
usage.15
0
usage.3
0
usage.19
0
usage.23
0
usage.5
0
usage.11
0
usage.7
0
usage.30
0
usage.18
0
usage.21
0
usage.27
0
usage.13
0
usage.22
0
usage.25
0
.
4
usage.10
0
usage.8
0
usage.9
0
usage.28
0
usage.2
0
usage.4
0
usage.6
0
usage.31
0
usage.17
0


root@infra:~# rados -p .usage listomapkeys usage.22
root@infra:~#


On Thu, Sep 19, 2019 at 12:54 PM Charles Alva  wrote:
>
> Could you please share how you trimmed the usage log?
>
> Kind regards,
>
> Charles Alva
> Sent from Gmail Mobile
>
>
> On Thu, Sep 19, 2019 at 11:46 PM shubjero  wrote:
>>
>> Hey all,
>>
>> Yesterday our cluster went in to HEALTH_WARN due to 1 large omap
>> object in the .usage pool (I've posted about this in the past). Last
>> time we resolved the issue by trimming the usage log below the alert
>> threshold but this time it seems like the alert wont clear even after
>> trimming and (this time) disabling the usage log entirely.
>>
>> ceph health detail
>> HEALTH_WARN 1 large omap objects
>> LARGE_OMAP_OBJECTS 1 large omap objects
>> 1 large objects found in pool '.usage'
>> Search the cluster log for 'Large omap object found' for more details.
>>
>> I've bounced ceph-mon, ceph-mgr, radosgw and even issued osd scrub on
>> the two osd's that hold pg's for the .usage pool but the alert wont
>> clear.
>>
>> It's been over 24 hours since I trimmed the usage log.
>>
>> Any suggestions?
>>
>> Jared Baker
>> Cloud Architect, OICR
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

2019-09-20 Thread Casey Bodley

Hi Jared,

My understanding is that these 'large omap object' warnings are only 
issued or cleared during scrub, so I'd expect them to go away the next 
time the usage objects get scrubbed.


On 9/20/19 2:31 PM, shubjero wrote:

Still trying to solve this one.

Here is the corresponding log entry when the large omap object was found:

ceph-osd.1284.log.2.gz:2019-09-18 11:43:39.237 7fcd68f96700  0
log_channel(cluster) log [WRN] : Large omap object found. Object:
26:86e4c833:::usage.22:head Key count: 2009548 Size (bytes): 369641376

I have since trimmed the entire usage log and disabled it entirely.
You can see from the output below that there's nothing in these usage
log objects.

for i in `rados -p .usage ls`; do echo $i; rados -p .usage
listomapkeys $i | wc -l; done
usage.29
0
usage.12
0
usage.1
0
usage.26
0
usage.20
0
usage.24
0
usage.16
0
usage.15
0
usage.3
0
usage.19
0
usage.23
0
usage.5
0
usage.11
0
usage.7
0
usage.30
0
usage.18
0
usage.21
0
usage.27
0
usage.13
0
usage.22
0
usage.25
0
.
4
usage.10
0
usage.8
0
usage.9
0
usage.28
0
usage.2
0
usage.4
0
usage.6
0
usage.31
0
usage.17
0


root@infra:~# rados -p .usage listomapkeys usage.22
root@infra:~#


On Thu, Sep 19, 2019 at 12:54 PM Charles Alva  wrote:

Could you please share how you trimmed the usage log?

Kind regards,

Charles Alva
Sent from Gmail Mobile


On Thu, Sep 19, 2019 at 11:46 PM shubjero  wrote:

Hey all,

Yesterday our cluster went in to HEALTH_WARN due to 1 large omap
object in the .usage pool (I've posted about this in the past). Last
time we resolved the issue by trimming the usage log below the alert
threshold but this time it seems like the alert wont clear even after
trimming and (this time) disabling the usage log entirely.

ceph health detail
HEALTH_WARN 1 large omap objects
LARGE_OMAP_OBJECTS 1 large omap objects
 1 large objects found in pool '.usage'
 Search the cluster log for 'Large omap object found' for more details.

I've bounced ceph-mon, ceph-mgr, radosgw and even issued osd scrub on
the two osd's that hold pg's for the .usage pool but the alert wont
clear.

It's been over 24 hours since I trimmed the usage log.

Any suggestions?

Jared Baker
Cloud Architect, OICR
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Doubt about ceph-iscsi and Vmware

2019-09-20 Thread Gesiel Galvão Bernardes
Hi,
I'm testing Ceph with Vmware, using Ceph-iscsi gateway. I reading
documentation*  and have doubts some points:

- If I understanded, in general terms, for each VMFS datastore in VMware
will match the an RBD image. (consequently in an RBD image I will possible
have many VMWare disks). Its correct?

- In documentation is this: "gwcli requires a pool with the name rbd, so it
can store metadata like the iSCSI configuration". In part 4 of
"Configuration", have: "Add a RBD image with the name disk_1 in the pool
rbd". In this part, the use of "rbd" pool is a example and I could use any
pool for storage of image, or the pool should be "rbd"?
Resuming: gwcli require "rbd" pool for metadata and I could use any pool
for image, or i will use just "rbd pool" for storage image and metadata?

- How much memory ceph-iscsi use? Which  is a good number of RAM?

Regards
Gesiel

* https://docs.ceph.com/docs/master/rbd/iscsi-target-cli/
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Doubt about ceph-iscsi and Vmware

2019-09-20 Thread Paul Emmerich
On Fri, Sep 20, 2019 at 8:55 PM Gesiel Galvão Bernardes
 wrote:
>
> Hi,
> I'm testing Ceph with Vmware, using Ceph-iscsi gateway. I reading 
> documentation*  and have doubts some points:
>
> - If I understanded, in general terms, for each VMFS datastore in VMware will 
> match the an RBD image. (consequently in an RBD image I will possible have 
> many VMWare disks). Its correct?

yes

> - In documentation is this: "gwcli requires a pool with the name rbd, so it 
> can store metadata like the iSCSI configuration". In part 4 of 
> "Configuration", have: "Add a RBD image with the name disk_1 in the pool 
> rbd". In this part, the use of "rbd" pool is a example and I could use any 
> pool for storage of image, or the pool should be "rbd"?

that's just an example, yes

> Resuming: gwcli require "rbd" pool for metadata and I could use any pool for 
> image, or i will use just "rbd pool" for storage image and metadata?

you can even store the metadata elsewhere in newer versions, see
options in the config file

> - How much memory ceph-iscsi use? Which  is a good number of RAM?

since it can't cache anything: virtually nothing

>
> Regards
> Gesiel
>
> * https://docs.ceph.com/docs/master/rbd/iscsi-target-cli/
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Doubt about ceph-iscsi and Vmware

2019-09-20 Thread Heðin Ejdesgaard Møller
Hello Gesiel,

Some iscsi settings are stored in an object, this object is stored in
the rbd pool. Hnece the rbd pool is required.

Your LUN's are mapped to {pool}/{rbdimage}. You should treat these as
you treat pools and rbd images in general.

In smallish deployments I try to keep it simple and make 1 pool for
each deviceclass and make LUN's as big as possible, while still
allowing for setting 1 LUN in maintenance mode in the datastore cluster
within vSphere, in case we need to re-format as part of vmfs upgrade.

Remember to set PSP and recovery timeout properly.

From the LUN level and up, you just treat the storage as any other
iscsi storage connected to vSphere.

the iGW consumes RAM pr. LUN export... I can't remember the default
settings but we are talking about single-digit Gb qith tens of LUN
exported, so it's fairly lightweight.

/Heðin

On frí, 2019-09-20 at 15:52 -0300, Gesiel Galvão Bernardes wrote:
> Hi,
> I'm testing Ceph with Vmware, using Ceph-iscsi gateway. I reading
> documentation*  and have doubts some points:
> 
> - If I understanded, in general terms, for each VMFS datastore in
> VMware will match the an RBD image. (consequently in an RBD image I
> will possible have many VMWare disks). Its correct?
> 
> - In documentation is this: "gwcli requires a pool with the name rbd,
> so it can store metadata like the iSCSI configuration". In part 4 of
> "Configuration", have: "Add a RBD image with the name disk_1 in the
> pool rbd". In this part, the use of "rbd" pool is a example and I
> could use any pool for storage of image, or the pool should be "rbd"?
> Resuming: gwcli require "rbd" pool for metadata and I could use any
> pool for image, or i will use just "rbd pool" for storage image and
> metadata?
> 
> - How much memory ceph-iscsi use? Which  is a good number of RAM?
> 
> Regards
> Gesiel
> 
> * https://docs.ceph.com/docs/master/rbd/iscsi-target-cli/
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io


signature.asc
Description: This is a digitally signed message part
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How to set timeout on Rados gateway request

2019-09-20 Thread Casey Bodley



On 9/19/19 11:52 PM, Hanyu Liu wrote:

Hi,

We are looking for a way to set timeout on requests to rados gateway. 
If a request takes too long time, just kill it.


1. Is there a command that can set the timeout?


there isn't, no


2. This parameter looks interesting. Can I know what the "open 
threads" means?


|rgw op thread timeout|

Description:The timeout in seconds for open threads.
Type:   Integer
Default:600

/(from https://docs.ceph.com/docs/nautilus/radosgw/config-ref/)/

this thread timeout option is left over from frontends that used ceph's 
internal WorkQueue/ThreadPool infrastructure. i believe the timeout 
option just caused the WorkQueue to print a warning when a request took 
longer to complete. the associated 'rgw op thread suicide timeout' went 
a step further and actually killed the radosgw process. however, these 
options don't apply to either of the currently supported frontends as 
they each have their own threading model




Thanks,
Hanyu




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Cannot start virtual machines KVM / LXC

2019-09-20 Thread Paul Emmerich
On Fri, Sep 20, 2019 at 1:31 PM Thomas Schneider <74cmo...@gmail.com> wrote:
>
> Hi,
>
> I cannot get rid of
>  pgs unknown
> because there were 3 disks that couldn't be started.
> Therefore I destroyed the relevant OSD and re-created it for the
> relevant disks.

and you had it configured to run with replica 3? Well, I guess the
down PGs where located on these three disks that you wiped.

Do you still have the disks? Use ceph-objectstore-tool to export the
affected PGs manually and inject them into another OSD.


Paul

> Then I added the 3 OSDs to crushmap.
>
> Regards
> Thomas
>
> Am 20.09.2019 um 08:19 schrieb Ashley Merrick:
> > Your need to fix this first.
> >
> > pgs: 0.056% pgs unknown
> >  0.553% pgs not active
> >
> > The back filling will cause slow I/O, but having pgs unknown and not
> > active will cause I/O blocking which your seeing with the VM booting.
> >
> > Seems you have 4 OSD's down, if you get them back online you should be
> > able to get all the PG's online.
> >
> >
> >  On Fri, 20 Sep 2019 14:14:01 +0800 *Thomas <74cmo...@gmail.com>*
> > wrote 
> >
> > Hi,
> >
> > here I describe 1 of the 2 major issues I'm currently facing in my 8
> > node ceph cluster (2x MDS, 6x ODS).
> >
> > The issue is that I cannot start any virtual machine KVM or container
> > LXC; the boot process just hangs after a few seconds.
> > All these KVMs and LXCs have in common that their virtual disks
> > reside
> > in the same pool: hdd
> >
> > This pool hdd is relatively small compared to the largest pool:
> > hdb_backup
> > root@ld3955:~# rados df
> > POOL_NAME  USED  OBJECTS CLONESCOPIES
> > MISSING_ON_PRIMARY
> > UNFOUND DEGRADEDRD_OPS   RDWR_OPS  WR USED COMPR
> > UNDER COMPR
> > backup  0 B0  0
> > 0
> > 0   00 0  0 B 0 0 B0
> > B 0 B
> > hdb_backup  589 TiB 51262212  0
> > 153786636
> > 0   0   124895  12266095  4.3 TiB 247132863 463 TiB0
> > B 0 B
> > hdd 3.2 TiB   281884   6568
> > 845652
> > 0   0 1658 275277357   16 TiB 208213922  10 TiB0
> > B 0 B
> > pve_cephfs_data 955 GiB91832  0
> > 275496
> > 0   0 3038  2103 1021 MiB102170 318 GiB0
> > B 0 B
> > pve_cephfs_metadata 486 MiB   62  0
> > 186
> > 0   07   860  1.4 GiB 12393 166 MiB0
> > B 0 B
> >
> > total_objects51635990
> > total_used   597 TiB
> > total_avail  522 TiB
> > total_space  1.1 PiB
> >
> > This is the current health status of the ceph cluster:
> >   cluster:
> > id: 6b1b5117-6e08-4843-93d6-2da3cf8a6bae
> > health: HEALTH_ERR
> > 1 filesystem is degraded
> > 1 MDSs report slow metadata IOs
> > 1 backfillfull osd(s)
> > 87 nearfull osd(s)
> > 1 pool(s) backfillfull
> > Reduced data availability: 54 pgs inactive, 47 pgs
> > peering,
> > 1 pg stale
> > Degraded data redundancy: 129598/154907946 objects
> > degraded
> > (0.084%), 33 pgs degraded, 33 pgs undersized
> > Degraded data redundancy (low space): 322 pgs
> > backfill_toofull
> > 1 subtrees have overcommitted pool target_size_bytes
> > 1 subtrees have overcommitted pool target_size_ratio
> > 1 pools have too many placement groups
> > 21 slow requests are blocked > 32 sec
> >
> >   services:
> > mon: 3 daemons, quorum ld5505,ld5506,ld5507 (age 14h)
> > mgr: ld5507(active, since 16h), standbys: ld5506, ld5505
> > mds: pve_cephfs:1/1 {0=ld3955=up:replay} 1 up:standby
> > osd: 360 osds: 356 up, 356 in; 382 remapped pgs
> >
> >   data:
> > pools:   5 pools, 8868 pgs
> > objects: 51.64M objects, 197 TiB
> > usage:   597 TiB used, 522 TiB / 1.1 PiB avail
> > pgs: 0.056% pgs unknown
> >  0.553% pgs not active
> >  129598/154907946 objects degraded (0.084%)
> >  229/154907946 objects misplaced (1.427%)
> >  8458 active+clean
> >  298  active+remapped+backfill_toofull
> >  29   remapped+peering
> >  24
> > active+undersized+degraded+remapped+backfill_toofull
> >  22   active+remapped+backfill_wait
> >  17   peering
> >  5unknown
> >  5active+recovery_wait+undersized+degraded+remapped
> >  3active+undersized+degraded+remapped+backfill_wait
> >  2activating+remapped
> >  1 

[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

2019-09-20 Thread shubjero
Thanks Casey. I will issue a scrub for the pg that contains this
object to speed things along. Will report back when that's done.

On Fri, Sep 20, 2019 at 2:50 PM Casey Bodley  wrote:
>
> Hi Jared,
>
> My understanding is that these 'large omap object' warnings are only
> issued or cleared during scrub, so I'd expect them to go away the next
> time the usage objects get scrubbed.
>
> On 9/20/19 2:31 PM, shubjero wrote:
> > Still trying to solve this one.
> >
> > Here is the corresponding log entry when the large omap object was found:
> >
> > ceph-osd.1284.log.2.gz:2019-09-18 11:43:39.237 7fcd68f96700  0
> > log_channel(cluster) log [WRN] : Large omap object found. Object:
> > 26:86e4c833:::usage.22:head Key count: 2009548 Size (bytes): 369641376
> >
> > I have since trimmed the entire usage log and disabled it entirely.
> > You can see from the output below that there's nothing in these usage
> > log objects.
> >
> > for i in `rados -p .usage ls`; do echo $i; rados -p .usage
> > listomapkeys $i | wc -l; done
> > usage.29
> > 0
> > usage.12
> > 0
> > usage.1
> > 0
> > usage.26
> > 0
> > usage.20
> > 0
> > usage.24
> > 0
> > usage.16
> > 0
> > usage.15
> > 0
> > usage.3
> > 0
> > usage.19
> > 0
> > usage.23
> > 0
> > usage.5
> > 0
> > usage.11
> > 0
> > usage.7
> > 0
> > usage.30
> > 0
> > usage.18
> > 0
> > usage.21
> > 0
> > usage.27
> > 0
> > usage.13
> > 0
> > usage.22
> > 0
> > usage.25
> > 0
> > .
> > 4
> > usage.10
> > 0
> > usage.8
> > 0
> > usage.9
> > 0
> > usage.28
> > 0
> > usage.2
> > 0
> > usage.4
> > 0
> > usage.6
> > 0
> > usage.31
> > 0
> > usage.17
> > 0
> >
> >
> > root@infra:~# rados -p .usage listomapkeys usage.22
> > root@infra:~#
> >
> >
> > On Thu, Sep 19, 2019 at 12:54 PM Charles Alva  wrote:
> >> Could you please share how you trimmed the usage log?
> >>
> >> Kind regards,
> >>
> >> Charles Alva
> >> Sent from Gmail Mobile
> >>
> >>
> >> On Thu, Sep 19, 2019 at 11:46 PM shubjero  wrote:
> >>> Hey all,
> >>>
> >>> Yesterday our cluster went in to HEALTH_WARN due to 1 large omap
> >>> object in the .usage pool (I've posted about this in the past). Last
> >>> time we resolved the issue by trimming the usage log below the alert
> >>> threshold but this time it seems like the alert wont clear even after
> >>> trimming and (this time) disabling the usage log entirely.
> >>>
> >>> ceph health detail
> >>> HEALTH_WARN 1 large omap objects
> >>> LARGE_OMAP_OBJECTS 1 large omap objects
> >>>  1 large objects found in pool '.usage'
> >>>  Search the cluster log for 'Large omap object found' for more 
> >>> details.
> >>>
> >>> I've bounced ceph-mon, ceph-mgr, radosgw and even issued osd scrub on
> >>> the two osd's that hold pg's for the .usage pool but the alert wont
> >>> clear.
> >>>
> >>> It's been over 24 hours since I trimmed the usage log.
> >>>
> >>> Any suggestions?
> >>>
> >>> Jared Baker
> >>> Cloud Architect, OICR
> >>> ___
> >>> ceph-users mailing list -- ceph-users@ceph.io
> >>> To unsubscribe send an email to ceph-users-le...@ceph.io
> > ___
> > ceph-users mailing list -- ceph-users@ceph.io
> > To unsubscribe send an email to ceph-users-le...@ceph.io
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: cephfs and selinux

2019-09-20 Thread Andrey Suharev
Thank you for the responce, but of course I'd tried this before asking. 
It has no effect. Selinux still prevents to open authorized_keys.


I suppose there is something wrong with file contexts at my cephfs. For 
instance, 'ls -Z' shows just a '?' as a context, and chcon fails with 
"Operation not supported" message. Where should I look for error?




You can setup a custom SELinux module to enable access.  We use the
following snippet to allow sshd to access authorized keys in home
directories on CephFS:

module local-ceph-ssh-auth 1.0;

require {
 type cephfs_t;
 type sshd_t;
 class file { read getattr open };
}

#= sshd_t ==
allow sshd_t cephfs_t:file { read getattr open };

Compiling and persistently installing such a module is covered by
various documentation, such as:
https://wiki.centos.org/HowTos/SELinux#head-aa437f65e1c7873cddbafd9e9a73bbf9d102c072
(7.1. Manually Customizing Policy Modules).  Also covered there is
using audit2allow to create your own module from SELinux audit logs.

thanks,
Ben

On Tue, Sep 17, 2019 at 9:22 AM Andrey Suharev  wrote:


 Hi all,

I would like to have my home dir at cephfs and to keep selinux enabled
at the same time.

The trouble is selinux prevents sshd to access ~/.ssh/authorized_keys
file. Any ideas how to fix it?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: HEALTH_WARN due to large omap object wont clear even after trim

2019-09-20 Thread shubjero
The deep scrub of the pg updated the cluster that the large omap was gone.
HEALTH_OK !

On Fri., Sep. 20, 2019, 2:31 p.m. shubjero,  wrote:

> Still trying to solve this one.
>
> Here is the corresponding log entry when the large omap object was found:
>
> ceph-osd.1284.log.2.gz:2019-09-18 11:43:39.237 7fcd68f96700  0
> log_channel(cluster) log [WRN] : Large omap object found. Object:
> 26:86e4c833:::usage.22:head Key count: 2009548 Size (bytes): 369641376
>
> I have since trimmed the entire usage log and disabled it entirely.
> You can see from the output below that there's nothing in these usage
> log objects.
>
> for i in `rados -p .usage ls`; do echo $i; rados -p .usage
> listomapkeys $i | wc -l; done
> usage.29
> 0
> usage.12
> 0
> usage.1
> 0
> usage.26
> 0
> usage.20
> 0
> usage.24
> 0
> usage.16
> 0
> usage.15
> 0
> usage.3
> 0
> usage.19
> 0
> usage.23
> 0
> usage.5
> 0
> usage.11
> 0
> usage.7
> 0
> usage.30
> 0
> usage.18
> 0
> usage.21
> 0
> usage.27
> 0
> usage.13
> 0
> usage.22
> 0
> usage.25
> 0
> .
> 4
> usage.10
> 0
> usage.8
> 0
> usage.9
> 0
> usage.28
> 0
> usage.2
> 0
> usage.4
> 0
> usage.6
> 0
> usage.31
> 0
> usage.17
> 0
>
>
> root@infra:~# rados -p .usage listomapkeys usage.22
> root@infra:~#
>
>
> On Thu, Sep 19, 2019 at 12:54 PM Charles Alva 
> wrote:
> >
> > Could you please share how you trimmed the usage log?
> >
> > Kind regards,
> >
> > Charles Alva
> > Sent from Gmail Mobile
> >
> >
> > On Thu, Sep 19, 2019 at 11:46 PM shubjero  wrote:
> >>
> >> Hey all,
> >>
> >> Yesterday our cluster went in to HEALTH_WARN due to 1 large omap
> >> object in the .usage pool (I've posted about this in the past). Last
> >> time we resolved the issue by trimming the usage log below the alert
> >> threshold but this time it seems like the alert wont clear even after
> >> trimming and (this time) disabling the usage log entirely.
> >>
> >> ceph health detail
> >> HEALTH_WARN 1 large omap objects
> >> LARGE_OMAP_OBJECTS 1 large omap objects
> >> 1 large objects found in pool '.usage'
> >> Search the cluster log for 'Large omap object found' for more
> details.
> >>
> >> I've bounced ceph-mon, ceph-mgr, radosgw and even issued osd scrub on
> >> the two osd's that hold pg's for the .usage pool but the alert wont
> >> clear.
> >>
> >> It's been over 24 hours since I trimmed the usage log.
> >>
> >> Any suggestions?
> >>
> >> Jared Baker
> >> Cloud Architect, OICR
> >> ___
> >> ceph-users mailing list -- ceph-users@ceph.io
> >> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW backup to tape

2019-09-20 Thread Robert LeBlanc
On Fri, Sep 20, 2019 at 11:10 AM Paul Emmerich  wrote:
>
> Probably easiest if you get a tape library that supports S3. You might
> even have some luck with radosgw's cloud sync module (but I wouldn't
> count on it, Octopus should improve things, though)
>
> Just intercepting PUT requests isn't that easy because of multi-part
> stuff and load balancing. I.e., if you upload a large file you should
> be sending it in chunks and each chunk should go to a different
> server, that makes any "simple" solutions pretty messy.

I wasn't aware of any library being S3 aware, usually it's been part
of the backup software. Do you have any suggestions for multi PB
libraries that have the S3 feature?

The idea with the PUT was not to intercept them in the path, but to
basically have RGW log access to LogStash, then a job would run to
find all the objects that were PUT within a time frame, then read the
objects off the cluster and write them to tape. Maybe that's not as
easy as I'm thinking either.


Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: RGW backup to tape

2019-09-20 Thread EDH - Manuel Rios Fernandez
Robert,

There're a storage company that integrate TAPES as OSD for deep-cold ceph.
But the code is not opensource

Regards


-Mensaje original-
De: Robert LeBlanc  
Enviado el: viernes, 20 de septiembre de 2019 23:28
Para: Paul Emmerich 
CC: ceph-users 
Asunto: [ceph-users] Re: RGW backup to tape

On Fri, Sep 20, 2019 at 11:10 AM Paul Emmerich 
wrote:
>
> Probably easiest if you get a tape library that supports S3. You might 
> even have some luck with radosgw's cloud sync module (but I wouldn't 
> count on it, Octopus should improve things, though)
>
> Just intercepting PUT requests isn't that easy because of multi-part 
> stuff and load balancing. I.e., if you upload a large file you should 
> be sending it in chunks and each chunk should go to a different 
> server, that makes any "simple" solutions pretty messy.

I wasn't aware of any library being S3 aware, usually it's been part of the
backup software. Do you have any suggestions for multi PB libraries that
have the S3 feature?

The idea with the PUT was not to intercept them in the path, but to
basically have RGW log access to LogStash, then a job would run to find all
the objects that were PUT within a time frame, then read the objects off the
cluster and write them to tape. Maybe that's not as easy as I'm thinking
either.


Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email
to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Doubt about ceph-iscsi and Vmware

2019-09-20 Thread Mike Christie
On 09/20/2019 01:52 PM, Gesiel Galvão Bernardes wrote:
> Hi,
> I'm testing Ceph with Vmware, using Ceph-iscsi gateway. I reading
> documentation*  and have doubts some points:
> 
> - If I understanded, in general terms, for each VMFS datastore in VMware
> will match the an RBD image. (consequently in an RBD image I will
> possible have many VMWare disks). Its correct?
> 
> - In documentation is this: "gwcli requires a pool with the name rbd, so
> it can store metadata like the iSCSI configuration". In part 4 of
> "Configuration", have: "Add a RBD image with the name disk_1 in the pool
> rbd". In this part, the use of "rbd" pool is a example and I could use
> any pool for storage of image, or the pool should be "rbd"?
> Resuming: gwcli require "rbd" pool for metadata and I could use any pool
> for image, or i will use just "rbd pool" for storage image and metadata?
> 
> - How much memory ceph-iscsi use? Which  is a good number of RAM?
> 

The major memory use is:

1. In RHEL 7.5 kernels and older we allocate max_data_area_mb of kernel
memory per device. The default value for that is 8. You can use gwcli to
configure it. It is allocated when the device is created. In newer
kernels, there is pool of memory and each device can use up to
max_data_area_mb worth of it. The per device default is the same and you
can change it with gwcli. The total pool limit is 2 GB. There is a sysfs
file:

/sys/module/target_core_user/parameters/global_max_data_area_mb

that can be used to change it.

2. Each device uses about 20 MB of memory in userspace. This is not
configurable.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] V/v [Ceph] problem with delete object in large bucket

2019-09-20 Thread tuan dung
Hi Ceph team,
Can you explain for me how ceph deleting object work? I have a bucket with
above 100M object (file size ~ 50KB). When I delete object for free space,
speed of deleting object very slow (about ~ 30-33 objects /s). I wan to
tuning performance of cluster but i do not clearly know how ceph do
deleting objects?
Thank you very much
-
Br,
Dương Tuấn Dũng
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io