[ceph-users] How do I get a sector marked bad?

2020-03-30 Thread David Herselman
Hi,

We have a single inconsistent placement group where I then subsequently 
triggered a deep scrub and tried doing a 'pg repair'. The placement group 
remains in an inconsistent state.

How do I discard the objects for this placement group only on the one OSD and 
get Ceph to essentially write the data out new. Drives will only mark a sector 
as remapped when asked to overwrite the problematic sector or repeated reads of 
the failed sector eventually succeed (this is my limited understanding).

Nothing useful in the 'ceph pg 1.35 query' output that I could decipher. Then 
ran 'ceph pg deep-scrub 1.35' and 'rados list-inconsistent-obj 1.35' thereafter 
indicates a read error on one of the copies:
{"epoch":25776,"inconsistents":[{"object":{"name":"rbd_data.746f3c94fb3a42.0001e48d","nspace":"","locator":"","snap":"head","version":34866184},"errors":[],"union_shard_errors":["read_error"],"selected_object_info":{"oid":{"oid":"rbd_data.746f3c94fb3a42.0001e48d","key":"","snapid":-2,"hash":3814100149,"max":0,"pool":1,"namespace":""},"version":"22845'1781037","prior_version":"22641'1771494","last_reqid":"client.136837683.0:124047","user_version":34866184,"size":4194304,"mtime":"2020-03-08
 17:59:00.159846","local_mtime":"2020-03-08 
17:59:00.159670","lost":0,"flags":["dirty","data_digest","omap_digest"],"truncate_seq":0,"truncate_size":0,"data_digest":"0x031cb17c","omap_digest":"0x","expected_object_size":4194304,"expected_write_size":4194304,"alloc_hint_flags":0,"manifest":{"type":0},"watchers":{}},"shards":[{"osd":51,"primary":false,"errors":["read_error"],"size":4194304},{"osd":60,"primary":false,"errors":[],"size":4194304,"omap_digest":"0x","data_dig
 
est":"0x031cb17c"},{"osd":82,"primary":true,"errors":[],"size":4194304,"omap_digest":"0x","data_digest":"0x031cb17c"}]}]}

/var/log/syslog:
Mar 30 08:40:40 kvm1e kernel: [74792.229021] ata2.00: exception Emask 0x0 SAct 
0x2 SErr 0x0 action 0x0
Mar 30 08:40:40 kvm1e kernel: [74792.230416] ata2.00: irq_stat 0x4008
Mar 30 08:40:40 kvm1e kernel: [74792.231715] ata2.00: failed command: READ 
FPDMA QUEUED
Mar 30 08:40:40 kvm1e kernel: [74792.233071] ata2.00: cmd 
60/00:08:00:7a:50/04:00:c9:00:00/40 tag 1 ncq dma 524288 in
Mar 30 08:40:40 kvm1e kernel: [74792.233071]  res 
43/40:00:10:7b:50/00:04:c9:00:00/00 Emask 0x409 (media error) 
Mar 30 08:40:40 kvm1e kernel: [74792.235736] ata2.00: status: { DRDY SENSE ERR }
Mar 30 08:40:40 kvm1e kernel: [74792.237045] ata2.00: error: { UNC }
Mar 30 08:40:40 kvm1e ceph-osd[450777]: 2020-03-30 08:40:40.240 7f48a41f3700 -1 
bluestore(/var/lib/ceph/osd/ceph-51) _do_read bdev-read failed: (5) 
Input/output error
Mar 30 08:40:40 kvm1e kernel: [74792.244914] ata2.00: configured for UDMA/133
Mar 30 08:40:40 kvm1e kernel: [74792.244938] sd 1:0:0:0: [sdb] tag#1 FAILED 
Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Mar 30 08:40:40 kvm1e kernel: [74792.244942] sd 1:0:0:0: [sdb] tag#1 Sense Key 
: Medium Error [current]
Mar 30 08:40:40 kvm1e kernel: [74792.244945] sd 1:0:0:0: [sdb] tag#1 Add. 
Sense: Unrecovered read error - auto reallocate failed
Mar 30 08:40:40 kvm1e kernel: [74792.244949] sd 1:0:0:0: [sdb] tag#1 CDB: 
Read(16) 88 00 00 00 00 00 c9 50 7a 00 00 00 04 00 00 00
Mar 30 08:40:40 kvm1e kernel: [74792.244953] blk_update_request: I/O error, dev 
sdb, sector 3377494800 op 0x0:(READ) flags 0x0 phys_seg 94 prio class 0
Mar 30 08:40:40 kvm1e kernel: [74792.246238] ata2: EH complete


Regards
David Herselman
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Odd CephFS Performance

2020-03-30 Thread Gabryel Mason-Williams
We have been benchmarking CephFS and comparing it Rados to see the performance 
difference and how much overhead CephFS has. However, we are getting odd 
results when using more than 1 OSD server (each OSDS has only one disk) using 
CephFS but using Rados everything appears normal. These tests are run on the 
same Ceph Cluster.  

CephFS  Rados
OSDSThread 16   Thread 16
1   289  316
2   139  546
3   143  728
4   142  844

CephFS is being benchmarked using: fio --name=seqwrite --rw=write --direct=1 
--ioengine=libaio --bs=4M --numjobs=16  --size=1G  --group_reporting
Rados is being benchmarked using: rados bench -p cephfs_data 10 write -t 16

If you could provide some help or insight into why this is happening or how to 
stop it, that would be much appreciated. 

Kind regards,

Gabryel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to use iscsi gateway with https | iscsi-gateway-add returns errors

2020-03-30 Thread Mike Christie
On 03/29/2020 04:43 PM, givemeone  wrote:
> Hi all,
> I am installing ceph Nautilus and getting constantly errors while adding 
> iscsi gateways
> It was working using http schema but after moving to https with wildcard 
> certs gives API errors
> 
> Below some of my configurations
> Thanks for your help
> 
> 
> Command: 
> ceph --cluster ceph dashboard iscsi-gateway-add 
> https://myadmin:admin.01@1.2.3.4:5050
> 
> Error:
> Error EINVAL: iscsi REST API cannot be reached. Please check your 
> configuration and that the API endpoint is accessible
> 
> Tried also disabling ssl verify
> # ceph dashboard set-rgw-api-ssl-verify False
> Option RGW_API_SSL_VERIFY updated
> 
> 
> "/etc/ceph/iscsi-gateway.cfg" 23L, 977C
> # Ansible managed
> [config]
> api_password = admin.01
> api_port = 5050
> # API settings.
> # The API supports a number of options that allow you to tailor it to your
> # local environment. If you want to run the API under https, you will need to
> # create cert/key files that are compatible for each iSCSI gateway node, that 
> is
> # not locked to a specific node. SSL cert and key files *must* be called
> # 'iscsi-gateway.crt' and 'iscsi-gateway.key' and placed in the '/etc/ceph/' 
> directory
> # on *each* gateway node. With the SSL files in place, you can use 
> 'api_secure = true'
> # to switch to https mode.
> # To support the API, the bear minimum settings are:
> api_secure = True


Maybe sure after you set this value you restart the rbd-target-api
daemons on all the nodes so the new value is used.

We might also need to set

api_ssl_verify = True

for some gateway to gateway operations. I'm not sure what happened with
the docs, because I do not see any info on it.

> # Optional settings related to the CLI/API service
> api_user = myadmin
> cluster_name = ceph
> loop_delay = 1
> trusted_ip_list = 1.2.3.3,1.2.3.4
> 
> 
> 
> Log  file
> ==

Are there any errors in /var/log/rbd-target-api/rbd-target-api.log?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Odd CephFS Performance

2020-03-30 Thread Mark Nelson

Hi Gabryel,


Are the pools always using 1X replication?  The rados results are 
scaling like it's using 1X but the CephFS results definitely look 
suspect.  Have you tried turning up the iodepth in addition to tuning 
numjobs?  Also is this kernel cephfs or fuse?  The fuse client is far 
slower.  FWIW, on our test cluster with NVMe drives I can get about 
60-65GB/s for large sequential writes across 80 OSDs (using 100 client 
processes with kernel cephfs).  It's definitely possible to scale better 
than what you are seeing here.



https://docs.google.com/spreadsheets/d/1SpwEk3vB9gWzoxvy-K0Ax4NKbRJwd7W1ip-W-qitLlw/edit?usp=sharing


Mark


On 3/30/20 8:56 AM, Gabryel Mason-Williams wrote:

We have been benchmarking CephFS and comparing it Rados to see the performance 
difference and how much overhead CephFS has. However, we are getting odd 
results when using more than 1 OSD server (each OSDS has only one disk) using 
CephFS but using Rados everything appears normal. These tests are run on the 
same Ceph Cluster.

 CephFS Rados
OSDSThread 16   Thread 16
1   289  316
2   139  546
3   143  728
4   142  844

CephFS is being benchmarked using: fio --name=seqwrite --rw=write --direct=1 
--ioengine=libaio --bs=4M --numjobs=16  --size=1G  --group_reporting
Rados is being benchmarked using: rados bench -p cephfs_data 10 write -t 16

If you could provide some help or insight into why this is happening or how to 
stop it, that would be much appreciated.

Kind regards,

Gabryel
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Terrible IOPS performance

2020-03-30 Thread Jarett DeAngelis
Hi folks,

I have a three-node cluster on a 10G network with very little traffic. I have a 
six-OSD flash-only pool with two devices — a 1TB NVMe drive and a 256GB SATA 
SSD — on each node, and here’s how it benchmarks:

Oof. How can I troubleshoot this? Anthony mentioned that I might be able to run 
more than one OSD on the NVMe —  how is that done, and can I do it “on the fly” 
with the system already up and running like this? And, will more OSDs give me 
better IOPS?

Thanks,
Jarett
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Terrible IOPS performance

2020-03-30 Thread Marc Roos



Your system is indeed slow, benchmark results are still not here ;)


 

-Original Message-
Sent: 27 March 2020 19:44
To: ceph-users@ceph.io
Subject: [ceph-users] Terrible IOPS performance

Hi folks,

I have a three-node cluster on a 10G network with very little traffic. I 
have a six-OSD flash-only pool with two devices  a 1TB NVMe drive and 
a 256GB SATA SSD  on each node, and heres how it benchmarks:

Oof. How can I troubleshoot this? Anthony mentioned that I might be able 
to run more than one OSD on the NVMe   how is that done, and can I do 
it on the fly with the system already up and running like this? And, 
will more OSDs give me better IOPS?

Thanks,
Jarett
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: How do I get a sector marked bad?

2020-03-30 Thread Dan van der Ster
Hi,

I have a feeling that the pg repair didn't actually run yet. Sometimes
if the OSDs are busy scrubbing, the repair doesn't start when you ask
it to.
You can force it through with something like:

ceph osd set noscrub
ceph osd set nodeep-scrub
ceph config set osd_max_scrubs 3
ceph pg repair 
ceph status # and check that the repair really started
ceph config set osd_max_scrubs 1
ceph osd unset nodeep-scrub
ceph osd unset noscrub

Once repair runs/completes, it will rewrite the inconsistent object
replica (to a new place on the disk). Check your ceph.log to see when
this happens.

>From my experience, the PendingSectors counter will not be decremented
until that sector is written again (which will happen at some random
point in the future when bluestore allocates some new data there).

Hope that helps,

Dan


On Mon, Mar 30, 2020 at 9:00 AM David Herselman  wrote:
>
> Hi,
>
> We have a single inconsistent placement group where I then subsequently 
> triggered a deep scrub and tried doing a 'pg repair'. The placement group 
> remains in an inconsistent state.
>
> How do I discard the objects for this placement group only on the one OSD and 
> get Ceph to essentially write the data out new. Drives will only mark a 
> sector as remapped when asked to overwrite the problematic sector or repeated 
> reads of the failed sector eventually succeed (this is my limited 
> understanding).
>
> Nothing useful in the 'ceph pg 1.35 query' output that I could decipher. Then 
> ran 'ceph pg deep-scrub 1.35' and 'rados list-inconsistent-obj 1.35' 
> thereafter indicates a read error on one of the copies:
> {"epoch":25776,"inconsistents":[{"object":{"name":"rbd_data.746f3c94fb3a42.0001e48d","nspace":"","locator":"","snap":"head","version":34866184},"errors":[],"union_shard_errors":["read_error"],"selected_object_info":{"oid":{"oid":"rbd_data.746f3c94fb3a42.0001e48d","key":"","snapid":-2,"hash":3814100149,"max":0,"pool":1,"namespace":""},"version":"22845'1781037","prior_version":"22641'1771494","last_reqid":"client.136837683.0:124047","user_version":34866184,"size":4194304,"mtime":"2020-03-08
>  17:59:00.159846","local_mtime":"2020-03-08 
> 17:59:00.159670","lost":0,"flags":["dirty","data_digest","omap_digest"],"truncate_seq":0,"truncate_size":0,"data_digest":"0x031cb17c","omap_digest":"0x","expected_object_size":4194304,"expected_write_size":4194304,"alloc_hint_flags":0,"manifest":{"type":0},"watchers":{}},"shards":[{"osd":51,"primary":false,"errors":["read_error"],"size":4194304},{"osd":60,"primary":false,"errors":[],"size":4194304,"omap_digest":"0x","data_d
 ig
>  
> est":"0x031cb17c"},{"osd":82,"primary":true,"errors":[],"size":4194304,"omap_digest":"0x","data_digest":"0x031cb17c"}]}]}
>
> /var/log/syslog:
> Mar 30 08:40:40 kvm1e kernel: [74792.229021] ata2.00: exception Emask 0x0 
> SAct 0x2 SErr 0x0 action 0x0
> Mar 30 08:40:40 kvm1e kernel: [74792.230416] ata2.00: irq_stat 0x4008
> Mar 30 08:40:40 kvm1e kernel: [74792.231715] ata2.00: failed command: READ 
> FPDMA QUEUED
> Mar 30 08:40:40 kvm1e kernel: [74792.233071] ata2.00: cmd 
> 60/00:08:00:7a:50/04:00:c9:00:00/40 tag 1 ncq dma 524288 in
> Mar 30 08:40:40 kvm1e kernel: [74792.233071]  res 
> 43/40:00:10:7b:50/00:04:c9:00:00/00 Emask 0x409 (media error) 
> Mar 30 08:40:40 kvm1e kernel: [74792.235736] ata2.00: status: { DRDY SENSE 
> ERR }
> Mar 30 08:40:40 kvm1e kernel: [74792.237045] ata2.00: error: { UNC }
> Mar 30 08:40:40 kvm1e ceph-osd[450777]: 2020-03-30 08:40:40.240 7f48a41f3700 
> -1 bluestore(/var/lib/ceph/osd/ceph-51) _do_read bdev-read failed: (5) 
> Input/output error
> Mar 30 08:40:40 kvm1e kernel: [74792.244914] ata2.00: configured for UDMA/133
> Mar 30 08:40:40 kvm1e kernel: [74792.244938] sd 1:0:0:0: [sdb] tag#1 FAILED 
> Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
> Mar 30 08:40:40 kvm1e kernel: [74792.244942] sd 1:0:0:0: [sdb] tag#1 Sense 
> Key : Medium Error [current]
> Mar 30 08:40:40 kvm1e kernel: [74792.244945] sd 1:0:0:0: [sdb] tag#1 Add. 
> Sense: Unrecovered read error - auto reallocate failed
> Mar 30 08:40:40 kvm1e kernel: [74792.244949] sd 1:0:0:0: [sdb] tag#1 CDB: 
> Read(16) 88 00 00 00 00 00 c9 50 7a 00 00 00 04 00 00 00
> Mar 30 08:40:40 kvm1e kernel: [74792.244953] blk_update_request: I/O error, 
> dev sdb, sector 3377494800 op 0x0:(READ) flags 0x0 phys_seg 94 prio class 0
> Mar 30 08:40:40 kvm1e kernel: [74792.246238] ata2: EH complete
>
>
> Regards
> David Herselman
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: ceph cephadm generate-key => No such file or directory: '/tmp/tmp4ejhr7wh/key'

2020-03-30 Thread Sage Weil
On Mon, 30 Mar 2020, Ml Ml wrote:
> Hello List,
> 
> is this a bug?
> 
> root@ceph02:~# ceph cephadm generate-key
> Error EINVAL: Traceback (most recent call last):
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1413, in _generate_key
> with open(path, 'r') as f:
> FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp4ejhr7wh/key'
> 
> During handling of the above exception, another exception occurred:
> 
> Traceback (most recent call last):
>   File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
> return self.handle_command(inbuf, cmd)
>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in
> handle_command
> return dispatch[cmd['prefix']].call(self, cmd, inbuf)
>   File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
> return self.func(mgr, **kwargs)
>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in 
> wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
>   File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
> return func(*args, **kwargs)
>   File "/usr/share/ceph/mgr/cephadm/module.py", line 1418, in _generate_key
> os.unlink(path)
> FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp4ejhr7wh/key'

Huh.. yeah looks like it.  My guess is there is a missing openssl 
dependency.  Turn up debugging (debug_mgr=20) and see if there is 
anything helpful in the log?

Note that this is a moot point if you run ceph-mgr in a container.  If 
you're converting to cephadm, you can adopt the mgr daemon (cephadm adopt 
--style legacy --name mgr.whatever) and then retry the same command.

sage

> 
> root@ceph02:~# dpkg -l |grep ceph
> ii  ceph-base15.2.0-1~bpo10+1
> amd64common ceph daemon libraries and management tools
> ii  ceph-common  15.2.0-1~bpo10+1
> amd64common utilities to mount and interact with a ceph
> storage cluster
> ii  ceph-deploy  2.0.1
> all  Ceph-deploy is an easy to use configuration tool
> ii  ceph-mds 15.2.0-1~bpo10+1
> amd64metadata server for the ceph distributed file system
> ii  ceph-mgr 15.2.0-1~bpo10+1
> amd64manager for the ceph distributed storage system
> ii  ceph-mgr-cephadm 15.2.0-1~bpo10+1
> all  cephadm orchestrator module for ceph-mgr
> ii  ceph-mgr-dashboard   15.2.0-1~bpo10+1
> all  dashboard module for ceph-mgr
> ii  ceph-mgr-diskprediction-cloud15.2.0-1~bpo10+1
> all  diskprediction-cloud module for ceph-mgr
> ii  ceph-mgr-diskprediction-local15.2.0-1~bpo10+1
> all  diskprediction-local module for ceph-mgr
> ii  ceph-mgr-k8sevents   15.2.0-1~bpo10+1
> all  kubernetes events module for ceph-mgr
> ii  ceph-mgr-modules-core15.2.0-1~bpo10+1
> all  ceph manager modules which are always enabled
> ii  ceph-mgr-rook15.2.0-1~bpo10+1
> all  rook module for ceph-mgr
> ii  ceph-mon 15.2.0-1~bpo10+1
> amd64monitor server for the ceph storage system
> ii  ceph-osd 15.2.0-1~bpo10+1
> amd64OSD server for the ceph storage system
> ii  cephadm  15.2.0-1~bpo10+1
> amd64cephadm utility to bootstrap ceph daemons with systemd
> and containers
> ii  libcephfs1   10.2.11-2
> amd64Ceph distributed file system client library
> ii  libcephfs2   15.2.0-1~bpo10+1
> amd64Ceph distributed file system client library
> ii  python-ceph-argparse 14.2.8-1
> all  Python 2 utility libraries for Ceph CLI
> ii  python3-ceph-argparse15.2.0-1~bpo10+1
> all  Python 3 utility libraries for Ceph CLI
> ii  python3-ceph-common  15.2.0-1~bpo10+1
> all  Python 3 utility libraries for Ceph
> ii  python3-cephfs   15.2.0-1~bpo10+1
> amd64Python 3 libraries for the Ceph libcephfs library
> root@ceph02:~# cat /etc/debian_version
> 10.3
> 
> Thanks,
> Michael
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph cephadm generate-key => No such file or directory: '/tmp/tmp4ejhr7wh/key'

2020-03-30 Thread Ml Ml
Hello List,

is this a bug?

root@ceph02:~# ceph cephadm generate-key
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1413, in _generate_key
with open(path, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp4ejhr7wh/key'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in
handle_command
return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in 
wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1418, in _generate_key
os.unlink(path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp4ejhr7wh/key'


root@ceph02:~# dpkg -l |grep ceph
ii  ceph-base15.2.0-1~bpo10+1
amd64common ceph daemon libraries and management tools
ii  ceph-common  15.2.0-1~bpo10+1
amd64common utilities to mount and interact with a ceph
storage cluster
ii  ceph-deploy  2.0.1
all  Ceph-deploy is an easy to use configuration tool
ii  ceph-mds 15.2.0-1~bpo10+1
amd64metadata server for the ceph distributed file system
ii  ceph-mgr 15.2.0-1~bpo10+1
amd64manager for the ceph distributed storage system
ii  ceph-mgr-cephadm 15.2.0-1~bpo10+1
all  cephadm orchestrator module for ceph-mgr
ii  ceph-mgr-dashboard   15.2.0-1~bpo10+1
all  dashboard module for ceph-mgr
ii  ceph-mgr-diskprediction-cloud15.2.0-1~bpo10+1
all  diskprediction-cloud module for ceph-mgr
ii  ceph-mgr-diskprediction-local15.2.0-1~bpo10+1
all  diskprediction-local module for ceph-mgr
ii  ceph-mgr-k8sevents   15.2.0-1~bpo10+1
all  kubernetes events module for ceph-mgr
ii  ceph-mgr-modules-core15.2.0-1~bpo10+1
all  ceph manager modules which are always enabled
ii  ceph-mgr-rook15.2.0-1~bpo10+1
all  rook module for ceph-mgr
ii  ceph-mon 15.2.0-1~bpo10+1
amd64monitor server for the ceph storage system
ii  ceph-osd 15.2.0-1~bpo10+1
amd64OSD server for the ceph storage system
ii  cephadm  15.2.0-1~bpo10+1
amd64cephadm utility to bootstrap ceph daemons with systemd
and containers
ii  libcephfs1   10.2.11-2
amd64Ceph distributed file system client library
ii  libcephfs2   15.2.0-1~bpo10+1
amd64Ceph distributed file system client library
ii  python-ceph-argparse 14.2.8-1
all  Python 2 utility libraries for Ceph CLI
ii  python3-ceph-argparse15.2.0-1~bpo10+1
all  Python 3 utility libraries for Ceph CLI
ii  python3-ceph-common  15.2.0-1~bpo10+1
all  Python 3 utility libraries for Ceph
ii  python3-cephfs   15.2.0-1~bpo10+1
amd64Python 3 libraries for the Ceph libcephfs library
root@ceph02:~# cat /etc/debian_version
10.3

Thanks,
Michael
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Multiple CephFS creation

2020-03-30 Thread Jarett DeAngelis
Hi guys,

This is documented as an experimental feature, but it doesn’t explain how to 
ensure that affinity for a given MDS sticks to the second filesystem you 
create. Has anyone had success implementing a second CephFS? In my case it will 
be based on a completely different pool from my first one.

Thanks.
J
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: samba ceph-vfs and scrubbing interval

2020-03-30 Thread David Disseldorp
Hi Marco and Jeff,

On Fri, 27 Mar 2020 08:04:56 -0400, Jeff Layton wrote:

> > i‘m running a 3 node ceph cluster setup with collocated mons and mds
> > for actually 3 filesystems at home since mimic. I’m planning to
> > downgrade to one FS and use RBD in the future, but this is another
> > story. I’m using the cluster as cold storage on spindles with EC-pools 
> > for archive purposes. The cluster usually does not run 24/7. I
> > actually managed to upgrade to octopus without problems yesterday. So
> > first of all: great job with the release. 
> > 
> > Now I have a little problem and a general question to address.
> > 
> > I have tried to share the CephFS via samba and the ceph-vfs module but
> > I could not manage to get write access (read access is not a problem)
> > to the share (even with the admin key). When I share the mounted path
> > (kernel module or fuser mount) instead as usual there are no problems
> > at all.  Is ceph-vfs generally read only and I missed this point?   
> 
> No. I haven't tested it in some time, but it does allow clients to
> write. When you say you can't get write access, what are you doing to
> test this, and what error are you getting back?

Is write access granted via a supplementary group ID? If so, this might
be https://bugzilla.samba.org/show_bug.cgi?id=14053 .
Fixing libcephfs supplementary group ID fallback behaviour was discussed
earlier via 
https://lists.ceph.io/hyperkitty/list/d...@ceph.io/thread/PCIOZRE5FJCQ2LZXLZCN5O2AA5AYU4KF/

Cheers, David
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Unable to use iscsi gateway with https | iscsi-gateway-add returns errors

2020-03-30 Thread Matthew Oliver
*sigh* and this time reply to all.

rbd-target-api is a little opinionated on where the ssl cert and key files
live and what they're named. It expects:

 cert_files = ['/etc/ceph/iscsi-gateway.crt',
   '/etc/ceph/iscsi-gateway.key']

So make sure these exist, and are named correctly.

Otherwise, we probably need to see the log :)

On Tue, Mar 31, 2020 at 3:56 AM Mike Christie  wrote:

> On 03/29/2020 04:43 PM, givemeone  wrote:
> > Hi all,
> > I am installing ceph Nautilus and getting constantly errors while adding
> iscsi gateways
> > It was working using http schema but after moving to https with wildcard
> certs gives API errors
> >
> > Below some of my configurations
> > Thanks for your help
> >
> >
> > Command:
> > ceph --cluster ceph dashboard iscsi-gateway-add
> https://myadmin:admin.01@1.2.3.4:5050
> >
> > Error:
> > Error EINVAL: iscsi REST API cannot be reached. Please check your
> configuration and that the API endpoint is accessible
> >
> > Tried also disabling ssl verify
> > # ceph dashboard set-rgw-api-ssl-verify False
> > Option RGW_API_SSL_VERIFY updated
> >
> >
> > "/etc/ceph/iscsi-gateway.cfg" 23L, 977C
> > # Ansible managed
> > [config]
> > api_password = admin.01
> > api_port = 5050
> > # API settings.
> > # The API supports a number of options that allow you to tailor it to
> your
> > # local environment. If you want to run the API under https, you will
> need to
> > # create cert/key files that are compatible for each iSCSI gateway node,
> that is
> > # not locked to a specific node. SSL cert and key files *must* be called
> > # 'iscsi-gateway.crt' and 'iscsi-gateway.key' and placed in the
> '/etc/ceph/' directory
> > # on *each* gateway node. With the SSL files in place, you can use
> 'api_secure = true'
> > # to switch to https mode.
> > # To support the API, the bear minimum settings are:
> > api_secure = True
>
>
> Maybe sure after you set this value you restart the rbd-target-api
> daemons on all the nodes so the new value is used.
>
> We might also need to set
>
> api_ssl_verify = True
>
> for some gateway to gateway operations. I'm not sure what happened with
> the docs, because I do not see any info on it.
>
> > # Optional settings related to the CLI/API service
> > api_user = myadmin
> > cluster_name = ceph
> > loop_delay = 1
> > trusted_ip_list = 1.2.3.3,1.2.3.4
> >
> >
> >
> > Log  file
> > ==
>
> Are there any errors in /var/log/rbd-target-api/rbd-target-api.log?
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Multiple CephFS creation

2020-03-30 Thread Eugen Block

Hi,

to create a second filesystem you have to use different pools anyway.

If you already have one CephFS up and running then you also should  
have at least one standby daemon, right? If you create a new FS and  
that standby daemon is not configured to any specific rank then it  
will be used for the second filesystem. Now you'll have two active MDS  
both with rank 0 (active):


---snip---
ceph:~ # ceph fs status
cephfs - 1 clients
==
+--++---+---+---+---+
| Rank | State  |  MDS  |Activity   |  dns  |  inos |
+--++---+---+---+---+
|  0   | active | host6 | Reqs:0 /s |   10  |   13  |
+--++---+---+---+---+
+-+--+---+---+
|   Pool  |   type   |  used | avail |
+-+--+---+---+
| cephfs_metadata | metadata | 1536k | 92.0G |
|   cephfs_data   |   data   | 5053M | 92.0G |
+-+--+---+---+
cephfs2 - 0 clients
===
+--++---+---+---+---+
| Rank | State  |  MDS  |Activity   |  dns  |  inos |
+--++---+---+---+---+
|  0   | active | host5 | Reqs:0 /s |   10  |   13  |
+--++---+---+---+---+
+--+--+---+---+
|   Pool   |   type   |  used | avail |
+--+--+---+---+
| cephfs2_metadata | metadata | 1536k | 92.0G |
|   cephfs2_data   |   data   |0  | 92.0G |
+--+--+---+---+
+-+
| Standby MDS |
+-+
+-+
---snip---

For the standby daemon you have to be aware of this:

By default, if none of these settings are used, all MDS daemons  
which do not hold a rank will

be used as 'standbys' for any rank.
[...]
When a daemon has entered the standby replay state, it will only be  
used as a standby for
the rank that it is following. If another rank fails, this standby  
replay daemon will not be

used as a replacement, even if no other standbys are available.


Some of the mentioned settings are for example:

mds_standby_for_rank
mds_standby_for_name
mds_standby_for_fscid

The easiest way is to have one standby daemon per CephFS and let them  
handle the failover.


Regards,
Eugen


Zitat von Jarett DeAngelis :


Hi guys,

This is documented as an experimental feature, but it doesn’t  
explain how to ensure that affinity for a given MDS sticks to the  
second filesystem you create. Has anyone had success implementing a  
second CephFS? In my case it will be based on a completely different  
pool from my first one.


Thanks.
J
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io



___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io