Re: [ceph-users] Ceph + SAMBA (vfs_ceph)

2019-08-28 Thread Maged Mokhtar


On 27/08/2019 21:39, Salsa wrote:
I'm running a ceph installation on a lab to evaluate for production 
and I have a cluster running, but I need to mount on different windows 
servers and desktops. I created an NFS share and was able to mount it 
on my Linux desktop, but not a Win 10 desktop. Since it seems that 
Windows server 2016 is required to mount the NFS share I quit that 
route and decided to try samba.


I compiled a version of Samba that has this vfs_ceph module, but I 
can't set it up correctly. It seems I'm missing some user 
configuration as I've hit this error:


"
~$ smbclient -U samba.gw //10.17.6.68/cephfs_a
WARNING: The "syslog" option is deprecated
Enter WORKGROUP\samba.gw's password:
session setup failed: NT_STATUS_LOGON_FAILURE
"
Does anyone know of any good setup tutorial to follow?

This is my smb config so far:

# Global parameters
[global]
load printers = No
netbios name = SAMBA-CEPH
printcap name = cups
security = USER
workgroup = CEPH
smbd: backgroundqueue = no
idmap config * : backend = tdb
cups options = raw
valid users = samba

[cephfs]
create mask = 0777
directory mask = 0777
guest ok = Yes
guest only = Yes
kernel share modes = No
path = /
read only = No
vfs objects = ceph
ceph: user_id = samba
ceph:config_file = /etc/ceph/ceph.conf

Thanks

--
Salsa

Sent with ProtonMail  Secure Email.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


The error seems to be a samba security issue. below is a conf file we 
use, it uses kernel client rather than vfs, but may help with permission:


[global]
workgroup = WORKGROUP
server string = Samba Server %v
security = user
map to guest = bad user

# clustering
netbios name= PETASAN
clustering=yes
passdb backend = tdbsam
idmap config * : backend = tdb2
idmap config * : range = 100-199
private dir = /mnt/cephfs/lock

[Public]
   path = /mnt/cephfs/share/public
   browseable = yes
   writable = yes
   guest ok = yes
   guest only = yes
   read only = no
   create mode = 0777
   directory mode = 0777
   force user = nobody

[Protected]
  path = /mnt/cephfs/share/protected
  valid users = @smbgroup
  guest ok = no
  writable = yes
  browsable = yes

Maged


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] health: HEALTH_ERR Module 'devicehealth' has failed: Failed to import _strptime because the import lockis held by another thread.

2019-08-28 Thread Peter Eisch

> Restart of single module is: `ceph mgr module disable devicehealth ; ceph mgr 
> module enable devicehealth`.

Thank you for your reply.  The I receive an error as the module can't be 
disabled.

I may have worked through this by restarting the nodes in a rapid succession.

peter



Peter Eisch
Senior Site Reliability Engineer
T1.612.659.3228
virginpulse.com
|virginpulse.com/global-challenge
Australia | Bosnia and Herzegovina | Brazil | Canada | Singapore | Switzerland 
| United Kingdom | USA
Confidentiality Notice: The information contained in this e-mail, including any 
attachment(s), is intended solely for use by the designated recipient(s). 
Unauthorized use, dissemination, distribution, or reproduction of this message 
by anyone other than the intended recipient(s), or a person designated as 
responsible for delivering such messages to the intended recipient, is strictly 
prohibited and may be unlawful. This e-mail may contain proprietary, 
confidential or privileged information. Any views or opinions expressed are 
solely those of the author and do not necessarily represent those of Virgin 
Pulse, Inc. If you have received this message in error, or are not the named 
recipient(s), please immediately notify the sender and delete this e-mail 
message.
v2.60
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph + SAMBA (vfs_ceph)

2019-08-28 Thread Salsa
This is the result:

# testparm -s
Load smb config files from /etc/samba/smb.conf
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[homes]"
Processing section "[cephfs]"
Processing section "[printers]"
Processing section "[print$]"
Loaded services file OK.
Server role: ROLE_STANDALONE

# Global parameters
[global]
load printers = No
netbios name = SAMBA-CEPH
printcap name = cups
security = USER
workgroup = CEPH
smbd: backgroundqueue = no
idmap config * : backend = tdb
cups options = raw
valid users = samba
...
[cephfs]
create mask = 0777
directory mask = 0777
guest ok = Yes
guest only = Yes
kernel share modes = No
path = /
read only = No
vfs objects = ceph
ceph: user_id = samba
ceph:config_file = /etc/ceph/ceph.conf

I cut off some parts I thought were not relevant.

--
Salsa

Sent with [ProtonMail](https://protonmail.com) Secure Email.

‐‐‐ Original Message ‐‐‐
On Wednesday, August 28, 2019 5:44 AM, Maged Mokhtar  
wrote:

> On 27/08/2019 21:39, Salsa wrote:
>
>> I'm running a ceph installation on a lab to evaluate for production and I 
>> have a cluster running, but I need to mount on different windows servers and 
>> desktops. I created an NFS share and was able to mount it on my Linux 
>> desktop, but not a Win 10 desktop. Since it seems that Windows server 2016 
>> is required to mount the NFS share I quit that route and decided to try 
>> samba.
>>
>> I compiled a version of Samba that has this vfs_ceph module, but I can't set 
>> it up correctly. It seems I'm missing some user configuration as I've hit 
>> this error:
>>
>> "
>> ~$ smbclient -U samba.gw //10.17.6.68/cephfs_a
>> WARNING: The "syslog" option is deprecated
>> Enter WORKGROUP\samba.gw's password:
>> session setup failed: NT_STATUS_LOGON_FAILURE
>> "
>> Does anyone know of any good setup tutorial to follow?
>>
>> This is my smb config so far:
>>
>> # Global parameters
>> [global]
>> load printers = No
>> netbios name = SAMBA-CEPH
>> printcap name = cups
>> security = USER
>> workgroup = CEPH
>> smbd: backgroundqueue = no
>> idmap config * : backend = tdb
>> cups options = raw
>> valid users = samba
>>
>> [cephfs]
>> create mask = 0777
>> directory mask = 0777
>> guest ok = Yes
>> guest only = Yes
>> kernel share modes = No
>> path = /
>> read only = No
>> vfs objects = ceph
>> ceph: user_id = samba
>> ceph:config_file = /etc/ceph/ceph.conf
>>
>> Thanks
>>
>> --
>> Salsa
>>
>> Sent with [ProtonMail](https://protonmail.com) Secure Email.
>>
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>>
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
> The error seems to be a samba security issue. below is a conf file we use, it 
> uses kernel client rather than vfs, but may help with permission:
>
> [global]
> workgroup = WORKGROUP
> server string = Samba Server %v
> security = user
> map to guest = bad user
>
> # clustering
> netbios name= PETASAN
> clustering=yes
> passdb backend = tdbsam
> idmap config * : backend = tdb2
> idmap config * : range = 100-199
> private dir = /mnt/cephfs/lock
>
> [Public]
>path = /mnt/cephfs/share/public
>browseable = yes
>writable = yes
>guest ok = yes
>guest only = yes
>read only = no
>create mode = 0777
>directory mode = 0777
>force user = nobody
>
> [Protected]
>   path = /mnt/cephfs/share/protected
>   valid users = @smbgroup
>   guest ok = no
>   writable = yes
>   browsable = yes
>
> Maged___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph + SAMBA (vfs_ceph)

2019-08-28 Thread Salsa
This is the result:

# testparm -s
Load smb config files from /etc/samba/smb.conf
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[homes]"
Processing section "[cephfs]"
Processing section "[printers]"
Processing section "[print$]"
Loaded services file OK.
Server role: ROLE_STANDALONE

# Global parameters
[global]
load printers = No
netbios name = SAMBA-CEPH
printcap name = cups
security = USER
workgroup = CEPH
smbd: backgroundqueue = no
idmap config * : backend = tdb
cups options = raw
valid users = samba
...
[cephfs]
create mask = 0777
directory mask = 0777
guest ok = Yes
guest only = Yes
kernel share modes = No
path = /
read only = No
vfs objects = ceph
ceph: user_id = samba
ceph:config_file = /etc/ceph/ceph.conf

I cut off some parts I thought were not relevant.

--
Salsa

Sent with [ProtonMail](https://protonmail.com) Secure Email.

‐‐‐ Original Message ‐‐‐
On Wednesday, August 28, 2019 3:09 AM, Konstantin Shalygin  
wrote:

>> I'm running a ceph installation on a lab to evaluate for production and I 
>> have a cluster running, but I need to mount on different windows servers and 
>> desktops. I created an NFS share and was able to mount it on my Linux 
>> desktop, but not a Win 10 desktop. Since it seems that Windows server 2016 
>> is required to mount the NFS share I quit that route and decided to try 
>> samba.
>>
>> I compiled a version of Samba that has this vfs_ceph module, but I can't set 
>> it up correctly. It seems I'm missing some user configuration as I've hit 
>> this error:
>>
>> "
>> ~$ smbclient -U samba.gw //10.17.6.68/cephfs_a
>> WARNING: The "syslog" option is deprecated
>> Enter WORKGROUP\samba.gw's password:
>> session setup failed: NT_STATUS_LOGON_FAILURE
>> "
>> Does anyone know of any good setup tutorial to follow?
>>
>> This is my smb config so far:
>>
>> # Global parameters
>> [global]
>> load printers = No
>> netbios name = SAMBA-CEPH
>> printcap name = cups
>> security = USER
>> workgroup = CEPH
>> smbd: backgroundqueue = no
>> idmap config * : backend = tdb
>> cups options = raw
>> valid users = samba
>>
>> [cephfs]
>> create mask = 0777
>> directory mask = 0777
>> guest ok = Yes
>> guest only = Yes
>> kernel share modes = No
>> path = /
>> read only = No
>> vfs objects = ceph
>> ceph: user_id = samba
>> ceph:config_file = /etc/ceph/ceph.conf
>>
>> Thanks
>
> Your configuration seems correct, but conf have or don't have special 
> characters such a spaces, lower case options. First what you should do is run 
> `testparm -s` and paste here what in output.
>
> k___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Failure to start ceph-mon in docker

2019-08-28 Thread Robert LeBlanc
We are trying to set up a new Nautilus cluster using ceph-ansible with
containers. We got things deployed, but I couldn't run `ceph s` on the host
so decided to `apt install ceph-common and installed the Luminous version
from Ubuntu 18.04. For some reason the docker container that was running
the monitor restarted and won't restart. I added the repo for Nautilus and
upgraded ceph-common, but the problem persists. The Manager and OSD docker
containers don't seem to be affected at all. I see this in the journal:

Aug 28 20:40:55 sun-gcs02-osd01 systemd[1]: Starting Ceph Monitor...
Aug 28 20:40:55 sun-gcs02-osd01 docker[2926]: Error: No such container:
ceph-mon-sun-gcs02-osd01
Aug 28 20:40:55 sun-gcs02-osd01 systemd[1]: Started Ceph Monitor.
Aug 28 20:40:55 sun-gcs02-osd01 docker[2949]: WARNING: Your kernel does not
support swap limit capabilities or the cgroup is not mounted. Memory
limited without swap.
Aug 28 20:40:56 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:40:56
 /opt/ceph-container/bin/entrypoint.sh: Existing mon, trying to rejoin
cluster...
Aug 28 20:40:56 sun-gcs02-osd01 docker[2949]: warning: line 41:
'osd_memory_target' in section 'osd' redefined
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:41:03
 /opt/ceph-container/bin/entrypoint.sh: /etc/ceph/ceph.conf is already
memory tuned
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:41:03
 /opt/ceph-container/bin/entrypoint.sh: SUCCESS
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: exec: PID 368: spawning
/usr/bin/ceph-mon --cluster ceph --default-log-to-file=false
--default-mon-cluster-log-to-file=false --setuser ceph --setgroup ceph -d
--mon-cluster-log-to-stderr --log-stderr-prefix=debug  -i sun-gcs02-osd01
--mon-data /var/lib/ceph/mon/ceph-sun-gcs02-osd01 --public-addr
10.65.101.21

Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: exec: Waiting 368 to quit
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: warning: line 41:
'osd_memory_target' in section 'osd' redefined
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835
7f401283c180  0 set uid:gid to 167:167 (ceph:ceph)
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835
7f401283c180  0 ceph version 14.2.2
(4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process
ceph-mon, pid 368
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835
7f401283c180 -1 stat(/var/lib/ceph/mon/ceph-sun-gcs02-osd01) (13)
Permission denied
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28 20:41:03.835
7f401283c180 -1 error accessing monitor data directory at
'/var/lib/ceph/mon/ceph-sun-gcs02-osd01': (13) Permission denied
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: managing teardown
after SIGCHLD
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Waiting PID 368 to
terminate
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Process 368 is
terminated
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Bye Bye, container
will die with return code -1
Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: if you don't want
me to die and have access to a shell to debug this situation, next time run
me with '-e DEBUG=stayalive'
Aug 28 20:41:04 sun-gcs02-osd01 systemd[1]: ceph-mon@sun-gcs02-osd01.service:
Main process exited, code=exited, status=255/n/a
Aug 28 20:41:04 sun-gcs02-osd01 systemd[1]: ceph-mon@sun-gcs02-osd01.service:
Failed with result 'exit-code'.

The directories for the monitor are owned by 167.167 and matches the
UID.GID that the container reports.

oot@sun-gcs02-osd01:~# ls -lhd /var/lib/ceph/
drwxr-x--- 14 ceph ceph 4.0K Jul 30 22:15 /var/lib/ceph/
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/
total 56K
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-mds
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-mgr
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-osd
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rbd
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rbd-mirror
drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rgw
drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mds
drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mgr
drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mon
drwxr-xr-x  14 167 167 4.0K Jul 30 22:28 osd
drwxr-xr-x   4 167 167 4.0K Aug  1 23:36 radosgw
drwxr-xr-x 254 167 167  12K Aug 28 20:44 tmp
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/mon/
total 4.0K
drwxr-xr-x 3 167 167 4.0K Jul 30 22:16 ceph-sun-gcs02-osd01
root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/mon/ceph-sun-gcs02-osd01/
total 16K
-rw--- 1 167 167   77 Jul 30 22:15 keyring
-rw-r--r-- 1 167 1678 Jul 30 22:15 kv_backend
-rw-r--r-- 1 167 1673 Jul 30 22:16 min_mon_release
drwxr-xr-x 2 167 167 4.0K Aug 28 19:16 store.db
root@sun-gcs02-osd01:~# ls -lh
/var/lib/ceph/mon/ceph-sun-gcs02-osd01/store.db/
total 149M
-rw-r--r-- 1 167 167 1.7M Aug 28 19:16 050225.log
-rw-r--r-- 1 167 167  65M Aug 28 19:16 050227.sst
-rw-r--r-- 1 167 167  45M Aug 28 19:16 050228.sst
-rw-r--r-- 1 167 167   16 Aug 16 07:40 

Re: [ceph-users] Ceph + SAMBA (vfs_ceph)

2019-08-28 Thread Bill Sharer
Your windows client is failing to authenticate when it tries to mount 
the share.  That could be a simple fix or hideously complicated 
depending on what type of Windows network you are running in.  Is this 
lab environment using a Windows server running as an Active Directory 
Domain controller or have you just been working with standalone installs 
of Linux and Windows in your lab?  Are your windows installs simply 
based on a retail version of Windows Home or do you have the Pro or 
Enterprise versions licensed?


If you are stuck with a Home only version or simply want to do ad-hoc 
stuff without much futher ado (probably why you have SECURITY=USER 
stanza in your conf) then just look at using smbpasswd to create the 
password hashes necessary for SMB mounting.  This is necessary because 
Windows and Unix/Linux have different hashing schemes.   This samba wiki 
link will probably be a good starting point for you.


https://wiki.samba.org/index.php/Setting_up_Samba_as_a_Standalone_Server

If you are an Active Directory network, you will end up mucking around 
in a lot more config files in order to get your Linux boxes to join the 
Directory as members and then authenticate against the domain 
controllers.  That can also be a somewhat simple thing, but it can get 
hairy if your organization has infosec in mind and has hardening 
procedures that they applied.  That's when you might be breaking out 
Wireshark and analyzing the exchanges between Linux and the dc to figure 
out what sort of insanity is going on in your IT department.  If you 
aren't the domain admin or aren't good friends with one who also knows 
Unix/Linux you may never get anywhere.


Bill Sharer



On 8/28/19 2:32 PM, Salsa wrote:

This is the result:

# testparm -s
Load smb config files from /etc/samba/smb.conf
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[homes]"
Processing section "[cephfs]"
Processing section "[printers]"
Processing section "[print$]"
Loaded services file OK.
Server role: ROLE_STANDALONE

# Global parameters
[global]
load printers = No
netbios name = SAMBA-CEPH
printcap name = cups
security = USER
workgroup = CEPH
smbd: backgroundqueue = no
idmap config * : backend = tdb
cups options = raw
valid users = samba
...
[cephfs]
create mask = 0777
directory mask = 0777
guest ok = Yes
guest only = Yes
kernel share modes = No
path = /
read only = No
vfs objects = ceph
ceph: user_id = samba
ceph:config_file = /etc/ceph/ceph.conf


I cut off some parts I thought were not relevant.

--
Salsa

Sent with ProtonMail  Secure Email.

‐‐‐ Original Message ‐‐‐
On Wednesday, August 28, 2019 3:09 AM, Konstantin Shalygin 
 wrote:





I'm running a ceph installation on a lab to evaluate for production and I have 
a cluster running, but I need to mount on different windows servers and 
desktops. I created an NFS share and was able to mount it on my Linux desktop, 
but not a Win 10 desktop. Since it seems that Windows server 2016 is required 
to mount the NFS share I quit that route and decided to try samba.

I compiled a version of Samba that has this vfs_ceph module, but I can't set it 
up correctly. It seems I'm missing some user configuration as I've hit this 
error:

"
~$ smbclient -U samba.gw //10.17.6.68/cephfs_a
WARNING: The "syslog" option is deprecated
Enter WORKGROUP\samba.gw's password:
session setup failed: NT_STATUS_LOGON_FAILURE
"
Does anyone know of any good setup tutorial to follow?

This is my smb config so far:

# Global parameters
[global]
load printers = No
netbios name = SAMBA-CEPH
printcap name = cups
security = USER
workgroup = CEPH
smbd: backgroundqueue = no
idmap config * : backend = tdb
cups options = raw
valid users = samba

[cephfs]
create mask = 0777
directory mask = 0777
guest ok = Yes
guest only = Yes
kernel share modes = No
path = /
read only = No
vfs objects = ceph
ceph: user_id = samba
ceph:config_file = /etc/ceph/ceph.conf

Thanks



Your configuration seems correct, but conf have or don't have special 
characters such a spaces, lower case options. First what you should 
do is run `testparm -s` and paste here what in output.




k




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Failure to start ceph-mon in docker

2019-08-28 Thread Robert LeBlanc
Turns out /var/lib/ceph was ceph.ceph and not 167.167, chowning it made
things work. I guess only monitor needs that permission, rgw,mgr,osd are
all happy without needing it to be 167.167.

Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1


On Wed, Aug 28, 2019 at 1:45 PM Robert LeBlanc  wrote:

> We are trying to set up a new Nautilus cluster using ceph-ansible with
> containers. We got things deployed, but I couldn't run `ceph s` on the host
> so decided to `apt install ceph-common and installed the Luminous version
> from Ubuntu 18.04. For some reason the docker container that was running
> the monitor restarted and won't restart. I added the repo for Nautilus and
> upgraded ceph-common, but the problem persists. The Manager and OSD docker
> containers don't seem to be affected at all. I see this in the journal:
>
> Aug 28 20:40:55 sun-gcs02-osd01 systemd[1]: Starting Ceph Monitor...
> Aug 28 20:40:55 sun-gcs02-osd01 docker[2926]: Error: No such container:
> ceph-mon-sun-gcs02-osd01
> Aug 28 20:40:55 sun-gcs02-osd01 systemd[1]: Started Ceph Monitor.
> Aug 28 20:40:55 sun-gcs02-osd01 docker[2949]: WARNING: Your kernel does
> not support swap limit capabilities or the cgroup is not mounted. Memory
> limited without swap.
> Aug 28 20:40:56 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:40:56
>  /opt/ceph-container/bin/entrypoint.sh: Existing mon, trying to rejoin
> cluster...
> Aug 28 20:40:56 sun-gcs02-osd01 docker[2949]: warning: line 41:
> 'osd_memory_target' in section 'osd' redefined
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:41:03
>  /opt/ceph-container/bin/entrypoint.sh: /etc/ceph/ceph.conf is already
> memory tuned
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: 2019-08-28 20:41:03
>  /opt/ceph-container/bin/entrypoint.sh: SUCCESS
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: exec: PID 368: spawning
> /usr/bin/ceph-mon --cluster ceph --default-log-to-file=false
> --default-mon-cluster-log-to-file=false --setuser ceph --setgroup ceph -d
> --mon-cluster-log-to-stderr --log-stderr-prefix=debug  -i sun-gcs02-osd01
> --mon-data /var/lib/ceph/mon/ceph-sun-gcs02-osd01 --public-addr
> 10.65.101.21
>
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: exec: Waiting 368 to quit
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: warning: line 41:
> 'osd_memory_target' in section 'osd' redefined
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28
> 20:41:03.835 7f401283c180  0 set uid:gid to 167:167 (ceph:ceph)
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28
> 20:41:03.835 7f401283c180  0 ceph version 14.2.2
> (4f8fa0a0024755aae7d95567c63f11d6862d55be) nautilus (stable), process
> ceph-mon, pid 368
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28
> 20:41:03.835 7f401283c180 -1 stat(/var/lib/ceph/mon/ceph-sun-gcs02-osd01)
> (13) Permission denied
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: debug 2019-08-28
> 20:41:03.835 7f401283c180 -1 error accessing monitor data directory at
> '/var/lib/ceph/mon/ceph-sun-gcs02-osd01': (13) Permission denied
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: managing teardown
> after SIGCHLD
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Waiting PID 368 to
> terminate
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Process 368 is
> terminated
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: Bye Bye, container
> will die with return code -1
> Aug 28 20:41:03 sun-gcs02-osd01 docker[2949]: teardown: if you don't want
> me to die and have access to a shell to debug this situation, next time run
> me with '-e DEBUG=stayalive'
> Aug 28 20:41:04 sun-gcs02-osd01 systemd[1]:
> ceph-mon@sun-gcs02-osd01.service: Main process exited, code=exited,
> status=255/n/a
> Aug 28 20:41:04 sun-gcs02-osd01 systemd[1]:
> ceph-mon@sun-gcs02-osd01.service: Failed with result 'exit-code'.
>
> The directories for the monitor are owned by 167.167 and matches the
> UID.GID that the container reports.
>
> oot@sun-gcs02-osd01:~# ls -lhd /var/lib/ceph/
> drwxr-x--- 14 ceph ceph 4.0K Jul 30 22:15 /var/lib/ceph/
> root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/
> total 56K
> drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-mds
> drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-mgr
> drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-osd
> drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rbd
> drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rbd-mirror
> drwxr-xr-x   2 167 167 4.0K Jul 30 22:16 bootstrap-rgw
> drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mds
> drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mgr
> drwxr-xr-x   3 167 167 4.0K Jul 30 22:15 mon
> drwxr-xr-x  14 167 167 4.0K Jul 30 22:28 osd
> drwxr-xr-x   4 167 167 4.0K Aug  1 23:36 radosgw
> drwxr-xr-x 254 167 167  12K Aug 28 20:44 tmp
> root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/mon/
> total 4.0K
> drwxr-xr-x 3 167 167 4.0K Jul 30 22:16 ceph-sun-gcs02-osd01
> root@sun-gcs02-osd01:~# ls -lh /var/lib/ceph/

[ceph-users] Specify OSD size and OSD journal size with ceph-ansible

2019-08-28 Thread Robert LeBlanc
I have a new cluster and I'd like to put the DB on the NVMe device, but
only make it 30GB, then use 100GB of the rest of the NVMe as an OSD for the
RGW metadata pool.

I set up the disks like the conf below without the block_db_size and it
created all the LVs on the HDDs and one LV on the NVMe that took up all the
space.

I've tried using block_db_size in vars, and also as a property in the list
for each OSD disk but neither work.

With block_db_size in the vars I get:
failed: [sun-gcs02-osd01] (item={u'db': u'/dev/nvme0n1', u'data':
u'/dev/sda', u'crush_device_class': u'hdd'}) => changed=true
 ansible_loop_var: item
 cmd:


 - docker

   - run
 - --rm
 - --privileged
 - --net=host
 - --ipc=host
 - --ulimit
 - nofile=1024:1024
 - -v
 - /run/lock/lvm:/run/lock/lvm:z
 - -v
 - /var/run/udev/:/var/run/udev/:z
 - -v
 - /dev:/dev
 - -v
 - /etc/ceph:/etc/ceph:z
 - -v
 - /run/lvm/:/run/lvm/
 - -v
 - /var/lib/ceph/:/var/lib/ceph/:z
 - -v
 - /var/log/ceph/:/var/log/ceph/:z
 - --entrypoint=ceph-volume
 - docker.io/ceph/daemon:latest
 - --cluster
 - ceph
 - lvm
 - prepare
 - --bluestore
 - --data
 - /dev/sda
 - --block.db
 - /dev/nvme0n1
 - --crush-device-class
 - hdd
 delta: '0:00:05.004777'
 end: '2019-08-28 23:26:39.074850'
 item:
   crush_device_class: hdd
   data: /dev/sda
   db: /dev/nvme0n1
 msg: non-zero return code
 rc: 1
 start: '2019-08-28 23:26:34.070073'
 stderr: '-->  RuntimeError: unable to use device'
 stderr_lines: 
 stdout: |-
   Running command: /bin/ceph-authtool --gen-print-key
   Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new
bcc7b3c3-6203-47c7-9f34-7b2e2060bf59
   Running command: /usr/sbin/vgcreate -s 1G --force --yes
ceph-76cd6a80-17dd-4a89-a35b-0844026bc9d4 /dev/sda
stdout: Physical volume "/dev/sda" successfully created.
stdout: Volume group "ceph-76cd6a80-17dd-4a89-a35b-0844026bc9d4"
successfully created
   Running command: /usr/sbin/lvcreate --yes -l 100%FREE -n
osd-block-bcc7b3c3-6203-47c7-9f34-7b2e2060bf59
ceph-76cd6a80-17dd-4a89-a35b-0844026bc9d4
stdout: Logical volume "osd-block-bcc7b3c3-6203-47c7-9f34-7b2e2060bf59"
created.
   --> blkid could not detect a PARTUUID for device: /dev/nvme0n1
   --> Was unable to complete a new OSD, will rollback changes
   Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd
--keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.21
--yes-i-really-mean-it
stderr: purged osd.21
  stdout_lines: 

... (One for each device) ...

And the LVs are created for all the HDD OSDs and none on the NVMe.

Looking through the code I don't see a way to set a size for the OSD, but
maybe I'm just missing it as I'm really new to Ansible.

osds:
 hosts:
   sun-gcs02-osd[01:43]:
   sun-gcs02-osd[45:60]:
 vars:
   block_db_size: 32212254720
   lvm_volumes:
 - data: '/dev/sda'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdb'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdc'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdd'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sde'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdf'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdg'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdh'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdi'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdj'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdk'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/sdl'
   db: '/dev/nvme0n1'
   crush_device_class: 'hdd'
 - data: '/dev/nvme0n1'  # Use the rest for metadata
   crush_device_class: 'nvme'

With block_db_size set for each disk, I got an error during the parameter
checking phase in Ansible and no LVs were created.

Please help me understand how to configure what I would like to do.

Thank you,
Robert LeBlanc


Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904  C70E E654 3BB2 FA62 B9F1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iostat and dashboard freezing

2019-08-28 Thread Reed Dier
Just a follow up 24h later, and the mgr's seem to be far more stable, and have 
had no issues or weirdness after disabling the balancer module.

Which isn't great, because the balancer plays an important role, but after 
fighting distribution for a few weeks and getting it 'good enough' I'm taking 
the stability.

Just wanted to follow up with another 2¢.

Reed

> On Aug 27, 2019, at 11:53 AM, Reed Dier  wrote:
> 
> Just to further piggyback,
> 
> Probably the most "hard" the mgr seems to get pushed is when the balancer is 
> engaged.
> When trying to eval a pool or cluster, it takes upwards of 30-120 seconds for 
> it to score it, and then another 30-120 seconds to execute the plan, and it 
> never seems to engage automatically.
> 
>> $ time ceph balancer status
>> {
>> "active": true,
>> "plans": [],
>> "mode": "upmap"
>> }
>> 
>> real0m36.490s
>> user0m0.259s
>> sys 0m0.044s
> 
> 
> I'm going to disable mine as well, and see if I can stop waking up to 'No 
> Active MGR.'
> 
> 
> You can see when I lose mgr's because RBD image stats go to 0 until I catch 
> it.
> 
> Thanks,
> 
> Reed
> 
>> On Aug 27, 2019, at 11:24 AM, Jake Grimmett > > wrote:
>> 
>> Hi Reed, Lenz, John
>> 
>> I've just tried disabling the balancer, so far ceph-mgr is keeping it's
>> CPU mostly under 20%, even with both the iostat and dashboard back on.
>> 
>> # ceph balancer off
>> 
>> was
>> [root@ceph-s1 backup]# ceph balancer status
>> {
>>"active": true,
>>"plans": [],
>>"mode": "upmap"
>> }
>> 
>> now
>> [root@ceph-s1 backup]# ceph balancer status
>> {
>>"active": false,
>>"plans": [],
>>"mode": "upmap"
>> }
>> 
>> We are using 8:2 erasure encoding across 324 12TB OSD, plus 4 NVMe OSD
>> for a replicated cephfs metadata pool.
>> 
>> let me know if the balancer is your problem too...
>> 
>> best,
>> 
>> Jake
>> 
>> On 8/27/19 3:57 PM, Jake Grimmett wrote:
>>> Yes, the problem still occurs with the dashboard disabled...
>>> 
>>> Possibly relevant, when both the dashboard and iostat plugins are
>>> disabled, I occasionally see ceph-mgr rise to 100% CPU.
>>> 
>>> as suggested by John Hearns, the output of  gstack ceph-mgr when at 100%
>>> is here:
>>> 
>>> http://p.ip.fi/52sV 
>>> 
>>> many thanks
>>> 
>>> Jake
>>> 
>>> On 8/27/19 3:09 PM, Reed Dier wrote:
 I'm currently seeing this with the dashboard disabled.
 
 My instability decreases, but isn't wholly cured, by disabling
 prometheus and rbd_support, which I use in tandem, as the only thing I'm
 using the prom-exporter for is the per-rbd metrics.
 
> ceph mgr module ls
> {
> "enabled_modules": [
> "diskprediction_local",
> "influx",
> "iostat",
> "prometheus",
> "rbd_support",
> "restful",
> "telemetry"
> ],
 
 I'm on Ubuntu 18.04, so that doesn't corroborate with some possible OS
 correlation.
 
 Thanks,
 
 Reed
 
> On Aug 27, 2019, at 8:37 AM, Lenz Grimmer  > wrote:
> 
> Hi Jake,
> 
> On 8/27/19 3:22 PM, Jake Grimmett wrote:
> 
>> That exactly matches what I'm seeing:
>> 
>> when iostat is working OK, I see ~5% CPU use by ceph-mgr
>> and when iostat freezes, ceph-mgr CPU increases to 100%
> 
> Does this also occur if the dashboard module is disabled? Just wondering
> if this is isolatable to the iostat module. Thanks!
> 
> Lenz
> 
> -- 
> SUSE Software Solutions Germany GmbH - Maxfeldstr. 5 - 90409 Nuernberg
> GF: Felix Imendörffer, HRB 247165 (AG Nürnberg)
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
 
 ___
 ceph-users mailing list
 ceph-users@lists.ceph.com
 http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
 
>>> 
>>> 
>> 
>> 
>> -- 
>> MRC Laboratory of Molecular Biology
>> Francis Crick Avenue,
>> Cambridge CB2 0QH, UK.
>> 
> 



smime.p7s
Description: S/MIME cryptographic signature
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] health: HEALTH_ERR Module 'devicehealth' has failed: Failed to import _strptime because the import lockis held by another thread.

2019-08-28 Thread Konstantin Shalygin

On 8/28/19 8:16 PM, Peter Eisch wrote:


Thank you for your reply. The I receive an error as the module can't 
be disabled.


I may have worked through this by restarting the nodes in a rapid 
succession. 



What exactly error? May be you catches a bug and should be create 
redmine ticket for this issue.





k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph + SAMBA (vfs_ceph)

2019-08-28 Thread Konstantin Shalygin


On 8/29/19 1:32 AM, Salsa wrote:

This is the result:

# testparm -s
Load smb config files from /etc/samba/smb.conf
rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384)
Processing section "[homes]"
Processing section "[cephfs]"
Processing section "[printers]"
Processing section "[print$]"
Loaded services file OK.
Server role: ROLE_STANDALONE

# Global parameters
[global]
load printers = No
netbios name = SAMBA-CEPH
printcap name = cups
security = USER
workgroup = CEPH
smbd: backgroundqueue = no
idmap config * : backend = tdb
cups options = raw
valid users = samba
...
[cephfs]
create mask = 0777
directory mask = 0777
guest ok = Yes
guest only = Yes
kernel share modes = No
path = /
read only = No
vfs objects = ceph
ceph: user_id = samba
ceph:config_file = /etc/ceph/ceph.conf


I cut off some parts I thought were not relevant.



`map to guest = Bad User`, instead `valid users = samba`.

```

[cephfs]
  path = /
  vfs objects = acl_xattr ceph
  ceph: config_file = /etc/ceph/ceph.conf
  ceph: user_id = samba
  oplocks = no
  kernel share modes = no
  browseable = yes
  public = yes
  writable = yes
  guest ok = yes
  force user = root
  force group = root
  create mask = 0644
  directory mode = 0755
```

Reload and try `smbclient -U guest -N //10.17.6.68/cephfs`



k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iostat and dashboard freezing

2019-08-28 Thread Konstantin Shalygin

Just a follow up 24h later, and the mgr's seem to be far more stable, and have 
had no issues or weirdness after disabling the balancer module.

Which isn't great, because the balancer plays an important role, but after 
fighting distribution for a few weeks and getting it 'good enough' I'm taking 
the stability.

Just wanted to follow up with another 2¢.
What is your balancer settings (`ceph config-key ls`)? Your mgr running 
in virtual environment or on bare metal?


How much pools you have? Please also paste `ceph osd tree` & `ceph osd 
df tree`.


Measure time of balancer plan creation: `time ceph balancer optimize new`.



k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com