Re: [ceph-users] osd prepare issue device-mapper mapping

2018-07-13 Thread Jacob DeGlopper
You have LVM data on /dev/sdb already; you will need to remove that 
before you can use ceph-disk on that device.


Use the LVM commands 'lvs','vgs', and 'pvs' to list the logical volumes, 
volume groups, and physical volumes defined.  Once you're sure you don't 
need the data, lvremove, vgremove, and pvremove them, then zero the disk 
using 'dd if=/dev/zero of=/dev/sdb bs=1M count=10'.  Note that this 
command wipes the disk - you must be sure that you're wiping the right disk.


    -- jacob


On 07/13/2018 03:26 PM, Satish Patel wrote:

I am installing ceph in my lab box using ceph-ansible, i have two HDD
for OSD and i am getting following error on one of OSD not sure what
is the issue.



[root@ceph-osd-01 ~]# ceph-disk prepare --cluster ceph --bluestore /dev/sdb
ceph-disk: Error: Device /dev/sdb1 is in use by a device-mapper
mapping (dm-crypt?): dm-0


[root@ceph-osd-01 ~]# ceph-disk list
/dev/dm-0 other, xfs, mounted on /
/dev/sda :
  /dev/sda1 other, xfs, mounted on /boot
  /dev/sda2 swap, swap
/dev/sdb :
  /dev/sdb1 other, LVM2_member
/dev/sdc :
  /dev/sdc1 ceph data, active, cluster ceph, osd.3, block /dev/sdc2
  /dev/sdc2 ceph block, for /dev/sdc1
/dev/sr0 other, unknown
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd prepare issue device-mapper mapping

2018-07-13 Thread Jacob DeGlopper
Also, looking at your ceph-disk list output, the LVM is probably your 
root filesystem and cannot be wiped.  If you'd like the send the output 
of a 'mount' and 'lvs' command, you should be able to to tell.


    -- jacob


On 07/13/2018 03:42 PM, Jacob DeGlopper wrote:
You have LVM data on /dev/sdb already; you will need to remove that 
before you can use ceph-disk on that device.


Use the LVM commands 'lvs','vgs', and 'pvs' to list the logical 
volumes, volume groups, and physical volumes defined.  Once you're 
sure you don't need the data, lvremove, vgremove, and pvremove them, 
then zero the disk using 'dd if=/dev/zero of=/dev/sdb bs=1M 
count=10'.  Note that this command wipes the disk - you must be sure 
that you're wiping the right disk.


    -- jacob


On 07/13/2018 03:26 PM, Satish Patel wrote:

I am installing ceph in my lab box using ceph-ansible, i have two HDD
for OSD and i am getting following error on one of OSD not sure what
is the issue.



[root@ceph-osd-01 ~]# ceph-disk prepare --cluster ceph --bluestore 
/dev/sdb

ceph-disk: Error: Device /dev/sdb1 is in use by a device-mapper
mapping (dm-crypt?): dm-0


[root@ceph-osd-01 ~]# ceph-disk list
/dev/dm-0 other, xfs, mounted on /
/dev/sda :
  /dev/sda1 other, xfs, mounted on /boot
  /dev/sda2 swap, swap
/dev/sdb :
  /dev/sdb1 other, LVM2_member
/dev/sdc :
  /dev/sdc1 ceph data, active, cluster ceph, osd.3, block /dev/sdc2
  /dev/sdc2 ceph block, for /dev/sdc1
/dev/sr0 other, unknown
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-container - rbd map failing since upgrade?

2018-08-21 Thread Jacob DeGlopper
I'm seeing an error from the rbd map command running in ceph-container; 
I had initially deployed this cluster as Luminous, but a pull of the 
ceph/daemon container unexpectedly upgraded me to Mimic 13.2.1.


[root@nodeA2 ~]# ceph version
ceph version 13.2.1 (5533ecdc0fda920179d7ad84e0aa65a127b20d77) mimic 
(stable)


[root@nodeA2 ~]# rbd info mysqlTB
rbd image 'mysqlTB':
    size 360 GiB in 92160 objects
    order 22 (4 MiB objects)
    id: 206a962ae8944a
    block_name_prefix: rbd_data.206a962ae8944a
    format: 2
    features: layering
    op_features:
    flags:
    create_timestamp: Sat Aug 11 00:00:36 2018

[root@nodeA2 ~]# rbd map mysqlTB
rbd: failed to add secret 'client.admin' to kernel
In some cases useful info is found in syslog - try "dmesg | tail".
rbd: map failed: (1) Operation not permitted

[root@nodeA2 ~]# type rbd
rbd is a function
rbd ()
{
    sudo docker exec ceph-mon-nodeA2 rbd --cluster ceph ${@}
}

[root@nodeA2 ~]# ls -alF /etc/ceph/ceph.client.admin.keyring
-rw--- 1 ceph ceph 159 May 21 09:27 /etc/ceph/ceph.client.admin.keyring

System is CentOS 7 with the elrepo mainline kernel:

[root@nodeA2 ~]# uname -a
Linux nodeA2 4.18.3-1.el7.elrepo.x86_64 #1 SMP Sat Aug 18 09:30:18 EDT 
2018 x86_64 x86_64 x86_64 GNU/Linux


I see a similar question here with no answer: 
https://github.com/ceph/ceph-container/issues/1030


dmesg shows nothing related:

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Safe to use RBD mounts for Docker volumes on containerized Ceph nodes

2018-09-06 Thread Jacob DeGlopper
I've seen the requirement not to mount RBD devices or CephFS filesystems 
on OSD nodes.  Does this still apply when the OSDs and clients using the 
RBD volumes are all in Docker containers?


That is, is it possible to run a 3-server setup in production with both 
Ceph daemons (mon, mgr, and OSD) in containers, along with applications 
in containers using Ceph as shared storage (Elasticsearch, gitlab, etc)?


    -- jacob

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Does ceph-ansible support the LVM OSD scenario under Docker?

2018-04-26 Thread Jacob DeGlopper
Hi - I'm trying to set up our first Ceph deployment with a small set of 
3 servers, using an SSD boot drive each and 2x Micron 5200 SSDs per 
server for OSD drives.  It appears that Ceph under Docker gives us an 
allowable production config using 3 servers rather than 6.  We are using 
CentOS 7.4 as the host operating system.


I was able to get the cluster up and running using ceph-ansible with one 
OSD per drive, using Bluestore, but 4k block and MySQL performance was 
below the performance of a single SSD.  One possible tuning step appears 
to be running 4-6 OSDs per SSD rather than 1, but I'm having trouble 
getting ceph-ansible to provision that.  My impression is that I should 
preconfigure LVM volumes for the OSDs, and I've done that (this is a 
test on a clean VM install to eliminate any leftovers from my previous 
cluster):


  osda1 vg3    -wi-a- 13.50g
  osda2 vg3    -wi-a- 13.50g
  osda3 vg3    -wi-a- 13.50g
  osda4 vg3    -wi-a- 13.50g

I configured osd.yml to include:

osd_scenario: lvm
osd_objectstore: bluestore
lvm_volumes:
  - data: osda1
    data_vg: vg3
  - data: osda2
    data_vg: vg3
  - data: osda3
    data_vg: vg3
  - data: osda4
    data_vg: vg3

and I see some OSD tasks being run by ansible, but there doesn't seem to 
be a startup script enabled for OSDs, and no OSD containers are running:


[root@ceph1 jacob]# systemctl  | grep ceph
ceph-mgr@ceph1.service loaded active running   Ceph Manager
ceph-mon@ceph1.service loaded active running   Ceph Monitor
system-ceph\x2dmgr.slice loaded active active    system-ceph\x2dmgr.slice
system-ceph\x2dmon.slice loaded active active    system-ceph\x2dmon.slice

[root@ceph1 jacob]# docker ps
CONTAINER ID    IMAGE COMMAND CREATED STATUS 
PORTS   NAMES
037197ef9bac    docker.io/ceph/daemon:latest "/entrypoint.sh"    5 
minutes ago   Up 5 minutes    ceph-mgr-ceph1
4947e5c1c544    docker.io/ceph/daemon:latest "/entrypoint.sh"    17 
minutes ago  Up 17 minutes   ceph-mon-ceph1


all.yml is configured for Docker:

ceph_origin: repository
ceph_repository: community
ceph_stable_release: luminous
fsid: "{{ cluster_uuid.stdout }}"
generate_fsid: true
monitor_interface: eth0
public_network: 192.168.122.0/24
cluster_network: 192.168.122.0/24
ceph_tcmalloc_max_total_thread_cache: 134217728
mon_containerized_deployment: true
osd_containerized_deployment: true
docker: true
containerized_deployment: true

Does ceph-ansible support LVM OSDs using Docker?

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Shared WAL/DB device partition for multiple OSDs?

2018-05-11 Thread Jacob DeGlopper

Thanks, this is useful in general.  I have a semi-related question:

Given an OSD server with multiple SSDs or NVME devices, is there an 
advantage to putting wal/db on a different device of the same speed?  
For example, data on sda1, matching wal/db on sdb1,  and then data on 
sdb2 and wal/db on sda2?


    -- jacob


On 05/11/2018 12:46 PM, David Turner wrote:
This thread is off in left field and needs to be brought back to how 
things work.


While multiple OSDs can use the same device for block/wal partitions, 
they each need their own partition.  osd.0 could use nvme0n1p1, 
osd.2/nvme0n1p2, etc.  You cannot use the same partition for each 
osd.  Ceph-volume will not create the db/wal partitions for you, you 
need to manually create the partitions to be used by the OSD.  There 
is no need to put a filesystem on top of the partition for the 
wal/db.  That is wasted overhead that will slow things down.


Back to the original email.

> Or do I need to use osd-db=/dev/nvme0n1p2 for data=/dev/sdb,
> osd-db=/dev/nvme0n1p3 for data=/dev/sdc, and so on?
This is what you need to do, but like said above, you need to create 
the partitions for --block-db yourself.  You talked about having a 
10GB partition for this, but the general recommendation for block-db 
partitions is 10GB per 1TB of OSD.  If your OSD is a 4TB disk you 
should be looking closer to a 40GB block.db partition.  If your 
block.db partition is too small, then once it fills up it will spill 
over onto the data volume and slow things down.


> And just to make sure - if I specify "--osd-db", I don't need
> to set "--osd-wal" as well, since the WAL will end up on the
> DB partition automatically, correct?
This is correct.  The wal will automatically be placed on the db if 
not otherwise specified.



I don't use ceph-deploy, but the process for creating the OSDs should 
be something like this.  After the OSDs are created it is a good idea 
to make sure that the OSD is not looking for the db partition with the 
/dev/nvme0n1p2 distinction as that can change on reboots if you have 
multiple nvme devices.


# Make sure the disks are clean and ready to use as an OSD
for hdd in /dev/sd{b..c}; do
  ceph-volume lvm zap $hdd --destroy
done

# Create the nvme db partitions (assuming 10G size for a 1TB OSD)
for partition in {2..3}; do
  sgdisk -c /dev/nvme0n1 -n:$partition:0:+10G -c:$partition:'ceph db'
done

# Create the OSD
echo "/dev/sdb /dev/nvme0n1p2
/dev/sdc /dev/nvme0n1p3" | while read hdd db; do
  ceph-volume lvm create --bluestore --data $hdd --block.db $db
done

# Fix the OSDs to look for the block.db partition by UUID instead of 
its device name.

for db in /var/lib/ceph/osd/*/block.db; do
  dev=$(readlink $db | grep -Eo 
nvme[[:digit:]]+n[[:digit:]]+p[[:digit:]]+ || echo false)

  if [[ "$dev" != false ]]; then
    uuid=$(ls -l /dev/disk/by-partuuid/ | awk '/'${dev}'$/ {print $9}')
    ln -sf /dev/disk/by-partuuid/$uuid $db
  fi
done
systemctl restart ceph-osd.target

On Fri, May 11, 2018 at 10:59 AM João Paulo Sacchetto Ribeiro Bastos 
mailto:joaopaulos...@gmail.com>> wrote:


Actually, if you go to
https://ceph.com/community/new-luminous-bluestore/ you will see
that DB/WAL work on a XFS partition, while the data itself goes on
a raw block.

Also, I told you the wrong command in the last mail. When i said
--osd-db it should be --block-db.

On Fri, May 11, 2018 at 11:51 AM Oliver Schulz
mailto:oliver.sch...@tu-dortmund.de>> wrote:

Hi,

thanks for the advice! I'm a bit confused now, though. ;-)
I thought DB and WAL were supposed to go on raw block
devices, not file systems?


Cheers,

Oliver


On 11.05.2018 16:01, João Paulo Sacchetto Ribeiro Bastos wrote:
> Hello Oliver,
>
> As far as I know yet, you can use the same DB device for
about 4 or 5
> OSDs, just need to be aware of the free space. I'm also
developing a
> bluestore cluster, and our DB and WAL will be in the same
SSD of about
> 480GB serving 4 OSD HDDs of 4 TB each. About the sizes, its
just a
> feeling because I couldn't find yet any clear rule about how
to measure
> the requirements.
>
> * The only concern that took me some time to realize is that
you should
> create a XFS partition if using ceph-deploy because if you
don't it will
> simply give you a RuntimeError that doesn't give any hint
about what's
> going on.
>
> So, answering your question, you could do something like:
> $ ceph-deploy osd create --bluestore --data=/dev/sdb --block-db
> /dev/nvme0n1p1 $HOSTNAME
> $ ceph-deploy osd create --bluestore --data=/dev/sdc --block-db
> /dev/nvme0n1p1 $HOSTNAME
>
> On Fri, May 11, 2018 at 10:35 AM Oliver Schulz
> mailto:oliver.sch...@tu-dortmund.de>


Re: [ceph-users] [PROBLEM] Fail in deploy do ceph on RHEL

2018-05-18 Thread Jacob DeGlopper
Hi Antonio - you need to set !requiretty in your sudoers file.  This is 
documented here: 
http://docs.ceph.com/docs/jewel/start/quick-ceph-deploy/   but it 
appears that section may not have been copied into the current docs.


You can test this by running 'ssh sds@node1 sudo whoami' from your admin 
node.


    -- jacob


On 05/18/2018 09:00 AM, Antonio Novaes wrote:
I tried create new cluster ceph, but on the my first command, received 
this erro in blue.
Searched on the gogle about this erro, but believe that is error of 
the ssh, and dont of the ceph.


I tried:
alias ssh="ssh -t" on the admin node

I Modifyed the file

Host node01
   Hostname node01.domain.local
   User sds
   PreferredAuthentications publickey
   IdentityFile /home/sds/.ssh/id_rsa

also try,
- start the command wtih sudo
-  Add PermitRootLogin whitout-password on /etc/ssh/sshd_config on the 
host node01


But, erro hold

[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[node01][DEBUG ] connected to host: cadmfsd001.tjba.jus.br 


[node01][INFO ] Running command: ssh -CT -o BatchMode=yes node01
[ceph_deploy.new][WARNIN] could not connect via SSH
[ceph_deploy.new][INFO  ] will connect again with password prompt
[node01][DEBUG ] connected to host: sds@node01
[node01][DEBUG ] detect platform information from remote host
[node01][DEBUG ] detect machine type
[ceph_deploy.new][INFO  ] adding public keys to authorized_keys
[node01][DEBUG ] append contents to file
[node01][DEBUG ] connection detected need for sudo
*sudo: I'm sorry, you should have a tty to run sudo*
*[ceph_deploy][ERROR ] RuntimeError: connecting to host: sds@node01 
resulted in errors: IOError cannot send (already closed?)*


Someone can help me?

Att,
Antonio Novaes de C. Jr
Analista TIC - Sistema e Infraestrutura
Especialista em Segurança de Rede de Computadores
Information Security Foundation based on ISO/IEC 27002 | ISFS
EXIN Cloud Computing (CLOUDF)
Red Hat Certified Engineer (RHCE)
Red Hat Certified Jboss Administrator (RHCJA)
Linux Certified Engineer (LPIC-2)
Novell Certified Linux Administrator (SUSE CLA)
ID Linux: 481126 | LPI000255169
LinkedIN: Perfil Público 






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] lacp bonding | working as expected..?

2018-06-21 Thread Jacob DeGlopper
Consider trying some variation in source and destination IP addresses 
and port numbers - unless you force it, iperf3 at least tends to pick 
only even port numbers for the ephemeral source port, which leads to all 
traffic being balanced to one link.


In your example, where you see one link being used, I see an even source 
IP paired with an odd destination port number for both transfers, or is 
that a search and replace issue?


Client connecting to a.b.c.10, TCP port 5001
[  3] local a.b.c.9 port 37940 connected with a.b.c.10 port 5001
Client connecting to a.b.c.205, TCP port 5000
[  3] local a.b.c.9 port 48806 connected with a.b.c.10 port 5000

In your "got lucky" example, the second connect is also to a.b.c.10.

    -- jacob


On 06/21/2018 02:54 PM, mj wrote:

Hi,

I'm trying out bonding to improve ceph performance on our cluster. 
(currently in a test setup, using 1G NICs, instead of 10G)


Setup like this on the ProCurve 5412 chassis:


Procurve chassis(config)# show trunk

 Load Balancing Method:  L4-based

  Port | Name Type  | Group  Type  + 
 - + -- 
  D1   | Link to ceph9  - 1   10GbE-T   | Trk1   LACP D2   | Link 
to ceph9  - 2   10GbE-T   | Trk1   LACP D3   | Link to ceph10 - 
1   10GbE-T   | Trk2   LACP D4   | Link to ceph10 - 2   
10GbE-T   | Trk2   LACP


and on the ceph side:


auto bond0
iface bond0 inet manual
slaves eth1 eth2
bond_miimon 100
bond_mode 802.3ad
 bond_xmit_hash_policy layer3+4

auto vmbr0
iface vmbr0 inet static
address  a.b.c.10
netmask  255.255.255.0
gateway  a.b.c.1
bridge_ports bond0
bridge_stp off
bridge_fd 0


Then, some testing: On ceph10 I start two iperf listeners, each 
listening on a different port, like:



iperf -s -B a.b.c.10 -p 5001 &
iperf -s -B a.b.c.10 -p 5000 &


Then I launch two different iperf processes on ceph9, to connect to my 
listeners, but to my surprise, MOST of the times only one link is 
used, for example:



Client connecting to a.b.c.10, TCP port 5001
TCP window size: 85.0 KByte (default)

[  3] local a.b.c.9 port 37940 connected with a.b.c.10 port 5001

Client connecting to a.b.c.205, TCP port 5000
TCP window size: 85.0 KByte (default)

[  3] local a.b.c.9 port 48806 connected with a.b.c.10 port 5000
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec   575 MBytes   482 Mbits/sec
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec   554 MBytes   464 Mbits/sec


(and looking at ifconfig on the other side confirms that all traffic 
goes through the same port)


However, trying multiple times I noticed that every 3rd or 4th time I 
will get lucky, and both links WILL be used:



Client connecting to a.b.c.10, TCP port 5001
TCP window size: 85.0 KByte (default)

[  3] local a.b.c.9 port 37984 connected with a.b.c.10 port 5001

Client connecting to a.b.c.10, TCP port 5000
TCP window size: 85.0 KByte (default)

[  3] local a.b.c.9 port 48850 connected with a.b.c.10 port 5000
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec  1.09 GBytes   936 Mbits/sec
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec   885 MBytes   742 Mbits/sec


My question is: is this level of "randomness" normal, and expected, or 
is there something wrong with my config/settings? Are there ways to 
improve the way links are chosen?


Specifically: I selected the L4 Load Balancing Method on the switch, 
as iI expected that it would help. And also "bond_xmit_hash_policy 
layer3+4" is the one I think I should be using, if I understand 
everything correctly...


I have 8 10GB ports available, and we will be running 4 ceph/proxmox 
servers, each with dual 10GB LACP bonded links.


Ideas?

MJ
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] lacp bonding | working as expected..?

2018-06-21 Thread Jacob DeGlopper
OK, that was a search-and-replace error in the original quote. This is 
still something with your layer 3/4 load balancing.


iperf2 does not support setting the source port, but iperf3 does - that 
might be worth a try.


    -- jacob



On 06/21/2018 03:37 PM, mj wrote:

Hi Jacob,

Thanks for your reply. But I'm not sure I completely understand it. :-)

On 06/21/2018 09:09 PM, Jacob DeGlopper wrote:
In your example, where you see one link being used, I see an even 
source IP paired with an odd destination port number for both 
transfers, or is that a search and replace issue?


Well, I left portnumbers as they were, I edited the IPs. Actually the 
machines are not a.b.c.9 and a.b.c.10, but a.b.c.204 and a.b.c.205, 
for the rest, everyting is unedited.


So a single line example:

Client connecting to a.b.c.205, TCP port 5001
TCP window size: 85.0 KByte (default)

[  3] local a.b.c.204 port 60600 connected with a.b.c.205 port 5001

Client connecting to a.b.c.205, TCP port 5000
TCP window size: 85.0 KByte (default)

[  3] local a.b.c.204 port 53788 connected with a.b.c.205 port 5000
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec   746 MBytes   625 Mbits/sec
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec   383 MBytes   321 Mbits/sec


And a lucky example:

Client connecting to a.b.c.205, TCP port 5001
TCP window size: 85.0 KByte (default)

[  3] local a.b.c.204 port 37984 connected with a.b.c.205 port 5001

Client connecting to a.b.c.205, TCP port 5000
TCP window size: 85.0 KByte (default)

[  3] local a.b.c.204 port 48850 connected with a.b.c.205 port 5000
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec  1.09 GBytes   936 Mbits/sec
[ ID] Interval   Transfer Bandwidth
[  3]  0.0-10.0 sec   885 MBytes   742 Mbits/sec

(reason for the a.b.c.204 is that the IPs are public, and I'd rather 
not put them here)


I don't see the odd/even port numbers thing you noticed..? (I could 
very well miss something though)


I see no way to specify what outgoing port iperf should use, otherwise 
I could try again using the same ports, to check the pattern.


Thanks again!

MJ
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] DockerSwarm and CephFS

2019-01-31 Thread Jacob DeGlopper
Hi Carlos - just a guess, but you might need your credentials from 
/etc/ceph on the host mounted inside the container.


    -- jacob

Hey guys!

First post to the list and new Ceph user so I might say/ask some 
stupid stuff ;)


I've setup a Ceph Storage (and crashed it 2 days after), with 2 
ceph-mon, 2 ceph-ods (same host), 2 ceph-mgr and 1 ceph-mgs. 
Everything is up and running and works great.
Now I'm trying to integrate the CephFS functionality with my Docker 
Swarm (the rbd part is already working great). I can mount the CephFS 
on the docker host without any problem with a specific client created 
for the effect (client.dockerfs). It also works great if creating a 
volume with "docker volume create" and then use that volume on a 
container. With a stack (defined as docker-compose.yml), it simply 
doesn't mount the CephFS share, and the ceph-mon daemons log this kind 
of msgs:
2019-01-30 21:44:56.595 7fed6daf9700  0 cephx server client.dockerfs:  
unexpected key: req.key=cb19d6f224e3099 expected_key=aa096575fa04aa68
2019-01-30 21:45:02.295 7fed6daf9700  0 cephx server client.dockerfs:  
unexpected key: req.key=8a87e7949a095e50 expected_key=1c3fd3ad47398e0a
2019-01-30 21:45:13.711 7fed6daf9700  0 cephx server client.dockerfs:  
unexpected key: req.key=93933c29c40e9b05 expected_key=5b1a8d4f4f0e8dd1


While on the docker host trying to start the container shows this:
Jan 30 23:57:57 docker02 kernel: libceph: auth method 'x' error -1

This is the mount command I use on the docker host to mount the CephFS 
share:
mount -t ceph  ceph-mon:/znc tmp -o 
mds_namespace=dockerfs,name=dockerfs,secret=`ceph auth print-key 
client.dockerfs`


And this is the volume part of the docker-compose.yml file:
volumes:
    data:
    driver: n0r1skcom/docker-volume-cephfs
    driver_opts:
    name: dockerfs
    secret: # Same output as the command above produces
    path: /znc
    monitors: ceph-mon
    mds_namespace: dockerfs


I must be doing something wrong with this because it looks really 
simple to do but, somehow, it isn't working.


Can someone shed any light plz?

Thanks,
Carlos Mogas da Silva
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Experiences with the Samsung SM/PM883 disk?

2019-02-22 Thread Jacob DeGlopper
What are you connecting it to?  We just got the exact same drive for 
testing, and I'm seeing much higher performance, connected to a 
motherboard 6 Gb SATA port on a Supermicro X9 board.


[root@centos7 jacob]# smartctl -a /dev/sda

Device Model: Samsung SSD 883 DCT 960GB
Firmware Version: HXT7104Q
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)

[root@centos7 jacob]# fio --filename=/dev/sda --direct=1 --sync=1 
--rw=write --bs=4k --numjobs=1 --iodepth=1 --runtime=60 --time_based 
--group_reporting --name=journal-test


write: IOPS=15.9k, BW=62.1MiB/s (65.1MB/s)(3728MiB/60001msec)

8 processes:

write: IOPS=58.1k, BW=227MiB/s (238MB/s)(13.3GiB/60003msec)


On 2/22/19 8:47 AM, Paul Emmerich wrote:

Hi,

it looks like the beloved Samsung SM/PM863a is no longer available and
the replacement is the new SM/PM883.

We got an 960GB PM883 (MZ7LH960HAJR-5) here and I ran the usual
fio benchmark... and got horrible results :(

fio --filename=/dev/sdX --direct=1 --sync=1 --rw=write --bs=4k
--numjobs=1 --iodepth=1 --runtime=60 --time_based --group_reporting
--name=journal-test

  1 thread  - 1150 iops
  4 threads - 2305 iops
  8 threads - 4200 iops
16 threads - 7230 iops

Now that's a factor of 15 or so slower than the PM863a.

Someone here reports better results with a 883:
https://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/

Maybe there's a difference between the SM and PM variant of these new
disks for performance? (This wasn't the case for the 863a)

Does anyone else have these new 883 disks yet?
Any experience reports?

Paul
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph inside Docker containers inside VirtualBox

2019-04-18 Thread Jacob DeGlopper
The ansible deploy is quite a pain to get set up properly, but it does 
work to get the whole stack working under Docker.  It uses the following 
script on Ubuntu to start the OSD containers:



/usr/bin/docker run \
  --rm \
  --net=host \
  --privileged=true \
  --pid=host \
  --memory=64386m \
  --cpus=1 \
  -v /dev:/dev \
  -v /etc/localtime:/etc/localtime:ro \
  -v /var/lib/ceph:/var/lib/ceph:z \
  -v /etc/ceph:/etc/ceph:z \
  -v /var/run/ceph:/var/run/ceph:z \
  --security-opt apparmor:unconfined \
  -e OSD_BLUESTORE=1 \
  -e OSD_DMCRYPT=0 \
  -e CLUSTER=ceph \
  -v /run/lvm/lvmetad.socket:/run/lvm/lvmetad.socket \
  -e CEPH_DAEMON=OSD_CEPH_VOLUME_ACTIVATE \
  -e OSD_ID="$1" \
  --name=ceph-osd-"$1" \
   \
  docker.io/ceph/daemon:latest


Hi !

I am not 100% sure, but i think, --net=host does not propagate /dev/ 
inside the conatiner.


From the Error Message :

2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: ERROR- The
device pointed by OSD_DEVICE (/dev/vdd) doesn't exist !


I whould say, you should add something like --device=/dev/vdd to the 
docker run command for the osd.


Br

Am 18.04.2019 um 14:46 schrieb Varun Singh:

Hi,
I am trying to setup Ceph through Docker inside a VM. My host machine
is Mac. My VM is an Ubuntu 18.04. Docker version is 18.09.5, build
e8ff056.
I am following the documentation present on ceph/daemon Docker Hub
page. The idea is, if I spawn docker containers as mentioned on the
page, I should get a ceph setup without KV store. I am not worried
about KV store as I just want to try it out. Following are the
commands I am firing to bring the containers up:

Monitor:
docker run -d --net=host -v /etc/ceph:/etc/ceph -v
/var/lib/ceph/:/var/lib/ceph/ -e MON_IP=10.0.2.15 -e
CEPH_PUBLIC_NETWORK=10.0.2.0/24 ceph/daemon mon

Manager:
docker run -d --net=host -v /etc/ceph:/etc/ceph -v
/var/lib/ceph/:/var/lib/ceph/ ceph/daemon mgr

OSD:
docker run -d --net=host --pid=host --privileged=true -v
/etc/ceph:/etc/ceph -v /var/lib/ceph/:/var/lib/ceph/ -v /dev/:/dev/ -e
OSD_DEVICE=/dev/vdd ceph/daemon osd

 From the above commands I am able to spawn monitor and manager
properly. I verified this by firing this command on both monitor and
manager containers:
sudo docker exec d1ab985 ceph -s

I get following outputs for both:

   cluster:
 id: 14a6e40a-8e54-4851-a881-661a84b3441c
 health: HEALTH_OK

   services:
 mon: 1 daemons, quorum serverceph-VirtualBox (age 62m)
 mgr: serverceph-VirtualBox(active, since 56m)
 osd: 0 osds: 0 up, 0 in

   data:
 pools:   0 pools, 0 pgs
 objects: 0 objects, 0 B
 usage:   0 B used, 0 B / 0 B avail
 pgs:

However when I try to bring up OSD using above command, it doesn't
work. Docker logs show this output:
2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: static:
does not generate config
2019-04-18 07:30:06  /opt/ceph-container/bin/entrypoint.sh: ERROR- The
device pointed by OSD_DEVICE (/dev/vdd) doesn't exist !

I am not sure why the doc asks to pass /dev/vdd to OSD_DEVICE env var.
I know there are five different ways to spawning the OSD, but I am not
able to figure out which one would be suitable for a simple
deployment. If you could please let me know how to spawn OSDs using
Docker, it would help a lot.



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com