Re: [ceph-users] iscsi resize -vmware datastore cannot increase size - FIXED

2019-10-25 Thread Steven Vacaroaia
ion errors, datastore cannot be extent from vCenter IT HAS TO BE DONE DIRECTLY on one of the ESXi servers which can be accomplished by simply logging in as root at https://esxiserver/ui Many thanks for your suggestions Steven On Fri, 25 Oct 2019 at 10:43, Steven Vacaroaia wrote: > the error seem

Re: [ceph-users] iscsi resize -vmware datastore cannot increase size

2019-10-25 Thread Steven Vacaroaia
nfo Client Iqn .. iqn.1998-01.com.vmware:vsan1-7bb6d7ac Ip Address .. 10.10.35.111 Alias .. Logged In .. LOGGED_IN Auth - chap .. cephuser/CORRECT_PASSWORD Group Name .. Luns - rbd.rep01.. lun_id=0 - rbd.vmware01 .. lun_id=1 What am I missing ??? On Fri, 25 Oct 2019 at 10:30, Steven Vacaro

Re: [ceph-users] iscsi resize -vmware datastore cannot increase size

2019-10-25 Thread Steven Vacaroaia
Vacaroaia wrote: > I can confirm that, after reentering credentials for the target on each > ESXi server and rescanning storage, the device appear and datastore can be > increased > > Thanks for your help and patience > > Steven > > On Fri, 25 Oct 2019 at 09:59, Stev

Re: [ceph-users] iscsi resize -vmware datastore cannot increase size

2019-10-25 Thread Steven Vacaroaia
I can confirm that, after reentering credentials for the target on each ESXi server and rescanning storage, the device appear and datastore can be increased Thanks for your help and patience Steven On Fri, 25 Oct 2019 at 09:59, Steven Vacaroaia wrote: > I noticed t

Re: [ceph-users] iscsi resize -vmware datastore cannot increase size

2019-10-25 Thread Steven Vacaroaia
I noticed this [vob.iscsi.discovery.login.error] discovery failure on vmhba64 to 10.10.35.202 because the target returned a login status of 0201. A restart of rbd services will require reentering chap credentials on targets ? Steven On Fri, 25 Oct 2019 at 09:57, Steven Vacaroaia wrote: >

Re: [ceph-users] iscsi resize -vmware datastore cannot increase size

2019-10-25 Thread Steven Vacaroaia
wrote: > On Fri, Oct 25, 2019 at 9:49 AM Steven Vacaroaia wrote: > > > > Thanks for your prompt response > > Unfortunately , still no luck > > Device shows with correct size under "Device backing" but not showing at > all under "increase datastore

Re: [ceph-users] iscsi resize -vmware datastore cannot increase size

2019-10-25 Thread Steven Vacaroaia
ks> ls o- disks . [13.0T, Disks: 2] o- rbd.rep01 [rep01 (7T)] On Fri, 25 Oct 2019 at 09:24, Jason Dillaman wrote: > On Fri, Oct 25, 2019 at 9:13 AM St

[ceph-users] iscsi resize -vmware datastore cannot increase size

2019-10-25 Thread Steven Vacaroaia
Hi, I am trying to increase size of a datastore made available through ceph iscsi rbd The steps I followed are depicted below Basically gwcli report correct data and even VMware device capacity is correct but when tried to increase it there is no device listed I am using ceph-iscsi-config-2.6-42.g

Re: [ceph-users] Openstack ceph - non bootable volumes

2018-12-24 Thread Steven Vacaroaia
with volumes? Do you use cephx and have > you setup rbd_secret on your compute node(s)? > > Regards, > Eugen > > [1] > http://lists.ceph.com/pipermail/ceph-users-ceph.com/2013-May/031454.html > > > Zitat von Steven Vacaroaia : > > > Thanks for your prompt reply

Re: [ceph-users] Openstack ceph - non bootable volumes

2018-12-19 Thread Steven Vacaroaia
works, you > should double check the credentials you created within ceph and > compare them to the credentials in your openstack configs. > That would be my first guess. > > Regards, > Eugen > > Zitat von Steven Vacaroaia : > > > Hi, > > > > I'll a

[ceph-users] Openstack ceph - non bootable volumes

2018-12-19 Thread Steven Vacaroaia
Hi, I'll appreciated if someone can provide some guidance for troubleshooting / setting up Openstack (rocky) + ceph (mimic) so that volumes created on ceph be bootable I have followed this http://docs.ceph.com/docs/mimic/rbd/rbd-openstack/ enabled debug in both nova and cinder but still not been

Re: [ceph-users] ceph-iscsi iSCSI Login negotiation failed

2018-12-05 Thread Steven Vacaroaia
Thanks for taking the trouble to respond I noticed some xfs error on the /var partition so I have rebooted the server in order to force xfs_repair to run It is now working Steven On Wed, 5 Dec 2018 at 11:47, Mike Christie wrote: > On 12/05/2018 09:43 AM, Steven Vacaroaia wrote: >

[ceph-users] ceph-iscsi iSCSI Login negotiation failed

2018-12-05 Thread Steven Vacaroaia
Hi, I have a strange issue I configured 2 identical iSCSI gateways but one of them is complaining about negotiations although gwcli reports the correct auth and status ( logged-in) Any help will be truly appreciated Here are some details ceph-iscsi-config-2.6-42.gccca57d.el7.noarch ceph-iscsi-cl

Re: [ceph-users] cephfs nfs-ganesha rados_cluster

2018-11-15 Thread Steven Vacaroaia
Thanks Jeff for taking the trouble to respond and your willingness to help Here are some questions: - Apparently rados_cluster is gone in 2.8. There is "fs" and 'fs_ng' now However, I was not able to find a config depicting usage Would you be able to share your working one ? - how would on

[ceph-users] cephfs nfs-ganesha rados_cluster

2018-11-13 Thread Steven Vacaroaia
Hi, I've been trying to setup an active active ( or even active passive) NFS share for a while without any success Using Mimic 13.2.2 and nfs-ganesha 2.8 with rados_cluster as recovery mechanism I focused on corosync/pacemaker as a HA controlling software but I would not mind using anything else

[ceph-users] Mimic - EC and crush rules - clarification

2018-11-01 Thread Steven Vacaroaia
Hi, I am trying to create an EC pool on my SSD based OSDs and will appreciate if someone clarify / provide advice about the following - best K + M combination for 4 hosts one OSD per host My understanding is that K+M< OSD but using K=2, M=1 does not provide any redundancy ( as soon as 1 OSD i

Re: [ceph-users] add monitors - not working

2018-11-01 Thread Steven Vacaroaia
might be useful to ceph users if someone can confirm Thanks Steven On Wed, 31 Oct 2018 at 21:59, Joao Eduardo Luis wrote: > On 10/31/2018 04:48 PM, Steven Vacaroaia wrote: > > On the monitor that works I noticed this > > > > mon.mon01@0(leader) e1 handle_probe ignoring fsid &

Re: [ceph-users] add monitors - not working

2018-10-31 Thread Steven Vacaroaia
On the monitor that works I noticed this mon.mon01@0(leader) e1 handle_probe ignoring fsid d01a0b47-fef0-4ce8-9b8d-80be58861053 != 8e7922c9-8d3b-4a04-9a8a-e0b0934162df Where is that fsid ( 8e7922 ) coming from ? Steven On Wed, 31 Oct 2018 at 12:45, Steven Vacaroaia wrote: > Hi, >

[ceph-users] add monitors - not working

2018-10-31 Thread Steven Vacaroaia
Hi, I've added 2 more monitors to my cluster but they are not joining the cluster service is up, ceph.conf is the same , what am I missing ?? ceph-deploy install ceph-deploy mon create ceph-deploy add I then manually change ceph.conf to contain the following Then I push it to all cluster membres

Re: [ceph-users] crush rules not persisting

2018-10-31 Thread Steven Vacaroaia
Nevermind ...a bit of reading was enough to point me to "osd_crush_update_on_start": "true" Thanks Steven On Wed, 31 Oct 2018 at 10:31, Steven Vacaroaia wrote: > Hi, > I have created a separate root for my ssd drives > All works well but a reboot ( or restart of t

[ceph-users] crush rules not persisting

2018-10-31 Thread Steven Vacaroaia
Hi, I have created a separate root for my ssd drives All works well but a reboot ( or restart of the services) wipes out all my changes How can I persist changes to crush rules ? here are some details Initial / default - this is what I am getting after a restart / reboot If I just do that on o

Re: [ceph-users] ceph.conf mon_max_pg_per_osd not recognized / set

2018-10-31 Thread Steven Vacaroaia
so, moving the entry from [mon] to [global] worked This is a bit confusing - I use to put all my configuration setting starting with mon_ under [mon] Steven On Wed, 31 Oct 2018 at 10:13, Steven Vacaroaia wrote: > I do not think so ..or maybe I did not understand what are you saying > Th

Re: [ceph-users] ceph.conf mon_max_pg_per_osd not recognized / set

2018-10-31 Thread Steven Vacaroaia
password", "config/mgr/mgr/balancer/active", "config/mgr/mgr/balancer/mode", "config/mgr/mgr/dashboard/password", "config/mgr/mgr/dashboard/server_addr", "config/mgr/mgr/dashboard/username", "config/mgr/mgr/prometheus

[ceph-users] ceph.conf mon_max_pg_per_osd not recognized / set

2018-10-31 Thread Steven Vacaroaia
Hi, Any idea why different value for mon_max_pg_per_osd is not "recognized" ? I am using mimic 13.2.2 Here is what I have in /etc/ceph/ceph.conf [mon] mon_allow_pool_delete = true mon_osd_min_down_reporters = 1 mon_max_pg_per_osd = 400 checking the value with ceph daemon osd.6 config show| gr

Re: [ceph-users] node not using cluster subnet

2018-10-30 Thread Steven Vacaroaia
#x27;t > getting through? > -Greg > > On Tue, Oct 30, 2018 at 8:34 AM Steven Vacaroaia wrote: > >> Hi, >> I am trying to add another node to my cluster which is configured to use >> a dedicated subnet >> >> public_network = 10.10.35.0/24 >> cluster

[ceph-users] node not using cluster subnet

2018-10-30 Thread Steven Vacaroaia
Hi, I am trying to add another node to my cluster which is configured to use a dedicated subnet public_network = 10.10.35.0/24 cluster_network = 192.168.200.0/24 For whatever reason, this node is staring properly and few seconds later is failing and staring to check for connectivity on public ne

Re: [ceph-users] NVME Intel Optane - same servers different performance

2018-10-25 Thread Steven Vacaroaia
5 Oct 2018 at 12:32, Steven Vacaroaia wrote: > Thanks for suggestion > However same "bad"server was working fine until I updated firmware > Now, all 4 server have the same firmware but one has lower performance > > I will try what you suggested though although, as I said

Re: [ceph-users] NVME Intel Optane - same servers different performance

2018-10-25 Thread Steven Vacaroaia
t.me/MartinVerges > > croit GmbH, Freseniusstr. 31h, 81247 Munich > CEO: Martin Verges - VAT-ID: DE310638492 > Com. register: Amtsgericht Munich HRB 231263 > > Web: https://croit.io > YouTube: https://goo.gl/PGE1Bx > > > 2018-10-25 18:06 GMT+02:00 Steven Vacaroaia : > >

Re: [ceph-users] NVME Intel Optane - same servers different performance

2018-10-25 Thread Steven Vacaroaia
seniusstr. 31h, 81247 Munich > CEO: Martin Verges - VAT-ID: DE310638492 > Com. register: Amtsgericht Munich HRB 231263 > > Web: https://croit.io > YouTube: https://goo.gl/PGE1Bx > > > 2018-10-25 17:46 GMT+02:00 Steven Vacaroaia : > > Hi, > > I have 4 x DELL R630 s

[ceph-users] NVME Intel Optane - same servers different performance

2018-10-25 Thread Steven Vacaroaia
Hi, I have 4 x DELL R630 servers with exact same specs I installed Intel Optane SSDPED1K375GA in all When comparing fio performance ( both read and write), when is lower than the other 3 ( see below - just read) Any suggestions as to what to check/fix ? BAD server [root@osd04 ~]# fio --filename=

[ceph-users] DELL R630 and Optane NVME

2018-10-11 Thread Steven Vacaroaia
Hi, Any of you uses Optane NVME and is willing to share his/her experience and tuning settings ? There was a discussion started by Wido mentioning using intel_pstate=disable intel_idle.max_cstate=1 processor.max_cstate=1 and disabling irqbalance but that is all I was able to find I am using DELL

Re: [ceph-users] ceph-iscsi upgrade issue

2018-10-11 Thread Steven Vacaroaia
.fc29.noarch.rpm python-rtslib-2.1.fb67-10.g7713d1e.noarch.rpm Many thanks for your patience, time and help Steven On Wed, 10 Oct 2018 at 16:42, Mike Christie wrote: > On 10/10/2018 03:18 PM, Steven Vacaroaia wrote: > > so, it seems OSD03 is having issues when creating disks ( I c

Re: [ceph-users] ceph-iscsi upgrade issue

2018-10-10 Thread Steven Vacaroaia
des? > On Wed, Oct 10, 2018 at 4:18 PM Steven Vacaroaia wrote: > > > > so, it seems OSD03 is having issues when creating disks ( I can create > target and hosts ) - here is an excerpt from api.log > > Please note I can create disk on the other node > > > > 2018-

Re: [ceph-users] ceph-iscsi upgrade issue

2018-10-10 Thread Steven Vacaroaia
] - _disk change on 127.0.0.1 failed with 500 2018-10-10 16:03:03,439 INFO [_internal.py:87:_log()] - 127.0.0.1 - - [10/Oct/2018 16:03:03] "PUT /api/disk/rbd.dstest2 HTTP/1.1" 500 - I remove gateway.conf object and install latest rpms on it as follows but the error persist Alos r

Re: [ceph-users] ceph-iscsi upgrade issue

2018-10-10 Thread Steven Vacaroaia
9 Oct 2018 at 16:35, Jason Dillaman wrote: > Anything in the rbd-target-api.log on osd03 to indicate why it failed? > > Since you replaced your existing "iscsi-gateway.conf", do your > security settings still match between the two hosts (i.e. on the > trusted_ip_list, s

Re: [ceph-users] ceph-iscsi upgrade issue

2018-10-09 Thread Steven Vacaroaia
ot; 500 - On Tue, 9 Oct 2018 at 15:42, Steven Vacaroaia wrote: > It worked. > > many thanks > Steven > > On Tue, 9 Oct 2018 at 15:36, Jason Dillaman wrote: > >> Can you try applying [1] and see if that resolves your issue? >> >> [1] https://github.com/ceph/

Re: [ceph-users] ceph-iscsi upgrade issue

2018-10-09 Thread Steven Vacaroaia
It worked. many thanks Steven On Tue, 9 Oct 2018 at 15:36, Jason Dillaman wrote: > Can you try applying [1] and see if that resolves your issue? > > [1] https://github.com/ceph/ceph-iscsi-config/pull/78 > On Tue, Oct 9, 2018 at 3:06 PM Steven Vacaroaia wrote: > >

Re: [ceph-users] ceph-iscsi upgrade issue

2018-10-09 Thread Steven Vacaroaia
quot;cephmetrics", or try setting "prometheus_host = 0.0.0.0" since it > sounds like you have the IPv6 stack disabled. > > [1] > https://github.com/ceph/ceph-iscsi-config/blob/master/ceph_iscsi_config/settings.py#L90 > On Tue, Oct 9, 2018 at 2:09 PM Steven Vacaroaia wrote:

Re: [ceph-users] ceph-iscsi upgrade issue

2018-10-09 Thread Steven Vacaroaia
3 rbd-target-gw: _sock = _realsocket(family, type, proto) Oct 9 13:58:35 osd03 rbd-target-gw: socket.error: [Errno 97] Address family not supported by protocol Oct 9 13:58:35 osd03 systemd: rbd-target-gw.service: main process exited, code=exited, status=1/FAILURE On Tue, 9 Oct 2018 at 13:16, S

[ceph-users] ceph-iscsi upgrade issue

2018-10-09 Thread Steven Vacaroaia
Hi , I am using Mimic 13.2 and kernel 4.18 Was using gwcli 2.5 and decided to upgrade to latest (2.7) as people reported improved performance What is the proper methodology ? How should I troubleshoot this? What I did ( and it broke it) was cd tcmu-runner; git pull ; make && make install cd ce

[ceph-users] Mimic 13.2.2 SCST or ceph-iscsi ?

2018-10-04 Thread Steven Vacaroaia
Hi, Which implementation of iSCSI is recommended for Mimic 13.2.2 and why ? Is multipathing supported by both in a VMWare environment ? Anyone willing to share performance details ? Many thanks Steven ___ ceph-users mailing list ceph-users@lists.ceph.c

Re: [ceph-users] Ceph and NVMe

2018-09-06 Thread Steven Vacaroaia
Hi , Just to add to this question, is anyone using Intel Optane DC P4800X on DELL R630 ...or any other server ? Any gotchas / feedback/ knowledge sharing will be greatly appreciated Steven On Thu, 6 Sep 2018 at 14:59, Stefan Priebe - Profihost AG < s.pri...@profihost.ag> wrote: > Hello list, > >

[ceph-users] Mimic - Erasure Code Plugin recommendation

2018-08-28 Thread Steven Vacaroaia
Hi, Would you be able to recommend erasure code plugin ? The default is jerasure but lrc appears to be more efficient Ill appreciate any hints and/or pointers to resources / best practices Thanks Steven ___ ceph-users mailing list ceph-users@lists.ce

[ceph-users] mimic + cephmetrics + prometheus - working ?

2018-08-27 Thread Steven Vacaroaia
Hi has anyone been able to use Mimic + cephmetric + prometheus ? I am struggling to make it fully functional as it appears data provided by node_exporter has a different name than the one grafana expectes As a result of the above, only certain dashboards are being populated ( the ones ceph speci

[ceph-users] mimic - troubleshooting prometheus

2018-08-24 Thread Steven Vacaroaia
Hi, Any idea/suggestions for troubleshooting prometheus ? what logs /commands are available to find out why OSD servers specific data ( IOPS, disk and network data) is not scrapped but cluster specific data ( pools, capacity ..etc) is ? Increasing log level for MGR showed only the following 201

Re: [ceph-users] Mimic prometheus plugin -no socket could be created

2018-08-24 Thread Steven Vacaroaia
ceph/minions/*yml > config/stack/default/global.yml > config/stack/default/ceph/cluster.yml > role-master/cluster/node01.sls > role-admin/cluster/*.sls > role-mon/cluster/*.sls > role-mgr/cluster/*.sls > role-mds/cluster/*.sls > role-ganesha/cluster/*.sls > role-client-nfs/c

Re: [ceph-users] Mimic prometheus plugin -no socket could be created

2018-08-23 Thread Steven Vacaroaia
Did all that .. even tried to change port Also selinux and firewalld are disabled Thanks for taking the trouble to suggest something Steven On Thu, 23 Aug 2018 at 13:46, John Spray wrote: > On Thu, Aug 23, 2018 at 5:18 PM Steven Vacaroaia wrote: > > > > Hi All, > > >

[ceph-users] Mimic prometheus plugin -no socket could be created

2018-08-23 Thread Steven Vacaroaia
Hi All, I am trying to enable prometheus plugin with no success due to "no socket could be created" The instructions for enabling the plugin are very straightforward and simple Note My ultimate goal is to use Prometheus with Cephmetrics Some of you suggested to deploy ceph-exporter but why do we

[ceph-users] prometheus has failed - no socket could be created

2018-08-22 Thread Steven Vacaroaia
Hi, I am trying to enable prometheus on Mimic so I can use it with cephmetrics But it fails with the following error Any help troubleshooting this will be very appreciated ".. Module 'prometheus' has failed: error('No socket could be created',) .." here is some info ( all commons ran on MON whe

Re: [ceph-users] limited disk slots - should I ran OS on SD card ?

2018-08-15 Thread Steven Vacaroaia
Thank you all Since all concerns were about reliability I am assuming performance impact of having OS running on SD card is minimal / negligible In other words, an OSD server is not writing/reading from Linux OS partitions too much ( especially with logs at minimum ) so its performance is not de

[ceph-users] limited disk slots - should I ran OS on SD card ?

2018-08-13 Thread Steven Vacaroaia
Hi, I am in the process of deploying CEPPH mimic on a few DELL R620/630 servers These servers have only 8 disk slots and the larger disk they accept is 2 TB Should I share a SSD drive between OS and WAL/DB or ran OS on internal SD cards and dedicated SSD to DB/WAL only ? Another option is to use

Re: [ceph-users] cephmetrics without ansible

2018-08-10 Thread Steven Vacaroaia
gt; *T: *818-649-7235 *M: *818-434-6195 > [image: ttp://www.hotyellow.com/deximages/dex-thryv-logo.jpg] > <http://www.dexyp.com/> > > > > *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf > Of *Steven Vacaroaia > *Sent:* Thursday, August 09, 2018

[ceph-users] cephmetrics without ansible

2018-08-09 Thread Steven Vacaroaia
Hi, I would be very grateful if any of you will share their experience/knowledge for using cephmetrics without ansible I have deployed my cluster using ceph-deploy on Centos 7 I have Grafana, Graphite and collectd installed and running / collecting data Building dashboards and queries is very

[ceph-users] design question - NVME + NLSAS, SSD or SSD + NLSAS

2018-07-19 Thread Steven Vacaroaia
Hi, I would appreciate any advice ( with arguments , if possible) regarding the best design approach considering below facts - budget is set to XX amount - goal is to get as much performance / capacity as possible using XX - 4 to 6 servers, DELL R620/R630 with 8 disk slots, 64 G RAM and 8 cores

[ceph-users] iSCSI SCST not working with Kernel 4.17.5

2018-07-09 Thread Steven Vacaroaia
Hi, Just wondering if any of you managed to use SCST with kernel 4.17.5 ? Apparently SCST works only with kernel 3.10 Alternatively, is ceph-iscsi running properly with the newest kernel ? Installationa and configuration went well but accessing the LUN fail with the following error ".. kernel:

Re: [ceph-users] VMWARE and RBD

2018-06-29 Thread Steven Vacaroaia
o use iscsi. By the way, i > found that the performance is much better for SCST than ceph-iscsi. I don't > think ceph-iscsi is production-ready? > > Regards, > Horace Ng > > ------ > *From: *"Steven Vacaroaia" > *To: *"ceph-users

[ceph-users] Centos kernel

2018-06-21 Thread Steven Vacaroaia
Hi, Just wondering if you would recommend using newest kernel on Centos ( i.e. after installing regular Centos ( 3.10.0-862), enable elrepo.kernel and install 4.17 ) or simply stay with the regular one Many thanks Steven ___ ceph-users mailing list ceph

[ceph-users] VMWARE and RBD

2018-06-18 Thread Steven Vacaroaia
Hi, I read somewhere that VMWare is planning to support RBD directly Anyone here know more about this ..maybe a tentative / date / version ? Thanks Steven ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-us

[ceph-users] Ceph bonding vs separate provate public network

2018-06-12 Thread Steven Vacaroaia
Hi, I am designing a new ceph cluster and was wondering whether I should bond the 10 GB adapters or use one for public one for private The advantage of bonding is simplicity and, maybe, performance The catch though is that I cannot use jumbo frames as most of my servers that needs to "consume" st

Re: [ceph-users] ceph , VMWare , NFS-ganesha

2018-05-29 Thread Steven Vacaroaia
an webinterface for > management and does clustered active-active ISCSI. For us the easy > management was the point to choose this, so we need not to think about how > to configure ISCSI... > Regards, > Dennis > > Am 28.05.2018 um 21:42 schrieb Steven Vacaroaia: > > Hi

[ceph-users] ceph , VMWare , NFS-ganesha

2018-05-28 Thread Steven Vacaroaia
Hi, I need to design and build a storage platform that will be "consumed" mainly by VMWare CEPH is my first choice As far as I can see, there are 3 ways CEPH storage can be made available to VMWare 1. iSCSI 2. NFS-Ganesha 3. mounted rbd to a lInux NFS server Any suggestions / advice as to whic

[ceph-users] ceph 12.2.5 - atop DB/WAL SSD usage 0%

2018-04-27 Thread Steven Vacaroaia
Hi During rados bench tests, I noticed that HDD usage goes to 100% but SSD stays at ( or very close to 0) Since I created OSD with BLOCK/WAL on SSD, shouldnt I see some "activity' on SSD ? How can I be sure CEPH is actually using SSD for WAL /DB ? Note I only have 2 HDD and one SSD per server

[ceph-users] Ceph 12.2.4 - performance max_sectors_kb

2018-04-26 Thread Steven Vacaroaia
Hi, I have been struggling tuning performance for quite a while now Just noticed this article http://longwhiteclouds.com/2016/03/06/default-io-size-change-in-linux-kernel/ But I am unable to change max_sectors_kb to 1024 uname -a Linux osd01.tor.medavail.net 4.16.3-1.el7.elrepo.x86_64 Anyone h

Re: [ceph-users] ceph luminous 12.2.4 - 2 servers better than 3 ?

2018-04-19 Thread Steven Vacaroaia
e > read performance is pretty bad. Have you verified the hardware with micro > benchmarks such as 'fio'? Also try to review storage controller settings. > > On Apr 19, 2018 5:13 PM, "Steven Vacaroaia" wrote: > > replication size is always 2 > > DB/WAL o

Re: [ceph-users] ceph luminous 12.2.4 - 2 servers better than 3 ?

2018-04-19 Thread Steven Vacaroaia
ow you setup block.db and Wal, are they on the > SSD? > > On Thu, Apr 19, 2018, 14:40 Steven Vacaroaia wrote: > >> Sure ..thanks for your willingness to help >> >> Identical servers >> >> Hardware >> DELL R620, 6 cores, 64GB RAM, 2 x 10 GB ports, >>

Re: [ceph-users] ceph luminous 12.2.4 - 2 servers better than 3 ?

2018-04-19 Thread Steven Vacaroaia
8.8324 2 2.31401 1.23399 44 16 541 525 47.7226 0 - 1.23399 45 16 541 525 46.6621 0 - 1.23399 46 16 541 525 45.6477 0 - 1.23399 47 16 541 525

Re: [ceph-users] ceph luminous 12.2.4 - 2 servers better than 3 ?

2018-04-19 Thread Steven Vacaroaia
ing with iperf3 on 10Gbit > [ ID] Interval Transfer Bandwidth Retr Cwnd > [ 4] 0.00-10.00 sec 11.5 GBytes 9.89 Gbits/sec0 1.31 MBytes > [ 4] 10.00-20.00 sec 11.5 GBytes 9.89 Gbits/sec0 1.79 MBytes > > > > -Original Message- >

[ceph-users] ceph luminous 12.2.4 - 2 servers better than 3 ?

2018-04-19 Thread Steven Vacaroaia
Hi, Any idea why 2 servers with one OSD each will provide better performance than 3 ? Servers are identical Performance is impacted irrespective if I used SSD for WAL/DB or not Basically, I am getting lots of cur MB/s zero Network is separate 10 GB for public and private I tested it with iperf

[ceph-users] ceph 12.2.4 - which OSD has slow requests ?

2018-04-17 Thread Steven Vacaroaia
Hi, I can see many slow requests in the logs but no clue which OSD is the culprit If I remember correctly, that info was provided in the logs in previous CEPH version How can I find the culprit ? Previous log entry *2017-10-13 21:22:01.833605 osd.5 [WRN] 1 slow requests, 1 included** below; ol

[ceph-users] ceph version 12.2.4 - slow requests missing from health details

2018-04-12 Thread Steven Vacaroaia
Hi, I am still struggling with my performance issue and I noticed that "ceph health details" does not provide details about where the slow requests are Some other people noticed that ( https://www.spinics.net/lists/ceph-users/msg43574.html ) What am I missing and /or how /where to find the OSD w

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-11 Thread Steven Vacaroaia
ze=4M" osd_mkfs_options_xfs = "-f -i size=2048" bluestore_block_db_size = 32212254720 bluestore_block_wal_size = 1073741824 On Wed, 11 Apr 2018 at 08:57, Steven Vacaroaia wrote: > [root@osd01 ~]# ceph osd pool ls detail -f json-pretty > > [ > { > "

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-11 Thread Steven Vacaroaia
m_objects": 0, "fast_read": false, "options": {}, "application_metadata": { "rbd": {} } } ] [root@osd01 ~]# ceph osd crush rule dump [ { "rule_id": 0, "rule_name": &

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-11 Thread Steven Vacaroaia
Thanks for the suggestion but , unfortunately, having same number of OSD did not solve the issue Here is with 2 OSD per server, 3 servers - identical servers and osd configuration [root@osd01 ~]# ceph osd tree ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF -1 4.02173 root default -9

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-10 Thread Steven Vacaroaia
riors: 0 2018-04-10 16:05:38.385454 7f33640c4700 5 write_log_and_missing with: dirty_to: 0'0, dirty_from: 4294967295'18446744073709551615, writeout_from: 300'45, trimmed: , trimmed_dups: , clear_divergent_priors: 0 2018-04-10 16:05:38.393238 7f33638c3700 5 write_log_and_missing

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-10 Thread Steven Vacaroaia
[ 4] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec0 sender [ 4] 0.00-10.00 sec 11.5 GBytes 9.90 Gbits/sec receiver On Tue, 10 Apr 2018 at 08:49, Steven Vacaroaia wrote: > Hi, > Thanks for providing guidance > > VD0 is the SSD drive > Many people sugges

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-10 Thread Steven Vacaroaia
Hi, Thanks for providing guidance VD0 is the SSD drive Many people suggested to not enable WB for SSD so that cache can be used for HDD where is needed more Setup is 3 identical DELL R620 server OSD01, OSD02, OSD04 10 GB separate networks, 600 GB Entreprise HDD , 320 GB Entreprise SSD Blustore,

Re: [ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-09 Thread Steven Vacaroaia
Exit Code: 0x00 On Fri, 6 Apr 2018 at 15:11, David Turner wrote: > First and foremost, have you checked your disk controller. Of most import > would be your cache battery. Any time I have a single node acting up, the > controller is Suspect #1. > > On Thu, Apr 5, 2018 at 11:23

[ceph-users] Ceph luminous - troubleshooting performance issues overall DSK 100%, busy 1%

2018-04-05 Thread Steven Vacaroaia
Hi, I have a strange issue - OSDs from a specific server are introducing huge performance issue This is a brand new installation on 3 identical servers - DELL R620 with PERC H710 , bluestore DB and WAL on SSD, 10GB dedicated private/public networks When I add the OSD I see gaps like below and

Re: [ceph-users] Ceph luminous 12.4 - ceph-volume device not found

2018-03-29 Thread Steven Vacaroaia
bytes / 512 bytes Thanks again Steven On Thu, 29 Mar 2018 at 15:24, Alfredo Deza wrote: > Seems like the partition table is still around even after calling > sgdisk --zap-all > > On Thu, Mar 29, 2018 at 2:18 PM, Steven Vacaroaia > wrote: > > Thanks for your willingne

Re: [ceph-users] Ceph luminous 12.4 - ceph-volume device not found

2018-03-29 Thread Steven Vacaroaia
fig.c:1051 Setting global/notify_dbus to 1 #lvmcmdline.c:2987 Completed: pvcreate - /dev/sdc On Thu, 29 Mar 2018 at 12:30, Alfredo Deza wrote: > On Thu, Mar 29, 2018 at 10:25 AM, Steven Vacaroaia > wrote: > > Hi, > > > > I am unable to create OSD beca

[ceph-users] Ceph luminous 12.4 - ceph-volume device not found

2018-03-29 Thread Steven Vacaroaia
Hi, I am unable to create OSD because " Device /dev/sdc not found (or ignored by filtering)." I tried using the ceph-volume ( on the host) as well as ceph-deploy ( on the admin node) The device is definitely there Any suggestions will be greatly appreciated Note I created the block-db and respe

[ceph-users] DELL R620 - SSD recommendation

2018-03-21 Thread Steven Vacaroaia
Hi, It will be appreciated if you could recommend some SSD models ( 200GB or less) I am planning to deploy 2 SSD and 6 HDD ( for a 1 to 3 ratio) in few DELL R620 with 64GB RAM Also, what is the highest HDD capacity that you were able to use in the R620 ? Note I apologize for asking "research e

Re: [ceph-users] Disk write cache - safe?

2018-03-16 Thread Steven Vacaroaia
Hi All, Can someone confirm please that, for a perfect performance/safety compromise, the following would be the best settings ( id 0 is SSD, id 1 is HDD ) Alternatively, any suggestions / sharing configuration / advice would be greatly appreciated Note server is a DELL R620 with PERC 710 , 1GB

Re: [ceph-users] Ceph luminous - Erasure code and iSCSI gateway

2018-02-27 Thread Steven Vacaroaia
te: > Do your pre-created images have the exclusive-lock feature enabled? > That is required to utilize them for iSCSI. > > On Tue, Feb 27, 2018 at 11:09 AM, Steven Vacaroaia > wrote: > > Hi Jason, > > > > Thanks for your prompt response > > > > I have not b

Re: [ceph-users] Ceph luminous - Erasure code and iSCSI gateway

2018-02-27 Thread Steven Vacaroaia
re, you would need to specify > the replicated pool where the image lives when attaching it as a > backing store for iSCSI (i.e. pre-create it via the rbd CLI): > > # gwcli > /iscsi-target...sx01-657d71e0> cd /disks > /disks> create pool=rbd image=image_ec1 size=XYZ > > >

[ceph-users] Ceph luminous - Erasure code and iSCSI gateway

2018-02-27 Thread Steven Vacaroaia
Hi, I noticed it is possible to use erasure code pool for RBD and CEPHFS https://ceph.com/community/new-luminous-erasure-coding-rbd-cephfs/ This got me thinking that I can deploy iSCSI luns on EC pools However it appears it is not working Anyone able to do that or have I misunderstood ? Thanks

[ceph-users] ceph-deploy ver 2 - [ceph_deploy.gatherkeys][WARNING] No mon key found in host

2018-02-20 Thread Steven Vacaroaia
Hi, I have decided to redeploy my test cluster using latest ceph-deploy and Luminous I cannot pass the ceph-deploy mon create-initial stage due to [ceph_deploy.gatherkeys][WARNING] No mon key found in host Any help will be appreciated ceph-deploy --version 2.0.0 [cephuser@ceph prodceph]$ ls -

Re: [ceph-users] High Load and High Apply Latency

2018-02-18 Thread Steven Vacaroaia
Hi John, I am trying to squize extra performance from my test cluster too Dell R 620 with PERC 710 , RAID0, 10 GB network Would you be willing to share your controller and kernel configuration ? For example, I am using BIOS profile 'Performance" with the following added to /etc/default/kernel i

[ceph-users] ceph luminous - ceph tell osd bench performance

2018-02-16 Thread Steven Vacaroaia
Hi, For every CONSECUTIVE ran of the "ceph tell osd.x bench" command I get different and MUCH worse results Is this expected ? OSD was created with the following command ( /dev/sda is an Entreprise class SDD) ceph-deploy osd create --zap-disk --bluestore osd01:sdc --block-db /dev/sda --blo

[ceph-users] Ceph luminous performance - how to calculate expected results

2018-02-14 Thread Steven Vacaroaia
Hi, It is very useful to "set up expectations" from a performance perspective I have a cluster using 3 DELL R620 with 64 GB RAM and 10 GB cluster network I've seen numerous posts and articles about the topic mentioning the following formula ( for disks WAL/DB on it ) OSD / replication / 2 Ex

Re: [ceph-users] ceph iscsi kernel 4.15 - "failed with 500"

2018-02-14 Thread Steven Vacaroaia
all your help On 14 February 2018 at 09:32, Jason Dillaman wrote: > Have you updated to ceph-iscsi-config-2.4-1 and ceph-iscsi-cli-2.6-1? > Any error messages in /var/log/rbd-target-api.log? > > On Wed, Feb 14, 2018 at 8:49 AM, Steven Vacaroaia > wrote: > > Thank you fo

Re: [ceph-users] ceph iscsi kernel 4.15 - "failed with 500"

2018-02-14 Thread Steven Vacaroaia
ain in the future, but > in the meantime I pushed and built python-rtslib-2.1.fb67-1 [1]. > > [1] https://shaman.ceph.com/repos/python-rtslib/ > > On Tue, Feb 13, 2018 at 2:09 PM, Steven Vacaroaia > wrote: > > Hi, > > > > I noticed a new ceph kernel (4.15.0-ceph

[ceph-users] ceph iscsi kernel 4.15 - "failed with 500"

2018-02-13 Thread Steven Vacaroaia
Hi, I noticed a new ceph kernel (4.15.0-ceph-g1c778f43da52) was made available so I have upgraded my test environment Now the iSCSI gateway stopped working - ERROR [rbd-target-api:1430:call_api()] - _disk change on osd02 failed with 500 So I was thinking that I have to pudate all the packages I

[ceph-users] ceph luminous - performance IOPS vs throughput

2018-02-05 Thread Steven Vacaroaia
Hi, I noticed a severe inverse correlation between IOPS and throughput For example: running rados bench write with t=32 shows and average IOPS 1426 and bandwidth 5.5 MB/sec running it with default ( t = 16 ) average IOPS is 49 and bandwidth is 200 MB/s Is this expected behavior ? How do I

Re: [ceph-users] ceph luminous performance - disks at 100% , low network utilization

2018-02-02 Thread Steven Vacaroaia
=0 ipv6.disable=1 intel_pstate=disable intel_idle.max_cstate=0 processor.max_cstate=0 idle=poll numa=off" This is extremely puzzling - any ideas, suggestions for troubleshooting it will be GREATLY appreciated Steven On 2 February 2018 at 10:51, Steven Vacaroaia wrote: > Hi Mark, > >

Re: [ceph-users] ceph luminous performance - disks at 100% , low network utilization

2018-02-02 Thread Steven Vacaroaia
th (MB/sec): 120 > Average IOPS: 40 > Stddev IOPS:6 > Max IOPS: 53 > Min IOPS: 30 > Average Latency(s): 0.396239 > Stddev Latency(s): 0.249998 > Max latency(s): 1.29482 > Min latency(s): 0.06875 > >

[ceph-users] ceph luminous performance - disks at 100% , low network utilization

2018-02-02 Thread Steven Vacaroaia
Hi, I have been struggling to get my test cluster to behave ( from a performance perspective) Dell R620, 64 GB RAM, 1 CPU, numa=off , PERC H710, Raid0, Enterprise 10K disks No SSD - just plain HDD Local tests ( dd, hdparm ) confirm my disks are capable of delivering 200 MBs Fio with 15 jobs in

[ceph-users] ceph luminous - different performance - same type of disks

2018-02-01 Thread Steven Vacaroaia
Hi, Any idea why same type of disks, on same server using same configuration will have different bench performance ( by 50% ) ? testing them with fio show similar performance fio --filename=/dev/sdd --direct=1 --sync=1 --rw=write --bs=4k --numjobs=10 --iodepth=1 --runtime=60 --time_based --group

[ceph-users] Ceph - incorrect output of ceph osd tree

2018-01-31 Thread Steven Vacaroaia
Hi, Why is ceph osd tree reports that osd.4 is up when the server on which osd.4 is running is actually down ?? Any help will be appreciated [root@osd01 ~]# ping -c 2 osd02 PING osd02 (10.10.30.182) 56(84) bytes of data. >From osd01 (10.10.30.181) icmp_seq=1 Destination Host Unreachable >From os

Re: [ceph-users] Ceph luminous - throughput performance issue

2018-01-31 Thread Steven Vacaroaia
is > only the case when using the PERC? > > Thanks > > On Wed, Jan 31, 2018 at 3:57 PM, Steven Vacaroaia > wrote: > >> >> Raid0 >> >> Hardware >> Controller >> ProductName : PERC H710 Mini(Bus 0, Dev 0) >&

  1   2   >