[ceph-users] cloudfuse

2016-07-16 Thread Marc Roos

I am looking a bit at ceph on a single node. Does anyone have experience 
with cloudfuse?

Do I need to use the rados-gw? Does it even work with ceph? 




- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -. 
F1 Outsourcing Development Sp. z o.o.
Poland 

t:  +48 (0)124466845
f:  +48 (0)124466843
e:  m...@f1-outsourcing.eu


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cloudfuse

2016-07-16 Thread Marc Roos
 

I have created a swift user, and can mount the object store with 
cloudfuse, and can create files in the default pool .rgw.root

How can I have my test user go to a different pool and not use the 
default .rgw.root?

Thanks, 
Marc




- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -. 
F1 Outsourcing Development Sp. z o.o.
Poland 

t:  +48 (0)124466845
f:  +48 (0)124466843
e:  m...@f1-outsourcing.eu


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Zabbix plugin for ceph-mgr

2017-06-27 Thread Marc Roos
 
FYI, 5 or even more years ago I was trying zabbix and when I noticed 
that when the monitored hosts increased, the load on the mysql server 
was increasing. Without being able to recall exactly what was wrong (I 
think every sample they did, was one insert statement), I do remember 
that I got quite an 'amateur' feeling of these guys. And when they apply 
'strange logics' in one situation, they are likely to apply this more 
often elsewhere in their code. Then I moved to nagios. 



-Original Message-
From: Wido den Hollander [mailto:w...@42on.com] 
Sent: dinsdag 27 juni 2017 11:09
To: ceph-us...@ceph.com
Subject: [ceph-users] Zabbix plugin for ceph-mgr

Hi,

After looking at the documentation [0] on how to write a plugin for 
ceph-mgr I've been playing with the idea to create a Zabbix [1] plugin 
for ceph-mgr.

Before I start writing one I'd like to check if I'm thinking in the 
right direction.

Zabbix supports Items [2] and Triggers. Triggers are based on Items's 
values. A Item could be from the type 'Trapper' where a application can 
simply send key=value pairs, for example:

my.host.name ceph.health HEALTH_OK
my.host.name ceph.osd.up 499
my.host.name ceph.osd.in 498

A simple ceph-mgr module could do:

def serve(self):
  while True:
send_data_to_zabbix()
time.sleep(60)

If for example the key ceph.health is != OK for >1h Zabbix could fire a 
trigger and send our an alert to an admin.

Now, would that be a sane plugin for ceph-mgr or is this something you 
shouldn't put in the mgr? To me it seems like a good place since it 
already has all the data present. This way data is pushed to Zabbix 
instead of the need for polling the data and parsing JSON output of 
'ceph -s'

Wido

[0]: http://docs.ceph.com/docs/master/mgr/plugins/
[1]: http://www.zabbix.com/
[2]: https://www.zabbix.com/documentation/3.0/manual/config/items
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph upgrade kraken -> luminous without deploy

2017-07-02 Thread Marc Roos
 
I have updated a test cluster by just updating the rpm and issueing a 
ceph osd require-osd-release because it was mentioned in the status. Is 
there more you need to do?


- update on all nodes the packages
sed -i 's/Kraken/Luminous/g' /etc/yum.repos.d/ceph.repo
yum update

- then on each node first restart the monitor
systemctl restart ceph-mon@X

- then on each node restart the osds
ceph osd tree
systemctl restart ceph-osd@X

- then on each node restart the mds
systemctl restart ceph-mds@X

ceph osd require-osd-release luminous



-Original Message-
From: Hauke Homburg [mailto:hhomb...@w3-creative.de] 
Sent: zondag 2 juli 2017 13:24
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Ceph Cluster with Deeo Scrub Error

Hello,

Ich have Ceph Cluster with 5 Ceph Servers, Running unter CentOS 7.2 and 
ceph 10.0.2.5. All OSD running in a RAID6.
In this Cluster i have Deep Scrub Error:
/var/log/ceph/ceph-osd.6.log-20170629.gz:389 .356391 7f1ac4c57700 -1 
log_channel(cluster) log [ERR] : 1.129 deep-scrub 1 errors

This Line is the inly Line i Can find with the Error.

I tried to repair with withceph osd deep-scrub osd and ceph pg repair.
Both didn't fiy the error.

What can i do to repair the Error?

Regards

Hauke

--
www.w3-creative.de

www.westchat.de


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] strange (collectd) Cluster.osdBytesUsed incorrect

2017-07-04 Thread Marc Roos


On a test cluster with 994GB used, via collectd I get in influxdb an 
incorrect 9.3362651136e+10 (93GB) reported and this should be 933GB (or 
actually 994GB). Cluster.osdBytes is reported correctly 
3.3005833027584e+13 (30TB)



  cluster:
health: HEALTH_OK

  services:
mon: 3 daemons, quorum a,b,c
mgr: c(active), standbys: a, b
mds: 1/1/1 up {0=a=up:active}, 1 up:standby
osd: 6 osds: 6 up, 6 in

  data:
pools:   6 pools, 600 pgs
objects: 3477k objects, 327 GB
usage:   994 GB used, 29744 GB / 30739 GB avail
pgs: 600 active+clean
 
Influxdb:

1499201873311403849 c01  mon.aceph_bytes Cluster.osdBytesAvail   
  3.2912470376448e+13
1499201873311399889 c01  mon.aceph_bytes Cluster.osdBytesUsed
  9.3362651136e+10
1499201873311396462 c01  mon.aceph_bytes Cluster.osdBytes
  3.3005833027584e+13






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] osd_bytes=0 reported by monitor

2017-07-07 Thread Marc Roos
 
Does anyone have an idea, why I am having these osd_bytes=0?



ceph daemon mon.c perf dump cluster
{
"cluster": {
"num_mon": 3,
"num_mon_quorum": 3,
"num_osd": 6,
"num_osd_up": 6,
"num_osd_in": 6,
"osd_epoch": 3593,
"osd_bytes": 0,
"osd_bytes_used": 0,
"osd_bytes_avail": 0,
"num_pool": 0,
"num_pg": 0,
"num_pg_active_clean": 0,
"num_pg_active": 0,
"num_pg_peering": 0,
"num_object": 0,
"num_object_degraded": 0,
"num_object_misplaced": 0,
"num_object_unfound": 0,
"num_bytes": 0,
"num_mds_up": 1,
"num_mds_in": 1,
"num_mds_failed": 0,
"mds_epoch": 796
}
}

ceph status
2017-07-07 23:33:10.866940 7f5a4d94c700 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-07-07 23:33:10.902683 7f5a4d94c700 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
  cluster:
id: 0f1701f5-453a-4a3b-928d-f652a2bbbcb0
health: HEALTH_OK

  services:
mon: 3 daemons, quorum a,b,c
mgr: a(active), standbys: c, b
mds: 1/1/1 up {0=c=up:active}, 1 up:standby
osd: 6 osds: 6 up, 6 in

  data:
pools:   6 pools, 600 pgs
objects: 5135k objects, 568 GB
usage:   1571 GB used, 29167 GB / 30739 GB avail
pgs: 600 active+clean
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Problems with statistics after upgrade to luminous

2017-07-10 Thread Marc Roos

I need a little help with fixing some errors I am having. 

After upgrading from Kraken im getting incorrect values reported on 
placement groups etc. At first I thought it is because I was changing 
the public cluster ip address range and modifying the monmap directly. 
But after deleting and adding a monitor this ceph daemon dump is still 
incorrect.




ceph daemon mon.a perf dump cluster
{
"cluster": {
"num_mon": 3,
"num_mon_quorum": 3,
"num_osd": 6,
"num_osd_up": 6,
"num_osd_in": 6,
"osd_epoch": 3842,
"osd_bytes": 0,
"osd_bytes_used": 0,
"osd_bytes_avail": 0,
"num_pool": 0,
"num_pg": 0,
"num_pg_active_clean": 0,
"num_pg_active": 0,
"num_pg_peering": 0,
"num_object": 0,
"num_object_degraded": 0,
"num_object_misplaced": 0,
"num_object_unfound": 0,
"num_bytes": 0,
"num_mds_up": 1,
"num_mds_in": 1,
"num_mds_failed": 0,
"mds_epoch": 816
}
}

2017-07-10 09:51:54.219167 7f5cb7338700 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
  cluster:
id: 0f1701f5-453a-4a3b-928d-f652a2bbbcb0
health: HEALTH_OK

  services:
mon: 3 daemons, quorum a,b,c
mgr: c(active), standbys: a, b
mds: 1/1/1 up {0=c=up:active}, 1 up:standby
osd: 6 osds: 6 up, 6 in

  data:
pools:   4 pools, 328 pgs
objects: 5224k objects, 889 GB
usage:   2474 GB used, 28264 GB / 30739 GB avail
pgs: 327 active+clean
 1   active+clean+scrubbing+deep
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Change the meta data pool of cephfs

2017-07-11 Thread Marc Roos


Is it possible to change the cephfs meta data pool. I would like to 
lower the pg's. And thought about just making a new pool, copying the 
pool and then renaming them. But I guess cephfs works with the pool id 
not? How can this be best done?

Thanks


 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Crashes Compiling Ruby

2017-07-13 Thread Marc Roos
 
No, but we are using Perl ;)


-Original Message-
From: Daniel Davidson [mailto:dani...@igb.illinois.edu] 
Sent: donderdag 13 juli 2017 16:44
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Crashes Compiling Ruby

We have a weird issue.  Whenever compiling Ruby, and only Ruby, on a 
location served by cephfs, the node in our cluster (not the ceph node) 
will crash.  This always happens, even if we do not use a PXE bootable 
node like the head/management node.  If we compile to local disk, it 
will succeed.  Compiling gems also does this.

Has anyone encountered a similar problem?

Dan

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] When are bugs available in the rpm repository

2017-07-15 Thread Marc Roos
 
When are bugs like these http://tracker.ceph.com/issues/20563 available 
in the rpm repository 
(https://download.ceph.com/rpm-luminous/el7/x86_64/)?

I sort of don’t get it from this page 
http://docs.ceph.com/docs/master/releases/. Maybe something here could 
specifically mentioned about the availability of rpm updates.
Or maybe a date can be added in the to the release notes pages 
(http://docs.ceph.com/docs/master/release-notes/#v12-0-3-luminous-dev)?






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Installing ceph on Centos 7.3

2017-07-18 Thread Marc Roos
 
We are running on 

Linux c01 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 
x86_64 x86_64 x86_64 GNU/Linux
CentOS Linux release 7.3.1611 (Core)

And didn’t have any issues installing/upgrading, but we are not using 
ceph-deploy. In fact am surprised on how easy it is to install.




-Original Message-
From: Götz Reinicke - IT Koordinator 
[mailto:goetz.reini...@filmakademie.de] 
Sent: dinsdag 18 juli 2017 11:25
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Installing ceph on Centos 7.3

Hi,


Am 18.07.17 um 10:51 schrieb Brian Wallis:


I’m failing to get an install of ceph to work on a new Centos 
7.3.1611 server. I’m following the instructions at 
http://docs.ceph.com/docs/master/start/quick-ceph-deploy/ to no avail.  

First question, is it possible to install ceph on Centos 7.3 or 
should I choose a different version or different linux distribution to 
use for now?

<...>

we run CEPH Jewel 10.2.7 on RHEL 7.3. It is working.

Maybe an other guide might help you trough the installation steps?

https://www.virtualtothecore.com/en/quickly-build-a-new-ceph-cluster-with-ceph-deploy-on-centos-7/


Regards . Götz


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Updating 12.1.0 -> 12.1.1

2017-07-18 Thread Marc Roos
 
I just updated packages on one CentOS7 node and getting these errors:

Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537510 7f4fa1c14e40 -1 
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537510 7f4fa1c14e40 -1 
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537725 7f4fa1c14e40 -1 
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537725 7f4fa1c14e40 -1 
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.567250 7f4fa1c14e40 -1 
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.567250 7f4fa1c14e40 -1 
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.589008 7f4fa1c14e40 -1 
mon.a@-1(probing).mgrstat failed to decode mgrstat state; luminous dev 
version?
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.589008 7f4fa1c14e40 -1 
mon.a@-1(probing).mgrstat failed to decode mgrstat state; luminous dev 
version?
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.724836 7f4f977d9700 -1 
mon.a@0(synchronizing).mgrstat failed to decode mgrstat state; luminous 
dev version?
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.724836 7f4f977d9700 -1 
mon.a@0(synchronizing).mgrstat failed to decode mgrstat state; luminous 
dev version?
Jul 18 12:03:34 c01 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: In function 
'PaxosServiceMessage* MForward::claim_message()' thread 7f4f977d9700 
time 2017-07-18 12:03:34.870230
Jul 18 12:03:34 c01 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: 100: FAILED 
assert(msg)
Jul 18 12:03:34 c01 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: In function 
'PaxosServiceMessage* MForward::claim_message()' thread 7f4f977d9700 
time 2017-07-18 12:03:34.870230
Jul 18 12:03:34 c01 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: 100: FAILED 
assert(msg)
Jul 18 12:03:34 c01 ceph-mon: ceph version 12.1.1 
(f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc)
Jul 18 12:03:34 c01 ceph-mon: 1: (ceph::__ceph_assert_fail(char const*, 
char const*, int, char const*)+0x110) [0x7f4fa21f4310]
Jul 18 12:03:34 c01 ceph-mon: 2: 
(Monitor::handle_forward(boost::intrusive_ptr)+0xd70) 
[0x7f4fa1fddcd0]
Jul 18 12:03:34 c01 ceph-mon: 3: 
(Monitor::dispatch_op(boost::intrusive_ptr)+0xd8d) 
[0x7f4fa1fdb29d]
Jul 18 12:03:34 c01 ceph-mon: 4: (Monitor::_ms_dispatch(Message*)+0x7de) 
[0x7f4fa1fdc06e]
Jul 18 12:03:34 c01 ceph-mon: 5: (Monitor::ms_dispatch(Message*)+0x23) 
[0x7f4fa2004303]
Jul 18 12:03:34 c01 ceph-mon: 6: (DispatchQueue::entry()+0x792) 
[0x7f4fa242c812]
Jul 18 12:03:34 c01 ceph-mon: 7: 
(DispatchQueue::DispatchThread::entry()+0xd) [0x7f4fa229a3cd]
Jul 18 12:03:34 c01 ceph-mon: 8: (()+0x7dc5) [0x7f4fa0fbedc5]
Jul 18 12:03:34 c01 ceph-mon: 9: (clone()+0x6d) [0x7f4f9e34a76d]
Jul 18 12:03:34 c01 ceph-mon: NOTE: a copy of the executable, or 
`objdump -rdS ` is needed to interpret this.
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.872654 7f4f977d9700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: In function 
'PaxosServiceMessage* MForward::claim_message()' thread 7f4f977d9700 
time 2017-07-18 12:03:34.870230
Jul 18 12:03:34 c01 ceph-mon: ceph version 12.1.1 
(f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc)
Jul 18 12:03:34 c01 ceph-mon: 1: (ceph::__ceph_assert_fail(char const*, 
char const*, int, char const*)+0x110) [0x7f4fa21f4310]
Jul 18 12:03:34 c01 ceph-mon: 2: 
(Monitor::handle_forward(boost::intrusive_ptr)+0xd70) 
[0x7f4fa1fddcd0]
Jul 18 12:03:34 c01 ceph-mon: 3: 
(Monitor::dispatch_op(boost::intrusive_ptr)+0xd8d) 
[0x7f4fa1fdb29d]
Jul 18 12:03:34 c01 ceph-mon: 4: (Monitor::_ms_dispatch(Message*)+0x7de) 
[0x7f4fa1fdc06e]
Jul 18 12:03:34 c01 ceph-mon: 5: (Monitor::ms_dispatch(Message*)+0x23) 
[0x7f4fa2004303]
Jul 18 12:03:34 c01 ceph-mon: 6: (DispatchQueue::entry()+0x792) 
[0x7f4fa242c812

[ceph-users] Modify pool size not allowed with permission osd 'allow rwx pool=test'

2017-07-18 Thread Marc Roos
 

With ceph auth I have set permissions like below, I can add and delete 
objects in the test pool, but cannot set size of a the test pool. What 
permission do I need to add for this user to modify the size of this 
test pool?

 mon 'allow r' mds 'allow r' osd 'allow rwx pool=test'





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Updating 12.1.0 -> 12.1.1 mon / osd wont start

2017-07-18 Thread Marc Roos
 

I just updated packages on one CentOS7 node and getting these errors. 
Anybody an idea how to resolve this?


Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537510 7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537510 7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537725 7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537725 7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.567250 7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.567250 7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are enabled: 
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.589008 7f4fa1c14e40 -1 
mon.a@-1(probing).mgrstat failed to decode mgrstat state; luminous dev 
version?
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.589008 7f4fa1c14e40 -1 
mon.a@-1(probing).mgrstat failed to decode mgrstat state; luminous dev 
version?
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.724836 7f4f977d9700 -1 
mon.a@0(synchronizing).mgrstat failed to decode mgrstat state; luminous 
dev version?
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.724836 7f4f977d9700 -1 
mon.a@0(synchronizing).mgrstat failed to decode mgrstat state; luminous 
dev version?
Jul 18 12:03:34 c01 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: In function
'PaxosServiceMessage* MForward::claim_message()' thread 7f4f977d9700 
time 2017-07-18 12:03:34.870230 Jul 18 12:03:34 c01 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: 100: FAILED
assert(msg)
Jul 18 12:03:34 c01 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: In function
'PaxosServiceMessage* MForward::claim_message()' thread 7f4f977d9700 
time 2017-07-18 12:03:34.870230 Jul 18 12:03:34 c01 ceph-mon: 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: 100: FAILED
assert(msg)
Jul 18 12:03:34 c01 ceph-mon: ceph version 12.1.1
(f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc) Jul 18 12:03:34 
c01 ceph-mon: 1: (ceph::__ceph_assert_fail(char const*, char const*, 
int, char const*)+0x110) [0x7f4fa21f4310] Jul 18 12:03:34 c01 ceph-mon: 
2: 
(Monitor::handle_forward(boost::intrusive_ptr)+0xd70)
[0x7f4fa1fddcd0]
Jul 18 12:03:34 c01 ceph-mon: 3: 
(Monitor::dispatch_op(boost::intrusive_ptr)+0xd8d)
[0x7f4fa1fdb29d]
Jul 18 12:03:34 c01 ceph-mon: 4: (Monitor::_ms_dispatch(Message*)+0x7de)
[0x7f4fa1fdc06e]
Jul 18 12:03:34 c01 ceph-mon: 5: (Monitor::ms_dispatch(Message*)+0x23)
[0x7f4fa2004303]
Jul 18 12:03:34 c01 ceph-mon: 6: (DispatchQueue::entry()+0x792) 
[0x7f4fa242c812] Jul 18 12:03:34 c01 ceph-mon: 7: 
(DispatchQueue::DispatchThread::entry()+0xd) [0x7f4fa229a3cd] Jul 18 
12:03:34 c01 ceph-mon: 8: (()+0x7dc5) [0x7f4fa0fbedc5] Jul 18 12:03:34 
c01 ceph-mon: 9: (clone()+0x6d) [0x7f4f9e34a76d] Jul 18 12:03:34 c01 
ceph-mon: NOTE: a copy of the executable, or `objdump -rdS ` 
is needed to interpret this.
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.872654 7f4f977d9700 -1 
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: In function
'PaxosServiceMessage* MForward::claim_message()' thread 7f4f977d9700 
time 2017-07-18 12:03:34.870230 Jul 18 12:03:34 c01 ceph-mon: ceph 
version 12.1.1
(f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc) Jul 18 12:03:34 
c01 ceph-mon: 1: (ceph::__ceph_assert_fail(char const*, char const*, 
int, char const*)+0x110) [0x7f4fa21f4310] Jul 18 12:03:34 c01 ceph-mon: 
2: 
(Monitor::handle_forward(boost::intrusive_ptr)+0xd70)
[0x7f4fa1fddcd0]
Jul 18 12:03:34 c01 ceph-mon: 3: 
(Monitor::dispatch_op(boost::intrusive_ptr)+0xd8d)
[0x7f4fa1fdb29d]
Jul 18 12:03:34 c01 ceph-mon: 4: (Monitor::_ms_dispatch(Message*)+0x7de)
[0x7f4fa1fdc06e]
Jul 18 12:03:34 c01 ceph-mon: 5: (Monitor::ms_dispatch(Message*)+0x23)
[0x7f4fa2004303]
Jul 18 12:03:34 c01 ceph-mon: 6: (DispatchQueue::entr

Re: [ceph-users] Updating 12.1.0 -> 12.1.1

2017-07-19 Thread Marc Roos
 

Thanks! updating all indeed resolved this.



-Original Message-
From: Gregory Farnum [mailto:gfar...@redhat.com] 
Sent: dinsdag 18 juli 2017 23:01
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] Updating 12.1.0 -> 12.1.1

Yeah, some of the message formats changed (incompatibly) during 
development. If you update all your nodes it should go away; that one I 
think is just ephemeral state.

On Tue, Jul 18, 2017 at 3:09 AM Marc Roos  
wrote:



I just updated packages on one CentOS7 node and getting these 
errors:

Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537510 
7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are 
enabled:
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537510 
7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are 
enabled:
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537725 
7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are 
enabled:
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.537725 
7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are 
enabled:
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.567250 
7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are 
enabled:
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.567250 
7f4fa1c14e40 -1
WARNING: the following dangerous and experimental features are 
enabled:
bluestore
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.589008 
7f4fa1c14e40 -1
mon.a@-1(probing).mgrstat failed to decode mgrstat state; luminous 
dev
version?
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.589008 
7f4fa1c14e40 -1
mon.a@-1(probing).mgrstat failed to decode mgrstat state; luminous 
dev
version?
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.724836 
7f4f977d9700 -1
mon.a@0(synchronizing).mgrstat failed to decode mgrstat state; 
luminous
dev version?
Jul 18 12:03:34 c01 ceph-mon: 2017-07-18 12:03:34.724836 
7f4f977d9700 -1
mon.a@0(synchronizing).mgrstat failed to decode mgrstat state; 
luminous
dev version?
Jul 18 12:03:34 c01 ceph-mon:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABL
E_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/rele
ase/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: In 
function
'PaxosServiceMessage* MForward::claim_message()' thread 
7f4f977d9700
time 2017-07-18 12:03:34.870230
Jul 18 12:03:34 c01 ceph-mon:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABL
E_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/rele
ase/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: 100: 
FAILED
assert(msg)
Jul 18 12:03:34 c01 ceph-mon:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABL
E_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/rele
ase/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: In 
function
'PaxosServiceMessage* MForward::claim_message()' thread 
7f4f977d9700
time 2017-07-18 12:03:34.870230
Jul 18 12:03:34 c01 ceph-mon:
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABL
E_ARC
H/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/rele
ase/1
2.1.1/rpm/el7/BUILD/ceph-12.1.1/src/messages/MForward.h: 100: 
FAILED
assert(msg)
Jul 18 12:03:34 c01 ceph-mon: ceph version 12.1.1
(f3e663a190bf2ed12c7e3cda288b9a159572c800) luminous (rc)
Jul 18 12:03:34 c01 ceph-mon: 1: (ceph::__ceph_assert_fail(char 
const*,
char const*, int, char const*)+0x110) [0x7f4fa21f4310]
Jul 18 12:03:34 c01 ceph-mon: 2:
(Monitor::handle_forward(boost::intrusive_ptr)+0xd70)
[0x7f4fa1fddcd0]
Jul 18 12:03:34 c01 ceph-mon: 3:
(Monitor::dispatch_op(boost::intrusive_ptr)+0xd8d)
[0x7f4fa1fdb29d]
Jul 18 12:03:34 c01 ceph-mon: 4: 
(Monitor::_ms_dispatch(Message*)+0x7de)
[0x7f4fa1fdc06e]
Jul 18 12:03:34 c01 ceph-mon: 5: 
(Monitor::ms_dispatch(Message*)+0x23)
[0x7f4fa2004303]
Jul 18 12:03:34 c01 ceph-mon: 6: (DispatchQueue::entry()+0x792)
[0x7f4fa242c812]
Jul 18 12:03:34 c01 ceph-mon: 7:
(DispatchQueue::DispatchThread::entry()+0xd) [0x7f4fa229a3cd]
Jul 18 12:03:34 c01 ceph-mon: 8: (()+0x7dc5) [0x7f4fa0fbedc5]
Jul 18 12:03:34 c01 ceph-mon: 9: (clone()+0x6d) [0x7f4f9e34a76d]
Jul 18 12:03:34 c01 ceph-mon: NOTE: a copy of the executable, or
`objdump -rdS ` is needed to interpret this.

[ceph-users] Report segfault?

2017-07-21 Thread Marc Roos
 

Should we report these?

[840094.519612] ceph[12010]: segfault at 8 ip 7f194fc8b4c3 sp 
7f19491b6030 error 4 in libceph-common.so.0[7f194f9fb000+7e9000]


CentOS Linux release 7.3.1611 (Core)
Linux 3.10.0-514.26.2.el7.x86_64 #1 SMP Tue Jul 4 15:04:05 UTC 2017 
x86_64 x86_64 x86_64 GNU/Linux

ceph-mds-12.1.1-0.el7.x86_64
ceph-12.1.1-0.el7.x86_64
collectd-ceph-5.7.1-2.el7.x86_64
libcephfs2-12.1.1-0.el7.x86_64
python-cephfs-12.1.1-0.el7.x86_64
ceph-common-12.1.1-0.el7.x86_64
ceph-selinux-12.1.1-0.el7.x86_64
ceph-mgr-12.1.1-0.el7.x86_64
ceph-mon-12.1.1-0.el7.x86_64
ceph-base-12.1.1-0.el7.x86_64
ceph-osd-12.1.1-0.el7.x86_64
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph collectd json errors luminous (for influxdb grafana)

2017-07-21 Thread Marc Roos
 

I would like to work on some grafana dashboards, but since the upgrade 
to luminous rc, there seems to have changed something in json and (a lot 
of) metrics are not stored in influxdb.

Does any one have an idea when updates to collectd-ceph in the epel repo 
will be updated? Or is there some shell script that I can temporarily 
can use to import some sample data?


Jul 21 15:57:29 c01 collectd: ceph plugin: 
cconn_handle_event(name=mds.a,i=2,st=4): error 1
Jul 21 15:57:29 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Jul 21 15:57:29 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Jul 21 15:57:29 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.3,i=1,st=4): error 1
Jul 21 15:57:29 c01 collectd: ceph plugin: ds 
FinisherPurgeQueue.completeLatency.avgtime was not properly initialized.
Jul 21 15:57:29 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Jul 21 15:57:29 c01 collectd: ceph plugin: 
cconn_handle_event(name=mds.a,i=2,st=4): error 1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph collectd json errors luminous (for influxdb grafana)

2017-07-21 Thread Marc Roos
I am running 12.1.1, and updated to it on the 18th. So I guess this is 
either something else or it was not in the rpms.



-Original Message-
From: Gregory Farnum [mailto:gfar...@redhat.com] 
Sent: vrijdag 21 juli 2017 20:21
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] Ceph collectd json errors luminous (for 
influxdb grafana)

This was broken in some of the luminous RCs but fixed in master (and I 
believe the very latest RC release). See 
https://github.com/ceph/ceph/pull/16249

On Fri, Jul 21, 2017 at 7:11 AM Marc Roos  
wrote:




I would like to work on some grafana dashboards, but since the 
upgrade
to luminous rc, there seems to have changed something in json and 
(a lot
of) metrics are not stored in influxdb.

Does any one have an idea when updates to collectd-ceph in the epel 
repo
will be updated? Or is there some shell script that I can 
temporarily
can use to import some sample data?


Jul 21 15:57:29 c01 collectd: ceph plugin:
cconn_handle_event(name=mds.a,i=2,st=4): error 1
Jul 21 15:57:29 c01 collectd: ceph plugin: ds
Bluestore.kvFlushLat.avgtime was not properly initialized.
Jul 21 15:57:29 c01 collectd: ceph plugin: JSON handler failed with
status -1.
Jul 21 15:57:29 c01 collectd: ceph plugin:
cconn_handle_event(name=osd.3,i=1,st=4): error 1
Jul 21 15:57:29 c01 collectd: ceph plugin: ds
FinisherPurgeQueue.completeLatency.avgtime was not properly 
initialized.
Jul 21 15:57:29 c01 collectd: ceph plugin: JSON handler failed with
status -1.
Jul 21 15:57:29 c01 collectd: ceph plugin:
cconn_handle_event(name=mds.a,i=2,st=4): error 1
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Restore RBD image

2017-07-24 Thread Marc Roos
 

I would recommend logging into the host and running your commands from a 
screen session, so they keep running.


-Original Message-
From: Martin Wittwer [mailto:martin.witt...@datonus.ch] 
Sent: zondag 23 juli 2017 15:20
To: ceph-us...@ceph.com
Subject: [ceph-users] Restore RBD image

Hi list

I have a big problem:

I had to resize a RBD image from 100G to 150G. So I used  rbd resize 
--size 150G volume01 to resize.

Because of a bad internet connection I was cicked from the server a few 
seconds after the start of the resize.

Now the image has a size of only 205M!


I now need to restore the RBD image or at least the files which were on 
it. Is there a way to restore them?

Best,
Martin


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Manual fix pg with bluestore

2017-07-31 Thread Marc Roos
 
I have an error with a placement group, and seem to only find these 
solutions based on a filesystem osd. 
http://ceph.com/geen-categorie/ceph-manually-repair-object/

Anybody have a link to how can I do this with a bluestore osd?


/var/log/ceph/ceph-osd.9.log:48:2017-07-31 14:21:33.929855 7fbbbfe57700 
-1 log_channel(cluster) log [ERR] : 17.36 soid 
17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4: failed to pick 
suitable object info
/var/log/ceph/ceph-osd.9.log:49:2017-07-31 14:22:50.189023 7fbbbfe57700 
-1 log_channel(cluster) log [ERR] : 17.36 repair 3 errors, 0 fixed
/var/log/ceph/ceph-osd.9.log:3743:2017-07-31 14:39:23.345218 
7f2f623d6700 -1 log_channel(cluster) log [ERR] : 17.36 soid 
17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4: failed to pick 
suitable object info
/var/log/ceph/ceph-osd.9.log:3744:2017-07-31 14:40:42.625634 
7f2f623d6700 -1 log_channel(cluster) log [ERR] : 17.36 repair 3 errors, 
0 fixed
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Pg inconsistent / export_files error -5

2017-08-04 Thread Marc Roos

I have got a placement group inconsistency, and saw some manual where 
you can export and import this on another osd. But I am getting an 
export error on every osd. 

What does this export_files error -5 actually mean? I thought 3 copies 
should be enough to secure your data.


> PG_DAMAGED Possible data damage: 1 pg inconsistent
>pg 17.36 is active+clean+inconsistent, acting [9,0,12]


> 2017-08-04 05:39:51.534489 7f2f623d6700 -1 log_channel(cluster) log 
[ERR] : 17.36 soid 
17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4: failed to pick 
suitable object info
> 2017-08-04 05:41:12.715393 7f2f623d6700 -1 log_channel(cluster) log 
[ERR] : 17.36 deep-scrub 3 errors
> 2017-08-04 15:21:12.445799 7f2f623d6700 -1 log_channel(cluster) log 
[ERR] : 17.36 soid 
17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4: failed to pick 
suitable object info
> 2017-08-04 15:22:35.646635 7f2f623d6700 -1 log_channel(cluster) log 
[ERR] : 17.36 repair 3 errors, 0 fixed

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-12 --pgid 17.36 
--op export --file /tmp/recover.17.36

...
Read #17:6c9f811c:::rbd_data.1b42f52ae8944a.1a32:head#
Read #17:6ca035fc:::rbd_data.1fff61238e1f29.b31a:head#
Read #17:6ca0b4f8:::rbd_data.1fff61238e1f29.6fcc:head#
Read #17:6ca0ffbc:::rbd_data.1fff61238e1f29.a214:head#
Read #17:6ca10b29:::rbd_data.1fff61238e1f29.9923:head#
Read #17:6ca11ab9:::rbd_data.1fa8ef2ae8944a.11b4:head#
Read #17:6ca13bed:::rbd_data.1f114174b0dc51.02c6:head#
Read #17:6ca1a791:::rbd_data.1fff61238e1f29.f101:head#
Read #17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4#
export_files error -5
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Pg inconsistent / export_files error -5

2017-08-04 Thread Marc Roos
 
I am still on 12.1.1, it is still a test 3 node cluster, nothing much 
happening. 2nd node had some issues a while ago, I had an osd.8 that 
didn’t want to start so I replaced it. 



-Original Message-
From: David Turner [mailto:drakonst...@gmail.com] 
Sent: vrijdag 4 augustus 2017 17:52
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] Pg inconsistent / export_files error -5

It _should_ be enough. What happened in your cluster recently? Power 
Outage, OSD failures, upgrade, added new hardware, any changes at all. 
What is your Ceph version?

On Fri, Aug 4, 2017 at 11:22 AM Marc Roos  
wrote:



I have got a placement group inconsistency, and saw some manual 
where
you can export and import this on another osd. But I am getting an
export error on every osd.

What does this export_files error -5 actually mean? I thought 3 
copies
should be enough to secure your data.


> PG_DAMAGED Possible data damage: 1 pg inconsistent
>pg 17.36 is active+clean+inconsistent, acting [9,0,12]


> 2017-08-04 05:39:51.534489 7f2f623d6700 -1 log_channel(cluster) 
log
[ERR] : 17.36 soid
17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4: failed to 
pick
suitable object info
> 2017-08-04 05:41:12.715393 7f2f623d6700 -1 log_channel(cluster) 
log
[ERR] : 17.36 deep-scrub 3 errors
> 2017-08-04 15:21:12.445799 7f2f623d6700 -1 log_channel(cluster) 
log
[ERR] : 17.36 soid
17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4: failed to 
pick
suitable object info
> 2017-08-04 15:22:35.646635 7f2f623d6700 -1 log_channel(cluster) 
log
[ERR] : 17.36 repair 3 errors, 0 fixed

ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-12 --pgid 
17.36
--op export --file /tmp/recover.17.36

...
Read #17:6c9f811c:::rbd_data.1b42f52ae8944a.1a32:head#
Read #17:6ca035fc:::rbd_data.1fff61238e1f29.b31a:head#
Read #17:6ca0b4f8:::rbd_data.1fff61238e1f29.6fcc:head#
Read #17:6ca0ffbc:::rbd_data.1fff61238e1f29.a214:head#
Read #17:6ca10b29:::rbd_data.1fff61238e1f29.9923:head#
Read #17:6ca11ab9:::rbd_data.1fa8ef2ae8944a.11b4:head#
Read #17:6ca13bed:::rbd_data.1f114174b0dc51.02c6:head#
Read #17:6ca1a791:::rbd_data.1fff61238e1f29.f101:head#
Read #17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4#
export_files error -5
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 1 pg inconsistent, 1 pg unclean, 1 pg degraded

2017-08-07 Thread Marc Roos

I tried to fix a 1 pg inconsistent by taking the osd 12 out, hoping for 
the data to be copied to a different osd, and that one would be used as 
'active?'. 

- Would deleting the whole image in the rbd pool solve this? (or would 
it fail because of this status)

- Should I have done this rather with osd 9?

- Can't I force ceph to just one of the osd's 4MB object and then eg. 
Run fschk on the vm having this image? 


> ok
> PG_STAT STATE UP   UP_PRIMARY 
ACTING   ACTING_PRIMARY
> 17.36   active+degraded+remapped+inconsistent [9,0,13]  9 
[9,0,12]  9





{
"state": "active+degraded+remapped+inconsistent",
"snap_trimq": "[]",
"epoch": 8687,
"up": [
9,
0,
13
],
"acting": [
9,
0,
12
],
"backfill_targets": [
"13"
],
"actingbackfill": [
"0",
"9",
"12",
"13"
],
"info": {
"pgid": "17.36",
"last_update": "8686'95650",
"last_complete": "0'0",
"log_tail": "8387'91830",
"last_user_version": 95650,
"last_backfill": "MAX",
"last_backfill_bitwise": 1,
"purged_snaps": [
{
"start": "1",
"length": "3"
}
],
"history": {
"epoch_created": 3636,
"epoch_pool_created": 3636,
"last_epoch_started": 8685,
"last_interval_started": 8684,
"last_epoch_clean": 8487,
"last_interval_clean": 8486,
"last_epoch_split": 0,
"last_epoch_marked_full": 0,
"same_up_since": 8683,
"same_interval_since": 8684,
"same_primary_since": 8392,
"last_scrub": "8410'93917",
"last_scrub_stamp": "2017-08-05 19:42:14.906055",
"last_deep_scrub": "8410'93917",
"last_deep_scrub_stamp": "2017-08-05 19:42:14.906055",
"last_clean_scrub_stamp": "2017-07-29 20:21:18.626777"
},
"stats": {
"version": "8686'95650",
"reported_seq": "141608",
"reported_epoch": "8687",
"state": "active+degraded+remapped+inconsistent",
"last_fresh": "2017-08-07 17:16:25.902001",
"last_change": "2017-08-07 17:16:25.902001",
"last_active": "2017-08-07 17:16:25.902001",
"last_peered": "2017-08-07 17:16:25.902001",
"last_clean": "2017-08-06 16:52:33.999429",
"last_became_active": "2017-08-07 13:01:14.646736",
"last_became_peered": "2017-08-07 13:01:14.646736",
"last_unstale": "2017-08-07 17:16:25.902001",
"last_undegraded": "2017-08-07 13:01:13.683550",
"last_fullsized": "2017-08-07 17:16:25.902001",
"mapping_epoch": 8684,
"log_start": "8387'91830",
"ondisk_log_start": "8387'91830",
"created": 3636,
"last_epoch_clean": 8487,
"parent": "0.0",
"parent_split_bits": 0,
"last_scrub": "8410'93917",
"last_scrub_stamp": "2017-08-05 19:42:14.906055",
"last_deep_scrub": "8410'93917",
"last_deep_scrub_stamp": "2017-08-05 19:42:14.906055",
"last_clean_scrub_stamp": "2017-07-29 20:21:18.626777",
"log_size": 3820,
"ondisk_log_size": 3820,
"stats_invalid": false,
"dirty_stats_invalid": false,
"omap_stats_invalid": false,
"hitset_stats_invalid": false,
"hitset_bytes_stats_invalid": false,
"pin_stats_invalid": false,
"stat_sum": {
"num_bytes": 7953924096,
"num_objects": 1910,
"num_object_clones": 30,
"num_object_copies": 5730,
"num_objects_missing_on_primary": 1,
"num_objects_missing": 0,
"num_objects_degraded": 0,
"num_objects_misplaced": 1596,
"num_objects_unfound": 1,
"num_objects_dirty": 1910,
"num_whiteouts": 0,
"num_read": 32774,
"num_read_kb": 1341008,
"num_write": 189252,
"num_write_kb": 9369192,
"num_scrub_errors": 3,
"num_shallow_scrub_errors": 0,
"num_deep_scrub_errors": 3,
"num_objects_recovered": 417,
"num_bytes_recovered": 1732386816,
"num_keys_recovered": 0,
"num_objects_omap": 0,
"num_objects_hit_set_archive": 0,
"num_bytes_hit_set_archive": 0,
"num_flush": 0,
"num_flush_kb": 0,
"num_evict": 0,
"num_evict_kb": 0,
"num_promote": 0,
"num_flush_mode_high": 0,
"num_flush_mode

Re: [ceph-users] Pg inconsistent / export_files error -5

2017-08-08 Thread Marc Roos
rror ", 19) = 
19 <0.24>
6814  16:08:19.261774 write(2, "-5", 2) = 2 <0.18>
6814  16:08:19.261841 write(2, "\n", 1) = 1 <0.16>
6814  16:08:19.262191 madvise(0x7f4839106000, 16384, MADV_DONTNEED) = 0 
<0.15>
6814  16:08:19.262229 madvise(0x7f483914e000, 16384, MADV_DONTNEED) = 0 
<0.12>
6814  16:08:19.262295 madvise(0x7f48389e6000, 49152, MADV_DONTNEED) = 0 
<0.13>
6814  16:08:19.262498 madvise(0x7f48390ea000, 16384, MADV_DONTNEED) = 0 
<0.13>
6814  16:08:19.262538 madvise(0x7f48390ce000, 16384, MADV_DONTNEED) = 0 
<0.12>
6814  16:08:19.262580 madvise(0x7f483c228000, 24576, MADV_DONTNEED) = 0 
<0.12>
6814  16:08:19.263047 madvise(0x7f48393d8000, 16384, MADV_DONTNEED) = 0 
<0.13>
6814  16:08:19.263081 madvise(0x7f48393d8000, 32768, MADV_DONTNEED) = 0 
<0.16>


I was curious how this would compare to the osd.9

object_info: 
17:6ca13bed:::rbd_data.1f114174b0dc51.02c6:head(5236'7640 
client.2074638.1:704364 dirty|data_digest|omap_digest s 4194304 uv 7922 
dd 3bcff64d od  alloc_hint [4194304 4194304 0])
data section offset=0 len=1048576
data section offset=1048576 len=1048576
data section offset=2097152 len=1048576
data section offset=3145728 len=1048576
attrs size 2
omap map size 0
Read #17:6ca1a791:::rbd_data.1fff61238e1f29.f101:head#
size=4194304
object_info: 
17:6ca1a791:::rbd_data.1fff61238e1f29.f101:head(5387'35553 
client.2096993.0:123721 dirty|data_digest|omap_digest s 4194304 uv 35752 
dd f9bc0fbd od  alloc_hint [4194304 4194304 0])
data section offset=0 len=1048576
data section offset=1048576 len=1048576
data section offset=2097152 len=1048576
data section offset=3145728 len=1048576
attrs size 2
omap map size 0
Read #17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4#
size=4194304
object_info: 
17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4(5390'56613 
client.2096907.1:3222443 dirty|omap_digest s 4194304 uv 55477 od 
 alloc_hint [0 0 0])
2017-08-08 16:22:00.893216 7f94e10f5100 -1 
bluestore(/var/lib/ceph/osd/ceph-9) _verify_csum bad crc32c/0x1000 
checksum at blob offset 0x0, got 0xb40b26a7, expected 0x90407f75, device 
location [0x2daea~1000], logical extent 0x0~1000, object 
#17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4#
export_files error -5
2017-08-08 16:22:00.895439 7f94e10f5100  1 
bluestore(/var/lib/ceph/osd/ceph-9) umount
2017-08-08 16:22:00.963774 7f94e10f5100  1 freelist shutdown
2017-08-08 16:22:00.963861 7f94e10f5100  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/rocksdb/db/db_impl.cc:217] 
Shutdown: canceling all background work
2017-08-08 16:22:00.968438 7f94e10f5100  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/rocksdb/db/db_impl.cc:343] Shutdown 
complete
2017-08-08 16:22:00.984583 7f94e10f5100  1 bluefs umount
2017-08-08 16:22:01.026784 7f94e10f5100  1 bdev(0x7f94e3670e00 
/var/lib/ceph/osd/ceph-9/block) close
2017-08-08 16:22:01.243361 7f94e10f5100  1 bdev(0x7f94e34b5a00 
/var/lib/ceph/osd/ceph-9/block) close


23555 16:26:31.336061 io_getevents(139955679129600, 1, 16,  
23552 16:26:31.336081 futex(0x7ffe7e4c9210, FUTEX_WAKE_PRIVATE, 1) = 0 
<0.000155>
23552 16:26:31.336452 futex(0x7f49fb4d20bc, FUTEX_WAKE_OP_PRIVATE, 1, 1, 
0x7f49fb4d20b8, {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1}) = 1 <0.000129>
23553 16:26:31.336637 <... futex resumed> ) = 0 <16.434259>
23553 16:26:31.336758 futex(0x7f49fb4d2038, FUTEX_WAKE_PRIVATE, 1 

23552 16:26:31.336801 madvise(0x7f4a0cafa000, 2555904, MADV_DONTNEED 

23553 16:26:31.336915 <... futex resumed> ) = 0 <0.000113>
23552 16:26:31.336959 <... madvise resumed> ) = 0 <0.000148>
23553 16:26:31.337040 futex(0x7f49fb4d20bc, FUTEX_WAIT_PRIVATE, 55, NULL 

23552 16:26:31.337070 madvise(0x7f4a0ca7a000, 3080192, MADV_DONTNEED) = 
0 <0.000180>
23552 16:26:31.337424 write(2, "export_files error ", 19) = 
19 <0.000104>
23552 16:26:31.337615 write(2, "-5", 2) = 2 <0.17>
23552 16:26:31.337674 write(2, "\n", 1) = 1 <0.37>
23552 16:26:31.338270 madvise(0x7f4a01ae4000, 16384, MADV_DONTNEED) = 0 
<0.20>
23552 16:26:31.338320 madvise(0x7f4a018cc000, 49152, MADV_DONTNEED) = 0 
<0.14>
23552 16:26:31.338561 madvise(0x7f4a0770a000, 24576, MADV_DONTNEED) = 0 
<0.15>
23552 16:26:31.339161 madvise(0x7f4a02102000, 16384, MADV_DONTNEED) = 0 
<0.15>
23552 16:26:31.339201 madvise(0x7f4a02132000, 16384, MADV_DONTNEED) = 0 
<0.13>
23552 16:26:31.339235 madvise(0x7f4a02102000, 32768, MADV_DONTNEED) = 0

Re: [ceph-users] Pg inconsistent / export_files error -5

2017-08-09 Thread Marc Roos
/src/rocksdb/db/db_impl.cc:217] 
Shutdown: canceling all background work
2017-08-09 11:41:25.471514 7f26db8ae100  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/rocksdb/db/db_impl.cc:343] Shutdown 
complete
2017-08-09 11:41:25.686088 7f26db8ae100  1 bluefs umount
2017-08-09 11:41:25.705389 7f26db8ae100  1 bdev(0x7f26de472e00 
/var/lib/ceph/osd/ceph-0/block) close
2017-08-09 11:41:25.944548 7f26db8ae100  1 bdev(0x7f26de2b3a00 
/var/lib/ceph/osd/ceph-0/block) close












-Original Message-
From: Sage Weil [mailto:s...@newdream.net] 
Sent: woensdag 9 augustus 2017 4:44
To: Brad Hubbard
Cc: Marc Roos; ceph-users
Subject: Re: [ceph-users] Pg inconsistent / export_files error -5

On Wed, 9 Aug 2017, Brad Hubbard wrote:
> Wee
> 
> On Wed, Aug 9, 2017 at 12:41 AM, Marc Roos  
wrote:
> >
> >
> >
> > The --debug indeed comes up with something
> > bluestore(/var/lib/ceph/osd/ceph-12) _verify_csum bad crc32c/0x1000 
> > checksum at blob offset 0x0, got 0x100ac314, expected 0x90407f75, 
> > device location [0x15a017~1000], logical extent 0x0~1000,
> >  bluestore(/var/lib/ceph/osd/ceph-9) _verify_csum bad crc32c/0x1000 
> > checksum at blob offset 0x0, got 0xb40b26a7, expected 0x90407f75, 
> > device location [0x2daea~1000], logical extent 0x0~1000,

What about the 3rd OSD?

It would be interesting to capture the fsck output for one of these.  
Stop the OSD, and then run

 ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-12 --log-file 
out \
--debug-bluestore 30 --no-log-to-stderr

That'll generate a pretty huge log, but should include dumps of onode 
metadata and will hopefully include something else with the checksum of
0x100ac314 so we can get some clue as to where the bad data came from.

Thanks!
sage


> >
> > I dont know how to interpret this, but am I correct to understand 
> > that data has been written across the cluster to these 3 osd's and 
> > all 3 have somehow received something different?
> 
> Did you run this command on OSD 0? What was the output in that case?
> 
> Possibly, all we currently know for sure is that the crc32c checksum 
> for the object on OSDs 12 and 9 do not match the expected checksum 
> according to the code when we attempt to read the object 
> #17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4#. There 
> seems to be some history behind this based on your previous emails 
> regarding these OSDs (12,9,0, and possibly 13) could you give us as 
> much detail as possible about how this issue came about and what you 
> have done in the interim to try to resolve it?
> 
> When was the first indication there was a problem with pg 17.36? Did 
> this correspond with any significant event?
> 
> Are these OSDs all on separate hosts?
> 
> It's possible ceph-bluestore-tool may help here but I would hold off 
> on that option until we understand the issue better.
> 
> 
> >
> >
> > size=4194304 object_info:
> > 17:6ca10b29:::rbd_data.1fff61238e1f29.9923:head(5387'351
> > 57
> > client.2096993.0:78941 dirty|data_digest|omap_digest s 4194304 uv 
> > 35356 dd f53dff2e od  alloc_hint [4194304 4194304 0]) data 
> > section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 
> > attrs size
> > 2 omap map size 0 Read
> > #17:6ca11ab9:::rbd_data.1fa8ef2ae8944a.11b4:head# 
> > size=4194304
> > object_info:
> > 17:6ca11ab9:::rbd_data.1fa8ef2ae8944a.11b4:head(5163'713
> > 6
> > client.2074638.1:483264 dirty|data_digest|omap_digest s 4194304 uv 
> > 7418 dd 43d61c5d od  alloc_hint [4194304 4194304 0]) data 
> > section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 
> > attrs size
> > 2 omap map size 0 Read
> > #17:6ca13bed:::rbd_data.1f114174b0dc51.02c6:head# 
> > size=4194304
> > object_info:
> > 17:6ca13bed:::rbd_data.1f114174b0dc51.02c6:head(5236'764
> > 0
> > client.2074638.1:704364 dirty|data_digest|omap_digest s 4194304 uv 
> > 7922 dd 3bcff64d od  alloc_hint [4194304 4194304 0]) data 
> > section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 
> > attrs size
> > 2 omap map size 0 Read
> > #17:6ca1a791:::rbd_data.1fff61238e1f29.0

[ceph-users] Luminous release + collectd plugin

2017-08-11 Thread Marc Roos

I am not sure if I am the only one having this. But there is an issue 
with the collectd plugin and the luminous release. I think I didn’t 
have this in Kraken, looks like something changed in the JSON? I also 
reported it here https://github.com/collectd/collectd/issues/2343, I 
have no idea who is responsible for this, but it would be nice to have 
this working in at the final release of luminous.


Aug 11 18:16:06 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.0,i=1,st=4): error 1
Aug 11 18:16:06 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Aug 11 18:16:06 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Aug 11 18:16:06 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.6,i=2,st=4): error 1
Aug 11 18:16:06 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Aug 11 18:16:06 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Aug 11 18:16:06 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.7,i=3,st=4): error 1
Aug 11 18:16:06 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Aug 11 18:16:06 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Aug 11 18:16:06 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.8,i=4,st=4): error 1
Aug 11 18:16:06 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Aug 11 18:16:06 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Aug 11 18:16:06 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.7,i=3,st=4): error 1
Aug 11 18:16:06 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Aug 11 18:16:06 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Luminous / auto application enable detection

2017-08-13 Thread Marc Roos

FYI when creating these rgw pools, not all are automatically 'enabled 
application'

I created these

ceph osd pool create default.rgw 
ceph osd pool create default.rgw.meta 
ceph osd pool create default.rgw.control 
ceph osd pool create default.rgw.log 
ceph osd pool create .rgw.root 
ceph osd pool create .rgw.gc 
ceph osd pool create .rgw.buckets 
ceph osd pool create .rgw.buckets.index
ceph osd pool create .rgw.buckets.extra 
ceph osd pool create .intent-log 
ceph osd pool create .usage 
ceph osd pool create .users 
ceph osd pool create .users.email 
ceph osd pool create .users.swift 
ceph osd pool create .users.uid 

And I get only these notifications,

POOL_APP_NOT_ENABLED application not enabled on 4 pool(s)
application not enabled on pool '.rgw.root'
application not enabled on pool 'default.rgw.control'
application not enabled on pool 'default.rgw.log'
application not enabled on pool 'default.rgw.meta'
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Pg inconsistent / export_files error -5

2017-08-14 Thread Marc Roos
db: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/rocksdb/db/db_impl.cc:217] 
Shutdown: canceling all background work
2017-08-09 11:41:25.471514 7f26db8ae100  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.1.1/rpm/el7/BUILD/ceph-12.1.1/src/rocksdb/db/db_impl.cc:343] Shutdown 
complete
2017-08-09 11:41:25.686088 7f26db8ae100  1 bluefs umount
2017-08-09 11:41:25.705389 7f26db8ae100  1 bdev(0x7f26de472e00 
/var/lib/ceph/osd/ceph-0/block) close
2017-08-09 11:41:25.944548 7f26db8ae100  1 bdev(0x7f26de2b3a00 
/var/lib/ceph/osd/ceph-0/block) close












-Original Message-
From: Sage Weil [mailto:s...@newdream.net]
Sent: woensdag 9 augustus 2017 4:44
To: Brad Hubbard
Cc: Marc Roos; ceph-users
Subject: Re: [ceph-users] Pg inconsistent / export_files error -5

On Wed, 9 Aug 2017, Brad Hubbard wrote:
> Wee
> 
> On Wed, Aug 9, 2017 at 12:41 AM, Marc Roos  
wrote:
> >
> >
> >
> > The --debug indeed comes up with something
> > bluestore(/var/lib/ceph/osd/ceph-12) _verify_csum bad crc32c/0x1000 
> > checksum at blob offset 0x0, got 0x100ac314, expected 0x90407f75, 
> > device location [0x15a017~1000], logical extent 0x0~1000,
> >  bluestore(/var/lib/ceph/osd/ceph-9) _verify_csum bad crc32c/0x1000 
> > checksum at blob offset 0x0, got 0xb40b26a7, expected 0x90407f75, 
> > device location [0x2daea~1000], logical extent 0x0~1000,

What about the 3rd OSD?

It would be interesting to capture the fsck output for one of these.  
Stop the OSD, and then run

 ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-12 --log-file 
out \
--debug-bluestore 30 --no-log-to-stderr

That'll generate a pretty huge log, but should include dumps of onode 
metadata and will hopefully include something else with the checksum of
0x100ac314 so we can get some clue as to where the bad data came from.

Thanks!
sage


> >
> > I dont know how to interpret this, but am I correct to understand 
> > that data has been written across the cluster to these 3 osd's and 
> > all 3 have somehow received something different?
> 
> Did you run this command on OSD 0? What was the output in that case?
> 
> Possibly, all we currently know for sure is that the crc32c checksum 
> for the object on OSDs 12 and 9 do not match the expected checksum 
> according to the code when we attempt to read the object 
> #17:6ca1f70a:::rbd_data.1f114174b0dc51.0974:4#. There 
> seems to be some history behind this based on your previous emails 
> regarding these OSDs (12,9,0, and possibly 13) could you give us as 
> much detail as possible about how this issue came about and what you 
> have done in the interim to try to resolve it?
> 
> When was the first indication there was a problem with pg 17.36? Did 
> this correspond with any significant event?
> 
> Are these OSDs all on separate hosts?
> 
> It's possible ceph-bluestore-tool may help here but I would hold off 
> on that option until we understand the issue better.
> 
> 
> >
> >
> > size=4194304 object_info:
> > 17:6ca10b29:::rbd_data.1fff61238e1f29.9923:head(5387'351
> > 57
> > client.2096993.0:78941 dirty|data_digest|omap_digest s 4194304 uv
> > 35356 dd f53dff2e od  alloc_hint [4194304 4194304 0]) data 
> > section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 
> > attrs size
> > 2 omap map size 0 Read
> > #17:6ca11ab9:::rbd_data.1fa8ef2ae8944a.11b4:head#
> > size=4194304
> > object_info:
> > 17:6ca11ab9:::rbd_data.1fa8ef2ae8944a.11b4:head(5163'713
> > 6
> > client.2074638.1:483264 dirty|data_digest|omap_digest s 4194304 uv
> > 7418 dd 43d61c5d od  alloc_hint [4194304 4194304 0]) data 
> > section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > offset=2097152 len=1048576 data section offset=3145728 len=1048576 
> > attrs size
> > 2 omap map size 0 Read
> > #17:6ca13bed:::rbd_data.1f114174b0dc51.02c6:head#
> > size=4194304
> > object_info:
> > 17:6ca13bed:::rbd_data.1f114174b0dc51.02c6:head(5236'764
> > 0
> > client.2074638.1:704364 dirty|data_digest|omap_digest s 4194304 uv
> > 7922 dd 3bcff64d od  alloc_hint [4194304 4194304 0]) data 
> > section offset=0
> > len=1048576 data section offset=1048576 len=1048576 data section
> > of

[ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

2017-08-19 Thread Marc Roos


Where can you get the nfs-ganesha-ceph rpm? Is there a repository that 
has these?




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] RBD encryption options?

2017-08-22 Thread Marc Roos
 

I had some issues with the iscsi software starting to early, maybe this 
can give you some ideas.


systemctl show target.service -p After

mkdir /etc/systemd/system/target.service.d

cat << 'EOF' > /etc/systemd/system/target.service.d/10-waitforrbd.conf
[Unit]
After=systemd-journald.socket sys-kernel-config.mount system.slice 
basic.target network.target local-fs.target rbdmap.service
EOF


-Original Message-
From: Daniel K [mailto:satha...@gmail.com] 
Sent: dinsdag 22 augustus 2017 3:03
To: ceph-users@lists.ceph.com
Subject: [ceph-users] RBD encryption options?

Are there any client-side options to encrypt an RBD device?

Using latest luminous RC, on Ubuntu 16.04 and a 4.10 kernel

I assumed adding client site encryption  would be as simple as using 
luks/dm-crypt/cryptsetup after adding the RBD device to /etc/ceph/rbdmap 
and enabling the rbdmap service -- but I failed to consider the order of 
things loading and it appears that the RBD gets mapped too late for 
dm-crypt to recognize it as valid.It just keeps telling me it's not a 
valid LUKS device.

I know you can run the OSDs on an encrypted drive, but I was hoping for 
something client side since it's not exactly simple(as far as I can 
tell) to restrict client access to a single(or group) of RBDs within a 
shared pool.

Any suggestions?




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Cephfs user path permissions luminous

2017-08-23 Thread Marc Roos


ceph fs authorize cephfs client.bla /bla rw

Will generate a user with these permissions 

[client.bla]
caps mds = "allow rw path=/bla"
caps mon = "allow r"
caps osd = "allow rw pool=fs_data"

With those permissions I cannot mount, I get a permission denied, until 
I change the permissions to eg. These:

caps mds = "allow r, allow rw path=/bla"
caps mon = "allow r"
caps osd = "allow rwx pool=fs_meta,allow rwx pool=fs_data"

Are these the minimum required permissions for mounting? I guess this 
should also be updated for ceph fs authorize?

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

2017-08-29 Thread Marc Roos
 
Ali, Very very nice! I was creating the rpm's based on a old rpm source 
spec. And it was a hastle to get them to build, and I am not sure if I 
even used to correct compile settings.



-Original Message-
From: Ali Maredia [mailto:amare...@redhat.com] 
Sent: maandag 28 augustus 2017 22:29
To: TYLin
Cc: Marc Roos; ceph-us...@ceph.com
Subject: Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

Marc,

These rpms (and debs) are built with the latest ganesha 2.5 stable 
release and the latest luminous release on download.ceph.com:

http://download.ceph.com/nfs-ganesha/

I just put them up late last week, and I will be maintaining them in the 
future.

-Ali

- Original Message -
> From: "TYLin" 
> To: "Marc Roos" 
> Cc: ceph-us...@ceph.com
> Sent: Sunday, August 20, 2017 11:58:05 PM
> Subject: Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7
> 
> You can get rpm from here
> 
> https://download.gluster.org/pub/gluster/glusterfs/nfs-ganesha/old/2.3
> .0/CentOS/nfs-ganesha.repo
> 
> You have to fix the path mismatch error in the repo file manually.
> 
> > On Aug 20, 2017, at 5:38 AM, Marc Roos  
wrote:
> > 
> > 
> > 
> > Where can you get the nfs-ganesha-ceph rpm? Is there a repository 
> > that has these?
> > 
> > 
> > 
> > 
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

2017-08-29 Thread Marc Roos
 
nfs-ganesha-2.5.2-.el7.x86_64.rpm 
 ^
Is this correct?

-Original Message-
From: Marc Roos 
Sent: dinsdag 29 augustus 2017 11:40
To: amaredia; wooertim
Cc: ceph-users
Subject: Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

 
Ali, Very very nice! I was creating the rpm's based on a old rpm source 
spec. And it was a hastle to get them to build, and I am not sure if I 
even used to correct compile settings.



-Original Message-
From: Ali Maredia [mailto:amare...@redhat.com]
Sent: maandag 28 augustus 2017 22:29
To: TYLin
Cc: Marc Roos; ceph-us...@ceph.com
Subject: Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

Marc,

These rpms (and debs) are built with the latest ganesha 2.5 stable 
release and the latest luminous release on download.ceph.com:

http://download.ceph.com/nfs-ganesha/

I just put them up late last week, and I will be maintaining them in the 
future.

-Ali

- Original Message -
> From: "TYLin" 
> To: "Marc Roos" 
> Cc: ceph-us...@ceph.com
> Sent: Sunday, August 20, 2017 11:58:05 PM
> Subject: Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7
> 
> You can get rpm from here
> 
> https://download.gluster.org/pub/gluster/glusterfs/nfs-ganesha/old/2.3
> .0/CentOS/nfs-ganesha.repo
> 
> You have to fix the path mismatch error in the repo file manually.
> 
> > On Aug 20, 2017, at 5:38 AM, Marc Roos 
wrote:
> > 
> > 
> > 
> > Where can you get the nfs-ganesha-ceph rpm? Is there a repository 
> > that has these?
> > 
> > 
> > 
> > 
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Centos7, luminous, cephfs, .snaps

2017-08-29 Thread Marc Roos

Where can I find some examples on creating a snapshot on a directory. 
Can I just do mkdir .snaps? I tried with stock kernel and a 4.12.9-1
http://docs.ceph.com/docs/luminous/dev/cephfs-snapshots/






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] v12.2.0 Luminous released , collectd json update?

2017-08-30 Thread Marc Roos
 


If now 12.2.0 is released, how and who should be approached for applying 
patches for collectd?

Aug 30 10:40:42 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Aug 30 10:40:42 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.8,i=4,st=4): error 1
Aug 30 10:40:42 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.3,i=0,st=4): error 1
Aug 30 10:40:42 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Aug 30 10:40:42 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Aug 30 10:40:42 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.0,i=1,st=4): error 1
Aug 30 10:40:42 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Aug 30 10:40:42 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Aug 30 10:40:42 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.6,i=2,st=4): error 1
Aug 30 10:40:42 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Aug 30 10:40:42 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Aug 30 10:40:42 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.7,i=3,st=4): error 1
Aug 30 10:40:42 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Aug 30 10:40:42 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Aug 30 10:40:42 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.8,i=4,st=4): error 1





-Original Message-
From: Abhishek Lekshmanan [mailto:abhis...@suse.com] 
Sent: dinsdag 29 augustus 2017 20:20
To: ceph-de...@vger.kernel.org; ceph-us...@ceph.com; 
ceph-maintain...@ceph.com; ceph-annou...@ceph.com
Subject: [ceph-users] v12.2.0 Luminous released


We're glad to announce the first release of Luminous v12.2.x long term 
stable release series. There have been major changes since Kraken
(v11.2.z) and Jewel (v10.2.z), and the upgrade process is non-trivial.
Please read the release notes carefully.

For more details, links & changelog please refer to the complete release 
notes entry at the Ceph blog:
http://ceph.com/releases/v12-2-0-luminous-released/


Major Changes from Kraken
-

- *General*:
  * Ceph now has a simple, built-in web-based dashboard for monitoring 
cluster
status.

- *RADOS*:
  * *BlueStore*:
- The new *BlueStore* backend for *ceph-osd* is now stable and the
  new default for newly created OSDs.  BlueStore manages data
  stored by each OSD by directly managing the physical HDDs or
  SSDs without the use of an intervening file system like XFS.
  This provides greater performance and features.
- BlueStore supports full data and metadata checksums
  of all data stored by Ceph.
- BlueStore supports inline compression using zlib, snappy, or LZ4. 
(Ceph
  also supports zstd for RGW compression but zstd is not recommended 
for
  BlueStore for performance reasons.)

  * *Erasure coded* pools now have full support for overwrites
allowing them to be used with RBD and CephFS.

  * *ceph-mgr*:
- There is a new daemon, *ceph-mgr*, which is a required part of
  any Ceph deployment.  Although IO can continue when *ceph-mgr*
  is down, metrics will not refresh and some metrics-related calls
  (e.g., `ceph df`) may block.  We recommend deploying several
  instances of *ceph-mgr* for reliability.  See the notes on
  Upgrading below.
- The *ceph-mgr* daemon includes a REST-based management API.
  The API is still experimental and somewhat limited but
  will form the basis for API-based management of Ceph going 
forward.
- ceph-mgr also includes a Prometheus exporter plugin, which can 
provide Ceph
  perfcounters to Prometheus.
- ceph-mgr now has a Zabbix plugin. Using zabbix_sender it sends 
trapper
  events to a Zabbix server containing high-level information of the 
Ceph
  cluster. This makes it easy to monitor a Ceph cluster's status and 
send
  out notifications in case of a malfunction.

  * The overall *scalability* of the cluster has improved. We have
successfully tested clusters with up to 10,000 OSDs.
  * Each OSD can now have a device class associated with
it (e.g., `hdd` or `ssd`), allowing CRUSH rules to trivially map
data to a subset of devices in the system.  Manually writing CRUSH
rules or manual editing of the CRUSH is normally not required.
  * There is a new upmap exception mechanism that allows individual PGs 
to be moved around to achieve
a *perfect distribution* (this requires luminous clients).
  * Each OSD now adjusts its default configuration based on whether the
backing device is an HDD or SSD. Manual tuning generally not 
required.
  * The prototype mClock QoS queueing algorithm is now available.
  * There is now a *backoff* mechanism that prevents OSDs from being
overloaded by requests to objects or PGs that are not currently able 
to
process 

Re: [ceph-users] Centos7, luminous, cephfs, .snaps

2017-08-30 Thread Marc Roos

I noticed it is .snap not .snaps 

[]# cd test/
[]# ls -arlt
total 614400
drwxr-xr-x 1 root root13 Aug 29 23:39 ..
-rw-r--r-- 1 root root 209715200 Aug 29 23:39 test1.img
-rw-r--r-- 1 root root 209715200 Aug 29 23:39 test2.img
-rw-r--r-- 1 root root 209715200 Aug 29 23:40 test3.img
drwxr-xr-x 1 root root 4 Aug 30 00:00 .
-rw-r--r-- 1 root root 0 Aug 30 00:00 test
[]# ls -l .snap
total 0
[]# mkdir .snap/snap1
mkdir: cannot create directory ‘.snap/snap1’: Operation not permitted

Is this because my permissions are insufficient on the client id?

caps: [mds] allow r, allow rw path=/nfs
caps: [mon] allow r
caps: [osd] allow rwx pool=fs_meta,allow rwx pool=fs_data 





-Original Message-
From: Marc Roos 
Sent: dinsdag 29 augustus 2017 23:48
To: ceph-users
Subject: [ceph-users] Centos7, luminous, cephfs, .snaps


Where can I find some examples on creating a snapshot on a directory. 
Can I just do mkdir .snaps? I tried with stock kernel and a 4.12.9-1 
http://docs.ceph.com/docs/luminous/dev/cephfs-snapshots/






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Correct osd permissions

2017-08-30 Thread Marc Roos

I have some osd with these permissions, and without mgr. What are the 
correct ones to have for luminous?

osd.0
caps: [mgr] allow profile osd
caps: [mon] allow profile osd
caps: [osd] allow *

osd.14
caps: [mon] allow profile osd
caps: [osd] allow *
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] osd heartbeat protocol issue on upgrade v12.1.0 ->v12.2.0

2017-08-30 Thread Marc Roos
 
I had this also once. If you update all nodes and then systemctl restart 
'ceph-osd@*' on all nodes, you should be fine. But first the monitors of 
course



-Original Message-
From: Thomas Gebhardt [mailto:gebha...@hrz.uni-marburg.de] 
Sent: woensdag 30 augustus 2017 14:10
To: ceph-users@lists.ceph.com
Subject: [ceph-users] osd heartbeat protocol issue on upgrade v12.1.0 
->v12.2.0

Hello,

when I upgraded (yet a single osd node) from v12.1.0 -> v12.2.0 its osds 
start flapping and finally got all marked as down.

As far as I can see, this is due to an incompatibility of the osd 
heartbeat protocol between the two versions:

v12.2.0 node:
7f4f7b6e6700 -1 osd.X 3879 heartbeat_check: no reply from x.x.x.x: 
osd.Y ever on either front or back, first ping sent ...

v12.1.0 node:
7fd854ebf700 -1 failed to decode message of type 70 v4:
buffer::malformed_input: void
osd_peer_stat_t::decode(ceph::buffer::list::iterator&) no longer 
understand old encoding version 1 < struct_compat

( it is puzzling that the *older* v12.1.0 node complains about the *old* 
encoding version of the *newer* v12.2.0 node.)

Any idea how I can go ahead?

Kind regards, Thomas
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] (no subject)

2017-08-31 Thread Marc Roos

Should these messages not be gone in 12.2.0?

2017-08-31 20:49:33.500773 7f5aa1756d40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-08-31 20:49:33.501026 7f5aa1756d40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-08-31 20:49:33.540667 7f5aa1756d40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore

ceph-selinux-12.2.0-0.el7.x86_64
ceph-mon-12.2.0-0.el7.x86_64
collectd-ceph-5.7.1-2.el7.x86_64
ceph-base-12.2.0-0.el7.x86_64
ceph-osd-12.2.0-0.el7.x86_64
ceph-mgr-12.2.0-0.el7.x86_64
ceph-12.2.0-0.el7.x86_64
ceph-common-12.2.0-0.el7.x86_64
ceph-mds-12.2.0-0.el7.x86_64



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Command that lists all client connections (with ips)?

2017-09-05 Thread Marc Roos
What would be the best way to get an overview of all client connetions. 
Something similar to the output of rbd lock list 


  cluster:
1 clients failing to respond to capability release
1 MDSs report slow requests


ceph daemon mds.a dump_ops_in_flight
{
"ops": [
{
"description": "client_request(client.2342664:12 create 
#0x11b9177/..discinfo.hJpqTF 2017-09-05 09:56:43.419636 
caller_uid=500, caller_gid=500{500,1,2,3,4,6,10,})",
"initiated_at": "2017-09-05 09:56:43.419708",
"age": 5342.233837,
"duration": 5342.233857,
"type_data": {
"flag_point": "failed to wrlock, waiting",
"reqid": "client.2342664:12",
"op_type": "client_request",
"client_info": {
"client": "client.2342664",
"tid": 12
},
"events": [
{
"time": "2017-09-05 09:56:43.419708",
"event": "initiated"
},
{
"time": "2017-09-05 09:56:43.419913",
"event": "failed to wrlock, waiting"
}
]
}
}
],
"num_ops": 1
}

http://docs.ceph.com/docs/master/cephfs/troubleshooting/



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PCIe journal benefit for SSD OSDs

2017-09-07 Thread Marc Roos
 
Sorry to cut in your thread. 

> Have you disabled te FLUSH command for the Samsung ones?

We have a test cluster currently only with spinners pool, but we have 
SM863 available to create the ssd pool. Is there something specific that 
needs to be done for the SM863?




-Original Message-
From: Stefan Priebe - Profihost AG [mailto:s.pri...@profihost.ag] 
Sent: donderdag 7 september 2017 8:04
To: Christian Balzer; ceph-users
Subject: Re: [ceph-users] PCIe journal benefit for SSD OSDs

Hello,
Am 07.09.2017 um 03:53 schrieb Christian Balzer:
> 
> Hello,
> 
> On Wed, 6 Sep 2017 09:09:54 -0400 Alex Gorbachev wrote:
> 
>> We are planning a Jewel filestore based cluster for a performance 
>> sensitive healthcare client, and the conservative OSD choice is 
>> Samsung SM863A.
>>
> 
> While I totally see where you're coming from and me having stated that 

> I'll give Luminous and Bluestore some time to mature, I'd also be 
> looking into that if I were being in the planning phase now, with like 

> 3 months before deployment.
> The inherent performance increase with Bluestore (and having something 

> that hopefully won't need touching/upgrading for a while) shouldn't be 

> ignored.

Yes and that's the point where i'm currently as well. Thinking about how 
to design a new cluster based on bluestore.

> The SSDs are fine, I've been starting to use those recently (though 
> not with Ceph yet) as Intel DC S36xx or 37xx are impossible to get.
> They're a bit slower in the write IOPS department, but good enough for 
me.

I've never used the Intel DC ones but always the Samsung are the Intel 
really faster? Have you disabled te FLUSH command for the Samsung ones?
They don't skip the command automatically like the Intel do. Sadly the 
Samsung SM863 got more expensive over the last months. They were a lot 
cheaper  in the first month of 2016. May be the 2,5" optane intel ssds 
will change the game.

>> but was wondering if anyone has seen a positive impact from also 
>> using PCIe journals (e.g. Intel P3700 or even the older 910 series) 
>> in front of such SSDs?
>>
> NVMe journals (or WAL and DB space for Bluestore) are nice and can 
> certainly help, especially if Ceph is tuned accordingly.
> Avoid non DC NVMes, I doubt you can still get 910s, they are 
> officially EOL.
> You want to match capabilities and endurances, a DC P3700 800GB would 
> be an OK match for 3-4 SM863a 960GB for example.

That's a good point but makes the cluster more expensive. Currently 
while using filestore i use one SSD for journal and data which works 
fine.

With bluestore we've block, db and wal so we need 3 block devices per 
OSD. If we need one PCIe or NVMe device per 3-4 devices it get's much 
more expensive per host - currently running 10 OSDs / SSDs per Node.

Have you already done tests how he performance changes with bluestore 
while putting all 3 block devices on the same ssd?

Greets,
Stefan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] output discards (queue drops) on switchport

2017-09-08 Thread Marc Roos
 

Afaik ceph is is not supporting/working with bonding. 

https://www.mail-archive.com/ceph-users@lists.ceph.com/msg35474.html
(thread: Maybe some tuning for bonded network adapters)




-Original Message-
From: Andreas Herrmann [mailto:andr...@mx20.org] 
Sent: vrijdag 8 september 2017 13:58
To: ceph-users@lists.ceph.com
Subject: [ceph-users] output discards (queue drops) on switchport

Hello,

I have a fresh Proxmox installation on 5 servers (Supermciro X10SRW-F, 
Xeon E5-1660 v4, 128 GB RAM) with each 8 Samsung SSD SM863 960GB 
connected to a LSI-9300-8i (SAS3008) controller used as OSDs for Ceph 
(12.1.2)

The servers are connected to two Arista DCS-7060CX-32S switches. I'm 
using MLAG bond (bondmode LACP, xmit_hash_policy layer3+4, MTU 9000):
 * backend network for Ceph: cluster network & public network
   Mellanox ConnectX-4 Lx dual-port 25 GBit/s
 * frontend network: Intel Corporation 82599ES 10-Gigabit SFI/SFP+ 
dual-port

Ceph is quite a default installation with size=3.

My problem:
I'm issuing a dd (dd if=/dev/urandom of=urandom.0 bs=10M count=1024) in 
a test virtual machine (the only one running in the cluster) with 
arround 210 MB/s. I get output drops on all switchports. The drop rate 
is between 0.1 - 0.9 %. The drop rate of 0.9 % is reached when writing 
with about 1300MB/s into ceph.

First I thought about a problem with the Mellanox cards and used the 
Intel cards for ceph traffic. The problem also exists.

I tried quite a lot and nothing help:
 * changed the MTU from 9000 to 1500
 * changed bond_xmit_hash_policy from layer3+4 to layer2+3
 * deactivated the bond and just used a single link
 * disabled offloading
 * disabled power management in BIOS
 * perf-bias 0

I analyzed the traffic via tcpdump and got some of those "errors":
 * TCP Previous segment not captured
 * TCP Out-of-Order
 * TCP Retransmission
 * TCP Fast Retransmission
 * TCP Dup ACK
 * TCP ACKed unseen segment
 * TCP Window Update

Is that behaviour normal for ceph or has anyone ideas how to solve that 
problem with the output drops at switch-side

With iperf I can reach full 50 GBit/s on the bond with zero output 
drops.

Andreas
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rgw install manual install luminous

2017-09-12 Thread Marc Roos


I have been trying to setup the rados gateway (without deploy), but I am 
missing some commands to enable the service I guess? How do I populate 
the /var/lib/ceph/radosgw/ceph-gw1. I didn’t see any command like the 
ceph-mon.

service ceph-radosgw@gw1 start
Gives:
2017-09-12 22:26:06.390523 7fb9d7f27e00 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-09-12 22:26:06.390537 7fb9d7f27e00  0 deferred set uid:gid to 
167:167 (ceph:ceph)
2017-09-12 22:26:06.390592 7fb9d7f27e00  0 ceph version 12.2.0 
(32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process 
(unknown), pid 28481
2017-09-12 22:26:06.412882 7fb9d7f27e00 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-09-12 22:26:06.415335 7fb9d7f27e00 -1 auth: error parsing file 
/var/lib/ceph/radosgw/ceph-gw1/keyring
2017-09-12 22:26:06.415342 7fb9d7f27e00 -1 auth: failed to load 
/var/lib/ceph/radosgw/ceph-gw1/keyring: (5) Input/output error
2017-09-12 22:26:06.415355 7fb9d7f27e00  0 librados: client.gw1 
initialization error (5) Input/output error
2017-09-12 22:26:06.415981 7fb9d7f27e00 -1 Couldn't init storage 
provider (RADOS)
2017-09-12 22:26:06.669892 7f1740d89e00 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-09-12 22:26:06.669919 7f1740d89e00  0 deferred set uid:gid to 
167:167 (ceph:ceph)
2017-09-12 22:26:06.669977 7f1740d89e00  0 ceph version 12.2.0 
(32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process 
(unknown), pid 28497
2017-09-12 22:26:06.693019 7f1740d89e00 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-09-12 22:26:06.695963 7f1740d89e00 -1 auth: error parsing file 
/var/lib/ceph/radosgw/ceph-gw1/keyring
2017-09-12 22:26:06.695971 7f1740d89e00 -1 auth: failed to load 
/var/lib/ceph/radosgw/ceph-gw1/keyring: (5) Input/output error
2017-09-12 22:26:06.695989 7f1740d89e00  0 librados: client.gw1 
initialization error (5) Input/output error
2017-09-12 22:26:06.696850 7f1740d89e00 -1 Couldn't init storage 
provider (RADOS

radosgw -c /etc/ceph/ceph.conf -n client.radosgw.gw1 -f --log-to-stderr 
--debug-rgw=1 --debug-ms=1
Gives:
2017-09-12 22:20:55.845184 7f9004b54e00 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-09-12 22:20:55.845457 7f9004b54e00 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-09-12 22:20:55.845508 7f9004b54e00  0 ceph version 12.2.0 
(32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process 
(unknown), pid 28122
2017-09-12 22:20:55.867423 7f9004b54e00 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-09-12 22:20:55.869509 7f9004b54e00  1  Processor -- start
2017-09-12 22:20:55.869573 7f9004b54e00  1 -- - start start
2017-09-12 22:20:55.870324 7f9004b54e00  1 -- - --> 
192.168.10.111:6789/0 -- auth(proto 0 36 bytes epoch 0) v1 -- 
0x7f9006e6ec80 con 0
2017-09-12 22:20:55.870350 7f9004b54e00  1 -- - --> 
192.168.10.112:6789/0 -- auth(proto 0 36 bytes epoch 0) v1 -- 
0x7f9006e6ef00 con 0
2017-09-12 22:20:55.870824 7f8ff1fc4700  1 -- 
192.168.10.114:0/4093088986 learned_addr learned my addr 
192.168.10.114:0/4093088986
2017-09-12 22:20:55.871413 7f8ff07c1700  1 -- 
192.168.10.114:0/4093088986 <== mon.0 192.168.10.111:6789/0 1  
mon_map magic: 0 v1  361+0+0 (1785674138 0 0) 0x7f9006e8afc0 con 
0x7f90070d8800
2017-09-12 22:20:55.871567 7f8ff07c1700  1 -- 
192.168.10.114:0/4093088986 <== mon.0 192.168.10.111:6789/0 2  
auth_reply(proto 2 0 (0) Success) v1  33+0+0 (4108244008 0 0) 
0x7f9006e6ec80 con 0x7f90070d8800
2017-09-12 22:20:55.871662 7f8ff07c1700  1 -- 
192.168.10.114:0/4093088986 --> 192.168.10.111:6789/0 -- auth(proto 2 2 
bytes epoch 0) v1 -- 0x7f9006e6f900 con 0
2017-09-12 22:20:55.871688 7f8ff07c1700  1 -- 
192.168.10.114:0/4093088986 <== mon.1 192.168.10.112:6789/0 1  
mon_map magic: 0 v1  361+0+0 (1785674138 0 0) 0x7f9006e8b200 con 
0x7f90070d7000
2017-09-12 22:20:55.871734 7f8ff07c1700  1 -- 
192.168.10.114:0/4093088986 <== mon.1 192.168.10.112:6789/0 2  
auth_reply(proto 2 0 (0) Success) v1  33+0+0 (3872865519 0 0) 
0x7f9006e6ef00 con 0x7f90070d7000
2017-09-12 22:20:55.871759 7f8ff07c1700  1 -- 
192.168.10.114:0/4093088986 --> 192.168.10.112:6789/0 -- auth(proto 2 2 
bytes epoch 0) v1 -- 0x7f9006e6ec80 con 0
2017-09-12 22:20:55.872083 7f8ff07c1700  1 -- 
192.168.10.114:0/4093088986 <== mon.0 192.168.10.111:6789/0 3  
auth_reply(proto 2 -22 (22) Invalid argument) v1  24+0+0 (3879741687 
0 0) 0x7f9006e6f900 con 0x7f90070d8800
2017-09-12 22:20:55.872122 7f8ff07c1700  1 -- 
192.168.10.114:0/4093088986 >> 192.168.10.111:6789/0 conn(0x7f90070d8800 
:-1 s=STATE_OPEN pgs=3828 cs=1 l=1).mark_down
2017-09-12 22:20:55.872166 7f8ff07c1700  1 -- 
192.168.10.114:0/4093088986 <== mon.1 192.168.10.112:6789/0 3  
auth_reply(proto 2 -22 (22) Invalid argument) v1  24+0

Re: [ceph-users] Rgw install manual install luminous

2017-09-13 Thread Marc Roos
 

Yes this command cannot find the keyring
service ceph-radosgw@gw1 start

But this can
radosgw -c /etc/ceph/ceph.conf -n client.radosgw.gw1 -f

I think I did not populate the /var/lib/ceph/radosgw/ceph-gw1/ folder 
correctly. Maybe the sysint is checking on 'done' file or so. I mannualy 
added the keyring there. But I don’t know the exact synaxt I should 
use, all seem to be generating the same errors.

[radosgw.ceph-gw1]
 key = xxx==

My osds have
[osd.12]
 key = xxx==
But my monitors have this one
[mon.]
 key = xxx==
 caps mon = "allow *"



-Original Message-
From: Jean-Charles Lopez [mailto:jelo...@redhat.com] 
Sent: woensdag 13 september 2017 1:06
To: Marc Roos
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Rgw install manual install luminous

Hi,

see comment in line

Regards
JC

> On Sep 12, 2017, at 13:31, Marc Roos  wrote:
> 
> 
> 
> I have been trying to setup the rados gateway (without deploy), but I 
> am missing some commands to enable the service I guess? How do I 
> populate the /var/lib/ceph/radosgw/ceph-gw1. I didn’t see any command 

> like the ceph-mon.
> 
> service ceph-radosgw@gw1 start
> Gives:
> 2017-09-12 22:26:06.390523 7fb9d7f27e00 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-09-12 22:26:06.390537 7fb9d7f27e00  0 deferred set uid:gid to
> 167:167 (ceph:ceph)
> 2017-09-12 22:26:06.390592 7fb9d7f27e00  0 ceph version 12.2.0
> (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process 
> (unknown), pid 28481
> 2017-09-12 22:26:06.412882 7fb9d7f27e00 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-09-12 22:26:06.415335 7fb9d7f27e00 -1 auth: error parsing file 
> /var/lib/ceph/radosgw/ceph-gw1/keyring
> 2017-09-12 22:26:06.415342 7fb9d7f27e00 -1 auth: failed to load
> /var/lib/ceph/radosgw/ceph-gw1/keyring: (5) Input/output error
> 2017-09-12 22:26:06.415355 7fb9d7f27e00  0 librados: client.gw1 
> initialization error (5) Input/output error
> 2017-09-12 22:26:06.415981 7fb9d7f27e00 -1 Couldn't init storage 
> provider (RADOS)
> 2017-09-12 22:26:06.669892 7f1740d89e00 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-09-12 22:26:06.669919 7f1740d89e00  0 deferred set uid:gid to
> 167:167 (ceph:ceph)
> 2017-09-12 22:26:06.669977 7f1740d89e00  0 ceph version 12.2.0
> (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process 
> (unknown), pid 28497
> 2017-09-12 22:26:06.693019 7f1740d89e00 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-09-12 22:26:06.695963 7f1740d89e00 -1 auth: error parsing file 
> /var/lib/ceph/radosgw/ceph-gw1/keyring
> 2017-09-12 22:26:06.695971 7f1740d89e00 -1 auth: failed to load
> /var/lib/ceph/radosgw/ceph-gw1/keyring: (5) Input/output error
Looks like you don’t have the keyring for the RGW user. The error 
message tells you about the location and the filename to use.
> 2017-09-12 22:26:06.695989 7f1740d89e00  0 librados: client.gw1 
> initialization error (5) Input/output error
> 2017-09-12 22:26:06.696850 7f1740d89e00 -1 Couldn't init storage 
> provider (RADOS
> 
> radosgw -c /etc/ceph/ceph.conf -n client.radosgw.gw1 -f 
> --log-to-stderr
> --debug-rgw=1 --debug-ms=1
> Gives:
> 2017-09-12 22:20:55.845184 7f9004b54e00 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-09-12 22:20:55.845457 7f9004b54e00 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-09-12 22:20:55.845508 7f9004b54e00  0 ceph version 12.2.0
> (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process 
> (unknown), pid 28122
> 2017-09-12 22:20:55.867423 7f9004b54e00 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-09-12 22:20:55.869509 7f9004b54e00  1  Processor -- start
> 2017-09-12 22:20:55.869573 7f9004b54e00  1 -- - start start
> 2017-09-12 22:20:55.870324 7f9004b54e00  1 -- - --> 
> 192.168.10.111:6789/0 -- auth(proto 0 36 bytes epoch 0) v1 -- 
> 0x7f9006e6ec80 con 0
> 2017-09-12 22:20:55.870350 7f9004b54e00  1 -- - --> 
> 192.168.10.112:6789/0 -- auth(proto 0 36 bytes epoch 0) v1 -- 
> 0x7f9006e6ef00 con 0
> 2017-09-12 22:20:55.870824 7f8ff1fc4700  1 --
> 192.168.10.114:0/4093088986 learned_addr learned my addr
> 192.168.10.114:0/4093088986
> 2017-09-12 22:20:55.871413 7f8ff07c1700  1 --
> 192.168.10.114:0/4093088986 <== mon.0 192.168.10.111:6789/0 1  
> mon_map magic: 0 v1  361+0+0 (1785674138 0 0) 0x7f9006e8afc0 con 
> 0x7f90070d8800
> 2017-09-12 22:20:55.871567 7f8ff07c1700  1 --
> 192.168.10.114:0/4093088986 <== mon.0 192.168

[ceph-users] Collectd issues

2017-09-13 Thread Marc Roos


Am I the only one having these JSON issues with collectd, did I do 
something wrong in configuration/upgrade?

Sep 13 15:44:15 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Sep 13 15:44:15 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Sep 13 15:44:15 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.6,i=2,st=4): error 1
Sep 13 15:44:15 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Sep 13 15:44:15 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Sep 13 15:44:15 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.7,i=3,st=4): error 1
Sep 13 15:44:15 c01 collectd: ceph plugin: ds 
Bluestore.kvFlushLat.avgtime was not properly initialized.
Sep 13 15:44:15 c01 collectd: ceph plugin: JSON handler failed with 
status -1.
Sep 13 15:44:15 c01 collectd: ceph plugin: 
cconn_handle_event(name=osd.8,i=4,st=4): error 1






___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Rbd resize, refresh rescan

2017-09-18 Thread Marc Roos

Is there something like this for scsi, to rescan the size of the rbd 
device and make it available? (while it is being used)

echo 1 >  /sys/class/scsi_device/2\:0\:0\:0/device/rescan




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rbd resize, refresh rescan

2017-09-18 Thread Marc Roos
  
Yes, I think you are right, after I saw this in dmesg, I noticed with 
fdisk the block device was updated
 rbd21: detected capacity change from 5368709120 to 6442450944

Maybe this also works (found a something that refered to a /sys/class, 
which I don’t have) echo 1 > /sys/devices/rbd/21/refresh

(I am trying to online increase the size via kvm, virtio disk in win 
2016)


-Original Message-
From: David Turner [mailto:drakonst...@gmail.com] 
Sent: maandag 18 september 2017 22:42
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] Rbd resize, refresh rescan

I've never needed to do anything other than extend the partition and/or 
filesystem when I increased the size of an RBD.  Particularly if I 
didn't partition the RBD I only needed to extend the filesystem.

Which method are you mapping/mounting the RBD?  Is it through a 
Hypervisor or just mapped to a server?  What are you seeing to indicate 
that the RBD isn't already reflecting the larger size?  Which version of 
Ceph are you using?

On Mon, Sep 18, 2017 at 4:31 PM Marc Roos  
wrote:



Is there something like this for scsi, to rescan the size of the 
rbd
device and make it available? (while it is being used)

echo 1 >  /sys/class/scsi_device/2\:0\:0\:0/device/rescan




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What HBA to choose? To expand or not to expand?

2017-09-20 Thread Marc Roos
 


We use these :
NVDATA Product ID  : SAS9207-8i
Serial Attached SCSI controller: LSI Logic / Symbios Logic SAS2308 
PCI-Express Fusion-MPT SAS-2 (rev 05)

Does someone by any chance know how to turn on the drive identification 
lights?




-Original Message-
From: Jake Young [mailto:jak3...@gmail.com] 
Sent: dinsdag 19 september 2017 18:00
To: Kees Meijs; ceph-us...@ceph.com
Subject: Re: [ceph-users] What HBA to choose? To expand or not to 
expand?


On Tue, Sep 19, 2017 at 9:38 AM Kees Meijs  wrote:


Hi Jake,

On 19-09-17 15:14, Jake Young wrote:
> Ideally you actually want fewer disks per server and more 
servers.
> This has been covered extensively in this mailing list. Rule of 
thumb
> is that each server should have 10% or less of the capacity of 
your
> cluster.

That's very true, but let's focus on the HBA.

> I didn't do extensive research to decide on this HBA, it's simply 
what
> my server vendor offered. There are probably better, faster, 
cheaper
> HBAs out there. A lot of people complain about LSI HBAs, but I am
> comfortable with them.

Given a configuration our vendor offered it's about LSI/Avago 
9300-8i
with 8 drives connected individually using SFF8087 on a backplane 
(e.g.
not an expander). Or, 24 drives using three HBAs (6xSFF8087 in 
total)
when using a 4HE SuperMicro chassis with 24 drive bays.

But, what are the LSI complaints about? Or, are the complaints 
generic
to HBAs and/or cryptic CLI tools and not LSI specific?


Typically people rant about how much Megaraid/LSI support sucks. I've 
been using LSI or MegaRAID for years and haven't had any big problems. 

I had some performance issues with Areca onboard SAS chips (non-Ceph 
setup, 4 disks in a RAID10) and after about 6 months of troubleshooting 
with the server vendor and Areca support they did patch the firmware and 
resolve the issue. 




> There is a management tool called storcli that can fully 
configure the
> HBA in one or two command lines.  There's a command that 
configures
> all attached disks as individual RAID0 disk groups. That command 
gets
> run by salt when I provision a new osd server.

The thread I read was about Areca in JBOD but still able to utilise 
the
cache, if I'm not mistaken. I'm not sure anymore if there was 
something
mentioned about BBU.


JBOD with WB cache would be nice so you can get smart data directly from 
the disks instead of having interrogate the HBA for the data.  This 
becomes more important once your cluster is stable and in production.

IMHO if there is unwritten data in a RAM chip, like when you enable WB 
cache, you really, really need a BBU. This is another nice thing about 
using SSD journals instead of HBAs in WB mode, the journaled data is 
safe on the SSD before the write is acknowledged. 




>
> What many other people are doing is using the least expensive 
JBOD HBA
> or the on board SAS controller in JBOD mode and then using SSD
> journals. Save the money you would have spent on the fancy HBA 
for
> fast, high endurance SSDs.

Thanks! And obviously I'm very interested in other comments as 
well.

Regards,
Kees

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] monitor takes long time to join quorum: STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH got BADAUTHORIZER

2017-09-21 Thread Marc Roos
 

In my case it was syncing, and was syncing slowly (hour or so?). You 
should see this in the log file. I wanted to report this, because my 
store.db is only 200MB, and I guess you want your monitors up and 
running quickly.

I also noticed that when the 3rd monitor left the quorum, ceph -s 
command was slow timing out. Probably trying to connect to the 3rd 
monitor, but why? When this monitor is not in quorum.






-Original Message-
From: Sean Purdy [mailto:s.pu...@cv-library.co.uk] 
Sent: donderdag 21 september 2017 12:02
To: Gregory Farnum
Cc: ceph-users
Subject: Re: [ceph-users] monitor takes long time to join quorum: 
STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH got BADAUTHORIZER

On Wed, 20 Sep 2017, Gregory Farnum said:
> That definitely sounds like a time sync issue. Are you *sure* they 
> matched each other?

NTP looked OK at the time.  But see below.


> Is it reproducible on restart?

Today I did a straight reboot - and it was fine, no issues.


The issue occurs after the machine is off for a number of hours, or has 
been worked on in the BIOS for a number of hours and then booted.  And 
then perhaps waited at the disk decrypt key prompt.

So I'd suspect hardware clock drift at those times.  (Using Dell R720xd 
machines)


Logs show a time change a few seconds after boot.  After boot it's 
running NTP and within that 45 minute period the NTP state looks the 
same as the other nodes in the (small) cluster.

How much drift is allowed between monitors?


Logs say:

Sep 20 09:45:21 store03 ntp[2329]: Starting NTP server: ntpd.
Sep 20 09:45:21 store03 ntpd[2462]: proto: precision = 0.075 usec (-24) 
...
Sep 20 09:46:44 store03 systemd[1]: Time has been changed Sep 20 
09:46:44 store03 ntpd[2462]: receive: Unexpected origin timestamp 
0xdd6ca972.c694801d does not match aorg 00. from 
server@172.16.0.16 xmt 0xdd6ca974.0c5c18f

So system time was changed about 6 seconds after disks were 
unlocked/boot proceeded.  But there was still 45 minutes of monitor 
messages after that.  Surely the time should have converged sooner than 
45 minutes?



NTP from today, post-problem.  But ntpq at the time of the problem 
looked just as OK:

store01:~$ ntpstat
synchronised to NTP server (172.16.0.19) at stratum 3
   time correct to within 47 ms

store02$ ntpstat
synchronised to NTP server (172.16.0.19) at stratum 3
   time correct to within 63 ms

store03:~$ sudo ntpstat
synchronised to NTP server (172.16.0.19) at stratum 3
   time correct to within 63 ms

store03:~$ ntpq -p
 remote   refid  st t when poll reach   delay   offset  
jitter

==
+172.16.0.16 85.91.1.164  3 u  561 1024  3770.2870.554   
0.914
+172.16.0.18 94.125.129.7 3 u  411 1024  3770.388   -0.331   
0.139
*172.16.0.19 158.43.128.332 u  289 1024  3770.282   -0.005   
0.103


Sean

 
> On Wed, Sep 20, 2017 at 2:50 AM Sean Purdy  
wrote:
> 
> >
> > Hi,
> >
> >
> > Luminous 12.2.0
> >
> > Three node cluster, 18 OSD, debian stretch.
> >
> >
> > One node is down for maintenance for several hours.  When bringing 
> > it back up, OSDs rejoin after 5 minutes, but health is still 
> > warning.  monitor has not joined quorum after 40 minutes and logs 
> > show BADAUTHORIZER message every time the monitor tries to connect 
to the leader.
> >
> > 2017-09-20 09:46:05.581590 7f49e2b29700  0 -- 172.16.0.45:0/2243 >>
> > 172.16.0.43:6812/2422 conn(0x5600720fb800 :-1 
> > s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 
> > l=0).handle_connect_reply connect got BADAUTHORIZER
> >
> > Then after ~45 minutes monitor *does* join quorum.
> >
> > I'm presuming this isn't normal behaviour?  Or if it is, let me know 

> > and I won't worry.
> >
> > All three nodes are using ntp and look OK timewise.
> >
> >
> > ceph-mon log:
> >
> > (.43 is leader, .45 is rebooted node, .44 is other live node in 
> > quorum)
> >
> > Boot:
> >
> > 2017-09-20 09:45:21.874152 7f49efeb8f80  0 ceph version 12.2.0
> > (32ce2a3ae5239ee33d6150705cdb24d43bab910c) luminous (rc), process 
> > (unknown), pid 2243
> >
> > 2017-09-20 09:46:01.824708 7f49e1b27700  0 -- 172.16.0.45:6789/0 >> 
> > 172.16.0.44:6789/0 conn(0x56007244d000 :6789 
> > s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 
> > l=0).handle_connect_msg accept connect_seq 3 vs existing csq=0 
> > existing_state=STATE_CONNECTING 2017-09-20 09:46:01.824723 
> > 7f49e1b27700  0 -- 172.16.0.45:6789/0 >> 172.16.0.44:6789/0 
> > conn(0x56007244d000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH 
> > pgs=0 cs=0 l=0).handle_connect_msg accept we reset (peer sent cseq 
> > 3, 0x5600722c.cseq = 0), sending RESETSESSION 2017-09-20 
> > 09:46:01.825247 7f49e1b27700  0 -- 172.16.0.45:6789/0 >> 
> > 172.16.0.44:6789/0 conn(0x56007244d000 :6789 
> > s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 
> > l=0).handle_connect_msg accept connect_seq 0 vs existing csq=0 
> > existing_state=STATE

Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-25 Thread Marc Roos
 
>From the looks of it, to bad the efforts could not be 
combined/coordinated, that seems to be an issue with many open source 
initiatives.


-Original Message-
From: mj [mailto:li...@merit.unu.edu] 
Sent: zondag 24 september 2017 16:37
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

Hi,

I forwarded your announcement to the dovecot  mailinglist. The following 
reply to it was posted by there by Timo Sirainen. I'm forwarding it 
here, as you might not be reading the dovecot mailinglist.

Wido:
> First, the Github link:
> https://github.com/ceph-dovecot/dovecot-ceph-plugin
> 
> I am not going to repeat everything which is on Github, put a short 
summary:
> 
> - CephFS is used for storing Mailbox Indexes
> - E-Mails are stored directly as RADOS objects
> - It's a Dovecot plugin
> 
> We would like everybody to test librmb and report back issues on 
Github so that further development can be done.
> 
> It's not finalized yet, but all the help is welcome to make librmb the 
best solution for storing your e-mails on Ceph with Dovecot.

Timo:
It would be have been nicer if RADOS support was implemented as lib-fs 
driver, and the fs-API had been used all over the place elsewhere. So 1) 
LibRadosMailBox wouldn't have been relying so much on RADOS specifically 
and 2) fs-rados could have been used for other purposes. There are 
already fs-dict and dict-fs drivers, so the RADOS dict driver may not 
have been necessary to implement if fs-rados was implemented instead 
(although I didn't check it closely enough to verify). (We've had 
fs-rados on our TODO list for a while also.)

BTW. We've also been planning on open sourcing some of the obox pieces, 
mainly fs-drivers (e.g. fs-s3). The obox format maybe too, but without 
the "metacache" piece. The current obox code is a bit too much married 
into the metacache though to make open sourcing it easy. (The metacache 
is about storing the Dovecot index files in object storage and 
efficiently caching them on local filesystem, which isn't planned to be 
open sourced in near future. That's pretty much the only difficult piece 
of the obox plugin, with Cassandra integration coming as a good second. 
I wish there had been a better/easier geo-distributed key-value database 
to use - tombstones are annoyingly troublesome.)

And using rmb-mailbox format, my main worries would be:
  * doesn't store index files (= message flags) - not necessarily a 
problem, as long as you don't want geo-replication
  * index corruption means rebuilding them, which means rescanning list 
of mail files, which means rescanning the whole RADOS namespace, which 
practically means  rescanning the RADOS pool. That most likely is a very 
very slow operation, which you want to avoid unless it's absolutely 
necessary. Need to be very careful to avoid that happening, and in 
general to avoid losing mails in case of crashes or other bugs.
  * I think copying/moving mails physically copies the full data on disk
  * Each IMAP/POP3/LMTP/etc process connects to RADOS separately from 
each others - some connection pooling would likely help here

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

2017-09-25 Thread Marc Roos
 

But from the looks of this dovecot mailinglist post, you didn’t start 
your project with talking to the dovecot guys, or have an ongoing 
communication with them during the development. I would think with that 
their experience could be a valuable asset. I am not talking about just 
giving some files at the end.

Ps. Is there some index of these slides? I have problems browsing back 
to a specific one constantly.


-Original Message-
From: Danny Al-Gaaf [mailto:danny.al-g...@bisect.de] 
Sent: maandag 25 september 2017 9:37
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] librmb: Mail storage on RADOS with Dovecot

Am 25.09.2017 um 09:00 schrieb Marc Roos:
>  
>>From the looks of it, to bad the efforts could not be
> combined/coordinated, that seems to be an issue with many open source 
> initiatives.

That's not right. The plan is to contribute the librmb code to the Ceph 
project and the Dovecot part back to the Dovecot project (as described 
in the slides) as soon as we know that it will work with real live load.

We simply needed a place to start with it, then we split the code into 
parts to move it to the corresponding projects.

Danny


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs : security questions?

2017-09-29 Thread Marc Roos
 

Maybe this will get you started with the permissions for only this fs 
path /smb

sudo ceph auth get-or-create client.cephfs.smb mon 'allow r' mds 'allow 
r, allow rw path=/smb' osd 'allow rwx pool=fs_meta,allow rwx 
pool=fs_data'




-Original Message-
From: Yoann Moulin [mailto:yoann.mou...@epfl.ch] 
Sent: vrijdag 29 september 2017 9:36
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Cephfs : security questions?

Hello,

We are working on a POC with containers (kubernetes) and cephfs (for 
permanent storage).

The main idea is to give to a user access to a subdirectory of the 
cephfs but be sure he won't be able to access to the rest of the 
storage. As k8s works, the user will have access to the yml file where 
the cephfs mount point is defined. He will be able to change the 
subdirectory mounted inside the container (and set it to /). And inside 
the container, the user is root…

So if even the user doesn't have access to the secret, he will be able 
to mount the whole cephfs volume with read access.

Is there a possibility to have "root_squash" option on cephfs volume for 
a specific client.user + secret?

Is it possible to allow a specific user to mount only /bla and disallow 
to mount the cephfs root "/"?

Or is there another way to do that?

Thanks,

--
Yoann Moulin
EPFL IC-IT
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Cephfs : security questions?

2017-09-29 Thread Marc Roos
 
I think that is because of the older kernel client, like mentioned here?
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg39734.html





-Original Message-
From: Yoann Moulin [mailto:yoann.mou...@epfl.ch] 
Sent: vrijdag 29 september 2017 10:00
To: ceph-users
Subject: Re: [ceph-users] Cephfs : security questions?


>> We are working on a POC with containers (kubernetes) and cephfs (for 
>> permanent storage).
>> 
>> The main idea is to give to a user access to a subdirectory of the 
>> cephfs but be sure he won't be able to access to the rest of the 
>> storage. As k8s works, the user will have access to the yml file 
>> where the cephfs mount point is defined. He will be able to change 
>> the subdirectory mounted inside the container (and set it to /). And 
>> inside the container, the user is root…
>> 
>> So if even the user doesn't have access to the secret, he will be 
>> able to mount the whole cephfs volume with read access.
>> 
>> Is there a possibility to have "root_squash" option on cephfs volume 
>> for a specific client.user + secret?
>> 
>> Is it possible to allow a specific user to mount only /bla and 
>> disallow to mount the cephfs root "/"?
>> 
>> Or is there another way to do that?
>
> Maybe this will get you started with the permissions for only this fs 
> path /smb
>
> sudo ceph auth get-or-create client.cephfs.smb mon 'allow r' mds 
> 'allow r, allow rw path=/smb' osd 'allow rwx pool=fs_meta,allow rwx 
> pool=fs_data'

What I currently do is :

mkdir /cephfs/foo
chown nobody:foogrp /cephfs/foo
chmod 770 /cephfs/foo
ceph auth get-or-create client.foo mon "allow r" osd "allow rw 
pool=cephfs_data" mds "allow r, allow rw path=/foo"
ceph fs authorize cephfs client.foo / r /foo rw

so I have this for client.foo

[client.foo]
key = [secret]
caps mds = "allow r, allow rw path=/foo"
caps mon = "allow r"
caps osd = "allow rw pool=cephfs_data"

With this, the user foo is able to mount the root of the cephfs and read 
everything, of course, he cannot modify but my problem here is he is 
still able to have read access to everything with uid=0.

--
Yoann Moulin
EPFL IC-IT
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] (no subject)

2017-09-30 Thread Marc Roos





[Sat Sep 30 15:51:11 2017] libceph: osd5 192.168.10.113:6809 socket 
closed (con state OPEN)
[Sat Sep 30 15:51:11 2017] libceph: osd5 192.168.10.113:6809 socket 
closed (con state CONNECTING)
[Sat Sep 30 15:51:11 2017] libceph: osd5 down
[Sat Sep 30 15:51:11 2017] libceph: osd5 down
[Sat Sep 30 15:52:52 2017] libceph: osd5 up
[Sat Sep 30 15:52:52 2017] libceph: osd5 up



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] 1 osd Segmentation fault in test cluster

2017-09-30 Thread Marc Roos
Is this useful for someone?



[Sat Sep 30 15:51:11 2017] libceph: osd5 192.168.10.113:6809 socket 
closed (con state OPEN)
[Sat Sep 30 15:51:11 2017] libceph: osd5 192.168.10.113:6809 socket 
closed (con state CONNECTING)
[Sat Sep 30 15:51:11 2017] libceph: osd5 down
[Sat Sep 30 15:51:11 2017] libceph: osd5 down
[Sat Sep 30 15:52:52 2017] libceph: osd5 up
[Sat Sep 30 15:52:52 2017] libceph: osd5 up



2017-09-30 15:48:08.542202 7f7623ce9700  0 log_channel(cluster) log 
[WRN] : slow request 31.456482 seconds old, received at 2017-09-30 
15:47:37.085589: osd_op(mds.0.9227:1289186 20.2b 20.9af42b6b (undecoded) 
ondisk+write+known_if_redirected+full_force e15675) currently 
queued_for_pg
2017-09-30 15:48:08.542207 7f7623ce9700  0 log_channel(cluster) log 
[WRN] : slow request 31.456086 seconds old, received at 2017-09-30 
15:47:37.085984: osd_op(mds.0.9227:1289190 20.13 20.e44f3f53 (undecoded) 
ondisk+write+known_if_redirected+full_force e15675) currently 
queued_for_pg
2017-09-30 15:48:08.542212 7f7623ce9700  0 log_channel(cluster) log 
[WRN] : slow request 31.456005 seconds old, received at 2017-09-30 
15:47:37.086065: osd_op(mds.0.9227:1289194 20.2b 20.6733bdeb (undecoded) 
ondisk+write+known_if_redirected+full_force e15675) currently 
queued_for_pg
2017-09-30 15:51:12.592490 7f7611cc5700  0 log_channel(cluster) log 
[DBG] : 20.3f scrub starts
2017-09-30 15:51:24.514602 7f76214e4700 -1 *** Caught signal 
(Segmentation fault) **
 in thread 7f76214e4700 thread_name:bstore_mempool

 ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous 
(stable)
 1: (()+0xa29511) [0x7f762e5b2511]
 2: (()+0xf370) [0x7f762afa5370]
 3: (BlueStore::TwoQCache::_trim(unsigned long, unsigned long)+0x2df) 
[0x7f762e481a2f]
 4: (BlueStore::Cache::trim(unsigned long, float, float, float)+0x1d1) 
[0x7f762e4543e1]
 5: (BlueStore::MempoolThread::entry()+0x14d) [0x7f762e45a71d]
 6: (()+0x7dc5) [0x7f762af9ddc5]
 7: (clone()+0x6d) [0x7f762a09176d]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.

--- begin dump of recent events ---
-1> 2017-09-30 15:51:05.105915 7f76284ac700  5 -- 
192.168.10.113:0/27661 >> 192.168.10.111:6810/6617 conn(0x7f766b736000 
:-1 s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=19 cs=1 l=1). rx 
osd.0 seq 19546 0x7f76a2daf000 osd_ping(ping_reply e15675 stamp 
2017-09-30 15:51:05.105439) v4
 -> 2017-09-30 15:51:05.105963 7f760fcc1700  1 -- 10.0.0.13:0/27661 
--> 10.0.0.11:6805/6491 -- osd_ping(ping e15675 stamp 2017-09-30 
15:51:05.105439) v4 -- 0x7f7683e98a00 con 0
 -9998> 2017-09-30 15:51:05.105960 7f76284ac700  1 -- 
192.168.10.113:0/27661 <== osd.0 192.168.10.111:6810/6617 19546  
osd_ping(ping_reply e15675 stamp 2017-09-30 15:51:05.105439) v4  
2004+0+0 (1212154800 0 0) 0x7f76a2daf000 con 0x7f766b736000
 -9997> 2017-09-30 15:51:05.105961 7f76274aa700  5 -- 10.0.0.13:0/27661 
>> 10.0.0.11:6808/6646 conn(0x7f766b745800 :-1 
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=24 cs=1 l=1). rx osd.3 
seq 19546 0x7f769b95f200 osd_ping(ping_reply e15675 stamp 2017-09-30 
15:51:05.105439) v4
 -9996> 2017-09-30 15:51:05.105983 7f760fcc1700  1 -- 
192.168.10.113:0/27661 --> 192.168.10.111:6805/6491 -- osd_ping(ping 
e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f7683e97600 con 0
 -9995> 2017-09-30 15:51:05.106001 7f76274aa700  1 -- 10.0.0.13:0/27661 
<== osd.3 10.0.0.11:6808/6646 19546  osd_ping(ping_reply e15675 
stamp 2017-09-30 15:51:05.105439) v4  2004+0+0 (1212154800 0 0) 
0x7f769b95f200 con 0x7f766b745800
 -9994> 2017-09-30 15:51:05.106015 7f760fcc1700  1 -- 10.0.0.13:0/27661 
--> 10.0.0.11:6807/6470 -- osd_ping(ping e15675 stamp 2017-09-30 
15:51:05.105439) v4 -- 0x7f7683e99800 con 0
 -9993> 2017-09-30 15:51:05.106035 7f760fcc1700  1 -- 
192.168.10.113:0/27661 --> 192.168.10.111:6808/6470 -- osd_ping(ping 
e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f763b72a200 con 0
 -9992> 2017-09-30 15:51:05.106072 7f760fcc1700  1 -- 10.0.0.13:0/27661 
--> 10.0.0.11:6809/6710 -- osd_ping(ping e15675 stamp 2017-09-30 
15:51:05.105439) v4 -- 0x7f768633dc00 con 0
 -9991> 2017-09-30 15:51:05.106093 7f760fcc1700  1 -- 
192.168.10.113:0/27661 --> 192.168.10.111:6804/6710 -- osd_ping(ping 
e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f76667d3600 con 0
 -9990> 2017-09-30 15:51:05.106114 7f760fcc1700  1 -- 10.0.0.13:0/27661 
--> 10.0.0.12:6805/1949 -- osd_ping(ping e15675 stamp 2017-09-30 
15:51:05.105439) v4 -- 0x7f768fcd6200 con 0
 -9989> 2017-09-30 15:51:05.106134 7f760fcc1700  1 -- 
192.168.10.113:0/27661 --> 192.168.10.112:6805/1949 -- osd_ping(ping 
e15675 stamp 2017-09-30 15:51:05.105439) v4 -- 0x7f765f27a800 con 0


...


  -29> 2017-09-30 15:51:24.469620 7f7611cc5700  1 -- 
10.0.0.13:6808/27661 --> 10.0.0.12:6800/1947 -- replica scrub(pg: 
20.3f,from:0'0,to:0'0,epoch:15675/15644,start:20:fc9e27a6:::18d7d4e.
00cd:0,end:20:fc9e74a2:::16a91fc.:0,chunky:1,deep:0,seed
:4294967295,version:7) v7 -- 0x7f76514c1600 con 0
   -28>

[ceph-users] nfs-ganesha / cephfs issues

2017-09-30 Thread Marc Roos


I have on luminous 12.2.1 on a osd node nfs-ganesha 2.5.2 (from ceph 
download) running. And when I rsync on a vm that has the nfs mounted, I 
get stalls. 

I thought it was related to the amount of files of rsyncing the centos7 
distro. But when I tried to rsync just one file it also stalled. It 
looks like it could not create the update of the 'CentOS_BuildTag' file.

Could this be a problem in the meta data pool of cephfs? Does this sound 
familiar? Is there something like an fsck for cephfs?

drwxr-xr-x 1 500 500 7 Jan 24  2016 ..
-rw-r--r-- 1 500 50014 Dec  5  2016 CentOS_BuildTag
-rw-r--r-- 1 500 50029 Dec  5  2016 .discinfo
-rw-r--r-- 1 500 500   946 Jan 12  2017 .treeinfo
drwxr-xr-x 1 500 500 1 Sep  5 15:36 LiveOS
drwxr-xr-x 1 500 500 1 Sep  5 15:36 EFI
drwxr-xr-x 1 500 500 3 Sep  5 15:36 images
drwxrwxr-x 1 500 50010 Sep  5 23:57 repodata
drwxrwxr-x 1 500 500  9591 Sep 19 20:33 Packages
drwxr-xr-x 1 500 500 9 Sep 19 20:33 isolinux
-rw--- 1 500 500 0 Sep 30 23:49 .CentOS_BuildTag.PKZC1W
-rw--- 1 500 500 0 Sep 30 23:52 .CentOS_BuildTag.gM1C1W
drwxr-xr-x 1 500 50015 Sep 30 23:52 .




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Rbd resize, refresh rescan

2017-10-17 Thread Marc Roos

Rbd resize is automatically on the mapped host.

However for the changes to appear in libvirt/qemu, I have to
virsh qemu-monitor-command vps-test2 --hmp "info block"
virsh qemu-monitor-command vps-test2 --hmp "block_resize 
drive-scsi0-0-0-0 12G" 



-Original Message-
From: Marc Roos 
Sent: maandag 18 september 2017 23:02
To: David Turner
Subject: RE: [ceph-users] Rbd resize, refresh rescan


Yes I can remember, I guess I have to do something in kvm/virt-manager, 
so the change is relayed to the guest.

-Original Message-
From: David Turner [mailto:drakonst...@gmail.com]
Sent: maandag 18 september 2017 23:00
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] Rbd resize, refresh rescan

Disk Management in Windows should very easily extend a partition to use 
the rest of the disk.  You should just right click the partition and 
select "Extend Volume" and that's it.  I did it in Windows 10 over the 
weekend for a laptop that had been set up weird.  

On Mon, Sep 18, 2017 at 4:49 PM Marc Roos  
wrote:



Yes, I think you are right, after I saw this in dmesg, I noticed 
with
fdisk the block device was updated
 rbd21: detected capacity change from 5368709120 to 6442450944

Maybe this also works (found a something that refered to a 
/sys/class,
which I don’t have) echo 1 > /sys/devices/rbd/21/refresh

(I am trying to online increase the size via kvm, virtio disk in 
win
2016)


-Original Message-
From: David Turner [mailto:drakonst...@gmail.com]
Sent: maandag 18 september 2017 22:42
To: Marc Roos; ceph-users
Subject: Re: [ceph-users] Rbd resize, refresh rescan

I've never needed to do anything other than extend the partition 
and/or
filesystem when I increased the size of an RBD.  Particularly if I
didn't partition the RBD I only needed to extend the filesystem.

Which method are you mapping/mounting the RBD?  Is it through a
Hypervisor or just mapped to a server?  What are you seeing to 
indicate
that the RBD isn't already reflecting the larger size?  Which 
version of
Ceph are you using?
    
On Mon, Sep 18, 2017 at 4:31 PM Marc Roos 

wrote:



Is there something like this for scsi, to rescan the size 
of the
rbd
device and make it available? (while it is being used)

echo 1 >  /sys/class/scsi_device/2\:0\:0\:0/device/rescan




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] OSD are marked as down after jewel -> luminous upgrade

2017-10-17 Thread Marc Roos
Did you check this?

https://www.mail-archive.com/ceph-users@lists.ceph.com/msg39886.html 








-Original Message-
From: Daniel Carrasco [mailto:d.carra...@i2tic.com] 
Sent: dinsdag 17 oktober 2017 17:49
To: ceph-us...@ceph.com
Subject: [ceph-users] OSD are marked as down after jewel -> luminous 
upgrade

Hello,

Today I've decided to upgrade my Ceph cluster to latest LTS version. To 
do it I've used the steps posted on release notes:
http://ceph.com/releases/v12-2-0-luminous-released/

After upgrade all the daemons I've noticed that all OSD daemons are 
marked as down even when all are working, so the cluster becomes down.

Maybe the problem is the command "ceph osd require-osd-release 
luminous", but all OSD are on Luminous version.


-


-

# ceph versions
{
"mon": {
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 3
},
"mgr": {
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 3
},
"osd": {
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 2
},
"mds": {
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 2
},
"overall": {
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 10
}
}


-


-

# ceph osd versions
{
"ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) 
luminous (stable)": 2 }

# ceph osd tree

ID CLASS WEIGHT  TYPE NAME  STATUS REWEIGHT PRI-AFF 
-1   0.08780 root default   
-2   0.04390 host alantra_fs-01 
 0   ssd 0.04390 osd.0  up  1.0 1.0 
-3   0.04390 host alantra_fs-02 
 1   ssd 0.04390 osd.1  up  1.0 1.0 
-4 0  host alantra_fs-03


-


-

# ceph -s
  cluster:
id: 5f8e66b5-1adc-4930-b5d8-c0f44dc2037e
health: HEALTH_WARN
nodown flag(s) set
 
  services:
mon: 3 daemons, quorum alantra_fs-02,alantra_fs-01,alantra_fs-03
mgr: alantra_fs-03(active), standbys: alantra_fs-01, alantra_fs-02
mds: cephfs-1/1/1 up  {0=alantra_fs-01=up:active}, 1 up:standby
osd: 2 osds: 2 up, 2 in
 flags nodown
 
  data:
pools:   3 pools, 192 pgs
objects: 40177 objects, 3510 MB
usage:   7486 MB used, 84626 MB / 92112 MB avail
pgs: 192 active+clean
 
  io:
client:   564 kB/s rd, 767 B/s wr, 33 op/s rd, 0 op/s wr


-


-
Log:
2017-10-17 16:15:25.466807 mon.alantra_fs-02 [INF] osd.0 marked down 
after no beacon for 29.864632 seconds
2017-10-17 16:15:25.467557 mon.alantra_fs-02 [WRN] Health check failed: 
1 osds down (OSD_DOWN)
2017-10-17 16:15:25.467587 mon.alantra_fs-02 [WRN] Health check failed: 
1 host (1 osds) down (OSD_HOST_DOWN)
2017-10-17 16:15:27.494526 mon.alantra_fs-02 [WRN] Health check failed: 
Degraded data redundancy: 63 pgs unclean (PG_DEGRADED)
2017-10-17 16:15:27.501956 mon.alantra_fs-02 [INF] Health check cleared: 
OSD_DOWN (was: 1 osds down)
2017-10-17 16:15:27.501997 mon.alantra_fs-02 [INF] Health check cleared: 
OSD_HOST_DOWN (was: 1 host (1 osds) down)
2017-10-17 16:15:27.502012 mon.alantra_fs-02 [INF] Cluster is now 
healthy
2017-10-17 16:15:27.518798 mon.alantra_fs-02 [INF] osd.0 
10.20.1.109:6801/3319 boot
2017-10-17 16:15:26.414023 osd.0 [WRN] Monitor daemon marked osd.0 down, 
but it is still running
2017-10-17 16:15:30.470477 mon.alantra_fs-02 [INF] osd.1 marked down 
after no beacon for 25.007336 seconds
2017-10-17 16:15:30.471014 mon.alantra_fs-02 [WRN] Health check failed: 
1 osds down (OSD_DOWN)
2017-10-17 16:15:30.471047 mon.alantra_fs-02 [WRN] Health check failed: 
1 host (1 osds) down (OSD_HOST_DOWN)
2017-10-17 16:15:30.532427 mon.alantra_fs-02 [WRN] overall HEALTH_WARN 1 
osds down; 1 host (1 osds) down; Degraded data redundancy: 63 pgs 
unclean
2017-10-17 16:15:31.590661 mon.alantra_fs-02 [INF] Health check cleared: 
PG_DEGRADED (was: Degraded data redundancy: 63 pgs unclean)
2017-10-17 16:15:34.703027 mon.a

Re: [ceph-users] Luminous can't seem to provision more than 32 OSDs per server

2017-10-19 Thread Marc Roos
 
What about not using deploy?




-Original Message-
From: Sean Sullivan [mailto:lookcr...@gmail.com] 
Sent: donderdag 19 oktober 2017 2:28
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Luminous can't seem to provision more than 32 OSDs 
per server

I am trying to install Ceph luminous (ceph version 12.2.1) on 4 ubuntu 
16.04 servers each with 74 disks, 60 of which are HGST 7200rpm sas 
drives::


HGST HUS724040AL sdbv  sas
root@kg15-2:~# lsblk --output MODEL,KNAME,TRAN | grep HGST | wc -l 60

I am trying to deploy them all with ::
a line like the following::
ceph-deploy osd zap kg15-2:(sas_disk)
ceph-deploy osd create --dmcrypt --bluestore --block-db (ssd_partition) 
kg15-2:(sas_disk)

This didn't seem to work at all so I am now trying to troubleshoot by 
just provisioning the sas disks::
ceph-deploy osd create --dmcrypt --bluestore kg15-2:(sas_disk)

Across all 4 hosts I can only seem to get 32 OSDs up and after that the 
rest fail::
root@kg15-1:~# ps faux | grep [c]eph-osd' | wc -l
32
root@kg15-2:~# ps faux | grep [c]eph-osd' | wc -l
32

root@kg15-3:~# ps faux | grep [c]eph-osd' | wc -l
32

The ceph-deploy tool doesn't seem to log or notice any failure but the 
host itself shows the following in the osd log:


2017-10-17 23:05:43.121016 7f8ca75c9e00  0 set uid:gid to 64045:64045 
(ceph:ceph)
2017-10-17 23:05:43.121040 7f8ca75c9e00  0 ceph version 12.2.1 
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable), process 
(unknown), pid 69926
2017-10-17 23:05:43.123939 7f8ca75c9e00  1 
bluestore(/var/lib/ceph/tmp/mnt.8oIc5b) mkfs path 
/var/lib/ceph/tmp/mnt.8oIc5b
2017-10-17 23:05:43.124037 7f8ca75c9e00  1 bdev create path 
/var/lib/ceph/tmp/mnt.8oIc5b/block type kernel
2017-10-17 23:05:43.124045 7f8ca75c9e00  1 bdev(0x564b7a05e900 
/var/lib/ceph/tmp/mnt.8oIc5b/block) open path 
/var/lib/ceph/tmp/mnt.8oIc5b/block
2017-10-17 23:05:43.124231 7f8ca75c9e00  1 bdev(0x564b7a05e900 
/var/lib/ceph/tmp/mnt.8oIc5b/block) open size 4000668520448 
(0x3a37a6d1000, 3725 GB) block_size 4096 (4096 B) rotational
2017-10-17 23:05:43.124296 7f8ca75c9e00  1 
bluestore(/var/lib/ceph/tmp/mnt.8oIc5b) _set_cache_sizes max 0.5 < ratio 
0.99
2017-10-17 23:05:43.124313 7f8ca75c9e00  1 
bluestore(/var/lib/ceph/tmp/mnt.8oIc5b) _set_cache_sizes cache_size 
1073741824 meta 0.5 kv 0.5 data 0
2017-10-17 23:05:43.124349 7f8ca75c9e00 -1 
bluestore(/var/lib/ceph/tmp/mnt.8oIc5b) _open_db 
/var/lib/ceph/tmp/mnt.8oIc5b/block.db link target doesn't exist
2017-10-17 23:05:43.124368 7f8ca75c9e00  1 bdev(0x564b7a05e900 
/var/lib/ceph/tmp/mnt.8oIc5b/block) close
2017-10-17 23:05:43.402165 7f8ca75c9e00 -1 
bluestore(/var/lib/ceph/tmp/mnt.8oIc5b) mkfs failed, (2) No such file or 
directory
2017-10-17 23:05:43.402185 7f8ca75c9e00 -1 OSD::mkfs: ObjectStore::mkfs 
failed with error (2) No such file or directory
2017-10-17 23:05:43.402258 7f8ca75c9e00 -1  ** ERROR: error creating 
empty object store in /var/lib/ceph/tmp/mnt.8oIc5b: (2) No such file or 
directory


I have a few questions. I am not sure where to start troubleshooting so 
I have a few questions.

1.) Anyone have any idea on why 32?
2.) Is there a good guide / outline on how to get the benefit of storing 
the keys in the monitor while still having ceph more or less manage the 
drives but provisioning the drives without ceph-deploy? I looked at the 
manual deployment long and short form and it doesn't mention dmcrypt or 
bluestore at all. I know I can use crypttab and cryptsetup to do this 
and then give ceph-disk the path to the mapped device but I would prefer 
to keep as much management in ceph as possible if I could.  (mailing 
list thread :: 
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg38575.html 
  )

3.) Ideally I would like to provision the drives with the DB on the SSD. (or 
would it be better to make a cache tier? I read on a reddit 
thread that the tiering in ceph isn't being developed anymore is it 
still worth it?)

Sorry for the bother and thanks for all the help!!!


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI gateway for ceph

2017-10-25 Thread Marc Roos
 

Hi Giang,

Can I ask you if you used the elrepo kernels? Because I tried these, but 
they are not booting because of I think the mpt2sas mpt3sas drivers.

Regards,
Marc



-Original Message-
From: GiangCoi Mr [mailto:ltrgian...@gmail.com] 
Sent: woensdag 25 oktober 2017 16:11
To: ceph-us...@ceph.com
Subject: [ceph-users] iSCSI gateway for ceph

Hi all.


I am researching with Ceph for Storage. I am using 3 VM: ceph01, ceph02, 
ceph03. All VM is using CentOS 7.4 with kernel from 4.x (I upgraded). 
Now I want to configure high availability iSCSI with ceph-iscsi-cli.

I installed ceph-iscsi-cli on ceph01. But when I create isci gateway by 
following command, it have error
>/iscsi-target create iqn.2003-01.com.redhat.iscsi-gw:blockgw
> goto gateways
> create ceph01 192.168.101.151


It show: OS is unsupported.


How I can fix this issue? Please help me. Thanks so much


Regards.

Giang Le



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI gateway for ceph

2017-10-25 Thread Marc Roos
 
I could not get it to boot on CentOS7 with just installing it. I think 
it is because of the booting from mpt2sas and that driver is replaced 
with mpt3sas in >4.x kernels. I was even recreating the boot initrd's, 
but could not get it to run quickly. 



-Original Message-
From: GiangCoi Mr [mailto:ltrgian...@gmail.com] 
Sent: woensdag 25 oktober 2017 17:08
To: Marc Roos
Cc: ceph-users
Subject: Re: [ceph-users] iSCSI gateway for ceph

Yes, I used elerepo to upgrade kernel, I can boot and show it, kernel 
4.x. What is the problem?

Sent from my iPhone

> On Oct 25, 2017, at 10:02 PM, Marc Roos  
wrote:
> 
> 
> 
> Hi Giang,
> 
> Can I ask you if you used the elrepo kernels? Because I tried these, 
> but they are not booting because of I think the mpt2sas mpt3sas 
drivers.
> 
> Regards,
> Marc
> 
> 
> 
> -Original Message-
> From: GiangCoi Mr [mailto:ltrgian...@gmail.com]
> Sent: woensdag 25 oktober 2017 16:11
> To: ceph-us...@ceph.com
> Subject: [ceph-users] iSCSI gateway for ceph
> 
> Hi all.
> 
> 
> I am researching with Ceph for Storage. I am using 3 VM: ceph01, 
> ceph02, ceph03. All VM is using CentOS 7.4 with kernel from 4.x (I 
upgraded).
> Now I want to configure high availability iSCSI with ceph-iscsi-cli.
> 
> I installed ceph-iscsi-cli on ceph01. But when I create isci gateway 
> by following command, it have error
>> /iscsi-target create iqn.2003-01.com.redhat.iscsi-gw:blockgw
>> goto gateways
>> create ceph01 192.168.101.151
> 
> 
> It show: OS is unsupported.
> 
> 
> How I can fix this issue? Please help me. Thanks so much
> 
> 
> Regards.
> 
> Giang Le
> 
> 
> 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Kernel version recommendation

2017-10-28 Thread Marc Roos
 
I hope I can post here a general question/comment regarding 
distributions. Because I see a lot of stability issues passing by here. 

Why are people choosing an ubuntu distribution to run in production? 
Mostly I get an answer like they are accustomed to using it. But is the 
OS not just a tool? And you have to choose the correct tool for the job 
(and then learn to use it)? 
When I chose centos it was because it is close related to redhat, and 
for critical situations I would choose or switch to the rhel license. 
There over 10k people working at redhat to produce a stable os. I am 
very pleased with the level of knowledge here and what redhat is doing 
general.

I just have to finish with; You people working on Ceph are doing a great 
job and are working on a great project!







-Original Message-
From: Bogdan SOLGA [mailto:bogdan.so...@gmail.com] 
Sent: vrijdag 27 oktober 2017 18:33
To: ceph-users
Cc: Stephen Oliver; Ákos Nagy
Subject: [ceph-users] Kernel version recommendation

Hello, everyone!


We have recently upgraded our Ceph pool to the latest Luminous release. 
On one of the servers that we used as Ceph clients we had several freeze 
issues, which we empirically linked to the concurrent usage of some I/O 
operations - writing in an LXD container (backed by Ceph) while there 
was an ongoing PG rebalancing. We searched for the issue's cause through 
the logs, but we haven't found anything useful.


At that time the server was running Ubuntu 16 with a 4.5 kernel. We 
thought an upgrade to the latest HWE kernel (4.10) would help, but we 
had the same freezing issues after the kernel upgrade. Of course, we're 
aware that we have tried to fix / avoid the issue without understanding 
it's cause.

After seeing the OS recommendations from the Ceph page 
 
, we reinstalled the server (and got the 4.4 kernel), we ran into a 
feature set mismatch issue when mounting a RBD image. We concluded 

  
that the feature set requires a kernel > 4.5.


Our question - how would you recommend us to proceed? Shall we 
re-upgrade to the HWE kernel (4.10) or to another kernel version? Would 
you recommend an alternative solution?


Thank you very much, we're looking forward for your advice.


Kind regards,

Bogdan



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Adding an extra description with creating a snapshot

2017-10-29 Thread Marc Roos
 
Is it possible to add a longer description with the created snapshot 
(other than using name)?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] No ops on some OSD

2017-11-04 Thread Marc Roos
 

What is the new syntax for "ceph osd status" for luminous?




-Original Message-
From: I Gede Iswara Darmawan [mailto:iswaradr...@gmail.com] 
Sent: donderdag 2 november 2017 6:19
To: ceph-users@lists.ceph.com
Subject: [ceph-users] No ops on some OSD

Hello,

I want to ask about my problem. There's some OSD that dont have any load 
(indicated with No ops on that OSD). 

Hereby I attached the ceph osd status result : 
https://pastebin.com/fFLcCbpk . Look at OSD 17,61 and 72. There's no 
load or operation happened at that OSD. How to fix this?

Thank you
Regards,

I Gede Iswara Darmawan

Information System - School of Industrial and System Engineering

Telkom University

P / SMS / WA : 081 322 070719 

E : iswaradr...@gmail.com / iswaradr...@live.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Erasure pool

2017-11-08 Thread Marc Roos
 
Can anyone advice on a erasure pool config to store 

- files between 500MB and 8GB, total 8TB
- just for archiving, not much reading (few files a week)
- hdd pool
- now 3 node cluster (4th coming)
- would like to save on storage space

I was thinking of a profile with jerasure  k=3 m=2, but maybe this lrc 
is better? Or wait for 4th node and choose k=4 m=2?


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] who is using nfs-ganesha and cephfs?

2017-11-08 Thread Marc Roos
 

I, in test environment, centos7, on a luminous osd node, with binaries 
from 
download.ceph.com::ceph/nfs-ganesha/rpm-V2.5-stable/luminous/x86_64/

Having these:
Nov  6 17:41:34 c01 kernel: ganesha.nfsd[31113]: segfault at 0 ip 
7fa80a151a43 sp 7fa755ffa2f0 error 4 in 
libdbus-1.so.3.7.4[7fa80a12b000+46000]
Nov  6 17:41:34 c01 kernel: ganesha.nfsd[31113]: segfault at 0 ip 
7fa80a151a43 sp 7fa755ffa2f0 error 4 in 
libdbus-1.so.3.7.4[7fa80a12b000+46000]
Nov  6 17:42:16 c01 kernel: ganesha.nfsd[6839]: segfault at 8 ip 
7fc97a5d3f98 sp 7fc8c6ffc2f8 error 6 in 
libdbus-1.so.3.7.4[7fc97a5ac000+46000]
Nov  6 17:42:16 c01 kernel: ganesha.nfsd[6839]: segfault at 8 ip 
7fc97a5d3f98 sp 7fc8c6ffc2f8 error 6 in 
libdbus-1.so.3.7.4[7fc97a5ac000+46000]
Nov  6 17:47:47 c01 kernel: ganesha.nfsd[7662]: segfault at 4 ip 
7f15e2afc060 sp 7f152effc388 error 6 in 
libdbus-1.so.3.7.4[7f15e2ad6000+46000]
Nov  6 17:47:47 c01 kernel: ganesha.nfsd[7662]: segfault at 4 ip 
7f15e2afc060 sp 7f152effc388 error 6 in 
libdbus-1.so.3.7.4[7f15e2ad6000+46000]
Nov  6 17:52:25 c01 kernel: ganesha.nfsd[14415]: segfault at 88 ip 
7f9258eed453 sp 7f91a9ff2348 error 4 in 
libdbus-1.so.3.7.4[7f9258eda000+46000]
Nov  6 17:52:25 c01 kernel: ganesha.nfsd[14415]: segfault at 88 ip 
7f9258eed453 sp 7f91a9ff2348 error 4 in 
libdbus-1.so.3.7.4[7f9258eda000+46000]


And reported this
https://github.com/nfs-ganesha/nfs-ganesha/issues/215



-Original Message-
From: Sage Weil [mailto:sw...@redhat.com] 
Sent: woensdag 8 november 2017 22:42
To: ceph-us...@ceph.com; ceph-de...@vger.kernel.org
Subject: [ceph-users] who is using nfs-ganesha and cephfs?

Who is running nfs-ganesha's FSAL to export CephFS?  What has your 
experience been?

(We are working on building proper testing and support for this into 
Mimic, but the ganesha FSAL has been around for years.)

Thanks!
sage

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph auth profile definitions

2017-11-09 Thread Marc Roos
 
How/where can I see how eg. 'profile rbd' is defined?

As in 
[client.rbd.client1]
key = xxx==
caps mon = "profile rbd"
caps osd = "profile rbd pool=rbd"





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Librbd, qemu, libvirt xml

2017-11-09 Thread Marc Roos
 
What would be the correct way to convert the xml file rbdmapped images 
to librbd?

I had this:
 

  
  
  
  
  


And for librbd this:


  
  

  
  



  
  
  
  


But this will give me a qemu format drive option:
-drive 
file=rbd:rbd/vps-test2:id=rbd.vps:key=XWHYISTHISEVENHERE==:auth_
supported=cephx\;none:mon_host=192.168.10.111\:6789\;192.168.10.112\:678
9\;192.168.10.113\:6789,format=raw,if=none,id=drive-scsi0-0-0-0,cache=wr
iteback

And not format rbd:
-drive format=rbd,file=rbd:data/squeeze,cache=writeback
As specified here, http://docs.ceph.com/docs/luminous/rbd/qemu-rbd/

If I change type='raw' to type='rbd', I get 
error: unsupported configuration: unknown driver format value 'rbd'



Linux c01 3.10.0-693.5.2.el7.x86_64 #1 SMP Fri Oct 20 20:32:50 UTC 2017 
x86_64 x86_64 x86_64 GNU/Linu
ceph-mgr-12.2.1-0.el7.x86_64
ceph-12.2.1-0.el7.x86_64
libcephfs2-12.2.1-0.el7.x86_64
python-cephfs-12.2.1-0.el7.x86_64
ceph-common-12.2.1-0.el7.x86_64
ceph-selinux-12.2.1-0.el7.x86_64
ceph-mon-12.2.1-0.el7.x86_64
ceph-mds-12.2.1-0.el7.x86_64
collectd-ceph-5.7.1-2.el7.x86_64
ceph-base-12.2.1-0.el7.x86_64
ceph-osd-12.2.1-0.el7.x86_64
ceph-deploy-1.5.39-0.noarch
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Pool shard/stripe settings for file too large files?

2017-11-09 Thread Marc Roos
 
I would like store objects with

rados -p ec32 put test2G.img test2G.img

error putting ec32/test2G.img: (27) File too large

Changing the pool application from custom to rgw did not help









___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Undersized fix for small cluster, other than adding a 4th node?

2017-11-09 Thread Marc Roos
 
I added an erasure k=3,m=2 coded pool on a 3 node test cluster and am 
getting these errors. 

   pg 48.0 is stuck undersized for 23867.00, current state 
active+undersized+degraded, last acting [9,13,2147483647,7,2147483647]
pg 48.1 is stuck undersized for 27479.944212, current state 
active+undersized+degraded, last acting [12,1,2147483647,8,2147483647]
pg 48.2 is stuck undersized for 27479.944514, current state 
active+undersized+degraded, last acting [12,1,2147483647,3,2147483647]
pg 48.3 is stuck undersized for 27479.943845, current state 
active+undersized+degraded, last acting [11,0,2147483647,2147483647,5]
pg 48.4 is stuck undersized for 27479.947473, current state 
active+undersized+degraded, last acting [8,4,2147483647,2147483647,5]
pg 48.5 is stuck undersized for 27479.940289, current state 
active+undersized+degraded, last acting [6,5,11,2147483647,2147483647]
pg 48.6 is stuck undersized for 27479.947125, current state 
active+undersized+degraded, last acting [5,8,2147483647,1,2147483647]
pg 48.7 is stuck undersized for 23866.977708, current state 
active+undersized+degraded, last acting [13,11,2147483647,0,2147483647]

Mentioned here 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-May/009572.html 
is that the problem was resolved by adding an extra node, I already 
changed the min_size to 3. Or should I change to k=2,m=2 but do I still 
then have good saving on storage then? How can you calculate saving 
storage of erasure pool?




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Pool shard/stripe settings for file too large files?

2017-11-09 Thread Marc Roos
 
Yes, I actually changed it back to the default after reading somewhat 
about it (https://github.com/ceph/ceph/pull/15520). I wanted to store 
5GB and 12GB files, that makes recovery not to nice. I thought there was 
a setting to split them up automatically like with rbd pools. 



-Original Message-
From: Kevin Hrpcek [mailto:kevin.hrp...@ssec.wisc.edu] 
Sent: donderdag 9 november 2017 21:09
To: Marc Roos
Cc: ceph-users
Subject: Re: [ceph-users] Pool shard/stripe settings for file too large 
files?

Marc,

If you're running luminous you may need to increase osd_max_object_size. 
This snippet is from the Luminous change log.

"The default maximum size for a single RADOS object has been reduced 
from 100GB to 128MB. The 100GB limit was completely impractical in 
practice while the 128MB limit is a bit high but not unreasonable. If 
you have an application written directly to librados that is using 
objects larger than 128MB you may need to adjust osd_max_object_size"

Kevin


On 11/09/2017 02:01 PM, Marc Roos wrote:


 
I would like store objects with

rados -p ec32 put test2G.img test2G.img

error putting ec32/test2G.img: (27) File too large

Changing the pool application from custom to rgw did not help









___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Pool shard/stripe settings for file too large files?

2017-11-09 Thread Marc Roos
 
Do you know of a rados client that uses this? Maybe a simple 'mount' so 
I can cp the files on it?






-Original Message-
From: Christian Wuerdig [mailto:christian.wuer...@gmail.com] 
Sent: donderdag 9 november 2017 22:01
To: Kevin Hrpcek
Cc: Marc Roos; ceph-users
Subject: Re: [ceph-users] Pool shard/stripe settings for file too large 
files?

It should be noted that the general advise is to not use such large 
objects since cluster performance will suffer, see also this thread:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021051.html

libradosstriper might be an option which will automatically break the 
object into smaller chunks

On Fri, Nov 10, 2017 at 9:08 AM, Kevin Hrpcek 
 wrote:
> Marc,
>
> If you're running luminous you may need to increase 
osd_max_object_size.
> This snippet is from the Luminous change log.
>
> "The default maximum size for a single RADOS object has been reduced 
> from 100GB to 128MB. The 100GB limit was completely impractical in 
> practice while the 128MB limit is a bit high but not unreasonable. If 
> you have an application written directly to librados that is using 
> objects larger than 128MB you may need to adjust osd_max_object_size"
>
> Kevin
>
> On 11/09/2017 02:01 PM, Marc Roos wrote:
>
>
> I would like store objects with
>
> rados -p ec32 put test2G.img test2G.img
>
> error putting ec32/test2G.img: (27) File too large
>
> Changing the pool application from custom to rgw did not help
>
>
>
>
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-10 Thread Marc Roos
 
osd's are crashing when putting a (8GB) file in a erasure coded pool, 
just before finishing. The same osd's are used for replicated pools 
rbd/cephfs, and seem to do fine. Did I made some error is this a bug? 
Looks similar to
https://www.spinics.net/lists/ceph-devel/msg38685.html
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021045.html


[@c01 ~]# date ; rados -p ec21 put  $(basename 
"/mnt/disk/blablablalbalblablalablalb.txt") 
blablablalbalblablalablalb.txt
Fri Nov 10 20:27:26 CET 2017

[Fri Nov 10 20:33:51 2017] libceph: osd9 down
[Fri Nov 10 20:33:51 2017] libceph: osd9 down
[Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket 
closed (con state OPEN)
[Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket 
error on write
[Fri Nov 10 20:33:52 2017] libceph: osd0 down
[Fri Nov 10 20:33:52 2017] libceph: osd7 down
[Fri Nov 10 20:33:55 2017] libceph: osd0 down
[Fri Nov 10 20:33:55 2017] libceph: osd7 down
[Fri Nov 10 20:34:41 2017] libceph: osd7 up
[Fri Nov 10 20:34:41 2017] libceph: osd7 up
[Fri Nov 10 20:35:03 2017] libceph: osd9 up
[Fri Nov 10 20:35:03 2017] libceph: osd9 up
[Fri Nov 10 20:35:47 2017] libceph: osd0 up
[Fri Nov 10 20:35:47 2017] libceph: osd0 up

[@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt
2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-11-10 20:39:31.296290 7f840ad45e40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-11-10 20:39:31.331588 7f840ad45e40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 20:32:52.00, 
size 8585740288



2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1 
{"time_micros": 1510342372287484, "job": 32, "event": "flush_started", 
"num_memtables": 1, "num_entries": 728747, "num_deletes": 363960, 
"memory_usage": 263854696}
2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293] 
[default] [JOB 32] Level-0 flush table #25279: started
2017-11-10 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1 
{"time_micros": 1510342372503293, "cf_name": "default", "job": 32, 
"event": "table_file_creation", "file_number": 25279, "file_size": 
4811948, "table_properties": {"data_size": 4675796, "index_size": 
102865, "filter_size": 32302, "raw_key_size": 646440, 
"raw_average_key_size": 75, "raw_value_size": 4446103, 
"raw_average_value_size": 519, "num_data_blocks": 1180, "num_entries": 
8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter", 
"kDeletedKeys": "0", "kMergeOperands": "330"}}
2017-11-10 20:32:52.503327 7f933028d700  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319] 
[default] [JOB 32] Level-0 flush table #25279: 4811948 bytes OK
2017-11-10 20:32:52.572413 7f933028d700  4 rocksdb: 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_files.cc:242] 
adding log 25276 to recycle list

2017-11-10 20:32:52.572422 7f933028d700  4 rocksdb: (Original Log Time 
2017/11/10-20:32:52.503339) 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:360] 
[default] Level-0 commit table #25279 started
2017-11-10 20:32:52.572425 7f933028d700  4 rocksdb: (Original Log Time 
2017/11/10-20:32:52.572312) 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/memtable_list.cc:383] 
[default] Level-0 commit table #25279: memtable #1 done
2017-11-10 20:32:52.572428 7f933028d700  4 rocksdb: (Original Log Time 
2017/11/10-20:32:52.572328) EVENT_LOG_v1 {"time_micros": 
1510342372572321, "job": 32, "event": "flush_finished", "lsm_state": [4, 
4, 36, 140, 0, 0, 0], "immutable_memtables": 0}
2017-11-10 20:32:52.572430 7f933028d700  4 rocksdb: (Original Log Time 
2017/11/10-20:32:52.572397) 
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_compaction_flush
.cc:132] [default] Level summary: base level 1 max bytes base 268435456 
files[4 4 36 140 0 0 0] max score 1.00

2017-11-10 20:32:52.572491 7f933028d700  4 rocksdb: 
[/home/jenki

Re: [ceph-users] No ops on some OSD

2017-11-12 Thread Marc Roos

[@c03 ~]# ceph osd status
2017-11-12 15:54:13.164823 7f478a6ad700 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
2017-11-12 15:54:13.211219 7f478a6ad700 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore
no valid command found; 10 closest matches:
osd map   {}
osd lspools {}
osd count-metadata 
osd versions
osd find 
osd metadata {}
osd getmaxosd
osd ls-tree {} {}
osd getmap {}
osd getcrushmap {}
Error EINVAL: invalid command



-Original Message-
From: I Gede Iswara Darmawan [mailto:iswaradr...@gmail.com] 
Sent: zondag 12 november 2017 2:17
Cc: ceph-users
Subject: Re: [ceph-users] No ops on some OSD

Still the same syntax (ceph osd status)

Thanks

Regards,

I Gede Iswara Darmawan

Information System - School of Industrial and System Engineering

Telkom University

P / SMS / WA : 081 322 070719 

E : iswaradr...@gmail.com / iswaradr...@live.com


On Sat, Nov 4, 2017 at 6:11 PM, Marc Roos  
wrote:




What is the new syntax for "ceph osd status" for luminous?





-Original Message-
From: I Gede Iswara Darmawan [mailto:iswaradr...@gmail.com]
Sent: donderdag 2 november 2017 6:19
To: ceph-users@lists.ceph.com
Subject: [ceph-users] No ops on some OSD

Hello,

I want to ask about my problem. There's some OSD that dont have any 
load
(indicated with No ops on that OSD).

Hereby I attached the ceph osd status result :
https://pastebin.com/fFLcCbpk . Look at OSD 17,61 and 72. There's 
no
load or operation happened at that OSD. How to fix this?

Thank you
Regards,

I Gede Iswara Darmawan

Information System - School of Industrial and System Engineering

Telkom University

P / SMS / WA : 081 322 070719

E : iswaradr...@gmail.com / iswaradr...@live.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> 




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Marc Roos
 
1. I don’t think an osd should 'crash' in such situation. 
2. How else should I 'rados put' an 8GB file?






-Original Message-
From: Christian Wuerdig [mailto:christian.wuer...@gmail.com] 
Sent: maandag 13 november 2017 0:12
To: Marc Roos
Cc: ceph-users
Subject: Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

As per: https://www.spinics.net/lists/ceph-devel/msg38686.html
Bluestore as a hard 4GB object size limit


On Sat, Nov 11, 2017 at 9:27 AM, Marc Roos  
wrote:
>
> osd's are crashing when putting a (8GB) file in a erasure coded pool, 
> just before finishing. The same osd's are used for replicated pools 
> rbd/cephfs, and seem to do fine. Did I made some error is this a bug?
> Looks similar to
> https://www.spinics.net/lists/ceph-devel/msg38685.html
> http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021
> 045.html
>
>
> [@c01 ~]# date ; rados -p ec21 put  $(basename
> "/mnt/disk/blablablalbalblablalablalb.txt")
> blablablalbalblablalablalb.txt
> Fri Nov 10 20:27:26 CET 2017
>
> [Fri Nov 10 20:33:51 2017] libceph: osd9 down [Fri Nov 10 20:33:51 
> 2017] libceph: osd9 down [Fri Nov 10 20:33:51 2017] libceph: osd0 
> 192.168.10.111:6802 socket closed (con state OPEN) [Fri Nov 10 
> 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket error on write 

> [Fri Nov 10 20:33:52 2017] libceph: osd0 down [Fri Nov 10 20:33:52 
> 2017] libceph: osd7 down [Fri Nov 10 20:33:55 2017] libceph: osd0 down 

> [Fri Nov 10 20:33:55 2017] libceph: osd7 down [Fri Nov 10 20:34:41 
> 2017] libceph: osd7 up [Fri Nov 10 20:34:41 2017] libceph: osd7 up 
> [Fri Nov 10 20:35:03 2017] libceph: osd9 up [Fri Nov 10 20:35:03 2017] 

> libceph: osd9 up [Fri Nov 10 20:35:47 2017] libceph: osd0 up [Fri Nov 
> 10 20:35:47 2017] libceph: osd0 up
>
> [@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt 2017-11-10 

> 20:39:31.296101 7f840ad45e40 -1 WARNING: the following dangerous and 
> experimental features are enabled: bluestore 2017-11-10 
> 20:39:31.296290 7f840ad45e40 -1 WARNING: the following dangerous and 
> experimental features are enabled: bluestore 2017-11-10 
> 20:39:31.331588 7f840ad45e40 -1 WARNING: the following dangerous and 
> experimental features are enabled: bluestore 
> ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 20:32:52.00, 
> size 8585740288
>
>
>
> 2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372287484, "job": 32, "event": "flush_started",
> "num_memtables": 1, "num_entries": 728747, "num_deletes": 363960,
> "memory_usage": 263854696}
> 2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
> AR 
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/releas
> e/ 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293]
> [default] [JOB 32] Level-0 flush table #25279: started 2017-11-10 
> 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1
> {"time_micros": 1510342372503293, "cf_name": "default", "job": 32,
> "event": "table_file_creation", "file_number": 25279, "file_size":
> 4811948, "table_properties": {"data_size": 4675796, "index_size":
> 102865, "filter_size": 32302, "raw_key_size": 646440,
> "raw_average_key_size": 75, "raw_value_size": 4446103,
> "raw_average_value_size": 519, "num_data_blocks": 1180, "num_entries":
> 8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
> "kDeletedKeys": "0", "kMergeOperands": "330"}} 2017-11-10 
> 20:32:52.503327 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
> AR 
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/releas
> e/ 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:319]
> [default] [JOB 32] Level-0 flush table #25279: 4811948 bytes OK 
> 2017-11-10 20:32:52.572413 7f933028d700  4 rocksdb:
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
> AR 
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/releas
> e/ 
> 12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/db_impl_files.cc:242]
> adding log 25276 to recycle list
>
> 2017-11-10 20:32:52.572422 7f933028d700  4 rocksdb: (Original Log Time
> 2017/11/10-20:32:52.503339)
> [/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_
> AR 
> CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/re

Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

2017-11-13 Thread Marc Roos
 

I have been asking myself (and here) the same question. I think it is 
because of having this in ceph.conf
enable experimental unrecoverable data corrupting features = bluestore
But I am not sure if I can remove this, or have to replace this with 
something else.

ceph-12.2.1-0.el7.x86_64
ceph-base-12.2.1-0.el7.x86_64
ceph-common-12.2.1-0.el7.x86_64
ceph-mds-12.2.1-0.el7.x86_64
ceph-mgr-12.2.1-0.el7.x86_64
ceph-mon-12.2.1-0.el7.x86_64
ceph-osd-12.2.1-0.el7.x86_64
ceph-selinux-12.2.1-0.el7.x86_64
collectd-ceph-5.7.1-2.el7.x86_64
libcephfs2-12.2.1-0.el7.x86_64
nfs-ganesha-ceph-2.5.2-.el7.x86_64
python-cephfs-12.2.1-0.el7.x86_64




-Original Message-
From: Caspar Smit [mailto:caspars...@supernas.eu] 
Sent: maandag 13 november 2017 9:58
To: ceph-users
Subject: Re: [ceph-users] Getting errors on erasure pool writes k=2, m=1

Hi,

Why would Ceph 12.2.1 give you this message:

2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following 
dangerous and experimental features are enabled: bluestore



Or is that a leftover warning message from an old client?

Kind regards,
Caspar


2017-11-10 21:27 GMT+01:00 Marc Roos :



osd's are crashing when putting a (8GB) file in a erasure coded 
pool,
just before finishing. The same osd's are used for replicated pools
rbd/cephfs, and seem to do fine. Did I made some error is this a 
bug?
Looks similar to
https://www.spinics.net/lists/ceph-devel/msg38685.html 
<https://www.spinics.net/lists/ceph-devel/msg38685.html> 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/
021045.html 
<http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-September/021045.html>
 


[@c01 ~]# date ; rados -p ec21 put  $(basename
"/mnt/disk/blablablalbalblablalablalb.txt")
blablablalbalblablalablalb.txt
Fri Nov 10 20:27:26 CET 2017

[Fri Nov 10 20:33:51 2017] libceph: osd9 down
[Fri Nov 10 20:33:51 2017] libceph: osd9 down
[Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
closed (con state OPEN)
[Fri Nov 10 20:33:51 2017] libceph: osd0 192.168.10.111:6802 socket
error on write
[Fri Nov 10 20:33:52 2017] libceph: osd0 down
[Fri Nov 10 20:33:52 2017] libceph: osd7 down
[Fri Nov 10 20:33:55 2017] libceph: osd0 down
[Fri Nov 10 20:33:55 2017] libceph: osd7 down
[Fri Nov 10 20:34:41 2017] libceph: osd7 up
[Fri Nov 10 20:34:41 2017] libceph: osd7 up
[Fri Nov 10 20:35:03 2017] libceph: osd9 up
[Fri Nov 10 20:35:03 2017] libceph: osd9 up
[Fri Nov 10 20:35:47 2017] libceph: osd0 up
[Fri Nov 10 20:35:47 2017] libceph: osd0 up

[@c02 ~]# rados -p ec21 stat blablablalbalblablalablalb.txt
2017-11-10 20:39:31.296101 7f840ad45e40 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore
2017-11-10 20:39:31.296290 7f840ad45e40 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore
2017-11-10 20:39:31.331588 7f840ad45e40 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore
ec21/blablablalbalblablalablalb.txt mtime 2017-11-10 
20:32:52.00,
size 8585740288



2017-11-10 20:32:52.287503 7f933028d700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1510342372287484, "job": 32, "event": 
"flush_started",
"num_memtables": 1, "num_entries": 728747, "num_deletes": 363960,
"memory_usage": 263854696}
2017-11-10 20:32:52.287509 7f933028d700  4 rocksdb:
[/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILAB
LE_AR
CH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/rel
ease/
12.2.1/rpm/el7/BUILD/ceph-12.2.1/src/rocksdb/db/flush_job.cc:293]
[default] [JOB 32] Level-0 flush table #25279: started
2017-11-10 20:32:52.503311 7f933028d700  4 rocksdb: EVENT_LOG_v1
{"time_micros": 1510342372503293, "cf_name": "default", "job": 32,
"event": "table_file_creation", "file_number": 25279, "file_size":
4811948, "table_properties": {"data_size": 4675796, "index_size":
102865, "filter_size": 32302, "raw_key_size": 646440,
"raw_average_key_size": 75, "raw_value_size": 4446103,
"raw_average_value_size": 519, "num_data_blocks": 1180, 
"num_entries":
8560, "filter_policy_name": "rocksdb.BuiltinBloomFilter",
"kDeletedKeys": "0", "kMergeOperands": "330"}}
2017-1

Re: [ceph-users] No ops on some OSD

2017-11-13 Thread Marc Roos

 
Indeed this what I have

[@c01 ceph]# ceph --version
ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous 
(stable)

[@c01 ceph]# ceph tell osd.* version|head
osd.0: {
"version": "ceph version 12.2.1 
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)"
}
osd.1: {
"version": "ceph version 12.2.1 
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)"
}
osd.2: {
"version": "ceph version 12.2.1 
(3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e) luminous (stable)"
}
osd.3: {

[@c01 ceph]# rpm -qa | grep ceph | sort
ceph-12.2.1-0.el7.x86_64
ceph-base-12.2.1-0.el7.x86_64
ceph-common-12.2.1-0.el7.x86_64
ceph-mds-12.2.1-0.el7.x86_64
ceph-mgr-12.2.1-0.el7.x86_64
ceph-mon-12.2.1-0.el7.x86_64
ceph-osd-12.2.1-0.el7.x86_64
ceph-selinux-12.2.1-0.el7.x86_64
collectd-ceph-5.7.1-2.el7.x86_64
libcephfs2-12.2.1-0.el7.x86_64
nfs-ganesha-ceph-2.5.2-.el7.x86_64
python-cephfs-12.2.1-0.el7.x86_64



-Original Message-
From: Caspar Smit [mailto:caspars...@supernas.eu] 
Sent: maandag 13 november 2017 10:51
To: ceph-users
Subject: Re: [ceph-users] No ops on some OSD

Weird

# ceph --version
ceph version 12.2.1 (fc129ad90a65dc0b419412e77cb85ac230da42a6) luminous 
(stable)

# ceph osd status
+++---+---++-++-+
| id |  host  |  used | avail | wr ops | wr data | rd ops | rd data |
+++---+---++-++-+
| 0  | node04 |  115M | 11.6G |0   | 0   |0   | 0   |
+++---+---++-++-+

ps. output is from a single host with 1 (virtual) OSD configured but the 
command works

Try to remove that dangerous and experimental features setting from your 
ceph.conf and see if that solves it.

Caspar

2017-11-12 15:56 GMT+01:00 Marc Roos :



[@c03 ~]# ceph osd status
2017-11-12 15:54:13.164823 7f478a6ad700 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore
2017-11-12 15:54:13.211219 7f478a6ad700 -1 WARNING: the following
dangerous and experimental features are enabled: bluestore
no valid command found; 10 closest matches:
osd map   {}
osd lspools {}
osd count-metadata 
osd versions
osd find 
osd metadata {}
osd getmaxosd
osd ls-tree {} {}
osd getmap {}
osd getcrushmap {}
Error EINVAL: invalid command



-Original Message-
From: I Gede Iswara Darmawan [mailto:iswaradr...@gmail.com]

Sent: zondag 12 november 2017 2:17
Cc: ceph-users
Subject: Re: [ceph-users] No ops on some OSD

Still the same syntax (ceph osd status)

Thanks

Regards,

I Gede Iswara Darmawan

Information System - School of Industrial and System Engineering

Telkom University

P / SMS / WA : 081 322 070719

E : iswaradr...@gmail.com / iswaradr...@live.com
    

On Sat, Nov 4, 2017 at 6:11 PM, Marc Roos 

wrote:




What is the new syntax for "ceph osd status" for luminous?





-Original Message-
From: I Gede Iswara Darmawan [mailto:iswaradr...@gmail.com]
Sent: donderdag 2 november 2017 6:19
To: ceph-users@lists.ceph.com
Subject: [ceph-users] No ops on some OSD

Hello,

I want to ask about my problem. There's some OSD that dont 
have any
load
(indicated with No ops on that OSD).

Hereby I attached the ceph osd status result :
https://pastebin.com/fFLcCbpk . Look at OSD 17,61 and 72. 
There's
no
load or operation happened at that OSD. How to fix this?

Thank you
Regards,

I Gede Iswara Darmawan

Information System - School of Industrial and System 
Engineering

Telkom University

P / SMS / WA : 081 322 070719

E : iswaradr...@gmail.com / iswaradr...@live.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> 

<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
<http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com> >






Re: [ceph-users] No ops on some OSD

2017-11-13 Thread Marc Roos
 
Very very nice, Thanks! Is there a heavy penalty to pay for enabling 
this? 



-Original Message-
From: John Spray [mailto:jsp...@redhat.com] 
Sent: maandag 13 november 2017 11:48
To: Marc Roos
Cc: iswaradrmwn; ceph-users
Subject: Re: [ceph-users] No ops on some OSD

On Sun, Nov 12, 2017 at 2:56 PM, Marc Roos  
wrote:
>
> [@c03 ~]# ceph osd status
> 2017-11-12 15:54:13.164823 7f478a6ad700 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore
> 2017-11-12 15:54:13.211219 7f478a6ad700 -1 WARNING: the following 
> dangerous and experimental features are enabled: bluestore no valid 
> command found; 10 closest matches:
> osd map   {} osd lspools {} osd 
> count-metadata  osd versions osd find  
> osd metadata {} osd getmaxosd osd ls-tree 
> {} {} osd getmap {} osd getcrushmap 
> {} Error EINVAL: invalid command

The "osd status" command comes from the ceph-mgr module called "status" 
-- this is enabled by default but it's possible that it got switched off 
on your system?  Check your ceph-mgr logs and whether it's in "ceph mgr 
module ls" (or try enabling with "ceph mgr module enable status")

John

>
>
>
> -Original Message-
> From: I Gede Iswara Darmawan [mailto:iswaradr...@gmail.com]
> Sent: zondag 12 november 2017 2:17
> Cc: ceph-users
> Subject: Re: [ceph-users] No ops on some OSD
>
> Still the same syntax (ceph osd status)
>
> Thanks
>
> Regards,
>
> I Gede Iswara Darmawan
>
> Information System - School of Industrial and System Engineering
>
> Telkom University
>
> P / SMS / WA : 081 322 070719
>
> E : iswaradr...@gmail.com / iswaradr...@live.com
>
>
> On Sat, Nov 4, 2017 at 6:11 PM, Marc Roos 
> wrote:
>
>
>
>
> What is the new syntax for "ceph osd status" for luminous?
>
>
>
>
>
> -Original Message-
> From: I Gede Iswara Darmawan [mailto:iswaradr...@gmail.com]
> Sent: donderdag 2 november 2017 6:19
> To: ceph-users@lists.ceph.com
> Subject: [ceph-users] No ops on some OSD
>
> Hello,
>
> I want to ask about my problem. There's some OSD that dont 
> have any load
> (indicated with No ops on that OSD).
>
> Hereby I attached the ceph osd status result :
> https://pastebin.com/fFLcCbpk . Look at OSD 17,61 and 72. 
> There's no
> load or operation happened at that OSD. How to fix this?
>
> Thank you
> Regards,
>
> I Gede Iswara Darmawan
>
> Information System - School of Industrial and System 
> Engineering
>
> Telkom University
>
> P / SMS / WA : 081 322 070719
>
> E : iswaradr...@gmail.com / iswaradr...@live.com
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> <http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] HW Raid vs. Multiple OSD

2017-11-13 Thread Marc Roos
 
Keep in mind also if you want to have fail over in the future. We were 
running a 2nd server and were replicating via DRBD the raid arrays. 
Expanding this storage is quite hastle, compared to just adding a few 
osd's. 



-Original Message-
From: Oscar Segarra [mailto:oscar.sega...@gmail.com] 
Sent: maandag 13 november 2017 15:26
To: Peter Maloney
Cc: ceph-users
Subject: Re: [ceph-users] HW Raid vs. Multiple OSD

Hi Peter, 

Thanks a lot for your consideration in terms of storage consumption. 

The other question is considering having one OSDs vs 8 OSDs... 8 OSDs 
will consume more CPU than 1 OSD (RAID5) ?

As I want to share compute and osd in the same box, resources consumed 
by OSD can be a handicap.

Thanks a lot.

2017-11-13 12:59 GMT+01:00 Peter Maloney 
:


Once you've replaced an OSD, you'll see it is quite simple... doing 
it for a few is not much more work (you've scripted it, right?). I don't 
see RAID as giving any benefit here at all. It's not tricky...it's 
perfectly normal operation. Just get used to ceph, and it'll be as 
normal as replacing a RAID disk. And for performance degradation, maybe 
it could be better on either... or better on ceph if you don't mind 
setting the rate to the lowest... but when the QoS functionality is 
ready, probably ceph will be much better. Also RAID will cost you more 
for hardware.

And raid5 is really bad for IOPS. And ceph already replicates, so 
you will have 2 layers of redundancy... and ceph does it cluster wide, 
not just one machine. Using ceph with replication is like all your free 
space as hot spares... you could lose 2 disks on all your machines, and 
it can still run (assuming it had time to recover in between, and enough 
space). And you don't want min_size=1, and if you have 2 layers of 
redundancy, you'll be tempted to do that probably.

But for some workloads, like RBD, ceph doesn't balance out the 
workload very evenly for a specific client, only many clients at once... 
raid might help solve that, but I don't see it as worth it.

I would just software RAID1 the OS and mons, and mds, not the OSDs.


On 11/13/17 12:26, Oscar Segarra wrote:


Hi,  

I'm designing my infraestructure. I want to provide 8TB (8 
disks x 1TB each) of data per host just for Microsoft Windows 10 VDI. In 
each host I will have storage (ceph osd) and compute (on kvm).

I'd like to hear your opinion about theese two configurations:

1.- RAID5 with 8 disks (I will have 7TB but for me it is 
enough) + 1 OSD daemon
2.- 8 OSD daemons

I'm a little bit worried that 8 osd daemons can affect 
performance because all jobs running and scrubbing.

Another question is the procedure of a replacement of a failed 
disk. In case of a big RAID, replacement is direct. In case of many 
OSDs, the procedure is a little bit tricky.


http://ceph.com/geen-categorie/admin-guide-replacing-a-failed-disk-in-a-ceph-cluster/
 

 


What is your advice?

Thanks a lot everybody in advance...

 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com 
 




-- 


Peter Maloney
Brockmann Consult
Max-Planck-Str. 2
21502 Geesthacht
Germany
Tel: +49 4152 889 300  
Fax: +49 4152 889 333  
E-mail: peter.malo...@brockmann-consult.de 
 
Internet: http://www.brockmann-consult.de 
 




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Another OSD broken today. How can I recover it?

2017-11-26 Thread Marc Roos
 
If I am not mistaken, the whole idea with the 3 replica's is dat you 
have enough copies to recover from a failed osd. In my tests this seems 
to go fine automatically. Are you doing something that is not adviced?




-Original Message-
From: Gonzalo Aguilar Delgado [mailto:gagui...@aguilardelgado.com] 
Sent: zaterdag 25 november 2017 20:44
To: 'ceph-users'
Subject: [ceph-users] Another OSD broken today. How can I recover it?

Hello, 


I had another blackout with ceph today. It seems that ceph osd's fall 
from time to time and they are unable to recover. I have 3 OSD's down 
now. 1 removed from the cluster and 2 down because I'm unable to recover 
them. 


We really need a recovery tool. It's not normal that an OSD breaks and 
there's no way to recover. Is there any way to do it?


Last one shows this:




] enter Reset
   -12> 2017-11-25 20:34:19.548891 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[0.34(unlocked)] enter Initial
   -11> 2017-11-25 20:34:19.548983 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[0.34( empty local-les=9685 n=0 ec=404 les/c/f 9685/9685/0 
9684/9684/9684) [4,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] 
exit Initial 0.91 0 0.00
   -10> 2017-11-25 20:34:19.548994 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[0.34( empty local-les=9685 n=0 ec=404 les/c/f 9685/9685/0 
9684/9684/9684) [4,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] 
enter Reset
-9> 2017-11-25 20:34:19.549166 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[10.36(unlocked)] enter Initial
-8> 2017-11-25 20:34:19.566781 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[10.36( v 9686'7301894 (9686'7298879,9686'7301894] local-les=9685 
n=534 ec=419 les/c/f 9685/9686/0 9684/9684/9684) [4,0] r=0 lpr=0 
crt=9686'7301894 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] exit Initial 
0.017614 0 0.00
-7> 2017-11-25 20:34:19.566811 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[10.36( v 9686'7301894 (9686'7298879,9686'7301894] local-les=9685 
n=534 ec=419 les/c/f 9685/9686/0 9684/9684/9684) [4,0] r=0 lpr=0 
crt=9686'7301894 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] enter Reset
-6> 2017-11-25 20:34:19.585411 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[8.5c(unlocked)] enter Initial
-5> 2017-11-25 20:34:19.602888 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[8.5c( empty local-les=9685 n=0 ec=348 les/c/f 9685/9685/0 
9684/9684/9684) [4,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] 
exit Initial 0.017478 0 0.00
-4> 2017-11-25 20:34:19.602912 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[8.5c( empty local-les=9685 n=0 ec=348 les/c/f 9685/9685/0 
9684/9684/9684) [4,0] r=0 lpr=0 crt=0'0 mlcod 0'0 inactive NIBBLEWISE] 
enter Reset
-3> 2017-11-25 20:34:19.603082 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[9.10(unlocked)] enter Initial
-2> 2017-11-25 20:34:19.615456 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[9.10( v 9686'2322547 (9031'2319518,9686'2322547] local-les=9685 n=261 
ec=417 les/c/f 9685/9685/0 9684/9684/9684) [4,0] r=0 lpr=0 
crt=9686'2322547 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] exit Initial 
0.012373 0 0.00
-1> 2017-11-25 20:34:19.615481 7f6e5dc158c0  5 osd.4 pg_epoch: 9686 
pg[9.10( v 9686'2322547 (9031'2319518,9686'2322547] local-les=9685 n=261 
ec=417 les/c/f 9685/9685/0 9684/9684/9684) [4,0] r=0 lpr=0 
crt=9686'2322547 lcod 0'0 mlcod 0'0 inactive NIBBLEWISE] enter Reset
 0> 2017-11-25 20:34:19.617400 7f6e5dc158c0 -1 osd/PG.cc: In 
function 'static int PG::peek_map_epoch(ObjectStore*, spg_t, epoch_t*, 
ceph::bufferlist*)' thread 7f6e5dc158c0 time 2017-11-25 20:34:19.615633
osd/PG.cc: 3025: FAILED assert(values.size() == 2)

 ceph version 10.2.10 (5dc1e4c05cb68dbf62ae6fce3f0700e4654fdbbe)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x80) [0x5562d318d790]
 2: (PG::peek_map_epoch(ObjectStore*, spg_t, unsigned int*, 
ceph::buffer::list*)+0x661) [0x5562d2b4b601]
 3: (OSD::load_pgs()+0x75a) [0x5562d2a9f8aa]
 4: (OSD::init()+0x2026) [0x5562d2aaaca6]
 5: (main()+0x2ef1) [0x5562d2a1c301]
 6: (__libc_start_main()+0xf0) [0x7f6e5aa75830]
 7: (_start()+0x29) [0x5562d2a5db09]
 NOTE: a copy of the executable, or `objdump -rdS ` is 
needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 client
   0/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 5 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 xio
   1/ 5 compressor
   1/ 5 newstore
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 

Re: [ceph-users] ceph all-nvme mysql performance tuning

2017-11-28 Thread Marc Roos
 
I was wondering if there are any statistics available that show the 
performance increase of doing such things?






-Original Message-
From: German Anders [mailto:gand...@despegar.com] 
Sent: dinsdag 28 november 2017 19:34
To: Luis Periquito
Cc: ceph-users
Subject: Re: [ceph-users] ceph all-nvme mysql performance tuning

Thanks a lot Luis, I agree with you regarding the CPUs, but 
unfortunately those were the best CPU model that we can afford :S

For the NUMA part, I manage to pinned the OSDs by changing the 
/usr/lib/systemd/system/ceph-osd@.service file and adding the 
CPUAffinity list to it. But, this is for ALL the OSDs to specific nodes 
or specific CPU list. But I can't find the way to specify a list for 
only a specific number of OSDs. 

Also, I notice that the NVMe disks are all on the same node (since I'm 
using half of the shelf - so the other half will be pinned to the other 
node), so the lanes of the NVMe disks are all on the same CPU (in this 
case 0). Also, I find that the IB adapter that is mapped to the OSD 
network (osd replication) is pinned to CPU 1, so this will cross the QPI 
path.

And for the memory, from the other email, we are already using the 
TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES parameter with a value of 
134217728

In this case I can pinned all the actual OSDs to CPU 0, but in the near 
future when I add more nvme disks to the OSD nodes, I'll definitely need 
to pinned the other half OSDs to CPU 1, someone already did this?

Thanks a lot,

Best,



German

2017-11-28 6:36 GMT-03:00 Luis Periquito :


There are a few things I don't like about your machines... If you 
want latency/IOPS (as you seemingly do) you really want the highest 
frequency CPUs, even over number of cores. These are not too bad, but 
not great either.

Also you have 2x CPU meaning NUMA. Have you pinned OSDs to NUMA 
nodes? Ideally OSD is pinned to same NUMA node the NVMe device is 
connected to. Each NVMe device will be running on PCIe lanes generated 
by one of the CPUs...

What versions of TCMalloc (or jemalloc) are you running? Have you 
tuned them to have a bigger cache?

These are from what I've learned using filestore - I've yet to run 
full tests on bluestore - but they should still apply...

On Mon, Nov 27, 2017 at 5:10 PM, German Anders 
 wrote:


Hi Nick, 

yeah, we are using the same nvme disk with an additional 
partition to use as journal/wal. We double check the c-state and it was 
not configure to use c1, so we change that on all the osd nodes and mon 
nodes and we're going to make some new tests, and see how it goes. I'll 
get back as soon as get got those tests running.

Thanks a lot,

Best,






German

2017-11-27 12:16 GMT-03:00 Nick Fisk :


From: ceph-users 
[mailto:ceph-users-boun...@lists.ceph.com 
 ] On Behalf Of German Anders
Sent: 27 November 2017 14:44
To: Maged Mokhtar 
Cc: ceph-users 
Subject: Re: [ceph-users] ceph all-nvme mysql 
performance 
tuning

 

Hi Maged,

 

Thanks a lot for the response. We try with different 
number of threads and we're getting almost the same kind of difference 
between the storage types. Going to try with different rbd stripe size, 
object size values and see if we get more competitive numbers. Will get 
back with more tests and param changes to see if we get better :)

 

 

Just to echo a couple of comments. Ceph will always 
struggle to match the performance of a traditional array for mainly 2 
reasons.

 

1.  You are replacing some sort of dual ported SAS 
or 
internally RDMA connected device with a network for Ceph replication 
traffic. This will instantly have a large impact on write latency
2.  Ceph locks at the PG level and a PG will most 
likely cover at least one 4MB object, so lots of small accesses to the 
same blocks (on a block device) will wait on each other and go 
effectively at a single threaded rate.

 

The best thing you can do to mitigate these, is to run 
the fastest journal/WAL devices you can, fastest network connections (ie 
25Gb/s) and run your CPU’s at max C and P states.

 

You stated that you are running the performance profile 
on the CPU’s. Could you also just double check that the C-states are 
being held at C1(e)? There are a few utilities that can show this in 
realtime.

 

   

[ceph-users] Luminous 12.2.2 rpm's not signed?

2017-12-04 Thread Marc Roos
 


Total size: 51 M
Is this ok [y/d/N]: y
Downloading packages:


Package ceph-common-12.2.2-0.el7.x86_64.rpm is not signed



-Original Message-
From: Rafał Wądołowski [mailto:rwadolow...@cloudferro.com] 
Sent: maandag 4 december 2017 14:18
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Monitoring bluestore compression ratio

Hi,

Is there any command or tool to show effectiveness of bluestore 
compression?

I see the difference (in ceph osd df tree), while uploading a object to 
ceph, but maybe there are more friendly method to do it.


-- 
Regards,

Rafał Wądołowski
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Questions about pg num setting

2018-01-03 Thread Marc Roos
 

Is there a disadvantage to just always start pg_num and pgp_num with 
something low like 8, and then later increase it when necessary?

Question is then how to identify when necessary





-Original Message-
From: Christian Wuerdig [mailto:christian.wuer...@gmail.com] 
Sent: dinsdag 2 januari 2018 19:40
To: 于相洋
Cc: Ceph-User
Subject: Re: [ceph-users] Questions about pg num setting

Have you had a look at http://ceph.com/pgcalc/?

Generally if you have too many PGs per OSD you can get yourself into 
trouble during recovery and backfilling operations consuming a lot more 
RAM than you have and eventually making your cluster unusable (some more 
info can be found here for example:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2016-October/013614.html
but there are other threads on the ML).
Also currently you cannot reduce the number of PGs for a pool so you are 
much better of starting with a lower value and then gradually increasing 
it.

The fact that the ceph developers introduced a config option which 
prevents users from increasing the number of PGs if it exceeds the 
configured limit should be a tell-tale sign that having too many PGs per 
OSD is considered a problem (see also
https://bugzilla.redhat.com/show_bug.cgi?id=1489064 and linked PRs)

On Wed, Dec 27, 2017 at 3:15 PM, 于相洋  wrote:
> Hi cephers,
>
> I have two questions about pg number setting.
>
> First :
> My storage informaiton is show as belows:
> HDD: 10 * 8TB
> CPU: Intel(R) Xeon(R) CPU E5645 @ 2.40GHz (24 cores)
> Memery: 64GB
>
> As my HDD capacity and my Mem is too large, so I want to set as many 
> as 300 pgs to each OSD. Although 100 pgs per OSD is perferred. I want 
> to know what is the disadvantage of setting too many pgs?
>
>
> Second:
> At begin ,I can not judge the capacity proportion of my workloads, so 
> I can not set accurate pg numbers of different pools. How many pgs 
> should I set for each pools first?
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" 
> in the body of a message to majord...@vger.kernel.org More majordomo 
> info at  http://vger.kernel.org/majordomo-info.html
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Performance issues on Luminous

2018-01-05 Thread Marc Roos
 
 
Maybe because of this 850 evo / 850 pro listed here as 1.9MB/s 1.5MB/s

http://www.sebastien-han.fr/blog/2014/10/10/ceph-how-to-test-if-your-ssd-is-suitable-as-a-journal-device/




-Original Message-
From: Rafał Wądołowski [mailto:rwadolow...@cloudferro.com]
Sent: donderdag 4 januari 2018 16:56
To: c...@elchaka.de; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Performance issues on Luminous

I have size of 2.

We know about this risk and we accept it, but we still don't know why 
performance so so bad.


Cheers,

Rafał Wądołowski


On 04.01.2018 16:51, c...@elchaka.de wrote:


I assume you have size of 3 then divide your expected 400 with 3 
and you are not far Away from what you get... 

In Addition you should Never use Consumer grade ssds for ceph as 
they will be reach the DWPD very soon...


Am 4. Januar 2018 09:54:55 MEZ schrieb "Rafał Wądołowski" 
  : 

Hi folks,

I am currently benchmarking my cluster for an performance 
issue and I 
have no idea, what is going on. I am using these devices in 
qemu.

Ceph version 12.2.2

Infrastructure:

3 x Ceph-mon

11 x Ceph-osd

Ceph-osd has 22x1TB Samsung SSD 850 EVO 1TB

96GB RAM

2x E5-2650 v4

4x10G Network (2 seperate bounds for cluster and public) with 
MTU 9000


I had tested it with rados bench:

# rados bench -p rbdbench 30 write -t 1

Total time run: 30.055677
Total writes made:  1199
Write size: 4194304
Object size:4194304
Bandwidth (MB/sec): 159.571
Stddev Bandwidth:   6.83601
Max bandwidth (MB/sec): 168
Min bandwidth (MB/sec): 140
Average IOPS:   39
Stddev IOPS:1
Max IOPS:   42
Min IOPS:   35
Average Latency(s): 0.0250656
Stddev Latency(s):  0.00321545
Max latency(s): 0.0471699
Min latency(s): 0.0206325

# ceph tell osd.0 bench
{
 "bytes_written": 1073741824,
 "blocksize": 4194304,
 "bytes_per_sec": 414199397
}

Testing osd directly

# dd if=/dev/zero of=/dev/sdc bs=4M oflag=direct count=100
100+0 records in
100+0 records out
419430400 bytes (419 MB, 400 MiB) copied, 1.0066 s, 417 MB/s

When I do dd inside vm (bs=4M wih direct), I have result like 
in rados 
bench.

I think that the speed should be arround ~400MB/s.

Is there any new parameters for rbd in luminous? Maybe I 
forgot about 
some performance tricks? If more information needed feel free 
to ask.


 

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Limitting logging to syslog server

2018-01-08 Thread Marc Roos
 

On a default luminous test cluster I would like to limit logging of I 
guess successful notifications related to deleted snapshots. I don’t 
need there 77k messages of these in my syslog server. 
What/where would be the best to place to do this? (but not dumping it at 
syslog)

Jan  8 13:11:54 c02 ceph-osd: 2018-01-08 13:11:54.788703 7f54b6b28700 -1 
osd.4 pg_epoch: 18696 pg[17.3( v 18696'1327786 
(18696'1326206,18696'1327786] local-lis/les=18634/18635 n=2629 
ec=3636/3636 lis/c 18634/18634 les/c/f 18635/18652/0 18634/18634/18519) 
[4,5,0] r=0 lpr=18634 luod=18696'1327784 lua=18696'1327784 crt=
18696'1327786 lcod 18696'1327782 mlcod 18696'1327782 
active+clean+snaptrim snaptrimq=[18~1,29~3,2f~8]] removing snap head
Jan  8 13:11:54 c02 ceph-osd: 2018-01-08 13:11:54.788703 7f54b6b28700 -1 
osd.4 pg_epoch: 18696 pg[17.3( v 18696'1327786 
(18696'1326206,18696'1327786] local-lis/les=18634/18635 n=2629 
ec=3636/3636 lis/c 18634/18634 les/c/f 18635/18652/0 18634/18634/18519) 
[4,5,0] r=0 lpr=18634 luod=18696'1327784 lua=18696'1327784 crt=
18696'1327786 lcod 18696'1327782 mlcod 18696'1327782 
active+clean+snaptrim snaptrimq=[18~1,29~3,2f~8]] removing snap head
Jan  8 13:11:54 c02 ceph-osd: 2018-01-08 13:11:54.799958 7f54b6b28700 -1 
osd.4 pg_epoch: 18696 pg[17.3( v 18696'1327790 
(18696'1326206,18696'1327790] local-lis/les=18634/18635 n=2626 
ec=3636/3636 lis/c 18634/18634 les/c/f 18635/18652/0 18634/18634/18519) 
[4,5,0] r=0 lpr=18634 luod=18696'1327788 lua=18696'1327788 crt=
18696'1327790 lcod 18696'1327786 mlcod 18696'1327786 
active+clean+snaptrim snaptrimq=[18~1,29~3,2f~8]] removing snap head
Jan  8 13:11:54 c02 ceph-osd: 2018-01-08 13:11:54.799958 7f54b6b28700 -1 
osd.4 pg_epoch: 18696 pg[17.3( v 18696'1327790 (18696'1326206,18696'
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] MDS cache size limits

2018-01-08 Thread Marc Roos
 
I guess the mds cache holds files, attributes etc but how many files 
will the default "mds_cache_memory_limit": "1073741824" hold?



-Original Message-
From: Stefan Kooman [mailto:ste...@bit.nl] 
Sent: vrijdag 5 januari 2018 12:54
To: Patrick Donnelly
Cc: Ceph Users
Subject: Re: [ceph-users] MDS cache size limits

Quoting Patrick Donnelly (pdonn...@redhat.com):
> 
> It's expected but not desired: http://tracker.ceph.com/issues/21402
> 
> The memory usage tracking is off by a constant factor. I'd suggest 
> just lowering the limit so it's about where it should be for your 
> system.

Thanks for the info. Yeah, we did exactly that (observe and adjust 
setting accordingly). Is this something worth mentioning in the 
documentation? Escpecially when this "factor" is a constant? Over time 
(with issue 21402 being worked on) things will change. Ceph operators 
will want to make use of as much cache as possible without 
overcommitting (MDS won't notice until there is no more memory left, 
restart, and looses all its cache :/).

Gr. Stefan

-- 
| BIT BV  http://www.bit.nl/Kamer van Koophandel 09090351
| GPG: 0xD14839C6   +31 318 648 688 / i...@bit.nl
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] nfs-ganesha rpm build script has not been adapted for this -

2018-01-09 Thread Marc Roos
 
The script has not been adapted for this - at the end
http://download.ceph.com/nfs-ganesha/rpm-V2.5-stable/luminous/x86_64/

 
nfs-ganesha-rgw-2.5.4-.el7.x86_64.rpm  
 ^





-Original Message-
From: Marc Roos 
Sent: dinsdag 29 augustus 2017 12:10
To: amare...@redhat.com; Marc Roos; wooer...@gmail.com
Cc: ceph-us...@ceph.com
Subject: RE: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

 
nfs-ganesha-2.5.2-.el7.x86_64.rpm 
 ^
Is this correct?

-Original Message-
From: Marc Roos
Sent: dinsdag 29 augustus 2017 11:40
To: amaredia; wooertim
Cc: ceph-users
Subject: Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

 
Ali, Very very nice! I was creating the rpm's based on a old rpm source 
spec. And it was a hastle to get them to build, and I am not sure if I 
even used to correct compile settings.



-Original Message-
From: Ali Maredia [mailto:amare...@redhat.com]
Sent: maandag 28 augustus 2017 22:29
To: TYLin
Cc: Marc Roos; ceph-us...@ceph.com
Subject: Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7

Marc,

These rpms (and debs) are built with the latest ganesha 2.5 stable 
release and the latest luminous release on download.ceph.com:

http://download.ceph.com/nfs-ganesha/

I just put them up late last week, and I will be maintaining them in the 
future.

-Ali

- Original Message -
> From: "TYLin" 
> To: "Marc Roos" 
> Cc: ceph-us...@ceph.com
> Sent: Sunday, August 20, 2017 11:58:05 PM
> Subject: Re: [ceph-users] Cephfs fsal + nfs-ganesha + el7/centos7
> 
> You can get rpm from here
> 
> https://download.gluster.org/pub/gluster/glusterfs/nfs-ganesha/old/2.3
> .0/CentOS/nfs-ganesha.repo
> 
> You have to fix the path mismatch error in the repo file manually.
> 
> > On Aug 20, 2017, at 5:38 AM, Marc Roos 
wrote:
> > 
> > 
> > 
> > Where can you get the nfs-ganesha-ceph rpm? Is there a repository 
> > that has these?
> > 
> > 
> > 
> > 
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] jemalloc on centos7

2018-01-13 Thread Marc Roos

I was thinking of enabling this jemalloc. Is there a recommended procedure for 
a default centos7 cluster?



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Switching a pool from EC to replicated online ?

2018-01-13 Thread Marc Roos


I regularly read the opposite here, and was thinking of switching to ec. Are 
you sure about what is causing your poor results.
http://ceph.com/community/new-luminous-erasure-coding-rbd-cephfs/
http://ceph.com/geen-categorie/ceph-pool-migration/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Adding a host node back to ceph cluster

2018-01-15 Thread Marc Roos
 
Maybe for the future: 

rpm {-V|--verify} [select-options] [verify-options]

   Verifying a package compares information about the installed 
files in the package with information about  the  files  taken
   from  the  package  metadata  stored in the rpm database.  Among 
other things, verifying compares the size, digest, permis‐
   sions, type, owner and group of each file.  Any discrepancies are 
displayed.  Files that were not installed from the  pack‐
   age, for example, documentation files excluded on installation 
using the "--excludedocs" option, will be silently ignored.





-Original Message-
From: Geoffrey Rhodes [mailto:geoff...@rhodes.org.za] 
Sent: maandag 15 januari 2018 16:39
To: ceph-users@lists.ceph.com
Subject: [ceph-users] Adding a host node back to ceph cluster

Good day,

I'm having an issue re-deploying a host back into my production ceph 
cluster.
Due to some bad memory (picked up by a scrub) which has been replaced I 
felt the need to re-install the host to be sure no host files were 
damaged.

Prior to decommissioning the host I set the crush weight's on each osd 
to 0.
Once to osd's had flushed all data I stopped the daemon's.
I then purged the osd's from the crushmap with "ceph osd purge".
Followed by "ceph osd crush rm {host}" to remove the host bucket from 
the crush map.

I also ran "ceph-deploy purge {host}" & "ceph-deploy purgedata {host}" 
from the management node.
I then reinstalled the host and made the necessary config changes 
followed by the appropriate ceph-deploy commands (ceph-deploy 
install..., ceph-deploy admin..., ceph-deploy osd create...) to bring 
the host & it's osd's back into the cluster, - same as I would when 
adding a new host node to the cluster.

Running ceph osd df tree shows the osd's however the host node is not 
displayed.
Inspecting the crush map I see no host bucket has been created or any 
host's osd's listed.
The osd's also did not start which explains the weight being 0 but I 
presume the osd's not starting isn't the only issue since the crush map 
lacks the newly installed host detail.

Could anybody maybe tell me where I've gone wrong?
I'm also assuming there shouldn't be an issue using the same host name 
again or do I manually add the host bucket and osd detail back into the 
crush map or should ceph-deploy not take care of that?

Thanks

OS: Ubuntu 16.04.3 LTS
Ceph version: 12.2.1 / 12.2.2 - Luminous


Kind regards
Geoffrey Rhodes



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Future

2018-01-16 Thread Marc Roos


Hmmm, I have to disagree with

'too many services'
What do you mean, there is a process for each osd, mon, mgr and mds. 
There are less processes running than on a default windows fileserver. 
What is the complaint here?

'manage everything by your command-line'
What is so bad about this? Even microsoft is seeing the advantages and 
introduced power shell etc. I would recommend hiring a ceph admin, then 
you don't even need to use the web interface. You will have voice 
control on ceph, how cool is that! ;)
(actually maybe we can do feature request to integrate apple siri (not 
forgetting of course google/amazon talk?))

'iscsi'
Afaik this is not even a default install with ceph or a ceph package. I 
am also not complaining to ceph, that my nespresso machine does not have 
triple redundancy.

'check hardware below the hood'
Why waste development on this when there are already enough solutions 
out there? As if it is even possible to make a one size fits all 
solution.

Afaiac I think the ceph team has done great job. I was pleasantly 
surprised by the very easy to install. Just with installing the rpms 
(not using ceph-deploy). Next to this, I think it is good to have some 
sort of 'threshold' to keep the wordpress admin's at a distance. Ceph 
solutions are holding TB/PB of other peoples data, and we don’t want 
rookies destroying that, nor blame ceph for that matter.




-Original Message-
From: Alex Gorbachev [mailto:a...@iss-integration.com] 
Sent: dinsdag 16 januari 2018 6:18
To: Massimiliano Cuttini
Cc: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] Ceph Future

Hi Massimiliano,


On Thu, Jan 11, 2018 at 6:15 AM, Massimiliano Cuttini 
 wrote:
> Hi everybody,
>
> i'm always looking at CEPH for the future.
> But I do see several issue that are leaved unresolved and block nearly 

> future adoption.
> I would like to know if there are some answear already:
>
> 1) Separation between Client and Server distribution.
> At this time you have always to update client & server in order to 
> match the same distribution of Ceph.
> This is ok in the early releases but in future I do expect that the 
> ceph-client is ONE, not many for every major version.
> The client should be able to self determinate what version of the 
> protocol and what feature are enabable and connect to at least 3 or 5 
> older major version of Ceph by itself.
>
> 2) Kernel is old -> feature mismatch
> Ok, kernel is old, and so? Just do not use it and turn to NBD.
> And please don't let me even know, just virtualize under the hood.
>
> 3) Management complexity
> Ceph is amazing, but is just too big to have everything under control 
> (too many services).
> Now there is a management console, but as far as I read this 
> management console just show basic data about performance.
> So it doesn't manage at all... it's just a monitor...
>
> In the end You have just to manage everything by your command-line.
> In order to manage by web it's mandatory:
>
> create, delete, enable, disable services If I need to run ISCSI 
> redundant gateway, do I really need to cut&paste command from your 
> online docs?
> Of course no. You just can script it better than what every admin can 
do.
> Just give few arguments on the html forms and that's all.
>
> create, delete, enable, disable users
> I have to create users and keys for 24 servers. Do you really think 
> it's possible to make it without some bad transcription or bad 
> cut&paste of the keys across all servers.
> Everybody end by just copy the admin keys across all servers, giving 
> very unsecure full permission to all clients.
>
> create MAPS  (server, datacenter, rack, node, osd).
> This is mandatory to design how the data need to be replicate.
> It's not good create this by script or shell, it's needed a graph 
> editor which can dive you the perpective of what will be copied where.
>
> check hardware below the hood
> It's missing the checking of the health of the hardware below.
> But Ceph was born as a storage software that ensure redundacy and 
> protect you from single failure.
> So WHY did just ignore to check the healths of disks with SMART?
> FreeNAS just do a better work on this giving lot of tools to 
> understand which disks is which and if it will fail in the nearly 
future.
> Of course also Ceph could really forecast issues by itself and need to 

> start to integrate with basic hardware IO.
> For example, should be possible to enable disable UID on the disks in 
> order to know which one need to be replace.

As a technical note, we ran into this need with Storcium, and it is 
pretty easy to utilize UID indicators using both Areca and LSI/Avago 
HBAs.  You will need the standard control tools available from their web 
sites, as well as hardware that supports SGPIO (most enterprise JBODs 
and drives do).  There's likely similar options to other HBAs.

Areca:

UID on:

cli64 curctrl=1 set password=
cli64 curctrl= disk identify drv=

UID OFF:

cli64 curctrl=1 set password=
c

[ceph-users] Hiding stripped objects from view

2018-01-17 Thread Marc Roos
 

Is there a way to hide the stripped objects from view? Sort of with the 
rbd type pool

[@c01 mnt]# rados ls -p ec21 | head
test2G.img.0023
test2G.img.011c
test2G.img.0028
test2G.img.0163
test2G.img.01e7
test2G.img.008d
test2G.img.0129
test2G.img.0150
test2G.img.010e
test2G.img.014b
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] also having a slow monitor join quorum

2018-01-18 Thread Marc Roos
 

I have seen messages pass by here, on when a monitor tries to join it 
takes a while. I had the monitor disk run out of space. Monitor was 
killed and now restarting it. I can't do a ceph -s and have to wait for 
this monitor to join also. 



2018-01-18 21:34:05.787749 7f5187a40700  0 -- 192.168.10.111:0/12930 >> 
192.168.10.112:6810/2033 conn(0x558120ab1800 :-1 
s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 
l=0).handle_connect_reply connect got BADAUTHORIZER
2018-01-18 21:34:20.788612 7f5187a40700  0 -- 192.168.10.111:0/12930 >> 
192.168.10.112:6810/2033 conn(0x558120ab1800 :-1 
s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 
l=0).handle_connect_reply connect got BADAUTHORIZER
2018-01-18 21:34:20.788739 7f5187a40700  0 -- 192.168.10.111:0/12930 >> 
192.168.10.112:6810/2033 conn(0x558120ab1800 :-1 
s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 
l=0).handle_connect_reply connect got BADAUTHORIZER
2018-01-18 21:34:35.789475 7f5187a40700  0 -- 192.168.10.111:0/12930 >> 
192.168.10.112:6810/2033 conn(0x558120ab1800 :-1 
s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 
l=0).handle_connect_reply connect got BADAUTHORIZER
2018-01-18 21:34:35.789608 7f5187a40700  0 -- 192.168.10.111:0/12930 >> 
192.168.10.112:6810/2033 conn(0x558120ab1800 :-1 
s=STATE_CONNECTING_WAIT_CONNECT_REPLY_AUTH pgs=0 cs=0 
l=0).handle_connect_reply connect got BADAUTHORIZER
2018-01-18 21:34:40.333203 7f518d24b700  0 
mon.a@0(synchronizing).data_health(0) update_stats avail 47% total 5990 
MB, used 3124 MB, avail 2865 MB


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] also having a slow monitor join quorum

2018-01-18 Thread Marc Roos
 
Took around 30min for the monitor join and I could execute ceph -s



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Fwd: Ceph team involvement in Rook (Deploying Ceph in Kubernetes)

2018-01-19 Thread Marc Roos
 
Are the guys of apache mesos agreeing to this? I have been looking at 
mesos, dcos and still have to make up my mind which way to go. I like 
that mesos has the unified containerizer that runs the docker images and 
I don’t need to run the dockerd, how the adapt the to the cni standard. 
How is this going with kubernetes? Link maybe to why I should use 
kubernetes?



-Original Message-
From: Kai Wagner [mailto:kwag...@suse.com] 
Sent: vrijdag 19 januari 2018 11:55
To: Ceph Users
Subject: [ceph-users] Fwd: Ceph team involvement in Rook (Deploying Ceph 
in Kubernetes)

Just for those of you who are not subscribed to ceph-users.



 Forwarded Message  
Subject:Ceph team involvement in Rook (Deploying Ceph in 
Kubernetes)  
Date:   Fri, 19 Jan 2018 11:49:05 +0100  
From:   Sebastien Han  
 
To: ceph-users  
 , Squid Cybernetic 
 
 , Dan Mick  
 , Chen, Huamin  
 , John Spray  
 , Sage Weil  
 , bas...@tabbara.com   


Everyone,

Kubernetes is getting bigger and bigger. It has become the platform of 
choice to run microservices applications in containers, just like 
OpenStack did for and Cloud applications in virtual machines.

When it comes to container storage there are three key aspects:

* Providing persistent storage to containers, Ceph has drivers in 
Kuberntes already with kRBD and CephFS
* Containerizing the storage itself, so efficiently running Ceph 
services in Containers. Currently, we have ceph-container
(https://github.com/ceph/ceph-container)
* Deploying the containerized storage in Kubernetes, we wrote ceph-helm 
charts (https://github.com/ceph/ceph-helm)

The third piece although it's working great has a particular goal and 
doesn't aim to run Ceph just like any other applications in Kuberntes.
We were also looking for a better abstraction/ease of use for end-users, 
multi-cluster support, operability, life-cycle management, centralized 
operations, to learn more you can read 
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2017-October/021918.html.
As a consequence, we decided to look at what the ecosystem had to offer. 
As a result, Rook came out, as a pleasant surprise. For those who are 
not familiar with Rook, please visit https://rook.io but in a nutshell, 
Rook is an open source orchestrator for distributed storage systems 
running in cloud-native environments. Under the hood, Rook is deploying, 
operating and managing Ceph life cycle in Kubernetes. Rook has a vibrant 
community and committed developers.

Even if Rook is not perfect (yet), it has firm foundations, and we are 
planning on helping to make it better. We already opened issues for that 
and started doing work with Rook's core developers. We are looking at 
reconciling what is available today (rook/ceph-container/helm), reduce 
the overlap/duplication and all work together toward a single and common 
goal. With this collaboration, through Rook, we hope to make Ceph the de 
facto Open Source storage solution for Kubernetes.

These are exciting times, so if you're a user, a developer, or merely 
curious, have a look at Rook and send us feedback!

Thanks!
--
Cheers

––
Sébastien Han
Principal Software Engineer, Storage Architect

"Always give 100%. Unless you're giving blood."

Mail: s...@redhat.com
Address: 11 bis, rue Roquépine - 75008 Paris
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in 
the body of a message to majord...@vger.kernel.org More majordomo info 
at  http://vger.kernel.org/majordomo-info.html


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] iSCSI over RBD

2018-01-20 Thread Marc Roos
 

Sorry for me asking maybe the obvious but is this the kernel available 
in elrepo? Or a different one?





-Original Message-
From: Mike Christie [mailto:mchri...@redhat.com] 
Sent: zaterdag 20 januari 2018 1:19
To: Steven Vacaroaia; Joshua Chen
Cc: ceph-users
Subject: Re: [ceph-users] iSCSI over RBD

On 01/19/2018 02:12 PM, Steven Vacaroaia wrote:
> Hi Joshua,
> 
> I was under the impression that kernel  3.10.0-693 will work with 
iscsi
>  

That kernel works with RHCS 2.5 and below. You need the rpms from that
or the matching upstream releases. Besides trying to dig out the
versions and matching things up, the problem with those releases is that
they were tech previewed or only supports linux initiators.

It looks like you are using the newer upstream tools or RHCS 3.0 tools.
For them you need the RHEL 7.5 beta or newer kernel or an upstream one.
For upstream all the patches got merged into the target layer
maintainer's tree yesterday. A new tcmu-runner release has been made.
And I just pushed a test kernel with all the patches based on 4.13 (4.14
had a bug in the login code which is being fixed still) to github, so
people do not have to wait for the next-next kernel release to come out.

Just give us a couple days for the kernel build to be done, to make the
needed ceph-iscsi-* release (current version will fail to create rbd
images with the current tcmu-runner release) and get the documentation
updated because some links are incorrect and some version info needs to
be updated.


> Unfortunately  I still cannot create a disk because qfull_time_out is
> not supported 
> 
> What am I missing / do it wrong ?
> 
> 2018-01-19 15:06:45,216 INFO [lun.py:601:add_dev_to_lio()] -
> (LUN.add_dev_to_lio) Adding image 'rbd.disk2' to LIO
> 2018-01-19 15:06:45,295ERROR [lun.py:634:add_dev_to_lio()] - Could
> not set LIO device attribute cmd_time_out/qfull_time_out for device:
> rbd.disk2. Kernel not supported. - error(Cannot find attribute:
> qfull_time_out)
> 2018-01-19 15:06:45,300ERROR [rbd-target-api:731:_disk()] - LUN
> alloc problem - Could not set LIO device attribute
> cmd_time_out/qfull_time_out for device: rbd.disk2. Kernel not 
supported.
> - error(Cannot find attribute: qfull_time_out)
> 
> 
> Many thanks
> 
> Steven
> 
> On 4 January 2018 at 22:40, Joshua Chen  > wrote:
> 
> Hello Steven,
>   I am using CentOS 7.4.1708 with kernel 3.10.0-693.el7.x86_64
>   and the following packages:
> 
> ceph-iscsi-cli-2.5-9.el7.centos.noarch.rpm
> ceph-iscsi-config-2.3-12.el7.centos.noarch.rpm
> libtcmu-1.3.0-0.4.el7.centos.x86_64.rpm
> libtcmu-devel-1.3.0-0.4.el7.centos.x86_64.rpm
> python-rtslib-2.1.fb64-2.el7.centos.noarch.rpm
> python-rtslib-doc-2.1.fb64-2.el7.centos.noarch.rpm
> targetcli-2.1.fb47-0.1.20170815.git5bf3517.el7.centos.noarch.rpm
> tcmu-runner-1.3.0-0.4.el7.centos.x86_64.rpm
> tcmu-runner-debuginfo-1.3.0-0.4.el7.centos.x86_64.rpm
> 
> 
> Cheers
> Joshua
> 
> 
> On Fri, Jan 5, 2018 at 2:14 AM, Steven Vacaroaia  > wrote:
> 
> Hi Joshua,
> 
> How did you manage to use iSCSI gateway ?
> I would like to do that but still waiting for a patched kernel 

> 
> What kernel/OS did you use and/or how did you patch it ?
> 
> Tahnsk
> Steven
> 
> On 4 January 2018 at 04:50, Joshua Chen
> mailto:csc...@asiaa.sinica.edu.tw>>
> wrote:
> 
> Dear all,
>   Although I managed to run gwcli and created some iqns, 
or
> luns,
> but I do need some working config example so that my
> initiator could connect and get the lun.
> 
>   I am familiar with targetcli and I used to do the
> following ACL style connection rather than password, 
> the targetcli setting tree is here:
> 
> (or see this page
> )
> 
> #targetcli ls
> o- /
> 

.
> [...]
>   o- backstores
> 

..
> [...]
>   | o- block
> 

..
> [Storage Objects: 1]
>   | | o- vmware_5t
> ..
> [/dev/rbd/rbd/vmware_5t (5.0TiB) write-thru activated]
>   | |   o- alua
> 

...
> [ALUA Groups: 1]
>   | | o- default_tg_pt_gp
> 

[ceph-users] What is the should be the expected latency of 10Gbit network connections

2018-01-20 Thread Marc Roos
 
If I test my connections with sockperf via a 1Gbit switch I get around 
25usec, when I test the 10Gbit connection via the switch I have around 
12usec is that normal? Or should there be a differnce of 10x. 

sockperf ping-pong 

sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=10.100 sec; SentMessages=432875; 
ReceivedMessages=432874
sockperf: = Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=10.000 sec; SentMessages=428640; 
ReceivedMessages=428640
sockperf: > avg-lat= 11.609 (std-dev=1.684)
sockperf: # dropped messages = 0; # duplicated messages = 0; # 
out-of-order messages = 0
sockperf: Summary: Latency is 11.609 usec
sockperf: Total 428640 observations; each percentile contains 4286.40 
observations
sockperf: --->  observation =  856.944
sockperf: ---> percentile  99.99 =   39.789
sockperf: ---> percentile  99.90 =   20.550
sockperf: ---> percentile  99.50 =   17.094
sockperf: ---> percentile  99.00 =   15.578
sockperf: ---> percentile  95.00 =   12.838
sockperf: ---> percentile  90.00 =   12.299
sockperf: ---> percentile  75.00 =   11.844
sockperf: ---> percentile  50.00 =   11.409
sockperf: ---> percentile  25.00 =   11.124
sockperf: --->  observation =8.888

sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=1.100 sec; SentMessages=22065; 
ReceivedMessages=22064
sockperf: = Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=1.000 sec; SentMessages=20056; 
ReceivedMessages=20056
sockperf: > avg-lat= 24.861 (std-dev=1.774)
sockperf: # dropped messages = 0; # duplicated messages = 0; # 
out-of-order messages = 0
sockperf: Summary: Latency is 24.861 usec
sockperf: Total 20056 observations; each percentile contains 200.56 
observations
sockperf: --->  observation =   77.158
sockperf: ---> percentile  99.99 =   54.285
sockperf: ---> percentile  99.90 =   37.864
sockperf: ---> percentile  99.50 =   34.406
sockperf: ---> percentile  99.00 =   33.337
sockperf: ---> percentile  95.00 =   27.497
sockperf: ---> percentile  90.00 =   26.072
sockperf: ---> percentile  75.00 =   24.618
sockperf: ---> percentile  50.00 =   24.443
sockperf: ---> percentile  25.00 =   24.361
sockperf: --->  observation =   16.746
[root@c01 sbin]# sockperf ping-pong -i 192.168.0.12 -p 5001 -t 10
sockperf: == version #2.6 ==
sockperf[CLIENT] send on:sockperf: using recvfrom() to block on 
socket(s)








___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What is the should be the expected latency of 10Gbit network connections

2018-01-22 Thread Marc Roos

ping -c 10 -f 
ping -M do -s 8972 
 
10Gb ConnectX-3 Pro, DAC + Vlan
rtt min/avg/max/mdev = 0.010/0.013/0.200/0.003 ms, ipg/ewma 0.025/0.014 
ms

8980 bytes from 10.0.0.11: icmp_seq=3 ttl=64 time=0.144 ms
8980 bytes from 10.0.0.11: icmp_seq=4 ttl=64 time=0.205 ms
8980 bytes from 10.0.0.11: icmp_seq=5 ttl=64 time=0.248 ms
8980 bytes from 10.0.0.11: icmp_seq=6 ttl=64 time=0.281 ms
8980 bytes from 10.0.0.11: icmp_seq=7 ttl=64 time=0.187 ms
8980 bytes from 10.0.0.11: icmp_seq=8 ttl=64 time=0.121 ms

I350 Gigabit + bond
rtt min/avg/max/mdev = 0.027/0.038/0.211/0.006 ms, ipg/ewma 0.050/0.041 
ms

8980 bytes from 192.168.0.11: icmp_seq=1 ttl=64 time=0.555 ms
8980 bytes from 192.168.0.11: icmp_seq=2 ttl=64 time=0.508 ms
8980 bytes from 192.168.0.11: icmp_seq=3 ttl=64 time=0.514 ms
8980 bytes from 192.168.0.11: icmp_seq=4 ttl=64 time=0.555 ms



-Original Message-
From: Nick Fisk [mailto:n...@fisk.me.uk] 
Sent: maandag 22 januari 2018 12:38
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] What is the should be the expected latency of 
10Gbit network connections

Anyone with 25G ethernet willing to do the test? Would love to see what 
the latency figures are for that.

 

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of 
Maged Mokhtar
Sent: 22 January 2018 11:28
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] What is the should be the expected latency of 
10Gbit network connections

 

On 2018-01-22 08:39, Wido den Hollander wrote:



On 01/20/2018 02:02 PM, Marc Roos wrote: 

  If I test my connections with sockperf via a 1Gbit switch I 
get around
25usec, when I test the 10Gbit connection via the switch I 
have around
12usec is that normal? Or should there be a differnce of 10x.


No, that's normal.

Tests with 8k ping packets over different links I did:

1GbE:  0.800ms
10GbE: 0.200ms
40GbE: 0.150ms

Wido




sockperf ping-pong

sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=10.100 sec; SentMessages=432875;
ReceivedMessages=432874
sockperf: = Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=10.000 sec; 
SentMessages=428640;
ReceivedMessages=428640
sockperf: > avg-lat= 11.609 (std-dev=1.684)
sockperf: # dropped messages = 0; # duplicated messages = 0; #
out-of-order messages = 0
sockperf: Summary: Latency is 11.609 usec
sockperf: Total 428640 observations; each percentile contains 
4286.40
observations
sockperf: --->  observation =  856.944
sockperf: ---> percentile  99.99 =   39.789
sockperf: ---> percentile  99.90 =   20.550
sockperf: ---> percentile  99.50 =   17.094
sockperf: ---> percentile  99.00 =   15.578
sockperf: ---> percentile  95.00 =   12.838
sockperf: ---> percentile  90.00 =   12.299
sockperf: ---> percentile  75.00 =   11.844
sockperf: ---> percentile  50.00 =   11.409
sockperf: ---> percentile  25.00 =   11.124
sockperf: --->  observation =8.888

sockperf: Warmup stage (sending a few dummy messages)...
sockperf: Starting test...
sockperf: Test end (interrupted by timer)
sockperf: Test ended
sockperf: [Total Run] RunTime=1.100 sec; SentMessages=22065;
ReceivedMessages=22064
sockperf: = Printing statistics for Server No: 0
sockperf: [Valid Duration] RunTime=1.000 sec; 
SentMessages=20056;
ReceivedMessages=20056
sockperf: > avg-lat= 24.861 (std-dev=1.774)
sockperf: # dropped messages = 0; # duplicated messages = 0; #
out-of-order messages = 0
sockperf: Summary: Latency is 24.861 usec
sockperf: Total 20056 observations; each percentile contains 
200.56
observations
sockperf: --->  observation =   77.158
sockperf: ---> percentile  99.99 =   54.285
sockperf: ---> percentile  99.90 =   37.864
sockperf: ---> percentile  99.50 =   34.406
sockperf: ---> percentile  99.00 =   33.337
sockperf: ---> percentile  95.00 =   27.497
sockpe

Re: [ceph-users] OSD servers swapping despite having free memory capacity

2018-01-23 Thread Marc Roos
 

Maybe first check what is using the swap?
swap-use.sh | sort -k 5,5 -n


#!/bin/bash

SUM=0
OVERALL=0

for DIR in `find /proc/ -maxdepth 1 -type d | egrep "^/proc/[0-9]"`
  do
  PID=`echo $DIR | cut -d / -f 3`
  PROGNAME=`ps -p $PID -o comm --no-headers`

  for SWAP in `grep Swap $DIR/smaps 2>/dev/null| awk '{ print $2 }'`
  do
let SUM=$SUM+$SWAP
  done
  echo "PID=$PID - Swap used: $SUM - ($PROGNAME )"
  let OVERALL=$OVERALL+$SUM
  SUM=0
done
echo "Overall swap used: $OVERALL"





-Original Message-
From: Lincoln Bryant [mailto:linco...@uchicago.edu] 
Sent: dinsdag 23 januari 2018 21:13
To: Samuel Taylor Liston; ceph-users@lists.ceph.com
Subject: Re: [ceph-users] OSD servers swapping despite having free 
memory capacity

Hi Sam,

What happens if you just disable swap altogether? i.e., with `swapoff 
-a`

--Lincoln

On Tue, 2018-01-23 at 19:54 +, Samuel Taylor Liston wrote:
> We have a 9 - node (16 - 8TB OSDs per node) running jewel on centos 
> 7.4.  The OSDs are configured with encryption.  The cluster is 
> accessed via two - RGWs  and there are 3 - mon servers.  The data pool 

> is using 6+3 erasure coding.
> 
> About 2 weeks ago I found two of the nine servers wedged and had to 
> hard power cycle them to get them back.  In this hard reboot 22 - OSDs 

> came back with either a corrupted encryption or data partitions.  
> These OSDs were removed and recreated, and the resultant rebalance 
> moved along just fine for about a week.  At the end of that week two 
> different nodes were unresponsive complaining of page allocation 
> failures.  This is when I realized the nodes were heavy into swap.  
> These nodes were configured with 64GB of RAM as a cost saving going 
> against the 1GB per 1TB recommendation.  We have since then doubled 
> the RAM in each of the nodes giving each of them more than the 1GB per 

> 1TB ratio.
> 
> The issue I am running into is that these nodes are still swapping; a 
> lot, and over time becoming unresponsive, or throwing page allocation 
> failures.  As an example, “free” will show 15GB of RAM usage (out of
> 128GB) and 32GB of swap.  I have configured swappiness to 0 and and 
> also turned up the vm.min_free_kbytes to 4GB to try to keep the kernel 

> happy, and yet I am still filling up swap.  It only occurs when the 
> OSDs have mounted partitions and ceph-osd daemons active.
> 
> Anyone have an idea where this swap usage might be coming from? Thanks 

> for any insight,
> 
> Sam Liston (sam.lis...@utah.edu)
> 
> Center for High Performance Computing
> 155 S. 1452 E. Rm 405
> Salt Lake City, Utah 84112 (801)232-6932 
> 
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Luminous - bad performance

2018-01-24 Thread Marc Roos
 

ceph osd pool application enable XXX rbd

-Original Message-
From: Steven Vacaroaia [mailto:ste...@gmail.com] 
Sent: woensdag 24 januari 2018 19:47
To: David Turner
Cc: ceph-users
Subject: Re: [ceph-users] Luminous - bad performance

Hi ,

I have bundled the public NICs and added 2 more monitors ( running on 2 
of the 3 OSD hosts) This seem to improve  things but still I have high 
latency Also performance of the SSD pool is worse than HDD which is very 
confusing 

SSDpool is using one Toshiba PX05SMB040Y per server ( for a total of 3 
OSDs) while HDD pool is using 2 Seagate ST600MM0006 disks per server () 
for a total of 6 OSDs)

Note
I have also disabled  C state in the BIOS and added  
"intel_pstate=disable intel_idle.max_cstate=0 processor.max_cstate=0 
idle=poll" to GRUB

Any hints/suggestions will be greatly appreciated 

[root@osd04 ~]# ceph status
  cluster:
id: 37161a51-a159-4895-a7fd-3b0d857f1b66
health: HEALTH_WARN
noscrub,nodeep-scrub flag(s) set
application not enabled on 2 pool(s)
mon osd02 is low on available space

  services:
mon: 3 daemons, quorum osd01,osd02,mon01
mgr: mon01(active)
osd: 9 osds: 9 up, 9 in
 flags noscrub,nodeep-scrub
tcmu-runner: 6 daemons active

  data:
pools:   2 pools, 228 pgs
objects: 50384 objects, 196 GB
usage:   402 GB used, 3504 GB / 3906 GB avail
pgs: 228 active+clean

  io:
client:   46061 kB/s rd, 852 B/s wr, 15 op/s rd, 0 op/s wr

[root@osd04 ~]# ceph osd tree
ID  CLASS WEIGHT  TYPE NAME  STATUS REWEIGHT PRI-AFF
 -9   4.5 root ssds
-10   1.5 host osd01-ssd
  6   hdd 1.5 osd.6  up  1.0 1.0
-11   1.5 host osd02-ssd
  7   hdd 1.5 osd.7  up  1.0 1.0
-12   1.5 host osd04-ssd
  8   hdd 1.5 osd.8  up  1.0 1.0
 -1   2.72574 root default
 -3   1.09058 host osd01
  0   hdd 0.54529 osd.0  up  1.0 1.0
  4   hdd 0.54529 osd.4  up  1.0 1.0
 -5   1.09058 host osd02
  1   hdd 0.54529 osd.1  up  1.0 1.0
  3   hdd 0.54529 osd.3  up  1.0 1.0
 -7   0.54459 host osd04
  2   hdd 0.27229 osd.2  up  1.0 1.0
  5   hdd 0.27229 osd.5  up  1.0 1.0


 rados bench -p ssdpool 300 -t 32 write --no-cleanup && rados bench -p 
ssdpool 300 -t 32  seq

Total time run: 302.058832
Total writes made:  4100
Write size: 4194304
Object size:4194304
Bandwidth (MB/sec): 54.2941
Stddev Bandwidth:   70.3355
Max bandwidth (MB/sec): 252
Min bandwidth (MB/sec): 0
Average IOPS:   13
Stddev IOPS:17
Max IOPS:   63
Min IOPS:   0
Average Latency(s): 2.35655
Stddev Latency(s):  4.4346
Max latency(s): 29.7027
Min latency(s): 0.045166

rados bench -p rbd 300 -t 32 write --no-cleanup && rados bench -p rbd 
300 -t 32  seq
Total time run: 301.428571
Total writes made:  8753
Write size: 4194304
Object size:4194304
Bandwidth (MB/sec): 116.154
Stddev Bandwidth:   71.5763
Max bandwidth (MB/sec): 320
Min bandwidth (MB/sec): 0
Average IOPS:   29
Stddev IOPS:17
Max IOPS:   80
Min IOPS:   0
Average Latency(s): 1.10189
Stddev Latency(s):  1.80203
Max latency(s): 15.0715
Min latency(s): 0.0210309




[root@osd04 ~]# ethtool -k gth0
Features for gth0:
rx-checksumming: on
tx-checksumming: on
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: on
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: on [fixed]
tx-checksum-sctp: on
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
tx-tcp-segmentation: on
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp-mangleid-segmentation: off
tx-tcp6-segmentation: on
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off
receive-hashing: on
highdma: on [fixed]
rx-vlan-filter: on
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: on [fixed]
tx-gre-segmentation: on
tx-gre-csum-segmentation: on
tx-ipxip4-segmentation: on
tx-ipxip6-segmentation: on
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: on
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fw

  1   2   3   4   5   6   >