Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-06 Thread Mykola Golub
On Tue, Nov 06, 2018 at 09:45:01AM +0800, Dengke Du wrote:

> I reconfigure the osd service from start, the journal was:

I am not quite sure I understand what you mean here.

> --
> 
> -- Unit ceph-osd@0.service has finished starting up.
> -- 
> -- The start-up result is RESULT.
> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 7f6a27204e80
> -1 Public network was set, but cluster network was not set
> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 7f6a27204e80
> -1 Using public network also for cluster network
> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365 7f6a27204e80
> -1 journal FileJournal::_open: disabling aio for non-block journal.  Use
> journal_force_aio to force use of a>
> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414 7f6a27204e80
> -1 journal do_read_entry(6930432): bad header magic
> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729 7f6a27204e80
> -1 osd.0 21 log_to_monitors {default=true}
> Nov 05 18:02:47 node1 nagios[3584]: Warning: Return code of 13 for check of
> host 'localhost' was out of bounds.
> 
> --

Could you please post the full ceph-osd log somewhere? 
/var/log/ceph/ceph-osd.0.log

> but hang at the command: "rbd create libvirt-pool/dimage --size 10240 "

So it hungs forever now instead of returning the error?
What is `ceph -s` output?

-- 
Mykola Golub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-06 Thread Dengke Du


On 2018/11/6 下午4:16, Mykola Golub wrote:

On Tue, Nov 06, 2018 at 09:45:01AM +0800, Dengke Du wrote:


I reconfigure the osd service from start, the journal was:

I am not quite sure I understand what you mean here.


--

-- Unit ceph-osd@0.service has finished starting up.
--
-- The start-up result is RESULT.
Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 7f6a27204e80
-1 Public network was set, but cluster network was not set
Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 7f6a27204e80
-1 Using public network also for cluster network
Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
/var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365 7f6a27204e80
-1 journal FileJournal::_open: disabling aio for non-block journal.  Use
journal_force_aio to force use of a>
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414 7f6a27204e80
-1 journal do_read_entry(6930432): bad header magic
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729 7f6a27204e80
-1 osd.0 21 log_to_monitors {default=true}
Nov 05 18:02:47 node1 nagios[3584]: Warning: Return code of 13 for check of
host 'localhost' was out of bounds.

--

Could you please post the full ceph-osd log somewhere? 
/var/log/ceph/ceph-osd.0.log


I don't have the file /var/log/ceph/ceph-osd.o.log

root@node1:~# systemctl status ceph-osd@0
● ceph-osd@0.service - Ceph object storage daemon osd.0
   Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled; 
vendor preset: enabled)

   Active: active (running) since Mon 2018-11-05 18:02:36 UTC; 6h ago
 Main PID: 4487 (ceph-osd)
    Tasks: 64
   Memory: 27.0M
   CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
   └─4487 /usr/bin/ceph-osd -f --cluster ceph --id 0

Nov 05 18:02:36 node1 systemd[1]: Starting Ceph object storage daemon 
osd.0...

Nov 05 18:02:36 node1 systemd[1]: Started Ceph object storage daemon osd.0.
Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 
7f6a27204e80 -1 Public network was set, but cluster network was not set
Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915 
7f6a27204e80 -1 Using public network also for cluster network
Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data 
/var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365 
7f6a27204e80 -1 journal FileJournal::_open: disabling aio for non-block 
journal.  Use journal_force_aio to force use of a>
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414 
7f6a27204e80 -1 journal do_read_entry(6930432): bad header magic
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729 
7f6a27204e80 -1 osd.0 21 log_to_monitors {default=true}





but hang at the command: "rbd create libvirt-pool/dimage --size 10240 "

So it hungs forever now instead of returning the error?

no returning any error, just hungs

What is `ceph -s` output?

root@node1:~# ceph -s
  cluster:
    id: 9c1a42e1-afc2-4170-8172-96f4ebdaac68
    health: HEALTH_WARN
    no active mgr

  services:
    mon: 1 daemons, quorum 0
    mgr: no daemons active
    osd: 1 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0  objects, 0 B
    usage:   0 B used, 0 B / 0 B avail
    pgs:





___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-06 Thread Ashley Merrick
If I am reading your ceph -s output correctly you only have 1 OSD, and 0
pool's created.

So your be unable to create a RBD till you atleast have a pool setup and
configured to create the RBD within.

On Tue, Nov 6, 2018 at 4:21 PM Dengke Du  wrote:

>
> On 2018/11/6 下午4:16, Mykola Golub wrote:
> > On Tue, Nov 06, 2018 at 09:45:01AM +0800, Dengke Du wrote:
> >
> >> I reconfigure the osd service from start, the journal was:
> > I am not quite sure I understand what you mean here.
> >
> >>
> --
> >>
> >> -- Unit ceph-osd@0.service has finished starting up.
> >> --
> >> -- The start-up result is RESULT.
> >> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
> 7f6a27204e80
> >> -1 Public network was set, but cluster network was not set
> >> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
> 7f6a27204e80
> >> -1 Using public network also for cluster network
> >> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
> >> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
> >> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365
> 7f6a27204e80
> >> -1 journal FileJournal::_open: disabling aio for non-block journal.  Use
> >> journal_force_aio to force use of a>
> >> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414
> 7f6a27204e80
> >> -1 journal do_read_entry(6930432): bad header magic
> >> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729
> 7f6a27204e80
> >> -1 osd.0 21 log_to_monitors {default=true}
> >> Nov 05 18:02:47 node1 nagios[3584]: Warning: Return code of 13 for
> check of
> >> host 'localhost' was out of bounds.
> >>
> >>
> --
> > Could you please post the full ceph-osd log somewhere?
> /var/log/ceph/ceph-osd.0.log
>
> I don't have the file /var/log/ceph/ceph-osd.o.log
>
> root@node1:~# systemctl status ceph-osd@0
> ● ceph-osd@0.service - Ceph object storage daemon osd.0
> Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled;
> vendor preset: enabled)
> Active: active (running) since Mon 2018-11-05 18:02:36 UTC; 6h ago
>   Main PID: 4487 (ceph-osd)
>  Tasks: 64
> Memory: 27.0M
> CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
> └─4487 /usr/bin/ceph-osd -f --cluster ceph --id 0
>
> Nov 05 18:02:36 node1 systemd[1]: Starting Ceph object storage daemon
> osd.0...
> Nov 05 18:02:36 node1 systemd[1]: Started Ceph object storage daemon osd.0.
> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
> 7f6a27204e80 -1 Public network was set, but cluster network was not set
> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
> 7f6a27204e80 -1 Using public network also for cluster network
> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365
> 7f6a27204e80 -1 journal FileJournal::_open: disabling aio for non-block
> journal.  Use journal_force_aio to force use of a>
> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414
> 7f6a27204e80 -1 journal do_read_entry(6930432): bad header magic
> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729
> 7f6a27204e80 -1 osd.0 21 log_to_monitors {default=true}
>
> >
> >> but hang at the command: "rbd create libvirt-pool/dimage --size 10240 "
> > So it hungs forever now instead of returning the error?
> no returning any error, just hungs
> > What is `ceph -s` output?
> root@node1:~# ceph -s
>cluster:
>  id: 9c1a42e1-afc2-4170-8172-96f4ebdaac68
>  health: HEALTH_WARN
>  no active mgr
>
>services:
>  mon: 1 daemons, quorum 0
>  mgr: no daemons active
>  osd: 1 osds: 0 up, 0 in
>
>data:
>  pools:   0 pools, 0 pgs
>  objects: 0  objects, 0 B
>  usage:   0 B used, 0 B / 0 B avail
>  pgs:
>
>
> >
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-06 Thread Dengke Du


On 2018/11/6 下午4:24, Ashley Merrick wrote:
If I am reading your ceph -s output correctly you only have 1 OSD, and 
0 pool's created.


So your be unable to create a RBD till you atleast have a pool setup 
and configured to create the RBD within.

root@node1:~# ceph osd lspools
1 libvirt-pool
2 test-pool


I create pools using:

ceph osd pool create libvirt-pool 128 128

following:

http://docs.ceph.com/docs/master/rbd/libvirt/



On Tue, Nov 6, 2018 at 4:21 PM Dengke Du > wrote:



On 2018/11/6 下午4:16, Mykola Golub wrote:
> On Tue, Nov 06, 2018 at 09:45:01AM +0800, Dengke Du wrote:
>
>> I reconfigure the osd service from start, the journal was:
> I am not quite sure I understand what you mean here.
>
>>

--
>>
>> -- Unit ceph-osd@0.service has finished starting up.
>> --
>> -- The start-up result is RESULT.
>> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
7f6a27204e80
>> -1 Public network was set, but cluster network was not set
>> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
7f6a27204e80
>> -1 Using public network also for cluster network
>> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
>> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365
7f6a27204e80
>> -1 journal FileJournal::_open: disabling aio for non-block
journal.  Use
>> journal_force_aio to force use of a>
>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414
7f6a27204e80
>> -1 journal do_read_entry(6930432): bad header magic
>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729
7f6a27204e80
>> -1 osd.0 21 log_to_monitors {default=true}
>> Nov 05 18:02:47 node1 nagios[3584]: Warning: Return code of 13
for check of
>> host 'localhost' was out of bounds.
>>
>>

--
> Could you please post the full ceph-osd log somewhere?
/var/log/ceph/ceph-osd.0.log

I don't have the file /var/log/ceph/ceph-osd.o.log

root@node1:~# systemctl status ceph-osd@0
● ceph-osd@0.service - Ceph object storage daemon osd.0
    Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled;
vendor preset: enabled)
    Active: active (running) since Mon 2018-11-05 18:02:36 UTC; 6h ago
  Main PID: 4487 (ceph-osd)
 Tasks: 64
    Memory: 27.0M
    CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
    └─4487 /usr/bin/ceph-osd -f --cluster ceph --id 0

Nov 05 18:02:36 node1 systemd[1]: Starting Ceph object storage daemon
osd.0...
Nov 05 18:02:36 node1 systemd[1]: Started Ceph object storage
daemon osd.0.
Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
7f6a27204e80 -1 Public network was set, but cluster network was
not set
Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
7f6a27204e80 -1 Using public network also for cluster network
Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
/var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365
7f6a27204e80 -1 journal FileJournal::_open: disabling aio for
non-block
journal.  Use journal_force_aio to force use of a>
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414
7f6a27204e80 -1 journal do_read_entry(6930432): bad header magic
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729
7f6a27204e80 -1 osd.0 21 log_to_monitors {default=true}

>
>> but hang at the command: "rbd create libvirt-pool/dimage --size
10240 "
> So it hungs forever now instead of returning the error?
no returning any error, just hungs
> What is `ceph -s` output?
root@node1:~# ceph -s
   cluster:
 id: 9c1a42e1-afc2-4170-8172-96f4ebdaac68
 health: HEALTH_WARN
 no active mgr

   services:
 mon: 1 daemons, quorum 0
 mgr: no daemons active
 osd: 1 osds: 0 up, 0 in

   data:
 pools:   0 pools, 0 pgs
 objects: 0  objects, 0 B
 usage:   0 B used, 0 B / 0 B avail
 pgs:


>
___
ceph-users mailing list
ceph-users@lists.ceph.com 
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-06 Thread Ashley Merrick
What does

"ceph osd tree" show ?

On Tue, Nov 6, 2018 at 4:27 PM Dengke Du  wrote:

>
> On 2018/11/6 下午4:24, Ashley Merrick wrote:
>
> If I am reading your ceph -s output correctly you only have 1 OSD, and 0
> pool's created.
>
> So your be unable to create a RBD till you atleast have a pool setup and
> configured to create the RBD within.
>
> root@node1:~# ceph osd lspools
> 1 libvirt-pool
> 2 test-pool
>
>
> I create pools using:
>
> ceph osd pool create libvirt-pool 128 128
>
> following:
>
> http://docs.ceph.com/docs/master/rbd/libvirt/
>
>
> On Tue, Nov 6, 2018 at 4:21 PM Dengke Du  wrote:
>
>>
>> On 2018/11/6 下午4:16, Mykola Golub wrote:
>> > On Tue, Nov 06, 2018 at 09:45:01AM +0800, Dengke Du wrote:
>> >
>> >> I reconfigure the osd service from start, the journal was:
>> > I am not quite sure I understand what you mean here.
>> >
>> >>
>> --
>> >>
>> >> -- Unit ceph-osd@0.service has finished starting up.
>> >> --
>> >> -- The start-up result is RESULT.
>> >> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
>> 7f6a27204e80
>> >> -1 Public network was set, but cluster network was not set
>> >> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
>> 7f6a27204e80
>> >> -1 Using public network also for cluster network
>> >> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
>> >> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
>> >> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365
>> 7f6a27204e80
>> >> -1 journal FileJournal::_open: disabling aio for non-block journal.
>> Use
>> >> journal_force_aio to force use of a>
>> >> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414
>> 7f6a27204e80
>> >> -1 journal do_read_entry(6930432): bad header magic
>> >> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729
>> 7f6a27204e80
>> >> -1 osd.0 21 log_to_monitors {default=true}
>> >> Nov 05 18:02:47 node1 nagios[3584]: Warning: Return code of 13 for
>> check of
>> >> host 'localhost' was out of bounds.
>> >>
>> >>
>> --
>> > Could you please post the full ceph-osd log somewhere?
>> /var/log/ceph/ceph-osd.0.log
>>
>> I don't have the file /var/log/ceph/ceph-osd.o.log
>>
>> root@node1:~# systemctl status ceph-osd@0
>> ● ceph-osd@0.service - Ceph object storage daemon osd.0
>> Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled;
>> vendor preset: enabled)
>> Active: active (running) since Mon 2018-11-05 18:02:36 UTC; 6h ago
>>   Main PID: 4487 (ceph-osd)
>>  Tasks: 64
>> Memory: 27.0M
>> CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
>> └─4487 /usr/bin/ceph-osd -f --cluster ceph --id 0
>>
>> Nov 05 18:02:36 node1 systemd[1]: Starting Ceph object storage daemon
>> osd.0...
>> Nov 05 18:02:36 node1 systemd[1]: Started Ceph object storage daemon
>> osd.0.
>> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
>> 7f6a27204e80 -1 Public network was set, but cluster network was not set
>> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
>> 7f6a27204e80 -1 Using public network also for cluster network
>> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
>> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365
>> 7f6a27204e80 -1 journal FileJournal::_open: disabling aio for non-block
>> journal.  Use journal_force_aio to force use of a>
>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414
>> 7f6a27204e80 -1 journal do_read_entry(6930432): bad header magic
>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729
>> 7f6a27204e80 -1 osd.0 21 log_to_monitors {default=true}
>>
>> >
>> >> but hang at the command: "rbd create libvirt-pool/dimage --size 10240 "
>> > So it hungs forever now instead of returning the error?
>> no returning any error, just hungs
>> > What is `ceph -s` output?
>> root@node1:~# ceph -s
>>cluster:
>>  id: 9c1a42e1-afc2-4170-8172-96f4ebdaac68
>>  health: HEALTH_WARN
>>  no active mgr
>>
>>services:
>>  mon: 1 daemons, quorum 0
>>  mgr: no daemons active
>>  osd: 1 osds: 0 up, 0 in
>>
>>data:
>>  pools:   0 pools, 0 pgs
>>  objects: 0  objects, 0 B
>>  usage:   0 B used, 0 B / 0 B avail
>>  pgs:
>>
>>
>> >
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-06 Thread Dengke Du


On 2018/11/6 下午4:29, Ashley Merrick wrote:

What does

"ceph osd tree" show ?

root@node1:~# ceph osd tree
ID CLASS WEIGHT  TYPE NAME  STATUS REWEIGHT PRI-AFF
-2 0 host 0
-1   1.0 root default
-3   1.0 host node1
 0   hdd 1.0 osd.0    down    0 1.0


On Tue, Nov 6, 2018 at 4:27 PM Dengke Du > wrote:



On 2018/11/6 下午4:24, Ashley Merrick wrote:

If I am reading your ceph -s output correctly you only have 1
OSD, and 0 pool's created.

So your be unable to create a RBD till you atleast have a pool
setup and configured to create the RBD within.

root@node1:~# ceph osd lspools
1 libvirt-pool
2 test-pool


I create pools using:

ceph osd pool create libvirt-pool 128 128

following:

http://docs.ceph.com/docs/master/rbd/libvirt/



On Tue, Nov 6, 2018 at 4:21 PM Dengke Du mailto:dengke...@windriver.com>> wrote:


On 2018/11/6 下午4:16, Mykola Golub wrote:
> On Tue, Nov 06, 2018 at 09:45:01AM +0800, Dengke Du wrote:
>
>> I reconfigure the osd service from start, the journal was:
> I am not quite sure I understand what you mean here.
>
>>

--
>>
>> -- Unit ceph-osd@0.service  has
finished starting up.
>> --
>> -- The start-up result is RESULT.
>> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05
18:02:36.915 7f6a27204e80
>> -1 Public network was set, but cluster network was not set
>> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05
18:02:36.915 7f6a27204e80
>> -1 Using public network also for cluster network
>> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at -
osd_data
>> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05
18:02:37.365 7f6a27204e80
>> -1 journal FileJournal::_open: disabling aio for non-block
journal.  Use
>> journal_force_aio to force use of a>
>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05
18:02:37.414 7f6a27204e80
>> -1 journal do_read_entry(6930432): bad header magic
>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05
18:02:37.729 7f6a27204e80
>> -1 osd.0 21 log_to_monitors {default=true}
>> Nov 05 18:02:47 node1 nagios[3584]: Warning: Return code
of 13 for check of
>> host 'localhost' was out of bounds.
>>
>>

--
> Could you please post the full ceph-osd log somewhere?
/var/log/ceph/ceph-osd.0.log

I don't have the file /var/log/ceph/ceph-osd.o.log

root@node1:~# systemctl status ceph-osd@0
● ceph-osd@0.service  - Ceph
object storage daemon osd.0
    Loaded: loaded (/lib/systemd/system/ceph-osd@.service;
disabled;
vendor preset: enabled)
    Active: active (running) since Mon 2018-11-05 18:02:36
UTC; 6h ago
  Main PID: 4487 (ceph-osd)
 Tasks: 64
    Memory: 27.0M
    CGroup:
/system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service

    └─4487 /usr/bin/ceph-osd -f --cluster ceph --id 0

Nov 05 18:02:36 node1 systemd[1]: Starting Ceph object
storage daemon
osd.0...
Nov 05 18:02:36 node1 systemd[1]: Started Ceph object storage
daemon osd.0.
Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
7f6a27204e80 -1 Public network was set, but cluster network
was not set
Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
7f6a27204e80 -1 Using public network also for cluster network
Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at -
osd_data
/var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365
7f6a27204e80 -1 journal FileJournal::_open: disabling aio for
non-block
journal.  Use journal_force_aio to force use of a>
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414
7f6a27204e80 -1 journal do_read_entry(6930432): bad header magic
Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729
7f6a27204e80 -1 osd.0 21 log_to_monitors {default=true}

>
>> but hang at the command: "rbd create libvirt-pool/dimage
--size 10240 "
> So it hungs forever now instead of returning the e

Re: [ceph-users] librbd::image::CreateRequest: 0x55e4fc8bf620 handle_create_id_object: error creating RBD id object

2018-11-06 Thread Ashley Merrick
Is that correct or have you added more than 1 OSD?

CEPH is never going to work or be able to bring up a pool with only one
OSD, if you really do have more than OSD and have added them correctly then
there really is something up with your CEPH setup / config and may be worth
starting from scratch.

On Tue, Nov 6, 2018 at 4:31 PM Dengke Du  wrote:

>
> On 2018/11/6 下午4:29, Ashley Merrick wrote:
>
> What does
>
> "ceph osd tree" show ?
>
> root@node1:~# ceph osd tree
> ID CLASS WEIGHT  TYPE NAME  STATUS REWEIGHT PRI-AFF
> -2 0 host 0
> -1   1.0 root default
> -3   1.0 host node1
>  0   hdd 1.0 osd.0down0 1.0
>
>
> On Tue, Nov 6, 2018 at 4:27 PM Dengke Du  wrote:
>
>>
>> On 2018/11/6 下午4:24, Ashley Merrick wrote:
>>
>> If I am reading your ceph -s output correctly you only have 1 OSD, and 0
>> pool's created.
>>
>> So your be unable to create a RBD till you atleast have a pool setup and
>> configured to create the RBD within.
>>
>> root@node1:~# ceph osd lspools
>> 1 libvirt-pool
>> 2 test-pool
>>
>>
>> I create pools using:
>>
>> ceph osd pool create libvirt-pool 128 128
>>
>> following:
>>
>> http://docs.ceph.com/docs/master/rbd/libvirt/
>>
>>
>> On Tue, Nov 6, 2018 at 4:21 PM Dengke Du  wrote:
>>
>>>
>>> On 2018/11/6 下午4:16, Mykola Golub wrote:
>>> > On Tue, Nov 06, 2018 at 09:45:01AM +0800, Dengke Du wrote:
>>> >
>>> >> I reconfigure the osd service from start, the journal was:
>>> > I am not quite sure I understand what you mean here.
>>> >
>>> >>
>>> --
>>> >>
>>> >> -- Unit ceph-osd@0.service has finished starting up.
>>> >> --
>>> >> -- The start-up result is RESULT.
>>> >> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
>>> 7f6a27204e80
>>> >> -1 Public network was set, but cluster network was not set
>>> >> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
>>> 7f6a27204e80
>>> >> -1 Using public network also for cluster network
>>> >> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
>>> >> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
>>> >> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365
>>> 7f6a27204e80
>>> >> -1 journal FileJournal::_open: disabling aio for non-block journal.
>>> Use
>>> >> journal_force_aio to force use of a>
>>> >> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414
>>> 7f6a27204e80
>>> >> -1 journal do_read_entry(6930432): bad header magic
>>> >> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729
>>> 7f6a27204e80
>>> >> -1 osd.0 21 log_to_monitors {default=true}
>>> >> Nov 05 18:02:47 node1 nagios[3584]: Warning: Return code of 13 for
>>> check of
>>> >> host 'localhost' was out of bounds.
>>> >>
>>> >>
>>> --
>>> > Could you please post the full ceph-osd log somewhere?
>>> /var/log/ceph/ceph-osd.0.log
>>>
>>> I don't have the file /var/log/ceph/ceph-osd.o.log
>>>
>>> root@node1:~# systemctl status ceph-osd@0
>>> ● ceph-osd@0.service - Ceph object storage daemon osd.0
>>> Loaded: loaded (/lib/systemd/system/ceph-osd@.service; disabled;
>>> vendor preset: enabled)
>>> Active: active (running) since Mon 2018-11-05 18:02:36 UTC; 6h ago
>>>   Main PID: 4487 (ceph-osd)
>>>  Tasks: 64
>>> Memory: 27.0M
>>> CGroup: /system.slice/system-ceph\x2dosd.slice/ceph-osd@0.service
>>> └─4487 /usr/bin/ceph-osd -f --cluster ceph --id 0
>>>
>>> Nov 05 18:02:36 node1 systemd[1]: Starting Ceph object storage daemon
>>> osd.0...
>>> Nov 05 18:02:36 node1 systemd[1]: Started Ceph object storage daemon
>>> osd.0.
>>> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
>>> 7f6a27204e80 -1 Public network was set, but cluster network was not set
>>> Nov 05 18:02:36 node1 ceph-osd[4487]: 2018-11-05 18:02:36.915
>>> 7f6a27204e80 -1 Using public network also for cluster network
>>> Nov 05 18:02:36 node1 ceph-osd[4487]: starting osd.0 at - osd_data
>>> /var/lib/ceph/osd/ceph-0 /var/lib/ceph/osd/ceph-0/journal
>>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.365
>>> 7f6a27204e80 -1 journal FileJournal::_open: disabling aio for non-block
>>> journal.  Use journal_force_aio to force use of a>
>>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.414
>>> 7f6a27204e80 -1 journal do_read_entry(6930432): bad header magic
>>> Nov 05 18:02:37 node1 ceph-osd[4487]: 2018-11-05 18:02:37.729
>>> 7f6a27204e80 -1 osd.0 21 log_to_monitors {default=true}
>>>
>>> >
>>> >> but hang at the command: "rbd create libvirt-pool/dimage --size 10240
>>> "
>>> > So it hungs forever now instead of returning the error?
>>> no returning any error, just hungs
>>> > What is `ceph -s` output?
>>> root@node1:~# ceph -s
>>>cluster:
>>>  id: 9

[ceph-users] cloud sync module testing

2018-11-06 Thread Roberto Valverde
Hi all,

I'm trying to test this feature but  I did not manage to make it working. In my 
simple setup, I have a small mimic cluster with 3 vms at work and I have access 
to a S3 cloud provider (not amazon).
Here is my period configuration, with one realm, one zonegroup and 2 zones:
--
{
"id": "fc158476-a882-47da-a615-b3dd4c95bc3f",
"epoch": 17,
"predecessor_uuid": "102bd810-6964-4576-8b25-ef2b62122e25",
"sync_status": [],
"period_map": {
"id": "fc158476-a882-47da-a615-b3dd4c95bc3f",
"zonegroups": [
{
"id": "8250957b-bce2-4b30-a8e4-118990c1d545",
"name": "ch",
"api_name": "ch",
"is_master": "true",
"endpoints": [
"http://localhost:8080";
],
"hostnames": [],
"hostnames_s3website": [],
"master_zone": "0b806e14-136e-48c6-99d8-07ba03780538",
"zones": [
{
"id": "0b806e14-136e-48c6-99d8-07ba03780538",
"name": "cephpolbo",
"endpoints": [
"http://localhost:8080";
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
},
{
"id": "570f3df3-20a9-49dd-b9d3-a9b8f2177047",
"name": "exoscale",
"endpoints": [

"[https://**:443](https://sos-ch-dk-2.exo.io:443)"
],
"log_meta": "false",
"log_data": "true",
"bucket_index_max_shards": 0,
"read_only": "false",
"tier_type": "cloud",
"sync_from_all": "true",
"sync_from": [],
"redirect_zone": ""
}
],
"placement_targets": [
{
"name": "default-placement",
"tags": []
}
],
"default_placement": "default-placement",
"realm_id": "f32efe47-1830-4350-9971-0b2ee59c0e36"
}
],
"short_zone_ids": [
{
"key": "0b806e14-136e-48c6-99d8-07ba03780538",
"val": 3335063808
},
{
"key": "570f3df3-20a9-49dd-b9d3-a9b8f2177047",
"val": 1220876408
}
]
},
"master_zonegroup": "8250957b-bce2-4b30-a8e4-118990c1d545",
"master_zone": "0b806e14-136e-48c6-99d8-07ba03780538",
"period_config": {
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
},
"user_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1,
"max_size_kb": 0,
"max_objects": -1
}
},
"realm_id": "f32efe47-1830-4350-9971-0b2ee59c0e36",
"realm_name": "earth",
"realm_epoch": 2
}
---
And this is the cloud zone config:

{
"id": "570f3df3-20a9-49dd-b9d3-a9b8f2177047",
"name": "exoscale",
"domain_root": "exoscale.rgw.meta:root",
"control_pool": "exoscale.rgw.control",
"gc_pool": "exoscale.rgw.log:gc",
"lc_pool": "exoscale.rgw.log:lc",
"log_pool": "exoscale.rgw.log",
"intent_log_pool": "exoscale.rgw.log:intent",
"usage_log_pool": "exoscale.rgw.log:usage",
"reshard_pool": "exoscale.rgw.log:reshard",
"user_keys_pool": "exoscale.rgw.meta:users.keys",
"user_email_pool": "exoscale.rgw.meta:users.email",
"user_swift_pool": "exoscale.rgw.meta:users.swift",
"user_uid_pool": "exoscale.rgw.meta:users.uid",
"otp_pool": "exoscale.rgw.otp",
"system_key": {
"access_key": "system_key",
"secret_key": "secret_key"
},
"placement_pools": [
{
"key": "default-placement",
"val": {
"index_pool": "exoscale.rgw.buckets.index",
"data_pool": "exoscale.rgw.buckets.data",
"data_extra_pool": "exoscale.rgw.buckets.non-ec",
"index_type": 

[ceph-users] cephfs quota limit

2018-11-06 Thread Zhenshi Zhou
Hi,

I'm wondering whether cephfs have quota limit options.
I use kernel client and ceph version is 12.2.8.

Thanks
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd mirror journal data

2018-11-06 Thread Jason Dillaman
On Tue, Nov 6, 2018 at 1:12 AM Wei Jin  wrote:
>
> Thanks.
> I found that both minimum and active set are very large in my cluster, is it 
> expected?
> By the way, I do snapshot for each image half an hour,and keep snapshots for 
> two days.
>
> Journal status:
>
> minimum_set: 671839
> active_set: 1197917
> registered clients:
> [id=, commit_position=[positions=[[object_number=4791670, tag_tid=3, 
> entry_tid=4146742458], [object_number=4791669, tag_tid=3, 
> entry_tid=4146742457], [object_number=4791668, tag_tid=3, 
> entry_tid=4146742456], [object_number=4791671, tag_tid=3, 
> entry_tid=4146742455]]], state=connected]
> [id=89024ad3-57a7-42cc-99d4-67f33b093704, 
> commit_position=[positions=[[object_number=2687357, tag_tid=3, 
> entry_tid=1188516421], [object_number=2687356, tag_tid=3, 
> entry_tid=1188516420], [object_number=2687359, tag_tid=3, 
> entry_tid=1188516419], [object_number=2687358, tag_tid=3, 
> entry_tid=1188516418]]], state=connected]
>

Are you attempting to run "rbd-mirror" daemon on a remote cluster? It
just appears like either the daemon is not running or that it's so far
behind that it's just not able to keep up with the IO workload of the
image. You can run "rbd journal disconnect --image 
--client-id=89024ad3-57a7-42cc-99d4-67f33b093704" to force-disconnect
the remote client and start the journal trimming process.

> > On Nov 6, 2018, at 3:39 AM, Jason Dillaman  wrote:
> >
> > On Sun, Nov 4, 2018 at 11:59 PM Wei Jin  wrote:
> >>
> >> Hi, Jason,
> >>
> >> I have a question about rbd mirroring. When enable mirroring, we observed 
> >> that there are a lot of objects prefix with journal_data, thus it consumes 
> >> a lot of disk space.
> >>
> >> When will these journal objects be deleted? And are there any parameters 
> >> to accelerate it?
> >> Thanks.
> >>
> >
> > Journal data objects should be automatically deleted when the journal
> > is trimmed beyond the position of the object. If you run "rbd journal
> > status --image ", you should be able to see the minimum
> > in-use set and the current active set for new journal entries:
> >
> > $ rbd --cluster cluster1 journal status --image image1
> > minimum_set: 7
> > active_set: 8
> > registered clients:
> > [id=, commit_position=[positions=[[object_number=33, tag_tid=2,
> > entry_tid=49153], [object_number=32, tag_tid=2, entry_tid=49152],
> > [object_number=35, tag_tid=2, entry_tid=49151], [object_number=34,
> > tag_tid=2, entry_tid=49150]]], state=connected]
> > [id=81672c30-d735-46d4-a30a-53c221954d0e,
> > commit_position=[positions=[[object_number=30, tag_tid=2,
> > entry_tid=48034], [object_number=29, tag_tid=2, entry_tid=48033],
> > [object_number=28, tag_tid=2, entry_tid=48032], [object_number=31,
> > tag_tid=2, entry_tid=48031]]], state=connected]
> >
> > $ rados --cluster cluster1 --pool rbd ls | grep journal_data | sort
> > journal_data.1.1029b4577f90.28
> > journal_data.1.1029b4577f90.29
> > journal_data.1.1029b4577f90.30
> > journal_data.1.1029b4577f90.31
> > journal_data.1.1029b4577f90.32
> > journal_data.1.1029b4577f90.33
> > journal_data.1.1029b4577f90.34
> > journal_data.1.1029b4577f90.35
> > <..>
> >
> > $ rbd --cluster cluster1 journal status --image image1
> > minimum_set: 8
> > active_set: 8
> > registered clients:
> > [id=, commit_position=[positions=[[object_number=33, tag_tid=2,
> > entry_tid=49153], [object_number=32, tag_tid=2, entry_tid=49152],
> > [object_number=35, tag_tid=2, entry_tid=49151], [object_number=34,
> > tag_tid=2, entry_tid=49150]]], state=connected]
> > [id=81672c30-d735-46d4-a30a-53c221954d0e,
> > commit_position=[positions=[[object_number=33, tag_tid=2,
> > entry_tid=49153], [object_number=32, tag_tid=2, entry_tid=49152],
> > [object_number=35, tag_tid=2, entry_tid=49151], [object_number=34,
> > tag_tid=2, entry_tid=49150]]], state=connected]
> >
> > $ rados --cluster cluster1 --pool rbd ls | grep journal_data | sort
> > journal_data.1.1029b4577f90.32
> > journal_data.1.1029b4577f90.33
> > journal_data.1.1029b4577f90.34
> > journal_data.1.1029b4577f90.35
> >
> > --
> > Jason
>


-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-deploy osd creation failed with multipath and dmcrypt

2018-11-06 Thread Pavan, Krish
Trying to created OSD with multipath with dmcrypt and it failed . Any 
suggestion please?.
ceph-deploy --overwrite-conf osd create ceph-store1:/dev/mapper/mpathr 
--bluestore --dmcrypt  -- failed
ceph-deploy --overwrite-conf osd create ceph-store1:/dev/mapper/mpathr 
--bluestore - worked

the logs for fail
[ceph-store12][WARNIN] command: Running command: /usr/sbin/restorecon -R 
/var/lib/ceph/osd-lockbox/e15f1adc-feff-4890-a617-adc473e7331e/magic.68428.tmp
[ceph-store12][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph 
/var/lib/ceph/osd-lockbox/e15f1adc-feff-4890-a617-adc473e7331e/magic.68428.tmp
[ceph-store12][WARNIN] Traceback (most recent call last):
[ceph-store12][WARNIN]   File "/usr/sbin/ceph-disk", line 9, in 
[ceph-store12][WARNIN] load_entry_point('ceph-disk==1.0.0', 
'console_scripts', 'ceph-disk')()
[ceph-store12][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5736, in run
[ceph-store12][WARNIN] main(sys.argv[1:])
[ceph-store12][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5687, in main
[ceph-store12][WARNIN] args.func(args)
[ceph-store12][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2108, in main
[ceph-store12][WARNIN] Prepare.factory(args).prepare()
[ceph-store12][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2097, in prepare
[ceph-store12][WARNIN] self._prepare()
[ceph-store12][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2171, in _prepare
[ceph-store12][WARNIN] self.lockbox.prepare()
[ceph-store12][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2861, in prepare
[ceph-store12][WARNIN] self.populate()
[ceph-store12][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2818, in populate
[ceph-store12][WARNIN] get_partition_base(self.partition.get_dev()),
[ceph-store12][WARNIN]   File 
"/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 844, in 
get_partition_base
[ceph-store12][WARNIN] raise Error('not a partition', dev)
[ceph-store12][WARNIN] ceph_disk.main.Error: Error: not a partition: /dev/dm-215
[ceph-store12][ERROR ] RuntimeError: command returned non-zero exit status: 1
[ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-disk -v 
prepare --dmcrypt --dmcrypt-key-dir /etc/ceph/dmcrypt-keys --bluestore 
--cluster ceph --fs-type btrfs -- /dev/mapper/mpathr
[ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy osd creation failed with multipath and dmcrypt

2018-11-06 Thread Kevin Olbrich
I met the same problem. I had to create GPT table for each disk, create
first partition over full space and then fed these to ceph-volume (should
be similar for ceph-deploy).
Also I am not sure if you can combine fs-type btrfs with bluestore (afaik
this is for filestore).

Kevin


Am Di., 6. Nov. 2018 um 14:41 Uhr schrieb Pavan, Krish <
krish.pa...@nuance.com>:

> Trying to created OSD with multipath with dmcrypt and it failed . Any
> suggestion please?.
>
> ceph-deploy --overwrite-conf osd create ceph-store1:/dev/mapper/mpathr
> --bluestore --dmcrypt  -- failed
>
> ceph-deploy --overwrite-conf osd create ceph-store1:/dev/mapper/mpathr
> --bluestore – worked
>
>
>
> the logs for fail
>
> [ceph-store12][WARNIN] command: Running command: /usr/sbin/restorecon -R
> /var/lib/ceph/osd-lockbox/e15f1adc-feff-4890-a617-adc473e7331e/magic.68428.tmp
>
> [ceph-store12][WARNIN] command: Running command: /usr/bin/chown -R
> ceph:ceph
> /var/lib/ceph/osd-lockbox/e15f1adc-feff-4890-a617-adc473e7331e/magic.68428.tmp
>
> [ceph-store12][WARNIN] Traceback (most recent call last):
>
> [ceph-store12][WARNIN]   File "/usr/sbin/ceph-disk", line 9, in 
>
> [ceph-store12][WARNIN] load_entry_point('ceph-disk==1.0.0',
> 'console_scripts', 'ceph-disk')()
>
> [ceph-store12][WARNIN]   File
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5736, in run
>
> [ceph-store12][WARNIN] main(sys.argv[1:])
>
> [ceph-store12][WARNIN]   File
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5687, in main
>
> [ceph-store12][WARNIN] args.func(args)
>
> [ceph-store12][WARNIN]   File
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2108, in main
>
> [ceph-store12][WARNIN] Prepare.factory(args).prepare()
>
> [ceph-store12][WARNIN]   File
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2097, in prepare
>
> [ceph-store12][WARNIN] self._prepare()
>
> [ceph-store12][WARNIN]   File
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2171, in _prepare
>
> [ceph-store12][WARNIN] self.lockbox.prepare()
>
> [ceph-store12][WARNIN]   File
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2861, in prepare
>
> [ceph-store12][WARNIN] self.populate()
>
> [ceph-store12][WARNIN]   File
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2818, in populate
>
> [ceph-store12][WARNIN] get_partition_base(self.partition.get_dev()),
>
> [ceph-store12][WARNIN]   File
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 844, in
> get_partition_base
>
> [ceph-store12][WARNIN] raise Error('not a partition', dev)
>
> [ceph-store12][WARNIN] ceph_disk.main.Error: Error: not a partition:
> /dev/dm-215
>
> [ceph-store12][ERROR ] RuntimeError: command returned non-zero exit
> status: 1
>
> [ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-disk
> -v prepare --dmcrypt --dmcrypt-key-dir /etc/ceph/dmcrypt-keys --bluestore
> --cluster ceph --fs-type btrfs -- /dev/mapper/mpathr
>
> [ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-deploy osd creation failed with multipath and dmcrypt

2018-11-06 Thread Alfredo Deza
On Tue, Nov 6, 2018 at 8:41 AM Pavan, Krish  wrote:
>
> Trying to created OSD with multipath with dmcrypt and it failed . Any 
> suggestion please?.

ceph-disk is known to have issues like this. It is already deprecated
in the Mimic release and will no longer be available for the upcoming
release (Nautilus).

I would strongly suggest you upgrade ceph-deploy to the 2.X.X series
which supports ceph-volume.

>
> ceph-deploy --overwrite-conf osd create ceph-store1:/dev/mapper/mpathr 
> --bluestore --dmcrypt  -- failed
>
> ceph-deploy --overwrite-conf osd create ceph-store1:/dev/mapper/mpathr 
> --bluestore – worked
>
>
>
> the logs for fail
>
> [ceph-store12][WARNIN] command: Running command: /usr/sbin/restorecon -R 
> /var/lib/ceph/osd-lockbox/e15f1adc-feff-4890-a617-adc473e7331e/magic.68428.tmp
>
> [ceph-store12][WARNIN] command: Running command: /usr/bin/chown -R ceph:ceph 
> /var/lib/ceph/osd-lockbox/e15f1adc-feff-4890-a617-adc473e7331e/magic.68428.tmp
>
> [ceph-store12][WARNIN] Traceback (most recent call last):
>
> [ceph-store12][WARNIN]   File "/usr/sbin/ceph-disk", line 9, in 
>
> [ceph-store12][WARNIN] load_entry_point('ceph-disk==1.0.0', 
> 'console_scripts', 'ceph-disk')()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5736, in run
>
> [ceph-store12][WARNIN] main(sys.argv[1:])
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 5687, in main
>
> [ceph-store12][WARNIN] args.func(args)
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2108, in main
>
> [ceph-store12][WARNIN] Prepare.factory(args).prepare()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2097, in prepare
>
> [ceph-store12][WARNIN] self._prepare()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2171, in _prepare
>
> [ceph-store12][WARNIN] self.lockbox.prepare()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2861, in prepare
>
> [ceph-store12][WARNIN] self.populate()
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 2818, in populate
>
> [ceph-store12][WARNIN] get_partition_base(self.partition.get_dev()),
>
> [ceph-store12][WARNIN]   File 
> "/usr/lib/python2.7/site-packages/ceph_disk/main.py", line 844, in 
> get_partition_base
>
> [ceph-store12][WARNIN] raise Error('not a partition', dev)
>
> [ceph-store12][WARNIN] ceph_disk.main.Error: Error: not a partition: 
> /dev/dm-215
>
> [ceph-store12][ERROR ] RuntimeError: command returned non-zero exit status: 1
>
> [ceph_deploy.osd][ERROR ] Failed to execute command: /usr/sbin/ceph-disk -v 
> prepare --dmcrypt --dmcrypt-key-dir /etc/ceph/dmcrypt-keys --bluestore 
> --cluster ceph --fs-type btrfs -- /dev/mapper/mpathr
>
> [ceph_deploy][ERROR ] GenericError: Failed to create 1 OSDs
>
>
>
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hayashida, Mami
So, currently this is what /var/lib/ceph/osd/ceph-60 shows.  Is it not
correct?   I don't know what I should expect to see.

root@osd1:~# ls -l /var/lib/ceph/osd/ceph-60
total 86252
-rw-r--r-- 1 ceph ceph 384 Nov  2 16:20 activate.monmap
-rw-r--r-- 1 ceph ceph 10737418240 Nov  5 16:32 block
lrwxrwxrwx 1 ceph ceph  14 Nov  2 16:20 block.db -> /dev/ssd0/db60
-rw-r--r-- 1 ceph ceph   2 Nov  2 16:20 bluefs
-rw-r--r-- 1 ceph ceph  37 Nov  2 16:20 ceph_fsid
-rw-r--r-- 1 ceph ceph  37 Nov  2 16:20 fsid
-rw--- 1 ceph ceph  57 Nov  2 16:20 keyring
-rw-r--r-- 1 ceph ceph   8 Nov  2 16:20 kv_backend
-rw-r--r-- 1 ceph ceph  21 Nov  2 16:20 magic
-rw-r--r-- 1 ceph ceph   4 Nov  2 16:20 mkfs_done
-rw-r--r-- 1 ceph ceph  41 Nov  2 16:20 osd_key
-rw-r--r-- 1 ceph ceph   6 Nov  2 16:20 ready
-rw-r--r-- 1 ceph ceph  10 Nov  2 16:20 type
-rw-r--r-- 1 ceph ceph   3 Nov  2 16:20 whoami

The disk I am using for this osd (osd.60) is a 3.7TB hdd.

lsblk shows

sdh  8:112  0   3.7T  0 disk
└─hdd60-data60 252:10   3.7T  0 lvm

and "ceph osd tree" shows
60   hdd3.63689 osd.60 up  1.0 1.0



On Mon, Nov 5, 2018 at 11:23 PM, Hector Martin 
wrote:

> On 11/6/18 6:03 AM, Hayashida, Mami wrote:
> > WOW.  With you two guiding me through every step, the 10 OSDs in
> > question are now added back to the cluster as Bluestore disks!!!  Here
> > are my responses to the last email from Hector:
> >
> > 1. I first checked the permissions and they looked like this
> >
> > root@osd1:/var/lib/ceph/osd/ceph-60# ls -l
> > total 56
> > -rw-r--r-- 1 ceph ceph 384 Nov  2 16:20 activate.monmap
> > -rw-r--r-- 1 ceph ceph 10737418240 Nov  2 16:20 block
> > lrwxrwxrwx 1 ceph ceph  14 Nov  2 16:20 block.db ->
> /dev/ssd0/db60
>
> Wait.
>
> That's not right, is it? That's a 10GB raw file being used as a
> BlueStore block device. It should be a symlink to an LVM volume, not a
> file.
>
> Can you check the `block` files/symlinks again for all the OSDs and also
> what is mounted on the OSD directories? It should be tmpfs directories
> with symlinks to block devices. I'm not sure what happened there.
>
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://mrcn.st/pub
>



-- 
*Mami Hayashida*

*Research Computing Associate*
Research Computing Infrastructure
University of Kentucky Information Technology Services
301 Rose Street | 102 James F. Hardymon Building
Lexington, KY 40506-0495
mami.hayash...@uky.edu
(859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin


On 11/7/18 12:30 AM, Hayashida, Mami wrote:
> So, currently this is what /var/lib/ceph/osd/ceph-60 shows.  Is it not
> correct?   I don't know what I should expect to see.
> 
> root@osd1:~# ls -l /var/lib/ceph/osd/ceph-60
> total 86252
> -rw-r--r-- 1 ceph ceph         384 Nov  2 16:20 activate.monmap
> -rw-r--r-- 1 ceph ceph 10737418240 Nov  5 16:32 block
> lrwxrwxrwx 1 ceph ceph          14 Nov  2 16:20 block.db -> /dev/ssd0/db60
> -rw-r--r-- 1 ceph ceph           2 Nov  2 16:20 bluefs
> -rw-r--r-- 1 ceph ceph          37 Nov  2 16:20 ceph_fsid
> -rw-r--r-- 1 ceph ceph          37 Nov  2 16:20 fsid
> -rw--- 1 ceph ceph          57 Nov  2 16:20 keyring
> -rw-r--r-- 1 ceph ceph           8 Nov  2 16:20 kv_backend
> -rw-r--r-- 1 ceph ceph          21 Nov  2 16:20 magic
> -rw-r--r-- 1 ceph ceph           4 Nov  2 16:20 mkfs_done
> -rw-r--r-- 1 ceph ceph          41 Nov  2 16:20 osd_key
> -rw-r--r-- 1 ceph ceph           6 Nov  2 16:20 ready
> -rw-r--r-- 1 ceph ceph          10 Nov  2 16:20 type
> -rw-r--r-- 1 ceph ceph           3 Nov  2 16:20 whoami

Are all the other OSDs like this? What is mounted on those directories
(i.e. what do "df" and "mount" show)?

All those files are dated Nov 2, yet it should be a tmpfs and you
rebooted after that date, so I think something's wrong here.

> The disk I am using for this osd (osd.60) is a 3.7TB hdd. 
> 
> lsblk shows
> 
> sdh              8:112  0   3.7T  0 disk 
> └─hdd60-data60 252:1    0   3.7T  0 lvm 
> 
> and "ceph osd tree" shows 
> 60   hdd    3.63689         osd.60         up  1.0 1.0 

That looks correct as far as the weight goes, but I'm really confused as
to why you have a 10GB "block" file. That should be a symlink to the
hdd60/data60 device as far as I know.

-- 
Hector Martin (hec...@marcansoft.com)
Public Key: https://mrcn.st/pub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hayashida, Mami
All other OSDs that I converted (#60-69) look basically identical while the
Filestore OSDs (/var/lib/ceph/osd/ceph-70 etc.) look different obviously.
When I run "df" it does NOT list those converted osds (only the Filestore
ones).  In other words, /dev/sdh1 where osd.60 should be is not listed.
(Should it be?)  Neither does mount lists that drive.   ("df | grep sdh"
and "mount | grep sdh" both return nothing)




On Tue, Nov 6, 2018 at 10:42 AM, Hector Martin 
wrote:

>
>
> On 11/7/18 12:30 AM, Hayashida, Mami wrote:
> > So, currently this is what /var/lib/ceph/osd/ceph-60 shows.  Is it not
> > correct?   I don't know what I should expect to see.
> >
> > root@osd1:~# ls -l /var/lib/ceph/osd/ceph-60
> > total 86252
> > -rw-r--r-- 1 ceph ceph 384 Nov  2 16:20 activate.monmap
> > -rw-r--r-- 1 ceph ceph 10737418240 Nov  5 16:32 block
> > lrwxrwxrwx 1 ceph ceph  14 Nov  2 16:20 block.db ->
> /dev/ssd0/db60
> > -rw-r--r-- 1 ceph ceph   2 Nov  2 16:20 bluefs
> > -rw-r--r-- 1 ceph ceph  37 Nov  2 16:20 ceph_fsid
> > -rw-r--r-- 1 ceph ceph  37 Nov  2 16:20 fsid
> > -rw--- 1 ceph ceph  57 Nov  2 16:20 keyring
> > -rw-r--r-- 1 ceph ceph   8 Nov  2 16:20 kv_backend
> > -rw-r--r-- 1 ceph ceph  21 Nov  2 16:20 magic
> > -rw-r--r-- 1 ceph ceph   4 Nov  2 16:20 mkfs_done
> > -rw-r--r-- 1 ceph ceph  41 Nov  2 16:20 osd_key
> > -rw-r--r-- 1 ceph ceph   6 Nov  2 16:20 ready
> > -rw-r--r-- 1 ceph ceph  10 Nov  2 16:20 type
> > -rw-r--r-- 1 ceph ceph   3 Nov  2 16:20 whoami
>
> Are all the other OSDs like this? What is mounted on those directories
> (i.e. what do "df" and "mount" show)?
>
> All those files are dated Nov 2, yet it should be a tmpfs and you
> rebooted after that date, so I think something's wrong here.
>
> > The disk I am using for this osd (osd.60) is a 3.7TB hdd.
> >
> > lsblk shows
> >
> > sdh  8:112  0   3.7T  0 disk
> > └─hdd60-data60 252:10   3.7T  0 lvm
> >
> > and "ceph osd tree" shows
> > 60   hdd3.63689 osd.60 up  1.0 1.0
>
> That looks correct as far as the weight goes, but I'm really confused as
> to why you have a 10GB "block" file. That should be a symlink to the
> hdd60/data60 device as far as I know.
>
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://mrcn.st/pub
>



-- 
*Mami Hayashida*

*Research Computing Associate*
Research Computing Infrastructure
University of Kentucky Information Technology Services
301 Rose Street | 102 James F. Hardymon Building
Lexington, KY 40506-0495
mami.hayash...@uky.edu
(859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin


On 11/7/18 12:48 AM, Hayashida, Mami wrote:
> All other OSDs that I converted (#60-69) look basically identical while
> the Filestore OSDs (/var/lib/ceph/osd/ceph-70 etc.) look different
> obviously.  When I run "df" it does NOT list those converted osds (only
> the Filestore ones).  In other words, /dev/sdh1 where osd.60 should be
> is not listed.  (Should it be?)  Neither does mount lists that drive. 
>  ("df | grep sdh" and "mount | grep sdh" both return nothing)

/dev/sdh1 no longer exists. Remember, we converted the drives to be LVM
physical volumes. There are no partitions any more. It's all in an LVM
volume backed by /dev/sdh (without the 1).

What *should* be mounted at the OSD paths are tmpfs filesystems, i.e.
ramdisks. Those would not reference sdh so of course those commands will
return nothing. Try "df | grep osd" and "mount | grep osd" instead and
see if ceph-60 through ceph-69 show up.

-- 
Hector Martin (hec...@marcansoft.com)
Public Key: https://mrcn.st/pub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hayashida, Mami
I see.  Thank you for clarifying lots of things along the way -- this has
been extremely helpful.   Neither "df | grep osd" nor "mount | grep osd"
shows ceph-60 through 69.

On Tue, Nov 6, 2018 at 10:57 AM, Hector Martin 
wrote:

>
>
> On 11/7/18 12:48 AM, Hayashida, Mami wrote:
> > All other OSDs that I converted (#60-69) look basically identical while
> > the Filestore OSDs (/var/lib/ceph/osd/ceph-70 etc.) look different
> > obviously.  When I run "df" it does NOT list those converted osds (only
> > the Filestore ones).  In other words, /dev/sdh1 where osd.60 should be
> > is not listed.  (Should it be?)  Neither does mount lists that drive.
> >  ("df | grep sdh" and "mount | grep sdh" both return nothing)
>
> /dev/sdh1 no longer exists. Remember, we converted the drives to be LVM
> physical volumes. There are no partitions any more. It's all in an LVM
> volume backed by /dev/sdh (without the 1).
>
> What *should* be mounted at the OSD paths are tmpfs filesystems, i.e.
> ramdisks. Those would not reference sdh so of course those commands will
> return nothing. Try "df | grep osd" and "mount | grep osd" instead and
> see if ceph-60 through ceph-69 show up.
>
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://mrcn.st/pub
>



-- 
*Mami Hayashida*

*Research Computing Associate*
Research Computing Infrastructure
University of Kentucky Information Technology Services
301 Rose Street | 102 James F. Hardymon Building
Lexington, KY 40506-0495
mami.hayash...@uky.edu
(859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Balancer module not balancing perfectly

2018-11-06 Thread Steve Taylor
I ended up balancing my osdmap myself offline to figure out why the balancer 
couldn't do better. I had similar issues with osdmaptool, which of course is 
what I expected, but it's a lot easier to run osdmaptool in a debugger to see 
what's happening. When I dug into the upmap code I discovered that my problem 
was due to the way that code balances OSDs. In my case the average PG count per 
OSD is 56.882, so as soon as any OSD had 56 PGs it wouldn't get any more no 
matter what I used as my max deviation. I got into a state where each OSD had 
56-61 PGs, and the upmap code wouldn't do any better because there were no 
"underfull" OSDs onto which to move PGs.

I made some changes to the osdmap code to insure the computed "overfull" and 
"underfull" OSD lists were the same size even if the least or most full OSDs 
were within the expected deviation in order to allow those outside of the 
expected deviation some relief, and it worked nicely. I have two independent, 
production pools that were both in this state, and now every OSD across both 
pools has 56 or 57 PGs as expected.

I intend to put together a pull request to push this upstream. I haven't 
reviewed the balancer module code to see how it's doing things, but assuming it 
uses osdmaptool or the same upmap code as osdmaptool this should also improve 
the balancer module.




[cid:SC_LOGO_VERT_4C_100x72_f823be1a-ae53-43d3-975c-b054a1b22ec3.jpg]


Steve Taylor | Senior Software Engineer | StorageCraft Technology 
Corporation
380 Data Drive Suite 300 | Draper | Utah | 84020
Office: 801.871.2799 |



If you are not the intended recipient of this message or received it 
erroneously, please notify the sender and delete it, together with any 
attachments, and be advised that any dissemination or copying of this message 
is prohibited.



On Tue, 2018-11-06 at 12:23 +0700, Konstantin Shalygin wrote:

From the balancer module's code for v 12.2.7 I noticed [1] these lines

which reference [2] these 2 config options for upmap. You might try using

more max iterations or a smaller max deviation to see if you can get a

better balance in your cluster. I would try to start with [3] these

commands/values and see if it improves your balance and/or allows you to

generate a better map.


[1]

https://github.com/ceph/ceph/blob/v12.2.7/src/pybind/mgr/balancer/module.py#L671-L672

[2] upmap_max_iterations (default 10)

upmap_max_deviation (default .01)

[3] ceph config-key set mgr/balancer/upmap_max_iterations 50

ceph config-key set mgr/balancer/upmap_max_deviation .005


This was not help to my 12.2.8 cluster. When first iterations of balancing was 
performing I decreased max_misplaced from default 0.05 to 0.01. After this 
balancing operations was stopped.

After cluster is HEALTH_OK, I not see no any balancer run's. I'll try to lower 
balancer variables and restart mgr - message is still: "Error EALREADY: Unable 
to find further optimization,or distribution is already perfect"

# ceph config-key dump | grep balancer
"mgr/balancer/active": "1",
"mgr/balancer/max_misplaced": ".50",
"mgr/balancer/mode": "upmap",
"mgr/balancer/upmap_max_deviation": ".001",
"mgr/balancer/upmap_max_iterations": "100",


So may be I need delete upmaps and start over?


ID  CLASS WEIGHTREWEIGHT SIZEUSE AVAIL   %USE  VAR  PGS TYPE NAME
 -1   414.0-  445TiB  129TiB  316TiB 29.01 1.00   - root default
 -7   414.0-  445TiB  129TiB  316TiB 29.01 1.00   - 
datacenter rtcloud
 -8   138.0-  148TiB 42.9TiB  105TiB 28.93 1.00   - 
rack rack2
 -269.0- 74.2TiB 21.5TiB 52.7TiB 28.93 1.00   - 
host ceph-osd0
  0   hdd   5.0  1.0 5.46TiB 1.64TiB 3.82TiB 30.06 1.04  62 
osd.0
  4   hdd   5.0  1.0 5.46TiB 1.65TiB 3.80TiB 30.29 1.04  64 
osd.4
  7   hdd   5.0  1.0 5.46TiB 1.61TiB 3.85TiB 29.44 1.01  63 
osd.7
  9   hdd   5.0  1.0 5.46TiB 1.68TiB 3.78TiB 30.77 1.06  63 
osd.9
 46   hdd   5.0  1.0 5.46TiB 1.68TiB 3.77TiB 30.86 1.06  65 
osd.46
 47   hdd   5.0  1.0 5.46TiB 1.68TiB 3.78TiB 30.73 1.06  66 
osd.47
 48   hdd   5.0  1.0 5.46TiB 1.65TiB 3.81TiB 30.22 1.04  66 
osd.48
 49   hdd   5.0  1.0 5.46TiB 1.71TiB 3.74TiB 31.41 1.08  65 
osd.49
 54   hdd   5.0  1.0 5.46TiB 1.64TiB 3.82TiB 30.08 1.04  65 
osd.54
 55   hdd   5.0  1.0 5.46TiB 1.65TiB 3.80TiB 30.30 1.04  64 
osd.55
 56   hdd   5.0  1.0 5.46TiB 1.66TiB 3.80TiB 30.35 1.05  64 
osd.56
 57   hdd   5.0  1.0 5.46TiB 1.63TiB 3.83TiB 29.81 1.03  64 
osd.57
 24  nvme   3.0  1.0 2.89TiB  559GiB 2.34TiB 18.88 0.65  63 

Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hayashida, Mami
But this is correct, isn't it?


root@osd1:~# ceph-volume lvm list --format=json hdd60/data60
{
"60": [
{
"devices": [
"/dev/sdh"
],
"lv_name": "data60",
"lv_path": "/dev/hdd60/data60",
"lv_size": "3.64t",
"lv_tags":
"ceph.block_device=/dev/hdd60/data60,ceph.block_uuid=ycRaVn-O70Q-Ci43-2IN3-U5ua-lnqL-IE9jVb,ceph.cephx_lockbox_secret=,ceph.cluster_fsid=fef5bc3c-3912-4a77-a077-3398f21cc16d,ceph.cluster_name=ceph,ceph.crush_device_class=None,ceph.db_device=/dev/ssd0/db60,ceph.db_uuid=d32eQz-79GQ-2eJD-4ANB-vr0O-bDpb-fjWSD5,ceph.encrypted=0,ceph.osd_fsid=e0d69288-13e1-4023-a812-9d313204f600,ceph.osd_id=60,ceph.type=block,ceph.vdo=0",
"lv_uuid": "ycRaVn-O70Q-Ci43-2IN3-U5ua-lnqL-IE9jVb",
"name": "data60",
"path": "/dev/hdd60/data60",
"tags": {
"ceph.block_device": "/dev/hdd60/data60",
"ceph.block_uuid":
"ycRaVn-O70Q-Ci43-2IN3-U5ua-lnqL-IE9jVb",
"ceph.cephx_lockbox_secret": "",
"ceph.cluster_fsid":
"fef5bc3c-3912-4a77-a077-3398f21cc16d",
"ceph.cluster_name": "ceph",
"ceph.crush_device_class": "None",
"ceph.db_device": "/dev/ssd0/db60",
"ceph.db_uuid": "d32eQz-79GQ-2eJD-4ANB-vr0O-bDpb-fjWSD5",
"ceph.encrypted": "0",
"ceph.osd_fsid": "e0d69288-13e1-4023-a812-9d313204f600",
"ceph.osd_id": "60",
"ceph.type": "block",
"ceph.vdo": "0"
},
"type": "block",
"vg_name": "hdd60"
}
]
}

On Tue, Nov 6, 2018 at 11:00 AM, Hayashida, Mami 
wrote:

> I see.  Thank you for clarifying lots of things along the way -- this has
> been extremely helpful.   Neither "df | grep osd" nor "mount | grep osd"
> shows ceph-60 through 69.
>
> On Tue, Nov 6, 2018 at 10:57 AM, Hector Martin 
> wrote:
>
>>
>>
>> On 11/7/18 12:48 AM, Hayashida, Mami wrote:
>> > All other OSDs that I converted (#60-69) look basically identical while
>> > the Filestore OSDs (/var/lib/ceph/osd/ceph-70 etc.) look different
>> > obviously.  When I run "df" it does NOT list those converted osds (only
>> > the Filestore ones).  In other words, /dev/sdh1 where osd.60 should be
>> > is not listed.  (Should it be?)  Neither does mount lists that drive.
>> >  ("df | grep sdh" and "mount | grep sdh" both return nothing)
>>
>> /dev/sdh1 no longer exists. Remember, we converted the drives to be LVM
>> physical volumes. There are no partitions any more. It's all in an LVM
>> volume backed by /dev/sdh (without the 1).
>>
>> What *should* be mounted at the OSD paths are tmpfs filesystems, i.e.
>> ramdisks. Those would not reference sdh so of course those commands will
>> return nothing. Try "df | grep osd" and "mount | grep osd" instead and
>> see if ceph-60 through ceph-69 show up.
>>
>> --
>> Hector Martin (hec...@marcansoft.com)
>> Public Key: https://mrcn.st/pub
>>
>
>
>
> --
> *Mami Hayashida*
>
> *Research Computing Associate*
> Research Computing Infrastructure
> University of Kentucky Information Technology Services
> 301 Rose Street | 102 James F. Hardymon Building
> Lexington, KY 40506-0495
> mami.hayash...@uky.edu
> (859)323-7521
>



-- 
*Mami Hayashida*

*Research Computing Associate*
Research Computing Infrastructure
University of Kentucky Information Technology Services
301 Rose Street | 102 James F. Hardymon Building
Lexington, KY 40506-0495
mami.hayash...@uky.edu
(859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] list admin issues

2018-11-06 Thread Janne Johansson
Den lör 6 okt. 2018 kl 15:06 skrev Elias Abacioglu
:
> I'm bumping this old thread cause it's getting annoying. My membership get 
> disabled twice a month.
> Between my two Gmail accounts I'm in more than 25 mailing lists and I see 
> this behavior only here. Why is only ceph-users only affected? Maybe 
> Christian was on to something, is this intentional?
> Reality is that there is a lot of ceph-users with Gmail accounts, perhaps it 
> wouldn't be so bad to actually trying to figure this one out?
> So can the maintainers of this list please investigate what actually gets 
> bounced? Look at my address if you want.
> I got disabled 20181006, 20180927, 20180916, 20180725, 20180718 most recently.

Guess it's time for it again.

-- 
May the most significant bit of your life be positive.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
On 11/7/18 1:00 AM, Hayashida, Mami wrote:
> I see.  Thank you for clarifying lots of things along the way -- this
> has been extremely helpful.   Neither "df | grep osd" nor "mount | grep
> osd" shows ceph-60 through 69.

OK, that isn't right then. I suggest you try this:

1) bring down OSD 60-69 (systemctl stop ceph-osd@60 etc)

2) move those directories out of the way, as in:

mkdir /var/lib/ceph/osd_old
mv /var/lib/ceph/osd/ceph-6[0-9] /var/lib/ceph/osd_old

(if this all works out you can delete them, just want to make sure you
don't accidentally wipe something important)

2) run `find /etc/systemd/system | grep ceph-volume` and check the
output. You're looking for symlinks in multi-user.target.wants or similar.

There should be a single "ceph-volume@lvm--" entry for each
OSD, and the id and uuid should match the "ceph.osd_id" and
"ceph.osd_fsid" LVM tags from `ceph-volume lvm list`. You can also use
`lvs -o vg_name,name,lv_tags`

If you see anything of the format "ceph-volume@simple-..." then that is
old junk from previous attempts at using ceph-volume. They should be
symlinks and you should delete them and run `systemctl daemon-reload`.
Same story if you see any @lvm symlinks but with incorrect OSD IDs or
fsids. All of this should be recreated by the next step anyway if
deleted, so it should be safe to delete any symlinks in there that you
think might be wrong.

3) Run `ceph-volume lvm activate --all`

At this point `df` and `mount` should show tmpfs mounts for all your LVM
OSDs, and they should be up. List the OSD directories and check that
both `block` and `block.db` entries are symlinks to the right devices.
The right target symlinks should also have been created/enabled in
/etc/systemd/system/multi-user.target.wants.

The LVM dump you provided is correct. I suspect what happened is that
somewhere during this experiment OSDs were activated into the root
filesystem (instead of a tmpfs), perhaps using the ceph-volume simple
mode, perhaps something else. Since all the metadata is in LVM, it's
safe to move or delete all those OSD directories for BlueStore OSDs and
try activating them cleanly again, which hopefully will do the right thing.

In the end this all might fix your device ownership woes too, making the
udev rule unnecessary. If it all works out, try a reboot and see if
everything comes back up as it should.

-- 
Hector Martin (hec...@marcansoft.com)
Public Key: https://mrcn.st/pub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hayashida, Mami
Ok. I will go through this this afternoon and let you guys know the
result.  Thanks!

On Tue, Nov 6, 2018 at 11:32 AM, Hector Martin 
wrote:

> On 11/7/18 1:00 AM, Hayashida, Mami wrote:
> > I see.  Thank you for clarifying lots of things along the way -- this
> > has been extremely helpful.   Neither "df | grep osd" nor "mount | grep
> > osd" shows ceph-60 through 69.
>
> OK, that isn't right then. I suggest you try this:
>
> 1) bring down OSD 60-69 (systemctl stop ceph-osd@60 etc)
>
> 2) move those directories out of the way, as in:
>
> mkdir /var/lib/ceph/osd_old
> mv /var/lib/ceph/osd/ceph-6[0-9] /var/lib/ceph/osd_old
>
> (if this all works out you can delete them, just want to make sure you
> don't accidentally wipe something important)
>
> 2) run `find /etc/systemd/system | grep ceph-volume` and check the
> output. You're looking for symlinks in multi-user.target.wants or similar.
>
> There should be a single "ceph-volume@lvm--" entry for each
> OSD, and the id and uuid should match the "ceph.osd_id" and
> "ceph.osd_fsid" LVM tags from `ceph-volume lvm list`. You can also use
> `lvs -o vg_name,name,lv_tags`
>
> If you see anything of the format "ceph-volume@simple-..." then that is
> old junk from previous attempts at using ceph-volume. They should be
> symlinks and you should delete them and run `systemctl daemon-reload`.
> Same story if you see any @lvm symlinks but with incorrect OSD IDs or
> fsids. All of this should be recreated by the next step anyway if
> deleted, so it should be safe to delete any symlinks in there that you
> think might be wrong.
>
> 3) Run `ceph-volume lvm activate --all`
>
> At this point `df` and `mount` should show tmpfs mounts for all your LVM
> OSDs, and they should be up. List the OSD directories and check that
> both `block` and `block.db` entries are symlinks to the right devices.
> The right target symlinks should also have been created/enabled in
> /etc/systemd/system/multi-user.target.wants.
>
> The LVM dump you provided is correct. I suspect what happened is that
> somewhere during this experiment OSDs were activated into the root
> filesystem (instead of a tmpfs), perhaps using the ceph-volume simple
> mode, perhaps something else. Since all the metadata is in LVM, it's
> safe to move or delete all those OSD directories for BlueStore OSDs and
> try activating them cleanly again, which hopefully will do the right thing.
>
> In the end this all might fix your device ownership woes too, making the
> udev rule unnecessary. If it all works out, try a reboot and see if
> everything comes back up as it should.
>
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://mrcn.st/pub
>



-- 
*Mami Hayashida*

*Research Computing Associate*
Research Computing Infrastructure
University of Kentucky Information Technology Services
301 Rose Street | 102 James F. Hardymon Building
Lexington, KY 40506-0495
mami.hayash...@uky.edu
(859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hayashida, Mami
1. Stopped osd.60-69:  no problem
2. Skipped this and went to #3 to check first
3. Here, `find /etc/systemd/system | grep ceph-volume` returned nothing.  I
see in that directory

/etc/systemd/system/ceph-disk@60.service# and 61 - 69.

No ceph-volume entries.


On Tue, Nov 6, 2018 at 11:43 AM, Hayashida, Mami 
wrote:

> Ok. I will go through this this afternoon and let you guys know the
> result.  Thanks!
>
> On Tue, Nov 6, 2018 at 11:32 AM, Hector Martin 
> wrote:
>
>> On 11/7/18 1:00 AM, Hayashida, Mami wrote:
>> > I see.  Thank you for clarifying lots of things along the way -- this
>> > has been extremely helpful.   Neither "df | grep osd" nor "mount | grep
>> > osd" shows ceph-60 through 69.
>>
>> OK, that isn't right then. I suggest you try this:
>>
>> 1) bring down OSD 60-69 (systemctl stop ceph-osd@60 etc)
>>
>> 2) move those directories out of the way, as in:
>>
>> mkdir /var/lib/ceph/osd_old
>> mv /var/lib/ceph/osd/ceph-6[0-9] /var/lib/ceph/osd_old
>>
>> (if this all works out you can delete them, just want to make sure you
>> don't accidentally wipe something important)
>>
>> 2) run `find /etc/systemd/system | grep ceph-volume` and check the
>> output. You're looking for symlinks in multi-user.target.wants or similar.
>>
>> There should be a single "ceph-volume@lvm--" entry for each
>> OSD, and the id and uuid should match the "ceph.osd_id" and
>> "ceph.osd_fsid" LVM tags from `ceph-volume lvm list`. You can also use
>> `lvs -o vg_name,name,lv_tags`
>>
>> If you see anything of the format "ceph-volume@simple-..." then that is
>> old junk from previous attempts at using ceph-volume. They should be
>> symlinks and you should delete them and run `systemctl daemon-reload`.
>> Same story if you see any @lvm symlinks but with incorrect OSD IDs or
>> fsids. All of this should be recreated by the next step anyway if
>> deleted, so it should be safe to delete any symlinks in there that you
>> think might be wrong.
>>
>> 3) Run `ceph-volume lvm activate --all`
>>
>> At this point `df` and `mount` should show tmpfs mounts for all your LVM
>> OSDs, and they should be up. List the OSD directories and check that
>> both `block` and `block.db` entries are symlinks to the right devices.
>> The right target symlinks should also have been created/enabled in
>> /etc/systemd/system/multi-user.target.wants.
>>
>> The LVM dump you provided is correct. I suspect what happened is that
>> somewhere during this experiment OSDs were activated into the root
>> filesystem (instead of a tmpfs), perhaps using the ceph-volume simple
>> mode, perhaps something else. Since all the metadata is in LVM, it's
>> safe to move or delete all those OSD directories for BlueStore OSDs and
>> try activating them cleanly again, which hopefully will do the right
>> thing.
>>
>> In the end this all might fix your device ownership woes too, making the
>> udev rule unnecessary. If it all works out, try a reboot and see if
>> everything comes back up as it should.
>>
>> --
>> Hector Martin (hec...@marcansoft.com)
>> Public Key: https://mrcn.st/pub
>>
>
>
>
> --
> *Mami Hayashida*
>
> *Research Computing Associate*
> Research Computing Infrastructure
> University of Kentucky Information Technology Services
> 301 Rose Street | 102 James F. Hardymon Building
> Lexington, KY 40506-0495
> mami.hayash...@uky.edu
> (859)323-7521
>



-- 
*Mami Hayashida*

*Research Computing Associate*
Research Computing Infrastructure
University of Kentucky Information Technology Services
301 Rose Street | 102 James F. Hardymon Building
Lexington, KY 40506-0495
mami.hayash...@uky.edu
(859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin
On 11/7/18 5:27 AM, Hayashida, Mami wrote:
> 1. Stopped osd.60-69:  no problem
> 2. Skipped this and went to #3 to check first
> 3. Here, `find /etc/systemd/system | grep ceph-volume` returned
> nothing.  I see in that directory 
> 
> /etc/systemd/system/ceph-disk@60.service    # and 61 - 69. 
> 
> No ceph-volume entries. 

Get rid of those, they also shouldn't be there. Then `systemctl
daemon-reload` and continue, see if you get into a good state. basically
feel free to nuke anything in there related to OSD 60-69, since whatever
is needed should be taken care of by the ceph-volume activation.


-- 
Hector Martin (hec...@marcansoft.com)
Public Key: https://mrcn.st/pub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hayashida, Mami
This is becoming even more confusing. I got rid of those
ceph-disk@6[0-9].service
(which had been symlinked to /dev/null).  Moved
/var/lib/ceph/osd/ceph-6[0-9] to  /var/./osd_old/.  Then, I ran
`ceph-volume lvm activate --all`.  I got once again

root@osd1:~# ceph-volume lvm activate --all
--> Activating OSD ID 67 FSID 17cd6755-76f9-4160-906c-1bf13d09fb3d
Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-67
--> Absolute path not found for executable: restorecon
--> Ensure $PATH environment variable contains common executable locations
Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev
/dev/hdd67/data67 --path /var/lib/ceph/osd/ceph-67
 stderr: failed to read label for /dev/hdd67/data67: (2) No such file or
directory
-->  RuntimeError: command returned non-zero exit status: 1

But when I ran `df` and `mount` ceph-67 is the only one that exists. (and
in  /var/lib/ceph/osd/)

root@osd1:~# df -h | grep ceph-6
tmpfs   126G 0  126G   0% /var/lib/ceph/osd/ceph-67

root@osd1:~# mount | grep ceph-6
tmpfs on /var/lib/ceph/osd/ceph-67 type tmpfs (rw,relatime)

root@osd1:~# ls /var/lib/ceph/osd/ | grep ceph-6
ceph-67

But in I cannot restart any of these 10 daemons (`systemctl start ceph-osd@6
[0-9]`).

I am wondering if I should zap these 10 osds and start over although at
this point I am afraid even zapping may not be a simple task



On Tue, Nov 6, 2018 at 3:44 PM, Hector Martin  wrote:

> On 11/7/18 5:27 AM, Hayashida, Mami wrote:
> > 1. Stopped osd.60-69:  no problem
> > 2. Skipped this and went to #3 to check first
> > 3. Here, `find /etc/systemd/system | grep ceph-volume` returned
> > nothing.  I see in that directory
> >
> > /etc/systemd/system/ceph-disk@60.service# and 61 - 69.
> >
> > No ceph-volume entries.
>
> Get rid of those, they also shouldn't be there. Then `systemctl
> daemon-reload` and continue, see if you get into a good state. basically
> feel free to nuke anything in there related to OSD 60-69, since whatever
> is needed should be taken care of by the ceph-volume activation.
>
>
> --
> Hector Martin (hec...@marcansoft.com)
> Public Key: https://mrcn.st/pub
>



-- 
*Mami Hayashida*

*Research Computing Associate*
Research Computing Infrastructure
University of Kentucky Information Technology Services
301 Rose Street | 102 James F. Hardymon Building
Lexington, KY 40506-0495
mami.hayash...@uky.edu
(859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Hector Martin "marcan"
If /dev/hdd67/data67 does not exist, try `vgchange -a y` and that should make 
it exist, then try again. Not sure why this would ever happen, though, since I 
expect lower level stuff to take care of activating LVM LVs.

If it does exist, I get the feeling that your original ceph-volume prepare 
command created the OSD filesystems in your root filesystem as files (probably 
because the OSD directories already existed for some reason). In that case, 
yes, you should re-create them, since the first time it wasn't done correctly. 
Before you do that, make sure you unmount the tmpfs that is now mounted, that 
no osd directories remain for your BlueStore OSDs, that you remove them from 
the mons, etc. You want to make sure your environment is clean so everything 
works as it should. Might be worth removing and re-creating the LVM LVs to make 
sure the tags are gone too.

On November 7, 2018 6:12:43 AM GMT+09:00, "Hayashida, Mami" 
 wrote:
>This is becoming even more confusing. I got rid of those
>ceph-disk@6[0-9].service
>(which had been symlinked to /dev/null).  Moved
>/var/lib/ceph/osd/ceph-6[0-9] to  /var/./osd_old/.  Then, I ran
>`ceph-volume lvm activate --all`.  I got once again
>
>root@osd1:~# ceph-volume lvm activate --all
>--> Activating OSD ID 67 FSID 17cd6755-76f9-4160-906c-1bf13d09fb3d
>Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-67
>--> Absolute path not found for executable: restorecon
>--> Ensure $PATH environment variable contains common executable
>locations
>Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev
>/dev/hdd67/data67 --path /var/lib/ceph/osd/ceph-67
>stderr: failed to read label for /dev/hdd67/data67: (2) No such file or
>directory
>-->  RuntimeError: command returned non-zero exit status: 1
>
>But when I ran `df` and `mount` ceph-67 is the only one that exists.
>(and
>in  /var/lib/ceph/osd/)
>
>root@osd1:~# df -h | grep ceph-6
>tmpfs   126G 0  126G   0% /var/lib/ceph/osd/ceph-67
>
>root@osd1:~# mount | grep ceph-6
>tmpfs on /var/lib/ceph/osd/ceph-67 type tmpfs (rw,relatime)
>
>root@osd1:~# ls /var/lib/ceph/osd/ | grep ceph-6
>ceph-67
>
>But in I cannot restart any of these 10 daemons (`systemctl start
>ceph-osd@6
>[0-9]`).
>
>I am wondering if I should zap these 10 osds and start over although at
>this point I am afraid even zapping may not be a simple task
>
>
>
>On Tue, Nov 6, 2018 at 3:44 PM, Hector Martin 
>wrote:
>
>> On 11/7/18 5:27 AM, Hayashida, Mami wrote:
>> > 1. Stopped osd.60-69:  no problem
>> > 2. Skipped this and went to #3 to check first
>> > 3. Here, `find /etc/systemd/system | grep ceph-volume` returned
>> > nothing.  I see in that directory
>> >
>> > /etc/systemd/system/ceph-disk@60.service# and 61 - 69.
>> >
>> > No ceph-volume entries.
>>
>> Get rid of those, they also shouldn't be there. Then `systemctl
>> daemon-reload` and continue, see if you get into a good state.
>basically
>> feel free to nuke anything in there related to OSD 60-69, since
>whatever
>> is needed should be taken care of by the ceph-volume activation.
>>
>>
>> --
>> Hector Martin (hec...@marcansoft.com)
>> Public Key: https://mrcn.st/pub
>>
>
>
>
>-- 
>*Mami Hayashida*
>
>*Research Computing Associate*
>Research Computing Infrastructure
>University of Kentucky Information Technology Services
>301 Rose Street | 102 James F. Hardymon Building
>Lexington, KY 40506-0495
>mami.hayash...@uky.edu
>(859)323-7521

-- 
Hector Martin "marcan" (hec...@marcansoft.com)
Public key: https://mrcn.st/pub___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Filestore to Bluestore migration question

2018-11-06 Thread Alfredo Deza
It is pretty difficult to know what step you are missing if we are
getting the `activate --all` command.

Maybe if you try one by one, capturing each command, throughout the
process, with output. In the filestore-to-bluestore guides we never
advertise `activate --all` for example.

Something is missing here, and I can't tell what it is.
On Tue, Nov 6, 2018 at 4:13 PM Hayashida, Mami  wrote:
>
> This is becoming even more confusing. I got rid of those 
> ceph-disk@6[0-9].service (which had been symlinked to /dev/null).  Moved 
> /var/lib/ceph/osd/ceph-6[0-9] to  /var/./osd_old/.  Then, I ran  
> `ceph-volume lvm activate --all`.  I got once again
>
> root@osd1:~# ceph-volume lvm activate --all
> --> Activating OSD ID 67 FSID 17cd6755-76f9-4160-906c-1bf13d09fb3d
> Running command: mount -t tmpfs tmpfs /var/lib/ceph/osd/ceph-67
> --> Absolute path not found for executable: restorecon
> --> Ensure $PATH environment variable contains common executable locations
> Running command: ceph-bluestore-tool --cluster=ceph prime-osd-dir --dev 
> /dev/hdd67/data67 --path /var/lib/ceph/osd/ceph-67
>  stderr: failed to read label for /dev/hdd67/data67: (2) No such file or 
> directory
> -->  RuntimeError: command returned non-zero exit status: 1
>
> But when I ran `df` and `mount` ceph-67 is the only one that exists. (and in  
> /var/lib/ceph/osd/)
>
> root@osd1:~# df -h | grep ceph-6
> tmpfs   126G 0  126G   0% /var/lib/ceph/osd/ceph-67
>
> root@osd1:~# mount | grep ceph-6
> tmpfs on /var/lib/ceph/osd/ceph-67 type tmpfs (rw,relatime)
>
> root@osd1:~# ls /var/lib/ceph/osd/ | grep ceph-6
> ceph-67
>
> But in I cannot restart any of these 10 daemons (`systemctl start 
> ceph-osd@6[0-9]`).
>
> I am wondering if I should zap these 10 osds and start over although at this 
> point I am afraid even zapping may not be a simple task
>
>
>
> On Tue, Nov 6, 2018 at 3:44 PM, Hector Martin  wrote:
>>
>> On 11/7/18 5:27 AM, Hayashida, Mami wrote:
>> > 1. Stopped osd.60-69:  no problem
>> > 2. Skipped this and went to #3 to check first
>> > 3. Here, `find /etc/systemd/system | grep ceph-volume` returned
>> > nothing.  I see in that directory
>> >
>> > /etc/systemd/system/ceph-disk@60.service# and 61 - 69.
>> >
>> > No ceph-volume entries.
>>
>> Get rid of those, they also shouldn't be there. Then `systemctl
>> daemon-reload` and continue, see if you get into a good state. basically
>> feel free to nuke anything in there related to OSD 60-69, since whatever
>> is needed should be taken care of by the ceph-volume activation.
>>
>>
>> --
>> Hector Martin (hec...@marcansoft.com)
>> Public Key: https://mrcn.st/pub
>
>
>
>
> --
> Mami Hayashida
> Research Computing Associate
>
> Research Computing Infrastructure
> University of Kentucky Information Technology Services
> 301 Rose Street | 102 James F. Hardymon Building
> Lexington, KY 40506-0495
> mami.hayash...@uky.edu
> (859)323-7521
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd mirror journal data

2018-11-06 Thread Wei Jin
Yes, we do one-way replication and the 'remote' cluster is the secondary 
cluster, so the rbd-mirror daemon is there.
We can confirm the daemon is working because we observed IO workload. And the 
remote cluster is actually bigger than the 'local’ cluster so it should be able 
to keep up with the IO workload. So it is confusing why there are so many 
journal data that cannot be trimmed immediately. (Local cluster also has 
capability to do more IO workload including trimming operations.)


> On Nov 6, 2018, at 9:25 PM, Jason Dillaman  wrote:
> 
> On Tue, Nov 6, 2018 at 1:12 AM Wei Jin  > wrote:
>> 
>> Thanks.
>> I found that both minimum and active set are very large in my cluster, is it 
>> expected?
>> By the way, I do snapshot for each image half an hour,and keep snapshots for 
>> two days.
>> 
>> Journal status:
>> 
>> minimum_set: 671839
>> active_set: 1197917
>> registered clients:
>>[id=, commit_position=[positions=[[object_number=4791670, tag_tid=3, 
>> entry_tid=4146742458], [object_number=4791669, tag_tid=3, 
>> entry_tid=4146742457], [object_number=4791668, tag_tid=3, 
>> entry_tid=4146742456], [object_number=4791671, tag_tid=3, 
>> entry_tid=4146742455]]], state=connected]
>>[id=89024ad3-57a7-42cc-99d4-67f33b093704, 
>> commit_position=[positions=[[object_number=2687357, tag_tid=3, 
>> entry_tid=1188516421], [object_number=2687356, tag_tid=3, 
>> entry_tid=1188516420], [object_number=2687359, tag_tid=3, 
>> entry_tid=1188516419], [object_number=2687358, tag_tid=3, 
>> entry_tid=1188516418]]], state=connected]
>> 
> 
> Are you attempting to run "rbd-mirror" daemon on a remote cluster? It
> just appears like either the daemon is not running or that it's so far
> behind that it's just not able to keep up with the IO workload of the
> image. You can run "rbd journal disconnect --image 
> --client-id=89024ad3-57a7-42cc-99d4-67f33b093704" to force-disconnect
> the remote client and start the journal trimming process.
> 
>>> On Nov 6, 2018, at 3:39 AM, Jason Dillaman  wrote:
>>> 
>>> On Sun, Nov 4, 2018 at 11:59 PM Wei Jin  wrote:
 
 Hi, Jason,
 
 I have a question about rbd mirroring. When enable mirroring, we observed 
 that there are a lot of objects prefix with journal_data, thus it consumes 
 a lot of disk space.
 
 When will these journal objects be deleted? And are there any parameters 
 to accelerate it?
 Thanks.
 
>>> 
>>> Journal data objects should be automatically deleted when the journal
>>> is trimmed beyond the position of the object. If you run "rbd journal
>>> status --image ", you should be able to see the minimum
>>> in-use set and the current active set for new journal entries:
>>> 
>>> $ rbd --cluster cluster1 journal status --image image1
>>> minimum_set: 7
>>> active_set: 8
>>> registered clients:
>>> [id=, commit_position=[positions=[[object_number=33, tag_tid=2,
>>> entry_tid=49153], [object_number=32, tag_tid=2, entry_tid=49152],
>>> [object_number=35, tag_tid=2, entry_tid=49151], [object_number=34,
>>> tag_tid=2, entry_tid=49150]]], state=connected]
>>> [id=81672c30-d735-46d4-a30a-53c221954d0e,
>>> commit_position=[positions=[[object_number=30, tag_tid=2,
>>> entry_tid=48034], [object_number=29, tag_tid=2, entry_tid=48033],
>>> [object_number=28, tag_tid=2, entry_tid=48032], [object_number=31,
>>> tag_tid=2, entry_tid=48031]]], state=connected]
>>> 
>>> $ rados --cluster cluster1 --pool rbd ls | grep journal_data | sort
>>> journal_data.1.1029b4577f90.28
>>> journal_data.1.1029b4577f90.29
>>> journal_data.1.1029b4577f90.30
>>> journal_data.1.1029b4577f90.31
>>> journal_data.1.1029b4577f90.32
>>> journal_data.1.1029b4577f90.33
>>> journal_data.1.1029b4577f90.34
>>> journal_data.1.1029b4577f90.35
>>> <..>
>>> 
>>> $ rbd --cluster cluster1 journal status --image image1
>>> minimum_set: 8
>>> active_set: 8
>>> registered clients:
>>> [id=, commit_position=[positions=[[object_number=33, tag_tid=2,
>>> entry_tid=49153], [object_number=32, tag_tid=2, entry_tid=49152],
>>> [object_number=35, tag_tid=2, entry_tid=49151], [object_number=34,
>>> tag_tid=2, entry_tid=49150]]], state=connected]
>>> [id=81672c30-d735-46d4-a30a-53c221954d0e,
>>> commit_position=[positions=[[object_number=33, tag_tid=2,
>>> entry_tid=49153], [object_number=32, tag_tid=2, entry_tid=49152],
>>> [object_number=35, tag_tid=2, entry_tid=49151], [object_number=34,
>>> tag_tid=2, entry_tid=49150]]], state=connected]
>>> 
>>> $ rados --cluster cluster1 --pool rbd ls | grep journal_data | sort
>>> journal_data.1.1029b4577f90.32
>>> journal_data.1.1029b4577f90.33
>>> journal_data.1.1029b4577f90.34
>>> journal_data.1.1029b4577f90.35
>>> 
>>> --
>>> Jason
>> 
> 
> 
> -- 
> Jason

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Balancer module not balancing perfectly

2018-11-06 Thread Konstantin Shalygin

On 11/6/18 11:02 PM, Steve Taylor wrote:
I intend to put together a pull request to push this upstream. I 
haven't reviewed the balancer module code to see how it's doing 
things, but assuming it uses osdmaptool or the same upmap code as 
osdmaptool this should also improve the balancer module.


Indeed you should make a PR to review this improvements.



k

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Packages for debian in Ceph repo

2018-11-06 Thread Nicolas Huillard
Le mardi 30 octobre 2018 à 18:14 +0100, Kevin Olbrich a écrit :
> Proxmox has support for rbd as they ship additional packages as well
> as
> ceph via their own repo.
> 
> I ran your command and got this:
> 
> > qemu-img version 2.8.1(Debian 1:2.8+dfsg-6+deb9u4)
> > Copyright (c) 2003-2016 Fabrice Bellard and the QEMU Project
> > developers
> > Supported formats: blkdebug blkreplay blkverify bochs cloop dmg
> > file ftp
> > ftps gluster host_cdrom host_device http https iscsi iser luks nbd
> > nfs
> > null-aio null-co parallels qcow qcow2 qed quorum raw rbd
> > replication
> > sheepdog ssh vdi vhdx vmdk vpc vvfat
> 
> 
> It lists rbd but still fails with the exact same error.

I stumbled upon the exact same error, and since there was no answer
anywhere, I figured it was a very simple problem: don't forget to
install the qemu-block-extra package (Debian stretch) along with qemu-
utils which contains the qemu-img command.
This command is actually compiled with rbd support (hence the output
above), but need this extra package to pull actual support-code and
dependencies...

-- 
Nicolas Huillard
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com