[ceph-users] Recommended OSD size

2016-05-13 Thread gjprabu
Hi All,



We need some clarification on CEPH OSD and MON and MDS. It will be very 
helpful and better understand to know below details.



Per OSD Recommended SIZE ( Both scsi and ssd ).



Which is recommended one (per machine = per OSD) or (Per machine = many OSD.)



Do we need run separate machine for monitoring.



MDS where we need to run, is it separate machine or OSD itself is better.



CEPHFS file system we are going use for production.




Regards

Prabu GJ







___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] PGS stuck inactive and osd down

2016-05-13 Thread Vincenzo Pii

> On 12 May 2016, at 19:27, Vincenzo Pii  wrote:
> 
> I have installed a new ceph cluster with ceph-ansible (using the same version 
> and playbook that had worked before, with some necessary changes to 
> variables).
> 
> The only major difference is that now an osd (osd3) has a disk twice as big 
> as the others, thus a different weight (check the crushmap excerpt below).
> 
> Ceph version is jewel (10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)) and 
> the setup has a single monitor node (it will be three in production) and 
> three osds.
> 
> Any help to find the issue will be highly appreciated!
> 
> # ceph status
> cluster f7f42c59-b8ec-4d68-bb09-41f7a10c6223
>  health HEALTH_ERR
> 448 pgs are stuck inactive for more than 300 seconds
> 448 pgs stuck inactive
>  monmap e1: 1 mons at {sbb=10.2.48.205:6789/0}
> election epoch 3, quorum 0 sbb
>   fsmap e8: 0/0/1 up
>  osdmap e10: 3 osds: 0 up, 0 in
> flags sortbitwise
>   pgmap v11: 448 pgs, 4 pools, 0 bytes data, 0 objects
> 0 kB used, 0 kB / 0 kB avail
>  448 creating
> 
> From the crushmap:
> 
> host osd1 {
> id -2   # do not change unnecessarily
> # weight 1.811
> alg straw
> hash 0  # rjenkins1
> item osd.0 weight 1.811
> }
> host osd2 {
> id -3   # do not change unnecessarily
> # weight 1.811
> alg straw
> hash 0  # rjenkins1
> item osd.1 weight 1.811
> }
> host osd3 {
> id -4   # do not change unnecessarily
> # weight 3.630
> alg straw
> hash 0  # rjenkins1
> item osd.2 weight 3.630
> }
> 
> Vincenzo Pii | TERALYTICS
> DevOps Engineer
> Teralytics AG | Zollstrasse 62 | 8005 Zurich | Switzerland 
> phone: +41 (0) 79 191 11 08
> email: vincenzo@teralytics.net 
> 
> www.teralytics.net 
> 
> Company registration number: CH-020.3.037.709-7 | Trade register Canton Zurich
> Board of directors: Georg Polzer, Luciano Franceschina, Mark Schmitz, Yann de 
> Vries
> 
> This e-mail message contains confidential information which is for the sole 
> attention and use of the intended recipient. Please notify us at once if you 
> think that it may not be intended for you and delete it immediately. 
> 
> 

Problem found, I misconfigured the public_network and cluster_network variables 
for some of the hosts (I moved some configuration to host_vars).
It was easy to spot once I checked the logs of those hosts.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How do ceph clients determine a monitor's address (and esp. port) for initial connection?

2016-05-13 Thread Christian Sarrasin

Hi Greg,

Thanks again and good guess!  Amending testcluster.conf as follows:

mon host = 192.168.10.201:6788
mon addr = 192.168.10.201:6788

... gets around the problem.

having "mon host = mona:6788" also works.

Should I raise a defect or is this workaround good enough?

Cheers,
Christian

On 12/05/16 22:17, Gregory Farnum wrote:

On Thu, May 12, 2016 at 12:42 PM, Christian Sarrasin
 wrote:

Thanks Greg!

If I understood correctly, your suggesting this:

cd /etc/ceph
grep -v 'mon host' testcluster.conf > testcluster_client.conf
diff testcluster.conf testcluster_client.conf
4d3
< mon host = mona
ceph -c ./testcluster_client.conf --cluster testcluster status
no monitors specified to connect to.
Error connecting to cluster: ObjectNotFound

So this doesn't seem to work.  Any other suggestion is most welcome.


Hmm, I'm clearly not remembering how the parsing works for these, and
it's a bit messy. You may be stuck using the full IP:port instead of
host names for the "mon host" config, if it's not working without
that. :/
-Greg



Cheers,
Christian


On 12/05/16 21:06, Gregory Farnum wrote:


On Thu, May 12, 2016 at 6:45 AM, Christian Sarrasin
 wrote:


I'm trying to run monitors on a non-standard port and having trouble
connecting to them.  The below shows the ceph client attempting to
connect
to default port 6789 rather than 6788:

ceph --cluster testcluster status
2016-05-12 13:31:12.246246 7f710478c700  0 -- :/2044977896 >>
192.168.10.201:6789/0 pipe(0x7f7100067550 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7f710005add0).fault
2016-05-12 13:31:15.247057 7f710468b700  0 -- :/2044977896 >>
192.168.10.201:6789/0 pipe(0x7f70f4000c00 sd=3 :0 s=1 pgs=0 cs=0 l=1
c=0x7f70f4004ef0).fault
... etc ...
^CError connecting to cluster: InterruptedOrTimeoutError

This is my embryonic config file:

cat /etc/ceph/testcluster.conf
[global]
fsid = fef4370d-6d97-43d2-b156-57c2a0357ee2
mon initial members = mona
mon host = mona
mon addr = 192.168.10.201:6788



This is *supposed* to work, but since it's not I bet the "mon host"
bit there is being used instead of the "mon addr" entry. Try clearing
that out from the client side.
-Greg


auth cluster required = cephx
auth service required = cephx
auth client required = cephx
public network = 192.168.10.0/24
cluster network = 192.168.10.0/24
osd journal size = 100

netstat shows the monitor listening on 6788 as expected.  If I regenerate
the env, changing 6788 to 6789 in the above, everything works as
expected.

I _thought_ ceph would use the IP:port from "mon addr" but clearly I'm
missing smth...

This is ceph version 9.2.1 (752b6a3020c3de74e07d2a8b4c5e48dab5a6b6fd)

Background: I want to run monitors for two separate clusters on the same
h/w
(OSDs on separate h/w).  Both clusters will run different version of ceph
so
I'm thinking of running mons for the 2nd cluster using docker (with
--net=host to minimize overhead, hence the need to change the port).  I
have
used a slightly modified version of ceph-docker to deploy the above.

Many thanks!
Christian

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com




___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Recommended OSD size

2016-05-13 Thread Christian Balzer
On Fri, 13 May 2016 12:38:05 +0530 gjprabu wrote:

Hello,

> Hi All,
> 
> 
> 
> We need some clarification on CEPH OSD and MON and MDS. It will
> be very helpful and better understand to know below details.
>
You will want to spend more time reading the documentation and hardware
guides, as well as finding similar threads in the ML archives.
 
> 
> 
> Per OSD Recommended SIZE ( Both scsi and ssd ).
> 
With SCSI I suppose you mean HDDs?

And there is not good answer, it depends on your needs and use case.
For example if your main goal is space and not performance, fewer but
larger HDDs will be a better fit.
 
> 
> Which is recommended one (per machine = per OSD) or (Per machine = many
> OSD.)
>
The first part makes no sense, I suppose you mean one or few OSD per
server?

And again, it all depends on your goals and budget. 
Find and read the hardware guides, there are other considerations like
RAM and CPU. 

Many OSDs per server can be complicated and challenging, unless you know
very well what you're doing.

The usual compromise between cost and density tend to be 2U servers with
12-14 drives.
 
> 
> 
> Do we need run separate machine for monitoring.
> 
If your OSDs are powerful enough (CPU/RAM/fast SSD for leveldb), not
necessarily.
You will want at least 3 MONs for production.

> 
> MDS where we need to run, is it separate machine or OSD itself is better.
> 
Again, it can be shared if you have enough resources on the OSDs.

What would be a safe recommendation is to have 1-2 dedicated MON and MDS
hosts and the rest of the MONs on OSDs.
These dedicated hosts need to have the lowest IPs in your cluster to become
MON leader.

> 
> 
> CEPHFS file system we are going use for production.
> 
The most important statement/question last.

You will want to build a test cluster and verify that your application(s)
are actually working well with CephFS, because if you read the ML there
are cases when this may not be true.

Christian
-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rbd resize option

2016-05-13 Thread M Ranga Swami Reddy
Thank you.now its working"resize2fs" and then "rbd resize"

Thanks
Swami

On Thu, May 12, 2016 at 7:40 PM, M Ranga Swami Reddy
 wrote:
> sure...checking the resize2fs before using the "rbd resize"...
>
> Thanks
> Swami
>
> On Thu, May 12, 2016 at 7:17 PM, Eneko Lacunza  wrote:
>> You have to shrink FS before RBD block! Now your FS is corrupt! :)
>>
>> El 12/05/16 a las 15:41, M Ranga Swami Reddy escribió:
>>
>>> Used "resize2fs" and its working for resize to higher number (ie from
>>> 10G -> 20G) or so...
>>> If I tried to resize the lower numbers (ie from 10G -> 5G), its
>>> failied...with below message:
>>> ===
>>>  ubuntu@swami-resize-test-vm:/$ sudo resize2fs /dev/vdb
>>>
>>> sudo: unable to resolve host swami-resize-test-vm
>>>
>>> resize2fs 1.42.9 (4-Feb-2014)
>>>
>>> Please run 'e2fsck -f /dev/vdb' first.
>>>
>>>
>>> ubuntu@swami-resize-test-vm:/$ sudo e2fsck -f /dev/vdb
>>>
>>> sudo: unable to resolve host swami-resize-test-vm
>>>
>>> e2fsck 1.42.9 (4-Feb-2014)
>>>
>>> The filesystem size (according to the superblock) is 52428800 blocks
>>>
>>> The physical size of the device is 13107200 blocks
>>>
>>> Either the superblock or the partition table is likely to be corrupt!
>>>
>>> Abort?
>>>
>>> On Thu, May 12, 2016 at 6:37 PM, Eneko Lacunza  wrote:

 Swami,

 You must resize (reduce) a filesystem before shrinking a partition/disk.
 Please search online how to do so with your specific
 filesystem/partitions.

 El 12/05/16 a las 15:00, M Ranga Swami Reddy escribió:

> Not done any FS shrink before "rbd resize". Please let me know what to
> do with FS shink before "rbd resize"
>
> Thanks
> Swami
>
> On Thu, May 12, 2016 at 4:34 PM, Eneko Lacunza 
> wrote:
>>
>> Did you shrink the FS to be smaller than the target rbd size before
>> doing
>> "rbd resize"?
>>
>> El 12/05/16 a las 12:33, M Ranga Swami Reddy escribió:
>>
>>> When I used "rbd resize" option for size shrink,  the image/volume
>>> lost its fs sectors and asking for "fs" not found...
>>> I have used  "mkf" option, then all data lost in it? This happens with
>>> shrink option...
>>>
>>>
>>> Thanks
>>> Swami
>>>
>>> On Wed, May 11, 2016 at 5:28 PM, Christian Balzer 
>>> wrote:

 Hello,

 On Wed, 11 May 2016 13:33:44 +0200 (CEST) Alexandre DERUMIER wrote:

>>> but the fstrim can used with in mount partition...But I wanted to
>>> as
>>> cloud admin...
>
> if you use qemu, you can launch fstrim  through guest-agent
>
 This of course assumes that qemu/kvm is using a disk method that
 allows
 for TRIM.

 And nobody in their right mind uses IDE (performance), while
 virtio-scsi
 isn't the default or even supported with some cloud stacks.

 And of course that the VM in question runs Linux and has fstrim
 installed.

 Otherwise solid advise, I agree.

 Christian
>
>
>
>
> http://dustymabe.com/2013/06/26/enabling-qemu-guest-agent-and-fstrim-again/
>
> - Mail original -
> De: "M Ranga Swami Reddy" 
> À: "Wido den Hollander" 
> Cc: "ceph-users" 
> Envoyé: Mercredi 11 Mai 2016 13:16:27
> Objet: Re: [ceph-users] rbd resize option
>
> Thank you.
>
> but the fstrim can used with in mount partition...But I wanted to as
> cloud admin...
> I have a few uses with high volume size (ie capacity) allotted, but
> only used 5% of the capacity. so I wanted to reduce the size to 10%
> of
> size using the rbd resize command. But in this process, if a
> customer's volume has more than 10% data, then I may end-up with
> data
> lost...
>
> Thanks
> Swami
>
> On Wed, May 11, 2016 at 1:17 PM, Wido den Hollander 
> wrote:
>>>
>>> Op 11 mei 2016 om 8:38 schreef M Ranga Swami Reddy
>>> :
>>>
>>>
>>> Hello,
>>> I wanted to resize an image using 'rbd' resize option, but it
>>> should
>>> be have data loss.
>>> For ex: I have image with 100 GB size (thin provisioned). and this
>>> image has data of 10GB only. Here I wanted to resize this image to
>>> 11GB, so that 10GB data is safe and its resized.
>>>
>>> Can I do the above resize safely.?
>>>
>> No, you can't. You need to resize the filesystem and partitions
>> inside
>> the RBD image to something below 11GB before you can do this.
>>
>> Still, make sure you have backups!
>>
>> Also, why shrink? If you can, run a fstrim on the image, that might
>> reclaimed unuse

[ceph-users] Ceph Recovery

2016-05-13 Thread Lazuardi Nasution
Hi,

After disaster and restarting for automatic recovery, I found following
ceph status. Some OSDs cannot be restarted due to file system corruption
(it seem that xfs is fragile).

[root@management-b ~]# ceph status
cluster 3810e9eb-9ece-4804-8c56-b986e7bb5627
 health HEALTH_WARN
209 pgs degraded
209 pgs stuck degraded
334 pgs stuck unclean
209 pgs stuck undersized
209 pgs undersized
recovery 5354/77810 objects degraded (6.881%)
recovery 1105/77810 objects misplaced (1.420%)
 monmap e1: 3 mons at {management-a=
10.255.102.1:6789/0,management-b=10.255.102.2:6789/0,management-c=10.255.102.3:6789/0
}
election epoch 2308, quorum 0,1,2
management-a,management-b,management-c
 osdmap e25037: 96 osds: 49 up, 49 in; 125 remapped pgs
flags sortbitwise
  pgmap v9024253: 2560 pgs, 5 pools, 291 GB data, 38905 objects
678 GB used, 90444 GB / 91123 GB avail
5354/77810 objects degraded (6.881%)
1105/77810 objects misplaced (1.420%)
2226 active+clean
 209 active+undersized+degraded
 125 active+remapped
  client io 0 B/s rd, 282 kB/s wr, 10 op/s

Since total active PGs same with total PGs and total degraded PGs same with
total undersized PGs, does it mean that all PGs have at least one good
replica, so I can just mark lost or remove down OSD, reformat again and
then restart them if there is no hardware issue with HDDs? Which one of PGs
status should I pay more attention, degraded or undersized due to lost
object possibility?

Best regards,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Recovery

2016-05-13 Thread Wido den Hollander

> Op 13 mei 2016 om 11:34 schreef Lazuardi Nasution :
> 
> 
> Hi,
> 
> After disaster and restarting for automatic recovery, I found following
> ceph status. Some OSDs cannot be restarted due to file system corruption
> (it seem that xfs is fragile).
> 
> [root@management-b ~]# ceph status
> cluster 3810e9eb-9ece-4804-8c56-b986e7bb5627
>  health HEALTH_WARN
> 209 pgs degraded
> 209 pgs stuck degraded
> 334 pgs stuck unclean
> 209 pgs stuck undersized
> 209 pgs undersized
> recovery 5354/77810 objects degraded (6.881%)
> recovery 1105/77810 objects misplaced (1.420%)
>  monmap e1: 3 mons at {management-a=
> 10.255.102.1:6789/0,management-b=10.255.102.2:6789/0,management-c=10.255.102.3:6789/0
> }
> election epoch 2308, quorum 0,1,2
> management-a,management-b,management-c
>  osdmap e25037: 96 osds: 49 up, 49 in; 125 remapped pgs
> flags sortbitwise
>   pgmap v9024253: 2560 pgs, 5 pools, 291 GB data, 38905 objects
> 678 GB used, 90444 GB / 91123 GB avail
> 5354/77810 objects degraded (6.881%)
> 1105/77810 objects misplaced (1.420%)
> 2226 active+clean
>  209 active+undersized+degraded
>  125 active+remapped
>   client io 0 B/s rd, 282 kB/s wr, 10 op/s
> 
> Since total active PGs same with total PGs and total degraded PGs same with
> total undersized PGs, does it mean that all PGs have at least one good
> replica, so I can just mark lost or remove down OSD, reformat again and
> then restart them if there is no hardware issue with HDDs? Which one of PGs
> status should I pay more attention, degraded or undersized due to lost
> object possibility?
> 

Yes. Your system is not reporting any inactive, unfound or stale PGs, so that 
is good news.

However, I recommend that you wait for the system to become fully active+clean 
before you start removing any OSDs or formatting hard drives. Better be safe 
than sorry.

Wido

> Best regards,
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-13 Thread Wido den Hollander
No luck either. After a reboot only the Ceph OSD starts, but the monitor not.

I have checked:
- service is enabled
- tried to re-enable the service
- check the MON logs to see if it was started, it wasn't
- systemd log to see if it wants to start the MON, it doesn't

My systemd-foo isn't that good either, so I don't know what is happening here.

Wido

> Op 12 mei 2016 om 15:31 schreef Jan Schermer :
> 
> 
> Btw try replacing
> 
> WantedBy=ceph-mon.target
> 
> With: WantedBy=default.target
> then systemctl daemon-reload.
> 
> See if that does the trick
> 
> I only messed with systemctl to have my own services start, I still hope it 
> goes away eventually... :P
> 
> Jan
> 
> > On 12 May 2016, at 15:01, Wido den Hollander  wrote:
> > 
> > 
> > To also answer Sage's question: No, this is a fresh Jewel install in a few 
> > test VMs. This system was not upgraded.
> > 
> > It was installed 2 hours ago.
> > 
> >> Op 12 mei 2016 om 14:51 schreef Jan Schermer :
> >> 
> >> 
> >> Can you post the contents of ceph-mon@.service file?
> >> 
> > 
> > Yes, here you go:
> > 
> > root@charlie:~# cat /lib/systemd/system/ceph-mon@.service 
> > [Unit]
> > Description=Ceph cluster monitor daemon
> > 
> > # According to:
> > #   http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget
> > # these can be removed once ceph-mon will dynamically change network
> > # configuration.
> > After=network-online.target local-fs.target ceph-create-keys@%i.service
> > Wants=network-online.target local-fs.target ceph-create-keys@%i.service
> > 
> > PartOf=ceph-mon.target
> > 
> > [Service]
> > LimitNOFILE=1048576
> > LimitNPROC=1048576
> > EnvironmentFile=-/etc/default/ceph
> > Environment=CLUSTER=ceph
> > ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id %i --setuser ceph 
> > --setgroup ceph
> > ExecReload=/bin/kill -HUP $MAINPID
> > PrivateDevices=yes
> > ProtectHome=true
> > ProtectSystem=full
> > PrivateTmp=true
> > TasksMax=infinity
> > Restart=on-failure
> > StartLimitInterval=30min
> > StartLimitBurst=3
> > 
> > [Install]
> > WantedBy=ceph-mon.target
> > root@charlie:~#
> > 
> >> what does
> >> systemctl is-enabled ceph-mon@charlie
> >> say?
> >> 
> > 
> > root@charlie:~# systemctl is-enabled ceph-mon@charlie
> > enabled
> > root@charlie:~#
> > 
> >> However, this looks like it was just started at a bad moment and died - 
> >> nothing in logs?
> >> 
> > 
> > No, I checked the ceph-mon logs in /var/log/ceph. No sign of it even trying 
> > to start after boot. In /var/log/syslog there also is not a trace of 
> > ceph-mon.
> > 
> > Only the OSD starts.
> > 
> > Wido
> > 
> >> Jan
> >> 
> >> 
> >>> On 12 May 2016, at 14:44, Sage Weil  wrote:
> >>> 
> >>> On Thu, 12 May 2016, Wido den Hollander wrote:
>  Hi,
>  
>  I am setting up a Jewel cluster in VMs with Ubuntu 16.04.
>  
>  ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)
>  
>  After a reboot the Ceph Monitors don't start and I have to do so 
>  manually.
>  
>  Three machines, alpha, bravo and charlie all have the same problem.
>  
>  root@charlie:~# systemctl status ceph-mon@charlie
>  ● ceph-mon@charlie.service - Ceph cluster monitor daemon
>   Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor 
>  preset: enabled)
>   Active: inactive (dead)
>  root@charlie:~#
>  
>  I can start it and it works
> >>> 
> >>> Hmm.. my systemd-fu is weak, but if it's enabled it seems like it shoud 
> >>> come up.
> >>> 
> >>> Was this an upgraded package?  What if you do 'systemctl reenable 
> >>> ceph-mon@charlie'?
> >>> 
> >>> sage
> >>> 
> >>> 
> >>> 
>  
>  root@charlie:~# systemctl start ceph-mon@charlie
>  root@charlie:~# systemctl status ceph-mon@charlie
>  ● ceph-mon@charlie.service - Ceph cluster monitor daemon
>   Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor 
>  preset: enabled)
>   Active: active (running) since Thu 2016-05-12 16:08:56 CEST; 1s ago
>  Main PID: 1368 (ceph-mon)
>  
>  I tried removing the /var/log/ceph/ceph-mon.charlie.log file and reboot 
>  to see if the mon was actually invoked, but it wasn't.
>  
>  ceph.target has been started and so is the OSD on the machine. It is 
>  just the monitor which hasn't been started.
>  
>  In the syslog I see:
>  
>  May 12 16:11:19 charlie systemd[1]: Starting Ceph object storage 
>  daemon...
>  May 12 16:11:19 charlie systemd[1]: Starting LSB: Start Ceph distributed 
>  file system daemons at boot time...
>  May 12 16:11:19 charlie systemd[1]: Started LSB: Start Ceph distributed 
>  file system daemons at boot time.
>  May 12 16:11:20 charlie systemd[1]: Started Ceph object storage daemon.
>  May 12 16:11:20 charlie systemd[1]: Started Ceph disk activation: 
>  /dev/sdb2.
>  May 12 16:11:21 charlie systemd[1]: Started Ceph object storage daemon.
>  May 12 16:11:21 charlie syst

Re: [ceph-users] Ceph Recovery

2016-05-13 Thread Lazuardi Nasution
Hi Wido,

The status is same after 24 hour running. It seem that the status will not
go to fully active+clean until all down OSDs back again. The only way to
make down OSDs to go back again is reformating or replace if HDDs has
hardware issue. Do you think that it is safe way to do?

Best regards,

On Fri, May 13, 2016 at 4:44 PM, Wido den Hollander  wrote:

>
> > Op 13 mei 2016 om 11:34 schreef Lazuardi Nasution <
> mrxlazuar...@gmail.com>:
> >
> >
> > Hi,
> >
> > After disaster and restarting for automatic recovery, I found following
> > ceph status. Some OSDs cannot be restarted due to file system corruption
> > (it seem that xfs is fragile).
> >
> > [root@management-b ~]# ceph status
> > cluster 3810e9eb-9ece-4804-8c56-b986e7bb5627
> >  health HEALTH_WARN
> > 209 pgs degraded
> > 209 pgs stuck degraded
> > 334 pgs stuck unclean
> > 209 pgs stuck undersized
> > 209 pgs undersized
> > recovery 5354/77810 objects degraded (6.881%)
> > recovery 1105/77810 objects misplaced (1.420%)
> >  monmap e1: 3 mons at {management-a=
> >
> 10.255.102.1:6789/0,management-b=10.255.102.2:6789/0,management-c=10.255.102.3:6789/0
> > }
> > election epoch 2308, quorum 0,1,2
> > management-a,management-b,management-c
> >  osdmap e25037: 96 osds: 49 up, 49 in; 125 remapped pgs
> > flags sortbitwise
> >   pgmap v9024253: 2560 pgs, 5 pools, 291 GB data, 38905 objects
> > 678 GB used, 90444 GB / 91123 GB avail
> > 5354/77810 objects degraded (6.881%)
> > 1105/77810 objects misplaced (1.420%)
> > 2226 active+clean
> >  209 active+undersized+degraded
> >  125 active+remapped
> >   client io 0 B/s rd, 282 kB/s wr, 10 op/s
> >
> > Since total active PGs same with total PGs and total degraded PGs same
> with
> > total undersized PGs, does it mean that all PGs have at least one good
> > replica, so I can just mark lost or remove down OSD, reformat again and
> > then restart them if there is no hardware issue with HDDs? Which one of
> PGs
> > status should I pay more attention, degraded or undersized due to lost
> > object possibility?
> >
>
> Yes. Your system is not reporting any inactive, unfound or stale PGs, so
> that is good news.
>
> However, I recommend that you wait for the system to become fully
> active+clean before you start removing any OSDs or formatting hard drives.
> Better be safe than sorry.
>
> Wido
>
> > Best regards,
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Ceph Recovery

2016-05-13 Thread Wido den Hollander

> Op 13 mei 2016 om 11:55 schreef Lazuardi Nasution :
> 
> 
> Hi Wido,
> 
> The status is same after 24 hour running. It seem that the status will not
> go to fully active+clean until all down OSDs back again. The only way to
> make down OSDs to go back again is reformating or replace if HDDs has
> hardware issue. Do you think that it is safe way to do?
> 

Ah, you are probably lacking enough replicas to make the recovery proceed.

If that is needed I would do this OSD by OSD. Your crushmap will probably tell 
you which OSDs you need to bring back before it works again.

Wido

> Best regards,
> 
> On Fri, May 13, 2016 at 4:44 PM, Wido den Hollander  wrote:
> 
> >
> > > Op 13 mei 2016 om 11:34 schreef Lazuardi Nasution <
> > mrxlazuar...@gmail.com>:
> > >
> > >
> > > Hi,
> > >
> > > After disaster and restarting for automatic recovery, I found following
> > > ceph status. Some OSDs cannot be restarted due to file system corruption
> > > (it seem that xfs is fragile).
> > >
> > > [root@management-b ~]# ceph status
> > > cluster 3810e9eb-9ece-4804-8c56-b986e7bb5627
> > >  health HEALTH_WARN
> > > 209 pgs degraded
> > > 209 pgs stuck degraded
> > > 334 pgs stuck unclean
> > > 209 pgs stuck undersized
> > > 209 pgs undersized
> > > recovery 5354/77810 objects degraded (6.881%)
> > > recovery 1105/77810 objects misplaced (1.420%)
> > >  monmap e1: 3 mons at {management-a=
> > >
> > 10.255.102.1:6789/0,management-b=10.255.102.2:6789/0,management-c=10.255.102.3:6789/0
> > > }
> > > election epoch 2308, quorum 0,1,2
> > > management-a,management-b,management-c
> > >  osdmap e25037: 96 osds: 49 up, 49 in; 125 remapped pgs
> > > flags sortbitwise
> > >   pgmap v9024253: 2560 pgs, 5 pools, 291 GB data, 38905 objects
> > > 678 GB used, 90444 GB / 91123 GB avail
> > > 5354/77810 objects degraded (6.881%)
> > > 1105/77810 objects misplaced (1.420%)
> > > 2226 active+clean
> > >  209 active+undersized+degraded
> > >  125 active+remapped
> > >   client io 0 B/s rd, 282 kB/s wr, 10 op/s
> > >
> > > Since total active PGs same with total PGs and total degraded PGs same
> > with
> > > total undersized PGs, does it mean that all PGs have at least one good
> > > replica, so I can just mark lost or remove down OSD, reformat again and
> > > then restart them if there is no hardware issue with HDDs? Which one of
> > PGs
> > > status should I pay more attention, degraded or undersized due to lost
> > > object possibility?
> > >
> >
> > Yes. Your system is not reporting any inactive, unfound or stale PGs, so
> > that is good news.
> >
> > However, I recommend that you wait for the system to become fully
> > active+clean before you start removing any OSDs or formatting hard drives.
> > Better be safe than sorry.
> >
> > Wido
> >
> > > Best regards,
> > > ___
> > > ceph-users mailing list
> > > ceph-users@lists.ceph.com
> > > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Ceph-Disk Prepare Bug

2016-05-13 Thread Lazuardi Nasution
Hi,

It seem there is bug on Infernalis "ceph-disk prepare" command to whole
disk. Following are some combination of "parted" result after doing that
command. Some time the data partition is not created, some time data
partition is not formated and prepared well.

Bad result:
Number  Start   End SizeFile system  Name  Flags
 2  1049kB  5369MB  5368MB  xfs  ceph journal
 1  5370MB  2000GB  1995GB   ceph data

Bad result:
Number  Start   End SizeFile system  Name  Flags
 2  1049kB  5369MB  5368MB  xfs  ceph journal

Good result (after some second xfs on journal partition is gone, it seem it
due to previous format of disk):
Number  Start   End SizeFile system  Name  Flags
 2  1049kB  5369MB  5368MB  xfs  ceph journal
 1  5370MB  2000GB  1995GB  xfs  ceph data

Good result:
Number  Start   End SizeFile system  Name  Flags
 2  1049kB  5369MB  5368MB   ceph journal
 1  5370MB  2000GB  1995GB  xfs  ceph data

Best regards,
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Recommended OSD size

2016-05-13 Thread Max A. Krasilnikov
Hello! 

On Fri, May 13, 2016 at 04:53:56PM +0900, chibi wrote:

>> We need some clarification on CEPH OSD and MON and MDS. It will
>> be very helpful and better understand to know below details.
>>
> You will want to spend more time reading the documentation and hardware
> guides, as well as finding similar threads in the ML archives.
>  
>> 
>> 
>> Per OSD Recommended SIZE ( Both scsi and ssd ).
>> 
> With SCSI I suppose you mean HDDs?

> And there is not good answer, it depends on your needs and use case.
> For example if your main goal is space and not performance, fewer but
> larger HDDs will be a better fit.

In my deployment, I have slow requests when starting OSD with 2.5+ TB used on
it.
Due to slowdowns on start, I have to start osds manual, one by one, to not
overload host, but 2.5 TB+ loaded osds causes slow requests anyway :(

THis is true for 6 TB HGST HDN726060ALE610 HDD connected to SATA2 iface, ubuntu
14.04, 4.2.0-34, ceph hammer 0.94.6 from Ubuntu cloud team.

-- 
WBR, Max A. Krasilnikov
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Try to find the right way to enable rbd-mirror.

2016-05-13 Thread Mika c
Hi Dillaman,
Thank you for getting back to me.
My system is ubuntu, so I using "sudo rbd-mirror --cluster=local
--log-file=mirror.log --debug-rbd-mirror=20/5" instead. I was read your
reply but still confused.
The image journaling is enable.
---rbd info start---
$ rbd info test
rbd image 'test':
size 1024 MB in 256 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.853d2ae8944a
format: 2
features: layering, exclusive-lock, object-map, fast-diff,
deep-flatten, journaling
flags:
journal: 853d2ae8944a
mirroring state: enabled
mirroring global id: 912e6673-6dd9-480c-bfe1-f134f79a9989
mirroring primary: true
rbd info end


And logs like below, I do not seem any errors. But images never show up on
site2.
log start 
2016-05-13 10:33:47.076797 7f70d0469c80 20 rbd-mirror: Mirror::run: enter
2016-05-13 10:33:47.076800 7f70d0469c80 20 rbd-mirror:
ClusterWatcher::refresh_pools: enter
2016-05-13 10:33:47.083209 7f70d0469c80 20 rbd-mirror:
ClusterWatcher::read_configs: pool rbd has mirroring enabled for peer uuid:
1add7cb3-2cee-43a3-a67c-4c5adb9d455d cluster: remote client: client.admin
2016-05-13 10:33:47.083229 7f70d0469c80 20 rbd-mirror:
Mirror::update_replayers: enter
2016-05-13 10:33:47.083231 7f70d0469c80 20 rbd-mirror:
Mirror::update_replayers: starting replayer for uuid:
1add7cb3-2cee-43a3-a67c-4c5adb9d455d cluster: remote client: client.admin
2016-05-13 10:33:47.083271 7f70d0469c80 20 rbd-mirror: Replayer::init:
replaying for uuid: 1add7cb3-2cee-43a3-a67c-4c5adb9d455d cluster: remote
client: client.admin
2016-05-13 10:33:47.091120 7f70d0469c80 20 rbd-mirror: Replayer::init:
connected to uuid: 1add7cb3-2cee-43a3-a67c-4c5adb9d455d cluster: remote
client: client.admin
2016-05-13 10:33:47.091178 7f70d0469c80 20 rbd-mirror:
PoolWatcher::refresh_images: enter
2016-05-13 10:33:47.096067 7f707bfff700 20 rbd-mirror: Replayer::run: enter
2016-05-13 10:33:47.096207 7f707bfff700 20 rbd-mirror:
Replayer::set_sources: enter
-log end

Do I need upgrade kernel to 4.4 or higher?




Best wishes,
Mika


2016-05-12 21:10 GMT+08:00 Jason Dillaman :

> On Thu, May 12, 2016 at 6:33 AM, Mika c  wrote:
>
> > 4.) Both sites  installed "rbd-mirror".
> >  Start daemon "rbd-mirror" .
> >  On site1:$sudo rbd-mirror -m 192.168.168.21:6789
> >  On site2:$sudo rbd-mirror -m 192.168.168.22:6789
>
> Assuming you use keep "ceph" as the local cluster name and use
> "remote" for its peer, you should be able to do the following:
>
> systemctl enable ceph-rbd-mirror@0
> systemctl start ceph-rbd-mirror@0
>
> You can override the local cluster name by putting
> "CLUSTER=" in /etc/sysconfig/ceph. The name of the remote
> cluster is derived from the registration added via the "rbd mirror
> pool peer add" command.
>
> --
> Jason
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Recommended OSD size

2016-05-13 Thread Max Vernimmen
>
>> And there is not good answer, it depends on your needs and use case.
>> For example if your main goal is space and not performance, fewer but
>> larger HDDs will be a better fit.
>
>In my deployment, I have slow requests when starting OSD with 2.5+ TB used on
>it.
>Due to slowdowns on start, I have to start osds manual, one by one, to not
>overload host, but 2.5 TB+ loaded osds causes slow requests anyway :(
>
>THis is true for 6 TB HGST HDN726060ALE610 HDD connected to SATA2 iface, ubuntu
>14.04, 4.2.0-34, ceph hammer 0.94.6 from Ubuntu cloud team.
>

We saw the same, but after upgrading to infernalis the problem went away for us.

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-13 Thread Sage Weil
This is starting to sound like a xenial systemd issue to me.  Maybe poke 
the canonical folks?

You might edit the unit file and make it touch something in /tmp instead 
of starting Ceph just to rule out ceph...

sage


On Fri, 13 May 2016, Wido den Hollander wrote:

> No luck either. After a reboot only the Ceph OSD starts, but the monitor not.
> 
> I have checked:
> - service is enabled
> - tried to re-enable the service
> - check the MON logs to see if it was started, it wasn't
> - systemd log to see if it wants to start the MON, it doesn't
> 
> My systemd-foo isn't that good either, so I don't know what is happening here.
> 
> Wido
> 
> > Op 12 mei 2016 om 15:31 schreef Jan Schermer :
> > 
> > 
> > Btw try replacing
> > 
> > WantedBy=ceph-mon.target
> > 
> > With: WantedBy=default.target
> > then systemctl daemon-reload.
> > 
> > See if that does the trick
> > 
> > I only messed with systemctl to have my own services start, I still hope it 
> > goes away eventually... :P
> > 
> > Jan
> > 
> > > On 12 May 2016, at 15:01, Wido den Hollander  wrote:
> > > 
> > > 
> > > To also answer Sage's question: No, this is a fresh Jewel install in a 
> > > few test VMs. This system was not upgraded.
> > > 
> > > It was installed 2 hours ago.
> > > 
> > >> Op 12 mei 2016 om 14:51 schreef Jan Schermer :
> > >> 
> > >> 
> > >> Can you post the contents of ceph-mon@.service file?
> > >> 
> > > 
> > > Yes, here you go:
> > > 
> > > root@charlie:~# cat /lib/systemd/system/ceph-mon@.service 
> > > [Unit]
> > > Description=Ceph cluster monitor daemon
> > > 
> > > # According to:
> > > #   http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget
> > > # these can be removed once ceph-mon will dynamically change network
> > > # configuration.
> > > After=network-online.target local-fs.target ceph-create-keys@%i.service
> > > Wants=network-online.target local-fs.target ceph-create-keys@%i.service
> > > 
> > > PartOf=ceph-mon.target
> > > 
> > > [Service]
> > > LimitNOFILE=1048576
> > > LimitNPROC=1048576
> > > EnvironmentFile=-/etc/default/ceph
> > > Environment=CLUSTER=ceph
> > > ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id %i --setuser 
> > > ceph --setgroup ceph
> > > ExecReload=/bin/kill -HUP $MAINPID
> > > PrivateDevices=yes
> > > ProtectHome=true
> > > ProtectSystem=full
> > > PrivateTmp=true
> > > TasksMax=infinity
> > > Restart=on-failure
> > > StartLimitInterval=30min
> > > StartLimitBurst=3
> > > 
> > > [Install]
> > > WantedBy=ceph-mon.target
> > > root@charlie:~#
> > > 
> > >> what does
> > >> systemctl is-enabled ceph-mon@charlie
> > >> say?
> > >> 
> > > 
> > > root@charlie:~# systemctl is-enabled ceph-mon@charlie
> > > enabled
> > > root@charlie:~#
> > > 
> > >> However, this looks like it was just started at a bad moment and died - 
> > >> nothing in logs?
> > >> 
> > > 
> > > No, I checked the ceph-mon logs in /var/log/ceph. No sign of it even 
> > > trying to start after boot. In /var/log/syslog there also is not a trace 
> > > of ceph-mon.
> > > 
> > > Only the OSD starts.
> > > 
> > > Wido
> > > 
> > >> Jan
> > >> 
> > >> 
> > >>> On 12 May 2016, at 14:44, Sage Weil  wrote:
> > >>> 
> > >>> On Thu, 12 May 2016, Wido den Hollander wrote:
> >  Hi,
> >  
> >  I am setting up a Jewel cluster in VMs with Ubuntu 16.04.
> >  
> >  ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)
> >  
> >  After a reboot the Ceph Monitors don't start and I have to do so 
> >  manually.
> >  
> >  Three machines, alpha, bravo and charlie all have the same problem.
> >  
> >  root@charlie:~# systemctl status ceph-mon@charlie
> >  ● ceph-mon@charlie.service - Ceph cluster monitor daemon
> >   Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; 
> >  vendor preset: enabled)
> >   Active: inactive (dead)
> >  root@charlie:~#
> >  
> >  I can start it and it works
> > >>> 
> > >>> Hmm.. my systemd-fu is weak, but if it's enabled it seems like it shoud 
> > >>> come up.
> > >>> 
> > >>> Was this an upgraded package?  What if you do 'systemctl reenable 
> > >>> ceph-mon@charlie'?
> > >>> 
> > >>> sage
> > >>> 
> > >>> 
> > >>> 
> >  
> >  root@charlie:~# systemctl start ceph-mon@charlie
> >  root@charlie:~# systemctl status ceph-mon@charlie
> >  ● ceph-mon@charlie.service - Ceph cluster monitor daemon
> >   Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; 
> >  vendor preset: enabled)
> >   Active: active (running) since Thu 2016-05-12 16:08:56 CEST; 1s ago
> >  Main PID: 1368 (ceph-mon)
> >  
> >  I tried removing the /var/log/ceph/ceph-mon.charlie.log file and 
> >  reboot to see if the mon was actually invoked, but it wasn't.
> >  
> >  ceph.target has been started and so is the OSD on the machine. It is 
> >  just the monitor which hasn't been started.
> >  
> >  In the syslog I see:
> >  
> >  May 12 16:11:19 charlie systemd[1]: S

Re: [ceph-users] Try to find the right way to enable rbd-mirror.

2016-05-13 Thread Jason Dillaman
On Fri, May 13, 2016 at 6:39 AM, Mika c  wrote:
> Hi Dillaman,
> Thank you for getting back to me.
> My system is ubuntu, so I using "sudo rbd-mirror --cluster=local
> --log-file=mirror.log --debug-rbd-mirror=20/5" instead. I was read your
> reply but still confused.

For upstart systems, you can run:

# sudo touch /var/lib/ceph/rbd-mirror/{cluster-name}-{id}/done"
# sudo start ceph-rbd-mirror id={id} cluster={cluster-name}

> The image journaling is enable.
> ---rbd info start---
> $ rbd info test
> rbd image 'test':
> size 1024 MB in 256 objects
> order 22 (4096 kB objects)
> block_name_prefix: rbd_data.853d2ae8944a
> format: 2
> features: layering, exclusive-lock, object-map, fast-diff,
> deep-flatten, journaling
> flags:
> journal: 853d2ae8944a
> mirroring state: enabled
> mirroring global id: 912e6673-6dd9-480c-bfe1-f134f79a9989
> mirroring primary: true
> rbd info end
>
>
> And logs like below, I do not seem any errors. But images never show up on
> site2.
> log start 
> 2016-05-13 10:33:47.076797 7f70d0469c80 20 rbd-mirror: Mirror::run: enter
> 2016-05-13 10:33:47.076800 7f70d0469c80 20 rbd-mirror:
> ClusterWatcher::refresh_pools: enter
> 2016-05-13 10:33:47.083209 7f70d0469c80 20 rbd-mirror:
> ClusterWatcher::read_configs: pool rbd has mirroring enabled for peer uuid:
> 1add7cb3-2cee-43a3-a67c-4c5adb9d455d cluster: remote client: client.admin
> 2016-05-13 10:33:47.083229 7f70d0469c80 20 rbd-mirror:
> Mirror::update_replayers: enter
> 2016-05-13 10:33:47.083231 7f70d0469c80 20 rbd-mirror:
> Mirror::update_replayers: starting replayer for uuid:
> 1add7cb3-2cee-43a3-a67c-4c5adb9d455d cluster: remote client: client.admin
> 2016-05-13 10:33:47.083271 7f70d0469c80 20 rbd-mirror: Replayer::init:
> replaying for uuid: 1add7cb3-2cee-43a3-a67c-4c5adb9d455d cluster: remote
> client: client.admin
> 2016-05-13 10:33:47.091120 7f70d0469c80 20 rbd-mirror: Replayer::init:
> connected to uuid: 1add7cb3-2cee-43a3-a67c-4c5adb9d455d cluster: remote
> client: client.admin
> 2016-05-13 10:33:47.091178 7f70d0469c80 20 rbd-mirror:
> PoolWatcher::refresh_images: enter
> 2016-05-13 10:33:47.096067 7f707bfff700 20 rbd-mirror: Replayer::run: enter
> 2016-05-13 10:33:47.096207 7f707bfff700 20 rbd-mirror:
> Replayer::set_sources: enter
> -log end

The rbd-mirror daemon pulls the updates from a remote cluster, so make
sure (1) you start it with the local cluster name, and (2) when you
configure the peer, the name of the peer aligns to a local
/etc/ceph/.conf file which connects to the remote.

> Do I need upgrade kernel to 4.4 or higher?

There are no kernel-specific features for mirroring.

-- 
Jason
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] PG active+clean+inconsistent unexpected clone errors in OSD log

2016-05-13 Thread Remco
Hi all,

We have been hit by http://tracker.ceph.com/issues/12954 which caused two
OSDs to crash during scrub operations. I have upgraded to 0.94.7 from
0.94.6 to apply a fix for this bug, and everything has been stable so far.
However, since this morning 17 scrub errors appeared (which was to be
expected as the OSDs crashed on the inconsistency).

I still have 7 errors left like this one: deep-scrub 2.31
2/15942131/rbd_data.3414ca7956cb.0cde/92b is an unexpected clone

The cluster has a replication level of 2. Only the OSD primary for this PG
is logging this error. What would be the best thing to do?

Thanks,
Remco.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-13 Thread Jan Schermer
Can you check that the dependencies have started? Anything about those in the 
logs?

network-online.target local-fs.target ceph-create-keys@%i.service

Jan


> On 13 May 2016, at 14:30, Sage Weil  wrote:
> 
> This is starting to sound like a xenial systemd issue to me.  Maybe poke 
> the canonical folks?
> 
> You might edit the unit file and make it touch something in /tmp instead 
> of starting Ceph just to rule out ceph...
> 
> sage
> 
> 
> On Fri, 13 May 2016, Wido den Hollander wrote:
> 
>> No luck either. After a reboot only the Ceph OSD starts, but the monitor not.
>> 
>> I have checked:
>> - service is enabled
>> - tried to re-enable the service
>> - check the MON logs to see if it was started, it wasn't
>> - systemd log to see if it wants to start the MON, it doesn't
>> 
>> My systemd-foo isn't that good either, so I don't know what is happening 
>> here.
>> 
>> Wido
>> 
>>> Op 12 mei 2016 om 15:31 schreef Jan Schermer :
>>> 
>>> 
>>> Btw try replacing
>>> 
>>> WantedBy=ceph-mon.target
>>> 
>>> With: WantedBy=default.target
>>> then systemctl daemon-reload.
>>> 
>>> See if that does the trick
>>> 
>>> I only messed with systemctl to have my own services start, I still hope it 
>>> goes away eventually... :P
>>> 
>>> Jan
>>> 
 On 12 May 2016, at 15:01, Wido den Hollander  wrote:
 
 
 To also answer Sage's question: No, this is a fresh Jewel install in a few 
 test VMs. This system was not upgraded.
 
 It was installed 2 hours ago.
 
> Op 12 mei 2016 om 14:51 schreef Jan Schermer :
> 
> 
> Can you post the contents of ceph-mon@.service file?
> 
 
 Yes, here you go:
 
 root@charlie:~# cat /lib/systemd/system/ceph-mon@.service 
 [Unit]
 Description=Ceph cluster monitor daemon
 
 # According to:
 #   http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget
 # these can be removed once ceph-mon will dynamically change network
 # configuration.
 After=network-online.target local-fs.target ceph-create-keys@%i.service
 Wants=network-online.target local-fs.target ceph-create-keys@%i.service
 
 PartOf=ceph-mon.target
 
 [Service]
 LimitNOFILE=1048576
 LimitNPROC=1048576
 EnvironmentFile=-/etc/default/ceph
 Environment=CLUSTER=ceph
 ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id %i --setuser ceph 
 --setgroup ceph
 ExecReload=/bin/kill -HUP $MAINPID
 PrivateDevices=yes
 ProtectHome=true
 ProtectSystem=full
 PrivateTmp=true
 TasksMax=infinity
 Restart=on-failure
 StartLimitInterval=30min
 StartLimitBurst=3
 
 [Install]
 WantedBy=ceph-mon.target
 root@charlie:~#
 
> what does
> systemctl is-enabled ceph-mon@charlie
> say?
> 
 
 root@charlie:~# systemctl is-enabled ceph-mon@charlie
 enabled
 root@charlie:~#
 
> However, this looks like it was just started at a bad moment and died - 
> nothing in logs?
> 
 
 No, I checked the ceph-mon logs in /var/log/ceph. No sign of it even 
 trying to start after boot. In /var/log/syslog there also is not a trace 
 of ceph-mon.
 
 Only the OSD starts.
 
 Wido
 
> Jan
> 
> 
>> On 12 May 2016, at 14:44, Sage Weil  wrote:
>> 
>> On Thu, 12 May 2016, Wido den Hollander wrote:
>>> Hi,
>>> 
>>> I am setting up a Jewel cluster in VMs with Ubuntu 16.04.
>>> 
>>> ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)
>>> 
>>> After a reboot the Ceph Monitors don't start and I have to do so 
>>> manually.
>>> 
>>> Three machines, alpha, bravo and charlie all have the same problem.
>>> 
>>> root@charlie:~# systemctl status ceph-mon@charlie
>>> ● ceph-mon@charlie.service - Ceph cluster monitor daemon
>>> Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor 
>>> preset: enabled)
>>> Active: inactive (dead)
>>> root@charlie:~#
>>> 
>>> I can start it and it works
>> 
>> Hmm.. my systemd-fu is weak, but if it's enabled it seems like it shoud 
>> come up.
>> 
>> Was this an upgraded package?  What if you do 'systemctl reenable 
>> ceph-mon@charlie'?
>> 
>> sage
>> 
>> 
>> 
>>> 
>>> root@charlie:~# systemctl start ceph-mon@charlie
>>> root@charlie:~# systemctl status ceph-mon@charlie
>>> ● ceph-mon@charlie.service - Ceph cluster monitor daemon
>>> Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; vendor 
>>> preset: enabled)
>>> Active: active (running) since Thu 2016-05-12 16:08:56 CEST; 1s ago
>>> Main PID: 1368 (ceph-mon)
>>> 
>>> I tried removing the /var/log/ceph/ceph-mon.charlie.log file and reboot 
>>> to see if the mon was actually invoked, but it wasn't.
>>> 
>>> ceph.target has been started and so is the OSD on the machine. It is 
>>> just the monitor wh

Re: [ceph-users] ceph-mon not starting on boot with systemd and Ubuntu 16.04

2016-05-13 Thread Wido den Hollander

> Op 13 mei 2016 om 14:56 schreef Jan Schermer :
> 
> 
> Can you check that the dependencies have started? Anything about those in the 
> logs?
> 
> network-online.target local-fs.target ceph-create-keys@%i.service
> 

May 13 16:59:15 alpha systemd[1]: Reached target Local File Systems (Pre).
May 13 16:59:15 alpha systemd[1]: Reached target Local File Systems.
..
..
May 13 16:59:18 alpha systemd[1]: Reached target Network.
May 13 16:59:18 alpha systemd[1]: Reached target Network is Online.

In the systemd logs there is no trace of 'ceph-create-keys@%i.service' starting 
up, so that seems to be the cullprit.

Wido

> Jan
> 
> 
> > On 13 May 2016, at 14:30, Sage Weil  wrote:
> > 
> > This is starting to sound like a xenial systemd issue to me.  Maybe poke 
> > the canonical folks?
> > 
> > You might edit the unit file and make it touch something in /tmp instead 
> > of starting Ceph just to rule out ceph...
> > 
> > sage
> > 
> > 
> > On Fri, 13 May 2016, Wido den Hollander wrote:
> > 
> >> No luck either. After a reboot only the Ceph OSD starts, but the monitor 
> >> not.
> >> 
> >> I have checked:
> >> - service is enabled
> >> - tried to re-enable the service
> >> - check the MON logs to see if it was started, it wasn't
> >> - systemd log to see if it wants to start the MON, it doesn't
> >> 
> >> My systemd-foo isn't that good either, so I don't know what is happening 
> >> here.
> >> 
> >> Wido
> >> 
> >>> Op 12 mei 2016 om 15:31 schreef Jan Schermer :
> >>> 
> >>> 
> >>> Btw try replacing
> >>> 
> >>> WantedBy=ceph-mon.target
> >>> 
> >>> With: WantedBy=default.target
> >>> then systemctl daemon-reload.
> >>> 
> >>> See if that does the trick
> >>> 
> >>> I only messed with systemctl to have my own services start, I still hope 
> >>> it goes away eventually... :P
> >>> 
> >>> Jan
> >>> 
>  On 12 May 2016, at 15:01, Wido den Hollander  wrote:
>  
>  
>  To also answer Sage's question: No, this is a fresh Jewel install in a 
>  few test VMs. This system was not upgraded.
>  
>  It was installed 2 hours ago.
>  
> > Op 12 mei 2016 om 14:51 schreef Jan Schermer :
> > 
> > 
> > Can you post the contents of ceph-mon@.service file?
> > 
>  
>  Yes, here you go:
>  
>  root@charlie:~# cat /lib/systemd/system/ceph-mon@.service 
>  [Unit]
>  Description=Ceph cluster monitor daemon
>  
>  # According to:
>  #   http://www.freedesktop.org/wiki/Software/systemd/NetworkTarget
>  # these can be removed once ceph-mon will dynamically change network
>  # configuration.
>  After=network-online.target local-fs.target ceph-create-keys@%i.service
>  Wants=network-online.target local-fs.target ceph-create-keys@%i.service
>  
>  PartOf=ceph-mon.target
>  
>  [Service]
>  LimitNOFILE=1048576
>  LimitNPROC=1048576
>  EnvironmentFile=-/etc/default/ceph
>  Environment=CLUSTER=ceph
>  ExecStart=/usr/bin/ceph-mon -f --cluster ${CLUSTER} --id %i --setuser 
>  ceph --setgroup ceph
>  ExecReload=/bin/kill -HUP $MAINPID
>  PrivateDevices=yes
>  ProtectHome=true
>  ProtectSystem=full
>  PrivateTmp=true
>  TasksMax=infinity
>  Restart=on-failure
>  StartLimitInterval=30min
>  StartLimitBurst=3
>  
>  [Install]
>  WantedBy=ceph-mon.target
>  root@charlie:~#
>  
> > what does
> > systemctl is-enabled ceph-mon@charlie
> > say?
> > 
>  
>  root@charlie:~# systemctl is-enabled ceph-mon@charlie
>  enabled
>  root@charlie:~#
>  
> > However, this looks like it was just started at a bad moment and died - 
> > nothing in logs?
> > 
>  
>  No, I checked the ceph-mon logs in /var/log/ceph. No sign of it even 
>  trying to start after boot. In /var/log/syslog there also is not a trace 
>  of ceph-mon.
>  
>  Only the OSD starts.
>  
>  Wido
>  
> > Jan
> > 
> > 
> >> On 12 May 2016, at 14:44, Sage Weil  wrote:
> >> 
> >> On Thu, 12 May 2016, Wido den Hollander wrote:
> >>> Hi,
> >>> 
> >>> I am setting up a Jewel cluster in VMs with Ubuntu 16.04.
> >>> 
> >>> ceph version 10.2.0 (3a9fba20ec743699b69bd0181dd6c54dc01c64b9)
> >>> 
> >>> After a reboot the Ceph Monitors don't start and I have to do so 
> >>> manually.
> >>> 
> >>> Three machines, alpha, bravo and charlie all have the same problem.
> >>> 
> >>> root@charlie:~# systemctl status ceph-mon@charlie
> >>> ● ceph-mon@charlie.service - Ceph cluster monitor daemon
> >>> Loaded: loaded (/lib/systemd/system/ceph-mon@.service; enabled; 
> >>> vendor preset: enabled)
> >>> Active: inactive (dead)
> >>> root@charlie:~#
> >>> 
> >>> I can start it and it works
> >> 
> >> Hmm.. my systemd-fu is weak, but if it's enabled it seems like it 
> >> shoud 
> >> come up.
> >> 
> >> Was this an upgraded package

[ceph-users] Segfault in libtcmalloc.so.4.2.2

2016-05-13 Thread David
Hi,

Been getting some segfaults in our newest ceph cluster running ceph 9.2.1-1 on 
Debian 8.3

segfault at 0 ip 7f27e85120f7 sp 7f27cff9e860 error 4 in 
libtcmalloc.so.4.2.2

I saw there’s already a bug up there on the tracker: 
http://tracker.ceph.com/issues/15628 
Don’t know how many other are affected by it. We stop and start the osd to 
bring it up again but it’s quite annoying.

I’m guessing this affects Jewel as well?

Kind Regards,

David Majchrzak

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multiple backend pools on the same cacher tier pool ?

2016-05-13 Thread Haomai Wang
On Fri, May 13, 2016 at 8:11 PM, Florent B  wrote:
> Hi everyone,
>
> I would like to setup Ceph cache tiering and I would like to know if I
> can have a single cache tier pool, used as "hot storage" for multiple
> backend pools ?

no, we can't. I think it's too complexity to implement this in current
cache tier design

>
> Documentation only takes example with a single backend pool, or I didn't
> find the information.
>
> Thank you.
>
> Florent
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Segfault in libtcmalloc.so.4.2.2

2016-05-13 Thread Somnath Roy
What is the exact kernel version ?
Ubuntu has a new tcmalloc incorporated from 3.16.0.50 kernel onwards. If you 
are using older kernel than this better to upgrade kernel or try building 
latest tcmalloc and try to see if this is happening there.
Ceph is not packaging tcmalloc it is using the tcmalloc available with distro.

Thanks & Regards
Somnath

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David
Sent: Friday, May 13, 2016 6:13 AM
To: ceph-users
Subject: [ceph-users] Segfault in libtcmalloc.so.4.2.2

Hi,

Been getting some segfaults in our newest ceph cluster running ceph 9.2.1-1 on 
Debian 8.3
segfault at 0 ip 7f27e85120f7 sp 7f27cff9e860 error 4 in 
libtcmalloc.so.4.2.2

I saw there’s already a bug up there on the tracker: 
http://tracker.ceph.com/issues/15628
Don’t know how many other are affected by it. We stop and start the osd to 
bring it up again but it’s quite annoying.

I’m guessing this affects Jewel as well?

Kind Regards,

David Majchrzak

PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Weighted Priority Queue testing

2016-05-13 Thread Somnath Roy
Thanks Christian for the input.
I will start digging the code and look for possible explanation.

Regards
Somnath

-Original Message-
From: Christian Balzer [mailto:ch...@gol.com]
Sent: Thursday, May 12, 2016 11:52 PM
To: Somnath Roy
Cc: Scottix; ceph-users@lists.ceph.com; Nick Fisk
Subject: Re: [ceph-users] Weighted Priority Queue testing


Hello,

On Fri, 13 May 2016 05:46:41 + Somnath Roy wrote:

> FYI in my test I used osd_max_backfills = 10 which is hammer default.
> Post hammer it's been changed to 1.
>
All my tests, experiences are with Firefly and Hammer.

Also FYI and possibly pertinent to this discussion, I just added a node with 6 
OSDs to one of my clusters.
I did this by initially adding things with a crush weight of 0 (so nothing
happened) and then in one fell swoop set the weights of all those OSDs to 5.

Now what I'm seeing (and remembering seeing before) is that Ceph is processing 
this very sequentially, meaning it is currently backfilling the first 2 OSDs 
and doing nothing of the sorts with the other 4, they are idle.

"osd_max_backfills" is set to 4, which is incidentally the number of backfills 
happening on the new node now, however this is per OSD, so in theory we could 
expect 24 backfills.
The prospective source OSDs aren't pegged with backfills either, they have
1-2 going on.

I'm seriously wondering if this behavior is related to what we're talking about 
here.

Christian

> Thanks & Regards
> Somnath
>
> -Original Message-
> From: Christian Balzer [mailto:ch...@gol.com]
> Sent: Thursday, May 12, 2016 10:40 PM
> To: Scottix
> Cc: Somnath Roy; ceph-users@lists.ceph.com; Nick Fisk
> Subject: Re: [ceph-users] Weighted Priority Queue testing
>
>
> Hello,
>
> On Thu, 12 May 2016 15:41:13 + Scottix wrote:
>
> > We have run into this same scenarios in terms of the long tail
> > taking much longer on recovery than the initial.
> >
> > Either time we are adding osd or an osd get taken down. At first we
> > have max-backfill set to 1 so it doesn't kill the cluster with io.
> > As time passes by the single osd is performing the backfill. So we
> > are gradually increasing the max-backfill up to 10 to reduce the
> > amount of time it needs to recover fully. I know there are a few
> > other factors at play here but for us we tend to do this procedure every 
> > time.
> >
>
> Yeah, as I wrote in my original mail "This becomes even more obvious
> when backfills and recovery settings are lowered".
>
> However my test cluster is at the default values, so it starts with a
> (much too big) bang and ends with a whimper, not because it's
> throttled but simply because there are so few PGs/OSDs to choose from.
> Or so it seems, purely from observation.
>
> Christian
> > On Wed, May 11, 2016 at 6:29 PM Christian Balzer  wrote:
> >
> > > On Wed, 11 May 2016 16:10:06 + Somnath Roy wrote:
> > >
> > > > I bumped up the backfill/recovery settings to match up Hammer.
> > > > It is probably unlikely that long tail latency is a parallelism
> > > > issue. If so, entire recovery would be suffering not the tail
> > > > alone. It's probably a prioritization issue. Will start looking
> > > > and update my findings. I can't add devl because of the table
> > > > but needed to add community that's why ceph-users :-).. Also,
> > > > wanted to know from Ceph's user if they are also facing similar issues..
> > > >
> > >
> > > What I meant with lack of parallelism is that at the start of a
> > > rebuild, there are likely to be many candidate PGs for recovery
> > > and backfilling, so many things happen at the same time, up to the
> > > limits of what is configured (max backfill etc).
> > >
> > > From looking at my test cluster, it starts with 8-10 backfills and
> > > recoveries (out of 140 affected PGs), but later on in the game
> > > there are less and less PGs (and OSDs/nodes) to choose from, so
> > > things slow down around 60 PGs to just 3-4 backfills.
> > > And around 20 PGs it's down to 1-2 backfills, so the parallelism
> > > is clearly gone at that point and recovery speed is down to what a
> > > single PG/OSD can handle.
> > >
> > > Christian
> > >
> > > > Thanks & Regards
> > > > Somnath
> > > >
> > > > -Original Message-
> > > > From: Christian Balzer [mailto:ch...@gol.com]
> > > > Sent: Wednesday, May 11, 2016 12:31 AM
> > > > To: Somnath Roy
> > > > Cc: Mark Nelson; Nick Fisk; ceph-users@lists.ceph.com
> > > > Subject: Re: [ceph-users] Weighted Priority Queue testing
> > > >
> > > >
> > > >
> > > > Hello,
> > > >
> > > > not sure if the Cc: to the users ML was intentional or not, but
> > > > either way.
> > > >
> > > > The issue seen in the tracker:
> > > > http://tracker.ceph.com/issues/15763
> > > > and what you have seen (and I as well) feels a lot like the lack
> > > > of parallelism towards the end of rebuilds.
> > > >
> > > > This becomes even more obvious when backfills and recovery
> > > > settings are lowered.
> > > >
> > > > Regards,
> > > >
> > > > Christian
> > 

[ceph-users] v0.94.7 Hammer released

2016-05-13 Thread Sage Weil
This Hammer point release fixes several minor bugs. It also includes a 
backport of an improved ‘ceph osd reweight-by-utilization’ command for 
handling OSDs with higher-than-average utilizations.

We recommend that all hammer v0.94.x users upgrade.

For more detailed information, see the release announcement at

http://ceph.com/releases/v0-94-7-hammer-released/

or the complete changelog at

http://docs.ceph.com/docs/master/_downloads/v0.94.6.txt

Getting Ceph


* Git at git://github.com/ceph/ceph.git
* Tarball at http://download.ceph.com/tarballs/ceph-0.94.7.tar.gz
* For packages, see http://ceph.com/docs/master/install/get-packages
* For ceph-deploy, see http://ceph.com/docs/master/install/install-ceph-deploy___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Multiple backend pools on the same cacher tier pool ?

2016-05-13 Thread Nick Fisk
> -Original Message-
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
> Haomai Wang
> Sent: 13 May 2016 15:00
> To: Florent B 
> Cc: Ceph Users 
> Subject: Re: [ceph-users] Multiple backend pools on the same cacher tier
> pool ?
> 
> On Fri, May 13, 2016 at 8:11 PM, Florent B  wrote:
> > Hi everyone,
> >
> > I would like to setup Ceph cache tiering and I would like to know if I
> > can have a single cache tier pool, used as "hot storage" for multiple
> > backend pools ?
> 
> no, we can't. I think it's too complexity to implement this in current
cache tier
> design

Technically correct, but I am guessing the OP is thinking of an actual pool
of SSD's, rather than a Ceph pool.

So to give a slightly different answer, no 1 cache pool can't tier multiple
base pools. But multiple cache pools could be created on the same set of
SSD's to accomplish this, keeping in mind available space.

> 
> >
> > Documentation only takes example with a single backend pool, or I
> > didn't find the information.
> >
> > Thank you.
> >
> > Florent
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Segfault in libtcmalloc.so.4.2.2

2016-05-13 Thread David
Linux osd11.storage 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u3 
(2016-01-17) x86_64 GNU/Linux

apt-show-versions linux-image-3.16.0-4-amd64
linux-image-3.16.0-4-amd64:amd64/jessie-updates 3.16.7-ckt20-1+deb8u3 
upgradeable to 3.16.7-ckt25-2

apt-show-versions libtcmalloc-minimal4
libtcmalloc-minimal4:amd64/jessie 2.2.1-0.2 uptodate



> 13 maj 2016 kl. 16:02 skrev Somnath Roy :
> 
> What is the exact kernel version ?
> Ubuntu has a new tcmalloc incorporated from 3.16.0.50 kernel onwards. If you 
> are using older kernel than this better to upgrade kernel or try building 
> latest tcmalloc and try to see if this is happening there.
> Ceph is not packaging tcmalloc it is using the tcmalloc available with distro.
>  
> Thanks & Regards
> Somnath
>  
> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David
> Sent: Friday, May 13, 2016 6:13 AM
> To: ceph-users
> Subject: [ceph-users] Segfault in libtcmalloc.so.4.2.2
>  
> Hi,
>  
> Been getting some segfaults in our newest ceph cluster running ceph 9.2.1-1 
> on Debian 8.3
> 
> segfault at 0 ip 7f27e85120f7 sp 7f27cff9e860 error 4 in 
> libtcmalloc.so.4.2.2
>  
> I saw there’s already a bug up there on the tracker: 
> http://tracker.ceph.com/issues/15628 
> Don’t know how many other are affected by it. We stop and start the osd to 
> bring it up again but it’s quite annoying.
>  
> I’m guessing this affects Jewel as well?
>  
> Kind Regards,
>  
> David Majchrzak
>  
> PLEASE NOTE: The information contained in this electronic mail message is 
> intended only for the use of the designated recipient(s) named above. If the 
> reader of this message is not the intended recipient, you are hereby notified 
> that you have received this message in error and that any review, 
> dissemination, distribution, or copying of this message is strictly 
> prohibited. If you have received this communication in error, please notify 
> the sender by telephone or e-mail (as shown above) immediately and destroy 
> any and all copies of this message in your possession (whether hard copies or 
> electronically stored copies).

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Segfault in libtcmalloc.so.4.2.2

2016-05-13 Thread Somnath Roy
I am not sure about debian , but, for Ubuntu latest tcmalloc is not 
incorporated till 3.16.0.50..
You can use the attached program to detect if your tcmalloc is okay or not. Do 
this..

$ g++ -o gperftest tcmalloc_test.c -ltcmalloc
   $ TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES=67108864 ./gperftest

BTW, I am not saying latest tcmalloc will fix the issue , but worth trying.

Thanks & Regards
Somnath

From: David [mailto:da...@visions.se]
Sent: Friday, May 13, 2016 7:49 AM
To: Somnath Roy
Cc: ceph-users
Subject: Re: [ceph-users] Segfault in libtcmalloc.so.4.2.2

Linux osd11.storage 3.16.0-4-amd64 #1 SMP Debian 3.16.7-ckt20-1+deb8u3 
(2016-01-17) x86_64 GNU/Linux

apt-show-versions linux-image-3.16.0-4-amd64
linux-image-3.16.0-4-amd64:amd64/jessie-updates 3.16.7-ckt20-1+deb8u3 
upgradeable to 3.16.7-ckt25-2

apt-show-versions libtcmalloc-minimal4
libtcmalloc-minimal4:amd64/jessie 2.2.1-0.2 uptodate



13 maj 2016 kl. 16:02 skrev Somnath Roy 
mailto:somnath@sandisk.com>>:

What is the exact kernel version ?
Ubuntu has a new tcmalloc incorporated from 3.16.0.50 kernel onwards. If you 
are using older kernel than this better to upgrade kernel or try building 
latest tcmalloc and try to see if this is happening there.
Ceph is not packaging tcmalloc it is using the tcmalloc available with distro.

Thanks & Regards
Somnath

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of David
Sent: Friday, May 13, 2016 6:13 AM
To: ceph-users
Subject: [ceph-users] Segfault in libtcmalloc.so.4.2.2

Hi,

Been getting some segfaults in our newest ceph cluster running ceph 9.2.1-1 on 
Debian 8.3
segfault at 0 ip 7f27e85120f7 sp 7f27cff9e860 error 4 in 
libtcmalloc.so.4.2.2

I saw there’s already a bug up there on the tracker: 
http://tracker.ceph.com/issues/15628
Don’t know how many other are affected by it. We stop and start the osd to 
bring it up again but it’s quite annoying.

I’m guessing this affects Jewel as well?

Kind Regards,

David Majchrzak

PLEASE NOTE: The information contained in this electronic mail message is 
intended only for the use of the designated recipient(s) named above. If the 
reader of this message is not the intended recipient, you are hereby notified 
that you have received this message in error and that any review, 
dissemination, distribution, or copying of this message is strictly prohibited. 
If you have received this communication in error, please notify the sender by 
telephone or e-mail (as shown above) immediately and destroy any and all copies 
of this message in your possession (whether hard copies or electronically 
stored copies).

#include 
#include 
#ifdef HAVE_GPERFTOOLS_HEAP_PROFILER_H
#include 
#else
#include 
#endif

#ifdef HAVE_GPERFTOOLS_MALLOC_EXTENSION_H
#include 
#else
#include 
#endif

using namespace std;

int main ()
{
  size_t tc_cache_sz;
  size_t env_cache_sz;
  char *env_cache_sz_str;
  int st;

  env_cache_sz_str = getenv("TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES");
  if (env_cache_sz_str) {
env_cache_sz = strtoul(env_cache_sz_str, NULL, 0);
if (env_cache_sz == 33554432) {
cout << "TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES Value same as default:"
" 33554432 export a different value for test" << endl;
exit(EXIT_FAILURE);
}
tc_cache_sz = 0;
MallocExtension::instance()->
GetNumericProperty("tcmalloc.max_total_thread_cache_bytes",
&tc_cache_sz);
if (tc_cache_sz == env_cache_sz) {
  cout << "Tcmalloc OK! Internal and Env cache size are same:" <<
  tc_cache_sz << endl;
  st = EXIT_SUCCESS;
} else {
  cout << "Tcmalloc BUG! TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES: "
  << env_cache_sz << " Internal Size: " << tc_cache_sz
  << " different" << endl;
  st = EXIT_FAILURE;
}
  } else {
cout << "TCMALLOC_MAX_TOTAL_THREAD_CACHE_BYTES Env Not Set" << endl;
st = EXIT_FAILURE;
  }
  exit(st);
}
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Steps for Adding Cache Tier

2016-05-13 Thread MailingLists - EWS
I have been reading a lot of information about cache-tiers, and I wanted to
know how best to go about adding the cache-tier to a production environment.

 

Our current setup is Infernalis (9.2.1) 4 nodes with 8 x 4TB SATA drives per
node and 2 x 400GB NVMe acting as journals (1:4 ratio). There is a bunch of
spare space on the NVMe's so we would like to partition that and make them
OSDs for a cache-tier. Each NVMe should have about 200GB of space available
on them giving us plenty of cache space (8 x 200GB), placing the journals on
the NVMe since they have more than enough bandwidth.

 

Our primary usage for Ceph at this time is powering RBD block storage for an
OpenStack cluster. The vast majority of our users use the system mainly for
long term storage (store and hold data) but we do get some "hotspots" from
time to time and we want to help smooth those out a little bit.

 

I have read this page:
http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ and believe
that I have a handle on most of that.

 

I recall some additional information regarding permissions for block device
access (making sure that your cephx permissions allow access to the
cache-tier pool).

 

Our plan is:

 

-  partition the NVMe's and create the OSDs manually with a 0 weight


-  create our new cache pool, and adjust the crushmap to place the
cache pool on these OSDs

-  make sure permissions and settings are taken care of (making sure
our cephx volumes user has rwx on the cache-tier pool)

-  add the cache-tier to our volumes pool

-  ???

-  Profit!

 

Is there anything we might be missing here? Are there any other issues that
we might need to be aware of? I seem to recall some discussion on the list
with regard to settings that were required to make caching work correctly,
but my memory seems to indicate that these changes were already added to the
page listed above. Is that assumption correct?

 

Tom Walsh

https://expresshosting.net/

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Mount Jewel CephFS to CentOS6

2016-05-13 Thread Andrus, Brian Contractor
So I see that support for RHEL6 and derivatives was dropped in Jewel 
(http://ceph.com/releases/v10-2-0-jewel-released/)

But is there backward compatibility to mount it using hammer on a node? Doesn't 
seem to be and that makes some sense, but how can I mount CephFS from a 
CentOS7-Jewel server to a CentOS6 box?

Thanks in advance for any advice,


Brian Andrus
ITACS/Research Computing
Naval Postgraduate School
Monterey, California
voice: 831-656-6238


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mount Jewel CephFS to CentOS6

2016-05-13 Thread Oliver Dzombic
Hi,

ceph-fuse will be yours.

Or, if you can do kernel > 2.6.32
( or when ever ceph was introduced into the kernel )

then you can also have the kernel mount with hammer.


-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:i...@ip-interactive.de

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 13.05.2016 um 18:49 schrieb Andrus, Brian Contractor:
> So I see that support for RHEL6 and derivatives was dropped in Jewel
> (http://ceph.com/releases/v10-2-0-jewel-released/)
> 
>  
> 
> But is there backward compatibility to mount it using hammer on a node?
> Doesn’t seem to be and that makes some sense, but how can I mount CephFS
> from a CentOS7-Jewel server to a CentOS6 box?
> 
>  
> 
> Thanks in advance for any advice,
> 
>  
> 
>  
> 
> Brian Andrus
> 
> ITACS/Research Computing
> 
> Naval Postgraduate School
> 
> Monterey, California
> 
> voice: 831-656-6238
> 
>  
> 
>  
> 
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Upgrade to Jewel... More interesting things...

2016-05-13 Thread Tu Holmes
So I'm updating a trusty cluster to Jewel and updating the kernel at the
same time.

Got around some mon issues, and that seems ok, but after upgrading one of
my OSD nodes, I'm getting these errors in the old log on that node.

























*0 ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process
ceph-osd, pid 22401 2016-05-13 10:51:39.416511 7fe486e73800 0
pidfile_write: ignore empty --pid-file 2016-05-13 10:51:39.423336
7fe486e73800 -1 filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount:
unable to access basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied
2016-05-13 10:51:39.423342 7fe486e73800 -1 osd.6 0 OSD:init: unable to
mount object store 2016-05-13 10:51:39.423348 7fe486e73800 -1 ** ERROR: osd
init failed: (13) Permission denied 2016-05-13 10:51:39.572099 7f4d86a68800
0 set uid:gid to 1000:1000 (ceph:ceph) 2016-05-13 10:51:39.572126
7f4d86a68800 0 ceph version 10.2.1
(3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22530
2016-05-13 10:51:39.573744 7f4d86a68800 0 pidfile_write: ignore empty
--pid-file 2016-05-13 10:51:39.580543 7f4d86a68800 -1
filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount: unable to access
basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied 2016-05-13
10:51:39.580549 7f4d86a68800 -1 osd.6 0 OSD:init: unable to mount object
store 2016-05-13 10:51:39.580554 7f4d86a68800 -1 ** ERROR: osd init failed:
(13) Permission denied 2016-05-13 10:51:39.798196 7fc0daf80800 0 set
uid:gid to 1000:1000 (ceph:ceph) 2016-05-13 10:51:39.798222 7fc0daf80800 0
ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process
ceph-osd, pid 22724 2016-05-13 10:51:39.799923 7fc0daf80800 0
pidfile_write: ignore empty --pid-file 2016-05-13 10:51:39.806382
7fc0daf80800 -1 filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount:
unable to access basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied
2016-05-13 10:51:39.806387 7fc0daf80800 -1 osd.6 0 OSD:init: unable to
mount object store 2016-05-13 10:51:39.806390 7fc0daf80800 -1 ** ERROR: osd
init failed: (13) Permission denied 2016-05-13 10:51:39.954085 7feb28741800
0 set uid:gid to 1000:1000 (ceph:ceph) 2016-05-13 10:51:39.954112
7feb28741800 0 ceph version 10.2.1
(3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22888
2016-05-13 10:51:39.955839 7feb28741800 0 pidfile_write: ignore empty
--pid-file 2016-05-13 10:51:39.962785 7feb28741800 -1
filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount: unable to access
basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied 2016-05-13
10:51:39.962791 7feb28741800 -1 osd.6 0 OSD:init: unable to mount object
store 2016-05-13 10:51:39.962796 7feb28741800 -1 ** ERROR: osd init failed:
(13) Permission denied*


The OSDs are all mounted as expected:



/dev/sdl1 3.7T 1.8T 1.9T 49% /var/lib/ceph/osd/ceph-6
/dev/sdn1 3.7T 2.0T 1.7T 55% /var/lib/ceph/osd/ceph-18
/dev/sdb1 3.7T 2.2T 1.5T 61% /var/lib/ceph/osd/ceph-30
/dev/sdf1 3.7T 2.0T 1.7T 54% /var/lib/ceph/osd/ceph-54
/dev/sdh1 3.7T 1.9T 1.8T 52% /var/lib/ceph/osd/ceph-66
/dev/sde1 3.7T 1.9T 1.8T 51% /var/lib/ceph/osd/ceph-48
/dev/sdd1 3.7T 1.8T 1.9T 49% /var/lib/ceph/osd/ceph-42
/dev/sdk1 3.7T 1.7T 2.0T 46% /var/lib/ceph/osd/ceph-0
/dev/sda1 3.7T 1.9T 1.8T 51% /var/lib/ceph/osd/ceph-24
/dev/sdm1 3.7T 1.9T 1.8T 52% /var/lib/ceph/osd/ceph-12
/dev/sdc1 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/ceph-36
/dev/sdg1 3.7T 1.8T 1.9T 49% /var/lib/ceph/osd/ceph-60


Any ideas as to what could be going on?

//Tu Holmes
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mount Jewel CephFS to CentOS6

2016-05-13 Thread Gregory Farnum
On Friday, May 13, 2016, Andrus, Brian Contractor  wrote:

> So I see that support for RHEL6 and derivatives was dropped in Jewel (
> http://ceph.com/releases/v10-2-0-jewel-released/)
>
>
>
> But is there backward compatibility to mount it using hammer on a node?
> Doesn’t seem to be and that makes some sense, but how can I mount CephFS
> from a CentOS7-Jewel server to a CentOS6 box?
>
Backwards testing is limited (especially for CephFS) but you should be able
to mount with any client. Just beware of the bugs. ;)



>
>
> Thanks in advance for any advice,
>
>
>
>
>
> Brian Andrus
>
> ITACS/Research Computing
>
> Naval Postgraduate School
>
> Monterey, California
>
> voice: 831-656-6238
>
>
>
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mount Jewel CephFS to CentOS6

2016-05-13 Thread Ilya Dryomov
On Fri, May 13, 2016 at 8:02 PM, Gregory Farnum  wrote:
>
>
> On Friday, May 13, 2016, Andrus, Brian Contractor  wrote:
>>
>> So I see that support for RHEL6 and derivatives was dropped in Jewel
>> (http://ceph.com/releases/v10-2-0-jewel-released/)
>>
>>
>>
>> But is there backward compatibility to mount it using hammer on a node?
>> Doesn’t seem to be and that makes some sense, but how can I mount CephFS
>> from a CentOS7-Jewel server to a CentOS6 box?
>
> Backwards testing is limited (especially for CephFS) but you should be able
> to mount with any client. Just beware of the bugs. ;)

... as long as you don't have any incompatible features or CRUSH
tunables enabled on the jewel side.

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade to Jewel... More interesting things...

2016-05-13 Thread LOPEZ Jean-Charles
Hi Tu,

from what version were you upgrading from?

In Jewel, all Ceph processes run with the ceph user and not the root user and 
may be you should investigate into the permissions of the /var/lib/ceph/osd 
subdirectories? So if you have upgraded from hammer it could likely be the 
problem.

Regards
JC


> On May 13, 2016, at 11:00, Tu Holmes  wrote:
> 
> So I'm updating a trusty cluster to Jewel and updating the kernel at the same 
> time. 
> 
> Got around some mon issues, and that seems ok, but after upgrading one of my 
> OSD nodes, I'm getting these errors in the old log on that node.
> 
> 
> 
> 0 ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process 
> ceph-osd, pid 22401 
> 2016-05-13 10:51:39.416511 7fe486e73800 0 pidfile_write: ignore empty 
> --pid-file 
> 2016-05-13 10:51:39.423336 7fe486e73800 -1 
> filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount: unable to access 
> basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied 
> 2016-05-13 10:51:39.423342 7fe486e73800 -1 osd.6 0 OSD:init: unable to mount 
> object store 
> 2016-05-13 10:51:39.423348 7fe486e73800 -1 ** ERROR: osd init failed: (13) 
> Permission denied 
> 2016-05-13 10:51:39.572099 7f4d86a68800 0 set uid:gid to 1000:1000 
> (ceph:ceph) 
> 2016-05-13 10:51:39.572126 7f4d86a68800 0 ceph version 10.2.1 
> (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22530 
> 2016-05-13 10:51:39.573744 7f4d86a68800 0 pidfile_write: ignore empty 
> --pid-file 
> 2016-05-13 10:51:39.580543 7f4d86a68800 -1 
> filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount: unable to access 
> basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied 
> 2016-05-13 10:51:39.580549 7f4d86a68800 -1 osd.6 0 OSD:init: unable to mount 
> object store 
> 2016-05-13 10:51:39.580554 7f4d86a68800 -1 ** ERROR: osd init failed: (13) 
> Permission denied 
> 2016-05-13 10:51:39.798196 7fc0daf80800 0 set uid:gid to 1000:1000 
> (ceph:ceph) 
> 2016-05-13 10:51:39.798222 7fc0daf80800 0 ceph version 10.2.1 
> (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22724 
> 2016-05-13 10:51:39.799923 7fc0daf80800 0 pidfile_write: ignore empty 
> --pid-file 
> 2016-05-13 10:51:39.806382 7fc0daf80800 -1 
> filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount: unable to access 
> basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied 
> 2016-05-13 10:51:39.806387 7fc0daf80800 -1 osd.6 0 OSD:init: unable to mount 
> object store 
> 2016-05-13 10:51:39.806390 7fc0daf80800 -1 ** ERROR: osd init failed: (13) 
> Permission denied 
> 2016-05-13 10:51:39.954085 7feb28741800 0 set uid:gid to 1000:1000 
> (ceph:ceph) 
> 2016-05-13 10:51:39.954112 7feb28741800 0 ceph version 10.2.1 
> (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22888 
> 2016-05-13 10:51:39.955839 7feb28741800 0 pidfile_write: ignore empty 
> --pid-file 
> 2016-05-13 10:51:39.962785 7feb28741800 -1 
> filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount: unable to access 
> basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied 
> 2016-05-13 10:51:39.962791 7feb28741800 -1 osd.6 0 OSD:init: unable to mount 
> object store 
> 2016-05-13 10:51:39.962796 7feb28741800 -1 ** ERROR: osd init failed: (13) 
> Permission denied
> 
> 
> The OSDs are all mounted as expected:
> 
> 
> 
> /dev/sdl1 3.7T 1.8T 1.9T 49% /var/lib/ceph/osd/ceph-6 
> /dev/sdn1 3.7T 2.0T 1.7T 55% /var/lib/ceph/osd/ceph-18 
> /dev/sdb1 3.7T 2.2T 1.5T 61% /var/lib/ceph/osd/ceph-30 
> /dev/sdf1 3.7T 2.0T 1.7T 54% /var/lib/ceph/osd/ceph-54 
> /dev/sdh1 3.7T 1.9T 1.8T 52% /var/lib/ceph/osd/ceph-66 
> /dev/sde1 3.7T 1.9T 1.8T 51% /var/lib/ceph/osd/ceph-48 
> /dev/sdd1 3.7T 1.8T 1.9T 49% /var/lib/ceph/osd/ceph-42 
> /dev/sdk1 3.7T 1.7T 2.0T 46% /var/lib/ceph/osd/ceph-0 
> /dev/sda1 3.7T 1.9T 1.8T 51% /var/lib/ceph/osd/ceph-24 
> /dev/sdm1 3.7T 1.9T 1.8T 52% /var/lib/ceph/osd/ceph-12 
> /dev/sdc1 3.7T 1.7T 2.0T 47% /var/lib/ceph/osd/ceph-36 
> /dev/sdg1 3.7T 1.8T 1.9T 49% /var/lib/ceph/osd/ceph-60
> 
> 
> Any ideas as to what could be going on?
> 
> //Tu Holmes
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade to Jewel... More interesting things...

2016-05-13 Thread MailingLists - EWS
Did you check the permissions of those directories?

 

Part of the steps in the upgrade process mentions the following:

 

chown -R ceph:ceph /var/lib/ceph

 

Tom Walsh

https://expresshosting.net/

 


0 ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process 
ceph-osd, pid 22401 
2016-05-13 10:51:39.416511 7fe486e73800 0 pidfile_write: ignore empty 
--pid-file 
2016-05-13 10:51:39.423336 7fe486e73800 -1 filestore(/var/lib/ceph/osd/ceph-6) 
FileStore::mount: unable to access basedir '/var/lib/ceph/osd/ceph-6': (13) 
Permission denied 
2016-05-13 10:51:39.423342 7fe486e73800 -1 osd.6 0 OSD:init: unable to mount 
object store 
2016-05-13 10:51:39.423348 7fe486e73800 -1 ** ERROR: osd init failed: (13) 
Permission denied 
2016-05-13 10:51:39.572099 7f4d86a68800 0 set uid:gid to 1000:1000 (ceph:ceph) 
2016-05-13 10:51:39.572126 7f4d86a68800 0 ceph version 10.2.1 
(3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22530 
2016-05-13 10:51:39.573744 7f4d86a68800 0 pidfile_write: ignore empty 
--pid-file 
2016-05-13 10:51:39.580543 7f4d86a68800 -1 filestore(/var/lib/ceph/osd/ceph-6) 
FileStore::mount: unable to access basedir '/var/lib/ceph/osd/ceph-6': (13) 
Permission denied 
2016-05-13 10:51:39.580549 7f4d86a68800 -1 osd.6 0 OSD:init: unable to mount 
object store 
2016-05-13 10:51:39.580554 7f4d86a68800 -1 ** ERROR: osd init failed: (13) 
Permission denied 
2016-05-13 10:51:39.798196 7fc0daf80800 0 set uid:gid to 1000:1000 (ceph:ceph) 
2016-05-13 10:51:39.798222 7fc0daf80800 0 ceph version 10.2.1 
(3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22724 
2016-05-13 10:51:39.799923 7fc0daf80800 0 pidfile_write: ignore empty 
--pid-file 
2016-05-13 10:51:39.806382 7fc0daf80800 -1 filestore(/var/lib/ceph/osd/ceph-6) 
FileStore::mount: unable to access basedir '/var/lib/ceph/osd/ceph-6': (13) 
Permission denied 
2016-05-13 10:51:39.806387 7fc0daf80800 -1 osd.6 0 OSD:init: unable to mount 
object store 
2016-05-13 10:51:39.806390 7fc0daf80800 -1 ** ERROR: osd init failed: (13) 
Permission denied 
2016-05-13 10:51:39.954085 7feb28741800 0 set uid:gid to 1000:1000 (ceph:ceph) 
2016-05-13 10:51:39.954112 7feb28741800 0 ceph version 10.2.1 
(3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22888 
2016-05-13 10:51:39.955839 7feb28741800 0 pidfile_write: ignore empty 
--pid-file 
2016-05-13 10:51:39.962785 7feb28741800 -1 filestore(/var/lib/ceph/osd/ceph-6) 
FileStore::mount: unable to access basedir '/var/lib/ceph/osd/ceph-6': (13) 
Permission denied 
2016-05-13 10:51:39.962791 7feb28741800 -1 osd.6 0 OSD:init: unable to mount 
object store 
2016-05-13 10:51:39.962796 7feb28741800 -1 ** ERROR: osd init failed: (13) 
Permission denied

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Upgrade to Jewel... More interesting things...

2016-05-13 Thread Tu Holmes
That is most likely exactly what my issue is. I must have missed that step.

Thanks.

Will report back.


On Fri, May 13, 2016 at 11:18 AM MailingLists - EWS <
mailingli...@expresswebsystems.com> wrote:

> Did you check the permissions of those directories?
>
>
>
> Part of the steps in the upgrade process mentions the following:
>
>
>
> chown -R ceph:ceph /var/lib/ceph
>
>
>
> Tom Walsh
>
> https://expresshosting.net/
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> *0 ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process
> ceph-osd, pid 22401 2016-05-13 10:51:39.416511 7fe486e73800 0
> pidfile_write: ignore empty --pid-file 2016-05-13 10:51:39.423336
> 7fe486e73800 -1 filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount:
> unable to access basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied
> 2016-05-13 10:51:39.423342 7fe486e73800 -1 osd.6 0 OSD:init: unable to
> mount object store 2016-05-13 10:51:39.423348 7fe486e73800 -1 ** ERROR: osd
> init failed: (13) Permission denied 2016-05-13 10:51:39.572099 7f4d86a68800
> 0 set uid:gid to 1000:1000 (ceph:ceph) 2016-05-13 10:51:39.572126
> 7f4d86a68800 0 ceph version 10.2.1
> (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22530
> 2016-05-13 10:51:39.573744 7f4d86a68800 0 pidfile_write: ignore empty
> --pid-file 2016-05-13 10:51:39.580543 7f4d86a68800 -1
> filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount: unable to access
> basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied 2016-05-13
> 10:51:39.580549 7f4d86a68800 -1 osd.6 0 OSD:init: unable to mount object
> store 2016-05-13 10:51:39.580554 7f4d86a68800 -1 ** ERROR: osd init failed:
> (13) Permission denied 2016-05-13 10:51:39.798196 7fc0daf80800 0 set
> uid:gid to 1000:1000 (ceph:ceph) 2016-05-13 10:51:39.798222 7fc0daf80800 0
> ceph version 10.2.1 (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process
> ceph-osd, pid 22724 2016-05-13 10:51:39.799923 7fc0daf80800 0
> pidfile_write: ignore empty --pid-file 2016-05-13 10:51:39.806382
> 7fc0daf80800 -1 filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount:
> unable to access basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied
> 2016-05-13 10:51:39.806387 7fc0daf80800 -1 osd.6 0 OSD:init: unable to
> mount object store 2016-05-13 10:51:39.806390 7fc0daf80800 -1 ** ERROR: osd
> init failed: (13) Permission denied 2016-05-13 10:51:39.954085 7feb28741800
> 0 set uid:gid to 1000:1000 (ceph:ceph) 2016-05-13 10:51:39.954112
> 7feb28741800 0 ceph version 10.2.1
> (3a66dd4f30852819c1bdaa8ec23c795d4ad77269), process ceph-osd, pid 22888
> 2016-05-13 10:51:39.955839 7feb28741800 0 pidfile_write: ignore empty
> --pid-file 2016-05-13 10:51:39.962785 7feb28741800 -1
> filestore(/var/lib/ceph/osd/ceph-6) FileStore::mount: unable to access
> basedir '/var/lib/ceph/osd/ceph-6': (13) Permission denied 2016-05-13
> 10:51:39.962791 7feb28741800 -1 osd.6 0 OSD:init: unable to mount object
> store 2016-05-13 10:51:39.962796 7feb28741800 -1 ** ERROR: osd init failed:
> (13) Permission denied*
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How do ceph clients determine a monitor's address (and esp. port) for initial connection?

2016-05-13 Thread Gregory Farnum
On Fri, May 13, 2016 at 12:51 AM, Christian Sarrasin
 wrote:
> Hi Greg,
>
> Thanks again and good guess!  Amending testcluster.conf as follows:
>
> mon host = 192.168.10.201:6788
> mon addr = 192.168.10.201:6788
>
> ... gets around the problem.
>
> having "mon host = mona:6788" also works.
>
> Should I raise a defect or is this workaround good enough?

Well, if the docs don't make that clear it sounds like a doc bug, at least!
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Starting a cluster with one OSD node

2016-05-13 Thread Mike Jacobacci
Hello,

I have a quick and probably dumb question… We would like to use Ceph for our 
storage, I was thinking of a cluster with 3 Monitor and OSD nodes.  I was 
wondering if it was a bad idea to start a Ceph cluster with just one OSD node 
(10 OSDs, 2 SSDs), then add more nodes as our budget allows?  We want to spread 
out the purchases of the OSD nodes over a month or two but I would like to 
start moving data over ASAP.

Cheers,
Mike


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Mounting format 2 rbd images (created in Jewel) on CentOS 7 clients

2016-05-13 Thread Steven Hsiao-Ting Lee
Hi,

I’m playing with Jewel and discovered format 1 images have been deprecated. 
Since the rbd kernel module in CentOS/RHEL 7 does not yet support format 2 
images, how do I access RBD images created in Jewel from CentOS/RHEL 7 clients? 
Thanks!


Steven
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Mounting format 2 rbd images (created in Jewel) on CentOS 7 clients

2016-05-13 Thread Ilya Dryomov
On Fri, May 13, 2016 at 10:11 PM, Steven Hsiao-Ting Lee
 wrote:
> Hi,
>
> I’m playing with Jewel and discovered format 1 images have been deprecated. 
> Since the rbd kernel module in CentOS/RHEL 7 does not yet support format 2 
> images, how do I access RBD images created in Jewel from CentOS/RHEL 7 
> clients? Thanks!

It does support format 2 images.  What it doesn't support is the extra
features enabled by default in jewel.

> Do
>
> $ rbd feature disable  
> deep-flatten,fast-diff,object-map,exclusive-lock
>
> to disable features unsupported by the kernel client.  If you are using the
> kernel client, you should create your images with
>
> $ rbd create --size  --image-feature layering 
>
> or add
>
> rbd default features = 3
>
> to ceph.conf on the client side.  (Setting rbd default features on the
> OSDs will have no effect.)

Thanks,

Ilya
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] What's the minimal version of "ceph" client side the current "jewel" release would support?

2016-05-13 Thread Bob R
Yang,

We've got some proxmox hosts which are still running firefly and appear to
be working fine with Jewel. We did have a problem where the firefly clients
wouldn't communicate with the ceph cluster due to mismatched capabilities
flags but this was resolved by setting "ceph osd crush tunables legacy".
Note that moving from optimal tunables to legacy did shuffle quite a bit of
data.

Bob

On Thu, May 12, 2016 at 3:33 PM, Yang X  wrote:

> See title.
>
> We have Firefly on the client side (SLES11SP3) and it does not seem to
> work well with the "jewel" server nodes (CentOS 7)
>
> Can somebody please provide some guidelines?
>
> Thanks,
>
> Yang
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] ceph-hammer - problem adding / removing monitors

2016-05-13 Thread Michael Kuriger
Hi everyone.  We’re running ceph-hammer, and I was trying to rename our monitor 
servers.  I tried following the procedure for removing a monitor, and adding a 
monitor.  Removing seems to have work ok, as now I have 2 monitors up.

When I try to add the 3rd monitor, and the ceph-deploy completes,  I see this 
error in the logs:

cephx: verify_reply couldn't decrypt with error: error decoding block for 
decryption


Now I have a cluster with only 2 monitors.  My last resort was to stop all my 
monitors and manually inject a new monmap.  Hopefully this isn’t necessary.


Can anyone help?  Thanks!


# ceph -s

2016-05-13 23:36:12.927558 7f88d02f6700  0 -- :/920712977 >> 
10.1.161.118:6789/0 pipe(0x7f88cc05f050 sd=3 :0 s=1 pgs=0 cs=0 l=1 
c=0x7f88cc05bcf0).fault

cluster 5f916b57-e171-4d1c-9a43-342a70475fc5

 health HEALTH_OK

 monmap e2: 2 mons at 
{ypec-prod1-ms104=10.1.161.116:6789/0,ypec-prod1-ms105=10.1.161.117:6789/0}

election epoch 78, quorum 0,1 ms104,ms105

 mdsmap e42934: 1/1/1 up {0=ypec-prod1-ms104=up:active}, 1 up:standby

 osdmap e42574: 64 osds: 64 up, 64 in

  pgmap v5593337: 2816 pgs, 5 pools, 487 GB data, 375 kobjects

1788 GB used, 178 TB / 180 TB avail

2816 active+clean

  client io 7334 B/s wr, 2 op/s

 Mike Kuriger



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] straw_calc_version

2016-05-13 Thread Tu Holmes
Hello again Cephers... As I'm learning more and breaking more things, I'm
finding more things I don't know.

So currently, with all of the other things since I started upgrading to
Jewel, I'm seeing this in my logs.

crush map has straw_calc_version=0

Now, yes, I understand the general crush map and what it does, but what is
this straw_calc_version.

Should I change this tunable to a different version?

Is there a potential for it to break something if I do?

I didn't see this before the Jewel upgrade, so I'm inclined to change it,
but I want to make sure I'm not breaking something.

Thanks.

//Tu Holmes
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] straw_calc_version

2016-05-13 Thread Gregory Farnum
On Fri, May 13, 2016 at 5:02 PM, Tu Holmes  wrote:
> Hello again Cephers... As I'm learning more and breaking more things, I'm
> finding more things I don't know.
>
> So currently, with all of the other things since I started upgrading to
> Jewel, I'm seeing this in my logs.
>
> crush map has straw_calc_version=0
>
> Now, yes, I understand the general crush map and what it does, but what is
> this straw_calc_version.
>
> Should I change this tunable to a different version?
>
> Is there a potential for it to break something if I do?
>
> I didn't see this before the Jewel upgrade, so I'm inclined to change it,
> but I want to make sure I'm not breaking something.


See 
http://docs.ceph.com/docs/master/rados/operations/crush-map#straw-calc-version-tunable
and http://docs.ceph.com/docs/master/release-notes/#adjusting-crush-maps

This was initially an option in Firefly, but with Jewel we're pushing
you a little harder to switch. It shouldn't break anything directly
(unlike most other crush tunables, it doesn't change which clients can
connect) but does have potential for data movement.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] straw_calc_version

2016-05-13 Thread Tu Holmes
Thank you for the info.

Basically I should just set it to 1.


On Fri, May 13, 2016 at 5:12 PM Gregory Farnum  wrote:

> On Fri, May 13, 2016 at 5:02 PM, Tu Holmes  wrote:
> > Hello again Cephers... As I'm learning more and breaking more things, I'm
> > finding more things I don't know.
> >
> > So currently, with all of the other things since I started upgrading to
> > Jewel, I'm seeing this in my logs.
> >
> > crush map has straw_calc_version=0
> >
> > Now, yes, I understand the general crush map and what it does, but what
> is
> > this straw_calc_version.
> >
> > Should I change this tunable to a different version?
> >
> > Is there a potential for it to break something if I do?
> >
> > I didn't see this before the Jewel upgrade, so I'm inclined to change it,
> > but I want to make sure I'm not breaking something.
>
>
> See
> http://docs.ceph.com/docs/master/rados/operations/crush-map#straw-calc-version-tunable
> and http://docs.ceph.com/docs/master/release-notes/#adjusting-crush-maps
>
> This was initially an option in Firefly, but with Jewel we're pushing
> you a little harder to switch. It shouldn't break anything directly
> (unlike most other crush tunables, it doesn't change which clients can
> connect) but does have potential for data movement.
> -Greg
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Starting a cluster with one OSD node

2016-05-13 Thread Alex Gorbachev
On Friday, May 13, 2016, Mike Jacobacci  wrote:

> Hello,
>
> I have a quick and probably dumb question… We would like to use Ceph for
> our storage, I was thinking of a cluster with 3 Monitor and OSD nodes.  I
> was wondering if it was a bad idea to start a Ceph cluster with just one
> OSD node (10 OSDs, 2 SSDs), then add more nodes as our budget allows?  We
> want to spread out the purchases of the OSD nodes over a month or two but I
> would like to start moving data over ASAP.


Hi Mike,

Production or test?  I would strongly recommend against one OSD node in
production.  Not only risk of hang and data loss due to e.g. Filesystem
issue or kernel, but also as you add nodes the data movement will introduce
a good deal of overhead.

Regards,
Alex



>
> Cheers,
> Mike
>
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com 
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>


-- 
--
Alex Gorbachev
Storcium
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Weighted Priority Queue testing

2016-05-13 Thread Christian Balzer

Hello again,

On Fri, 13 May 2016 14:17:22 + Somnath Roy wrote:

> Thanks Christian for the input.
> I will start digging the code and look for possible explanation.
> 

To be fair, after a while more PGs become involved, up to to a backfill
count of 18 (that's 9 actually backfill operations as it counts both reads
and writes). But the last OSD of the 6 new ones didn't see any action
until nearly 3 hours into the process.

As they say, a picture is worth a thousand words, this the primary PG
distribution during that backfill operation:

https://i.imgur.com/mp6yUW7.png

Start point is a 3 node cluster, 1 node with 2 large OSDs (to be
replaced with a 6 OSDs one later), one node with 6 OSDs and another 6 OSD
node with all OSDs set to a crush weight of 0.

Note that the 6 OSDs nodes have their OSD max backfill set to 4, the node
with the 2 large OSDs is at 1. This of course explains some of the
behavior seen here, but not all of it by far.

So at 15:00 I did set the crush weight of all new 6 OSD to 5 (the same as
all the others). 
As you can see, the first OSD (in order of weight change) starts growing
right away, the second shortly after that.

But it takes 20 minutes until the 3rd OSD sees some action, 1 hour for the
4th, nearly 2 hours for the 5th and as said nearly 3 hours for the 6th and
last one.

Again, some of this can be explained by the max backfill of 1 for the 2
large OSDs, but even they were idle for about 20% of the time and never
should have been.
And the 6 smaller existing OSDs should have seen up to 4 backfills (reads
mostly), but never did. 

So to recap, things happen sequentially, when they should be randomized and
optimized.

My idea of how this _should_ work (and clearly doesn't) would be:

Iterate over all PGs with pending backfills ops (optionally start
each loop at a random point ala the Exim queue runner), find a target
(write) OSD that is below the max backfill, then match this with a source
(read) OSD that also has enough backfill credits.

The matching bit is the important bit, if there isnt't a source OSD
available for any of the waiting backfills on the target OSD, go to the
next source OSD, if all source OSDs for that target OSD are busy, go the
next target OSD.

This way should get things going at full speed right from the start.

After that one could think about optimizing the above with weighted
priorities and buckets (prioritize the bucket of the OSD with the most
target PGs).

Regards,

Christian

> Regards
> Somnath
> 
> -Original Message-
> From: Christian Balzer [mailto:ch...@gol.com]
> Sent: Thursday, May 12, 2016 11:52 PM
> To: Somnath Roy
> Cc: Scottix; ceph-users@lists.ceph.com; Nick Fisk
> Subject: Re: [ceph-users] Weighted Priority Queue testing
> 
> 
> Hello,
> 
> On Fri, 13 May 2016 05:46:41 + Somnath Roy wrote:
> 
> > FYI in my test I used osd_max_backfills = 10 which is hammer default.
> > Post hammer it's been changed to 1.
> >
> All my tests, experiences are with Firefly and Hammer.
> 
> Also FYI and possibly pertinent to this discussion, I just added a node
> with 6 OSDs to one of my clusters. I did this by initially adding things
> with a crush weight of 0 (so nothing happened) and then in one fell
> swoop set the weights of all those OSDs to 5.
> 
> Now what I'm seeing (and remembering seeing before) is that Ceph is
> processing this very sequentially, meaning it is currently backfilling
> the first 2 OSDs and doing nothing of the sorts with the other 4, they
> are idle.
> 
> "osd_max_backfills" is set to 4, which is incidentally the number of
> backfills happening on the new node now, however this is per OSD, so in
> theory we could expect 24 backfills. The prospective source OSDs aren't
> pegged with backfills either, they have 1-2 going on.
> 
> I'm seriously wondering if this behavior is related to what we're
> talking about here.
> 
> Christian
> 
> > Thanks & Regards
> > Somnath
> >
> > -Original Message-
> > From: Christian Balzer [mailto:ch...@gol.com]
> > Sent: Thursday, May 12, 2016 10:40 PM
> > To: Scottix
> > Cc: Somnath Roy; ceph-users@lists.ceph.com; Nick Fisk
> > Subject: Re: [ceph-users] Weighted Priority Queue testing
> >
> >
> > Hello,
> >
> > On Thu, 12 May 2016 15:41:13 + Scottix wrote:
> >
> > > We have run into this same scenarios in terms of the long tail
> > > taking much longer on recovery than the initial.
> > >
> > > Either time we are adding osd or an osd get taken down. At first we
> > > have max-backfill set to 1 so it doesn't kill the cluster with io.
> > > As time passes by the single osd is performing the backfill. So we
> > > are gradually increasing the max-backfill up to 10 to reduce the
> > > amount of time it needs to recover fully. I know there are a few
> > > other factors at play here but for us we tend to do this procedure
> > > every time.
> > >
> >
> > Yeah, as I wrote in my original mail "This becomes even more obvious
> > when backfills and recovery settings are lowered".
> >
> > Ho

Re: [ceph-users] Steps for Adding Cache Tier

2016-05-13 Thread Christian Balzer

Hello,

On Fri, 13 May 2016 11:57:24 -0400 MailingLists - EWS wrote:

> I have been reading a lot of information about cache-tiers, and I wanted
> to know how best to go about adding the cache-tier to a production
> environment.
>

Did you read my thread titled "Cache tier operation clarifications" and
related posts?

>  
> 
> Our current setup is Infernalis (9.2.1) 4 nodes with 8 x 4TB SATA drives
> per node and 2 x 400GB NVMe acting as journals (1:4 ratio). There is a
> bunch of spare space on the NVMe's so we would like to partition that
> and make them OSDs for a cache-tier. Each NVMe should have about 200GB
> of space available on them giving us plenty of cache space (8 x 200GB),
> placing the journals on the NVMe since they have more than enough
> bandwidth.
> 

That's likely the last Infernalis release, no more bugfixes for it, so you
should consider going to Jewel once it had time to settle a bit. 

Jewel also has much improved cache tiering bits.

I assume those are Intel DC P3700 NVMes?

While they have a very nice 10 DWPD endurance, keep in mind that now each
write will potentially (depending on your promotion settings) get
amplified 3 times per NVMe: 
once for the cache tier, 
once for the journal on that cache tier 
and once (eventually) when the data gets flushed to the base tier.

And that's 4x200GB effective cache space of course, because even with the
most reliable and monitored SSDs you want/need a replication of 2 at least.

>  
> 
> Our primary usage for Ceph at this time is powering RBD block storage
> for an OpenStack cluster. The vast majority of our users use the system
> mainly for long term storage (store and hold data) but we do get some
> "hotspots" from time to time and we want to help smooth those out a
> little bit.
> 
>  
> 
> I have read this page:
> http://docs.ceph.com/docs/master/rados/operations/cache-tiering/ and
> believe that I have a handle on most of that.
> 
>  
> 
> I recall some additional information regarding permissions for block
> device access (making sure that your cephx permissions allow access to
> the cache-tier pool).
> 
Not using Openstack, but I don't think so.
>From a client perspective, it is talking to the original pool, the cache
is transparently overlaid.


>  
> 
> Our plan is:
> 
>  
> 
> -  partition the NVMe's and create the OSDs manually with a 0
> weight
> 
You will want to create a new root and buckets before creating the OSDs.

> 
> -  create our new cache pool, and adjust the crushmap to place
> the cache pool on these OSDs
>

Since you will have multiple roots on the same node, you will need to set
"osd crush update on start = false".
 
> -  make sure permissions and settings are taken care of (making
> sure our cephx volumes user has rwx on the cache-tier pool)
>
Again, doubt that is needed, but that's what even a tiny, crappy test or
staging environment is for.
 
> -  add the cache-tier to our volumes pool
> 
> -  ???
> 
> -  Profit!
> 
>  
Pretty much.

> 
> Is there anything we might be missing here? Are there any other issues
> that we might need to be aware of? I seem to recall some discussion on
> the list with regard to settings that were required to make caching work
> correctly, but my memory seems to indicate that these changes were
> already added to the page listed above. Is that assumption correct?
> 
> 
Again, this is the kind of operation you want to get comfortable with on a
test cluster first.

Regards, 

Christian
 
> 
> Tom Walsh
> 
> https://expresshosting.net/
> 


-- 
Christian BalzerNetwork/Systems Engineer
ch...@gol.com   Global OnLine Japan/Rakuten Communications
http://www.gol.com/
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com