Re: [ceph-users] rbd ThreadPool threads number

2016-10-13 Thread Venky Shankar
On 16-10-13 14:56:12, tao changtao wrote:
> Hi All,
> 
> why the rbd ThreadPool  threads number are set 1 by hard code ?

details here: http://tracker.ceph.com/issues/15034

> 
> 
> class ThreadPoolSingleton : public ThreadPool {
> public:
>   explicit ThreadPoolSingleton(CephContext *cct)
> : ThreadPool(cct, "librbd::thread_pool", "tp_librbd", 1,
>  "rbd_op_threads") {
> start();
>   }
>   virtual ~ThreadPoolSingleton() {
> stop();
>   }
> };
> 
> 
> changtao...@gmail.com
> --
> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hammer OSD memory usage very high

2016-10-13 Thread Praveen Kumar G T (Cloud Platform)
Hi David,

I am Praveen, we also had a similar problem with hammer 0.94.2. We had the
problem when we created a new cluster with erasure coding pool (10+5
config).

Root cause:

The high memory usage in our case was because of pg logs. The number of pg
logs are higher in case of erasure coding pool compared to replica pools.
so in our case we started running out of memory when we created the new
cluster with erasure coding pools

Solution:

Ceph provides configuration to control the number of pg log entries. You
can try setting this value in your cluster and check your OSD memory usage.
This will also improve the osd boot up time. Below are the config
parameters and the values we use

  osd max pg log entries = 600
  osd min pg log entries = 200
  osd pg log trim min = 200

Other Information:

We dug around this problem for some time before figuring out the root
cause. So we are fairly sure there are no memory leaks in ceph hammer
0.94.2 version.

Regards,
Praveen

Date: Fri, 7 Oct 2016 16:04:03 +1100
From: David Burns 
To: ceph-us...@ceph.com
Subject: [ceph-users] Hammer OSD memory usage very high
Message-ID: 
Content-Type: text/plain; charset=UTF-8

Hello all,

We have a small 160TB Ceph cluster used only as a test s3 storage
repository for media content.

Problem
Since upgrading from Firefly to Hammer we are experiencing very high OSD
memory use of 2-3 GB per TB of OSD storage - typical OSD memory 6-10GB.
We have had to increase swap space to bring the cluster to a basic
functional state. Clearly this will significantly impact system performance
and precludes starting all OSDs simultaneously.

Hardware
4 x storage nodes with 16 OSDs/node. OSD nodes are reasonable spec SMC
storage servers with dual Xeon CPUs. Storage is 16 x 3TB SAS disks in each
node.
Installed RAM is 72GB (2 nodes) & 80GB (2 nodes). (We note that the
installed RAM is at least 50% higher than the Ceph recommended 1 GB RAM per
TB of storage.)

Software
OSD node OS is CentOS 6.8 (with updates). One node has been updated to
CentOS 7.2 - no change in memory usage was observed.

"ceph -v" -> ceph version 0.94.9 (fe6d859066244b97b24f09d46552afc2071e6f90)
(all Ceph packages downloaded from download.ceph.com)

The cluster has achieved status HEALTH_OK so we don?t believe this relates
to increased memory due to recovery.

History
Emperor 0.72.2 -> Firefly 0.80.10 -> Hammer 0.94.6 -> Hammer 0.94.7 ->
Hammer 0.94.9

OSD per process memory is observed to increase substantially during
load_pgs phase.

Use of "ceph tell 'osd.*' heap release? has minimal effect - there is no
substantial memory in the heap or cache freelists.

More information can be found in bug #17228 (link
http://tracker.ceph.com/issues/17228)

Any feedback or guidance to further understanding the high memory usage
would be welcomed.

Thanks

David


--
FetchTV Pty Ltd, Level 5, 61 Lavender Street, Milsons Point, NSW 2061



This email is sent by FetchTV Pty Ltd (ABN 36 130 669 500). The contents of
this communication may be
confidential, legally privileged and/or copyright material. If you are not
the intended recipient, any use,
disclosure or copying of this communication is expressed prohibited. If you
have received this email in error,
please notify the sender and delete it immediately.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"

2016-10-13 Thread Chris Murray

On 22/09/2016 15:29, Chris Murray wrote:

Hi all,

Might anyone be able to help me troubleshoot an "apt-get dist-upgrade"
which is stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"?

I'm upgrading from 10.2.2. The two OSDs on this node are up, and think
they are version 10.2.3, but the upgrade doesn't appear to be finishing
... ?

Thank you in advance,
Chris


Hi,

Are there possibly any pointers to help troubleshoot this? I've got a 
test system on which the same thing has happened.


The cluster's status is "HEALTH_OK" before starting. I'm running Debian 
Jessie.


dpkg.log only has the following:

2016-10-13 11:37:25 configure ceph-osd:amd64 10.2.3-1~bpo80+1 
2016-10-13 11:37:25 status half-configured ceph-osd:amd64 10.2.3-1~bpo80+1

At this point, the ugrade gets stuck and doesn't go any further. Where 
could I look for the next clue?


Thanks,

Chris


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph website problems?

2016-10-13 Thread Henrik Korkuc
from status page it seems that Ceph didn't like networking problems. May 
we find out some details what happened? Underprovisioned servers (RAM 
upgrades were in there too)? Too much load on disks? Something else?


This situation may be not pleasant but I feel that others can learn from 
it to prevent such situations in the future.


On 16-10-13 06:55, Dan Mick wrote:

Everything should have been back some time ago ( UTC or thereabouts)

On 10/11/2016 10:41 PM, Brian :: wrote:

Looks like they are having major challenges getting that ceph cluster
running again.. Still down.

On Tuesday, October 11, 2016, Ken Dreyer mailto:kdre...@redhat.com>> wrote:

I think this may be related:


http://www.dreamhoststatus.com/2016/10/11/dreamcompute-us-east-1-cluster-service-disruption/

On Tue, Oct 11, 2016 at 5:57 AM, Sean Redmond 
> wrote:

Hi,

Looks like the ceph website and related sub domains are giving errors for
the last few hours.

I noticed the below that I use are in scope.

http://ceph.com/
http://docs.ceph.com/
http://download.ceph.com/
http://tracker.ceph.com/

Thanks

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"

2016-10-13 Thread Henrik Korkuc
Is apt/dpkg doing something now? Is problem repeatable, e.g. by killing 
upgrade and starting again. Are there any stuck systemctl processes?


I had no problems upgrading 10.2.x clusters to 10.2.3

On 16-10-13 13:41, Chris Murray wrote:

On 22/09/2016 15:29, Chris Murray wrote:

Hi all,

Might anyone be able to help me troubleshoot an "apt-get dist-upgrade"
which is stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"?

I'm upgrading from 10.2.2. The two OSDs on this node are up, and think
they are version 10.2.3, but the upgrade doesn't appear to be finishing
... ?

Thank you in advance,
Chris


Hi,

Are there possibly any pointers to help troubleshoot this? I've got a 
test system on which the same thing has happened.


The cluster's status is "HEALTH_OK" before starting. I'm running 
Debian Jessie.


dpkg.log only has the following:

2016-10-13 11:37:25 configure ceph-osd:amd64 10.2.3-1~bpo80+1 
2016-10-13 11:37:25 status half-configured ceph-osd:amd64 
10.2.3-1~bpo80+1


At this point, the ugrade gets stuck and doesn't go any further. Where 
could I look for the next clue?


Thanks,

Chris


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph website problems?

2016-10-13 Thread Oliver Dzombic
Hi,

i fully agree.

If the downtime is related to a problem with a ceph cluster, it should
be very intresting to anyone of us, what happend to this ceph cluster to
cause a downtime of multiple days.

Usually thats quiet long for a productive usage.

So any information with some details is highly appriciated. I assume
that who ever run the ceph cluster knows how ceph works. So its even
more important to know what was going wrong, so this can be considered.

Thank you !

-- 
Mit freundlichen Gruessen / Best regards

Oliver Dzombic
IP-Interactive

mailto:i...@ip-interactive.de

Anschrift:

IP Interactive UG ( haftungsbeschraenkt )
Zum Sonnenberg 1-3
63571 Gelnhausen

HRB 93402 beim Amtsgericht Hanau
Geschäftsführung: Oliver Dzombic

Steuer Nr.: 35 236 3622 1
UST ID: DE274086107


Am 13.10.2016 um 12:46 schrieb Henrik Korkuc:
> from status page it seems that Ceph didn't like networking problems. May
> we find out some details what happened? Underprovisioned servers (RAM
> upgrades were in there too)? Too much load on disks? Something else?
> 
> This situation may be not pleasant but I feel that others can learn from
> it to prevent such situations in the future.
> 
> On 16-10-13 06:55, Dan Mick wrote:
>> Everything should have been back some time ago ( UTC or thereabouts)
>>
>> On 10/11/2016 10:41 PM, Brian :: wrote:
>>> Looks like they are having major challenges getting that ceph cluster
>>> running again.. Still down.
>>>
>>> On Tuesday, October 11, 2016, Ken Dreyer >> > wrote:
 I think this may be related:

>>> http://www.dreamhoststatus.com/2016/10/11/dreamcompute-us-east-1-cluster-service-disruption/
>>>
 On Tue, Oct 11, 2016 at 5:57 AM, Sean Redmond >> > wrote:
> Hi,
>
> Looks like the ceph website and related sub domains are giving
> errors for
> the last few hours.
>
> I noticed the below that I use are in scope.
>
> http://ceph.com/
> http://docs.ceph.com/
> http://download.ceph.com/
> http://tracker.ceph.com/
>
> Thanks
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Modify placement group pg and pgp in production environment

2016-10-13 Thread Vincent Godin
When you increase your pg number, the new pgs will have to peer first and
during this time they will be unreachable.So you need to put the cluster in
maintenance mode for this operation.

The way to upgrade the number of PG and the PGP of a running cluster is :


   - First, it's very important to modify your Ceph settings in ceph.conf
   to keep your cluster responsive for client operations. Otherwise, all the
   IO and CPU will be used for the recovery operations and your cluster will
   be unreachable. Be sure that all these new parameters are in place and
   taken in account by all the running process before upgrading your cluster

osd_max_backfills = 1 (you can increase a little bit if the recovery is to
slow but stay < 5)
osd_recovery_threads = 1
osd_recovery_max_active = 1 (you can increase a little bit if the recovery
is to slow but stay < 5)
osd_client_op_priority = 63
osd_recovery_op_priority = 1

   - stop scrub and deep-scrub operations (and wait till no scrub or
   deepscrub operation are running)

ceph osd set noscrub
ceph osd set nodeep-scrub

   - set you cluster in maintenance mode with :

ceph osd set norecover
ceph osd set nobackfill
ceph osd set nodown
ceph osd set noout

   - upgrade the pg number with a small increment like 256


   - wait for the cluster to create and peer the new pgs (about 30 seconds)


   - upgrade the pgp number with the same increment


   - wait for the cluster to create and peer (about 30 seconds)

(Repeat the last 4 operations until you reach the number of pg and pgp you
want

At this time, your cluster is still functionnal.

   - Now you have to unset the maintenance mode

ceph osd unset noout
ceph osd unset nodown
ceph osd unset nobackfill
ceph osd unset norecover

It will take some time to rebalance all the data to the new pgs but at the
end you will have a cluster with all pgs active+clean.During all the
operation,your cluster will still be functionnal if you have respected the
first settings


   - When all the pgs are active+clean, you can re-enable the scrub and
   deep-scrub operations

ceph osd unset noscrub
ceph osd unset nodeep-scrub
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] rgw: How to delete huge bucket?

2016-10-13 Thread Василий Ангапов
Hello,

I have a huge RGW bucket with 180 million objects and non-sharded
bucket. Ceph version is 10.2.1.
I wonder is it safe to delete it with --purge-data option? Will other
buckets be heavily influenced by that?

Regards, Vasily.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Yet another hardware planning question ...

2016-10-13 Thread Patrik Martinsson
Hello everyone, 

We are in the process of buying hardware for our first ceph-cluster. We
will start with some testing and do some performance measurements to
see that we are on the right track, and once we are satisfied with our
setup we'll continue to grow in it as time comes along.

Now, I'm just seeking some thoughts on our future hardware, I know
there are a lot of these kind of questions out there, so please forgive
me for posting another one. 

Details, 
- Cluster will be in the same datacenter, multiple racks as we grow 
- Typical workload (this is incredible vague, forgive me again) would
be an Openstack environment, hosting 150~200 vms, we'll have quite a
few databases for Jira/Confluence/etc. Some workload coming from
Stash/Bamboo agents, puppet master/foreman, and other typical "core
infra stuff". 

Given this prerequisites just given, the going all SSD's (and NVME for
journals) may seem as overkill(?), but we feel like we can afford it
and it will be a benefit for us in the future. 

Planned hardware, 

Six nodes to begin with (which would give us a cluster size of ~46TB,
with a default replica of three (although probably a bit bigger since
the vm's would be backed by a erasure coded pool) will look something
like, 
 - 1x  Intel E5-2695 v4 2.1GHz, 45M Cache, 18 Cores
 - 2x  Dell 64 GB RDIMM 2400MT
 - 12x Dell 1.92TB Mix Use MLC 12Gbps (separate OS disks) 
 - 2x  Dell 1.6TB NVMe Mixed usage (6 osd's per NVME)

Network between all nodes within a rack will be 40Gbit (and 200Gbit
between racks), backed by Junipers QFX5200-32C.

Rather then asking the question, 
- "Does this seems reasonable for our workload ?", 

I want to ask,
- "Is there any reason *not* have a setup like this, is there any
obvious bottlenecks or flaws that we are missing or could this may very
well work as good start (and the ability to grow with adding more
servers) ?"

When it comes to workload-wise-issues, I think we'll just have to see
and grow as we learn. 

We'll be grateful for any input, thoughts, ideas suggestions, you name
it. 

Best regards, 
Patrik Martinsson,
Sweden
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] Loop in radosgw-admin orphan find

2016-10-13 Thread Yoann Moulin
Hello,

I run a cluster in jewel 10.2.2, I have deleted the last Bucket of a radosGW 
pool to delete this pool and recreate it in EC (was replicate)

Detail of the pool :

> pool 36 'erasure.rgw.buckets.data' replicated size 3 min_size 2 crush_ruleset 
> 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 31459 flags 
> hashpspool stripe_width 0

> POOLS:
>NAME  ID USED   %USED MAX AVAIL 
> OBJECTS
>erasure.rgw.buckets.data  36 11838M 075013G
>  4735

After the GC, I found lots of orphan objects still remain in the pool :

> $ rados ls -p erasure.rgw.buckets.data  | egrep -c "(multipart|shadow)"
> 4735
> $ rados ls -p erasure.rgw.buckets.data  | grep -c multipart
> 2368
> $ rados ls -p erasure.rgw.buckets.data  | grep -c shadow
> 2367

example :

> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__multipart_CC-MAIN-2016-40/segments/1474738660158.61/warc/CC-MAIN-20160924173740-00147-ip-10-143-35-109.ec2.internal.warc.gz.2~WezpbEQW1C9nskvtnyAteCVoO3D255Q.29
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__multipart_CC-MAIN-2016-40/segments/1474738660158.61/warc/CC-MAIN-20160924173740-00147-ip-10-143-35-109.ec2.internal.warc.gz.2~WezpbEQW1C9nskvtnyAteCVoO3D255Q.61
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__shadow_segments/1466783398869.97/wet/CC-MAIN-20160624154958-00194-ip-10-164-35-72.ec2.internal.warc.wet.gz.2~7ru9WPCLMf9Lpi__TP1NXuYwjSU7KQK.11_1
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__shadow_crawl-data/CC-MAIN-2016-26/segments/1466783396147.66/wet/CC-MAIN-20160624154956-00071-ip-10-164-35-72.ec2.internal.warc.wet.gz.2~7bKg6WEmNo23IQ6rd8oWF_vbaG0QAFR.6_1
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__shadow_segments/1466783398516.82/wet/CC-MAIN-20160624154958-00172-ip-10-164-35-72.ec2.internal.warc.wet.gz.2~ap5QynCJTco_L7yK6bn4M_bnHBbBe64.14_1
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__multipart_CC-MAIN-2016-40/segments/1474738662400.75/warc/CC-MAIN-20160924173742-00076-ip-10-143-35-109.ec2.internal.warc.gz.2~LEM4bpbbdiTu86rs3Ew_LFNN_oHg_m7.13
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__shadow_CC-MAIN-2016-40/segments/1474738662400.75/warc/CC-MAIN-20160924173742-00033-ip-10-143-35-109.ec2.internal.warc.gz.2~FrN02NmencyDwXavvuzwqR8M8WnWNbH.8_1
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__multipart_segments/1466783395560.14/wet/CC-MAIN-20160624154955-00118-ip-10-164-35-72.ec2.internal.warc.wet.gz.2~GqyEUdSepIxGwPOXfKLSxtS8miWGASe.3
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__multipart_segments/1466783395346.6/wet/CC-MAIN-20160624154955-00083-ip-10-164-35-72.ec2.internal.warc.wet.gz.2~cTQ86ZEmOvxYD4BUI7zW37X-JcJeMgW.19
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__multipart_CC-MAIN-2016-40/segments/1474738660158.61/warc/CC-MAIN-20160924173740-00147-ip-10-143-35-109.ec2.internal.warc.gz.2~WezpbEQW1C9nskvtnyAteCVoO3D255Q.62
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__shadow_CC-MAIN-2016-40/segments/1474738662400.75/warc/CC-MAIN-20160924173742-00259-ip-10-143-35-109.ec2.internal.warc.gz.2~1b-olF9koids0gqT9DsO0y1vAsTOasf.12_1
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__shadow_CC-MAIN-2016-40/segments/1474738660338.16/warc/CC-MAIN-20160924173740-00067-ip-10-143-35-109.ec2.internal.warc.gz.2~JxuX8v0DmsSgAr3iprPBoHx6PoTKRi6.19_1
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__multipart_segments/1466783397864.87/wet/CC-MAIN-20160624154957-00110-ip-10-164-35-72.ec2.internal.warc.wet.gz.2~q2_hY5oSoBWaSZgxh0NdK8JvxmEySPB.29
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__shadow_segments/1466783396949.33/wet/CC-MAIN-20160624154956-0-ip-10-164-35-72.ec2.internal.warc.wet.gz.2~kUInFVpsWy23JFm9eWNPiFNKlXrjDQU.18_1
> c9724aff-5fa0-4dd9-b494-57bdb48fab4e.1371134.1__multipart_CC-MAIN-2016-40/segments/1474738662400.75/warc/CC-MAIN-20160924173742-00076-ip-10-143-35-109.ec2.internal.warc.gz.2~LEM4bpbbdiTu86rs3Ew_LFNN_oHg_m7.36

firstly, Can I delete the pool even if there is orphan object in ? Should I 
delete other metadata (index, data_extra) pools related to this pool
defined in the zone ? is there other data I should clean to be sure to no have 
side effect by removing those objects by deleting the pool
instead of deleting them with radosgw-admin orphan ?

for now, I have followed this doc to find and delete them :

https://access.redhat.com/documentation/en/red-hat-ceph-storage/1.3/single/object-gateway-guide-for-ubuntu/#finding_orphan_objects

I have ran this command :

> radosgw-admin --cluster cephprod orphans find --pool=erasure.rgw.buckets.data 
> --job-id=erasure

but it stuck on a loop, is this a normal behavior ?

example of the output I have for at least 2h :

> storing 1 entries at orphan.scan.erasure.linked.2
> storing 1 entries at orphan.scan.erasure.linked.5
> storing 1 entries at orphan.scan.erasure.linked.9
> storing 1 entries at orphan.scan.erasure.linked.19
> storing 1 entries at orphan.scan.erasure.linked.25
> storing 1 entries at orphan.scan.erasure.linked

[ceph-users] Missing arm64 Ubuntu packages for 10.2.3

2016-10-13 Thread Stillwell, Bryan J
I have a basement cluster that is partially built with Odroid-C2 boards and 
when I attempted to upgrade to the 10.2.3 release I noticed that this release 
doesn't have an arm64 build.  Are there any plans on continuing to make arm64 
builds?

Thanks,
Bryan
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Yet another hardware planning question ...

2016-10-13 Thread Brady Deetz
6 SSD per nvme journal might leave your journal in contention. Can you
provide the specific models you will be using?

On Oct 13, 2016 10:23 AM, "Patrik Martinsson" <
patrik.martins...@trioptima.com> wrote:

> Hello everyone,
>
> We are in the process of buying hardware for our first ceph-cluster. We
> will start with some testing and do some performance measurements to
> see that we are on the right track, and once we are satisfied with our
> setup we'll continue to grow in it as time comes along.
>
> Now, I'm just seeking some thoughts on our future hardware, I know
> there are a lot of these kind of questions out there, so please forgive
> me for posting another one.
>
> Details,
> - Cluster will be in the same datacenter, multiple racks as we grow
> - Typical workload (this is incredible vague, forgive me again) would
> be an Openstack environment, hosting 150~200 vms, we'll have quite a
> few databases for Jira/Confluence/etc. Some workload coming from
> Stash/Bamboo agents, puppet master/foreman, and other typical "core
> infra stuff".
>
> Given this prerequisites just given, the going all SSD's (and NVME for
> journals) may seem as overkill(?), but we feel like we can afford it
> and it will be a benefit for us in the future.
>
> Planned hardware,
>
> Six nodes to begin with (which would give us a cluster size of ~46TB,
> with a default replica of three (although probably a bit bigger since
> the vm's would be backed by a erasure coded pool) will look something
> like,
>  - 1x  Intel E5-2695 v4 2.1GHz, 45M Cache, 18 Cores
>  - 2x  Dell 64 GB RDIMM 2400MT
>  - 12x Dell 1.92TB Mix Use MLC 12Gbps (separate OS disks)
>  - 2x  Dell 1.6TB NVMe Mixed usage (6 osd's per NVME)
>
> Network between all nodes within a rack will be 40Gbit (and 200Gbit
> between racks), backed by Junipers QFX5200-32C.
>
> Rather then asking the question,
> - "Does this seems reasonable for our workload ?",
>
> I want to ask,
> - "Is there any reason *not* have a setup like this, is there any
> obvious bottlenecks or flaws that we are missing or could this may very
> well work as good start (and the ability to grow with adding more
> servers) ?"
>
> When it comes to workload-wise-issues, I think we'll just have to see
> and grow as we learn.
>
> We'll be grateful for any input, thoughts, ideas suggestions, you name
> it.
>
> Best regards,
> Patrik Martinsson,
> Sweden
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Yet another hardware planning question ...

2016-10-13 Thread Patrik Martinsson
On tor, 2016-10-13 at 10:29 -0500, Brady Deetz wrote:
> 6 SSD per nvme journal might leave your journal in contention. Canyou
> provide the specific models you will be using?

Well, according to Dell, the card is called "Dell 1.6TB, NVMe, Mixed
Use Express Flash, PM1725", but the specs for the card is listed here h
ttp://i.dell.com/sites/doccontent/shared-content/data-
sheets/en/Documents/Dell-PowerEdge-Express-Flash-NVMe-Mixed-Use-PCIe-
SSD.pdf

Forgive me for my poor English here, but when you say "leave your
journal in contention", what exactly do you mean by that ?

Best regards, 
Patrik Martinsson
Sweden


> On Oct 13, 2016 10:23 AM, "Patrik Martinsson"
>  wrote:
> > Hello everyone, 
> > 
> > We are in the process of buying hardware for our first ceph-
> > cluster. We
> > will start with some testing and do some performance measurements
> > to
> > see that we are on the right track, and once we are satisfied with
> > our
> > setup we'll continue to grow in it as time comes along.
> > 
> > Now, I'm just seeking some thoughts on our future hardware, I know
> > there are a lot of these kind of questions out there, so please
> > forgive
> > me for posting another one. 
> > 
> > Details, 
> > - Cluster will be in the same datacenter, multiple racks as we
> > grow 
> > - Typical workload (this is incredible vague, forgive me again)
> > would
> > be an Openstack environment, hosting 150~200 vms, we'll have quite
> > a
> > few databases for Jira/Confluence/etc. Some workload coming from
> > Stash/Bamboo agents, puppet master/foreman, and other typical "core
> > infra stuff". 
> > 
> > Given this prerequisites just given, the going all SSD's (and NVME
> > for
> > journals) may seem as overkill(?), but we feel like we can afford
> > it
> > and it will be a benefit for us in the future. 
> > 
> > Planned hardware, 
> > 
> > Six nodes to begin with (which would give us a cluster size of
> > ~46TB,
> > with a default replica of three (although probably a bit bigger
> > since
> > the vm's would be backed by a erasure coded pool) will look
> > something
> > like, 
> >  - 1x  Intel E5-2695 v4 2.1GHz, 45M Cache, 18 Cores
> >  - 2x  Dell 64 GB RDIMM 2400MT
> >  - 12x Dell 1.92TB Mix Use MLC 12Gbps (separate OS disks) 
> >  - 2x  Dell 1.6TB NVMe Mixed usage (6 osd's per NVME)
> > 
> > Network between all nodes within a rack will be 40Gbit (and 200Gbit
> > between racks), backed by Junipers QFX5200-32C.
> > 
> > Rather then asking the question, 
> > - "Does this seems reasonable for our workload ?", 
> > 
> > I want to ask,
> > - "Is there any reason *not* have a setup like this, is there any
> > obvious bottlenecks or flaws that we are missing or could this may
> > very
> > well work as good start (and the ability to grow with adding more
> > servers) ?"
> > 
> > When it comes to workload-wise-issues, I think we'll just have to
> > see
> > and grow as we learn. 
> > 
> > We'll be grateful for any input, thoughts, ideas suggestions, you
> > name
> > it. 
> > 
> > Best regards, 
> > Patrik Martinsson,
> > Sweden
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> > 
-- 
Kindly regards,
Patrik Martinsson
0707 - 27 64 96
System Administrator Linux
Genuine Happiness
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] ceph website problems?

2016-10-13 Thread Sage Weil
On Thu, 13 Oct 2016, Henrik Korkuc wrote:
> from status page it seems that Ceph didn't like networking problems. May we
> find out some details what happened? Underprovisioned servers (RAM upgrades
> were in there too)? Too much load on disks? Something else?
> 
> This situation may be not pleasant but I feel that others can learn from it to
> prevent such situations in the future.

Yep.

These VMs were backed by an old ceph cluster and the cluster fell over 
after a switch failed.  Because it's a beta cluster that's due to be 
decommissioned shortly it wasn't upgraded from firefly.  And because it's 
old the PGs were mistuned (way too many) and machines were 
underprovisioned on RAM (32GB for 12 OSDs; normally probably enough but 
not on a very large cluster with 1000+ OSDs and too many PGs).  It fell 
into the somewhat familiar pattern of OSDs OOMing because of large OSDMaps 
due to a degraded cluster.

The recovery was a bit tedious (tune osdmap caches way down, get all OSDs 
to catch up on maps and rejoin cluster) but it's a procedure that's been 
described on this list before.  Once the core issue was identified it came 
back pretty quickly.

Had the nodes had more RAM or had the PG counts been better tuned it would 
have been avoided, and had the cluster been upgraded it *might* have been 
avoided (hammer+ is more memory efficient, and newer versions have lower 
default map cache sizes).

This was one of the very first large-scale clusters we ever built, so 
we've learned quite a bit since then.  :)

sage


> 
> On 16-10-13 06:55, Dan Mick wrote:
> > Everything should have been back some time ago ( UTC or thereabouts)
> > 
> > On 10/11/2016 10:41 PM, Brian :: wrote:
> > > Looks like they are having major challenges getting that ceph cluster
> > > running again.. Still down.
> > > 
> > > On Tuesday, October 11, 2016, Ken Dreyer  > > > wrote:
> > > > I think this may be related:
> > > > 
> > > http://www.dreamhoststatus.com/2016/10/11/dreamcompute-us-east-1-cluster-service-disruption/
> > > > On Tue, Oct 11, 2016 at 5:57 AM, Sean Redmond  > > > wrote:
> > > > > Hi,
> > > > > 
> > > > > Looks like the ceph website and related sub domains are giving errors
> > > > > for
> > > > > the last few hours.
> > > > > 
> > > > > I noticed the below that I use are in scope.
> > > > > 
> > > > > http://ceph.com/
> > > > > http://docs.ceph.com/
> > > > > http://download.ceph.com/
> > > > > http://tracker.ceph.com/
> > > > > 
> > > > > Thanks
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw: How to delete huge bucket?

2016-10-13 Thread Stas Starikevich
Hi,

I had experience with deleting a big bucket (25M small objects) with
--purge-data option. It took ~20H (run in screen) and didn't made any
significant effect on the cluster performance.
Stas


On Thu, Oct 13, 2016 at 9:42 AM, Василий Ангапов  wrote:
> Hello,
>
> I have a huge RGW bucket with 180 million objects and non-sharded
> bucket. Ceph version is 10.2.1.
> I wonder is it safe to delete it with --purge-data option? Will other
> buckets be heavily influenced by that?
>
> Regards, Vasily.
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] cephfs slow delete

2016-10-13 Thread Heller, Chris
I have a directory I’ve been trying to remove from cephfs (via cephfs-hadoop), 
the directory is a few hundred gigabytes in size and contains a few million 
files, but not in a single sub directory. I startd the delete yesterday at 
around 6:30 EST, and it’s still progressing. I can see from (ceph osd df) that 
the overall data usage on my cluster is decreasing, but at the rate its going 
it will be a month before the entire sub directory is gone. Is a recursive 
delete of a directory known to be a slow operation in CephFS or have I hit upon 
some bad configuration? What steps can I take to better debug this scenario?

-Chris
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"

2016-10-13 Thread Chris Murray

On 13/10/2016 11:49, Henrik Korkuc wrote:
Is apt/dpkg doing something now? Is problem repeatable, e.g. by 
killing upgrade and starting again. Are there any stuck systemctl 
processes?


I had no problems upgrading 10.2.x clusters to 10.2.3

On 16-10-13 13:41, Chris Murray wrote:

On 22/09/2016 15:29, Chris Murray wrote:

Hi all,

Might anyone be able to help me troubleshoot an "apt-get dist-upgrade"
which is stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"?

I'm upgrading from 10.2.2. The two OSDs on this node are up, and think
they are version 10.2.3, but the upgrade doesn't appear to be finishing
... ?

Thank you in advance,
Chris


Hi,

Are there possibly any pointers to help troubleshoot this? I've got a 
test system on which the same thing has happened.


The cluster's status is "HEALTH_OK" before starting. I'm running 
Debian Jessie.


dpkg.log only has the following:

2016-10-13 11:37:25 configure ceph-osd:amd64 10.2.3-1~bpo80+1 
2016-10-13 11:37:25 status half-configured ceph-osd:amd64 
10.2.3-1~bpo80+1


At this point, the ugrade gets stuck and doesn't go any further. 
Where could I look for the next clue?


Thanks,

Chris


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Thank you Henrik, I see it's a systemctl process that's stuck.

It is reproducible for me on every run of  dpkg --configure -a

And, indeed, reproducible across two separate machines.

I'll pursue the stuck "/bin/systemctl start ceph-osd.target".

Thanks again,
Chris

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] cephfs slow delete

2016-10-13 Thread Gregory Farnum
On Thu, Oct 13, 2016 at 12:44 PM, Heller, Chris  wrote:
> I have a directory I’ve been trying to remove from cephfs (via
> cephfs-hadoop), the directory is a few hundred gigabytes in size and
> contains a few million files, but not in a single sub directory. I startd
> the delete yesterday at around 6:30 EST, and it’s still progressing. I can
> see from (ceph osd df) that the overall data usage on my cluster is
> decreasing, but at the rate its going it will be a month before the entire
> sub directory is gone. Is a recursive delete of a directory known to be a
> slow operation in CephFS or have I hit upon some bad configuration? What
> steps can I take to better debug this scenario?

Is it the actual unlink operation taking a long time, or just the
reduction in used space? Unlinks require a round trip to the MDS
unfortunately, but you should be able to speed things up at least some
by issuing them in parallel on different directories.

If it's the used space, you can let the MDS issue more RADOS delete
ops by adjusting the "mds max purge files" and "mds max purge ops"
config values.
-Greg
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Missing arm64 Ubuntu packages for 10.2.3

2016-10-13 Thread Alfredo Deza
On Thu, Oct 13, 2016 at 11:33 AM, Stillwell, Bryan J
 wrote:
> I have a basement cluster that is partially built with Odroid-C2 boards and
> when I attempted to upgrade to the 10.2.3 release I noticed that this
> release doesn't have an arm64 build.  Are there any plans on continuing to
> make arm64 builds?

We have a couple of machines for building ceph releases on ARM64 but
unfortunately they sometimes have issues and since Arm64 is
considered a "nice to have" at the moment we usually skip them if
anything comes up.

So it is an on-and-off kind of situation (I don't recall what happened
for 10.2.3)

But since you've asked, I can try to get them built and see if we can
get 10.2.3 out.

>
> Thanks,
> Bryan
>
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] rgw: How to delete huge bucket?

2016-10-13 Thread Василий Ангапов
Thanks very much, Stas! Anyone can also confirm this?

2016-10-13 19:57 GMT+03:00 Stas Starikevich :
> Hi,
>
> I had experience with deleting a big bucket (25M small objects) with
> --purge-data option. It took ~20H (run in screen) and didn't made any
> significant effect on the cluster performance.
> Stas
>
>
> On Thu, Oct 13, 2016 at 9:42 AM, Василий Ангапов  wrote:
>> Hello,
>>
>> I have a huge RGW bucket with 180 million objects and non-sharded
>> bucket. Ceph version is 10.2.1.
>> I wonder is it safe to delete it with --purge-data option? Will other
>> buckets be heavily influenced by that?
>>
>> Regards, Vasily.
>> ___
>> ceph-users mailing list
>> ceph-users@lists.ceph.com
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Missing arm64 Ubuntu packages for 10.2.3

2016-10-13 Thread Stillwell, Bryan J
On 10/13/16, 2:32 PM, "Alfredo Deza"  wrote:

>On Thu, Oct 13, 2016 at 11:33 AM, Stillwell, Bryan J
> wrote:
>> I have a basement cluster that is partially built with Odroid-C2 boards
>>and
>> when I attempted to upgrade to the 10.2.3 release I noticed that this
>> release doesn't have an arm64 build.  Are there any plans on continuing
>>to
>> make arm64 builds?
>
>We have a couple of machines for building ceph releases on ARM64 but
>unfortunately they sometimes have issues and since Arm64 is
>considered a "nice to have" at the moment we usually skip them if
>anything comes up.
>
>So it is an on-and-off kind of situation (I don't recall what happened
>for 10.2.3)
>
>But since you've asked, I can try to get them built and see if we can
>get 10.2.3 out.

Sounds good, thanks Alfredo!

Bryan

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Yet another hardware planning question ...

2016-10-13 Thread Christian Balzer

Hello,

On Thu, 13 Oct 2016 15:46:03 + Patrik Martinsson wrote:

> On tor, 2016-10-13 at 10:29 -0500, Brady Deetz wrote:
> > 6 SSD per nvme journal might leave your journal in contention. Canyou
> > provide the specific models you will be using?
> 
> Well, according to Dell, the card is called "Dell 1.6TB, NVMe, Mixed
> Use Express Flash, PM1725", but the specs for the card is listed here h
> ttp://i.dell.com/sites/doccontent/shared-content/data-
> sheets/en/Documents/Dell-PowerEdge-Express-Flash-NVMe-Mixed-Use-PCIe-
> SSD.pdf
>
That's a re-branded (not much, same model number) Samsung.
Both that link and the equivalent Samsung link are not what I would
consider professional, with their "up to" speeds.
Because that usually is a fact of design and flash modules used, typically
resulting in smaller drives being slower (less parallelism).

Extrapolating from the 3.2 TB model we can assume that these can not write
more than 2MB/s.

If your 40Gb/s network is single ported or active/standby (you didn't
mention), then this is fine, as 2 of these journals NVMes would be a
perfect match.
If it's dual-ported with MC-LAG, then you're wasting half of the potential
bandwidth. 

Also these NVMes have a nice, feel good 5 DWPD, for future reference. 

> Forgive me for my poor English here, but when you say "leave your
> journal in contention", what exactly do you mean by that ?
> 
He means that the combined bandwidth of your SSDs will be larger than
those of your journal NVMe's, limiting the top bandwidth your nodes can
write at to those of the journals.

In your case we're missing any pertinent details about the SSDs as well.

An educated guess (size, 12Gbs link, Samsung) makes them these:
http://www.samsung.com/semiconductor/products/flash-storage/enterprise-ssd/MZILS1T9HCHP?ia=832
http://www.samsung.com/semiconductor/global/file/media/PM853T.pdf

So 750MB/s sequential writes, 3 of these can already handle more than your
NVMe.

However the 1 DWPD (the PDF is more detailed and gives us a scary 0.3 DWPD
for small I/Os) of these SSDs would definitely stop me from considering
them.
Unless you can quantify your write volume with certainty and it's below
the level these SSDs can support, go for something safer, at least 3 DWPD.

Quick estimate:
24 SSDs (replication of 3) * 1.92TB * 0.3 (worst case) = 13.8TB/day 
That's ignoring further overhead and write amplification by the FS
(journals) and Ceph itself.
So if your cluster sees less than 10TB writes/day, you may at least assume
it won't kill those SSDs within months.

Your journal NVMes are incidentally a decent match endurance wise at a
(much more predictable) 16TB/day.


The above is of course all about bandwidth (sequential writes), which are
important in certain use cases and during backfill/recovery actions.

Since your use case suggest more of a DB, smallish data transactions
scenario, that "waste" of bandwidth may be totally acceptable.
All my clusters certainly favor lower latency over higher bandwidth when
having to choose between either. 

It comes back to use case and write volume, those journal NVMes will help
with keeping latency low (for your DBs) so if that is paramount, go with
that.

They do feel a bit wasted (1.6TB, of which you'll use 1-200MB at most),
though.
Consider alternative designs where you have special pools for high
performance needs on NVMes and use 3+DWPD SSDs (journals inline) for the
rest.

Also I'd use the E5-2697A v4 CPU instead with SSDs (faster baseline and
Turbo).

Christian

> Best regards, 
> Patrik Martinsson
> Sweden
> 
> 
> > On Oct 13, 2016 10:23 AM, "Patrik Martinsson"
> >  wrote:
> > > Hello everyone, 
> > > 
> > > We are in the process of buying hardware for our first ceph-
> > > cluster. We
> > > will start with some testing and do some performance measurements
> > > to
> > > see that we are on the right track, and once we are satisfied with
> > > our
> > > setup we'll continue to grow in it as time comes along.
> > > 
> > > Now, I'm just seeking some thoughts on our future hardware, I know
> > > there are a lot of these kind of questions out there, so please
> > > forgive
> > > me for posting another one. 
> > > 
> > > Details, 
> > > - Cluster will be in the same datacenter, multiple racks as we
> > > grow 
> > > - Typical workload (this is incredible vague, forgive me again)
> > > would
> > > be an Openstack environment, hosting 150~200 vms, we'll have quite
> > > a
> > > few databases for Jira/Confluence/etc. Some workload coming from
> > > Stash/Bamboo agents, puppet master/foreman, and other typical "core
> > > infra stuff". 
> > > 
> > > Given this prerequisites just given, the going all SSD's (and NVME
> > > for
> > > journals) may seem as overkill(?), but we feel like we can afford
> > > it
> > > and it will be a benefit for us in the future. 
> > > 
> > > Planned hardware, 
> > > 
> > > Six nodes to begin with (which would give us a cluster size of
> > > ~46TB,
> > > with a default replica of three (although prob

Re: [ceph-users] Hammer OSD memory usage very high

2016-10-13 Thread David Burns

> On 7 Oct. 2016, at 22:53, Haomai Wang  wrote:
> 
> do you try to restart osd to se the memory usage?
> 

Restarting OSDs does not change the memory usage.

(Apologies for delay in reply - was offline due to illness.)

Regards
David


-- 
FetchTV Pty Ltd, Level 5, 61 Lavender Street, Milsons Point, NSW 2061



This email is sent by FetchTV Pty Ltd (ABN 36 130 669 500). The contents of 
this communication may be 
confidential, legally privileged and/or copyright material. If you are not 
the intended recipient, any use, 
disclosure or copying of this communication is expressed prohibited. If you 
have received this email in error, 
please notify the sender and delete it immediately.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Hammer OSD memory usage very high

2016-10-13 Thread David Burns
On 13 Oct. 2016, at 20:21, Praveen Kumar G T (Cloud Platform) 
 wrote:
> 
> 
> Hi David,
> 
> I am Praveen, we also had a similar problem with hammer 0.94.2. We had the 
> problem when we created a new cluster with erasure coding pool (10+5 config). 
> 
> Root cause:
> 
> The high memory usage in our case was because of pg logs. The number of pg 
> logs are higher in case of erasure coding pool compared to replica pools. so 
> in our case we started running out of memory when we created the new cluster 
> with erasure coding pools
> 
> Solution:
> 
> Ceph provides configuration to control the number of pg log entries. You can 
> try setting this value in your cluster and check your OSD memory usage. This 
> will also improve the osd boot up time. Below are the config parameters and 
> the values we use
> 
>   osd max pg log entries = 600
>   osd min pg log entries = 200
>   osd pg log trim min = 200
> 
> Other Information:
> 
> We dug around this problem for some time before figuring out the root cause. 
> So we are fairly sure there are no memory leaks in ceph hammer 0.94.2 
> version. 
> 
> Regards,
> Praveen
> 

Hello Praveen,

Thankyou for your suggestions.

We’ve previously attempting tuning these parameters with no effect.

Regardless, today we’ve tested your parameters (which are more aggressive than 
what we tried) on one of the OSDs … but there was no change.

Next step is to examine debug output but this may take some time to interpret...

NB At the moment we’re only using replicated pools. We’d like to evaluate EC 
pools but this is on the backburner until we can fix the high OSD memory usage.

Regards,
David


-- 
FetchTV Pty Ltd, Level 5, 61 Lavender Street, Milsons Point, NSW 2061



This email is sent by FetchTV Pty Ltd (ABN 36 130 669 500). The contents of 
this communication may be 
confidential, legally privileged and/or copyright material. If you are not 
the intended recipient, any use, 
disclosure or copying of this communication is expressed prohibited. If you 
have received this email in error, 
please notify the sender and delete it immediately.


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] Stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"

2016-10-13 Thread Henrik Korkuc

On 16-10-13 22:46, Chris Murray wrote:

On 13/10/2016 11:49, Henrik Korkuc wrote:
Is apt/dpkg doing something now? Is problem repeatable, e.g. by 
killing upgrade and starting again. Are there any stuck systemctl 
processes?


I had no problems upgrading 10.2.x clusters to 10.2.3

On 16-10-13 13:41, Chris Murray wrote:

On 22/09/2016 15:29, Chris Murray wrote:

Hi all,

Might anyone be able to help me troubleshoot an "apt-get dist-upgrade"
which is stuck at "Setting up ceph-osd (10.2.3-1~bpo80+1)"?

I'm upgrading from 10.2.2. The two OSDs on this node are up, and think
they are version 10.2.3, but the upgrade doesn't appear to be 
finishing

... ?

Thank you in advance,
Chris


Hi,

Are there possibly any pointers to help troubleshoot this? I've got 
a test system on which the same thing has happened.


The cluster's status is "HEALTH_OK" before starting. I'm running 
Debian Jessie.


dpkg.log only has the following:

2016-10-13 11:37:25 configure ceph-osd:amd64 10.2.3-1~bpo80+1 
2016-10-13 11:37:25 status half-configured ceph-osd:amd64 
10.2.3-1~bpo80+1


At this point, the ugrade gets stuck and doesn't go any further. 
Where could I look for the next clue?


Thanks,

Chris


___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Thank you Henrik, I see it's a systemctl process that's stuck.

It is reproducible for me on every run of  dpkg --configure -a

And, indeed, reproducible across two separate machines.

I'll pursue the stuck "/bin/systemctl start ceph-osd.target".



you can try to check if systemctl daemon-rexec helps to solve this 
problem. I couldn't find a link quickly but it seems that Jessie systemd 
sometomes manages to get stuck on systemctl calls.



Thanks again,
Chris

___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com



___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com