I set mon_data to “/home/ceph/software/ceph/var/lib/ceph/mon”, and its owner
has always been “ceph” since we were running Hammer.
And I also tried to set the permission to “777”, it also didn’t work.
发件人: Linh Vu [mailto:v...@unimelb.edu.au]
发送时间: 2017年6月22日 14:26
收件人: 许雪寒; ceph-users@lists.ceph.
Permissions of your mon data directory under /var/lib/ceph/mon/ might have
changed as part of Hammer -> Jewel upgrade. Have you had a look there?
From: ceph-users on behalf of 许雪寒
Sent: Thursday, 22 June 2017 3:32:45 PM
To: ceph-users@lists.ceph.com
Subject: [c
Hello,
Currently have a pool of SSD's running as a Cache in front of a EC Pool.
The cache is very under used and the SSD's spend most time idle, would like to
create a small SSD Pool for a selection of very small RBD disk's as scratch
disks within the OS, should I expect any issues running the
Hi, everyone.
I upgraded one of our ceph clusters from Hammer to Jewel. After upgrading, I
can’t start ceph-mon through “systemctl start ceph-mon@ceph1”, while, on the
other hand, I can start ceph-mon, either as user ceph or root, if I directly
call “/usr/bin/ceph-mon �Ccluster ceph �Cid ceph1
On Mon, Jun 19, 2017 at 3:12 AM Wido den Hollander wrote:
>
> > Op 19 juni 2017 om 5:15 schreef Alex Gorbachev :
> >
> >
> > Has anyone run into such config where a single client consumes storage
> from
> > several ceph clusters, unrelated to each other (different MONs and OSDs,
> > and keys)?
>
Hello,
Hmm, gmail client not grokking quoting these days?
On Wed, 21 Jun 2017 20:40:48 -0500 Brady Deetz wrote:
> On Jun 21, 2017 8:15 PM, "Christian Balzer" wrote:
>
> On Wed, 21 Jun 2017 19:44:08 -0500 Brady Deetz wrote:
>
> > Hello,
> > I'm expanding my 288 OSD, primarily cephfs, cluster
On Jun 21, 2017 8:15 PM, "Christian Balzer" wrote:
On Wed, 21 Jun 2017 19:44:08 -0500 Brady Deetz wrote:
> Hello,
> I'm expanding my 288 OSD, primarily cephfs, cluster by about 16%. I have
12
> osd nodes with 24 osds each. Each osd node has 2 P3700 400GB NVMe PCIe
> drives providing 10GB journal
On Wed, 21 Jun 2017 19:44:08 -0500 Brady Deetz wrote:
> Hello,
> I'm expanding my 288 OSD, primarily cephfs, cluster by about 16%. I have 12
> osd nodes with 24 osds each. Each osd node has 2 P3700 400GB NVMe PCIe
> drives providing 10GB journals for groups of 12 6TB spinning rust drives
> and 2x
Hello,
I'm expanding my 288 OSD, primarily cephfs, cluster by about 16%. I have 12
osd nodes with 24 osds each. Each osd node has 2 P3700 400GB NVMe PCIe
drives providing 10GB journals for groups of 12 6TB spinning rust drives
and 2x lacp 40gbps ethernet.
Our hardware provider is recommending that
Hello,
On Wed, 21 Jun 2017 11:15:26 +0200 Fabian Grünbichler wrote:
> On Wed, Jun 21, 2017 at 05:30:02PM +0900, Christian Balzer wrote:
> >
> > Hello,
> >
> > On Wed, 21 Jun 2017 09:47:08 +0200 (CEST) Alexandre DERUMIER wrote:
> >
> > > Hi,
> > >
> > > Proxmox is maintening a ceph-luminous
You can specify an option in ceph-deploy to tell it which release of ceph
to install, jewel, kraken, hammer, etc. `ceph-deploy --release jewel`
would pin the command to using jewel instead of kraken.
While running a mixed environment is supported, it should always be tested
before assuming it wil
David,
Thanks for the reply.
The scenario:
Monitor node fails for whatever reason, Bad blocks in HD, or Motherboard fail,
whatever.
Procedure:
Remove the monitor from the cluster, replace hardware, reinstall OS and add
monitor to cluster.
That is exactly what I did. However, my ceph-deploy nod
Do your VMs or OSDs show blocked requests? If you disable scrub or
restart the blocked OSD, does the issue go away? If yes, it most
likely is this issue [1].
[1] http://tracker.ceph.com/issues/20041
On Wed, Jun 21, 2017 at 3:33 PM, Hall, Eric wrote:
> The VMs are using stock Ubuntu14/16 images s
The VMs are using stock Ubuntu14/16 images so yes, there is the default
“/sbin/fstrim –all” in /etc/cron.weekly/fstrim.
--
Eric
On 6/21/17, 1:58 PM, "Jason Dillaman" wrote:
Are some or many of your VMs issuing periodic fstrims to discard
unused extents?
On Wed, Jun 21, 2017 a
Are some or many of your VMs issuing periodic fstrims to discard
unused extents?
On Wed, Jun 21, 2017 at 2:36 PM, Hall, Eric wrote:
> After following/changing all suggested items (turning off exclusive-lock
> (and associated object-map and fast-diff), changing host cache behavior,
> etc.) this is
After following/changing all suggested items (turning off exclusive-lock (and
associated object-map and fast-diff), changing host cache behavior, etc.) this
is still a blocking issue for many uses of our OpenStack/Ceph installation.
We have upgraded Ceph to 10.2.7, are running 4.4.0-62 or later
Hello!
It is clear what happens after OSD goes OUT - PGs are backfilled to other OSDs
and PGs whose primary copies were on lost OSD gets new primary OSDs. But when
OSD returns back it looks like all that data, for which the OSD was holding
primary copies, are read from that OSD and re-written t
You have a point, depends on your needs
Based on recovery time and usage, I may find acceptable to lock write
during recovery
Thanks you for that insight
On 21/06/2017 18:47, David Turner wrote:
> I disagree that Replica 2 will ever truly be sane if you care about your
> data. The biggest issue
I disagree that Replica 2 will ever truly be sane if you care about your
data. The biggest issue with replica 2 has nothing to do with drive
failures, restarting osds/nodes, power outages, etc. The biggest issue
with replica 2 is the min_size. If you set min_size to 2, then your data
is locked i
2r on filestore == "I do not care about my data"
This is not because of OSD's failure chance
When you have a write error (ie data is badly written on the disk,
without error reported), your data is just corrupted without hope of
redemption
Just as you expect your drives to die, expect your drive
Hi all,
I'm doing some work to evaluate the risks involved in running 2r storage
pools. On the face of it my naive disk failure calculations give me 4-5
nines for a 2r pool of 100 OSDs (no copyset awareness, i.e., secondary disk
failure based purely on chance of any 1 of the remaining 99 OSDs fail
On 06/14/17 11:59, Dan van der Ster wrote:
> Dear ceph users,
>
> Today we had O(100) slow requests which were caused by deep-scrubbing
> of the metadata log:
>
> 2017-06-14 11:07:55.373184 osd.155
> [2001:1458:301:24::100:d]:6837/3817268 7387 : cluster [INF] 24.1d
> deep-scrub starts
> ...
> 2017-
Hi cephers,
I noticed something I don't understand about ceph's behavior when adding
an OSD. When I start with a clean cluster (all PG's active+clean) and
add an OSD (via ceph-deploy for example), the crush map gets updated and
PGs get reassigned to different OSDs, and the new OSD starts gett
On Wed, 21 Jun 2017, Piotr Dałek wrote:
> > > > > I tested on few of our production images and it seems that about 30%
> > > > > is
> > > > > sparse. This will be lost on any cluster wide event (add/remove nodes,
> > > > > PG grow, recovery).
> > > > >
> > > > > How this is/will be handled in Blue
That patch looks reasonable. You could also try raising the values of
osd_op_thread_suicide_timeout and filestore_op_thread_suicide_timeout on
that osd in order to trim more at a time.
On 06/21/2017 09:27 AM, Dan van der Ster wrote:
Hi Casey,
I managed to trim up all shards except for that bi
On 17-06-21 03:35 PM, Jason Dillaman wrote:
On Wed, Jun 21, 2017 at 3:05 AM, Piotr Dałek wrote:
I saw that RBD (librbd) does that - replacing writes with discards when
buffer contains only zeros. Some code that does the same in librados could
be added and it shouldn't impact performance much, c
On 17-06-21 03:24 PM, Sage Weil wrote:
On Wed, 21 Jun 2017, Piotr Dałek wrote:
On 17-06-14 03:44 PM, Sage Weil wrote:
On Wed, 14 Jun 2017, Paweł Sadowski wrote:
On 04/13/2017 04:23 PM, Piotr Dałek wrote:
On 04/06/2017 03:25 PM, Sage Weil wrote:
On Thu, 6 Apr 2017, Piotr Dałek wrote:
[snip]
On Wed, Jun 21, 2017 at 3:05 AM, Piotr Dałek wrote:
> I saw that RBD (librbd) does that - replacing writes with discards when
> buffer contains only zeros. Some code that does the same in librados could
> be added and it shouldn't impact performance much, current implementation of
> mem_is_zero is
Hi Casey,
I managed to trim up all shards except for that big #54. The others
all trimmed within a few seconds.
But 54 is proving difficult. It's still going after several days, and
now I see that the 1000-key trim is indeed causing osd timeouts. I've
manually compacted the relevant osd leveldbs,
On Wed, 21 Jun 2017, Piotr Dałek wrote:
> On 17-06-14 03:44 PM, Sage Weil wrote:
> > On Wed, 14 Jun 2017, Paweł Sadowski wrote:
> > > On 04/13/2017 04:23 PM, Piotr Dałek wrote:
> > > > On 04/06/2017 03:25 PM, Sage Weil wrote:
> > > > > On Thu, 6 Apr 2017, Piotr Dałek wrote:
> > > > > > [snip]
> > >
> Op 21 juni 2017 om 12:38 schreef Osama Hasebou :
>
>
> Hi Guys,
>
> Has anyone used flash SSD drives for nodes hosting Monitor nodes only?
>
> If yes, any major benefits against just using SAS drives ?
>
Yes:
- Less latency
- Faster store compacting
- More reliable
Buy a Intel S3710 or
On 06/21/2017 12:38 PM, Osama Hasebou wrote:
> Hi Guys,
>
> Has anyone used flash SSD drives for nodes hosting Monitor nodes only?
>
> If yes, any major benefits against just using SAS drives ?
We are using such setup for big (>500 OSDs) clusters. It makes it less
painful when such cluster rebalan
If you just mean normal DC rated SSD’s then that’s what I am running across a
~120 OSD cluster.
When checking they are very unbusy and minimal use, however I can imagine the
lower random latency will always help.
So if you can I would.
,Ashley
Sent from my iPhone
On 21 Jun 2017, at 6:39 PM, O
On 17-06-20 02:44 PM, Richard Hesketh wrote:
Is there a way, either by individual PG or by OSD, I can prioritise
backfill/recovery on a set of PGs which are currently particularly important to
me?
For context, I am replacing disks in a 5-node Jewel cluster, on a node-by-node
basis - mark out
Hi Guys,
Has anyone used flash SSD drives for nodes hosting Monitor nodes only?
If yes, any major benefits against just using SAS drives ?
Thanks.
Regards,
Ossi
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listin
On Wed, Jun 21, 2017 at 05:30:02PM +0900, Christian Balzer wrote:
>
> Hello,
>
> On Wed, 21 Jun 2017 09:47:08 +0200 (CEST) Alexandre DERUMIER wrote:
>
> > Hi,
> >
> > Proxmox is maintening a ceph-luminous repo for stretch
> >
> > http://download.proxmox.com/debian/ceph-luminous/
> >
> >
> >
Hello,
On Wed, 21 Jun 2017 09:47:08 +0200 (CEST) Alexandre DERUMIER wrote:
> Hi,
>
> Proxmox is maintening a ceph-luminous repo for stretch
>
> http://download.proxmox.com/debian/ceph-luminous/
>
>
> git is here, with patches and modifications to get it work
> https://git.proxmox.com/?p=ceph
Hi,
Proxmox is maintening a ceph-luminous repo for stretch
http://download.proxmox.com/debian/ceph-luminous/
git is here, with patches and modifications to get it work
https://git.proxmox.com/?p=ceph.git;a=summary
- Mail original -
De: "Alfredo Deza"
À: "Christian Balzer"
Cc: "ceph
On 17-06-14 03:44 PM, Sage Weil wrote:
On Wed, 14 Jun 2017, Paweł Sadowski wrote:
On 04/13/2017 04:23 PM, Piotr Dałek wrote:
On 04/06/2017 03:25 PM, Sage Weil wrote:
On Thu, 6 Apr 2017, Piotr Dałek wrote:
[snip]
I think the solution here is to use sparse_read during recovery. The
PushOp da
39 matches
Mail list logo