Hi Will,
there is a dedicated mailing list for ceph-ansible:
http://lists.ceph.com/listinfo.cgi/ceph-ansible-ceph.com
Best,
Martin
On Thu, Jan 31, 2019 at 5:07 PM Will Dennis wrote:
>
> Hi all,
>
>
>
> Trying to utilize the ‘ceph-ansible’ project
> (https://github.com/ceph/ceph-ansible ) to de
Upgrading to 4.15.0-43-generic fixed the problem.
Best,
Martin
On Fri, Jan 25, 2019 at 9:43 PM Ilya Dryomov wrote:
>
> On Fri, Jan 25, 2019 at 9:40 AM Martin Palma wrote:
> >
> > > Do you see them repeating every 30 seconds?
> >
> > yes:
> >
> > Jan
> Do you see them repeating every 30 seconds?
yes:
Jan 25 09:34:37 sdccgw01 kernel: [6306813.737615] libceph: mon4
10.8.55.203:6789 session lost, hunting for new mon
Jan 25 09:34:37 sdccgw01 kernel: [6306813.737620] libceph: mon3
10.8.55.202:6789 session lost, hunting for new mon
Jan 25 09:34:37
Hi Ilya,
thank you for the clarification. After setting the
"osd_map_messages_max" to 10 the io errors and the MDS error
"MDS_CLIENT_LATE_RELEASE" are gone.
The messages of "mon session lost, hunting for new new mon" didn't go
away... can it be that this is related to
https://tracker.ceph.com/is
We are experiencing the same issues on clients with CephFS mounted
using the kernel client and 4.x kernels.
The problem shows up when we add new OSDs, on reboots after
installing patches and when changing the weight.
Here the logs of a misbehaving client;
[6242967.890611] libceph: mon4 10.8.55.
Hello,
maybe a dump question but is there a way to correlate the ceph kernel
module version with a ceph specific version? For example can I figure
this out using "modinfo ceph"?
Whats the best way to check if a specific client is running at least
at Luminous?
Best,
Martin
___
Same here also on Gmail with G Suite.
On Mon, Oct 8, 2018 at 12:31 AM Paul Emmerich wrote:
>
> I'm also seeing this once every few months or so on Gmail with G Suite.
>
> Paul
> Am So., 7. Okt. 2018 um 08:18 Uhr schrieb Joshua Chen
> :
> >
> > I also got removed once, got another warning once (nee
t;
> Mons are also on a 30s timeout.
> Even a short loss of quorum isn‘t noticeable for ongoing IO.
>
> Paul
>
> > Am 04.10.2018 um 11:03 schrieb Martin Palma :
> >
> > Also monitor election? That is the most fear we have since the monitor
> > nodes will no see
Also monitor election? That is the most fear we have since the monitor
nodes will no see each other for that timespan...
On Thu, Oct 4, 2018 at 10:21 AM Paul Emmerich wrote:
>
> 10 seconds is far below any relevant timeout values (generally 20-30
> seconds); so you will be fine without any specia
Hi all,
our Ceph cluster is distributed across two datacenter. Due do network
maintenance the link between the two datacenter will be down for ca. 8
- 10 seconds. In this time the public network of Ceph between the two
DCs will also be down.
What can we do of best handling this scenario to have m
Thanks for the suggestions, and will future check for LVM volumes,
etc... the kernel version is the following 3.10.0-327.4.4.el7.x86_64
and the OS is CentOS 7.2.1511 (Core)
Best,
Martin
On Mon, Sep 10, 2018 at 12:23 PM Ilya Dryomov wrote:
>
> On Mon, Sep 10, 2018 at 10:46 AM Martin Palma
We are trying to unmap an rbd image form a host for deletion and
hitting the following error:
rbd: sysfs write failed
rbd: unmap failed: (16) Device or resource busy
We used commands like "lsof" and "fuser" but nothing is reported to
use the device. Also checked for watcher with "rados -p pool
li
Since Prometheus uses a pull model over HTTP for collecting metrics.
What are the best practices to secure these HTTP endpoints?
- With a reverse proxy with authentication?
- Export the node_exporter only on the cluster network? (not usable
for the mgr plugin and for nodes like mons, mdss,...)
- N
Hi all,
In our current production cluster we have the following CRUSH
hierarchy, see https://pastebin.com/640Q4XSH or the attached image.
This reflects 1:1 real physical deployment. We currently use also a
replica factor of 3 with the following CRUSH rule on our pools:
rule hdd_replicated {
id
Hello,
Is it possible to get directory/file layout information (size, pool)
of a CephFS directory directly from a metadata server without the need
to mount the fs? Or better through the restful plugin...
When mounted I can get infos about the directory/file layout using the
getfattr command...
B
Just run into this problem on our production cluster
It would have been nice if the release notes of 12.2.4 had been
adapted to inform user about this.
Best,
Martin
On Wed, Mar 14, 2018 at 9:53 PM, Gregory Farnum wrote:
> On Wed, Mar 14, 2018 at 12:41 PM, Lars Marowsky-Bree wrote:
>> On 20
. If you were to reset the
> weights for the previous OSDs, you would only incur an additional round of
> reweighting for no discernible benefit.
>
> On Mon, Feb 26, 2018 at 7:13 AM Martin Palma wrote:
>>
>> Hello,
>>
>> from some OSDs in our cluster we got th
Hello,
from some OSDs in our cluster we got the "nearfull" warning message so
we run the "ceph osd reweight-by-utilization" command to better
distribute the data.
Now we have expanded out cluster with new nodes should we reverse the
weight of the changed OSDs to 1.0?
Best,
Martin
___
Hello,
is there a way to get librados for MacOS? Has anybody tried to build
librados for MacOS? Is this even possible?
Best,
Martin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi,
Calamari is deprecated, it was replaced by the ceph-mgr [0] from what I know.
Bye,
Martin
[0] http://docs.ceph.com/docs/master/mgr/
On Wed, Jul 19, 2017 at 6:28 PM, Oscar Segarra wrote:
> Hi,
>
> Anybody has been able to setup Calamari on Centos7??
>
> I've done a lot of Google but I haven
Can the "sortbitwise" also be set if we have a cluster running OSDs on
10.2.6 and some OSDs on 10.2.9? Or should we wait that all OSDs are on
10.2.9?
Monitor nodes are already on 10.2.9.
Best,
Martin
On Fri, Jul 14, 2017 at 1:16 PM, Dan van der Ster wrote:
> On Mon, Jul 10, 2017 at 5:06 PM, Sag
Thank you for the clarification and yes we saw that v10.2.9 was just
released. :-)
Best,
Martin
On Fri, Jul 14, 2017 at 3:53 PM, Patrick Donnelly wrote:
> On Fri, Jul 14, 2017 at 12:26 AM, Martin Palma wrote:
>> So only the ceph-mds is affected? Let's say if we have mons and osd
So only the ceph-mds is affected? Let's say if we have mons and osds
on 10.2.8 and the MDS on 10.2.6 or 10.2.7 we would be "safe"?
I'm asking since we need to add new storage nodes to our production cluster.
Best,
Martin
On Wed, Jul 12, 2017 at 10:44 PM, Patrick Donnelly wrote:
> On Wed, Jul 12
> [429280.254400] attempt to access beyond end of device
> [429280.254412] sdi1: rw=0, want=19134412768, limit=19134412767
We are seeing the same for our OSDs which have the journal as a
separate partition always on the same disk and only for OSDs which we
added after our cluster was upgraded to j
Hi Wido,
thank you for the clarification. We will wait until recovery is over
we have plenty of space on the mons :-)
Best,
Martin
On Tue, Jan 31, 2017 at 10:35 AM, Wido den Hollander wrote:
>
>> Op 31 januari 2017 om 10:22 schreef Martin Palma :
>>
>>
>> Hi all,
>
Hi all,
our cluster is currently performing a big expansion and is in recovery
mode (we doubled in size and osd# from 600 TB to 1,2 TB).
Now we get the following message from our monitor nodes:
mon.mon01 store is getting too big! 18119 MB >= 15360 MB -- 94% avail
Reading [0] it says that it is
where it is on the monitor nodes? only in the memory or
> persisted in any files or DBs? Looks like it’s not just in memory but I
> cannot find where those value saved, thanks!
>
> Best Regards,
> Dave Chen
>
> From: Martin Palma [mailto:mar...@palma.bz]
> Sent: Friday, Jan
Hi,
They are stored on the monitore nodes.
Best,
Martin
On Fri, 20 Jan 2017 at 04:53, Chen, Wei D wrote:
> Hi,
>
>
>
> I have read through some documents about authentication and user
> management about ceph, everything works fine with me, I can create
>
> a user and play with the keys and cap
user that will
> perform the "snap unprotect" has the "allow class-read object_prefix
> rbd_children" on all pools [1].
>
> [1] http://docs.ceph.com/docs/master/man/8/ceph-authtool/#capabilities
>
> On Thu, Jan 12, 2017 at 10:56 AM, Martin Palma w
Hi all,
what permissions do I need to unprotect a protected rbd snapshot?
Currently the key interacting with the pool containing the rbd image
has the following permissions:
mon 'allow r'
osd 'allow rwx pool=vms'
When I try to unprotect a snaphost with the following command "rbd
snap unprotect
Thanks all for the clarification.
Best,
Martin
On Mon, Dec 5, 2016 at 2:14 PM, John Spray wrote:
> On Mon, Dec 5, 2016 at 12:35 PM, David Disseldorp wrote:
>> Hi Martin,
>>
>> On Mon, 5 Dec 2016 13:27:01 +0100, Martin Palma wrote:
>>
>>> Ok, just discovere
Ok, just discovered that with the fuse client, we have to add the '-r
/path' option, to treat that as root. So I assume the caps 'mds allow
r' is only needed if we also what to be able to mount the directory
with the kernel client. Right?
Best,
Martin
On Mon, Dec 5, 2016 at 1
Hello,
is it possible prevent cephfs client to mount the root of a cephfs
filesystem and browse through it?
We want to restrict cephfs clients to a particular directory, but when
we define a specific cephx auth key for a client we need to add the
following caps: "mds 'allow r'" which then gives t
> I was wondering how exactly you accomplish that?
> Can you do this with a "ceph-deploy create" with "noin" or "noup" flags
> set, or does one need to follow the manual steps of adding an osd?
You can do it either way (manual or with ceph-deploy). Here are the
steps using ceph-deploy:
1. Add "os
Hi James,
> o) Based on our performance testing we’re seeing the kernel client by far
> out-performs the fuse client – older mailing list posts from 2014 suggest
> this is expected, is the recommendation still to use the kernel client?
I guess so, but about this I'm not 100% sure.
> o) Ref: http
r wrote:
>
>> Op 9 augustus 2016 om 17:44 schreef Martin Palma :
>>
>>
>> Hi Wido,
>>
>> thanks for your advice.
>>
>
> Just keep in mind, you should update the CRUSHMap in one big bang. The
> cluster will be calculating and peering for 1 or 2 m
Hi Wido,
thanks for your advice.
Best,
Martin
On Tue, Aug 9, 2016 at 10:05 AM, Wido den Hollander wrote:
>
>> Op 8 augustus 2016 om 16:45 schreef Martin Palma :
>>
>>
>> Hi all,
>>
>> we are in the process of expanding our cluster and I would like to
>
Hi all,
we are in the process of expanding our cluster and I would like to
know if there are some best practices in doing so.
Our current cluster is composted as follows:
- 195 OSDs (14 Storage Nodes)
- 3 Monitors
- Total capacity 620 TB
- Used 360 TB
We will expand the cluster by other 14 Stora
I assume you installed Ceph using 'ceph-deploy'. I noticed the same
thing on CentOS when deploying a cluster for testing...
As Wido already noted the OSDs are marked as down & out. From each OSD
node you can do a "ceph-disk activate-all" to start the OSDs.
On Mon, Jul 18, 2016 at 12:59 PM, Wido d
It seems that the packages "ceph-release-*.noarch.rpm" contain a
ceph.repo pointing to the baseurl
"http://ceph.com/rpm-hammer/rhel7/$basearch"; which does not exist. It
should probably point to "http://ceph.com/rpm-hammer/el7/$basearch";.
- Martin
On Thu, Jul 7,
Hi All,
it seems that the "rhel7" folder/symlink on
"download.ceph.com/rpm-hammer" does not exist anymore therefore
ceph-deploy fails to deploy a new cluster. Just tested it by setting
up a new lab environment.
We have the same issue on our production cluster currently, which
keeps us of updating
--
> From: m...@palma.bz [mailto:m...@palma.bz] On Behalf Of Martin Palma
> Sent: Wednesday, June 15, 2016 16:03
> To: DAVY Stephane OBS/OCB
> Cc: ceph-users@lists.ceph.com
> Subject: Re: [ceph-users] Failing upgrade from Hammer to Jewel on Centos 7
>
> Hi Stéphane,
>
> We
Hi Stéphane,
We had the same issue:
https://www.mail-archive.com/ceph-users%40lists.ceph.com/msg27507.html
Since then we have applied the fix suggested by Dan by simple adding
"ceph-disk activate-all" to rc.local
Best,
Martin
On Wed, Jun 15, 2016 at 10:39 AM, wrote:
> Hello ceph users,
>
>
>
idd
> Sr. Software Maintenance Engineer
> Red Hat Ceph Storage
> +1 919-442-8878
>
> On Tue, Mar 15, 2016 at 11:41 AM, Martin Palma wrote:
>>
>> Hi all,
>>
>> The documentation [0] gives us the following formula for calculating
>> the number of PG if th
Hi all,
The documentation [0] gives us the following formula for calculating
the number of PG if the cluster is bigger than 50 OSDs:
(OSDs * 100)
Total PGs =
pool size
When we have mixed storage server (HDD disks and SSD disks) and we
have
; To clarify, I didn't notice this issue in 0.94.6 specifically... I
> just don't trust the udev magic to work every time after every kernel
> upgrade, etc.
>
> -- Dan
>
> On Mon, Mar 7, 2016 at 10:20 AM, Martin Palma wrote:
>> Hi Dan,
>>
>> thanks for the
he time anyway just in case...)
>
> -- Dan
>
> On Mon, Mar 7, 2016 at 9:38 AM, Martin Palma wrote:
>> Hi All,
>>
>> we are in the middle of patching our OSD servers and noticed that
>> after rebooting no OSD disk is mounted and therefore no OSD service
>>
Hi All,
we are in the middle of patching our OSD servers and noticed that
after rebooting no OSD disk is mounted and therefore no OSD service
starts.
We have then to manually call "ceph-disk-activate /dev/sdX1" for all
our disk in order to mount and start the OSD service again.
Here a the versio
Hi Maruthi,
happy to hear that it is working now.
Yes, with the latest stable release, infernalis, the "ceph" username is
reserved for the Ceph daemons.
Best,
Martin
On Tuesday, 5 January 2016, Maruthi Seshidhar
wrote:
> Thank you Martin,
>
> Yes, "nslookup " was not working.
> After configu
Hi Maruthi,
and did you test that DNS name lookup properly works (e.g. nslookup
ceph-mon1 etc...) on all hosts?
>From the output of 'ceph-deploy' it seem that the host can only resolve
it's own name but not the others:
[ceph-mon1][DEBUG ] "monmap": {
[ceph-mon1][DEBUG ] "created": "0.0
Hi,
it seem you are messing the "client.admin" keyring.
When you disabled the cephx authentication in ceph.conf did you push the
config to all nodes? If yes, maybe you need to restart the monitor services
in order the change takes effect.
Best,
Martin
On Thu, Dec 24, 2015 at 4:11 PM, Selim Dinc
Currently, we use approach #1 with kerberized NFSv4 and Samba (with AD as
KDC) - desperately waiting for CephFS :-)
Best,
Martin
On Tue, Dec 15, 2015 at 11:51 AM, Wade Holler wrote:
> Keep it simple is my approach. #1
>
> If needed Add rudimentary HA with pacemaker.
>
> http://linux-ha.org/wiki
Hi,
from what I'm seeing your ceph.conf isn't quite right if we take into
account you cluster description "...with one monitor node and one osd...".
The parameters "mon_inital_members" and "mon_host" should only contain
monitor nodes. Not all the nodes in you cluster.
More over you should provide
; > And a swap partition is still needed even though the memory is big.
> > Martin Palma 于 2015年9月18日,下午11:07写道: Hi,
> >
> > Is it a good idea to use a software raid for the system disk (Operating
> > System) on a Ceph storage node? I mean only for the OS not for the
Hi,
Is it a good idea to use a software raid for the system disk (Operating
System) on a Ceph storage node? I mean only for the OS not for the OSD
disks.
And what about a swap partition? Is that needed?
Best,
Martin
___
ceph-users mailing list
ceph-use
.
>
> In this case as you said, distributing the ssds across all nodes should be
> your correct approach.
>
> Hope this helps,
>
>
>
> Thanks & Regards
>
> Somnath
>
>
>
> *From:* ceph-users [mailto:ceph-users-boun...@lists.ceph.com] *On Behalf
> Of *
find like other that cache-tiers currently aren't all great
> performance wise.
>
> Christian.
>
> On Sat, 30 May 2015 10:36:39 +0200 Martin Palma wrote:
>
> > Hello,
> >
> > We are planing to deploy our first Ceph cluster with 14 storage nodes
> &g
Hello,
We are planing to deploy our first Ceph cluster with 14 storage nodes and 3
monitor nodes. The storage node have 12 SATA disks and 4 SSDs. 2 of the
SSDs we plan to use as
journal disks and 2 for cache tiering.
Now the question raised in our team if it would be better to put all SSDs
lets s
58 matches
Mail list logo