Hi Team,
If i lost admin node, what will be the recovery procedure with same keys.
Regards
Prabu
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hello!
On Mon, Oct 05, 2015 at 09:35:26PM -0600, robert wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
> With some off-list help, we have adjusted
> osd_client_message_cap=1. This seems to have helped a bit and we
> have seen some OSDs have a value up to 4,000 for client messages
Short: how to sure avoid (if possible) fs freezes on 1 of 3 mds rejoin?
ceph version 0.94.3-242-g79385a8 (79385a85beea9bccd82c99b6bda653f0224c4fcd)
I moving 2 VM clients from ocfs2 (starting to deadlock VM on snapshot) to cephfs
(at least I can backup it). May be I just don't see it before, may
On Tue, Oct 6, 2015 at 11:43 AM, Dzianis Kahanovich
wrote:
> Short: how to sure avoid (if possible) fs freezes on 1 of 3 mds rejoin?
>
> ceph version 0.94.3-242-g79385a8 (79385a85beea9bccd82c99b6bda653f0224c4fcd)
>
> I moving 2 VM clients from ocfs2 (starting to deadlock VM on snapshot) to
> cephf
John Spray пишет:
On Tue, Oct 6, 2015 at 11:43 AM, Dzianis Kahanovich
wrote:
Short: how to sure avoid (if possible) fs freezes on 1 of 3 mds rejoin?
ceph version 0.94.3-242-g79385a8 (79385a85beea9bccd82c99b6bda653f0224c4fcd)
I moving 2 VM clients from ocfs2 (starting to deadlock VM on snapsho
PS This is standard 3 node (MON+MDS+OSDs - initial 3x setup) cluster + 1 OSDs
later node. Nothing special. OSDs balanced near equal size per host.
Dzianis Kahanovich пишет:
John Spray пишет:
On Tue, Oct 6, 2015 at 11:43 AM, Dzianis Kahanovich
wrote:
Short: how to sure avoid (if possible) fs
On Tue, Oct 6, 2015 at 12:07 PM, Dzianis Kahanovich
wrote:
> John Spray пишет:
>>
>> On Tue, Oct 6, 2015 at 11:43 AM, Dzianis Kahanovich
>> wrote:
>>>
>>> Short: how to sure avoid (if possible) fs freezes on 1 of 3 mds rejoin?
>>>
>>> ceph version 0.94.3-242-g79385a8
>>> (79385a85beea9bccd82c99b6
John Spray пишет:
Short: how to sure avoid (if possible) fs freezes on 1 of 3 mds rejoin?
ceph version 0.94.3-242-g79385a8
(79385a85beea9bccd82c99b6bda653f0224c4fcd)
I moving 2 VM clients from ocfs2 (starting to deadlock VM on snapshot) to
cephfs (at least I can backup it). May be I just don't
Hi,
Context:
Firefly 0.80.9
8 storage nodes
176 osds : 14*8 sas and 8*8 ssd
3 monitors
I create an alternate crushmap in order to fulfill tiering requirement i.e.
select ssd or sas.
I created specific buckets "host-ssd" and "host-sas" and regroup them in
"tier-ssd" and "tier-sas" under a "tier-
On Tue, Oct 6, 2015 at 1:22 PM, Dzianis Kahanovich
wrote:
> Even now I remove "mds standby replay = true":
> e7151: 1/1/1 up {0=b=up:active}, 2 up:standby
> Cluster stuck on KILL active mds.b. How to correctly stop mds to get
> behaviour like on MONs - leader->down/peon->leader?
It's not clear to
On Mon, 5 Oct 2015, Robert LeBlanc wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> With some off-list help, we have adjusted
> osd_client_message_cap=1. This seems to have helped a bit and we
> have seen some OSDs have a value up to 4,000 for client messages. But
> it does not s
John Spray пишет:
On Tue, Oct 6, 2015 at 1:22 PM, Dzianis Kahanovich
wrote:
Even now I remove "mds standby replay = true":
e7151: 1/1/1 up {0=b=up:active}, 2 up:standby
Cluster stuck on KILL active mds.b. How to correctly stop mds to get
behaviour like on MONs - leader->down/peon->leader?
It'
Sorry, skipped some...
John Spray пишет:
On Tue, Oct 6, 2015 at 1:22 PM, Dzianis Kahanovich
wrote:
Even now I remove "mds standby replay = true":
e7151: 1/1/1 up {0=b=up:active}, 2 up:standby
Cluster stuck on KILL active mds.b. How to correctly stop mds to get
behaviour like on MONs - leader->d
On Tue, Oct 6, 2015 at 2:21 PM, Dzianis Kahanovich
wrote:
> John Spray пишет:
>>
>> On Tue, Oct 6, 2015 at 1:22 PM, Dzianis Kahanovich
>> wrote:
>>>
>>> Even now I remove "mds standby replay = true":
>>> e7151: 1/1/1 up {0=b=up:active}, 2 up:standby
>>> Cluster stuck on KILL active mds.b. How to
On Mon, Oct 5, 2015 at 11:21 AM, Egor Kartashov wrote:
> Hello!
>
> I have cluster of 3 machines with ceph 0.80.10 (package shipped with Ubuntu
> Trusty). Ceph sucessfully mounts on all of them. On external machine I'm
> reciving error "can't read superblock" and dmesg shows records like:
>
> [1
On Mon, Oct 5, 2015 at 10:36 PM, Dmitry Ogorodnikov
wrote:
> Good day,
>
> I think I will use wheezy for now for tests. Bad thing is wheezy full
> support ends in 5 months, so wheezy is not ok for persistent production
> cluster.
>
> I cant find out what ceph team offer to debian users, move to ot
Thanks for your time Sage. It sounds like a few people may be helped if you
can find something.
I did a recursive chown as in the instructions (although I didn't know
about the doc at the time). I did an osd debug at 20/20 but didn't see
anything. I'll also do ms and make the logs available. I'll
On Mon, Oct 5, 2015 at 10:40 PM, Serg M wrote:
> What difference between memory statistics of "ceph tell {daemon}.{id} heap
> stats"
Assuming you're using tcmalloc (by default you are) this will get
information straight from the memory allocator about what the actual
daemon memory usage is.
> ,
> Most users in the apt family have deployed on Ubuntu
> though, and that's what our tests run on, fyi.
That is good to know - I wouldn't be surprised if the same packages could be
used in Ubuntu and Debian. Especially if the release dates of the Ubuntu and
Debian versions were similar.
Thanks
All four machines are located in different datacenters and networks. All that
networks are routable with each other. Public network section of ceph.conf
contains all that networks.
--
С уважением,
Карташов Егор
http://staff/kartvep
06.10.2015, 17:23, "Gregory Farnum" :
> On Mon, Oct 5, 2015
On Tue, 6 Oct 2015, Robert LeBlanc wrote:
> Thanks for your time Sage. It sounds like a few people may be helped if you
> can find something.
>
> I did a recursive chown as in the instructions (although I didn't know about
> the doc at the time). I did an osd debug at 20/20 but didn't see anything
On Tue, Oct 6, 2015 at 10:29 AM, Gregory Farnum wrote:
> On Mon, Oct 5, 2015 at 10:36 PM, Dmitry Ogorodnikov
> wrote:
>> Good day,
>>
>> I think I will use wheezy for now for tests. Bad thing is wheezy full
>> support ends in 5 months, so wheezy is not ok for persistent production
>> cluster.
>>
John Spray пишет:
On Tue, Oct 6, 2015 at 2:21 PM, Dzianis Kahanovich
wrote:
John Spray пишет:
On Tue, Oct 6, 2015 at 1:22 PM, Dzianis Kahanovich
wrote:
Even now I remove "mds standby replay = true":
e7151: 1/1/1 up {0=b=up:active}, 2 up:standby
Cluster stuck on KILL active mds.b. How to co
I have encountered a rather interesting issue with Ubuntu 14.04 LTS running
3.19.0-30 kernel (Vivid) using Ceph Hammer (0.94.3).
With everything else identical in our testing cluster, no other changes
other than the kernel (apt-get install linux-image-generic-lts-vivid and
then a reboot), we ar
On 10/06/2015 10:14 AM, MailingLists - EWS wrote:
I have encountered a rather interesting issue with Ubuntu 14.04 LTS
running 3.19.0-30 kernel (Vivid) using Ceph Hammer (0.94.3).
With everything else identical in our testing cluster, no other changes
other than the kernel (apt-get install linux-
Could you share some of your testing methodology? I'd like to repeat your
tests.
I have a cluster that is currently running mostly 3.13 kernels, but the
latest patch of that version breaks the onboard 1Gb NIC in the servers I'm
using. I recently had to redeploy several of these servers due to SSD
I downgraded to the hammer gitbuilder branch, but it looks like I've
passed the point of no return:
2015-10-06 09:44:52.210873 7fd3dd8b78c0 -1 ERROR: on disk data
includes unsupported features:
compat={},rocompat={},incompat={7=support shec erasure code}
2015-10-06 09:44:52.210922 7fd3dd8b78c0 -1
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
I've only done a 'step take ' where is a root entry. I haven't tried
with it being under the root. I would suspect it would work, but you
can try to put your tiers in a root section and test it there.
-
Robert LeBlanc
PGP Fingerprin
On Tue, 6 Oct 2015, Robert LeBlanc wrote:
> I downgraded to the hammer gitbuilder branch, but it looks like I've
> passed the point of no return:
>
> 2015-10-06 09:44:52.210873 7fd3dd8b78c0 -1 ERROR: on disk data
> includes unsupported features:
> compat={},rocompat={},incompat={7=support shec era
On Tue, Oct 6, 2015 at 8:38 AM, Sage Weil wrote:
> Oh.. I bet you didn't upgrade the osds to 0.94.4 (or latest hammer build)
> first. They won't be allowed to boot until that happens... all upgrades
> must stop at 0.94.4 first.
This sounds pretty crucial. is there Redmine ticket(s)?
- Ken
_
On Tue, 6 Oct 2015, Ken Dreyer wrote:
> On Tue, Oct 6, 2015 at 8:38 AM, Sage Weil wrote:
> > Oh.. I bet you didn't upgrade the osds to 0.94.4 (or latest hammer build)
> > first. They won't be allowed to boot until that happens... all upgrades
> > must stop at 0.94.4 first.
>
> This sounds pretty
> Hi,
>
> Very interesting! Did you upgrade the kernel on both the OSDs and clients
or
> just some of them? I remember there were some kernel performance
> regressions a little while back. You might try running perf during your
tests
> and look for differences. Also, iperf might be worth tryin
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
This was from the monitor (can't bring it up with Hammer now, complete
cluster is down, this is only my lab, so no urgency).
I got it up and running this way:
1. Upgrade the mon node to Infernalis and started the mon.
2. Downgraded the OSDs to to-be
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
OK, an interesting point. Running ceph version 9.0.3-2036-g4f54a0d
(4f54a0dd7c4a5c8bdc788c8b7f58048b2a28b9be) looks a lot better. I got
messages when the OSD was marked out:
2015-10-06 11:52:46.961040 osd.13 192.168.55.12:6800/20870 81 :
cluster [WR
On Tue, 6 Oct 2015, Robert LeBlanc wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> OK, an interesting point. Running ceph version 9.0.3-2036-g4f54a0d
> (4f54a0dd7c4a5c8bdc788c8b7f58048b2a28b9be) looks a lot better. I got
> messages when the OSD was marked out:
>
> 2015-10-06 11:52:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
I can't think of anything. In my dev cluster the only thing that has
changed is the Ceph versions (no reboot). What I like is even though
the disks are 100% utilized, it is preforming as I expect now. Client
I/O is slightly degraded during the recove
On Tue, 6 Oct 2015, Robert LeBlanc wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> I can't think of anything. In my dev cluster the only thing that has
> changed is the Ceph versions (no reboot). What I like is even though
> the disks are 100% utilized, it is preforming as I expect
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
I'll capture another set of logs. Is there any other debugging you
want turned up? I've seen the same thing where I see the message
dispatched to the secondary OSD, but the message just doesn't show up
for 30+ seconds in the secondary OSD logs.
- ---
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
On my second test (a much longer one), it took nearly an hour, but a
few messages have popped up over a 20 window. Still far less than I
have been seeing.
-
Robert LeBlanc
PGP Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62
I'm wondering if you are hitting the "bug" with the readahead changes?
I know the changes to limit readahead to 2MB was introduced in 3.15, but I
don't know if it was back ported into 3.13 or not. I have a feeling this may
also limit maximum request size to 2MB as well.
If you look in iostat do y
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
I upped the debug on about everything and ran the test for about 40
minutes. I took OSD.19 on ceph1 doen and then brought it back in.
There was at least one op on osd.19 that was blocked for over 1,000
seconds. Hopefully this will have something that
Hello,
a bit of back story first, it may prove educational for others a future
generations.
As some may recall, I have a firefly production cluster with a storage node
design that was both optimized for the use case at the time and with an
estimated capacity to support 140 VMs (all running the s
Hi,
proxmox 4.0 has been released:
http://forum.proxmox.com/threads/23780-Proxmox-VE-4-0-released!
Some ceph improvements :
- lxc containers with krbd support (multiple disks + snapshots)
- qemu with jemalloc support (improve librbd performance)
- qemu iothread option by disk (improve scaling
Hi,
I have a cluster of one monitor and eight OSDs. These OSDs are running on four
hosts(each host has two OSDs). When I set up everything and started Ceph, I got
this:
esta@monitorOne:~$ sudo ceph -s
[sudo] password for esta:
cluster 0b9b05db-98fe-49e6-b12b-1cce0645c015
health HEALTH_W
Hello,
On Wed, 7 Oct 2015 12:57:58 +0800 (CST) wikison wrote:
This is a very old bug, misfeature.
And creeps up every week or so here, google is your friend.
> Hi,
> I have a cluster of one monitor and eight OSDs. These OSDs are running
> on four hosts(each host has two OSDs). When I set up ev
Hi Christian,
Interesting use case :-) How many OSDs / hosts do you have ? And how are they
connected together ?
Cheers
On 07/10/2015 04:58, Christian Balzer wrote:
>
> Hello,
>
> a bit of back story first, it may prove educational for others a future
> generations.
>
> As some may recall, I
46 matches
Mail list logo