AH! Sorry for the false alarm, I clearly have a hard drive problem
ata2.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
ata2.00: BMDMA stat 0x24
ata2.00: failed command: READ DMA
ata2.00: cmd c8/00:08:38:bd:70/00:00:00:00:00/ef tag 0 dma 4096 in
res 51/40:00:3f:bd:70/40:00:2
Hmm, looks like leveldb is hitting a problem. Is there anything in the
kernel log (dmesg) that suggests a disk or file system problem? Are you
able to, say, tar up the current/omap directory without problems?
This is a single OSD, right? None of the others have been upgraded yet?
sage
On T
[root@ceph-node20 ~]# ls /var/lib/ceph/osd/us-west01-0/current
0.10_head 0.1a_head 0.23_head 0.2c_head 0.37_head 0.3_head 0.b_head
1.10_head 1.1b_head 1.24_head 1.2c_head 1.3a_head 1.b_head 2.16_head
2.1_head 2.2a_head 2.32_head 2.3a_head 2.a_head omap
0.11_head 0.1d_head
On Thu, Nov 13, 2014 at 3:11 PM, Anthony Alba wrote:
> Thanks! What happens when the lone rule fails? Is there a fallback
> rule that will place the blob in a random PG? Say I misconfigure, and
> my choose/chooseleaf don't add up to pool min size.
There's no built-in fallback rule or anything li
On Thu, 13 Nov 2014, Joshua McClintock wrote:
> I upgraded my mons to the latest version and they appear to work, I then
> upgraded my mds and it seems fine.
> I then upgraded one OSD node and the OSD fails to start with the following
> dump, any help is appreciated:
>
> --- begin dump of recent
Hi,
Sure, Thanks.
As described in the link,
Is the only way to avoid this issue downgrade to 2014.1.10?
vagrant@precise64:~$ dpkg -l | grep salt
ii salt-common 2014.1.13-1precise1 Shared libraries that salt requires
for all packages
ii salt-minion 2014.1.13-1precise1 This packag
I upgraded my mons to the latest version and they appear to work, I then
upgraded my mds and it seems fine.
I then upgraded one OSD node and the OSD fails to start with the following
dump, any help is appreciated:
--- begin dump of recent events ---
0> 2014-11-13 18:20:15.625793 7fbd973ce7a
Hi,
Currently, there's a bug in that salt version for ubuntu precise. See
https://github.com/saltstack/salt/issues/17227
On 11/14/14 10:07 AM, idzzy wrote:
Hi,
vagrant@precise64:/git$ salt-call --version
salt-call 2014.1.13 (Hydrogen)
As of now I can see the calamari-client pkg in vagrant
Hi,
vagrant@precise64:/git$ salt-call --version
salt-call 2014.1.13 (Hydrogen)
As of now I can see the calamari-client pkg in vagrant:/git directory.
Does this mean the building pkg success?
But what was the error message which I sent in previous mail?
vagrant@precise64:/git$ ls -l
total 3340
Hi,
Which version are you currently running?
|# salt-call --version|
On 11/14/14 9:34 AM, idzzy wrote:
Hello,
I'm trying to setup calamari with reference to
http://ceph.com/category/ceph-gui/.
I could create package of calamari server. but the creation of
calamari client was failed.
Foll
Hello,
I’m trying to setup calamari with reference to
http://ceph.com/category/ceph-gui/.
I could create package of calamari server. but the creation of calamari client
was failed.
Following is the procedure. build process was failed. How can I fix this?
# git clone https://github.com/ceph/cal
Hello,
On Wed, 12 Nov 2014 17:29:43 +0100 Christoph Adomeit wrote:
> Hi,
>
> i installed a Ceph Cluster with 50 OSDs on 4 Hosts and finally I am
> really happy with it.
>
> Linux and Windows VMs run really fast in KVM on the Ceph Storage.
>
> Only my Solaris 10 guests are terribly slow on cep
Thanks! What happens when the lone rule fails? Is there a fallback
rule that will place the blob in a random PG? Say I misconfigure, and
my choose/chooseleaf don't add up to pool min size.
(This also explains why all examples in the wild use only 1 rule per ruleset.)
On Fri, Nov 14, 2014 at 7:
On Thu, Nov 13, 2014 at 2:58 PM, Anthony Alba wrote:
> Hi list,
>
>
> When there are multiple rules in a ruleset, is it the case that "first
> one wins"?
>
> When will a rule faisl, does it fall through to the next rule?
> Are min_size, max_size the only determinants?
>
> Are there any examples?
Hi Christoph,
Am 12.11.2014 17:29, schrieb Christoph Adomeit:
> Hi,
>
> i installed a Ceph Cluster with 50 OSDs on 4 Hosts and finally I am really
> happy with it.
>
> Linux and Windows VMs run really fast in KVM on the Ceph Storage.
>
> Only my Solaris 10 guests are terribly slow on ceph rbd
Hi list,
When there are multiple rules in a ruleset, is it the case that "first
one wins"?
When will a rule faisl, does it fall through to the next rule?
Are min_size, max_size the only determinants?
Are there any examples? The only examples I've see put one rule per
ruleset (e.g. the docs ha
This appears to be a buggy libtcmalloc. Ceph hasn't gotten to main() yet
from the looks of things.. tcmalloc is still initializing.
Hopefully fedora has a newer versin of the package?
sage
On Thu, 13 Nov 2014, Harm Weites wrote:
> Hi Sage,
>
> Here you go: http://paste.openstack.org/show/1
Hi Sage,
Here you go: http://paste.openstack.org/show/132936/
Harm
Op 13-11-14 om 00:44 schreef Sage Weil:
> On Wed, 12 Nov 2014, Harm Weites wrote:
>> Hi,
>>
>> When trying to add a new OSD to my cluster the ceph-osd process hangs:
>>
>> # ceph-osd -i $id --mkfs --mkkey
>>
>>
>> At this point
On 11/13/2014 10:17 AM, David Moreau Simard wrote:
> Running into weird issues here as well in a test environment. I don't have a
> solution either but perhaps we can find some things in common..
>
> Setup in a nutshell:
> - Ceph cluster: Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 (OSDs with separa
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
That's interesting.
Although I'm running 3.16.7 and that I'd expect the patch to be in already,
I'll downgrade to the "working" 3.16.0 kernel and report back if this fixes the
issue.
Thanks for the pointer.
--
David Moreau Simard
> On Nov 13, 2014, at 1:15 PM, German Anders wrote:
>
> Is po
any special parameters (or best practice) regarding the offload
settings for the NICs? I got two ports: p4p1 (Public net) and p4p2
(Cluster internal), the cluster internal has MTU 9000 across all the
OSD servers and of course on the SW ports:
ceph@cephosd01:~$ ethtool -k p4p1
Features for p4p1
> >> Indeed, there must be something! But I can't figure it out yet. Same
> >> controllers, tried the same OS, direct cables, but the latency is 40%
> >> higher.
Wido,
just an educated guess:
Did you check the offload settings of your NIC?
ethtool -k should you provide that.
- Stepha
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Moreover if I restart the service on the ceph-node1, which is the
initial monitor and has an osd and mds:
[root@ceph-node1 ~]# service ceph restart
=== mon.ceph-node1 ===
=== mon.ceph-node1 ===
Stopping Ceph mon.ceph-node1 on ceph-node1...kill 1215...done
=== mon.ceph-node1 ===
Starting Ceph mon
Hi all,
Just providing an update to this -- I started the mds daemon on a new server
and rebooted a box with a hung CephFS mount (from the first crash) and the
problem seems to have gone away.
I'm still not sure why the mds was shutting down with a "Caught signal",
though.
Cheers,
Lincoln
Hi,
thank you for your answer:
On 11/13/2014 06:17 PM, Gregory Farnum wrote:
What does "ceph -s" output when things are working?
Does the ceph.conf on your admin node
BEFORE the problem: from ceph -w because I don't have ceph -s
[rzgceph@admin-node my-cluster]$ ceph -w
cluster 6fa39bb3-de
What does "ceph -s" output when things are working?
Does the ceph.conf on your admin node
contain the address of each monitor? (Paste is the relevant lines.) it will
need to or the ceph tool won't be able to find the monitors even though the
system is working.
-Greg
On Thu, Nov 13, 2014 at 9:11 AM
Hi,
On 11/13/2014 06:05 PM, Artem Silenkov wrote:
Hello!
Only 1 monitor instance? It won't work at most cases.
Make more and ensure quorum to reach survivalability.
No, three monitor instances, one for each ceph-node. As designed into the
quick-ceph-deploy.
I tried to kill one of them (the in
Hello!
Only 1 monitor instance? It won't work at most cases.
Make more and ensure quorum to reach survivalability.
Regards, Silenkov Artem
---
artem.silen...@gmail.com
2014-11-13 20:02 GMT+03:00 Luca Mazzaferro :
> Dear Users,
> I followed the instruction of the storage cluster quick start here
Hi Cephers,
Over night, our MDS crashed, failing over to the standby which also crashed!
Upon trying to restart them this morning, I find that they no longer start and
always seem to crash on the same file in the logs. I've pasted part of a "ceph
mds tell 0 injectargs '--debug-mds 20 --debug-ms
Dear Users,
I followed the instruction of the storage cluster quick start here:
http://ceph.com/docs/master/start/quick-ceph-deploy/
I simulate a little storage with 4 VMs ceph-node[1,2,3] and an admin-node.
Everything worked fine until I shut down the initial monitor node
(ceph-node1).
Also
Hi Sage,
Thank you for your answer.
So, there is no anticipated problem with how I did ?
Does the 'data' pool performance affects directly my filesystem
performance, even if there is no file on it ?
Do I need to have the same performance policy on 'data' pools than on
the other pools ?
Can I us
On Thu, 13 Nov 2014, Thomas Lemarchand wrote:
> Hi Ceph users,
>
> I need to have different filesystem trees in different pools, mainly for
> security reasons.
>
> So I have ceph users (cephx) with specific access on specific pools.
>
> I have one metadata pool ('metadata') and tree data pools (
Running into weird issues here as well in a test environment. I don't have a
solution either but perhaps we can find some things in common..
Setup in a nutshell:
- Ceph cluster: Ubuntu 14.04, Kernel 3.16.7, Ceph 0.87-1 (OSDs with separate
public/cluster network in 10 Gbps)
- iSCSI Proxy node (ta
Hi Ceph users,
I need to have different filesystem trees in different pools, mainly for
security reasons.
So I have ceph users (cephx) with specific access on specific pools.
I have one metadata pool ('metadata') and tree data pools ('data',
'wimi-files, 'wimi-recette-files').
I used file layou
Hi,
The Ceph cluster we are running have few OSDs approaching to 95% 1+ weeks
ago so I ran a reweight to balance it out, in the meantime, instructing
application to purge data not required. But after large amount of data
purge issued from application side(all OSDs' usage dropped below 20%), the
cl
Hi,
On Thu Nov 13 2014 at 3:35:55 PM Anthony Alba
wrote:
> Ah no.
> On 13 Nov 2014 21:49, "Dan van der Ster"
> wrote:
>
>> Hi,
>> Did you mkjournal the reused journal?
>>
>>ceph-osd -i $ID --mkjournal
>>
>> Cheers, Dan
>>
>
> No - however the man page states that "--mkjournal" is for :
> "
Ah no.
On 13 Nov 2014 21:49, "Dan van der Ster" wrote:
> Hi,
> Did you mkjournal the reused journal?
>
>ceph-osd -i $ID --mkjournal
>
> Cheers, Dan
>
No - however the man page states that "--mkjournal" is for :
"Create a new journal file to match an existing object repository. This is
usef
Hi,
Did you mkjournal the reused journal?
ceph-osd -i $ID --mkjournal
Cheers, Dan
On Thu Nov 13 2014 at 2:34:51 PM Anthony Alba
wrote:
> When I create a new OSD with a block device as journal that has
> existing data on it, ceph is causing FAILED assert. The block device
> iss a journal fr
When I create a new OSD with a block device as journal that has
existing data on it, ceph is causing FAILED assert. The block device
iss a journal from a previous experiment. It can safely be
overwritten.
If I zero the block device with dd if=/dev/zero bs=512 count=1000
of=MyJournalDev
then the a
On 12-11-14 21:12, Udo Lembke wrote:
> Hi Wido,
> On 12.11.2014 12:55, Wido den Hollander wrote:
>> (back to list)
>>
>>
>> Indeed, there must be something! But I can't figure it out yet. Same
>> controllers, tried the same OS, direct cables, but the latency is 40%
>> higher.
>>
>>
> perhaps someth
Hi David,
Yes its clones to the ceph folder. Its only that module which seems to
complain, which is a bit odd.
I might try and pop onto IRC at some point.
Many Thanks,
Nick
-Original Message-
From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of
David Moreau Simard
Se
43 matches
Mail list logo