If your issue is caused by the bug I presume, you need to use the
newest client (0.72 ceph-fuse or 3.12 kernel)
Regards
Yan, Zheng
Hi,
We are running 0.72-1 throughout the cluster but on kernel
2.6.32-358.6.2.el6.x86_64... This is a big deployment (~1 cores),
not so easy to update kern
Thanks.
--
Regards
Dominik
2013/12/3 Yehuda Sadeh :
> For bobtail at this point yes. You can try the unofficial version with
> that fix off the gitbuilder. Another option is to upgrade everything
> to dumpling.
>
> Yehuda
>
> On Mon, Dec 2, 2013 at 10:24 PM, Dominik Mostowiec
> wrote:
>> Thanks
Dear ceph users,
I have a ceph cluster running 0.67.4. Two osd are down in "ceph -s".
They are stil there in ps and I can't stop them (service ceph stop osd.x
or kill or even kill -9)!
Any idea?
___
ceph-users mailing list
ceph-users@lists.ceph.com
> Dear ceph users,
>
> I have a ceph cluster running 0.67.4. Two osd are down in "ceph -s".
> They are stil there in ps and I can't stop them (service ceph stop osd.x
> or kill or even kill -9)!
>
> Any idea?
>
Do you know why the OSDs went down? If their state is D in 'ps -ax' then check
dmes
Hi Emmanuel,
Funny that you are 20 meters away :-) From the dmesg output you just showed me
it looks like you've been hit by
http://tracker.ceph.com/issues/6301 ceph-osd hung by XFS using linux 3.10
Cheers
On 03/12/2013 10:12, Emmanuel Lacour wrote:
>
> Dear ceph users,
>
>
> I have a ceph
On Tue, Dec 03, 2013 at 10:12:03AM +0100, Emmanuel Lacour wrote:
>
> Dear ceph users,
>
>
> I have a ceph cluster running 0.67.4. Two osd are down in "ceph -s".
> They are stil there in ps and I can't stop them (service ceph stop osd.x
> or kill or even kill -9)!
>
this seems related to some
Hi Brian and Robert,
Thanks for your replies! Appreciate.
Can I safely say that there will be no downtime to the cluster when I
increase the pg_num and pgp_num values?
Looking forward to your reply, thank you.
Cheers.
On Tue, Dec 3, 2013 at 2:31 PM, Robert van Leeuwen <
robert.vanleeu...@spi
> On 3 dec. 2013, at 10:49, "Indra Pramana" wrote:
>
> Hi Brian and Robert,
>
> Thanks for your replies! Appreciate.
>
> Can I safely say that there will be no downtime to the cluster when I
> increase the pg_num and pgp_num values?
There is no actual downtime.
What I did see is that IOs wil
Hi,
Upgaded to emperor, restarted all nodes.
Still have "31 actige+remapped" pgs.
Compared remapped and healthy pg query output - some remapped pgs do
not have data, some do, some have been scrubbed some don't. Now
running read for whole rbd - may be that would trigger those stuck
pgs.
state on
Hi Ceph,
When an SSD partition is used to store a journal
https://github.com/ceph/ceph/blob/master/src/os/FileJournal.cc#L90
how is it trimmed ?
http://en.wikipedia.org/wiki/Trim_%28computing%29
Cheers
--
Loïc Dachary, Artisan Logiciel Libre
signature.asc
Description: OpenPGP digital sign
Since the journal partitions are generally small, it shouldn't need to
be.
For example implement with substantial under-provisioning, either via
HPA or simple partitions.
On 2013-12-03 12:18, Loic Dachary wrote:
Hi Ceph,
When an SSD partition is used to store a journal
https://github.com/c
On Tue, Dec 03, 2013 at 12:38:54PM +, James Pearce wrote:
> Since the journal partitions are generally small, it shouldn't need
> to be.
>
here with 2 journals (2 osds) on two ssd (samsung 850 pro, soft raid1 +
lvm + xfs) trim is just obligatory. We forget to set it at cluster setup
and one w
How much (%) is left unprovisioned on those (840s?) ? And were they
trim'd/secure erased before deployment?
On 2013-12-03 12:45, Emmanuel Lacour wrote:
On Tue, Dec 03, 2013 at 12:38:54PM +, James Pearce wrote:
Since the journal partitions are generally small, it shouldn't need
to be.
h
On Tue, Dec 03, 2013 at 12:48:21PM +, James Pearce wrote:
> How much (%) is left unprovisioned on those (840s?) ? And were they
> trim'd/secure erased before deployment?
>
unfortunatly, everything was provisioned (thought, there is free spaces
in the VG) due to lack of knowledge.
Nothing sp
Most likely. When fully provisioned the device has a much smaller pool
of cells to manage (i.e. charge) in the background, hence once that pool
is exhausted the device has no option but to stall whilst it clears
(re-charges) a cell, which takes something like 2-5ms.
Daily cron task is though
Hi Kyle,
All OSDs are SATA drives in JBOD. The journals are all on a pair of SAS
in RAID0. All of those are on a shared backplane with a single RAID
controller (8 ports -> 12 disks).
I also have a pair of SAS in RAID1 holding the OS, which may be on a
different port/data-path. I am going to
I'm trying to deploy Ceph on a group of Raspberry Pis using the procedure
documented in: http://ceph.com/docs/master/start/quick-ceph-deploy/
There used to be a site: http://ceph.com/docs/master/start/quick-start/ but
that page is no longer valid.
The first thing I noticed is that the command lsb
I would really appreciate it if someone could:
- explain why the journal setup is way more important than striping
settings;
I'm not sure if it's what you're asking, but any write must be
physically written to the journal before the operation is acknowledged.
So the overall cluster performa
On Tue, Dec 03, 2013 at 01:22:50PM +, James Pearce wrote:
>
> Daily cron task is though still a good idea - enabling discard mount
> option is generally counter-productive since trim is issue way too
> often, destroying performance (in my testing).
>
yes that's why we are using cron here.
_
On Tue, Dec 3, 2013 at 8:55 AM, Shlomo Dubrowin wrote:
> I'm trying to deploy Ceph on a group of Raspberry Pis using the procedure
> documented in: http://ceph.com/docs/master/start/quick-ceph-deploy/
>
> There used to be a site: http://ceph.com/docs/master/start/quick-start/ but
> that page is no
Alfredo,
Thank you for your response. I simply did apt-get install ceph on the
nodes.
My /etc/apt/sources.list.d/ceph.list contains:
deb http://ceph.com/debian-emperor/ wheezy main
and the versions I received are what I got.
Shlomo
-
Shlomo Dubrowin
The Soluti
On Tue, Dec 3, 2013 at 9:21 AM, Shlomo Dubrowin wrote:
> Alfredo,
>
> Thank you for your response. I simply did apt-get install ceph on the
> nodes.
>
> My /etc/apt/sources.list.d/ceph.list contains:
>
> deb http://ceph.com/debian-emperor/ wheezy main
>
Was that added manually? ceph-deploy can t
Alfredo,
I started that way, but I run into an error:
$ ceph-deploy install baxter
[ceph_deploy.cli][INFO ] Invoked (1.3.3): /usr/bin/ceph-deploy install
baxter
[ceph_deploy.install][DEBUG ] Installing stable version emperor on cluster
ceph hosts baxter
[ceph_deploy.install][DEBUG ] Detecting pl
On Tue, Dec 3, 2013 at 9:56 AM, Shlomo Dubrowin wrote:
> Alfredo,
>
> I started that way, but I run into an error:
>
> $ ceph-deploy install baxter
> [ceph_deploy.cli][INFO ] Invoked (1.3.3): /usr/bin/ceph-deploy install
> baxter
> [ceph_deploy.install][DEBUG ] Installing stable version emperor o
Hi,
what do you think to use the same SSD as journal and as root partition?
Forexample:
1x 128GB SSD
6 OSD
15GB for each journal, for each OSD
5GB as root partition for OS.
This give me 105GB of used space and 23GB of unused space (i've read
somewhere that is better to not use the whole SSD f
On 03/12/2013 13:38, James Pearce wrote:
> Since the journal partitions are generally small, it shouldn't need to be.
>
> For example implement with substantial under-provisioning, either via HPA or
> simple partitions.
>
Does that mean the problem will happen much later or that it will never
Robert,
Interesting results on the effect of # of PG/PGPs. My cluster struggles
a bit under the strain of heavy random small-sized writes.
The IOPS you mention seem high to me given 30 drives and 3x replication
unless they were pure reads or on high-rpm drives. Instead of assuming,
I want to
Hi ceph-users,
I've been playing around with radosgw and I notice there is an
inconsistency between the Ubuntu and CentOS startup scripts.
On Ubuntu, if I run a start ceph-all (which will start radosgw), or I run
the init script /etc/init.d/radosgw start - the radosgw process starts up
fine, but
Guys, I don't think we have pre-released packages of anything new that
is going to work on the pi regardless if you use ceph-deploy. Look at
our armhf packages file:
http://ceph.com/debian-emperor/dists/wheezy/main/binary-armhf/Packages
Unless I'm mistaken, you're going to have to compile it
Hi all,
I have some questions about the journaling logs.
My setup: I have 3 hosts, each having 12 * 4TB sata disks and 2 *
200GB ssd disks, 12 cores (2 hexacores) and 32GB RAM per host.
- I read that it is not a good idea to put the OS on the same SSD than
the journals
- I also read that it
On Tue, Dec 3, 2013 at 10:21 AM, Mark Nelson wrote:
> Guys, I don't think we have pre-released packages of anything new that is
> going to work on the pi regardless if you use ceph-deploy. Look at our
> armhf packages file:
>
> http://ceph.com/debian-emperor/dists/wheezy/main/binary-armhf/Package
OK,
So I'll need to do the installation manually, but the rest of the commands
I should run via ceph-deploy? What version should I be trying to grab for
the manual compilation? Should I be grabbing from git or is there a better
place?
Shlomo
-
Shlomo Dubrowin
The
On 12/03/2013 03:21 PM, Mark Nelson wrote:
Guys, I don't think we have pre-released packages of anything new that
is going to work on the pi regardless if you use ceph-deploy. Look at
our armhf packages file:
http://ceph.com/debian-emperor/dists/wheezy/main/binary-armhf/Packages
Unless I'm mis
Hi Mike,
I am using filebench within a kvm virtual. (Like an actual workload we will
have)
Using 100% synchronous 4k writes with a 50GB file on a 100GB volume with 32
writer threads.
Also tried from multiple KVM machines from multiple hosts.
Aggregated performance keeps at 2k+ IOPS
The disks a
It shouldn't happen, provided the sustained write rate doesn't exceed
the sustained erase capabilities of the device I guess. Daily fstrim
will not hurt though.
Essentially the mapping between LBAs and physical cells isn't
persistent in SSDs (unlike LBA and physical sectors on an HDD).
Prov
Robert,
Do you have rbd writeback cache enabled on these volumes? That could
certainly explain the higher than expected write performance. Any chance
you could re-test with rbd writeback on vs. off?
Thanks,
Mike Dawson
On 12/3/2013 10:37 AM, Robert van Leeuwen wrote:
Hi Mike,
I am using fi
On Tue, 3 Dec 2013, Andy McCrae wrote:
> Hi ceph-users,
> I've been playing around with radosgw and I notice there is an inconsistency
> between the Ubuntu and CentOS startup scripts.
>
> On Ubuntu, if I run a start ceph-all (which will start radosgw), or I run
> the init script /etc/init.d/radosg
Tracked it down - its the start-stop-daemon command flag, -u isn't for user
change, that should be -c, so I'll submit a fix soon.
On Tue, Dec 3, 2013 at 4:06 PM, Sage Weil wrote:
> On Tue, 3 Dec 2013, Andy McCrae wrote:
> > Hi ceph-users,
> > I've been playing around with radosgw and I notice t
Not specifically set, according to the docs the default is off...
I am using the async qemu cuttlefish rpm.
Maybe it does cache something but I think not:
Specifically setting writeback on in the client config yielded different
results:
In our DEV environment we had issues with the virtual becom
Hello Cephers
Need your guidance
In my setup ceph cluster and openstack are working good , i am able to create
volumes using cinder as well.
What i want is to mount ceph volume to VM instance. But getting deadly errors
like this . Expecting your help in this
[root@rdo /(keystone_admin)
On 12/03/2013 03:38 PM, Shlomo Dubrowin wrote:
Joao,
Is there a reason you aren't putting OSDs on the Pis? Do you expect that
the OSD won't run on the Pi?
The question really isn't about why I'm not putting the OSDs on the pi's
but why I'd prefer to put them on the Cubietruck: unlike the pi,
I found network to be the most limiting factor in Ceph.
Any chance to move to 10G+ would be beneficial.
I did have success with Bonding and just doing a simple RR increased the
throughput.
On Mon, Dec 2, 2013 at 10:17 PM, Kyle Bader wrote:
> > Is having two cluster networks like this a supporte
CRUSH is failing to map all the PGs to the right number of OSDs.
You've got a completely empty host which has ~1/3 of the cluster's
total weight, and that is probably why — remove it!
-Greg
Software Engineer #42 @ http://inktank.com | http://ceph.com
On Tue, Dec 3, 2013 at 3:13 AM, Ugis wrote:
>
Crystal clear, thanks for the tutorial :-)
On 03/12/2013 16:52, James Pearce wrote:
> It shouldn't happen, provided the sustained write rate doesn't exceed the
> sustained erase capabilities of the device I guess. Daily fstrim will not
> hurt though.
>
> Essentially the mapping between LBAs an
I got them and they appear to work.
Thanks.
On 12/02/2013 04:21 PM, Dan Van Der Ster wrote:
You need the rhev package.
https://www.mail-archive.com/ceph-users@lists.ceph.com/msg05962.html
On Dec 2, 2013 10:18 PM, Alvin Starr wrote:
I have been looking at the src rpm qemu-img-0.12.1.2-2.415.
In trying to download the RPM packages for CEPH, the yum commands timed
out. I then tried just downloading them via Chrome browser (
http://ceph.com/rpm-emperor/el6/x86_64/ceph-0.72.1-0.el6.x86_64.rpm) and it
only downloaded 64KB. (The website www.ceph.com is slow too)
_
The website is repsonding quickly now, but I'm getting very strange error
of:
===
Downloaded more than max size for
http://ceph.com/rpm-emperor/el6/x86_64/ceph-0.72.1-0.el6.x86_64.rpm:
15221737 > 13835720.
I didn't get this before. Is something outdated on th
Hello Every
Still waiting .. Any help in this would be highly appreciated.
Many Thanks
Karan Singh
- Original Message -
From: "Karan Singh"
To: ceph-users@lists.ceph.com
Sent: Tuesday, 3 December, 2013 6:27:29 PM
Subject: [ceph-users] Openstack+ceph volume mounting to vm
Hell
Greetings all,
Does anyone have any recommendations for using ceph as a reliable,
distributed backend for any existing parallel filesystems? My final
goal would be to have data reliability and availability handled by ceph
and the serving of a filesystem handled by .. well, a distributed,
parallel
Background
New ceph setup with 3 nodes and a mon running on each node. OSDs are
split up across the nodes. This is a brand new cluster and no data has
been added.
I zapped osd.0 and re-added it and now I am stuck with:
health HEALTH_WARN 12 pgs degraded; 12 pgs stale; 12 pgs stuck stal
On 12/02/2013 03:26 PM, Bill Eldridge wrote:
Hi all,
We're looking at using Ceph's copy-on-write for a ton of users'
replicated cloud image environments,
and are wondering how efficient Ceph is for adding user data to base
images -
is data added in normal 4kB or 64kB sizes, or can you specify bl
I don't think there is any inherent limitation to using RADOS or RBD
as a backend for an a non-CephFS file system, as CephFS is inherently
built on top of RADOS (though I suppose it doesn't directly use
librados). However, the challenge would be in configuring and tuning
the two independent systems
On Tue, Dec 3, 2013 at 4:00 PM, Miguel Afonso Oliveira
wrote:
>
>> If your issue is caused by the bug I presume, you need to use the
>> newest client (0.72 ceph-fuse or 3.12 kernel)
>>
>> Regards
>> Yan, Zheng
>
>
> Hi,
>
> We are running 0.72-1 throughout the cluster but on kernel
> 2.6.32-358.6.
I have been testing osd on btrfs, and the first thing I notice is that there is
constant write activity when idle.
The write activity hovers between 5Mbytes/second and 30Mbytes/second, and
averages around 9Mbytes/second (as determined by iostat -x 30). On average,
iostat is showing around 90 w/
Hello,
I can't understand an error I have since now :
HEALTH_WARN pool .rgw.buckets has too few pgs.
Do you have any ideas ?
Some info :
[root@admin ~]# ceph --version
ceph version 0.72.1 (4d923861868f6a15dcb33fef7f50f674997322de)
[root@admin ~]# ceph osd pool get .rgw.buckets pgp_num
pgp_num:
55 matches
Mail list logo