Has anyone saw something like this: osd id == 2147483647
(2147483647 == 2^31 - 1). Looks like some int casting bug
but I have no idea where to look for it (and I don't know
exact steps to reproduce this - I was just doing osd in/osd out
multiple times to test recovery speed under some client load).
Turbo Boost will not hurt performance. Unless you have 100% load on all cores
it will actually improve performance (vastly, in terms of bursty workloads).
The issue you have could be related to CPU cores going to sleep mode.
Put "intel_idle.max_cstate=3” on the kernel command line (I ran with =2
Hi,
When an erasure coded pool
pool 4 'ecpool' erasure size 5 min_
does not have enough OSDs to map a PG, the missing OSDs shows as 2147483647 and
that's what you have in
[7,2,2147483647,6,10]
in the case of a replicated pool, the missing OSDs would be omitted instead. In
Hammer 214748
Hi Loic,
Thanks for quick response. Before I started putting osd in/out there
were no such problems.
Cluster health has been OK. And second thing is that I'm using *rack*
failure domain (and there
are three racks) so shouldn't be there two missing OSDs?
mon-01-01525673-bc76-433e-8a68-12578d797b1
Hello,
Thanks for your detailed explanation, and for the pointer to the
"Unexplainable slow request" thread.
After investigating osd logs, disk SMART status, etc., the disk under
osd.71 seems OK, so we restarted the osd... And voilà, problem seems
to be solved! (or at least, the "slow request" me
On 05/26/15 10:06, Jan Schermer wrote:
> Turbo Boost will not hurt performance. Unless you have 100% load on
> all cores it will actually improve performance (vastly, in terms of
> bursty workloads).
> The issue you have could be related to CPU cores going to sleep mode.
Another possibility is tha
I think we (i.e. Christian) found the problem:
We created a test VM with 9 mounted RBD volumes (no NFS server). As soon as he
hit all disks, we started to experience these 120 second timeouts. We realized
that the QEMU process on the hypervisor is opening a TCP connection to every
OSD for every
Hi ALL:
I've built a ceph0.8 cluster including 2 nodes ,which contains 5 osds(ssd)
each , with 100MB/s network . Testing a rbd device with default configuration
,the result is no ideal.To got better performance ,except the capability of
random r/w of SSD, which should to give a change?
Hi ,
you should definitely increase the speed of the network. 100Mbit/s is
way too slow for all use cases I could think of, as it results in a
maximum data transfer of less than 10 Mbyte per second, which is
slower than a usb 2.0 thumb drive.
Best,
Karsten
2015-05-26 15:53 GMT+02:00 lixuehui...@
Hi,
Hi, I guess the author here means that for random loads 100Mb network
should generate 2500-3000 IOPS for 4k blocks.
So the complaint is reasonable, I suppose.
Regards, Vasily.
On Tue, May 26, 2015 at 5:27 PM, Karsten Heymann
wrote:
> Hi ,
>
> you should definitely increase the speed of the
On 05/26/2015 08:53 AM, lixuehui...@126.com wrote:
Hi ALL:
Hi!
I've built a ceph0.8 cluster including 2 nodes ,which contains 5
osds(ssd) each , with 100MB/s network . Testing a rbd device with
default configuration ,the result is no ideal.To got better performance
,except the capabili
All our ceph clusters are on centos 7 and I am trying to install calamari on
one of the node. I am using instructions from
http://karan-mj.blogspot.fi/2014/09/ceph-calamari-survival-guide.html. They are
written for centos 6. I tries using them but did not work.
Has anyone tried installing calamar
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
What version of Ceph are you using? I seem to remember an enhancement
of ceph-disk for Hammer that is more aggressive in reusing previous
partition.
-
Robert LeBlanc
GPG Fingerprint 79A2 9CA4 6CC4 45DD A904 C70E E654 3BB2 FA62 B9F1
Hi,
After upgrading our OpenStack cluster to Kilo and upgrading Ceph from Giant to
Hammer, RadosGW stopped working. All other services using Ceph work fine.
RadosGW is configured to use Keystone for authentication.
# swift list
Account GET failed: http://object.api.openstack.cyso.net/swift/v1?f
Hi,
It's firefly 0.80.9, so if the improvement is in Hammer I haven't seen it.
Will check back when I upgrade the cluster.
Thanks
Eneko
On 26/05/15 17:45, Robert LeBlanc wrote:
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
What version of Ceph are you using? I seem to remember an enhanceme
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
I've seen I/O become stuck after we have done network torture tests.
It seems that after so many retries that the OSD peering just gives up
and doesn't retry any more. An OSD restart kicks off another round of
retries and the I/O completes. It seems
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256
In my experience with HP hardware, it was set to Econo mode in the
BIOS which is just plain junk. It will halt cores without respect to
workload to provide energy savings.
We found that by setting the power mode to "OS controlled" we got
almost the
It should be noted that not all power saving is bad - you can save a lot of
power by enabling some sleep states, throttling down, idling, or enabling low
voltage mode on memory, with zero performance impact. In the end you can end up
with more performance because of higher Turbo Boost TDP reserv
Jens-Christian,
how did you test that? Did you just tried to write to them
simultaneously? Any other tests that one can perform to verify that?
In our installation we have a VM with 30 RBD volumes mounted which are
all exported via NFS to other VMs.
No one has complaint for the moment but th
Dear Ceph Team,
Our cluster includes three Ceph nodes with 1 MON and 1 OSD in each. All nodes
are running on CentOS 6.5 (kernel 2.6.32) VMs in a testing cluster, not
production. The script we’re using is a simplified sequence of steps that does
more or less what the ceph-cookbook does. Using Op
Dear Ceph Team,
Our cluster includes three Ceph nodes with 1 MON and 1 OSD in each. All nodes
are running on CentOS 6.5 (kernel 2.6.32) VMs in a testing cluster, not
production. The script we’re using is a simplified sequence of steps that does
more or less what the ceph-cookbook does. Using Op
Shailesh,
I was trying to do the same, but came across several compiling errors,
that I decided to deploy the Calamari Server on a Centos 6 machine. Even
then I was not able to finalize the installation.
See:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2015-May/001543.html
http://list
Hey cephers,
We now have two more Ceph Day events confirmed and on the books [0]:
16 July -- Los Angeles
18 Augus -- Chicago
If you are interested in sharing your Ceph experience with the
community-at-large we'd love to have you! Right now we're especially
interested in real-world use cases and
I followed the Calamari build instructions here:
http://ceph.com/category/ceph-step-by-step/
I used an Ubuntu 14.04 system to build all of the Calarmari client and server
packages for Centos 6.5 and Ubuntu Trusty (14.04).
Once the packages were built I also referenced the Calamari instructions h
Due to popular demand we are expanding the Ceph lists to include a
Chinese-language list to allow for direct communications for all of
our friends in China.
ceph...@lists.ceph.com
It was decided that there are many fragmented discussions going on in
the region due to unfamiliarity or discomfort w
Hey cephers,
A while ago I sent out a note to let people know that Red Hat and
Intel will be partnering to host a real 4-day hackathon at Intel's
offices in Hilsboro, OR on Aug 10-13. Our goal is to keep this a
small (20-25 max) gathering of people doing real work. Right now the
focus is on perfo
It's that time again, time to gird up our loins and submit blueprints
for all work slated for the Jewel release of Ceph.
http://ceph.com/uncategorized/ceph-developer-summit-jewel/
The one notable change for this CDS is that we'll be using the new
wiki (on tracker.ceph.com) that is still undergoin
Hi,
What block size does ceph use, and what is the most optimal size? I'm assuming
it uses whatever the file system has been formatted with.
Thanks
Pankaj
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-use
Hi,
Hi, I guess the author here means that for random loads 100Mb network
should generate 2500-3000 IOPS for 4k blocks.
So the complaint is reasonable, I suppose.
Regards, Vasily.
On Tue, May 26, 2015 at 5:27 PM, Karsten Heymann
wrote:
> Hi ,
>
> you should definitely increase the speed of the
Hi everyone,
This is announcing a new release of ceph-deploy that fixes a security
related issue, improves SUSE support, and improves support for RGW on
RPM systems. ceph-deploy can be installed from ceph.com hosted repos
for Firefly, Giant, Hammer, and testing, and is also available on
PyPI.
Ea
Hello,
your problem is of course that the weight is 0 for all your OSDs.
Thus no data can be placed anywhere at all.
You will want to re-read the manual deployment documentation or dissect
ceph-deploy/ceph-disk more.
Your script misses the crush add bit of that process:
ceph osd crush add {id-or
Hello,
On Tue, 26 May 2015 10:00:13 -0600 Robert LeBlanc wrote:
> -BEGIN PGP SIGNED MESSAGE-
> Hash: SHA256
>
> I've seen I/O become stuck after we have done network torture tests.
> It seems that after so many retries that the OSD peering just gives up
> and doesn't retry any more. An
32 matches
Mail list logo