Re: [ceph-users] Fwd: 70+ OSD are DOWN and not coming up

2014-05-25 Thread Sage Weil
89568 pi=492-289566/9900 crt=0'0 inactive NOTIFY] > handle_activate_map: Not dirtying info: last_persisted is 289568 while > current is 289568 > 2014-05-25 15:10:41.607946 7fbc0b54c700 10 osd.58 350946 advance_pg advanced > by max 200 past min epoch 289568 ... will requeue pg[0.185( em

Re: [ceph-users] Fwd: 70+ OSD are DOWN and not coming up

2014-05-25 Thread Sage Weil
On Sun, 25 May 2014, Sage Weil wrote: > Hi Karan, > > Can you confirm that the next several map(s) (289569, 2895670, etc.) exist > on those OSDs? > > cd /var/lib/ceph/osd/ceph-58/current/meta > find . | grep 28957 Assuming those maps are in fact missing, then I see the b

Re: [ceph-users] Multiple L2 LAN segments with Ceph

2014-05-28 Thread Sage Weil
On Wed, 28 May 2014, Travis Rhoden wrote: > Hi folks, > > Does anybody know if there are any issues running Ceph with multiple L2 > LAN segements?  I'm picturing a large multi-rack/multi-row deployment > where you may give each rack (or row) it's own L2 segment, then connect > them all with L3/

Re: [ceph-users] ceph hostnames

2014-05-29 Thread Sage Weil
On Thu, 29 May 2014, Ignazio Cassano wrote: > Hi all, I am planning to install a ceph cluster and I have 3 nodes with 2 > nic for each one. > I read the documentation and it suggests to set a public netowork and a > cluster network. > Firstly I need to know if public network is the network used by

[ceph-users] v0.81 released

2014-06-02 Thread Sage Weil
ts (Yan, Zheng) * ceph-fuse, libcephfs: improve traceless reply handling (Sage Weil) * clang build fixes (John Spray, Danny Al-Gaaf) * config: support G, M, K, etc. suffixes (Joao Eduardo Luis) * coverity cleanups (Danny Al-Gaaf) * doc: cache tiering (John Wilkins) * doc: keystone integration docs

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Sage Weil
On Wed, 4 Jun 2014, Dan Van Der Ster wrote: > Hi Sage, all, > > On 21 May 2014, at 22:02, Sage Weil wrote: > > > * osd: allow snap trim throttling with simple delay (#6278, Sage Weil) > > Do you have some advice about how to use the snap trim throttle? I saw > os

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Sage Weil
On Wed, 4 Jun 2014, Andrey Korolyov wrote: > On 06/04/2014 06:06 PM, Sage Weil wrote: > > On Wed, 4 Jun 2014, Dan Van Der Ster wrote: > >> Hi Sage, all, > >> > >> On 21 May 2014, at 22:02, Sage Weil wrote: > >> > >>> * osd: allow

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Sage Weil
On Wed, 4 Jun 2014, Dan Van Der Ster wrote: > On 04 Jun 2014, at 16:06, Sage Weil wrote: > > > On Wed, 4 Jun 2014, Dan Van Der Ster wrote: > >> Hi Sage, all, > >> > >> On 21 May 2014, at 22:02, Sage Weil wrote: > >> > >>> * osd: al

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Sage Weil
On Wed, 4 Jun 2014, Andrey Korolyov wrote: > On 06/04/2014 07:22 PM, Sage Weil wrote: > > On Wed, 4 Jun 2014, Andrey Korolyov wrote: > >> On 06/04/2014 06:06 PM, Sage Weil wrote: > >>> On Wed, 4 Jun 2014, Dan Van Der Ster wrote: > >>>> Hi Sage, all,

Re: [ceph-users] v0.67.9 Dumpling released

2014-06-04 Thread Sage Weil
On Wed, 4 Jun 2014, Dan Van Der Ster wrote: > On 04 Jun 2014, at 16:06, Sage Weil wrote: > > > You can adjust this on running OSDs with something like 'ceph daemon > > osd.NN config set osd_snap_trim_sleep .01' or with 'ceph tell osd.* > > injectargs --

Re: [ceph-users] librbd cache

2014-06-05 Thread Sage Weil
On Thu, 5 Jun 2014, Wido den Hollander wrote: > On 06/05/2014 08:59 AM, Stuart Longland wrote: > > Hi all, > > > > I'm looking into other ways I can boost the performance of RBD devices > > on the cluster here and I happened to see these settings: > > > > http://ceph.com/docs/next/rbd/rbd-config-

Re: [ceph-users] ceph osd down and out

2014-06-05 Thread Sage Weil
This usually happens on larger clusters when you hit the max fd limit. Add max open files = 131072 in the [global] section of ceph.conf to fix it (default is 16384). sage On Thu, 5 Jun 2014, Cao, Buddy wrote: > > Hi,  several osds were down/out with similar logs as below, could you

Re: [ceph-users] Spurious error about TMAP2OMAP from mds?

2014-06-05 Thread Sage Weil
This may happen if the mon wasn't upgraded to 0.80.x before all of the OSDs were restarted. You should be able to find the OSD(s) that were restarted before the mons with ceph osd dump -f json-pretty | grep features Look for features set to 0 instead of some big number. sage On Thu, 5 Jun

Re: [ceph-users] active+degraded even after following best practice

2014-06-07 Thread Sage Weil
On Sat, 7 Jun 2014, Anil Dhingra wrote: > HI Guys > > Finally writing ..after loosing my patience to configure my cluster multiple > times but still not able to achieve active+clean .. looks like its almost > impossible to configure this on centos 6.5. > > As I have to prepare a POC ceph+cinder b

Re: [ceph-users] I have PGs that I can't deep-scrub

2014-06-11 Thread Sage Weil
Hi Craig, It's hard to say what is going wrong with that level of logs. Can you reproduce with debug ms = 1 and debug osd = 20? There were a few things fixed in scrub between emperor and firefly. Are you planning on upgrading soon? sage On Tue, 10 Jun 2014, Craig Lewis wrote: > Every time

Re: [ceph-users] Disabling OSD journals, parallel reads and eventual consistency for RBD

2014-06-12 Thread Sage Weil
On Thu, 12 Jun 2014, Charles 'Boyo wrote: > Hello list. > > Is it possible, or will it ever be possible to disable the OSD's > journalling activity? > > I understand it is risky and has the potential for data loss but in my > use case, the data is easily re-built from scratch and I'm really >

Re: [ceph-users] Disabling OSD journals, parallel reads and eventual consistency for RBD

2014-06-12 Thread Sage Weil
equires the ceph journal for crash safety). > So can you show me how to turn off journalling using the xfs FileStore > backend? :) You could point the journal at tmpfs. But, you *will* lose data if you do that! sage > > Charles > --Original Message-- > From: Sage Wei

Re: [ceph-users] Disabling OSD journals, parallel reads and eventual consistency for RBD

2014-06-12 Thread Sage Weil
ailure we can rely on accurate metadata to quickly resync without having to walk the data on the disk. It's not directly related to the distributed consistency model, though. sage > > Charles > --Original Message-- > From: Sage Weil > To: Charles 'Boyo > C

Re: [ceph-users] Strange qemu-rbd I/O behavior when booting Windows VM

2014-06-13 Thread Sage Weil
Right now, no. We could add a minimum read size to librbd when caching is enabled... that would not be particularly difficult. sage On Fri, 13 Jun 2014, Ke-fei Lin wrote: > 2014-06-13 22:04 GMT+08:00 Andrey Korolyov : > > > > On Fri, Jun 13, 2014 at 5:50 PM, Ke-fei Lin wrote: > > > Thanks,

Re: [ceph-users] Strange qemu-rbd I/O behavior when booting Windows VM

2014-06-13 Thread Sage Weil
On Sat, 14 Jun 2014, Ke-fei Lin wrote: > 2014-06-14 0:11 GMT+08:00 Sage Weil : > > Right now, no. > > > > We could add a minimum read size to librbd when caching is enabled... > > that would not be particularly difficult. > > > > sage > > Thank

Re: [ceph-users] /etc/ceph/rbdmap

2014-06-19 Thread Sage Weil
On Thu, 19 Jun 2014, Chad Seys wrote: > Hi all, > Also /etc/ceph/rbdmap in librbd1 rather than ceph? This is for mapping kernel rbd devices on system startup, and belong with ceph-common (which hasn't yet been but soon will be split out from ceph) along with the 'rbd' cli utility. It isn't di

Re: [ceph-users] /etc/ceph/rbdmap

2014-06-19 Thread Sage Weil
On Thu, 19 Jun 2014, Chad Seys wrote: > > This is for mapping kernel rbd devices on system startup, and belong with > > ceph-common (which hasn't yet been but soon will be split out from ceph) > > Great! Yeah, I was hoping to map /dev/rbd without installing all the ceph > daemons! The package c

[ceph-users] v0.82 released

2014-06-27 Thread Sage Weil
g: add tox tests (Alfredo Deza) * common: perfcounters now use atomics and go faster (Sage Weil) * doc: CRUSH updates (John Wilkins) * doc: osd primary affinity (John Wilkins) * doc: pool quotas (John Wilkins) * doc: pre-flight doc improvements (Kevin Dalley) * doc: switch to an unencumbered font

Re: [ceph-users] Release notes for firefly not very clear wrt the tunables

2014-07-07 Thread Sage Weil
Hi Sylvain, Thanks for pointing this out. Is this clearer? * The default CRUSH rules and layouts are now using the 'bobtail' tunables and defaults. Upgaded clusters using the old values will now present with a health WARN state. This can be disabled by adding 'mon warn on legacy crush tu

Re: [ceph-users] Release notes for firefly not very clear wrt the tunables

2014-07-07 Thread Sage Weil
On Mon, 7 Jul 2014, James Harper wrote: > > > > Hi Sage, > > > > > Thanks for pointing this out. Is this clearer? > > > > Yes. Although it would probably be useful to say that using 'ceph osd > > crush tunables bobtail' will be enough to get rid of the warning and > > will not break compatibili

Re: [ceph-users] ceph mount not working anymore

2014-07-10 Thread Sage Weil
Have you made any other changes after the upgrade? (Like adjusting tunables, or creating EC pools?) See if there is anything in 'dmesg' output. sage On Thu, 10 Jul 2014, Joshua McClintock wrote: > I upgraded my cluster to .80.1-2 (CentOS).  My mount command just freezes > and outputs an error

Re: [ceph-users] ceph mount not working anymore

2014-07-10 Thread Sage Weil
ephfs1-0.80.1-0.el6.x86_64 > > [root@ceph-mon01 ~]# > > > [root@ceph-mon02 ~]# rpm -qa|grep ceph > > libcephfs1-0.80.1-0.el6.x86_64 > > ceph-0.80.1-2.el6.x86_64 > > ceph-release-1-0.el6.noarch > > [root@ceph-mon02 ~]# > > > [root@ceph-mon03

Re: [ceph-users] ceph mount not working anymore

2014-07-11 Thread Sage Weil
gt;       "choose_total_tries": 50, >       "chooseleaf_descend_once": 1, >       "profile": "bobtail", >       "optimal_tunables": 0, >       "legacy_tunables": 0, >       "require_feature_tunables": 1, >       "require_featur

Re: [ceph-users] logrotate

2014-07-11 Thread Sage Weil
On Fri, 11 Jul 2014, James Eckersall wrote: > Upon further investigation, it looks like this part of the ceph logrotate > script is causing me the problem: > > if [ -e "/var/lib/ceph/$daemon/$f/done" ] && [ -e > "/var/lib/ceph/$daemon/$f/upstart" ] && [ ! -e > "/var/lib/ceph/$daemon/$f/sysvinit" ]

Re: [ceph-users] scrub error on firefly

2014-07-11 Thread Sage Weil
One other thing we might also try is catching this earlier (on first read of corrupt data) instead of waiting for scrub. If you are not super performance sensitive, you can add filestore sloppy crc = true filestore sloppy crc block size = 524288 That will track and verify CRCs on any large (

[ceph-users] v0.80.3 released

2014-07-11 Thread Sage Weil
in manifest decoding (#8804, Sage Weil) For more detailed information, see: http://ceph.com/docs/master/_downloads/v0.80.3.txt v0.80.2 Firefly === This is the second Firefly point release. It contains a range of important fixes, including several bugs in the OSD cache tiering

Re: [ceph-users] logrotate

2014-07-12 Thread Sage Weil
om] On Behalf >Of >> James Eckersall >> Sent: Freitag, 11. Juli 2014 17:06 >> To: Sage Weil >> Cc: ceph-us...@ceph.com >> Subject: Re: [ceph-users] logrotate >> >> Hi Sage, >> >> Many thanks for the info. >> I have inherited this cluster, but I

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-14 Thread Sage Weil
I've added some additional notes/warnings to the upgrade and release notes: https://github.com/ceph/ceph/commit/fc597e5e3473d7db6548405ce347ca7732832451 If there is somewhere else where you think a warning flag would be useful, let me know! Generally speaking, we want to be able to cope with

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-14 Thread Sage Weil
... we have an cluster with 5 > storage node and 12 4TB-osd-disk each (60 osd), replica 2. The cluster > is 60% filled. > Networkconnection 10Gb. > Takes tunables optimal in such an configuration one, two or more days? > > Udo > > On 14.07.2014 18:18, Sage We

Re: [ceph-users] Firefly Upgrade

2014-07-14 Thread Sage Weil
On Mon, 14 Jul 2014, Quenten Grasso wrote: > > Hi All, > > Just a quick question for the list, has anyone seen a significant increase > in ram usage since firefly? I upgraded from 0.72.2 to 80.3 now all of my > Ceph servers are using about double the ram they used to. One significant change is t

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-15 Thread Sage Weil
erformance impact is much higher. sage > > Thanks, > Andrija > > > On 14 July 2014 18:18, Sage Weil wrote: > I've added some additional notes/warnings to the upgrade and > release > notes: > >  https://github.com/ceph/ceph/commit/fc597e5e3

[ceph-users] v0.80.4 Firefly released

2014-07-15 Thread Sage Weil
This Firefly point release fixes an potential data corruption problem when ceph-osd daemons run on top of XFS and service Firefly librbd clients. A recently added allocation hint that RBD utilizes triggers an XFS bug on some kernels (Linux 3.2, and likely others) that leads to data corruption and

Re: [ceph-users] scrub error on firefly

2014-07-15 Thread Sage Weil
>> >> >> I'm using xfs. > >> >> >> > >> >> >> Also, when, in a previous email, you asked > if I could send the > >> >> >> object, > >&

Re: [ceph-users] v0.80.4 Firefly released

2014-07-16 Thread Sage Weil
automatic CEPH service restart after updating packages ? > > We are instructed to first update/restart MONs, and after that OSD - > but that is impossible if we have MON+OSDs on same host...since the > ceph is automaticaly restarted with YUM/RPM, but NOT automaticaly > restarted on U

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-16 Thread Sage Weil
On Wed, 16 Jul 2014, Gregory Farnum wrote: > On Wed, Jul 16, 2014 at 4:45 PM, Craig Lewis > wrote: > > One of the things I've learned is that many small changes to the cluster are > > better than one large change. Adding 20% more OSDs? Don't add them all at > > once, trickle them in over time.

Re: [ceph-users] ceph osd crush tunables optimal AND add new OSD at the same time

2014-07-17 Thread Sage Weil
On Thu, 17 Jul 2014, Quenten Grasso wrote: > Hi Sage & List > > I understand this is probably a hard question to answer. > > I mentioned previously our cluster is co-located MON?s on OSD servers, which > are R515?s w/ 1 x AMD 6 Core processor & 11 3TB OSD?s w/ dual 10GBE. > > When our cluster i

Re: [ceph-users] Regarding ceph osd setmaxosd

2014-07-18 Thread Sage Weil
On Fri, 18 Jul 2014, Anand Bhat wrote: > I have question on intention of Ceph setmaxosd command. From source code, it > appears as if this is present as a way to limit the number of OSDs in the > Ceph cluster.  Yeah. It's basically sizing the array of OSDs in the OSDMap. It's a bit obsolete sin

Re: [ceph-users] Ceph OSDs failing to start after upgrade to kernel 3.13

2014-07-22 Thread Sage Weil
Try increasing /proc/sys/fs/aio-max-nr and see if that helps? It's teh io_setup syscall that is failing. sage On Tue, 22 Jul 2014, Dane Elwell wrote: > Hi, > > We recently tried to switch to kernel 3.13 (from 3.5) using the Ubuntu > Precise backported kernels (e.g. package linux-generic-lt

Re: [ceph-users] Ceph and Infiniband

2014-07-22 Thread Sage Weil
On Tue, 22 Jul 2014, Riccardo Murri wrote: > Hello, > > a few questions on Ceph's current support for Infiniband > > (A) Can Ceph use Infiniband's native protocol stack, or must it use > IP-over-IB? Google finds a couple of entries in the Ceph wiki related > to native IB support (see [1], [2]),

Re: [ceph-users] slow read speeds from kernel rbd (Firefly 0.80.4)

2014-07-23 Thread Sage Weil
On Wed, 23 Jul 2014, Steve Anthony wrote: > Hello, > > Recently I've started seeing very slow read speeds from the rbd images I > have mounted. After some analysis, I suspect the root cause is related > to krbd; if I run the rados benchmark, I see read bandwith in the > 400-600MB/s range, however

Re: [ceph-users] firefly osds stuck in state booting

2014-07-26 Thread Sage Weil
On Sat, 26 Jul 2014, 10 minus wrote: > Hi, > > I just setup a test ceph installation on 3 node Centos 6.5  . > two of the nodes are used for hosting osds and the third acts as mon . > > Please note I'm using LVM so had to set up the osd using the manual install > guide. > > --snip-- > ceph -s >

Re: [ceph-users] anti-cephalopod question

2014-07-28 Thread Sage Weil
On Mon, 28 Jul 2014, Joao Eduardo Luis wrote: > On 07/28/2014 02:07 PM, Robert Fantini wrote: > > Is the '15 minutes or so ' something that can be configured at run time? > > Someone who knows this better than I do should probably chime in, but from a > quick look throughout the code it seems to

Re: [ceph-users] ceph metrics

2014-07-28 Thread Sage Weil
On Mon, 28 Jul 2014, James Eckersall wrote: > Hi, > I'm trying to understand what a lot of the values mean that are reported by > "perf dump" on the ceph admin socket.  I have a collectd plugin which sends > all of these values to graphite. > > Does anyone have a cross-reference list that explains

[ceph-users] v0.80.5 Firefly released

2014-07-29 Thread Sage Weil
radosgw. Notable Changes --- * ceph-dencoder: do not needlessly link to librgw, librados, etc. (Sage Weil) * do not needlessly link binaries to leveldb (Sage Weil) * mon: fix mon crash when no auth keys are present (#8851, Joao Eduardo Luis) * osd: fix cleanup (and avoid

[ceph-users] v0.83 released

2014-07-29 Thread Sage Weil
ing library for librados (Sebastien Ponce) * libs3: update to latest (Danny Al-Gaaf) * log: fix derr level (Joao Eduardo Luis) * logrotate: fix osd log rotation on ubuntu (Sage Weil) * mds: fix xattr bug triggered by ACLs (Yan, Zheng) * misc memory leaks, cleanups, fixes (Danny Al-Gaaf, Sahid Ferdjaou

Re: [ceph-users] cache pool osds crashing when data is evicting to underlying storage pool

2014-07-31 Thread Sage Weil
Hi Kenneth, On Thu, 31 Jul 2014, Kenneth Waegeman wrote: > Hi all, > > We have a erasure coded pool 'ecdata' and a replicated pool 'cache' acting as > writeback cache upon it. > When running 'rados -p ecdata bench 1000 write', it starts filling up the > 'cache' pool as expected. > I want to see w

Re: [ceph-users] GPF kernel panics

2014-07-31 Thread Sage Weil
gt; > > If there is a 250 limit, can you confirm where this is documented? > > > In this very ML, see the "v0.75 released" thread: > --- > On Thu, 16 Jan 2014 15:51:17 +0200 Ilya Dryomov wrote: > > > On Wed, Jan 15, 2014 at 5:42 AM, Sage Weil wrote: >

Re: [ceph-users] cache pool osds crashing when data is evicting to underlying storage pool

2014-08-01 Thread Sage Weil
On Fri, 1 Aug 2014, Kenneth Waegeman wrote: > > On Thu, 31 Jul 2014, Kenneth Waegeman wrote: > > > Hi all, > > > > > > We have a erasure coded pool 'ecdata' and a replicated pool 'cache' acting > > > as > > > writeback cache upon it. > > > When running 'rados -p ecdata bench 1000 write', it starts

Re: [ceph-users] 0.80.5-1precise Not Able to Map RBD & CephFS

2014-08-01 Thread Sage Weil
On Fri, 1 Aug 2014, Ilya Dryomov wrote: > On Fri, Aug 1, 2014 at 10:06 PM, Ilya Dryomov > wrote: > > On Fri, Aug 1, 2014 at 4:22 PM, Ilya Dryomov > > wrote: > >> On Fri, Aug 1, 2014 at 4:05 PM, Gregory Farnum wrote: > >>> We appear to have solved this and then immediately re-broken it by > >>>

Re: [ceph-users] Firefly OSDs stuck in creating state forever

2014-08-03 Thread Sage Weil
Hi Bruce, On Sun, 3 Aug 2014, Bruce McFarland wrote: > Yes I looked at tcpdump on each of the OSDs and saw communications between > all 3 OSDs before I sent my first question to this list. When I disabled > selinux on the one offending server based on your feedback (typically we > have this disabl

Re: [ceph-users] Firefly OSDs stuck in creating state forever

2014-08-03 Thread Sage Weil
n the > monitor box. Do you have one of the messages handy? I'm curious whether it is an OSD or a mon. Thanks! sage > Thanks for the feedback. > > > > On Aug 3, 2014, at 8:30 AM, "Sage Weil" wrote: > > > > Hi Bruce, > > > >&

Re: [ceph-users] Using Valgrind with Teuthology

2014-08-04 Thread Sage Weil
On Mon, 4 Aug 2014, Sarang G wrote: > Hi, > > I am configuring Ceph Cluster using teuthology. I want to use Valgrind. > > My yaml File contains: > > check-locks: false > > roles: > - [mon.0, osd.0] > - [mon.1, osd.1] > - [mon.2, osd.2, client.0] BTW I would use mon.[abc] instead of [012] as th

Re: [ceph-users] cache questions

2014-08-04 Thread Sage Weil
On Mon, 4 Aug 2014, Kenneth Waegeman wrote: > Hi, > > I have been doing some tests with rados bench write on a EC storage pool with > a writeback cache pool(replicated, size 3), and have some questions: > > * I had set target_max_bytes to 280G, and after some time of writing, the > cache pool sta

Re: [ceph-users] Erronous stats output (ceph df) after increasing PG number

2014-08-04 Thread Sage Weil
On Mon, 4 Aug 2014, Konstantinos Tompoulidis wrote: > Hi all, > > We recently added many OSDs to our production cluster. > This brought us to a point where the number of PGs we had assigned to our > main (heavily used) pool was well below the recommended value. > > We increased the PG number (in

Re: [ceph-users] Firefly OSDs stuck in creating state forever

2014-08-04 Thread Sage Weil
0 sd=5 :0 s=1 pgs=0 cs=0 l=1 > c=0x7f4204003eb0).fault > > 209.243.160.35 - monitor > 209.243.160.51 - osd.0 > 209.243.160.52 - osd.3 > 209.243.160.59 - osd.2 > > -Original Message- > From: Sage Weil [mailto:sw...@redhat.com] > Sent: Sunday, August 03, 2014

Re: [ceph-users] Firefly OSDs stuck in creating state forever

2014-08-04 Thread Sage Weil
om over a year ago. Is that still the case with 0.81 firefly code? Yep! Here's a recentish dump: http://tracker.ceph.com/issues/8880 sage > > -----Original Message- > From: Sage Weil [mailto:sw...@redhat.com] > Sent: Monday, August 04, 2014 10:09 AM >

Re: [ceph-users] v0.83 released

2014-08-05 Thread Sage Weil
These development releases can either be found at ceph.com, or perhaps they can go into debian unstable or rawhide or similar. I don't think it is the release that most users will want when they 'apt-get install ceph' on a generic system, though! sage > > > 2014-

Re: [ceph-users] osd disk location - comment field

2014-08-05 Thread Sage Weil
On Tue, 5 Aug 2014, Kenneth Waegeman wrote: > I'm trying to find out the location(mountpoint/device) of an osd through ceph > dumps, but I don't seem to find a way. I can of course ssh to it and check the > symlink under /var/lib/ceph/ceph-{osd_id}, but I would like to parse it out of > ceph comman

Re: [ceph-users] v0.83 released

2014-08-05 Thread Sage Weil
Oops, fixing ceph-maintainers address. On Tue, 5 Aug 2014, Sage Weil wrote: > On Tue, 5 Aug 2014, debian Only wrote: > > Good news.  when this release will public in Debian Wheezy pkglist ?thanks > > for ur good job > > I think that, in general, the strategy should

Re: [ceph-users] Erroneous stats output (ceph df) after increasing PG number

2014-08-05 Thread Sage Weil
On Mon, 4 Aug 2014, Konstantinos Tompoulidis wrote: > Sage Weil writes: > > > > > On Mon, 4 Aug 2014, Konstantinos Tompoulidis wrote: > > > Hi all, > > > > > > We recently added many OSDs to our production cluster. > > > This brought us to

Re: [ceph-users] Erroneous stats output (ceph df) after increasing PG number

2014-08-05 Thread Sage Weil
On Tue, 5 Aug 2014, Konstantinos Tompoulidis wrote: > We decided to perform a scrub and see the impact now that we have 4x PGs. > It seems that now that the PGs are "smaller", the impact is not that high. > We kept osd-max-scrubs to 1 which is the default setting. > Indeed the output of "ceph df"

Re: [ceph-users] [Ceph-community] Remote replication

2014-08-06 Thread Sage Weil
On Tue, 5 Aug 2014, Craig Lewis wrote: > There currently isn't a backup tool for CephFS.  CephFS is a POSIX > filesystem, so your normal tools should work.  It's a really large POSIX > filesystem though, so normal tools may not scale well. Note that CephFS does have one feature that should make ef

Re: [ceph-users] slow OSD brings down the cluster

2014-08-06 Thread Sage Weil
You can use the ceph osd perf command to get recent queue latency stats for all OSDs. With a bit of sorting this should quickly tell you if any OSDs are going significantly slower than the others. We'd like to automate this in calamari or perhaps even in the monitor, but it is not immediate

Re: [ceph-users] librbd tuning?

2014-08-06 Thread Sage Weil
On Wed, 6 Aug 2014, Mark Nelson wrote: > On 08/05/2014 06:19 PM, Mark Kirkwood wrote: > > On 05/08/14 23:44, Mark Nelson wrote: > > > On 08/05/2014 02:48 AM, Mark Kirkwood wrote: > > > > On 05/08/14 03:52, Tregaron Bayly wrote: > > > > > Does anyone have any insight on how we can tune librbd to per

Re: [ceph-users] What is difference in storing data between rbd and rados ?

2014-08-07 Thread Sage Weil
On Thu, 7 Aug 2014, debian Only wrote: > Hope expert give me some light > > > 2014-08-06 18:01 GMT+07:00 debian Only : > I am confuse to understand how File store in Ceph. > > I do two test. where is the File or the object for the File > > ?rados put Python.msi Python.msi -p data > ?rbd -

Re: [ceph-users] Regarding cache tier understanding

2014-08-07 Thread Sage Weil
On Thu, 7 Aug 2014, Somnath Roy wrote: > Hi Sage, > > I went through the tiering agent code base and here is my understanding on > the agent behavior. Please let me know if that is correct. > >   > > 1.   Agent will be always in idle state if target_max_bytes or > target_max_objects not set

[ceph-users] v0.67.10 Dumpling released

2014-08-12 Thread Sage Weil
leset ...' (#8599, John Spray) * mon: shut down if mon is removed from cluster (#6789, Joao Eduardo Luis) * osd: fix filestore perf reports to mon (Sage Weil) * osd: force any new or updated xattr into leveldb if E2BIG from XFS (#7779, Sage Weil) * osd: lock snapdir object during write to fix race wit

Re: [ceph-users] Can't export cephfs via nfs

2014-08-13 Thread Sage Weil
On Wed, 13 Aug 2014, Micha Krause wrote: > Hi, > > any ideas? Have you confirmed that if you unmount cephfs on /srv/micha the NFS export works? sage > > Micha Krause > > Am 11.08.2014 16:34, schrieb Micha Krause: > > Hi, > > > > Im trying to build a cephfs to nfs gateway, but somehow i can'

Re: [ceph-users] running Firefly client (0.80.1) against older version (dumpling 0.67.10) cluster?

2014-08-14 Thread Sage Weil
On Thu, 14 Aug 2014, Nigel Williams wrote: > Anyone know if this is safe in the short term? we're rebuilding our > nova-compute nodes and can make sure the Dumpling versions are pinned > as part of the process in the future. It's safe, with the possible exception of radosgw, that generally needs

Re: [ceph-users] Cache tiering and target_max_bytes

2014-08-14 Thread Sage Weil
On Thu, 14 Aug 2014, Pawe? Sadowski wrote: > Hello, > > I've a cluster of 35 OSD (30 HDD, 5 SSD) with cache tiering configured. > During tests it looks like ceph is not respecting target_max_bytes > settings. Steps to reproduce: > - configure cache tiering > - set target_max_bytes to 32G (on hot

Re: [ceph-users] cache pools on hypervisor servers

2014-08-14 Thread Sage Weil
On Thu, 14 Aug 2014, Andrei Mikhailovsky wrote: > Hi guys, > > Could someone from the ceph team please comment on running osd cache pool on > the hypervisors? Is this a good idea, or will it create a lot of performance > issues? It doesn't sound like an especially good idea. In general you want

Re: [ceph-users] Cache tiering and target_max_bytes

2014-08-14 Thread Sage Weil
On Thu, 14 Aug 2014, Pawe? Sadowski wrote: > W dniu 14.08.2014 17:20, Sage Weil pisze: > > On Thu, 14 Aug 2014, Pawe? Sadowski wrote: > >> Hello, > >> > >> I've a cluster of 35 OSD (30 HDD, 5 SSD) with cache tiering configured. > >>

Re: [ceph-users] ceph cluster inconsistency?

2014-08-15 Thread Sage Weil
On Fri, 15 Aug 2014, Haomai Wang wrote: > Hi Kenneth, > > I don't find valuable info in your logs, it lack of the necessary > debug output when accessing crash code. > > But I scan the encode/decode implementation in GenericObjectMap and > find something bad. > > For example, two oid has same ha

Re: [ceph-users] Fixed all active+remapped PGs stuck forever (but I have no clue why)

2014-08-18 Thread Sage Weil
On Mon, 18 Aug 2014, John Morris wrote: > rule by_bank { > ruleset 3 > type replicated > min_size 3 > max_size 4 > step take default > step choose firstn 0 type bank > step choose firstn 0 type osd > step emit > } You probably want:

[ceph-users] v0.84 released

2014-08-18 Thread Sage Weil
fields; these have been removed. Please use 'read_bytes' and 'write_bytes' instead (and divide by 1024 if appropriate). Notable Changes --- * ceph-conf: flush log on exit (Sage Weil) * ceph-dencoder: refactor build a bit to limit dependencies (Sage Weil, Dan Mick) * ceph.spec: spli

Re: [ceph-users] cephfs set_layout / setfattr ... does not work anymore for pools

2014-08-18 Thread Sage Weil
/cephfs/ssd-r2 > getfattr: Removing leading '/' from absolute path names > # file: mnt/cephfs/ssd-r2 > ceph.dir.entries="0" > ceph.dir.files="0" > ceph.dir.rbytes="0" > ceph.dir.rctime="0.090" > ceph.dir.rentries="1" >

Re: [ceph-users] v0.84 released

2014-08-18 Thread Sage Weil
ev events, not not using 'enable' thing where systemd persistently registers that a service is to be started...? sage > Thanks, > Robert LeBlanc > > > On Mon, Aug 18, 2014 at 1:14 PM, Sage Weil wrote: > The next Ceph development release is here!  This release

Re: [ceph-users] v0.84 released

2014-08-19 Thread Sage Weil
ly part of the ceph package: $ dpkg -L ceph | grep udev /lib/udev /lib/udev/rules.d /lib/udev/rules.d/60-ceph-partuuid-workaround.rules /lib/udev/rules.d/95-ceph-osd.rules sage > Robert LeBlanc > > > On Mon, Aug 18, 2014 at 5:49 PM, Sage Weil wrote: > On Mon, 18 A

Re: [ceph-users] Translating a RadosGW object name into a filename on disk

2014-08-20 Thread Sage Weil
On Wed, 20 Aug 2014, Craig Lewis wrote: > Looks like I need to upgrade to Firefly to get ceph-kvstore-tool > before I can proceed. > I am getting some hits just from grepping the LevelDB store, but so > far nothing has panned out. FWIW if you just need the tool, you can wget the .deb and 'dpkg -x

Re: [ceph-users] pool with cache pool and rbd export

2014-08-22 Thread Sage Weil
On Fri, 22 Aug 2014, Andrei Mikhailovsky wrote: > Does that also mean that scrubbing and deep-scrubbing also squishes data > out of the cache pool? Could someone from the ceph community confirm > this? Scrubbing activities have no effect on the cache; don't worry. :) sage > > Thanks > > >

Re: [ceph-users] pool with cache pool and rbd export

2014-08-22 Thread Sage Weil
On Fri, 22 Aug 2014, Andrei Mikhailovsky wrote: > So it looks like using rbd export / import will negatively effect the > client performance, which is unfortunate. Is this really the case? Any > plans on changing this behavior in future versions of ceph? There will always be some impact from imp

Re: [ceph-users] Is it safe to enable rbd cache with qemu?

2014-08-23 Thread Sage Weil
For Giant, we have changed the default librbd caching options to: rbd cache = true rbd cache writethrough until flush = true The second option enables the cache for reads but does writethrough until we observe a FLUSH command come through, which implies that the guest OS is issuing barriers.

Re: [ceph-users] Deadlock in ceph journal

2014-08-25 Thread Sage Weil
g and Mark have seen? sage On Wed, 20 Aug 2014, Somnath Roy wrote: > > I think this is the issue.. > >   > > http://tracker.ceph.com/issues/9073 > >   > > Thanks & Regards > > Somnath > >   > > From: Somnath Roy > Sent: Tuesday, August 1

Re: [ceph-users] Prioritize Heartbeat packets

2014-08-27 Thread Sage Weil
On Wed, 27 Aug 2014, Robert LeBlanc wrote: > I'm looking for a way to prioritize the heartbeat traffic higher than the > storage and replication traffic. I would like to keep the ceph.conf as > simple as possible by not adding the individual osd IP addresses and ports, > but it looks like the liste

Re: [ceph-users] Prioritize Heartbeat packets

2014-08-27 Thread Sage Weil
On Wed, 27 Aug 2014, Matt W. Benjamin wrote: > > - "Sage Weil" wrote: > > > > What would be best way for us to mark which sockets are heartbeat > > related? > > Is there some setsockopt() type call we should be using, or should we > > > &

Re: [ceph-users] OSD is crashing while running admin socket

2014-09-08 Thread Sage Weil
> -Original Message- > From: Samuel Just [mailto:sam.j...@inktank.com] > Sent: Monday, September 08, 2014 5:22 PM > To: Somnath Roy > Cc: Sage Weil (sw...@redhat.com); ceph-de...@vger.kernel.org; > ceph-users@lists.ceph.com > Subject: Re: OSD is crashing while running

Re: [ceph-users] all my osds are down, but ceph -s tells they are up and in.

2014-09-08 Thread Sage Weil
On Tue, 9 Sep 2014, yuelongguang wrote: > hi,all >   > that is crazy. > 1. > all my osds are down, but ceph -s tells they are up and in. why? Peer OSDs normally handle failure detection. If all OSDs are down, there is nobody to report the failures. After 5 or 10 minutes if the OSDs don't report

Re: [ceph-users] ceph cluster inconsistency keyvaluestore

2014-09-08 Thread Sage Weil
t;>>> > >>>> What could here be the problem? > >>>> Thanks again!! > >>>> > >>>> Kenneth > >>>> > >>>> > >>>> - Message from Haomai Wang - > >>>> Date: Tue, 26 Aug 2014 17:11:43 +0

[ceph-users] CephFS roadmap (was Re: NAS on RBD)

2014-09-09 Thread Sage Weil
On Tue, 9 Sep 2014, Blair Bethwaite wrote: > > Personally, I think you?re very brave to consider running 2PB of ZoL > > on RBD. If I were you I would seriously evaluate the CephFS option. It > > used to be on the roadmap for ICE 2.0 coming out this fall, though I > > noticed its not there anymor

Re: [ceph-users] ceph data consistency

2014-09-09 Thread Sage Weil
On Thu, 4 Sep 2014, wrote: > > hi, guys: >   >   when I read the filestore.cc, I find the ceph use crc the check the data. > Why should check the data? > >   In my knowledge,  the disk has error-correcting code (ECC) for each > sector. Looking at wiki: http://en.wikipedia.org/wiki/Disk_s

Re: [ceph-users] question about RGW

2014-09-10 Thread Sage Weil
[Moving this to ceph-devel, where you're more likely to get a response from a developer!] On Wed, 10 Sep 2014, baijia...@126.com wrote: > when I read RGW code,  and can't  understand  master_ver  inside struct > rgw_bucket_dir_header . > who can explain this struct , in especial master_ver and s

Re: [ceph-users] OpTracker optimization

2014-09-10 Thread Sage Weil
gt; > https://github.com/ceph/ceph/pull/2440 > > Thanks & Regards > Somnath > > -Original Message- > From: Samuel Just [mailto:sam.j...@inktank.com] > Sent: Wednesday, September 10, 2014 3:25 PM > To: Somnath Roy > Cc: Sage Weil (sw...@redhat.com); ceph-de...

Re: [ceph-users] Cephfs upon Tiering

2014-09-11 Thread Sage Weil
On Thu, 11 Sep 2014, Gregory Farnum wrote: > On Thu, Sep 11, 2014 at 4:13 AM, Kenneth Waegeman > wrote: > > Hi all, > > > > I am testing the tiering functionality with cephfs. I used a replicated > > cache with an EC data pool, and a replicated metadata pool like this: > > > > > > ceph osd pool cr

Re: [ceph-users] Cephfs upon Tiering

2014-09-11 Thread Sage Weil
On Thu, 11 Sep 2014, Gregory Farnum wrote: > On Thu, Sep 11, 2014 at 11:39 AM, Sage Weil wrote: > > On Thu, 11 Sep 2014, Gregory Farnum wrote: > >> On Thu, Sep 11, 2014 at 4:13 AM, Kenneth Waegeman > >> wrote: > >> > Hi all, > >> > > >

Re: [ceph-users] Regarding key/value interface

2014-09-11 Thread Sage Weil
Hi Somnath, On Fri, 12 Sep 2014, Somnath Roy wrote: > > Hi Sage/Haomai, > > If I have a key/value backend that support transaction, range queries (and I > don?t need any explicit caching etc.) and I want to replace filestore (and > leveldb omap) with that,  which interface you recommend me to de

  1   2   3   4   5   6   7   8   9   10   >