Re: [ceph-users] Cache tiering

2014-05-07 Thread Gregory Farnum
On Wed, May 7, 2014 at 8:13 AM, Dan van der Ster wrote: > Hi, > > > Gregory Farnum wrote: > > 3) The cost of a cache miss is pretty high, so they should only be > used when the active set fits within the cache and doesn't change too > frequently. > > > Can

Re: [ceph-users] v0.80 Firefly released

2014-05-07 Thread Gregory Farnum
On Wed, May 7, 2014 at 8:44 AM, Dan van der Ster wrote: > Hi, > > > Sage Weil wrote: > > * *Primary affinity*: Ceph now has the ability to skew selection of > OSDs as the "primary" copy, which allows the read workload to be > cheaply skewed away from parts of the cluster without migrating any

Re: [ceph-users] v0.80 Firefly released

2014-05-07 Thread Gregory Farnum
On Wed, May 7, 2014 at 11:18 AM, Mike Dawson wrote: > > On 5/7/2014 11:53 AM, Gregory Farnum wrote: >> >> On Wed, May 7, 2014 at 8:44 AM, Dan van der Ster >> wrote: >>> >>> Hi, >>> >>> >>> Sage Weil wrote: >>> >>

Re: [ceph-users] Slow IOPS on RBD compared to journal and backing devices

2014-05-07 Thread Gregory Farnum
On Wed, May 7, 2014 at 5:57 PM, Christian Balzer wrote: > > Hello, > > ceph 0.72 on Debian Jessie, 2 storage nodes with 2 OSDs each. The journals > are on (separate) DC 3700s, the actual OSDs are RAID6 behind an Areca 1882 > with 4GB of cache. > > Running this fio: > > fio --size=400m --ioengine=l

Re: [ceph-users] Deep-Scrub Scheduling

2014-05-07 Thread Gregory Farnum
Is it possible you're running into the max scrub intervals and jumping up to one-per-OSD from a much lower normal rate? On Wednesday, May 7, 2014, Mike Dawson wrote: > My write-heavy cluster struggles under the additional load created by > deep-scrub from time to time. As I have instrumented the

Re: [ceph-users] Slow IOPS on RBD compared to journal and backing devices

2014-05-07 Thread Gregory Farnum
do you have begin the raid6 ?) > > > Aslo, I known that direct ios can be quite slow with ceph, > > maybe can you try without --direct=1 > > and also enable rbd_cache > > ceph.conf > [client] > rbd cache = true > > > > > - Mail original - > > De: &q

Re: [ceph-users] 0.67.7 rpms changed today??

2014-05-08 Thread Gregory Farnum
On Thu, May 8, 2014 at 9:09 AM, Dan van der Ster wrote: > Dear ceph repo admins, > > Today our repo synchronization detected that the 0.67.7 rpms from > http://ceph.com/rpm-dumpling/el6/x86_64/ have changed, namely: > > Repo: ceph-dumpling-el6 > + ceph-0.67.7-0.el6.x86_64.

Re: [ceph-users] Migrate whole clusters

2014-05-09 Thread Gregory Farnum
I don't think anybody's done this before, but that will functionally work, yes. Depending on how much of the data in the cluster you actually care about, you might be better off just taking it out (rbd export/import or something) instead of trying to incrementally move all the data over, but...*shr

Re: [ceph-users] issues with ceph

2014-05-09 Thread Gregory Farnum
I'm less current on the kernel client, so maybe there are some since-fixed bugs I'm forgetting, but: On Fri, May 9, 2014 at 8:55 AM, Aronesty, Erik wrote: > I can always remount and see them. > > But I wanted to preserve the "broken" state and see if I could figure out why > it was happening.

Re: [ceph-users] Low latency values

2014-05-09 Thread Gregory Farnum
The recovery_state "latencies" are all about how long your PGs are in various states of recovery; they're not per-operation latencies. 3 days still seems awfully long, but if you had a lot of data that needed to get recovered and were throttling it tightly enough that could happen. -Greg Software E

Re: [ceph-users] Low latency values

2014-05-09 Thread Gregory Farnum
On Fri, May 9, 2014 at 10:49 AM, Dan Ryder (daryder) wrote: > Thanks Greg, > > That makes sense. > > Can you also confirm that latency values are always in seconds? > I haven't seen any documentation for it and want to be sure before I say it > is one way or the other. I believe that's the case,

Re: [ceph-users] Where is the SDK of ceph object storage

2014-05-13 Thread Gregory Farnum
On Mon, May 12, 2014 at 11:55 PM, wsnote wrote: > Hi, everyone! > Where can I find the SDK of ceph object storage? > Python: boto > C++: libs3 which I found in the src of ceph and github.com/ceph/libs3. > where are that of other language? Does ceph supply them? > Otherwise I use the SDK of Amazon

Re: [ceph-users] crushmap question

2014-05-13 Thread Gregory Farnum
You just use a type other than "rack" in your chooseleaf rule. In your case, "host". When using chooseleaf, the bucket type you specify is the failure domain which it must segregate across. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com On Tue, May 13, 2014 at 12:52 AM, Cao, B

Re: [ceph-users] Occasional Missing Admin Sockets

2014-05-13 Thread Gregory Farnum
On Tue, May 13, 2014 at 9:06 AM, Mike Dawson wrote: > All, > > I have a recurring issue where the admin sockets > (/var/run/ceph/ceph-*.*.asok) may vanish on a running cluster while the > daemons keep running Hmm. >(or restart without my knowledge). I'm guessing this might be involved: > I see

Re: [ceph-users] Migrate whole clusters

2014-05-13 Thread Gregory Farnum
On Tue, May 13, 2014 at 11:36 AM, Fred Yang wrote: > I have to say I'm shocked to see the suggestion is rbd import/export if 'you > care the data'. These kind of operation is common use case and should be an > essential part of any distributed storage. What if I have a hundred node > cluster runni

Re: [ceph-users] Occasional Missing Admin Sockets

2014-05-13 Thread Gregory Farnum
nning 0.80.1 just like the description in > Issue 7188 [0]. > > 0: http://tracker.ceph.com/issues/7188 > > Should that bug be reopened? > > Thanks, > Mike Dawson > > > > On 5/13/2014 2:10 PM, Gregory Farnum wrote: >> >> On Tue, May 13, 2014 at 9:06 AM

Re: [ceph-users] Migrate whole clusters

2014-05-13 Thread Gregory Farnum
Assuming you have the spare throughput-/IOPS for Ceph to do its thing without disturbing your clients, this will work fine. -Greg On Tuesday, May 13, 2014, Gandalf Corvotempesta < gandalf.corvotempe...@gmail.com> wrote: > 2014-05-13 21:21 GMT+02:00 Gregory Farnum > >: > &

Re: [ceph-users] crushmap question

2014-05-14 Thread Gregory Farnum
) > > -Original Message- > From: Cao, Buddy > Sent: Wednesday, May 14, 2014 1:30 PM > To: 'Gregory Farnum' > Cc: ceph-users@lists.ceph.com > Subject: RE: [ceph-users] crushmap question > > Thanks Gregory so much,it solved the problem! > > > Wei C

Re: [ceph-users] Advanced CRUSH map rules

2014-05-14 Thread Gregory Farnum
On Wed, May 14, 2014 at 9:56 AM, Fabrizio G. Ventola wrote: > Hi everybody, > > Is it possible with CRUSH map to make a rule that puts R-1 replicas on > a node and the remaining one on a different node of the same failure > domain (for example datacenter) putting the replicas considering a > deepe

Re: [ceph-users] Advanced CRUSH map rules

2014-05-14 Thread Gregory Farnum
On Wed, May 14, 2014 at 10:52 AM, Pavel V. Kaygorodov wrote: > Hi! > >> CRUSH can do this. You'd have two choose ...emit sequences; >> the first of which would descend down to a host and then choose n-1 >> devices within the host; the second would descend once. I think >> something like this shoul

Re: [ceph-users] Why number of objects increase when a PG is added

2014-05-14 Thread Gregory Farnum
On Wed, May 14, 2014 at 12:12 PM, Shesha Sreenivasamurthy wrote: > Hi, >I was experimenting with Ceph and found an interesting behavior (at > least to me) : Number of objects doubled when a new placement group was > added. > > Experiment Set Up: > > 3 Nodes with one OSD per node > Replication

Re: [ceph-users] simultaneous access to ceph via librados and s3 gw

2014-05-14 Thread Gregory Farnum
On Wed, May 14, 2014 at 2:42 PM, Lukac, Erik wrote: > Hi there, > > does anybody have an idea, how I can access my files created via librados > through the s3 gateway on my ceph-cluster? > > Uploading via librados and then accessing via s3 seems to be impossible > because I only see a bunch of ent

Re: [ceph-users] Does CEPH rely on any multicasting?

2014-05-15 Thread Gregory Farnum
On Thu, May 15, 2014 at 9:52 AM, Amit Vijairania wrote: > Hello! > > Does CEPH rely on any multicasting? Appreciate the feedback.. Nope! All networking is point-to-point. -Greg Software Engineer #42 @ http://inktank.com | http://ceph.com ___ ceph-users

Re: [ceph-users] attive+degraded cluster

2014-05-16 Thread Gregory Farnum
On Friday, May 16, 2014, Ignazio Cassano wrote: > Hit all, I successfully installed a ceph cluster firefly version made up > of 3 osd and one monitor host. > After that I created a pool and 1 rdb object for kvm . > It works fine . > I verified my pool has a replica size = 3 but a I read the de

Re: [ceph-users] CephFS parallel reads from multiple replicas ?

2014-05-19 Thread Gregory Farnum
On Sat, May 17, 2014 at 5:40 PM, Michal Pazdera wrote: > Hi everyone, > > I wonder if CephFS is able to read from all replicas simontaniously, that > will result in doubled read performance if > a replica of 2 has been used. Nope, definitely not. > I have done some humble testing on 5PCs (2OSD >

Re: [ceph-users] metadata pool : size growing

2014-05-19 Thread Gregory Farnum
On Mon, May 19, 2014 at 4:05 AM, Florent B wrote: > On 05/19/2014 12:40 PM, Wido den Hollander wrote: >> On 05/19/2014 12:31 PM, Florent B wrote: >>> Hi all, >>> >>> I use CephFS and I am wondering what does "metadata" pool contains >>> exactly ? >>> >> >> It contains the directory structure of yo

Re: [ceph-users] is cephfs ready for production ?

2014-05-19 Thread Gregory Farnum
On Mon, May 19, 2014 at 9:05 AM, Ignazio Cassano wrote: > Hi all, I'd like to know if cephfs is in heavy developement or it is ready > for production . > Documentation reports it is not for production but I think documentation on > ceph.com is not enough recent . There are groups successfully us

Re: [ceph-users] Expanding pg's of an erasure coded pool

2014-05-20 Thread Gregory Farnum
This failure means the messenger subsystem is trying to create a thread and is getting an error code back — probably due to a process or system thread limit that you can turn up with ulimit. This is happening because a replicated PG primary needs a connection to only its replicas (generally 1 or 2

Re: [ceph-users] Expanding pg's of an erasure coded pool

2014-05-21 Thread Gregory Farnum
On Wed, May 21, 2014 at 3:52 AM, Kenneth Waegeman wrote: > Thanks! I increased the max processes parameter for all daemons quite a lot > (until ulimit -u 3802720) > > These are the limits for the daemons now.. > [root@ ~]# cat /proc/17006/limits > Limit Soft Limit Har

Re: [ceph-users] cephfs read-only setting doesn't work?

2015-09-02 Thread Gregory Farnum
On Tue, Sep 1, 2015 at 9:20 PM, Erming Pei wrote: > Hi, > > I tried to set up a read-only permission for a client but it looks always > writable. > > I did the following: > > ==Server end== > > [client.cephfs_data_ro] > key = AQxx== > caps mon = "allow r" > caps

Re: [ceph-users] Appending to an open file - O_APPEND flag

2015-09-02 Thread Gregory Farnum
On Wed, Sep 2, 2015 at 10:00 AM, Janusz Borkowski wrote: > Hi! > > I mount cephfs using kernel client (3.10.0-229.11.1.el7.x86_64). > > The effect is the same when doing "echo >>" from another machine and from a > machine keeping the file open. > > The file is opened with open( .., > O_WRONLY|O_LA

Re: [ceph-users] Appending to an open file - O_APPEND flag

2015-09-02 Thread Gregory Farnum
Whoops, forgot to add Zheng. On Wed, Sep 2, 2015 at 10:11 AM, Gregory Farnum wrote: > On Wed, Sep 2, 2015 at 10:00 AM, Janusz Borkowski > wrote: >> Hi! >> >> I mount cephfs using kernel client (3.10.0-229.11.1.el7.x86_64). >> >> The effect is the same when

Re: [ceph-users] cephfs read-only setting doesn't work?

2015-09-02 Thread Gregory Farnum
pool? Mounting it on another client and seeing if changes are reflected there would do it. Or unmounting the filesystem, mounting again, and seeing if the file has really changed. -Greg > > Thanks! > > Erming > > > > On 9/2/15, 2:44 AM, Gregory Farnum wrote: >> >>

Re: [ceph-users] OSD respawning -- FAILED assert(clone_size.count(clone))

2015-09-03 Thread Gregory Farnum
On Thu, Sep 3, 2015 at 7:48 AM, Chris Taylor wrote: > I removed the latest OSD that was respawing (osd.23) and now I having the > same problem with osd.30. It looks like they both have pg 3.f9 in common. I > tried "ceph pg repair 3.f9" but the OSD is still respawning. > > Does anyone have any idea

Re: [ceph-users] How objects are reshuffled on addition of new OSD

2015-09-08 Thread Gregory Farnum
On Tue, Sep 1, 2015 at 2:31 AM, Shesha Sreenivasamurthy wrote: > I had a question regarding how OSD locations are determined by CRUSH. > > From the CRUSH paper I gather that the replica locations of an object (A) is > a vector (v) that is got by the function c(r,x) = (hash (x) + rp) mod m). It is

Re: [ceph-users] Inconsistency in 'ceph df' stats

2015-09-08 Thread Gregory Farnum
This comes up periodically on the mailing list; see eg http://www.spinics.net/lists/ceph-users/msg15907.html I'm not sure if your case fits within those odd parameters or not, but I bet it does. :) -Greg On Mon, Aug 31, 2015 at 8:16 PM, Stillwell, Bryan wrote: > On one of our staging ceph cluste

Re: [ceph-users] how to improve ceph cluster capacity usage

2015-09-08 Thread Gregory Farnum
On Tue, Sep 1, 2015 at 3:58 PM, huang jun wrote: > hi,all > > Recently, i did some experiments on OSD data distribution, > we set up a cluster with 72 OSDs,all 2TB sata disk, > and ceph version is v0.94.3 and linux kernel version is 3.18, > and set "ceph osd crush tunables optimal". > There are 3

Re: [ceph-users] rebalancing taking very long time

2015-09-08 Thread Gregory Farnum
On Wed, Sep 2, 2015 at 9:34 PM, Bob Ababurko wrote: > When I lose a disk OR replace a OSD in my POC ceph cluster, it takes a very > long time to rebalance. I should note that my cluster is slightly unique in > that I am using cephfs(shouldn't matter?) and it currently contains about > 310 million

Re: [ceph-users] osds on 2 nodes vs. on one node

2015-09-08 Thread Gregory Farnum
On Fri, Sep 4, 2015 at 12:24 AM, Deneau, Tom wrote: > After running some other experiments, I see now that the high single-node > bandwidth only occurs when ceph-mon is also running on that same node. > (In these small clusters I only had one ceph-mon running). > If I compare to a single-node wher

Re: [ceph-users] CephFS/Fuse : detect package upgrade to remount

2015-09-08 Thread Gregory Farnum
On Fri, Sep 4, 2015 at 9:15 AM, Florent B wrote: > Hi everyone, > > I would like to know if there is a way on Debian to detect an upgrade of > ceph-fuse package, that "needs" remouting CephFS. > > When I upgrade my systems, I do a "aptitude update && aptitude > safe-upgrade". > > When ceph-fuse pa

Re: [ceph-users] CephFS and caching

2015-09-08 Thread Gregory Farnum
On Thu, Sep 3, 2015 at 11:58 PM, Kyle Hutson wrote: > I was wondering if anybody could give me some insight as to how CephFS does > its caching - read-caching in particular. > > We are using CephFS with an EC pool on the backend with a replicated cache > pool in front of it. We're seeing some very

Re: [ceph-users] CephFS/Fuse : detect package upgrade to remount

2015-09-08 Thread Gregory Farnum
On Tue, Sep 8, 2015 at 2:33 PM, Florent B wrote: > > > On 09/08/2015 03:26 PM, Gregory Farnum wrote: >> On Fri, Sep 4, 2015 at 9:15 AM, Florent B wrote: >>> Hi everyone, >>> >>> I would like to know if there is a way on Debian to detect an upgrade of >

Re: [ceph-users] A few questions and remarks about cephx

2015-09-08 Thread Gregory Farnum
On Sun, Sep 6, 2015 at 10:07 AM, Marin Bernard wrote: > Hi, > > I've just setup Ceph Hammer (latest version) on a single node (1 MON, 1 > MDS, 4 OSDs) for testing purposes. I used ceph-deploy. I only > configured CephFS as I don't use RBD. My pool config is as follows: > > $ sudo ceph df > GLOBAL:

Re: [ceph-users] CephFS and caching

2015-09-09 Thread Gregory Farnum
of the developer testing kernels or something? I think Ilya might have mentioned some issues with readahead being artificially blocked, but that might have only been with RBD. Oh, are the files you're using sparse? There was a bug with sparse files not filling in pages that just got patched yester

Re: [ceph-users] CephFS and caching

2015-09-09 Thread Gregory Farnum
On Wed, Sep 9, 2015 at 4:26 PM, Kyle Hutson wrote: > > > On Wed, Sep 9, 2015 at 9:34 AM, Gregory Farnum wrote: >> >> On Wed, Sep 9, 2015 at 3:27 PM, Kyle Hutson wrote: >> > We are using Hammer - latest released version. How do I check if it's >> >

Re: [ceph-users] Ceph.conf

2015-09-10 Thread Gregory Farnum
On Thu, Sep 10, 2015 at 9:44 AM, Shinobu Kinjo wrote: > Hello, > > I'm seeing 859 parameters in the output of: > > $ ./ceph --show-config | wc -l > *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** > 859 > > In: > > $ ./ceph --version > *** DEVELOPER MODE: se

Re: [ceph-users] higher read iop/s for single thread

2015-09-10 Thread Gregory Farnum
On Thu, Sep 10, 2015 at 2:34 PM, Stefan Priebe - Profihost AG wrote: > Hi, > > while we're happy running ceph firefly in production and also reach > enough 4k read iop/s for multithreaded apps (around 23 000) with qemu 2.2.1. > > We've now a customer having a single threaded application needing ar

Re: [ceph-users] higher read iop/s for single thread

2015-09-11 Thread Gregory Farnum
On Fri, Sep 11, 2015 at 9:52 AM, Nick Fisk wrote: >> -Original Message- >> From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >> Mark Nelson >> Sent: 10 September 2015 16:20 >> To: ceph-users@lists.ceph.com >> Subject: Re: [ceph-users] higher read iop/s for single thr

Re: [ceph-users] 9 PGs stay incomplete

2015-09-11 Thread Gregory Farnum
On Thu, Sep 10, 2015 at 9:46 PM, Wido den Hollander wrote: > Hi, > > I'm running into a issue with Ceph 0.94.2/3 where after doing a recovery > test 9 PGs stay incomplete: > > osdmap e78770: 2294 osds: 2294 up, 2294 in > pgmap v1972391: 51840 pgs, 7 pools, 220 TB data, 185 Mobjects >755 TB

Re: [ceph-users] Query about contribution regarding monitoring of Ceph Object Storage

2015-09-14 Thread Gregory Farnum
On Sat, Sep 12, 2015 at 6:13 AM, pragya jain wrote: > Hello all > > I am carrying out research in the area of cloud computing under Department > of CS, University of Delhi. I would like to contribute my research work > regarding monitoring of Ceph Object Storage to the Ceph community. > > Please h

Re: [ceph-users] rados bench seq throttling

2015-09-14 Thread Gregory Farnum
On Thu, Sep 10, 2015 at 1:02 PM, Deneau, Tom wrote: > Running 9.0.3 rados bench on a 9.0.3 cluster... > In the following experiments this cluster is only 2 osd nodes, 6 osds each > and a separate mon node (and a separate client running rados bench). > > I have two pools populated with 4M objects.

Re: [ceph-users] Ceph performance, empty vs part full

2015-09-14 Thread Gregory Farnum
formance, empty vs part full >> > >>>> >> > >>>> Hrm, I think it will follow the merge/split rules if it's out of >> > >>>> whack given the new settings, but I don't know that I've ever >> > >>>> teste

Re: [ceph-users] CephFS and caching

2015-09-14 Thread Gregory Farnum
On Thu, Sep 10, 2015 at 1:07 PM, Kyle Hutson wrote: > A 'rados -p cachepool ls' takes about 3 hours - not exactly useful. > > I'm intrigued that you say a single read may not promote it into the cache. > My understanding is that if you have an EC-backed pool the clients can't > talk to them direct

Re: [ceph-users] Cephfs total throughput

2015-09-15 Thread Gregory Farnum
On Tue, Sep 15, 2015 at 9:10 AM, Barclay Jameson wrote: > So, I asked this on the irc as well but I will ask it here as well. > > When one does 'ceph -s' it shows client IO. > > The question is simple. > > Is this total throughput or what the clients would see? > > Since it's replication factor of

Re: [ceph-users] Cephfs total throughput

2015-09-15 Thread Gregory Farnum
lay Jameson wrote: >>> >>> Unfortunately, it's not longer idle as my CephFS cluster is now in >>> production :) >>> >>> On Tue, Sep 15, 2015 at 11:17 AM, Gregory Farnum >>> wrote: >>>> >>>> On Tue, Sep 15, 2015 at 9:10

Re: [ceph-users] ceph-fuse failed with mount connection time out

2015-09-17 Thread Gregory Farnum
On Thu, Sep 17, 2015 at 1:15 AM, Fulin Sun wrote: > Hi, experts > > While doing the command > ceph-fuse /home/ceph/cephfs > > I got the following error : > > ceph-fuse[28460]: starting ceph client > 2015-09-17 16:03:33.385602 7fabf999b780 -1 init, newargv = 0x2c730c0 > newargc=11 > ceph-fuse

Re: [ceph-users] benefit of using stripingv2

2015-09-17 Thread Gregory Farnum
On Wed, Sep 16, 2015 at 11:56 AM, Corin Langosch wrote: > Hi guys, > > afaik rbd always splits the image into chunks of size 2^order (2^22 = 4MB by > default). What's the benefit of specifying > the feature flag "STRIPINGV2"? I couldn't find any documenation about it > except > http://ceph.com/d

Re: [ceph-users] leveldb compaction error

2015-09-17 Thread Gregory Farnum
On Thu, Sep 17, 2015 at 12:41 AM, Selcuk TUNC wrote: > hello, > > we have noticed leveldb compaction on mount causes a segmentation fault in > hammer release(0.94). > It seems related to this pull request (github.com/ceph/ceph/pull/4372). Are > you planning to backport > this fix to next hammer re

Re: [ceph-users] benefit of using stripingv2

2015-09-17 Thread Gregory Farnum
On Thu, Sep 17, 2015 at 7:55 AM, Corin Langosch wrote: > Hi Greg, > > Am 17.09.2015 um 16:42 schrieb Gregory Farnum: >> Briefly, if you do a lot of small direct IOs (for instance, a database >> journal) then striping lets you send each sequential write to a >> separ

Re: [ceph-users] Using cephfs with hadoop

2015-09-18 Thread Gregory Farnum
On Thu, Sep 17, 2015 at 7:48 PM, Fulin Sun wrote: > Hi, guys > > I am wondering if I am able to deploy ceph and hadoop into different cluster > nodes and I can > > still use cephfs as the backend for hadoop access. > > For example, ceph in cluster 1 and hadoop in cluster 2, while cluster 1 and > c

Re: [ceph-users] multi-datacenter crush map

2015-09-18 Thread Gregory Farnum
On Fri, Sep 18, 2015 at 4:57 AM, Wouter De Borger wrote: > Hi all, > > I have found on the mailing list that it should be possible to have a multi > datacenter setup, if latency is low enough. > > I would like to set this up, so that each datacenter has at least two > replicas and each PG has a re

Re: [ceph-users] Potential OSD deadlock?

2015-09-21 Thread Gregory Farnum
So it sounds like you've got two different things here: 1) You get a lot of slow operations that show up as warnings. 2) Rarely, you get blocked op warnings that don't seem to go away until the cluster state changes somehow. (2) is the interesting one. Since you say the cluster is under heavy loa

Re: [ceph-users] multi-datacenter crush map

2015-09-21 Thread Gregory Farnum
On Mon, Sep 21, 2015 at 7:07 AM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > > > > On Mon, Sep 21, 2015 at 3:02 AM, Wouter De Borger wrote: >> Thank you for your answer! We will use size=4 and min_size=2, which should >> do the trick. >> >> For the monitor issue,

Re: [ceph-users] Potential OSD deadlock?

2015-09-21 Thread Gregory Farnum
xO > mWZqaeKBMauBWwLADIX1Q+VYBSvZWqFCfKGUawQ4bRnyz7zlHXQANlL1t7iF > /QakoriydMW3l2WPftk4kDt4egFGhxxrCRZfA0TnVNx1DOLE9vRBKXKgTr0j > miB0Ca9v9DQzVnTWhPCTfb8UdEHzozMTMEv30V3nskafPolsRJmjO04C1K7e > 61R+cawG02J0RQqFMMNj3X2Gnbp/CC6JzUpQ5JPvNrvO34lcTYBWkdfwtolg > 9ExB > =hAcJ > -END PGP SIGNATURE- >

Re: [ceph-users] CephFS Fuse Issue

2015-09-21 Thread Gregory Farnum
Do you have a core file from the crash? If you do and can find out which pointers are invalid that would help...I think "cct" must be the broken one, but maybe it's just the Inode* or something. -Greg On Mon, Sep 21, 2015 at 2:03 PM, Scottix wrote: > I was rsyncing files to ceph from an older mac

Re: [ceph-users] Potential OSD deadlock?

2015-09-22 Thread Gregory Farnum
On Mon, Sep 21, 2015 at 11:43 PM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > I'm starting to wonder if this has to do with some OSDs getting full > or the 0.94.3 code. Earlier this afternoon, I cleared out my test > cluster so there was no pools. I created anew r

Re: [ceph-users] Potential OSD deadlock?

2015-09-22 Thread Gregory Farnum
On Tue, Sep 22, 2015 at 7:24 AM, Robert LeBlanc wrote: > -BEGIN PGP SIGNED MESSAGE- > Hash: SHA256 > > Is there some way to tell in the logs that this is happening? You can search for the (mangled) name _split_collection > I'm not > seeing much I/O, CPU usage during these times. Is there

Re: [ceph-users] Basic object storage question

2015-09-24 Thread Gregory Farnum
On Thu, Sep 24, 2015 at 2:06 AM, Ilya Dryomov wrote: > On Thu, Sep 24, 2015 at 7:05 AM, Robert LeBlanc wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA256 >> >> If you use RADOS gateway, RBD or CephFS, then you don't need to worry >> about striping. If you write your own application that

Re: [ceph-users] Seek advice for using Ceph to provice NAS service

2015-09-24 Thread Gregory Farnum
On Tue, Sep 22, 2015 at 7:21 PM, Jevon Qiao wrote: > Hi Sage and other Ceph experts, > > This is a greeting from Jevon, I'm from China and working in a company which > are using Ceph as the backend storage. At present, I'm evaluating the > following two options of using Ceph cluster to provide NAS

Re: [ceph-users] Basic object storage question

2015-09-24 Thread Gregory Farnum
On Sep 24, 2015 5:12 PM, "Cory Hawkless" wrote: > > Hi all, thanks for the replies. > So my confusion was because I was using "rados put test.file someobject testpool" > This command does not seem to split my 'files' into chunks when they are saved as 'objects', hence the terminology > > Upon bolt

Re: [ceph-users] [sepia] debian jessie repository ?

2015-09-29 Thread Gregory Farnum
On Tue, Sep 29, 2015 at 3:59 AM, Jogi Hofmüller wrote: > Hi, > > Am 2015-09-25 um 22:23 schrieb Udo Lembke: > >> you can use this sources-list >> >> cat /etc/apt/sources.list.d/ceph.list >> deb http://gitbuilder.ceph.com/ceph-deb-jessie-x86_64-basic/ref/v0.94.3 >> jessie main > > The thing is: wh

Re: [ceph-users] CephFS file to rados object mapping

2015-09-29 Thread Gregory Farnum
The formula for objects in a file is .. So you'll have noticed they all look something like 12345.0001, 12345.0002, 12345.0003, ... So if you've got a particular inode and file size, you can generate a list of all the possible objects in it. To find the object->OSD mapping you'd need t

Re: [ceph-users] Fwd: CephFS : check if rados objects are linked to inodes

2015-10-01 Thread Gregory Farnum
On Thu, Oct 1, 2015 at 4:42 AM, John Spray wrote: > On Thu, Oct 1, 2015 at 12:35 PM, Florent B wrote: >> Thank you John, I think about it. I did : >> >> # get inodes in pool >> rados -p my_pool ls | cut -d '.' -f 1 | uniq >> -u # returns 132886 unique inode

Re: [ceph-users] ceph-fuse and its memory usage

2015-10-02 Thread Gregory Farnum
On Fri, Oct 2, 2015 at 1:57 AM, John Spray wrote: > On Fri, Oct 2, 2015 at 2:42 AM, Goncalo Borges > wrote: >> Dear CephFS Gurus... >> >> I have a question regarding ceph-fuse and its memory usage. >> >> 1./ My Ceph and CephFS setups are the following: >> >> Ceph: >> a. ceph 9.0.3 >> b. 32 OSDs d

Re: [ceph-users] Can't mount cephfs to host outside of cluster

2015-10-06 Thread Gregory Farnum
On Mon, Oct 5, 2015 at 11:21 AM, Egor Kartashov wrote: > Hello! > > I have cluster of 3 machines with ceph 0.80.10 (package shipped with Ubuntu > Trusty). Ceph sucessfully mounts on all of them. On external machine I'm > reciving error "can't read superblock" and dmesg shows records like: > > [1

Re: [ceph-users] Correct method to deploy on jessie

2015-10-06 Thread Gregory Farnum
On Mon, Oct 5, 2015 at 10:36 PM, Dmitry Ogorodnikov wrote: > Good day, > > I think I will use wheezy for now for tests. Bad thing is wheezy full > support ends in 5 months, so wheezy is not ok for persistent production > cluster. > > I cant find out what ceph team offer to debian users, move to ot

Re: [ceph-users] memory stats

2015-10-06 Thread Gregory Farnum
On Mon, Oct 5, 2015 at 10:40 PM, Serg M wrote: > What difference between memory statistics of "ceph tell {daemon}.{id} heap > stats" Assuming you're using tcmalloc (by default you are) this will get information straight from the memory allocator about what the actual daemon memory usage is. > ,

Re: [ceph-users] "stray" objects in empty cephfs data pool

2015-10-08 Thread Gregory Farnum
On Thu, Oct 8, 2015 at 6:29 AM, Burkhard Linke wrote: > Hammer 0.94.3 does not support a 'dump cache' mds command. > 'dump_ops_in_flight' does not list any pending operations. Is there any > other way to access the cache? "dumpcache", it looks like. You can get all the supported commands with "he

Re: [ceph-users] OSD reaching file open limit - known issues?

2015-10-08 Thread Gregory Farnum
On Fri, Sep 25, 2015 at 10:04 AM, Jan Schermer wrote: > I get that, even though I think it should be handled more gracefuly. > But is it expected to also lead to consistency issues like this? I don't think it's expected, but obviously we never reproduced it in the lab. Given that dumpling is EOL

Re: [ceph-users] Peering algorithm questions

2015-10-08 Thread Gregory Farnum
On Tue, Sep 29, 2015 at 12:08 AM, Balázs Kossovics wrote: > Hey! > > I'm trying to understand the peering algorithm based on [1] and [2]. There > are things that aren't really clear or I'm not entirely sure if I understood > them correctly, so I'd like to ask some clarification on the points below

Re: [ceph-users] CephFS file to rados object mapping

2015-10-08 Thread Gregory Farnum
copy of an object when scrubbing. If you have 3+ copies I'd recommend checking each of them and picking the one that's duplicated... -Greg > > Andras > > > On 9/29/15, 9:58 AM, "Gregory Farnum" wrote: > >>The formula for objects in a file is .>seque

Re: [ceph-users] Rados python library missing functions

2015-10-08 Thread Gregory Farnum
On Thu, Oct 8, 2015 at 5:01 PM, Rumen Telbizov wrote: > Hello everyone, > > I am very new to Ceph so, please excuse me if this has already been > discussed. I couldn't find anything on the web. > > We are interested in using Ceph and access it directly via its native rados > API with python. We no

Re: [ceph-users] CephFS file to rados object mapping

2015-10-08 Thread Gregory Farnum
On Thu, Oct 8, 2015 at 6:45 PM, Francois Lafont wrote: > Hi, > > On 08/10/2015 22:25, Gregory Farnum wrote: > >> So that means there's no automated way to guarantee the right copy of >> an object when scrubbing. If you have 3+ copies I'd recommend checking &

Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-10-12 Thread Gregory Farnum
On Mon, Oct 12, 2015 at 9:50 AM, Mark Nelson wrote: > Hi Guy, > > Given all of the recent data on how different memory allocator > configurations improve SimpleMessenger performance (and the effect of memory > allocators and transparent hugepages on RSS memory usage), I thought I'd run > some test

Re: [ceph-users] "stray" objects in empty cephfs data pool

2015-10-13 Thread Gregory Farnum
On Mon, Oct 12, 2015 at 12:50 AM, Burkhard Linke wrote: > Hi, > > On 10/08/2015 09:14 PM, John Spray wrote: >> >> On Thu, Oct 8, 2015 at 7:23 PM, Gregory Farnum wrote: >>> >>> On Thu, Oct 8, 2015 at 6:29 AM, Burkhard Linke >>> wrote: >>>

Re: [ceph-users] CephFS file to rados object mapping

2015-10-13 Thread Gregory Farnum
On Fri, Oct 9, 2015 at 5:49 PM, Francois Lafont wrote: > Hi, > > Thanks for your answer Greg. > > On 09/10/2015 04:11, Gregory Farnum wrote: > >> The size of the on-disk file didn't match the OSD's record of the >> object size, so it rejected it. This wor

Re: [ceph-users] error while upgrading to infernalis last release on OSD serv

2015-10-19 Thread Gregory Farnum
As the infernalis release notes state, if you're upgrading you first need to step through the current development hammer branch or the (not-quite-release 0.94.4). -Greg On Thu, Oct 15, 2015 at 7:27 AM, German Anders wrote: > Hi all, > > I'm trying to upgrade a ceph cluster (prev hammer release) t

Re: [ceph-users] Ceph journal - isn't it a bit redundant sometimes?

2015-10-19 Thread Gregory Farnum
On Mon, Oct 19, 2015 at 11:18 AM, Jan Schermer wrote: > I'm sorry for appearing a bit dull (on purpose), I was hoping I'd hear what > other people using Ceph think. > > If I were to use RADOS directly in my app I'd probably rejoice at its > capabilities and how useful and non-legacy it is, but m

Re: [ceph-users] CephFS namespace

2015-10-19 Thread Gregory Farnum
On Mon, Oct 19, 2015 at 3:06 PM, Erming Pei wrote: > Hi, > >Is there a way to list the namespaces in cephfs? How to set it up? > >From man page of ceph.mount, I see this: > > To mount only part of the namespace: > > mount.ceph monhost1:/some/small/thing /mnt/thing > > But how t

Re: [ceph-users] CephFS namespace

2015-10-19 Thread Gregory Farnum
On Mon, Oct 19, 2015 at 3:26 PM, Erming Pei wrote: > I see. That's also what I needed. > Thanks. > > Can we only allow a part of the 'namespace' or directory tree to be mounted > from server end? Just like NFS exporting? > And even setting of permissions as well? This just got merged into the mas

Re: [ceph-users] pg incomplete state

2015-10-21 Thread Gregory Farnum
On Tue, Oct 20, 2015 at 7:22 AM, John-Paul Robinson wrote: > Hi folks > > I've been rebuilding drives in my cluster to add space. This has gone > well so far. > > After the last batch of rebuilds, I'm left with one placement group in > an incomplete state. > > [sudo] password for jpr: > HEALTH_WA

Re: [ceph-users] pg incomplete state

2015-10-21 Thread Gregory Farnum
ver got started? > > An alternative idea I had was to take osd.30 back out of the cluster so > that pg 3.ae [30,11] would get mapped to some other osd to maintain > replication. This seems a bit heavy handed though, given that only this > one pg is affected. > > Thanks for an

Re: [ceph-users] ceph-fuse and its memory usage

2015-10-21 Thread Gregory Farnum
On Tue, Oct 13, 2015 at 10:09 PM, Goncalo Borges wrote: > Hi all... > > Thank you for the feedback, and I am sorry for my delay in replying. > > 1./ Just to recall the problem, I was testing cephfs using fio in two > ceph-fuse clients: > > - Client A is in the same data center as all OSDs connecte

Re: [ceph-users] CephFS file to rados object mapping

2015-10-21 Thread Gregory Farnum
On Wed, Oct 14, 2015 at 7:20 PM, Francois Lafont wrote: > Hi, > > On 14/10/2015 06:45, Gregory Farnum wrote: > >>> Ok, however during my tests I had been careful to replace the correct >>> file by a bad file with *exactly* the same size (the content of the >>&

Re: [ceph-users] ceph-fuse crush

2015-10-21 Thread Gregory Farnum
On Thu, Oct 15, 2015 at 10:41 PM, 黑铁柱 wrote: > > cluster info: >cluster b23b48bf-373a-489c-821a-31b60b5b5af0 > health HEALTH_OK > monmap e1: 3 mons at > {node1=192.168.0.207:6789/0,node2=192.168.0.208:6789/0,node3=192.168.0.209:6789/0}, > election epoch 24, quorum 0,1,2 node1,node2,n

Re: [ceph-users] cephfs best practice

2015-10-21 Thread Gregory Farnum
On Wed, Oct 21, 2015 at 3:12 PM, Erming Pei wrote: > Hi, > > I am just wondering which use case is better: (within one single file > system) set up one data pool for each project, or let project to share a big > pool? I don't think anybody has that kind of operational experience. I think that i

Re: [ceph-users] CephFS and page cache

2015-10-21 Thread Gregory Farnum
On Sun, Oct 18, 2015 at 8:27 PM, Yan, Zheng wrote: > On Sat, Oct 17, 2015 at 1:42 AM, Burkhard Linke > wrote: >> Hi, >> >> I've noticed that CephFS (both ceph-fuse and kernel client in version 4.2.3) >> remove files from page cache as soon as they are not in use by a process >> anymore. >> >> Is

Re: [ceph-users] ceph-fuse and its memory usage

2015-10-22 Thread Gregory Farnum
On Thu, Oct 22, 2015 at 1:59 AM, Yan, Zheng wrote: > direct IO only bypass kernel page cache. data still can be cached in > ceph-fuse. If I'm correct, the test repeatedly writes data to 8M > files. The cache make multiple write assimilate into single OSD > write Ugh, of course. I don't see a tra

Re: [ceph-users] "stray" objects in empty cephfs data pool

2015-10-23 Thread Gregory Farnum
On Fri, Oct 23, 2015 at 7:08 AM, Burkhard Linke wrote: > Hi, > > On 10/14/2015 06:32 AM, Gregory Farnum wrote: >> >> On Mon, Oct 12, 2015 at 12:50 AM, Burkhard Linke >> wrote: >>> >>> > *snipsnap* >>> >>> Thanks, that did the tric

Re: [ceph-users] why was osd pool default size changed from 2 to 3.

2015-10-23 Thread Gregory Farnum
On Fri, Oct 23, 2015 at 8:17 AM, Stefan Eriksson wrote: > Hi > > I have been looking for info about "osd pool default size" and the reason > its 3 as default. > > I see it got changed in v0.82 from 2 to 3, > > Here its 2. > http://docs.ceph.com/docs/v0.81/rados/configuration/pool-pg-config-ref/ >

<    7   8   9   10   11   12   13   14   15   16   >