from:"Chen, Xiaoxi"

Re: [ceph-users] Rack weight imbalance

2016-02-24 Thread Chen, Xiaoxi

My 0.02, there are two kinds of balance, one for space utilization , another for performance. Now seems you will be good for the space utilization, but you might suffer a bit for the performance as the density of disk increase.The new rack will hold 1/3 data by 1/5 disks, if we assume the work

Re: [ceph-users] [performance] why rbd_aio_write latency increase from 4ms to 7.3ms after the same test

2015-11-01 Thread Chen, Xiaoxi

Pre-allocated the volume by "DD" across the entire RBD before you do any performance test:). In this case, you may want to re-create the RBD, pre-allocate and try again. > -Original Message- > From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel- > ow...@vger.kernel.org] On Behalf O

Re: [ceph-users] Initial performance cluster SimpleMessenger vs AsyncMessenger results

2015-10-14 Thread Chen, Xiaoxi

Hi Mark, The Async result in 128K drops quickly after some point, is that because of the testing methodology? Other conclusion looks to me like simple messenger + Jemalloc is the best practice till now as it has the same performance as async but using much less memory? -Xiaoxi

Re: [ceph-users] question about OSD failure detection

2015-04-12 Thread Chen, Xiaoxi

Hi, 1. In short, the OSD need to heartbeat with up to #PG x (#Replica -1 ), but actually will be much less since most of the peers are redundant. For example, An OSD (say OSD 1) is holding 100 PGs, especially for some PGs, say PG 1, OSD1 is the primary OSD of PG1, then OSD1 need to pee

Re: [ceph-users] How to dispatch monitors in a multi-site cluster (ie in 2 datacenters)

2015-04-12 Thread Chen, Xiaoxi

Hi Francois, Actually you are discussing two separate questions here:) 1. in the 5 mons(2 in dc1, 2 in dc2, 1 in wan), can the monitor form a quorum? How to offload the mon in WAN? Yes and No, in one case, you lose any of your DC completely, that's fine, the left 3 monitors could

[ceph-users] 回复: Re: rbd resize (shrink) taking forever and a day

2015-01-07 Thread Chen, Xiaoxi

if there is data to be > trimmed. I'm not a big fan of a "--skip-trimming" option as there is > the potential to leave some orphan objects that may not be cleaned up > correctly. > > On Tue, Jan 6, 2015 at 8:09 AM, Jake Young wrote: > > > > > &g

Re: [ceph-users] rbd resize (shrink) taking forever and a day

2015-01-06 Thread Chen, Xiaoxi

How do you think? From: Jake Young [mailto:jak3...@gmail.com] Sent: Monday, January 5, 2015 9:45 PM To: Chen, Xiaoxi Cc: Edwin Peer; ceph-users@lists.ceph.com Subject: Re: [ceph-users] rbd resize (shrink) taking forever and a day On Sunday, January 4, 2015, Chen, Xiaoxi mailto:xia

Re: [ceph-users] Worthwhile setting up Cache tier with small leftover SSD partions?

2015-01-05 Thread Chen, Xiaoxi

Some low level caching might help, flashcache, dmcache,etc… But that may hurt the reliability to some extent , and make it harder for operator ☺ From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Lindsay Mathieson Sent: Monday, January 5, 2015 12:14 PM To: Christian Balzer

Re: [ceph-users] rbd resize (shrink) taking forever and a day

2015-01-04 Thread Chen, Xiaoxi

You could use rbd info to see the block_name_prefix, the object name consist like ., so for example, rb.0.ff53.3d1b58ba.e6ad should be the th object of the volume with block_name_prefix rb.0.ff53.3d1b58ba. $ rbd info huge rbd image 'huge': size 1024 TB in 26843

Re: [ceph-users] redundancy with 2 nodes

2015-01-04 Thread Chen, Xiaoxi

Did you shut down the node with 2 mon? I think it might be impossible to have redundancy with only 2 node, paxos quorum is the reason: Say you have N (N=2K+1) monitors, you always have a node(let's named it node A) with majority number of MONs(>= K+1), another node(node B) with minority number

Re: [ceph-users] Ceph data consistency

2014-12-30 Thread Chen, Xiaoxi

Hi, First of all, the data is safe since it's persistent in journal, if error occurs on OSD data partition, replay the journal will get the data back. And, there is a wbthrottle there, you can config how much data(ios, bytes, inodes) you wants to remain in memory. A background thread wil

Re: [ceph-users] 答复: Re: can not add osd

2014-12-28 Thread Chen, Xiaoxi

Hi Yang bin, Not sure if you followed the right docs. I suspect you didn’t, because you should use ceph-disk and specified a FS-Type in the command. I think you might mislead by the quick start(http://ceph.com/docs/master/start/quick-ceph-deploy/#create-a-cluster), it use a directory inst

Re: [ceph-users] LevelDB support status is still experimental on Giant?

2014-12-01 Thread Chen, Xiaoxi

...@gmail.com] Sent: Tuesday, December 2, 2014 1:27 PM To: Chen, Xiaoxi Cc: ceph-us...@ceph.com; Haomai Wang Subject: Re: [ceph-users] LevelDB support status is still experimental on Giant? Hi Xiaoxi, Thanks for very useful information. Can you share more details about "Terrible bad performance"

Re: [ceph-users] LevelDB support status is still experimental on Giant?

2014-12-01 Thread Chen, Xiaoxi

had better off to optimize the key-value backend code to support specified kind of load. From: Haomai Wang [mailto:haomaiw...@gmail.com] Sent: Monday, December 1, 2014 10:14 PM To: Chen, Xiaoxi Cc: Satoru Funai; ceph-us...@ceph.com Subject: Re: [ceph-users] LevelDB support status is still

Re: [ceph-users] LevelDB support status is still experimental on Giant?

2014-12-01 Thread Chen, Xiaoxi

We have tested it for a while, basically it seems kind of stable but show terrible bad performance. This is not the fault of Ceph , but levelDB, or more generally, all K-V storage with LSM design(RocksDB,etc), the LSM tree structure naturally introduce very large write amplification 10X to

Re: [ceph-users] prioritizing reads over writes

2014-11-02 Thread Chen, Xiaoxi

Hi Simon Do your workload has lots of RAW? Since Ceph has RW lock in each object, so if you have a write to RBD and the following read happen to hit the same object, the latency will be higher. Another possibility is the OSD op_wq, it’s a priority queue but read and write have same pr

Re: [ceph-users] Poor RBD performance as LIO iSCSI target

2014-10-27 Thread Chen, Xiaoxi

Hi Chris, I am not the expert of LIO but from your result, seems RBD/Ceph works well(RBD on local system, no iSCSI) and LIO works well(Ramdisk (No RBD) -> LIO target) , and if you change LIO to use other interface (file, loopback) to play with RBD, it also works well. So see

Re: [ceph-users] [Single OSD performance on SSD] Can't go over 3, 2K IOPS

2014-09-18 Thread Chen, Xiaoxi

Hi Mark It's client IOPS and we use replica = 2, journal and OSD are hosted in the same SSDs so the real IOPS is 23K * 2 * 2 =90K, still far from HW limit (30K+ for a single DCS3700) CPU % is ~62% in peak (2VM ), interrupt distributed. An additional information, seems the cluster is in a kind

Re: [ceph-users] Cache Pool writing too much on ssds, poor performance?

2014-09-10 Thread Chen, Xiaoxi

Could you show your cache tiering configuration? Especially this three parameters. ceph osd pool set hot-storage cache_target_dirty_ratio 0.4 ceph osd pool set hot-storage cache_target_full_ratio 0.8 ceph osd pool set {cachepool} target_max_bytes {#bytes} From: ceph-users [mailto:ceph-users-bo

Re: [ceph-users] ceph data consistency

2014-09-09 Thread Chen, Xiaoxi

Yes, but usually a system has several layer of error-detecting/recovering stuff in different granularity. Disk CRC works on Sector level, Ceph CRC mostly work on object level, and we also have replication/erasure coding in system level. The CRC in ceph mainly handle the case, imaging you have a

[ceph-users] 另j

2013-10-09 Thread Chen, Xiaoxi

发自我的 iPhone ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Re: [ceph-users] Sequential placement

2013-08-26 Thread Chen, Xiaoxi

The "random" may come from ceph trunks. For RBD, Ceph trunk the image to 4M(default) objects, for Rados bench , it already 4M objects if you didn't set the parameters. So from XFS's view, there are lots of 4M files, in default, with ag!=1 (allocation group, specified during mkfs, default seems t

Re: [ceph-users] add crush rule in one command

2013-07-29 Thread Chen, Xiaoxi

From: zrz...@gmail.com [mailto:zrz...@gmail.com] On Behalf Of Rongze Zhu Sent: Monday, July 29, 2013 2:18 PM To: Chen, Xiaoxi Cc: Gregory Farnum; ceph-users@lists.ceph.com Subject: Re: [ceph-users] add crush rule in one command On Sat, Jul 27, 2013 at 4:25 PM, Chen, Xiaoxi mailto:xiaoxi.c

Re: [ceph-users] add crush rule in one command

2013-07-27 Thread Chen, Xiaoxi

My 0.02: 1. Why you need to simultaneously set the map for your purpose ? It’s obvious very important for ceph to have an atomic CLI , but this is just because the map may be changed by cluster itself ( loss node or what), but not for your case. Since the map can be auto-distributed by ce

Re: [ceph-users] SSD recommendations for OSD journals

2013-07-22 Thread Chen, Xiaoxi

发自我的 iPhone 在 2013-7-23，4:35，"Charles 'Boyo" mailto:charlesb...@gmail.com>> 写道： Hi, On Mon, Jul 22, 2013 at 2:08 AM, Chen, Xiaoxi mailto:xiaoxi.c...@intel.com>> wrote: Hi, > Can you share any information on the SSD you are using, is it PCIe connect

Re: [ceph-users] SSD recommendations for OSD journals

2013-07-22 Thread Chen, Xiaoxi

发自我的 iPhone 在 2013-7-22，23:16，"Gandalf Corvotempesta" 写道： > 2013/7/22 Chen, Xiaoxi : >> With “journal writeahead”,the data first write to journal ,ack to the >> client, and write to OSD, note that, the data always keep in memory before >> it write to both

Re: [ceph-users] SSD suggestions as journal

2013-07-22 Thread Chen, Xiaoxi

Basically i think endurance is most important for a ceph journal，since the workload for journal is full write，you can easily caculate how long your ssd will burn out.. even we assume your ssd only run at 100MB/s in average，you will burn out 8TB/day and 240TB/month DCS 3500 is definitely not use

Re: [ceph-users] SSD recommendations for OSD journals

2013-07-22 Thread Chen, Xiaoxi

发自我的 iPhone 在 2013-7-23，0:21，"Gandalf Corvotempesta" 写道： > 2013/7/22 Chen, Xiaoxi : >> Imaging you have several writes have been flushed to journal and acked，but >> not yet write to disk. Now the system crash by kernal panic or power >> failure，you will lose

Re: [ceph-users] SSD recommendations for OSD journals

2013-07-22 Thread Chen, Xiaoxi

Hi, My 0.02 : > Secondly, I'm unclear about how OSDs use the journal. It appears they write to the journal (in all cases, can't be turned >off), ack to the client and then read the journal later to write to backing >storage. Is that correct? I would like to say NO, the journal w

Re: [ceph-users] Any concern about Ceph on CentOS

2013-07-16 Thread Chen, Xiaoxi

2:17 PM To: Chen, Xiaoxi Cc: ceph-de...@vger.kernel.org; ceph-us...@ceph.com Subject: Re: Any concern about Ceph on CentOS Hi Xiaoxi, we are really running Ceph on CentOS-6.4 (6 server nodes, 3 client nodes, 160 OSDs). We put a 3.8.13 Kernel on top and installed the ceph-0.61.4 cluster with mkc

Re: [ceph-users] Any concern about Ceph on CentOS

2013-07-16 Thread Chen, Xiaoxi

rstanding of the issue is that for the actual cluster it's self, it should be ok. I could be wrong here, but I thought the kernel module was only specifically for mounting cephfs (And even then, there's a fuse module that you *can* use anyway) On 07/17/2013 11:18 AM, Chen, Xiaoxi

[ceph-users] Any concern about Ceph on CentOS

2013-07-16 Thread Chen, Xiaoxi

Hi list, I would like to ask if anyone really run Ceph on CentOS/RHEL? Since the kernel version for Cent/RHEL is much older than that of Ubuntu, I am thinking about whether we have some known performance/functionality issue? Thanks for everyone could share your insight for Ceph+CentOS.

Re: [ceph-users] How many Pipe per Ceph OSD daemon will keep?

2013-06-06 Thread Chen, Xiaoxi

threads. This is still too high for 8 core or 16 core cpu/cpus and will waste a lot of cycles in context switchinh. 发自我的 iPhone 在 2013-6-7，0:21，"Gregory Farnum" 写道： > On Thu, Jun 6, 2013 at 12:25 AM, Chen, Xiaoxi wrote: >> >> Hi, >> From the code, each pi

[ceph-users] How many Pipe per Ceph OSD daemon will keep?

2013-06-06 Thread Chen, Xiaoxi

Hi, From the code, each pipe (contains a TCP socket) will fork 2 threads, a reader and a writer. We really observe 100+ threads per OSD daemon with 30 instances of rados bench as clients. But this number seems a bit crazy, if I have a 40 disks node, thus I will have 40 OSDs, we

Re: [ceph-users] Ceph killed by OS because of OOM under high load

2013-06-03 Thread Chen, Xiaoxi

iaoxi -Original Message- From: Gregory Farnum [mailto:g...@inktank.com] Sent: 2013年6月4日 0:37 To: Chen, Xiaoxi Cc: ceph-de...@vger.kernel.org; Mark Nelson (mark.nel...@inktank.com); ceph-us...@ceph.com Subject: Re: [ceph-users] Ceph killed by OS because of OOM under high load On Mon, Jun 3, 2013 at 8:

Re: [ceph-users] replacing an OSD or crush map sensitivity

2013-06-03 Thread Chen, Xiaoxi

my 0.02， you really dont need to wait for health_ok between your recovery steps,just go ahead. Everytime a new map be generated and broadcasted,the old map and in-progress recovery will be canceled 发自我的 iPhone 在 2013-6-2，11:30，"Nigel Williams" 写道： > Could I have a critique of this approach pl

[ceph-users] Ceph killed by OS because of OOM under high load

2013-06-03 Thread Chen, Xiaoxi

Hi, As my previous mail reported some weeks ago ,we are suffering from OSD crash/ OSD Flipping / System reboot and etc, all these unstable issue really stop us from digging further into ceph characterization. Good news is that we seems find out the cause, I explain our experiment

Re: [ceph-users] qemu-1.4.2 rbd-fixed ubuntu packages

2013-05-29 Thread Chen, Xiaoxi

Hi, Can I assume i am safe without this patch if i don't use any rbd cache? 发自我的 iPhone 在 2013-5-29，16:00，"Alex Bligh" 写道： > > On 28 May 2013, at 06:50, Wolfgang Hennerbichler wrote: > >> for anybody who's interested, I've packaged the latest qemu-1.4.2 (not 1.5, >> it didn't work nicel

Re: [ceph-users] increasing stability

2013-05-29 Thread Chen, Xiaoxi

Cannot agree more,when I trying to promote ceph to internal state holder,they always complaining the stability of ceph,especially when they are evaluating ceph with high enough pressure, ceph cannot stay heathy during the test. 发自我的 iPhone 在 2013-5-29，19:13，"Wolfgang Hennerbichler" 写道： > H

Re: [ceph-users] OSD state flipping when cluster-network in high utilization

2013-05-22 Thread Chen, Xiaoxi

ormal. Xiaoxi -Original Message- From: Chen, Xiaoxi Sent: 2013年5月16日 6:38 To: 'Sage Weil' Subject: RE: [ceph-users] OSD state flipping when cluster-network in high utilization Uploaded to /home/cephdrop/xiaoxi_flip_osd/osdlog.tar.gz Thanks -Original Me

Re: [ceph-users] OSD state flipping when cluster-network in high utilization

2013-05-15 Thread Chen, Xiaoxi

Thanks， but i am not quite understand how to determine weather monitor overloaded？ and if yes，will start several monitor help？发自我的 iPhone 在 2013-5-15，23:07，"Jim Schutt" 写道： > On 05/14/2013 09:23 PM, Chen, Xiaoxi wrote: >>> How responsive generally is the machine

Re: [ceph-users] OSD state flipping when cluster-network in high utilization

2013-05-15 Thread Chen, Xiaoxi

853'4329,4103'5330] local-les=4092 n=154 ec =100 les/c 4092/4093 4091/4091/4034) [319,46] r=0 lpr=4091 mlcod 4103'5329 active+clean] do_op mode is idle(wr=0) 2013-05-15 15:29:22.513295 7f0253340700 10 osd.319 pg_epoch: 4113 pg[3.d7( v 4103'5330 (3853'4329,4103'5330]

Re: [ceph-users] OSD state flipping when cluster-network in high utilization

2013-05-14 Thread Chen, Xiaoxi

d like to say it may related with CPU scheduler ? The heartbeat thread (in busy OSD ) failed to get enough cpu cycle. -Original Message- From: ceph-devel-ow...@vger.kernel.org [mailto:ceph-devel-ow...@vger.kernel.org] On Behalf Of Sage Weil Sent: 2013年5月15日 7:23 To: Chen, Xiaoxi Cc: Mark

Re: [ceph-users] OSD state flipping when cluster-network in high utilization

2013-05-14 Thread Chen, Xiaoxi

h ＞30% io wait）.Enabling jumbo frame **seems** make things worth.(just feeling.no data supports) 发自我的 iPhone 在 2013-5-14，23:36，"Mark Nelson" 写道： > On 05/14/2013 10:30 AM, Sage Weil wrote: >> On Tue, 14 May 2013, Chen, Xiaoxi wrote: >>> >>> Hi >>>

[ceph-users] OSD state flipping when cluster-network in high utilization

2013-05-14 Thread Chen, Xiaoxi

Hi We are suffering our OSD flipping between up and down ( OSD X be voted to down due to 3 missing ping, and after a while it tells the monitor "map xxx wrongly mark me down" ). Because we are running sequential write performance test on top of RBDs, and the cluster network nics is really in h

Re: [ceph-users] ceph segfault on all osd

2013-04-10 Thread Chen, Xiaoxi

by what means you want a pool with replication=0? 发自我的 iPhone 在 2013-4-10，18:59，"Witalij Poljatchek" mailto:witalij.poljatc...@aixit.com>> 写道： Hello, need help to solve segfault on all osd in my test cluster. Setup ceph from scratch. service ceph -a start ceph -w health HEALTH_OK m

[ceph-users] Questions about the meaning of osd_client_message_size_cap and etc

2013-03-26 Thread Chen, Xiaoxi

Hi Mark, I think you are the right man for these questions :) I am really don't understand how osd_client_message_size_cap , objecter_infilght_op_bytes/ops, ms_dispatch_throttle_bytes works? And how they affect performance. Especially ,the objecter_inflight_op_bytes seems be used

Re: [ceph-users] Journal size

2013-03-26 Thread Chen, Xiaoxi

Are you using a partition as journal? From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Aleksey Samarin Sent: 2013年3月26日 20:45 To: ceph-us...@ceph.com Subject: [ceph-users] Journal size Hello everyone! I have question about journal. Ceph cluster is

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi

-Original Message- From: Sage Weil [mailto:s...@inktank.com] Sent: 2013年3月25日 23:35 To: Chen, Xiaoxi Cc: 'ceph-users@lists.ceph.com' (ceph-users@lists.ceph.com); ceph-de...@vger.kernel.org Subject: Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes. Hi Xiaox

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi

Rephrase it to make it more clear From: ceph-users-boun...@lists.ceph.com [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Chen, Xiaoxi Sent: 2013年3月25日 17:02 To: 'ceph-users@lists.ceph.com' (ceph-users@lists.ceph.com) Cc: ceph-de...@vger.kernel.org Subject: [ceph-users] Cep

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi

say the issue I hit is a different issue?(not #3737) > Wolfgang xiaoxi > > On 03/25/2013 10:15 AM, Chen, Xiaoxi wrote: >> >> >> Hi Wolfgang， >> >>Thanks for the reply，but why my problem is related with issue#3737？ I

Re: [ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi

is could be related to this issue here and has been reported multiple > times: > > http://tracker.ceph.com/issues/3737 > > In short: They're working on it, they know about it. > > Wolfgang > > On 03/25/2013 10:01 AM, Chen, Xiaoxi wrote: >> Hi list, >&g

[ceph-users] Ceph Crach at sync_thread_timeout after heavy random writes.

2013-03-25 Thread Chen, Xiaoxi

Hi list, We have hit and reproduce this issue for several times, ceph will suicide because FileStore: sync_entry timed out after a very heavy random IO on top of the RBD. My test environment is: 4 Nodes ceph cluster with 20 HDDs for OSDs and 4 Intel

[ceph-users] Unable to start ceph monitor in V0.59

2013-03-21 Thread Chen, Xiaoxi

Hi List, I cannot start my monitor when I update my cluster to v0.59, pls note that I am not trying to upgrade,but by reinstall the ceph software stack and rerunning mkcephfs. I have seen that the monitor change a lot after 0.58, is the mkcephfs still have bugs ? Below is the log:

Re: [ceph-users] create volume from an image

2013-03-20 Thread Chen, Xiaoxi

Thanks josh，the problem is solved by updating ceph in the glance node. 发自我的 iPhone 在 2013-3-20，14:59，"Josh Durgin" 写道： > On 03/19/2013 11:03 PM, Chen, Xiaoxi wrote: >> I think Josh may be the right man for this question ☺ >> >> To be more precious, I would l

Re: [ceph-users] create volume from an image

2013-03-19 Thread Chen, Xiaoxi

I think Josh may be the right man for this question ☺ To be more precious, I would like to add more words about the status: 1. We have configured “show_image_direct_url= Ture” in Glance, and from the Cinder-volume’s log, we can make sure we have got a direct_url , for example. image_id 6565d775-

Re: [ceph-users] SL4500 as a storage machine

2013-03-17 Thread Chen, Xiaoxi

For me,We have seem a supermicro machine,which is 2U with 2 CPU and 24 2.5 inch sata/sas drives,together with 2 onboard 10Gb Nic. I think it's good enough for both density and computing power. To another end, we are also planning to evaluating small node for ceph,say a ATOM with 2 /4 disks per

57 matches

Mail list logo