On Mon, Jul 29, 2013 at 2:55 PM, Chen, Xiaoxi <xiaoxi.c...@intel.com> wrote:
> ** ** > > ** ** > > *From:* zrz...@gmail.com [mailto:zrz...@gmail.com] *On Behalf Of *Rongze > Zhu > *Sent:* Monday, July 29, 2013 2:18 PM > *To:* Chen, Xiaoxi > *Cc:* Gregory Farnum; ceph-users@lists.ceph.com > > *Subject:* Re: [ceph-users] add crush rule in one command**** > > ** ** > > ** ** > > ** ** > > On Sat, Jul 27, 2013 at 4:25 PM, Chen, Xiaoxi <xiaoxi.c...@intel.com> > wrote:**** > > My 0.02:**** > > 1. Why you need to simultaneously set the map for your purpose ? > It’s obvious very important for ceph to have an atomic CLI , but this is > just because the map may be changed by cluster itself ( loss node or what), > but not for your case. Since the map can be auto-distributed by ceph, I > really think it’s a good idea to just change your own code , to have the > map changing stuff only happen in one node. **** > > We need auto-scaling. When a storage node is added to cluster, puppet > agent will deploy ceph on the node and create a local pool for it. A > dedicated node for creating pools is more complexity, because we need elect > the dedicated node and the node is single point of failure.**** > > ** ** > > Well, from my point of view, it’s not that easy to implement an atomic > map change CLI for ceph, combining these 3 commands together is obvious not > enough. But I would be very happy if anybody implement one.**** > > ** ** > > Technically speaking ,yes, it’s a SPOF, but ,I would say it’s OK from > engineering aspect. Adding a node to a cluster is not the case that > happened every day, and definitely you cannot add a physical node > “automatically” , so it’s too easy to checking if a dedicated control node > (for map management) is alive before you do so. For instance, would you > mind your console proxy machine to the backend cluster be the SPOF ? If > yes, just have two and try another on failure, will you even write a > “auto-election” and “fail-detection” application for that ? > There are many ways to do that, we will evaluate these ways. Our goal is that making it high available for our customers :) > **** > > 2. We are also evaluating the pros and cons for “local pool” , > well, the only pros is you can save the network BW for read. You may want > to say latency, I agree with you before but after we have a complete > latency breakdown for ceph, showing that the network latency can be neglect > , even using full-ssd setup. The question remain is “how much BW can be > save?”, well, unless you have some prior statement about the workload that > is reading majority, or you still have to use a 10GbE link for Ceph to have > a balanced throughput . But the drawback is really obvious, > live-migration complexity, management complexity, and etc. **** > > I agree that there are some drawbacks for "local pool" :) But I think > network is shared resource, we should avoid ceph using excessive network > resource(many enterprises suing 1GbE link in their network environment in > china). **** > > ** ** > > The BW is much decided by how much write throughput you would like to > achieve. Unless customer can satisfied with <100MB/s aggregation BW for a > single compute node (typically for a 2U WSM/SNB node, there are 32 cores > there ,usually means at least 16 VMs, so less than 6MB/s for a single VM), > or they will goes to 10Gb solution. There are still no enterprises that use > Ceph in China (AFAIK) , but from the list, seems quite a lot of user goes > to full 10Gb or even IB network.**** > > ** ** > > The 10GbE is much cheaper than we first thought it was, a 48 ports 10GbE > switch will only cost you 5~6K USD. > Thanks your information, it is useful for me. > **** > > **** > > Xiaoxi**** > > *From:* ceph-users-boun...@lists.ceph.com [mailto: > ceph-users-boun...@lists.ceph.com] *On Behalf Of *Rongze Zhu > *Sent:* Friday, July 26, 2013 2:29 PM > *To:* Gregory Farnum > *Cc:* ceph-users@lists.ceph.com > *Subject:* Re: [ceph-users] add crush rule in one command**** > > **** > > **** > > **** > > On Fri, Jul 26, 2013 at 2:27 PM, Rongze Zhu <ron...@unitedstack.com> > wrote:**** > > **** > > **** > > On Fri, Jul 26, 2013 at 1:22 PM, Gregory Farnum <g...@inktank.com> wrote:* > *** > > On Thu, Jul 25, 2013 at 7:41 PM, Rongze Zhu <ron...@unitedstack.com> > wrote: > > Hi folks, > > > > Recently, I use puppet to deploy Ceph and integrate Ceph with OpenStack. > We > > put computeand storage together in the same cluster. So nova-compute and > > OSDs will be in each server. We will create a local pool for each server, > > and the pool only use the disks of each server. Local pools will be used > by > > Nova for root disk and ephemeral disk.**** > > Hmm, this is constraining Ceph quite a lot; I hope you've thought > about what this means in terms of data availability and even > utilization of your storage. :)**** > > **** > > We also will create global pool for Cinder, the IOPS of global pool will > be betther than local pool.**** > > The benefit of local pool is reducing the network traffic between servers > and Improving the management of storage. We use one same Ceph Gluster for > Nova,Cinder,Glance, and create different pools(and diffenrent rules) for > them. Maybe it need more testing :)**** > > **** > > s/Gluster/Cluster/g**** > > **** > > **** > > > > In order to use the local pools, I need add some rules for the local > pools > > to ensure the local pools using only local disks. There is only way to > add > > rule in ceph: > > > > ceph osd getcrushmap -o crush-map > > crushtool -c crush-map.txt -o new-crush-map > > ceph osd setcrushmap -i new-crush-map > > > > If multiple servers simultaneously set crush map(puppet agent will do > that), > > there is the possibility of consistency problems. So if there is an > command > > for adding rule, which will be very convenient. Such as: > > > > ceph osd crush add rule -i new-rule-file > > > > Could I add the command into Ceph?**** > > We love contributions to Ceph, and this is an obvious hole in our > atomic CLI-based CRUSH manipulation which a fix would be welcome for. > Please be aware that there was a significant overhaul to the way these > commands are processed internally between Cuttlefish and > Dumpling-to-be that you'll need to deal with if you want to cross that > boundary. I also recommend looking carefully at how we do the > individual pool changes and how we handle whole-map injection to make > sure the interface you use and the places you do data extraction makes > sense. :)**** > > > Thank you for your quick reply, it is very useful for me :) > **** > > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com**** > > > > **** > > > -- **** > > > Rongze Zhu - 朱荣泽 > Email: zrz...@gmail.com > Blog: http://way4ever.com > Weibo: http://weibo.com/metaxen**** > > Github: https://github.com/zhurongze**** > > > > > -- **** > > > Rongze Zhu - 朱荣泽 > Email: zrz...@gmail.com > Blog: http://way4ever.com > Weibo: http://weibo.com/metaxen**** > > Github: https://github.com/zhurongze**** > > > > > -- **** > > > Rongze Zhu - 朱荣泽 > Email: zrz...@gmail.com > Blog: http://way4ever.com > Weibo: http://weibo.com/metaxen**** > > Github: https://github.com/zhurongze**** > -- Rongze Zhu - 朱荣泽 Email: zrz...@gmail.com Blog: http://way4ever.com Weibo: http://weibo.com/metaxen Github: https://github.com/zhurongze
_______________________________________________ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com