I do not have a better knowledge about the Cassandra. As per my knowledge, there is no such a tool. I believe, such a tool would be worth.
Thanks, Indika On Thu, Jan 20, 2011 at 6:15 PM, Mimi Aluminium <mimi.alumin...@gmail.com>wrote: > Hi, > > I have a question that somewhat related to the above. > Is there a tool that predicts the resource consumption (i.e, memory, disk, > CPU) in an offline mode? Means it is given with the storage conf > parameters, ks, CFs and data model, and then application parameters such > read/write average rates. It should output the required sizes for memory, > disk etc. > > I need to estimate costs for various configurations we might have and > thus I am working on building "simple" excel for my own data model - but > then it came to my mind to ask wether something like that already exists. > > BTW, I think such tool can also help for the issues that were discussed > before even though it will be built on averages which probably are no so > fine-grained but it can provide worse cases numbers to the application > that uses Cassandra > > Thanks, > Miriam > > > ========== > Miriam Allalouf > n > > On Thu, Jan 20, 2011 at 1:53 PM, indika kumara <indika.k...@gmail.com>wrote: > >> Thanks David.... We decided to do it at our client-side as the initial >> implementation. I will investigate the approaches for supporting the fine >> grained control of the resources consumed by a sever, tenant, and CF. >> >> Thanks, >> >> Indika >> >> On Thu, Jan 20, 2011 at 3:20 PM, David Boxenhorn <da...@lookin2.com>wrote: >> >>> As far as I can tell, if Cassandra supports three levels of configuration >>> (server, keyspace, column family) we can support multi-tenancy. It is >>> trivial to give each tenant their own keyspace (e.g. just use the tenant's >>> id as the keyspace name) and let them go wild. (Any out-of-bounds behavior >>> on the CF level will be stopped at the keyspace and server level before >>> doing any damage.) >>> >>> I don't think Cassandra needs to know about end-users. From Cassandra's >>> point of view the tenant is the user. >>> >>> On Thu, Jan 20, 2011 at 7:00 AM, indika kumara <indika.k...@gmail.com>wrote: >>> >>>> +1 Are there JIRAs for these requirements? I would like to contribute >>>> from my capacity. >>>> >>>> As per my understanding, to support some muti-tenant models, it is >>>> needed to qualified keyspaces' names, Cfs' names, etc. with the tenant >>>> namespace (or id). The easiest way to do this would be to modify >>>> corresponding constructs transparently. I tought of a stage (optional and >>>> configurable) prior to authorization. Is there any better solutions? I >>>> appreciate the community's suggestions. >>>> >>>> Moreover, It is needed to send the tenant NS(id) with the user >>>> credentials (A users belongs to this tenant (org.)). For that purpose, I >>>> thought of using the user credentials in the AuthenticationRequest. s there >>>> any better solution? >>>> >>>> I would like to have a MT support at the Cassandra level which is >>>> optional and configurable. >>>> >>>> Thanks, >>>> >>>> Indika >>>> >>>> >>>> On Wed, Jan 19, 2011 at 7:40 PM, David Boxenhorn <da...@lookin2.com>wrote: >>>> >>>>> Yes, the way I see it - and it becomes even more necessary for a >>>>> multi-tenant configuration - there should be completely separate >>>>> configurations for applications and for servers. >>>>> >>>>> - Application configuration is based on data and usage characteristics >>>>> of your application. >>>>> - Server configuration is based on the specific hardware limitations of >>>>> the server. >>>>> >>>>> Obviously, server limitations take priority over application >>>>> configuration. >>>>> >>>>> Assuming that each tenant in a multi-tenant environment gets one >>>>> keyspace, you would also want to enforce limitations based on keyspace >>>>> (which correspond to parameters that the tenant payed for). >>>>> >>>>> So now we have three levels: >>>>> >>>>> 1. Server configuration (top priority) >>>>> 2. Keyspace configuration (payed-for service - second priority) >>>>> 3. Column family configuration (configuration provided by tenant - >>>>> third priority) >>>>> >>>>> >>>>> On Wed, Jan 19, 2011 at 3:15 PM, indika kumara >>>>> <indika.k...@gmail.com>wrote: >>>>> >>>>>> As the actual problem is mostly related to the number of CFs in the >>>>>> system (may be number of the columns), I still believe that supporting >>>>>> exposing the Cassandra ‘as-is’ to a tenant is doable and suitable though >>>>>> need some fixes. That multi-tenancy model allows a tenant to use the >>>>>> programming model of the Cassandra ‘as-is’, enabling the seamless >>>>>> migration >>>>>> of an application that uses the Cassandra into the cloud. Moreover, In >>>>>> order >>>>>> to support different SLA requirements of different tenants, the >>>>>> configurability of keyspaces, cfs, etc., per tenant may be critical. >>>>>> However, there are trade-offs among usability, memory consumption, and >>>>>> performance. I believe that it is important to consider the SLA >>>>>> requirements >>>>>> of different tenants when deciding the strategies for controlling >>>>>> resource >>>>>> consumption. >>>>>> >>>>>> I like to the idea of system-wide parameters for controlling resource >>>>>> usage. I believe that the tenant-specific parameters are equally >>>>>> important. >>>>>> There are resources, and each tenant can claim a portion of them based on >>>>>> SLA. For instance, if there is a threshold on the number of columns per a >>>>>> node, it should be able to decide how many columns a particular tenant >>>>>> can >>>>>> have. It allows selecting a suitable Cassandra cluster for a tenant >>>>>> based >>>>>> on his or her SLA. I believe the capability to configure resource >>>>>> controlling parameters per keyspace would be important to support a >>>>>> keyspace >>>>>> per tenant model. Furthermore, In order to maximize the resource sharing >>>>>> among tenants, a threshold (on a resource) per keyspace should not be a >>>>>> hard >>>>>> limit. Rather, it should be oscillated between a hard minimum and a >>>>>> maximum. >>>>>> For example, if a particular tenant needs more resources at a given >>>>>> time, he >>>>>> or she should be possible to borrow from the others up to the maximum. >>>>>> The >>>>>> threshold is only considered when a tenant is assigned to a cluster - the >>>>>> remaining resources of a cluster should be equal or higher than the >>>>>> resource >>>>>> limit of the tenant. It may need to spread a single keyspace across >>>>>> multiple >>>>>> clusters; especially when there are no enough resources in a single >>>>>> cluster. >>>>>> >>>>>> I believe that it would be better to have a flexibility to change >>>>>> seamlessly multi-tenancy implementation models such as the Cassadra >>>>>> ‘as-is’, >>>>>> the keyspace per tenant model, a keyspace for all tenants, and so on. >>>>>> Based >>>>>> on what I have learnt, each model requires adding tenant id (name space) >>>>>> to >>>>>> a keyspace’s name or cf’s name or raw key, or column’s name. Would it be >>>>>> better to have a kind of pluggable handler that can access those >>>>>> resources >>>>>> prior to doing the actual operation so that the required changes can be >>>>>> done? May be prior to authorization. >>>>>> >>>>>> Thanks, >>>>>> >>>>>> Indika >>>>>> >>>>> >>>>> >>>> >>> >> >