Hi,

I have a question that somewhat related to the above.
Is there a tool that predicts the resource consumption (i.e, memory, disk,
CPU)  in an offline mode? Means it is given with the storage conf
parameters, ks, CFs and data model, and then application parameters such
read/write average rates. It should output the required sizes for memory,
disk etc.

I need to estimate costs for various configurations we might have and
thus  I am working on building "simple" excel  for my own data model  - but
then it came to my mind to ask wether something like that already exists.

BTW, I think such tool can also help for the issues that were discussed
before even though it will be built on averages which probably are no so
fine-grained but it can provide worse cases numbers to the application
that uses Cassandra

Thanks,
Miriam


==========
Miriam Allalouf
n

On Thu, Jan 20, 2011 at 1:53 PM, indika kumara <indika.k...@gmail.com>wrote:

> Thanks David.... We decided to do it at our client-side as the initial
> implementation. I will investigate the approaches for supporting the fine
> grained control of the resources consumed by a sever, tenant, and CF.
>
> Thanks,
>
> Indika
>
> On Thu, Jan 20, 2011 at 3:20 PM, David Boxenhorn <da...@lookin2.com>wrote:
>
>> As far as I can tell, if Cassandra supports three levels of configuration
>> (server, keyspace, column family) we can support multi-tenancy. It is
>> trivial to give each tenant their own keyspace (e.g. just use the tenant's
>> id as the keyspace name) and let them go wild. (Any out-of-bounds behavior
>> on the CF level will be stopped at the keyspace and server level before
>> doing any damage.)
>>
>> I don't think Cassandra needs to know about end-users. From Cassandra's
>> point of view the tenant is the user.
>>
>> On Thu, Jan 20, 2011 at 7:00 AM, indika kumara <indika.k...@gmail.com>wrote:
>>
>>> +1   Are there JIRAs for these requirements? I would like to contribute
>>> from my capacity.
>>>
>>> As per my understanding, to support some muti-tenant models, it is needed
>>> to qualified keyspaces' names, Cfs' names, etc. with the tenant namespace
>>> (or id). The easiest way to do this would be to modify corresponding
>>> constructs transparently. I tought of a stage (optional and configurable)
>>> prior to authorization. Is there any better solutions? I appreciate the
>>> community's suggestions.
>>>
>>> Moreover, It is needed to send the tenant NS(id) with the user
>>> credentials (A users belongs to this tenant (org.)). For that purpose, I
>>> thought of using the user credentials in the AuthenticationRequest. s there
>>> any better solution?
>>>
>>> I would like to have a MT support at the Cassandra level which is
>>> optional and configurable.
>>>
>>> Thanks,
>>>
>>> Indika
>>>
>>>
>>> On Wed, Jan 19, 2011 at 7:40 PM, David Boxenhorn <da...@lookin2.com>wrote:
>>>
>>>> Yes, the way I see it - and it becomes even more necessary for a
>>>> multi-tenant configuration - there should be completely separate
>>>> configurations for applications and for servers.
>>>>
>>>> - Application configuration is based on data and usage characteristics
>>>> of your application.
>>>> - Server configuration is based on the specific hardware limitations of
>>>> the server.
>>>>
>>>> Obviously, server limitations take priority over application
>>>> configuration.
>>>>
>>>> Assuming that each tenant in a multi-tenant environment gets one
>>>> keyspace, you would also want to enforce limitations based on keyspace
>>>> (which correspond to parameters that the tenant payed for).
>>>>
>>>> So now we have three levels:
>>>>
>>>> 1. Server configuration (top priority)
>>>> 2. Keyspace configuration (payed-for service - second priority)
>>>> 3. Column family configuration (configuration provided by tenant - third
>>>> priority)
>>>>
>>>>
>>>> On Wed, Jan 19, 2011 at 3:15 PM, indika kumara 
>>>> <indika.k...@gmail.com>wrote:
>>>>
>>>>> As the actual problem is mostly related to the number of CFs in the
>>>>> system (may be number of the columns), I still believe that supporting
>>>>> exposing the Cassandra ‘as-is’ to a tenant is doable and suitable though
>>>>> need some fixes.  That multi-tenancy model allows a tenant to use the
>>>>> programming model of the Cassandra ‘as-is’, enabling the seamless 
>>>>> migration
>>>>> of an application that uses the Cassandra into the cloud. Moreover, In 
>>>>> order
>>>>> to support different SLA requirements of different tenants, the
>>>>> configurability of keyspaces, cfs, etc., per tenant may be critical.
>>>>> However, there are trade-offs among usability, memory consumption, and
>>>>> performance. I believe that it is important to consider the SLA 
>>>>> requirements
>>>>> of different tenants when deciding the strategies for controlling resource
>>>>> consumption.
>>>>>
>>>>> I like to the idea of system-wide parameters for controlling resource
>>>>> usage. I believe that the tenant-specific parameters are equally 
>>>>> important.
>>>>> There are resources, and each tenant can claim a portion of them based on
>>>>> SLA. For instance, if there is a threshold on the number of columns per a
>>>>> node, it should be able to decide how many columns a particular tenant can
>>>>> have.  It allows selecting a suitable Cassandra cluster for a tenant based
>>>>> on his or her SLA. I believe the capability to configure resource
>>>>> controlling parameters per keyspace would be important to support a 
>>>>> keyspace
>>>>> per tenant model. Furthermore, In order to maximize the resource sharing
>>>>> among tenants, a threshold (on a resource) per keyspace should not be a 
>>>>> hard
>>>>> limit. Rather, it should be oscillated between a hard minimum and a 
>>>>> maximum.
>>>>> For example, if a particular tenant needs more resources at a given time, 
>>>>> he
>>>>> or she should be possible to borrow from the others up to the maximum. The
>>>>> threshold is only considered when a tenant is assigned to a cluster - the
>>>>> remaining resources of a cluster should be equal or higher than the 
>>>>> resource
>>>>> limit of the tenant. It may need to spread a single keyspace across 
>>>>> multiple
>>>>> clusters; especially when there are no enough resources in a single
>>>>> cluster.
>>>>>
>>>>> I believe that it would be better to have a flexibility to change
>>>>> seamlessly multi-tenancy implementation models such as the Cassadra 
>>>>> ‘as-is’,
>>>>> the keyspace per tenant model, a keyspace for all tenants, and so on.  
>>>>> Based
>>>>> on what I have learnt, each model requires adding tenant id (name space) 
>>>>> to
>>>>> a keyspace’s name or cf’s name or raw key, or column’s name.  Would it be
>>>>> better to have a kind of pluggable handler that can access those resources
>>>>> prior to doing the actual operation so that the required changes can be
>>>>> done? May be prior to authorization.
>>>>>
>>>>> Thanks,
>>>>>
>>>>> Indika
>>>>>
>>>>
>>>>
>>>
>>
>

Reply via email to