Re: the process of reading and writing

2010-09-03 Thread Ying Tang
In dynamo's paper ,it says:

Each key, k, is assigned to a coordinator node .
The coordinator is in charge of the replication of the data items that fall
within its range.

On Fri, Sep 3, 2010 at 2:56 PM, Benjamin Black  wrote:

> On Thu, Sep 2, 2010 at 8:19 PM, Ying Tang  wrote:
> > Recently , i read the paper about Cassandra again .
> > And now i have some concepts about  the reading and writing .
> > We all know Cassandra uses NWR ,
> > When read :
> > the request ---> a random node in Cassandra .This node acts as a proxy
> ,and
> > it routes the request.
> > Here ,
> > 1. the proxy node route this request to this key's coordinator , the
> > coordinator then routes request to other N-1 nodes   OR   the proxy
> routes
> > the read request to N nodes ?
>
> The coordinator node is the proxy node.
>
> > 2. If it is the former situation , the read repair occurs on the  key's
> > coordinator ?
> >If  it is the latter , the  read repair occurs on the proxy node ?
>
> Depends on the CL requested.  QUORUM and ALL cause the RR to be
> performed by the coordinator.  ANY and ONE cause RR to be delegated to
> one of the replicas for the key.
>
> > When write :
> > the request ---> a random node in Cassandra .This node acts as a proxy
> ,and
> > it routes the request.
> > Here ,
> > 3. the proxy node route this request to this key's coordinator , the
> > coordinator then routes request to other N-1 nodes   OR   the proxy
> routes
> > the request to N nodes ?
> >
>
> For writes, the coordinator sends the writes directly to the replicas
> regardless of CL (rather than delegating for weakly consistent CLs).
>
> > 4. The N isn't the data's copy numbers , it's just a  range . In this  N
> > range , there must be W copies .So W is the copy numbers.
> > So in this N range , R+W>N can guarantee the data's validity. Right?
> >
>
> Sorry, I can't even parse this.
>
>
> b
>



-- 
Best regards,

Ivy Tang


Re: 4k keyspaces... Maybe we're doing it wrong?

2010-09-03 Thread Mike Peters

 Very interesting. Thank you

So it sounds like other than being able to quickly truncate 
customer-keyspaces, with Cassandra there's no real benefit in keeping 
each customer data in a separate keyspace.


We'll suffer on the memory side with all the switching between keyspaces 
and we're better off storing all customer data under the same keyspace?



On 9/2/2010 11:29 PM, Aaron Morton wrote:
Create one big happy love in keyspace. Use the key structure to 
identify the different clients data.


The is more support for multi tenancy systems but a lot of the memory 
configuration is per keyspace/column family, so you cannot run that 
many keyspaces.


This page has some more information 
http://wiki.apache.org/cassandra/MultiTenant


Aaron


On 03 Sep, 2010,at 01:25 PM, Mike Peters 
 wrote:



Hi,

We're in the process of migrating 4,000 MySQL client databases to
Cassandra. All database schemas are identical.

With MySQL, we used to provision a separate 'database' per each client,
to make it easier to shard and move things around.

Does it make sense to migrate the 4,000 MySQL databases to 4,000
keyspaces in Cassandra? Or should we stick with a single keyspace?

My concerns are -
#1. Will every single node end up with 4k folders under /cassandra/data/?

#2. Performance: Will Cassandra work better with a single keyspace +
lots of keys, or thousands of keyspaces?

-

Granted it's 'cleaner' to have a separate keyspace per each client, but
maybe that's not the best approach with Cassandra.

Thoughts?




indexing methods

2010-09-03 Thread Courtney Robinson
A few of us working on a book for casanadra and got to the point where we (well 
I did anyway)  wanted to include an example of a non trivial inverted index. 

I've been playing around  with different ideas on how I could store the data 
and I've had a look at the previous threads that touched on the subject but 
with the 2 or 3 ideas I've seen on the list someone always points out something 
in the approach that punches a hole in it.

I've been playing around with the idea of using a Columnfamily for the index 
where I store the terms as the key then each column name is a 64 bit long and 
its value is the doc id. If the column name represents a ranking for the doc id 
it stores and the compare with option is LongType then once a term is retrieved 
the first x amount of columns would represent the most related docs for that 
term. 

I'd go on in more detail but I'm using my phone to write this and I think that 
gets the idea across.
Ofcourse my first thought to this is, is it scalable? In a system where 
possibly millions of docs are related to one term, is that a good idea to have 
potentially that many columns in one row all associated to the one row key 
which is the term?

I just want to know what others think, if you have any suggestions or have a 
similar thing implemented and you're able to share.

On a side note to that, there has been a bit of talk about secondary indexes in 
0.7 can anyone shed some light on that, or point me to any presentation or the 
like where its mentioned so I can get a better idea of what its for.

Thanks,
Courtney
  

Re: 4k keyspaces... Maybe we're doing it wrong?

2010-09-03 Thread vineet daniel
If I am correct than you need to restart cassandra whenever you adding a new
KeySpace. Thats another concern.

Vineet Daniel
Cell  : +91-8106217121
Websites :
Blog    |
Linkedin
|  Twitter 





On Fri, Sep 3, 2010 at 2:58 PM, Mike Peters
wrote:

>  Very interesting. Thank you
>
> So it sounds like other than being able to quickly truncate
> customer-keyspaces, with Cassandra there's no real benefit in keeping each
> customer data in a separate keyspace.
>
> We'll suffer on the memory side with all the switching between keyspaces
> and we're better off storing all customer data under the same keyspace?
>
>
>
> On 9/2/2010 11:29 PM, Aaron Morton wrote:
>
> Create one big happy love in keyspace. Use the key structure to identify
> the different clients data.
>
>  The is more support for multi tenancy systems but a lot of the memory
> configuration is per keyspace/column family, so you cannot run that many
> keyspaces.
>
>  This page has some more information
> http://wiki.apache.org/cassandra/MultiTenant
>
>   Aaron
>
>
> On 03 Sep, 2010,at 01:25 PM, Mike Peters 
> wrote:
>
>Hi,
>
> We're in the process of migrating 4,000 MySQL client databases to
> Cassandra. All database schemas are identical.
>
> With MySQL, we used to provision a separate 'database' per each client,
> to make it easier to shard and move things around.
>
> Does it make sense to migrate the 4,000 MySQL databases to 4,000
> keyspaces in Cassandra? Or should we stick with a single keyspace?
>
> My concerns are -
> #1. Will every single node end up with 4k folders under /cassandra/data/?
>
> #2. Performance: Will Cassandra work better with a single keyspace +
> lots of keys, or thousands of keyspaces?
>
> -
>
> Granted it's 'cleaner' to have a separate keyspace per each client, but
> maybe that's not the best approach with Cassandra.
>
> Thoughts?
>
>
>


Re: 4k keyspaces... Maybe we're doing it wrong?

2010-09-03 Thread Mike Peters

 We're using 0.7


On 9/3/2010 6:48 AM, vineet daniel wrote:
If I am correct than you need to restart cassandra whenever you adding 
a new KeySpace. Thats another concern.


Vineet Daniel
Cell  : +91-8106217121
Websites :
Blog    | Linkedin 
  | Twitter 







On Fri, Sep 3, 2010 at 2:58 PM, Mike Peters 
> wrote:


Very interesting. Thank you

So it sounds like other than being able to quickly truncate
customer-keyspaces, with Cassandra there's no real benefit in
keeping each customer data in a separate keyspace.

We'll suffer on the memory side with all the switching between
keyspaces and we're better off storing all customer data under the
same keyspace?



On 9/2/2010 11:29 PM, Aaron Morton wrote:

Create one big happy love in keyspace. Use the key structure to
identify the different clients data.

The is more support for multi tenancy systems but a lot of the
memory configuration is per keyspace/column family, so you cannot
run that many keyspaces.

This page has some more information
http://wiki.apache.org/cassandra/MultiTenant

Aaron


On 03 Sep, 2010,at 01:25 PM, Mike Peters

 wrote:


Hi,

We're in the process of migrating 4,000 MySQL client databases to
Cassandra. All database schemas are identical.

With MySQL, we used to provision a separate 'database' per each
client,
to make it easier to shard and move things around.

Does it make sense to migrate the 4,000 MySQL databases to 4,000
keyspaces in Cassandra? Or should we stick with a single keyspace?

My concerns are -
#1. Will every single node end up with 4k folders under
/cassandra/data/?

#2. Performance: Will Cassandra work better with a single
keyspace +
lots of keys, or thousands of keyspaces?

-

Granted it's 'cleaner' to have a separate keyspace per each
client, but
maybe that's not the best approach with Cassandra.

Thoughts?







Re: indexing methods

2010-09-03 Thread Jake Luciani
Hi Courtney,

You can take a look at lucandra http://github.com/tjake/Lucandra which uses
the lucene api to maintain a inverted index in cassandra. There are a couple
articles and presentations in the readme that give more info on how this is
done.

-Jake

On Fri, Sep 3, 2010 at 6:26 AM, Courtney Robinson  wrote:

> A few of us working on a book for casanadra and got to the point where we
> (well I did anyway)  wanted to include an example of a non trivial inverted
> index.
>
> I've been playing around  with different ideas on how I could store the
> data and I've had a look at the previous threads that touched on the subject
> but with the 2 or 3 ideas I've seen on the list someone always points out
> something in the approach that punches a hole in it.
>
> I've been playing around with the idea of using a Columnfamily for the
> index where I store the terms as the key then each column name is a 64 bit
> long and its value is the doc id. If the column name represents a ranking
> for the doc id it stores and the compare with option is LongType then once a
> term is retrieved the first x amount of columns would represent the most
> related docs for that term.
>
> I'd go on in more detail but I'm using my phone to write this and I think
> that gets the idea across.
> Ofcourse my first thought to this is, is it scalable? In a system where
> possibly millions of docs are related to one term, is that a good idea to
> have potentially that many columns in one row all associated to the one row
> key which is the term?
>
> I just want to know what others think, if you have any suggestions or have
> a similar thing implemented and you're able to share.
>
> On a side note to that, there has been a bit of talk about secondary
> indexes in 0.7 can anyone shed some light on that, or point me to any
> presentation or the like where its mentioned so I can get a better idea of
> what its for.
>
> Thanks,
> Courtney
>


Consistency issue

2010-09-03 Thread Hugo

 Hi,

I'm performing tests with Cassandra 0.6.5 with Hector 0.6.0-14 on a 
single machine (one node cluster). I've noticed an issue with consistency.


In my tests I perform a KeySpace.batchMutate() to update a column and 
immediately after that I perform a KeySpace.getSlice() on the same 
column (from within the same thread). I noticed that occasionally I get 
back the previous value rather than the value I've just written.


My guess is that this occurs because Hector uses pooled connections and 
both my requests are executed on different connections. I suspect this 
causes a race condition in Cassandra between the getSlice() and the 
batchMutate().


Can anyone confirm my suspicions and does anyone have a solution for this?

Groets, Hugo.


Re: Consistency issue

2010-09-03 Thread Nick Telford
Which ConsistencyLevels did you use for your batchMutate() and getSlice()
operations?

ConsistencyLevels directly dictate the level of consistency you will get
with your data.

Regards,

Nick Telford

On 3 September 2010 12:03, Hugo  wrote:

>  Hi,
>
> I'm performing tests with Cassandra 0.6.5 with Hector 0.6.0-14 on a single
> machine (one node cluster). I've noticed an issue with consistency.
>
> In my tests I perform a KeySpace.batchMutate() to update a column and
> immediately after that I perform a KeySpace.getSlice() on the same column
> (from within the same thread). I noticed that occasionally I get back the
> previous value rather than the value I've just written.
>
> My guess is that this occurs because Hector uses pooled connections and
> both my requests are executed on different connections. I suspect this
> causes a race condition in Cassandra between the getSlice() and the
> batchMutate().
>
> Can anyone confirm my suspicions and does anyone have a solution for this?
>
> Groets, Hugo.
>


Re: Consistency issue

2010-09-03 Thread Hugo

 I'm using QUORUM, but in my single-node setup this doesn't matter IMHO.

On 9/3/2010 1:51 PM, Nick Telford wrote:
Which ConsistencyLevels did you use for your batchMutate() and 
getSlice() operations?


ConsistencyLevels directly dictate the level of consistency you will 
get with your data.


Regards,

Nick Telford

On 3 September 2010 12:03, Hugo > wrote:


 Hi,

I'm performing tests with Cassandra 0.6.5 with Hector 0.6.0-14 on
a single machine (one node cluster). I've noticed an issue with
consistency.

In my tests I perform a KeySpace.batchMutate() to update a column
and immediately after that I perform a KeySpace.getSlice() on the
same column (from within the same thread). I noticed that
occasionally I get back the previous value rather than the value
I've just written.

My guess is that this occurs because Hector uses pooled
connections and both my requests are executed on different
connections. I suspect this causes a race condition in Cassandra
between the getSlice() and the batchMutate().

Can anyone confirm my suspicions and does anyone have a solution
for this?

Groets, Hugo.




Re: Consistency issue

2010-09-03 Thread Nick Telford
Are you using QUORUM for both writes and reads?

The behaviour you're seeing sounds like something I'd expect to see if you
used NONE for writes.

On 3 September 2010 14:12, Hugo  wrote:

>  I'm using QUORUM, but in my single-node setup this doesn't matter IMHO.
>
>
> On 9/3/2010 1:51 PM, Nick Telford wrote:
>
> Which ConsistencyLevels did you use for your batchMutate() and getSlice()
> operations?
>
>  ConsistencyLevels directly dictate the level of consistency you will get
> with your data.
>
>  Regards,
>
>  Nick Telford
>
> On 3 September 2010 12:03, Hugo  wrote:
>
>>  Hi,
>>
>> I'm performing tests with Cassandra 0.6.5 with Hector 0.6.0-14 on a single
>> machine (one node cluster). I've noticed an issue with consistency.
>>
>> In my tests I perform a KeySpace.batchMutate() to update a column and
>> immediately after that I perform a KeySpace.getSlice() on the same column
>> (from within the same thread). I noticed that occasionally I get back the
>> previous value rather than the value I've just written.
>>
>> My guess is that this occurs because Hector uses pooled connections and
>> both my requests are executed on different connections. I suspect this
>> causes a race condition in Cassandra between the getSlice() and the
>> batchMutate().
>>
>> Can anyone confirm my suspicions and does anyone have a solution for this?
>>
>> Groets, Hugo.
>>
>
>


Cache capacity set with JConsole is lost after restart

2010-09-03 Thread Viktor Jevdokimov
Hi,

We're not setting cache capacity upon creation of Column Family, since the type 
and capacity is unknown at that time. By default it = 0.

After Column Family has enough data and we could decide on cache type (Row or 
Key) and capacity, we connect with JConsole and set cache capacity manually on 
every node. But after Cassandra restart cache capacity is 0 again.

How to avoid losing cache capacity after restart?


Viktor


Re: Consistency issue

2010-09-03 Thread Hugo

 I'm using the Hector defaults, which are QUORUM for reads and writes.

On 9/3/2010 3:18 PM, Nick Telford wrote:

Are you using QUORUM for both writes and reads?

The behaviour you're seeing sounds like something I'd expect to see if 
you used NONE for writes.


On 3 September 2010 14:12, Hugo > wrote:


I'm using QUORUM, but in my single-node setup this doesn't matter
IMHO.


On 9/3/2010 1:51 PM, Nick Telford wrote:

Which ConsistencyLevels did you use for your batchMutate() and
getSlice() operations?

ConsistencyLevels directly dictate the level of consistency you
will get with your data.

Regards,

Nick Telford

On 3 September 2010 12:03, Hugo mailto:h...@unitedgames.com>> wrote:

 Hi,

I'm performing tests with Cassandra 0.6.5 with Hector
0.6.0-14 on a single machine (one node cluster). I've noticed
an issue with consistency.

In my tests I perform a KeySpace.batchMutate() to update a
column and immediately after that I perform a
KeySpace.getSlice() on the same column (from within the same
thread). I noticed that occasionally I get back the previous
value rather than the value I've just written.

My guess is that this occurs because Hector uses pooled
connections and both my requests are executed on different
connections. I suspect this causes a race condition in
Cassandra between the getSlice() and the batchMutate().

Can anyone confirm my suspicions and does anyone have a
solution for this?

Groets, Hugo.






Re: Cache capacity set with JConsole is lost after restart

2010-09-03 Thread Edward Capriolo
On Fri, Sep 3, 2010 at 9:22 AM, Viktor Jevdokimov
 wrote:
> Hi,
>
>
>
> We’re not setting cache capacity upon creation of Column Family, since the
> type and capacity is unknown at that time. By default it = 0.
>
>
>
> After Column Family has enough data and we could decide on cache type (Row
> or Key) and capacity, we connect with JConsole and set cache capacity
> manually on every node. But after Cassandra restart cache capacity is 0
> again.
>
>
>
> How to avoid losing cache capacity after restart?
>
>
>
>
>
> Viktor

Viktor,
I will assume you are using 6.X.

In 6.X changes to cache capacity through JMX are NOT saved. Use the
KeysCached and RowsCached

  

Re: the process of reading and writing

2010-09-03 Thread Jonathan Ellis
To the degree that this suggests that there is a "master" node for
each range, IMO it is a "bug" in the paper.  (There are several of
these.)  Certainly there are no master nodes in Cassandra.

On Fri, Sep 3, 2010 at 12:02 AM, Ying Tang  wrote:
> In dynamo's paper ,it says:
> Each key, k, is assigned to a coordinator node .
> The coordinator is in charge of the replication of the data items that fall
> within its range.
> On Fri, Sep 3, 2010 at 2:56 PM, Benjamin Black  wrote:
>>
>> On Thu, Sep 2, 2010 at 8:19 PM, Ying Tang  wrote:
>> > Recently , i read the paper about Cassandra again .
>> > And now i have some concepts about  the reading and writing .
>> > We all know Cassandra uses NWR ,
>> > When read :
>> > the request ---> a random node in Cassandra .This node acts as a proxy
>> > ,and
>> > it routes the request.
>> > Here ,
>> > 1. the proxy node route this request to this key's coordinator , the
>> > coordinator then routes request to other N-1 nodes   OR   the proxy
>> > routes
>> > the read request to N nodes ?
>>
>> The coordinator node is the proxy node.
>>
>> > 2. If it is the former situation , the read repair occurs on the  key's
>> > coordinator ?
>> >    If  it is the latter , the  read repair occurs on the proxy node ?
>>
>> Depends on the CL requested.  QUORUM and ALL cause the RR to be
>> performed by the coordinator.  ANY and ONE cause RR to be delegated to
>> one of the replicas for the key.
>>
>> > When write :
>> > the request ---> a random node in Cassandra .This node acts as a proxy
>> > ,and
>> > it routes the request.
>> > Here ,
>> > 3. the proxy node route this request to this key's coordinator , the
>> > coordinator then routes request to other N-1 nodes   OR   the proxy
>> > routes
>> > the request to N nodes ?
>> >
>>
>> For writes, the coordinator sends the writes directly to the replicas
>> regardless of CL (rather than delegating for weakly consistent CLs).
>>
>> > 4. The N isn't the data's copy numbers , it's just a  range . In this  N
>> > range , there must be W copies .So W is the copy numbers.
>> > So in this N range , R+W>N can guarantee the data's validity. Right?
>> >
>>
>> Sorry, I can't even parse this.
>>
>>
>> b
>
>
>
> --
> Best regards,
> Ivy Tang
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


RE: Cache capacity set with JConsole is lost after restart

2010-09-03 Thread Viktor Jevdokimov
Forgot to mention the version: 0.7 beta 1

-Original Message-
From: Edward Capriolo [mailto:edlinuxg...@gmail.com] 
Sent: Friday, September 03, 2010 4:59 PM
To: user@cassandra.apache.org
Subject: Re: Cache capacity set with JConsole is lost after restart

On Fri, Sep 3, 2010 at 9:22 AM, Viktor Jevdokimov
 wrote:
> Hi,
>
>
>
> We're not setting cache capacity upon creation of Column Family, since the
> type and capacity is unknown at that time. By default it = 0.
>
>
>
> After Column Family has enough data and we could decide on cache type (Row
> or Key) and capacity, we connect with JConsole and set cache capacity
> manually on every node. But after Cassandra restart cache capacity is 0
> again.
>
>
>
> How to avoid losing cache capacity after restart?
>
>
>
>
>
> Viktor

Viktor,
I will assume you are using 6.X.

In 6.X changes to cache capacity through JMX are NOT saved. Use the
KeysCached and RowsCached

  

Re: Cache capacity set with JConsole is lost after restart

2010-09-03 Thread Jonathan Ellis
That doesn't matter, the config file is the Source Of Truth for the
values it has.

On Fri, Sep 3, 2010 at 7:12 AM, Viktor Jevdokimov
 wrote:
> Forgot to mention the version: 0.7 beta 1
>
> -Original Message-
> From: Edward Capriolo [mailto:edlinuxg...@gmail.com]
> Sent: Friday, September 03, 2010 4:59 PM
> To: user@cassandra.apache.org
> Subject: Re: Cache capacity set with JConsole is lost after restart
>
> On Fri, Sep 3, 2010 at 9:22 AM, Viktor Jevdokimov
>  wrote:
>> Hi,
>>
>>
>>
>> We're not setting cache capacity upon creation of Column Family, since the
>> type and capacity is unknown at that time. By default it = 0.
>>
>>
>>
>> After Column Family has enough data and we could decide on cache type (Row
>> or Key) and capacity, we connect with JConsole and set cache capacity
>> manually on every node. But after Cassandra restart cache capacity is 0
>> again.
>>
>>
>>
>> How to avoid losing cache capacity after restart?
>>
>>
>>
>>
>>
>> Viktor
>
> Viktor,
> I will assume you are using 6.X.
>
> In 6.X changes to cache capacity through JMX are NOT saved. Use the
> KeysCached and RowsCached
>
>                      ColumnType="Super"
>                    CompareWith="UTF8Type"
>                    CompareSubcolumnsWith="UTF8Type"
>                    RowsCached="1"
>                    KeysCached="111"
>
> Be warned the attributes are CaSe SeNSative!
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Impact on running cassandra cluster from changing hostnames...

2010-09-03 Thread Jonathan Ellis
It's supposed to work, but it's definitely a rare thing to do.

You should be concerned about other nodes rejecting the new node
saying "remember this token? i'm at ip Y now instead of Z."  if that
happens then shutting down all nodes and restarting will fix it.

On Thu, Sep 2, 2010 at 3:03 PM, Ned Wolpert  wrote:
> Folks-
>   What is the correct process of changing the hostnames and IPs of each
> server in a cassandra cluster. In my use-case we're shutting it down and
> then changing the names and ips. No changes to hardware during the
> processes. Beyond config changes, what should I be concerned about?
> --
> Virtually, Ned Wolpert
>
> "Settle thy studies, Faustus, and begin..."   --Marlowe
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Re: Broken pipe

2010-09-03 Thread Jonathan Shook
I have been able to reproduce this, although it was a bug in
application client code. If you keep a thrift client around longer
after it has had an exception, it may generate this error.

In my case, I was holding a reference via ThreadLocal<> to a stale
storage object.

Another symptom which may help identify this scenario is that the
broken client will not initiate any network traffic, not even a SYN
packet. You may have to shut down other client traffic on the client
node in order to see this...


2010/4/28 Jonathan Ellis :
> did you check the log for exceptions?
>
> On Wed, Apr 28, 2010 at 12:08 AM, Bingbing Liu  wrote:
>> but the situation is that ,at the beginning everything goes well, then when
>> the get_range_slices gets about 13,000,000 rows (set the key range to 2000)
>>
>> the exception happens.
>>
>> and when i do the same thing on a smaller data set, no such thing happens.
>>
>> 2010-04-28
>> 
>> Bingbing Liu
>> 
>> 发件人: Jonathan Ellis
>> 发送时间: 2010-04-27  20:51:11
>> 收件人: user
>> 抄送: rucbing
>> 主题: Re: Broken pipe
>> get_range_slices works fine in the system tests, so something is wrong
>> on your client side.  Some possibilities:
>>  - sending to a non-Thrift port
>>  - using an incompatible set of Thrift bindings than the one your
>> server supports
>>  - mixing a framed client with a non-framed server or vice versa
>> [moving followups to user list]
>> 2010/4/27 Bingbing Liu :
>>> when i use get_range_slices, i get the exceptions , i don't know what 
>>> happens
>>>
>>> hope someone can help me
>>>
>>>
>>> org.apache.thrift.transport.TTransportException: java.net.SocketException: 
>>> Broken pipe
>>>at 
>>> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:142)
>>>at 
>>> org.apache.thrift.protocol.TBinaryProtocol.writeI32(TBinaryProtocol.java:152)
>>>at 
>>> org.apache.thrift.protocol.TBinaryProtocol.writeMessageBegin(TBinaryProtocol.java:80)
>>>at 
>>> org.apache.cassandra.thrift.Cassandra$Client.send_get_range_slices(Cassandra.java:592)
>>>at 
>>> org.apache.cassandra.thrift.Cassandra$Client.get_range_slices(Cassandra.java:586)
>>>at org.clouddb.test.GrepSelect.main(GrepSelect.java:64)
>>> Caused by: java.net.SocketException: Broken pipe
>>>at java.net.SocketOutputStream.socketWrite0(Native Method)
>>>at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
>>>at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
>>>at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
>>>at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
>>>at 
>>> org.apache.thrift.transport.TIOStreamTransport.write(TIOStreamTransport.java:140)
>>>... 5 more
>>>
>>>
>>> 2010-04-27
>>>
>>>
>>>
>>> Bingbing Liu
>>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


RE: Cache capacity set with JConsole is lost after restart

2010-09-03 Thread Jeremiah Jordan
But the config file doesn't hold those values anymore with 0.7.
There is a JIRA ticket out there for api's to modify stuff about column 
families in 0.7, it may cover doing this.  I would guess that the tracking 
tables in the system keyspace aren't being updated with the values you are 
setting from JMX, so on restart they go back to what they were on creation.

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com] 
Sent: Friday, September 03, 2010 9:28 AM
To: user@cassandra.apache.org
Subject: Re: Cache capacity set with JConsole is lost after restart

That doesn't matter, the config file is the Source Of Truth for the
values it has.

On Fri, Sep 3, 2010 at 7:12 AM, Viktor Jevdokimov
 wrote:
> Forgot to mention the version: 0.7 beta 1
>
> -Original Message-
> From: Edward Capriolo [mailto:edlinuxg...@gmail.com]
> Sent: Friday, September 03, 2010 4:59 PM
> To: user@cassandra.apache.org
> Subject: Re: Cache capacity set with JConsole is lost after restart
>
> On Fri, Sep 3, 2010 at 9:22 AM, Viktor Jevdokimov
>  wrote:
>> Hi,
>>
>>
>>
>> We're not setting cache capacity upon creation of Column Family, since the
>> type and capacity is unknown at that time. By default it = 0.
>>
>>
>>
>> After Column Family has enough data and we could decide on cache type (Row
>> or Key) and capacity, we connect with JConsole and set cache capacity
>> manually on every node. But after Cassandra restart cache capacity is 0
>> again.
>>
>>
>>
>> How to avoid losing cache capacity after restart?
>>
>>
>>
>>
>>
>> Viktor
>
> Viktor,
> I will assume you are using 6.X.
>
> In 6.X changes to cache capacity through JMX are NOT saved. Use the
> KeysCached and RowsCached
>
>                      ColumnType="Super"
>                    CompareWith="UTF8Type"
>                    CompareSubcolumnsWith="UTF8Type"
>                    RowsCached="1"
>                    KeysCached="111"
>
> Be warned the attributes are CaSe SeNSative!
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Cache capacity set with JConsole is lost after restart

2010-09-03 Thread Jonathan Ellis
Right, in 0.7 the system calls for modifying these are the source of
truth.  See https://issues.apache.org/jira/browse/CASSANDRA-1285

On Fri, Sep 3, 2010 at 11:17 AM, Jeremiah Jordan
 wrote:
> But the config file doesn't hold those values anymore with 0.7.
> There is a JIRA ticket out there for api's to modify stuff about column 
> families in 0.7, it may cover doing this.  I would guess that the tracking 
> tables in the system keyspace aren't being updated with the values you are 
> setting from JMX, so on restart they go back to what they were on creation.
>
> -Original Message-
> From: Jonathan Ellis [mailto:jbel...@gmail.com]
> Sent: Friday, September 03, 2010 9:28 AM
> To: user@cassandra.apache.org
> Subject: Re: Cache capacity set with JConsole is lost after restart
>
> That doesn't matter, the config file is the Source Of Truth for the
> values it has.
>
> On Fri, Sep 3, 2010 at 7:12 AM, Viktor Jevdokimov
>  wrote:
>> Forgot to mention the version: 0.7 beta 1
>>
>> -Original Message-
>> From: Edward Capriolo [mailto:edlinuxg...@gmail.com]
>> Sent: Friday, September 03, 2010 4:59 PM
>> To: user@cassandra.apache.org
>> Subject: Re: Cache capacity set with JConsole is lost after restart
>>
>> On Fri, Sep 3, 2010 at 9:22 AM, Viktor Jevdokimov
>>  wrote:
>>> Hi,
>>>
>>>
>>>
>>> We're not setting cache capacity upon creation of Column Family, since the
>>> type and capacity is unknown at that time. By default it = 0.
>>>
>>>
>>>
>>> After Column Family has enough data and we could decide on cache type (Row
>>> or Key) and capacity, we connect with JConsole and set cache capacity
>>> manually on every node. But after Cassandra restart cache capacity is 0
>>> again.
>>>
>>>
>>>
>>> How to avoid losing cache capacity after restart?
>>>
>>>
>>>
>>>
>>>
>>> Viktor
>>
>> Viktor,
>> I will assume you are using 6.X.
>>
>> In 6.X changes to cache capacity through JMX are NOT saved. Use the
>> KeysCached and RowsCached
>>
>>  >                    ColumnType="Super"
>>                    CompareWith="UTF8Type"
>>                    CompareSubcolumnsWith="UTF8Type"
>>                    RowsCached="1"
>>                    KeysCached="111"
>>
>> Be warned the attributes are CaSe SeNSative!
>>
>>
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


servers for cassandra

2010-09-03 Thread vineet daniel
Hi

I am just curious to know if there is any hosting company that provides
servers at a very low cost, wherein I can install cassandra on WAN. I have
cassandra setup in my LAN and want to test it in real conditions, taking
dedicated servers just for testing purposes is not at all feasible for me
not even pay-as-you go types. I'd really appreciate if anybody can share
information on such hosting providers.

Vineet Daniel
Cell  : +918106217121
Websites :
Blog    |
Linkedin
|  Twitter