Re: Passing client as parameter

2010-06-10 Thread Ran Tavory
You can look at
http://github.com/rantav/hector/blob/master/src/main/java/me/prettyprint/cassandra/service/CassandraClientFactory.java

so, to close the client you can just get the transport out of the client
(bold):

  private void closeClient(CassandraClient cclient) {
log.debug("Closing client {}", cclient);
((CassandraClientPoolImpl) pool).reportDestroyed(cclient);
Cassandra.Client client = cclient.getCassandra();
*client.getInputProtocol().getTransport().close();*
*client.getOutputProtocol().getTransport().close();*
cclient.markAsClosed();
  }

But to create a client you need a transport (bold):

  private Cassandra.Client createThriftClient(String  url, int port)
  throws TTransportException , TException {
log.debug("Creating a new thrift connection to {}:{}", url, port);
TTransport tr;
if (useThriftFramedTransport) {
  tr = new TFramedTransport(new TSocket(url, port, timeout));
} else {
  tr = new TSocket(url, port, timeout);
}
TProtocol proto = new TBinaryProtocol(tr);
*Cassandra.Client client = new Cassandra.Client(proto);*
try {
  tr.open();
} catch (TTransportException e) {
  // Thrift exceptions aren't very good in reporting, so we have to
catch the exception here and
  // add details to it.
  log.error("Unable to open transport to " + url + ":" + port, e);
  clientMonitor.incCounter(Counter.CONNECT_ERROR);
  throw new TTransportException("Unable to open transport to " + url +
":" + port + " , " +
  e.getLocalizedMessage(), e);
}
return client;
  }


So what you can do is instead of passing a client to the method, pass a URL
to the method. The method would open the transport, create a client, make
some cassandra operations and then close the transport.

On Wed, Jun 9, 2010 at 10:35 PM, Steven Haar wrote:

> C#
>
>
> On Wed, Jun 9, 2010 at 2:34 PM, Ran Tavory  wrote:
>
>> Some languages have higher level clients that might help you. What
>> language are you using?
>>
>> On Jun 9, 2010 9:01 PM, "Steven Haar"  wrote:
>>
>> What is the best way to pass a Cassandra client as a parameter? If you
>> pass it as a parameter, do you also have to pass the transport in order to
>> be able to close the connection? Is there any way to open or close the
>> transport direclty from the client?
>>
>> Essentailly what I want to do is pass a Cassandra client to a method and
>> then within that method be able to open the transport, execute a get or set
>> to the Cassandra database, and then close the transport all witihin the
>> method. The only way I see to do this is to also pass the transport to the
>> method.
>>
>>
>


Re: Range search on keys not working?

2010-06-10 Thread David Boxenhorn
My experience is the same as Philip's. My point was simply that there is no
way to get a range more restrictive than "all" if you use random
partitioning.

2010/6/9 Philip Stanhope 

> If you are using random partitioner, and you want to do an EXPENSIVE row
> scan ... I found that I could iterate using start_key="" end_key="" for
> first call ... and then all other calls you'd provide the
> start_key="LAST_KEY" from previous iteration. If you set count to 1000, then
> you'll get 1000 keys first time ... 999 for each additional iteration ...
> until you receive a result that is < count and then you are done. In another
> century this is a crude pager or cursor approach with no server-side
> knowledge of the state.
>
> Caveat: There are no changes occurring in the column family while you are
> doing this type of scan through the keys of a CF. Depending on what you are
> trying to do ... this may not be acceptable.
>
> Another caveat: Once you have sufficient amount of random keys in a CF ...
> there are practical limits that you'll soon reach over the amount of data
> you can receive in a Thrift response and/or the cost of building the
> response (timeouts may occur or you may exhaust memory at the node servicing
> the request).
>
> The same concerns apply to columns accessed via get_slice ... the # of
> columns and the values of those columns will run the potential of causing a
> timeout on the request or too much data to satisfy the request.
>
> Once you have sufficiently large keyspace (10M, 100M?) this approach is not
> sufficient or scalable. If you want to perform analysis it may very well be
> better to get the data into another format that is more appropriate for
> analytics (hadoop?). My production environment will have 4+ different
> distributed data stores: file system, relational (clustered on distributed
> file system), distributed key store (cassandra) and analytics (tbd, could be
> multiple). They each serve different purposes for historical and
> performance/operational considerations.
>
> Why you would want to iterate over every single key in a random partitioned
> CF is another thing altogether. I had my own reasons (to validate a
> batch_mutate that was inserting 5K - 10K rows at a shot). NOTE: I was
> getting < 1000ms per 5K batch_mutate call ... or > 5K inserts per second per
> thrift client, per node. When this was parallelized using multiple thrift
> clients and hitting multiple nodes in the cluster, I was seeing > 25K
> inserts per second (write consistency, read consistency and replication
> factor are other considerations). Other caveats apply to batch_mutate, it is
> not atomic, but when it works it is much much faster than batching single
> insert calls.
>
> -phil
>
> On Jun 9, 2010, at 12:07 PM, David Boxenhorn wrote:
>
> I don't get what you're saying. If you want to loop over your entire range
> of keys, you can do it with a range query, and start and finish will both be
> "". Is there any scenario where you would want to do a range query where
> start and/or finish do not equal "", if you use random partitioning?
>
> 2010/6/9 Philip Stanhope 
>
>> I feel that there is a significant bit of confusion here.
>>
>> You CAN use start/finish when using get_range_slices with random
>> partitioner. But you can't make any assumptions about what key will be next
>> in the range which is the whole point of "random". If you do know a specific
>> key that you care about, you can use that as a start, but again, you don't
>> know what will come next.
>>
>> If you have a CF with 1M keys ... you can effectively do a full row scan
>> ... it is expensive and you'd have to ask yourself why you'd be wanting to
>> do this in the first place.
>>
>> Ordering with columns for a particular key is completely dependent on the
>> CompareWith choice you make when you defined the column family. For example,
>> you can make assumptions about the sequencing of columns returned from
>> get_slice (NOT get_range_slices).
>>
>> -phil
>>
>> On Jun 9, 2010, at 7:29 AM, David Boxenhorn wrote:
>>
>> To use start and finish parameters at all, you need to use OPP. Start and
>> finish parameters don't work if you don't use OPP, i.e. the result set won't
>> be:  start =< resultSet < finish
>>
>> 2010/6/9 Ben Browning 
>>
>>> OPP stands for Order-Preserving Partitioner. For more information on
>>> partitioners, look here:
>>>
>>> http://wiki.apache.org/cassandra/StorageConfiguration#Partitioner
>>>
>>> To do key range slices that use both start and finish parameters and
>>> retrieve keys in-order, you need to use an ordered partitioner -
>>> either the built-in OPP or your own custom one.
>>>
>>> Ben
>>>
>>> On Tue, Jun 8, 2010 at 10:26 PM, sina  wrote:
>>> > what's the mean of opp? And How can i make the "start" and "finish"
>>> useful
>>> > and make sense?
>>> >
>>> >
>>> > 2010-06-09
>>> > 
>>> > 9527
>>> > 
>>> > 发件人: Ben Browning
>>> > 发送时间: 2010-06-02  2

Granularity SSTables.

2010-06-10 Thread xavier manach
Hi.

  I try to understand tricks that I can use with the SSTables, for
faster manipulation of datas in clusters.

I learn I how copy a keyspaces from data directories to a new node and
change replicationfactor (thx Jonathan).

If I understood, Each SSTable have 3 files :
  ColumnFamily-ID-Datas.db
  ColumnFamily-ID-Index.db
  ColumnFamily-ID-Filter.db

  If I want merge datas from 2 clusters, with differents keys (each
key is only in one cluster) but with the same ColumnFamily.
Can I copy all the files from SSTables with the same methode ?
> 1. nodetool drain & stop original node
> 2. copy everything  ***files sstables*** in data/ directories (but not system 
> keyspace!) to new node
> 3. restart and autobootstrap=false [the default]

Thx.



On Tue, Jun 8, 2010 at 7:12 AM, xavier manach  wrote:
> Hi.
>
>   I have a cluster with only 1 node with a lot of datas (500 Go) .
>   I want add a new node with the same datas (with a ReplicationFactor
> 2)
>
> The method normal is :
> stop node.
> add a node.
> change replication factor to 2.
> start nodes
> use nodetool repair
>
>   But , I didn't know if this other method is valid, and if it's can
> be faster :
> stop nodes.
> copy all SSTables
> change replication factor.
> start nodes
> and
> use nodetool repair
>
>   Have you an idea for the faster valid method ?
>
> Thx.
>


single node capacity

2010-06-10 Thread hive13 Wong
Hi,

How much data load can a single typical cassandra instance handle?
It seems like we are getting into trouble when one of our node's load grows
to bigger than 200g. Both read latency and write latency are increasing,
varying from 10 to several thousand milliseconds.
machine config is 16*cpu 32G RAM
Heap size is 10G
Any suggestion of tuning?
Or should I start considering adding more nodes when the data grows to this
big?

Thanks


RE: single node capacity

2010-06-10 Thread Dr . Martin Grabmüller
Your problem is probably not the amount of data you store, but the number of
SSTable files.  When these increase, read latency goes up.  Write latency maybe
goes up because of compaction.  Check in the data directory, whether there are 
many
data files, and check via JMX whether compaction is happening.
 
My recommendation is to reduce the write traffic to the nodes so that each node 
can
keep up with compaction.  If reducing the load is not possible, you have to add 
nodes
(or get faster hard disks, but that is often not possible).
 
Martin




From: hive13 Wong [mailto:hiv...@gmail.com] 
Sent: Thursday, June 10, 2010 2:58 PM
To: user@cassandra.apache.org
Subject: single node capacity


Hi,  

How much data load can a single typical cassandra instance handle?
It seems like we are getting into trouble when one of our node's load 
grows to bigger than 200g. Both read latency and write latency are increasing, 
varying from 10 to several thousand milliseconds.
machine config is 16*cpu 32G RAM 
Heap size is 10G
Any suggestion of tuning?
Or should I start considering adding more nodes when the data grows to 
this big?

Thanks



Re: single node capacity

2010-06-10 Thread hive13 Wong
You are right, our write traffic indeed is pretty tense as we are now at the
stage of initializing data.
Then we do need some more nodes here.

Thanks very much Martin.

On Thu, Jun 10, 2010 at 9:04 PM, Dr. Martin Grabmüller <
martin.grabmuel...@eleven.de> wrote:

>  Your problem is probably not the amount of data you store, but the number
> of
> SSTable files.  When these increase, read latency goes up.  Write latency
> maybe
> goes up because of compaction.  Check in the data directory, whether there
> are many
> data files, and check via JMX whether compaction is happening.
>
> My recommendation is to reduce the write traffic to the nodes so that each
> node can
> keep up with compaction.  If reducing the load is not possible, you have to
> add nodes
> (or get faster hard disks, but that is often not possible).
>
> Martin
>
>  --
> *From:* hive13 Wong [mailto:hiv...@gmail.com]
> *Sent:* Thursday, June 10, 2010 2:58 PM
> *To:* user@cassandra.apache.org
> *Subject:* single node capacity
>
> Hi,
>
> How much data load can a single typical cassandra instance handle?
> It seems like we are getting into trouble when one of our node's load grows
> to bigger than 200g. Both read latency and write latency are increasing,
> varying from 10 to several thousand milliseconds.
> machine config is 16*cpu 32G RAM
> Heap size is 10G
> Any suggestion of tuning?
> Or should I start considering adding more nodes when the data grows to this
> big?
>
> Thanks
>
>


Best way of adding new nodes

2010-06-10 Thread hive13 Wong
Hi, guys

The 2 ways of adding new nodes, when add with bootstrapping, since we've
already got lots of data, often it will take many hours to complete the
bootstrapping and probably affect the performance of existing nodes. But if
we add without bootstrapping, the data load on the new node could be quite
unbalanced.

Is there a better way of adding a new node with less cost while more
balanced? Maybe a steady approach at a cost of longer time while not
affecting current nodes much.

Thanks


keyrange for get_range_slices

2010-06-10 Thread Dop Sun
Hi,

 

As documented in the http://wiki.apache.org/cassandra/API, the key range for
get_range_slices are both inclusive. 

 

As discussed in this thread:
http://groups.google.com/group/jassandra-user/browse_thread/thread/c2e56453c
de067d3, there is a case that user want to discover all keys (huge number)
in a column family. 

 

What I think  is doing batchly: using empty string as start and finish
first, then using the last key returned as start and query second.

 

My question is: using this method, the last key returned for the first
query, will be returned again in the second query as the first key. And it's
a duplication. Is there any other API to discover keys without duplications
in current implementation?

 

Thanks,

Regards,

Dop



Re: keyrange for get_range_slices

2010-06-10 Thread Philip Stanhope
No ... and I personally don't have a problem with this if you think about what 
is actually going on under the covers.

Note, however, that this is an expensive operation and as a result if there are 
parallel updates to the indexes while you are performing a full keyscan 
(rowscan) you will potentially miss keys because they are inserted earlier in 
the index than you are currently processing.

A further concern is that the keys (and indexes) are spread around a cluster. 
Unless R=N you will be hitting the network during this type of scan.

Lastly, be careful about how you specify the SlicePredicate. A keyscan can 
easily turn into a "dump the entire datastore" if you aren't careful.

On Jun 10, 2010, at 10:03 AM, Dop Sun wrote:

> Hi,
>  
> As documented in the http://wiki.apache.org/cassandra/API, the key range for 
> get_range_slices are both inclusive.
>  
> As discussed in this thread: 
> http://groups.google.com/group/jassandra-user/browse_thread/thread/c2e56453cde067d3,
>  there is a case that user want to discover all keys (huge number) in a 
> column family.
>  
> What I think  is doing batchly: using empty string as start and finish first, 
> then using the last key returned as start and query second.
>  
> My question is: using this method, the last key returned for the first query, 
> will be returned again in the second query as the first key. And it’s a 
> duplication. Is there any other API to discover keys without duplications in 
> current implementation?
>  
> Thanks,
> Regards,
> Dop



Re: Quick help on Cassandra please: cluster access and performance

2010-06-10 Thread li wei
Thanks you very much,  Per!



- Original Message 
From: Per Olesen 
To: "user@cassandra.apache.org" 
Sent: Wed, June 9, 2010 4:02:52 PM
Subject: Re: Quick help on Cassandra please: cluster access and performance


On Jun 9, 2010, at 9:47 PM, li wei wrote:

> Thanks a lot.
> We are set READ one, WRITE ANY. Is this better than QUORUM in performance.

Yes, but less consistency safe.

> Do you think the cassandra  Cluster (with 2 or  nodes) should be always 
> faster than Single one node in the reality and theory?
> Or it depends?

It depends :-)

I think the idea with cassandra is that it scales linearly. So, if you have 
obtained some performance numbers X for read performance. And you get lots of 
new users and data amounts, you can keep having X simply by adding new nodes.

But I think there are others on this list with much more insight into this than 
mine!

/Per


  


File Descriptor leak

2010-06-10 Thread Matt Conway
Hi All,

I'm running a small 4-node cluster with minimal load using
the 2010-06-08_12-31-16 build from trunk, and its exhausting file
descriptors pretty quickly (65K in less than an hour).  Here's a list of the
files I see it  leaking, I can do a more specific query if you'd like.  Am I
doing something wrong, is this a known problem, something being done wrong
from the client side, or something else?  Any help appreciated, thanks,

Matt

r...@cassandra01:~# lsof -p `ps ax | grep [C]assandraDaemon | awk '{print
$1}'` | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5
  3 /mnt/cassandra/data/system/Schema-c-2-Data.db
   1278 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-7-Data.db
   1405 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-9-Data.db
   1895 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-5-Data.db
  26655 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-11-Data.db


Running Cassandra as a Windows Service

2010-06-10 Thread Kochheiser,Todd W - TO-DITT1
For various reasons I am required to deploy systems on Windows.  As such, I 
went looking for information on running Cassandra as a Windows service.  I've 
read some of the user threads regarding running Cassandra as a Windows service, 
such as this one:

http://www.mail-archive.com/user@cassandra.apache.org/msg01656.html

I also found the following JIRA issue:

https://issues.apache.org/jira/browse/CASSANDRA-292

As it didn't look like anyone has contributed a formal solution and having some 
experience using Apache's Procrun 
(http://commons.apache.org/daemon/procrun.html), I decided to go ahead and 
write a batch script and a simple "WindowsService" class to accomplish the 
task.  The WindowsService class only makes calls to public methods in 
CassandraDeamon and is fairly simple.  In combination with the batch script, it 
is very easy to install and remove the service.  At this point, I've installed 
Cassandra as a Windows service on XP (32 bit), Windows 7 (64 bit) and Windows 
Server 2008 R1/R2 (64 bit).  It should work fine on other version of Windows 
(2K, 2K3).

Questions:

1.  Has anyone else already done this work?
2.  If not, I wouldn't mind sharing the code/script or contributing it back 
to the project.  Is there any interest in this from the Cassandra dev team or 
the user community?

Ideally the WindowsService could be included in the distributed source/binary 
distributions (perhaps in a contrib area) as well as the batch script and 
associated procrun executables.  Or, perhaps it could be posted to a Cassandra 
community site (is there one?).

Todd







Re: Running Cassandra as a Windows Service

2010-06-10 Thread Gary Dusbabek
IMO this is one of those things that would bitrot fairly quickly if it
were not maintained.  It may be useful in contrib, where curious
parties could pick it up, get it back in shape, and send in the
changes to be committed.

Judging by the sparse interest so far, this probably wouldn't be a
good fit in core since there don't seem to be many (any?) cassandra
developers who run windows.

Gary.


On Thu, Jun 10, 2010 at 12:34, Kochheiser,Todd W - TO-DITT1
 wrote:
> For various reasons I am required to deploy systems on Windows.  As such, I
> went looking for information on running Cassandra as a Windows service.
> I’ve read some of the user threads regarding running Cassandra as a Windows
> service, such as this one:
>
>     http://www.mail-archive.com/user@cassandra.apache.org/msg01656.html
>
> I also found the following JIRA issue:
>
>     https://issues.apache.org/jira/browse/CASSANDRA-292
>
> As it didn’t look like anyone has contributed a formal solution and having
> some experience using Apache’s Procrun
> (http://commons.apache.org/daemon/procrun.html), I decided to go ahead and
> write a batch script and a simple “WindowsService” class to accomplish the
> task.  The WindowsService class only makes calls to public methods in
> CassandraDeamon and is fairly simple.  In combination with the batch script,
> it is very easy to install and remove the service.  At this point, I’ve
> installed Cassandra as a Windows service on XP (32 bit), Windows 7 (64 bit)
> and Windows Server 2008 R1/R2 (64 bit).  It should work fine on other
> version of Windows (2K, 2K3).
>
> Questions:
>
>
> Has anyone else already done this work?
> If not, I wouldn’t mind sharing the code/script or contributing it back to
> the project.  Is there any interest in this from the Cassandra dev team or
> the user community?
>
>
> Ideally the WindowsService could be included in the distributed
> source/binary distributions (perhaps in a contrib area) as well as the batch
> script and associated procrun executables.  Or, perhaps it could be posted
> to a Cassandra community site (is there one?).
>
> Todd
>
>
>
>
>


Re: Running Cassandra as a Windows Service

2010-06-10 Thread Ben Standefer
"For various reasons I am required to deploy systems on Windows."

I don't think it would be difficult to argue the business case for running
Cassandra on Linux.  It's still a young project and everybody in IRC and the
mailing list is running it on Linux.  You should really re-think whatever
factors are requiring you to run it on Windows and try to overcome those
obstacles.

-Ben



On Thu, Jun 10, 2010 at 10:58 AM, Gary Dusbabek  wrote:

> IMO this is one of those things that would bitrot fairly quickly if it
> were not maintained.  It may be useful in contrib, where curious
> parties could pick it up, get it back in shape, and send in the
> changes to be committed.
>
> Judging by the sparse interest so far, this probably wouldn't be a
> good fit in core since there don't seem to be many (any?) cassandra
> developers who run windows.
>
> Gary.
>
>
> On Thu, Jun 10, 2010 at 12:34, Kochheiser,Todd W - TO-DITT1
>  wrote:
> > For various reasons I am required to deploy systems on Windows.  As such,
> I
> > went looking for information on running Cassandra as a Windows service.
> > I’ve read some of the user threads regarding running Cassandra as a
> Windows
> > service, such as this one:
> >
> >
> http://www.mail-archive.com/user@cassandra.apache.org/msg01656.html
> >
> > I also found the following JIRA issue:
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-292
> >
> > As it didn’t look like anyone has contributed a formal solution and
> having
> > some experience using Apache’s Procrun
> > (http://commons.apache.org/daemon/procrun.html), I decided to go ahead
> and
> > write a batch script and a simple “WindowsService” class to accomplish
> the
> > task.  The WindowsService class only makes calls to public methods in
> > CassandraDeamon and is fairly simple.  In combination with the batch
> script,
> > it is very easy to install and remove the service.  At this point, I’ve
> > installed Cassandra as a Windows service on XP (32 bit), Windows 7 (64
> bit)
> > and Windows Server 2008 R1/R2 (64 bit).  It should work fine on other
> > version of Windows (2K, 2K3).
> >
> > Questions:
> >
> >
> > Has anyone else already done this work?
> > If not, I wouldn’t mind sharing the code/script or contributing it back
> to
> > the project.  Is there any interest in this from the Cassandra dev team
> or
> > the user community?
> >
> >
> > Ideally the WindowsService could be included in the distributed
> > source/binary distributions (perhaps in a contrib area) as well as the
> batch
> > script and associated procrun executables.  Or, perhaps it could be
> posted
> > to a Cassandra community site (is there one?).
> >
> > Todd
> >
> >
> >
> >
> >
>


read operation is slow

2010-06-10 Thread Caribbean410
Hello,

I am testing the performance of cassandra. We write 200k records to
database and each record is 1k size. Then we read these 200k records.
It takes more than 400s to finish the read which is much slower than
mysql (20s around). I read some discussion online and someone suggest
to make multiple connections to make it faster. But I am not sure how
to do it, do I need to change my storage setting file or just change
the java client code?

Here is my read code,

Properties info = new Properties();
info.put(DriverManager.CONSISTENCY_LEVEL,
  ConsistencyLevel.ONE.toString());

IConnection connection = DriverManager.getConnection(
"thrift://localhost:9160", info);

  // 2. Get a KeySpace by name
  IKeySpace keySpace =
connection.getKeySpace("Keyspace1");

  // 3. Get a ColumnFamily by name
  IColumnFamily cf =
keySpace.getColumnFamily("Standard2");

  ByteArray nameFirst = ByteArray.ofASCII("first");
  ICriteria criteria = cf.createCriteria();
  long readBytes = 0;
  long start = System.currentTimeMillis();
  for (int i = 0; i < numOfRecords; i++) {
  int n = random.nextInt(numOfRecords);
  userName = keySet[n];

criteria.keyList(Lists.newArrayList(userName)).columnRange(nameFirst,
nameFirst, 10);
  Map> map =
criteria.select();
  List list =
map.get(userName);
  ByteArray bloc =
list.get(0).getValue();
  byte[] byteArrayloc =
bloc.toByteArray();
  loc = new String(byteArrayloc);
//System.out.println(userName+"
"+loc);
  readBytes = readBytes + loc.length();
  }

long finish=System.currentTimeMillis();

I once commented these lines

  ByteArray bloc =
list.get(0).getValue();
  byte[] byteArrayloc =
bloc.toByteArray();
  loc = new String(byteArrayloc);
//System.out.println(userName+"
"+loc);
  readBytes = readBytes + loc.length();

And the performance doesn't improve much.

Any suggestion is welcome. Thanks,


Re: Best way of adding new nodes

2010-06-10 Thread Jonathan Ellis
It's not just a matter of being balanced, if you add new nodes without
bootstrapping the others will think it has data on it, that hasn't
actually been moved there.

On Thu, Jun 10, 2010 at 6:50 AM, hive13 Wong  wrote:
> Hi, guys
> The 2 ways of adding new nodes, when add with bootstrapping, since we've
> already got lots of data, often it will take many hours to complete the
> bootstrapping and probably affect the performance of existing nodes. But if
> we add without bootstrapping, the data load on the new node could be quite
> unbalanced.
> Is there a better way of adding a new node with less cost while more
> balanced? Maybe a steady approach at a cost of longer time while not
> affecting current nodes much.
> Thanks



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: cassandra out of heap space crash

2010-06-10 Thread Ran Tavory
I can't say exactly how much memory is the correct amount, but surely 1G is
very little.
By replicating 3 times your cluster now makes 3 times more work than it used
to do, both on reads and on writes while the readers/writers continue
hammering it the same pace.

So once you've upped your memory (try 4g, if not enough 8g etc) if this
still doesn't help, you want to look at either adding capacity or slowing
down your writes.
Which consistency level are you writing with? You can try ALL, this will
slow down your writes just as much needed by the cluster to catch its breath
(or so I hope, I never actually tried that...)

On Fri, Jun 11, 2010 at 12:26 AM, Julie  wrote:

> I am running an 8 node cassandra cluster with each node on its own
> dedicated VM.
>
> My app very quickly populates the database with about 100,000 rows of data
> (each row is about 100K bytes) times the number of nodes in my cluster so
> there's about 100,000 rows of data on each node (seems very evenly
> distributed).
>
> I have been running my app fairly successfully but today changed the
> replication
> factor from 1 to 3. (I first took down the servers, nuked their data
> directories, copied over the new storage-conf.xml to each node, then
> restarted
> the servers.)  My app begins by populating the database with fresh data.
>  During
> the writing phase, all the cassandra servers, one by one, started getting
> an
> out-of-memory exception.  Here's the output from the first to die:
>
> INFO [COMMIT-LOG-WRITER] 2010-06-10 14:18:54,609 CommitLog.java (line 407)
> Discarding obsolete commit
>
> log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1276193883235.log)
>
> INFO [ROW-MUTATION-STAGE:5] 2010-06-10 14:18:55,499 ColumnFamilyStore.java
> (line 609) Enqueuing flush of Memtable(Standard1)@19571399
>
> INFO [GMFD:1] 2010-06-10 14:19:01,556 Gossiper.java (line 568)
> InetAddress /10.210.69.221 is now UP
> INFO [GMFD:1] 2010-06-10 14:20:35,136 Gossiper.java (line 568)
> InetAddress /10.254.242.228 is now UP
> INFO [GMFD:1] 2010-06-10 14:20:35,137 Gossiper.java (line 568)
> InetAddress /10.201.207.129 is now UP
> INFO [GMFD:1] 2010-06-10 14:20:36,922 Gossiper.java (line 568)
> InetAddress /10.198.37.241 is now UP
>
> INFO [GC inspection] 2010-06-10 14:19:03,722 GCInspector.java (line 110)
> GC for ConcurrentMarkSweep: 2164 ms, 8754168 reclaimed leaving 1070909048
> used;
> max is 1174339584
> INFO [GC inspection] 2010-06-10 14:21:09,068 GCInspector.java (line 110) GC
> for
> ConcurrentMarkSweep: 2151 ms, 78896080 reclaimed leaving 994679752 used;
> max is
> 1174339584
> INFO [Timer-1] 2010-06-10 14:21:09,068 Gossiper.java (line 179)
> InetAddress /10.198.37.241 is now dead.
> INFO [Timer-1] 2010-06-10 14:21:12,045 Gossiper.java (line 179)
> InetAddress /10.210.69.221 is now dead.
>  INFO [GMFD:1] 2010-06-10 14:21:12,046 Gossiper.java (line 568)
> InetAddress /10.210.203.210 is now UP
>  INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568)
> InetAddress /10.210.69.221 is now UP
>  INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568)
> InetAddress /10.192.218.117 is now UP
>  INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568)
> InetAddress /10.198.37.241 is now UP
>  INFO [GMFD:1] 2010-06-10 14:21:12,307 Gossiper.java (line 568)
> InetAddress /10.254.138.226 is now UP
> ERROR [ROW-MUTATION-STAGE:25] 2010-06-10 14:21:15,127 CassandraDaemon.java
> (line 78) Fatal exception in thread Thread[ROW-MUTATION-STAGE:25,5,main]
> java.lang.OutOfMemoryError: Java heap space
>at
>
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:84)
>at
>
> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:29)
>at
> org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns
> (ColumnFamilySerializer.java:117)
>at
> org.apache.cassandra.db.ColumnFamilySerializer.deserialize
> (ColumnFamilySerializer.java:108)
>at
> org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps
> (RowMutation.java:359)
>at
> org.apache.cassandra.db.RowMutationSerializer.deserialize
> (RowMutation.java:369)
>at
> org.apache.cassandra.db.RowMutationSerializer.deserialize
> (RowMutation.java:322)
>at
> org.apache.cassandra.db.RowMutationVerbHandler.doVerb
> (RowMutationVerbHandler.java:45)
>at
> org.apache.cassandra.net.MessageDeliveryTask.run
> (MessageDeliveryTask.java:40)
>at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask
> (ThreadPoolExecutor.java:886)
>at
> java.util.concurrent.ThreadPoolExecutor$Worker.run
> (ThreadPoolExecutor.java:908)
>at java.lang.Thread.run(Thread.java:619)
> ERROR [ROW-MUTATION-STAGE:18] 2010-06-10 14:21:15,129 CassandraDaemon.java
> (line 78) Fatal exception in thread Thread[ROW-MUTATION-STAGE:18,5,main]
>
>
>
> Within 15 minutes, all 8 nodes died while my app continued trying to
> populate
> the database.  Is there something I am doing wrong?  I am populating 

Re: scans stopped returning values for some keys

2010-06-10 Thread Jonathan Ellis
How is your CF defined?  (what comparator?)

did you try start=empty byte array instead of Long.MAX_VALUE?

On Wed, Jun 9, 2010 at 8:06 AM, Pawel Dabrowski  wrote:
> Hi,
>
> I'm using Cassandra to store some aggregated data in a structure like this:
>
> KEY - product_id
> SUPER COLUMN NAME - timestamp
> and in the super column, I have a few columns with actual data.
>
> I am using a scan operation to find the latest super column 
> (start=Long.MAX_VALUE, reversed=true, count=1) for a key, which worked fine 
> for quite some time.
> But recently I needed to remove some of the columns within the super columns.
> After that things got weird: for some keys, the scan for latest super column 
> work normally, but for some of them they stopped returning any results. I 
> checked the data using the CLI and the data is obviously there. I can get it 
> if I specify the super column name, but scanning for latest does not work. If 
> I scan for previous data (start=some other timestamp less than maximum 
> timestamp in cassandra), it works fine.
> I compared the data for keys that work, and those that don't, but there is no 
> difference - the super column names are exactly the same and they contain the 
> same amounts of columns.
>
> But the really weird thing is that the scans did not stop working immediately 
> after some columns were removed. I was able to scan for the data and verify 
> that the columns were removed correctly and only after a couple of minutes 
> some scans stopped returning data. When I looked in the log, I've seen that 
> Cassandra has been doing some compacting, flushing and deleting of .db files 
> more or less at the time that the scans stopped working.
> I tried restarting Cassandra, but it did not help.
> Anyone had a similar problem?
>
> regards
> Pawel Dabrowski



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


cassandra out of heap space crash

2010-06-10 Thread Julie
I am running an 8 node cassandra cluster with each node on its own dedicated VM.

My app very quickly populates the database with about 100,000 rows of data
(each row is about 100K bytes) times the number of nodes in my cluster so
there's about 100,000 rows of data on each node (seems very evenly distributed).

I have been running my app fairly successfully but today changed the replication
factor from 1 to 3. (I first took down the servers, nuked their data
directories, copied over the new storage-conf.xml to each node, then restarted
the servers.)  My app begins by populating the database with fresh data.  During
the writing phase, all the cassandra servers, one by one, started getting an
out-of-memory exception.  Here's the output from the first to die:

INFO [COMMIT-LOG-WRITER] 2010-06-10 14:18:54,609 CommitLog.java (line 407)
Discarding obsolete commit
log:CommitLogSegment(/var/lib/cassandra/commitlog/CommitLog-1276193883235.log)

INFO [ROW-MUTATION-STAGE:5] 2010-06-10 14:18:55,499 ColumnFamilyStore.java
(line 609) Enqueuing flush of Memtable(Standard1)@19571399

INFO [GMFD:1] 2010-06-10 14:19:01,556 Gossiper.java (line 568) 
InetAddress /10.210.69.221 is now UP
INFO [GMFD:1] 2010-06-10 14:20:35,136 Gossiper.java (line 568) 
InetAddress /10.254.242.228 is now UP
INFO [GMFD:1] 2010-06-10 14:20:35,137 Gossiper.java (line 568) 
InetAddress /10.201.207.129 is now UP
INFO [GMFD:1] 2010-06-10 14:20:36,922 Gossiper.java (line 568) 
InetAddress /10.198.37.241 is now UP

INFO [GC inspection] 2010-06-10 14:19:03,722 GCInspector.java (line 110) 
GC for ConcurrentMarkSweep: 2164 ms, 8754168 reclaimed leaving 1070909048 used;
max is 1174339584
INFO [GC inspection] 2010-06-10 14:21:09,068 GCInspector.java (line 110) GC for
ConcurrentMarkSweep: 2151 ms, 78896080 reclaimed leaving 994679752 used; max is
1174339584
INFO [Timer-1] 2010-06-10 14:21:09,068 Gossiper.java (line 179) 
InetAddress /10.198.37.241 is now dead.
INFO [Timer-1] 2010-06-10 14:21:12,045 Gossiper.java (line 179) 
InetAddress /10.210.69.221 is now dead.
 INFO [GMFD:1] 2010-06-10 14:21:12,046 Gossiper.java (line 568) 
InetAddress /10.210.203.210 is now UP
 INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568) 
InetAddress /10.210.69.221 is now UP
 INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568) 
InetAddress /10.192.218.117 is now UP
 INFO [GMFD:1] 2010-06-10 14:21:12,306 Gossiper.java (line 568) 
InetAddress /10.198.37.241 is now UP
 INFO [GMFD:1] 2010-06-10 14:21:12,307 Gossiper.java (line 568) 
InetAddress /10.254.138.226 is now UP
ERROR [ROW-MUTATION-STAGE:25] 2010-06-10 14:21:15,127 CassandraDaemon.java 
(line 78) Fatal exception in thread Thread[ROW-MUTATION-STAGE:25,5,main]
java.lang.OutOfMemoryError: Java heap space
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:84)
at
org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:29)
at
org.apache.cassandra.db.ColumnFamilySerializer.deserializeColumns
(ColumnFamilySerializer.java:117)
at
org.apache.cassandra.db.ColumnFamilySerializer.deserialize
(ColumnFamilySerializer.java:108)
at
org.apache.cassandra.db.RowMutationSerializer.defreezeTheMaps
(RowMutation.java:359)
at
org.apache.cassandra.db.RowMutationSerializer.deserialize
(RowMutation.java:369)
at
org.apache.cassandra.db.RowMutationSerializer.deserialize
(RowMutation.java:322)
at
org.apache.cassandra.db.RowMutationVerbHandler.doVerb
(RowMutationVerbHandler.java:45)
at
org.apache.cassandra.net.MessageDeliveryTask.run
(MessageDeliveryTask.java:40)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask
(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run
(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
ERROR [ROW-MUTATION-STAGE:18] 2010-06-10 14:21:15,129 CassandraDaemon.java 
(line 78) Fatal exception in thread Thread[ROW-MUTATION-STAGE:18,5,main]



Within 15 minutes, all 8 nodes died while my app continued trying to populate
the database.  Is there something I am doing wrong?  I am populating the
database very quickly by writing 100 rows at once in each of 8 clients, until
each client has written 100,000 rows.   All of my cassandra servers are started
up with 1GB of heap space:  /usr/bin/java -ea -Xms128M -Xmx1G …

Thank you for your help!
Julie



RE: Running Cassandra as a Windows Service

2010-06-10 Thread Kochheiser,Todd W - TO-DITT1
I agree that bitrot might be happen if all of the core Cassandra developers are 
using Linux. Your suggestion of putting things in a contrib area where curious 
(or desperate) parties suffering on the Windows platform could pick it up seems 
like a reasonable place to start.  It might also be an opportunity to increase 
the number of "application" developers using Cassandra if Cassandra was 
slightly more approachable on the Windows platform.

Any suggestions on next steps?

Todd. 

-Original Message-
From: Gary Dusbabek [mailto:gdusba...@gmail.com] 
Sent: Thursday, June 10, 2010 10:59 AM
To: user@cassandra.apache.org
Subject: Re: Running Cassandra as a Windows Service

IMO this is one of those things that would bitrot fairly quickly if it
were not maintained.  It may be useful in contrib, where curious
parties could pick it up, get it back in shape, and send in the
changes to be committed.

Judging by the sparse interest so far, this probably wouldn't be a
good fit in core since there don't seem to be many (any?) cassandra
developers who run windows.

Gary.


On Thu, Jun 10, 2010 at 12:34, Kochheiser,Todd W - TO-DITT1
 wrote:
> For various reasons I am required to deploy systems on Windows.  As such, I
> went looking for information on running Cassandra as a Windows service.
> I've read some of the user threads regarding running Cassandra as a Windows
> service, such as this one:
>
>     http://www.mail-archive.com/user@cassandra.apache.org/msg01656.html
>
> I also found the following JIRA issue:
>
>     https://issues.apache.org/jira/browse/CASSANDRA-292
>
> As it didn't look like anyone has contributed a formal solution and having
> some experience using Apache's Procrun
> (http://commons.apache.org/daemon/procrun.html), I decided to go ahead and
> write a batch script and a simple "WindowsService" class to accomplish the
> task.  The WindowsService class only makes calls to public methods in
> CassandraDeamon and is fairly simple.  In combination with the batch script,
> it is very easy to install and remove the service.  At this point, I've
> installed Cassandra as a Windows service on XP (32 bit), Windows 7 (64 bit)
> and Windows Server 2008 R1/R2 (64 bit).  It should work fine on other
> version of Windows (2K, 2K3).
>
> Questions:
>
>
> Has anyone else already done this work?
> If not, I wouldn't mind sharing the code/script or contributing it back to
> the project.  Is there any interest in this from the Cassandra dev team or
> the user community?
>
>
> Ideally the WindowsService could be included in the distributed
> source/binary distributions (perhaps in a contrib area) as well as the batch
> script and associated procrun executables.  Or, perhaps it could be posted
> to a Cassandra community site (is there one?).
>
> Todd
>
>
>
>
>


RE: keyrange for get_range_slices

2010-06-10 Thread Dop Sun
Thanks for your quick and detailed explain on the key scan. This is really
helpful!

 

Dop

 

From: Philip Stanhope [mailto:pstanh...@wimba.com] 
Sent: Thursday, June 10, 2010 10:40 PM
To: user@cassandra.apache.org
Subject: Re: keyrange for get_range_slices

 

No ... and I personally don't have a problem with this if you think about
what is actually going on under the covers.

 

Note, however, that this is an expensive operation and as a result if there
are parallel updates to the indexes while you are performing a full keyscan
(rowscan) you will potentially miss keys because they are inserted earlier
in the index than you are currently processing.

 

A further concern is that the keys (and indexes) are spread around a
cluster. Unless R=N you will be hitting the network during this type of
scan.

 

Lastly, be careful about how you specify the SlicePredicate. A keyscan can
easily turn into a "dump the entire datastore" if you aren't careful.

 

On Jun 10, 2010, at 10:03 AM, Dop Sun wrote:





Hi,

 

As documented in the http://wiki.apache.org/cassandra/API, the key range for
get_range_slices are both inclusive.

 

As discussed in this thread:
http://groups.google.com/group/jassandra-user/browse_thread/thread/c2e56453c
de067d3, there is a case that user want to discover all keys (huge number)
in a column family.

 

What I think  is doing batchly: using empty string as start and finish
first, then using the last key returned as start and query second.

 

My question is: using this method, the last key returned for the first
query, will be returned again in the second query as the first key. And it's
a duplication. Is there any other API to discover keys without duplications
in current implementation?

 

Thanks,

Regards,

Dop

 



Re: Range Slices timing question

2010-06-10 Thread Jonathan Ellis
get_range_slices is faster in 0.7 but there's not much you can do in 0.6.

On Wed, Jun 9, 2010 at 11:04 AM, Carlos Sanchez
 wrote:
> I have about a million rows (each row with 100 cols) of the form 
> domain/!date/!id  (e.g. gwm.com/!20100430/!CFRA4500) So I am interested in 
> getting all the ids (all cols) for a particular domain/date (e.g. 
> "gwm.ml.com/!20100430/!A" "gwm.ml.com/!20100430/!D"). I am looping in chunks 
> of 6000 rows / 500 cols at a time. However, it is taken in my 5 node cluster 
> (each  machine has 32gb in ram, RF=3 and OPP, v0.6.1) 36 secs to get all the 
> required rows (stats below); which I think it is a bit high. I am wondering 
> if a possible cause it's the way my string keys are constructed (suggestions 
> are welcome) that makes Cassandra work 'harder' when doing a 'range slices'. 
> Does Cassandra examines all row keys to search for matches? Are there any 
> settings I can tweak to try to make the retrieval faster?
>
> Thanks
>
> Carlos
>
> row(s) found 6000 in 35086ms
> total cols(s) found 593502
> row bytes 228000
> col bytes 38422670
> total bytes 38650670  (36.86015 MB)
>
>
>
>
> This email message and any attachments are for the sole use of the intended 
> recipients and may contain proprietary and/or confidential information which 
> may be privileged or otherwise protected from disclosure. Any unauthorized 
> review, use, disclosure or distribution is prohibited. If you are not an 
> intended recipient, please contact the sender by reply email and destroy the 
> original message and any copies of the message as well as any attachments to 
> the original message.
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Granularity SSTables.

2010-06-10 Thread Jonathan Ellis
Only if your clusters have the same number of nodes, with the same tokens.

Trying to get too clever is not usually advisable.

On Thu, Jun 10, 2010 at 3:54 AM, xavier manach  wrote:
> Hi.
>
>  I try to understand tricks that I can use with the SSTables, for
> faster manipulation of datas in clusters.
>
> I learn I how copy a keyspaces from data directories to a new node and
> change replicationfactor (thx Jonathan).
>
> If I understood, Each SSTable have 3 files :
>  ColumnFamily-ID-Datas.db
>  ColumnFamily-ID-Index.db
>  ColumnFamily-ID-Filter.db
>
>  If I want merge datas from 2 clusters, with differents keys (each
> key is only in one cluster) but with the same ColumnFamily.
> Can I copy all the files from SSTables with the same methode ?
>> 1. nodetool drain & stop original node
>> 2. copy everything  ***files sstables*** in data/ directories (but not 
>> system keyspace!) to new node
>> 3. restart and autobootstrap=false [the default]
>
> Thx.
>
>
>
> On Tue, Jun 8, 2010 at 7:12 AM, xavier manach  wrote:
>> Hi.
>>
>>   I have a cluster with only 1 node with a lot of datas (500 Go) .
>>   I want add a new node with the same datas (with a ReplicationFactor
>> 2)
>>
>> The method normal is :
>> stop node.
>> add a node.
>> change replication factor to 2.
>> start nodes
>> use nodetool repair
>>
>>   But , I didn't know if this other method is valid, and if it's can
>> be faster :
>> stop nodes.
>> copy all SSTables
>> change replication factor.
>> start nodes
>> and
>> use nodetool repair
>>
>>   Have you an idea for the faster valid method ?
>>
>> Thx.
>>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: File Descriptor leak

2010-06-10 Thread Jonathan Ellis
Fixed in https://issues.apache.org/jira/browse/CASSANDRA-1178

On Thu, Jun 10, 2010 at 9:01 AM, Matt Conway  wrote:
> Hi All,
> I'm running a small 4-node cluster with minimal load using
> the 2010-06-08_12-31-16 build from trunk, and its exhausting file
> descriptors pretty quickly (65K in less than an hour).  Here's a list of the
> files I see it  leaking, I can do a more specific query if you'd like.  Am I
> doing something wrong, is this a known problem, something being done wrong
> from the client side, or something else?  Any help appreciated, thanks,
> Matt
> r...@cassandra01:~# lsof -p `ps ax | grep [C]assandraDaemon | awk '{print
> $1}'` | awk '{print $9}' | sort | uniq -c | sort -n | tail -n 5
>       3 /mnt/cassandra/data/system/Schema-c-2-Data.db
>    1278 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-7-Data.db
>    1405 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-9-Data.db
>    1895 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-5-Data.db
>   26655 /mnt/cassandra/data/MyKeyspace/MyColumnType-c-11-Data.db
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


RE: Range Slices timing question

2010-06-10 Thread Carlos Sanchez
Thx a lot

-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Thursday, June 10, 2010 4:28 PM
To: user@cassandra.apache.org
Subject: Re: Range Slices timing question

get_range_slices is faster in 0.7 but there's not much you can do in 0.6.

On Wed, Jun 9, 2010 at 11:04 AM, Carlos Sanchez
 wrote:
> I have about a million rows (each row with 100 cols) of the form 
> domain/!date/!id  (e.g. gwm.com/!20100430/!CFRA4500) So I am interested in 
> getting all the ids (all cols) for a particular domain/date (e.g. 
> "gwm.ml.com/!20100430/!A" "gwm.ml.com/!20100430/!D"). I am looping in chunks 
> of 6000 rows / 500 cols at a time. However, it is taken in my 5 node cluster 
> (each  machine has 32gb in ram, RF=3 and OPP, v0.6.1) 36 secs to get all the 
> required rows (stats below); which I think it is a bit high. I am wondering 
> if a possible cause it's the way my string keys are constructed (suggestions 
> are welcome) that makes Cassandra work 'harder' when doing a 'range slices'. 
> Does Cassandra examines all row keys to search for matches? Are there any 
> settings I can tweak to try to make the retrieval faster?
>
> Thanks
>
> Carlos
>
> row(s) found 6000 in 35086ms
> total cols(s) found 593502
> row bytes 228000
> col bytes 38422670
> total bytes 38650670  (36.86015 MB)
>
>
>
>
> This email message and any attachments are for the sole use of the intended 
> recipients and may contain proprietary and/or confidential information which 
> may be privileged or otherwise protected from disclosure. Any unauthorized 
> review, use, disclosure or distribution is prohibited. If you are not an 
> intended recipient, please contact the sender by reply email and destroy the 
> original message and any copies of the message as well as any attachments to 
> the original message.
>



--
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

This email message and any attachments are for the sole use of the intended 
recipients and may contain proprietary and/or confidential information which 
may be privileged or otherwise protected from disclosure. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not an 
intended recipient, please contact the sender by reply email and destroy the 
original message and any copies of the message as well as any attachments to 
the original message.


Cassandra Write Performance, CPU usage

2010-06-10 Thread Rishi Bhardwaj
Hi

I am investigating Cassandra write performance and see very heavy CPU usage 
from Cassandra. I have a single node Cassandra instance running on a dual core 
(2.66 Ghz Intel ) Ubuntu 9.10 server. The writes to Cassandra are being 
generated from the same server using BatchMutate(). The client makes exactly 
one RPC call at a time to Cassandra. Each BatchMutate() RPC contains 2 MB of 
data and once it is acknowledged by Cassandra, the next RPC is done. Cassandra 
has two separate disks, one for commitlog with a sequential b/w of 130MBps and 
the other a solid state disk for data with b/w of 90MBps. Tuning various 
parameters, I observe that I am able to attain a maximum write performance of 
about 45 to 50 MBps from Cassandra. I see that the Cassandra java process 
consistently uses 100% to 150% of CPU resources (as shown by top) during the 
entire write operation. Also, iostat clearly shows that the max disk bandwidth 
is not reached anytime during the write
 operation, every now and then the i/o activity on "commitlog" disk and the 
data disk spike but it is never consistently maintained by cassandra close to 
their peak. I would imagine that the CPU is probably the bottleneck here. Does 
anyone have any idea why Cassandra beats the heck out of the CPU here? Any 
suggestions on how to go about finding the exact bottleneck here?

Some more information about the writes: I have 2 column families, the data 
though is mostly written in one column family with column sizes of around 32k 
and each row having around 256 or 512 columns. I would really appreciate any 
help here.

Thanks,
Rishi



  

Re: Cassandra Write Performance, CPU usage

2010-06-10 Thread vd
Hi Rishi

The writes in Cassandra are not directly written to the Disk, they are
stored in memory and later on flushed to the disk. May be thats why you are
not getting much out of iostat. Cant say about high cpu usage.
___
Vineet Daniel
___

Let your email find you


On Fri, Jun 11, 2010 at 6:12 AM, Rishi Bhardwaj wrote:

> Hi
>
> I am investigating Cassandra write performance and see very heavy CPU usage
> from Cassandra. I have a single node Cassandra instance running on a dual
> core (2.66 Ghz Intel ) Ubuntu 9.10 server. The writes to Cassandra are being
> generated from the same server using BatchMutate(). The client makes exactly
> one RPC call at a time to Cassandra. Each BatchMutate() RPC contains 2 MB of
> data and once it is acknowledged by Cassandra, the next RPC is done.
> Cassandra has two separate disks, one for commitlog with a sequential b/w of
> 130MBps and the other a solid state disk for data with b/w of 90MBps. Tuning
> various parameters, I observe that I am able to attain a maximum write
> performance of about 45 to 50 MBps from Cassandra. I see that the
> Cassandra java process consistently uses 100% to 150% of CPU resources (as
> shown by top) during the entire write operation. Also, iostat clearly shows
> that the max disk bandwidth is not reached anytime during the write
> operation, every now and then the i/o activity on "commitlog" disk and the
> data disk spike but it is never consistently maintained by cassandra close
> to their peak. I would imagine that the CPU is probably the bottleneck
> here. Does anyone have any idea why Cassandra beats the heck out of the CPU
> here? Any suggestions on how to go about finding the exact bottleneck here?
>
> Some more information about the writes: I have 2 column families, the data
> though is mostly written in one column family with column sizes of around
> 32k and each row having around 256 or 512 columns. I would really appreciate
> any help here.
>
> Thanks,
> Rishi
>
>
>


Re: Cassandra Write Performance, CPU usage

2010-06-10 Thread Jonathan Shook
You are testing Cassandra in a way that it was not designed to be used.
Bandwidth to disk is not a meaningful example for nearly anything
except for filesystem benchmarking and things very nearly the same as
filesystem benchmarking.
Unless the usage patterns of your application match your test data,
there is not a good reason to expect a strong correlation between this
test and actual performance.

Cassandra is not simply shuffling data through IO when you write.
There are calculations that have to be done as writes filter their way
through various stages of processing. The point of this is to minimize
the overall effort Cassandra has to make in order to retrieve the data
again. One example would be bloom filters. Each column that is written
requires bloom filter processing and potentially auxiliary IO. Some of
these steps are allowed to happen in the background, but if you try,
you can cause them to stack up on top of the available CPU and memory
resources.

In such a case (continuous bulk writes), you are causing all of these
costs to be taken in more of a synchronous (not delayed) fashion. You
are not allowing the background processing that helps reduce client
blocking (by deferring some processing) to do its magic.



On Thu, Jun 10, 2010 at 7:42 PM, Rishi Bhardwaj  wrote:
> Hi
> I am investigating Cassandra write performance and see very heavy CPU usage
> from Cassandra. I have a single node Cassandra instance running on a dual
> core (2.66 Ghz Intel ) Ubuntu 9.10 server. The writes to Cassandra are being
> generated from the same server using BatchMutate(). The client makes exactly
> one RPC call at a time to Cassandra. Each BatchMutate() RPC contains 2 MB of
> data and once it is acknowledged by Cassandra, the next RPC is done.
> Cassandra has two separate disks, one for commitlog with a sequential b/w of
> 130MBps and the other a solid state disk for data with b/w of 90MBps. Tuning
> various parameters, I observe that I am able to attain a maximum write
> performance of about 45 to 50 MBps from Cassandra. I see that the Cassandra
> java process consistently uses 100% to 150% of CPU resources (as shown by
> top) during the entire write operation. Also, iostat clearly shows that the
> max disk bandwidth is not reached anytime during the write operation, every
> now and then the i/o activity on "commitlog" disk and the data disk spike
> but it is never consistently maintained by cassandra close to their peak. I
> would imagine that the CPU is probably the bottleneck here. Does anyone have
> any idea why Cassandra beats the heck out of the CPU here? Any suggestions
> on how to go about finding the exact bottleneck here?
> Some more information about the writes: I have 2 column families, the data
> though is mostly written in one column family with column sizes of around
> 32k and each row having around 256 or 512 columns. I would really appreciate
> any help here.
> Thanks,
> Rishi
>
>


Re: Cassandra Write Performance, CPU usage

2010-06-10 Thread Rishi Bhardwaj
Hi Jonathan

Thanks for such an informative reply. My application may end up doing such 
continuous bulk writes to Cassandra and thus I was interested in such a 
performance case. I was wondering as to what are all the CPU overheads for each 
row/column written to Cassandra? You mentioned updating of bloom filters, would 
that be the main CPU overhead, there may even be copying of data happening? I 
want to investigate about all the factors in play here and if there is a 
possibility for improvement. Is it possible to profile cassandra and see what 
maybe the bottleneck here. The auxiliary I/O you had mentioned for the Bloom 
filters, wouldn't that occur with the I/O for the SSTable, in which case the 
extra I/O for the bloom filter gets piggybacked with the SSTable I/O? I guess I 
don't understand the Cassandra internals too well but wanted to see how much 
can Cassandra achieve for continuous bulk writes.

Has anyone done any bulk write experiments with Cassandra? Is Cassandra 
performance always expected to be bottlenecked by CPU when doing continuous 
bulk writes?

Thanks for all the help,
Rishi




From: Jonathan Shook 
To: user@cassandra.apache.org
Sent: Thu, June 10, 2010 7:39:24 PM
Subject: Re: Cassandra Write Performance, CPU usage

You are testing Cassandra in a way that it was not designed to be used.
Bandwidth to disk is not a meaningful example for nearly anything
except for filesystem benchmarking and things very nearly the same as
filesystem benchmarking.
Unless the usage patterns of your application match your test data,
there is not a good reason to expect a strong correlation between this
test and actual performance.

Cassandra is not simply shuffling data through IO when you write.
There are calculations that have to be done as writes filter their way
through various stages of processing. The point of this is to minimize
the overall effort Cassandra has to make in order to retrieve the data
again. One example would be bloom filters. Each column that is written
requires bloom filter processing and potentially auxiliary IO. Some of
these steps are allowed to happen in the background, but if you try,
you can cause them to stack up on top of the available CPU and memory
resources.

In such a case (continuous bulk writes), you are causing all of these
costs to be taken in more of a synchronous (not delayed) fashion. You
are not allowing the background processing that helps reduce client
blocking (by deferring some processing) to do its magic.



On Thu, Jun 10, 2010 at 7:42 PM, Rishi Bhardwaj  wrote:
> Hi
> I am investigating Cassandra write performance and see very heavy CPU usage
> from Cassandra. I have a single node Cassandra instance running on a dual
> core (2.66 Ghz Intel ) Ubuntu 9.10 server. The writes to Cassandra are being
> generated from the same server using BatchMutate(). The client makes exactly
> one RPC call at a time to Cassandra. Each BatchMutate() RPC contains 2 MB of
> data and once it is acknowledged by Cassandra, the next RPC is done.
> Cassandra has two separate disks, one for commitlog with a sequential b/w of
> 130MBps and the other a solid state disk for data with b/w of 90MBps. Tuning
> various parameters, I observe that I am able to attain a maximum write
> performance of about 45 to 50 MBps from Cassandra. I see that the Cassandra
> java process consistently uses 100% to 150% of CPU resources (as shown by
> top) during the entire write operation. Also, iostat clearly shows that the
> max disk bandwidth is not reached anytime during the write operation, every
> now and then the i/o activity on "commitlog" disk and the data disk spike
> but it is never consistently maintained by cassandra close to their peak. I
> would imagine that the CPU is probably the bottleneck here. Does anyone have
> any idea why Cassandra beats the heck out of the CPU here? Any suggestions
> on how to go about finding the exact bottleneck here?
> Some more information about the writes: I have 2 column families, the data
> though is mostly written in one column family with column sizes of around
> 32k and each row having around 256 or 512 columns. I would really appreciate
> any help here.
> Thanks,
> Rishi
>
>



  

Re: Cassandra Write Performance, CPU usage

2010-06-10 Thread Jonathan Shook
Rishi,

I am not yet knowledgeable enough to answer your question in more
detail. I would like to know more about the specifics as well.
There are counters you can use via JMX to show logical events, but
this will not always translate to good baseline information that you
can use in scaling estimates.
I would like to see a good analysis that characterizes the scaling
factors of different parts of the system, both from load
characterization and from an algorithmic perspective.

This is a common area of inquiry. Maybe we should start
http://wiki.apache.org/cassandra/ScalabilityFactors


On Thu, Jun 10, 2010 at 11:05 PM, Rishi Bhardwaj  wrote:
> Hi Jonathan
> Thanks for such an informative reply. My application may end up doing such
> continuous bulk writes to Cassandra and thus I was interested in such a
> performance case. I was wondering as to what are all the CPU overheads for
> each row/column written to Cassandra? You mentioned updating of bloom
> filters, would that be the main CPU overhead, there may even be copying of
> data happening? I want to investigate about all the factors in play here and
> if there is a possibility for improvement. Is it possible to profile
> cassandra and see what maybe the bottleneck here. The auxiliary I/O you had
> mentioned for the Bloom filters, wouldn't that occur with the I/O for the
> SSTable, in which case the extra I/O for the bloom filter gets piggybacked
> with the SSTable I/O? I guess I don't understand the Cassandra internals too
> well but wanted to see how much can Cassandra achieve for continuous bulk
> writes.
> Has anyone done any bulk write experiments with Cassandra? Is Cassandra
> performance always expected to be bottlenecked by CPU when doing continuous
> bulk writes?
> Thanks for all the help,
> Rishi
> 
> From: Jonathan Shook 
> To: user@cassandra.apache.org
> Sent: Thu, June 10, 2010 7:39:24 PM
> Subject: Re: Cassandra Write Performance, CPU usage
>
> You are testing Cassandra in a way that it was not designed to be used.
> Bandwidth to disk is not a meaningful example for nearly anything
> except for filesystem benchmarking and things very nearly the same as
> filesystem benchmarking.
> Unless the usage patterns of your application match your test data,
> there is not a good reason to expect a strong correlation between this
> test and actual performance.
>
> Cassandra is not simply shuffling data through IO when you write.
> There are calculations that have to be done as writes filter their way
> through various stages of processing. The point of this is to minimize
> the overall effort Cassandra has to make in order to retrieve the data
> again. One example would be bloom filters. Each column that is written
> requires bloom filter processing and potentially auxiliary IO. Some of
> these steps are allowed to happen in the background, but if you try,
> you can cause them to stack up on top of the available CPU and memory
> resources.
>
> In such a case (continuous bulk writes), you are causing all of these
> costs to be taken in more of a synchronous (not delayed) fashion. You
> are not allowing the background processing that helps reduce client
> blocking (by deferring some processing) to do its magic.
>
>
>
> On Thu, Jun 10, 2010 at 7:42 PM, Rishi Bhardwaj 
> wrote:
>> Hi
>> I am investigating Cassandra write performance and see very heavy CPU
>> usage
>> from Cassandra. I have a single node Cassandra instance running on a dual
>> core (2.66 Ghz Intel ) Ubuntu 9.10 server. The writes to Cassandra are
>> being
>> generated from the same server using BatchMutate(). The client makes
>> exactly
>> one RPC call at a time to Cassandra. Each BatchMutate() RPC contains 2 MB
>> of
>> data and once it is acknowledged by Cassandra, the next RPC is done.
>> Cassandra has two separate disks, one for commitlog with a sequential b/w
>> of
>> 130MBps and the other a solid state disk for data with b/w of 90MBps.
>> Tuning
>> various parameters, I observe that I am able to attain a maximum write
>> performance of about 45 to 50 MBps from Cassandra. I see that the
>> Cassandra
>> java process consistently uses 100% to 150% of CPU resources (as shown by
>> top) during the entire write operation. Also, iostat clearly shows that
>> the
>> max disk bandwidth is not reached anytime during the write operation,
>> every
>> now and then the i/o activity on "commitlog" disk and the data disk spike
>> but it is never consistently maintained by cassandra close to their
>> peak. I
>> would imagine that the CPU is probably the bottleneck here. Does anyone
>> have
>> any idea why Cassandra beats the heck out of the CPU here? Any suggestions
>> on how to go about finding the exact bottleneck here?
>> Some more information about the writes: I have 2 column families, the data
>> though is mostly written in one column family with column sizes of around
>> 32k and each row having around 256 or 512 columns. I would really
>> appreciate
>> any help her

Re: keyrange for get_range_slices

2010-06-10 Thread Shuai Yuan

Hi,

Since you're iterating the whole set with several records a time, your 
code should know when it's first time.


Why just simply

if(!_first_time){
_iter++; //to ignore the first record?
}else{
_first_time=false;
}

Kevin Yuan,
Supertool Corp.
www.yuan-shuai.info


On 2010?06?10? 22:03, Dop Sun wrote:


Hi,

As documented in the http://wiki.apache.org/cassandra/API, the key 
range for get_range_slices are both inclusive.


As discussed in this thread: 
http://groups.google.com/group/jassandra-user/browse_thread/thread/c2e56453cde067d3, 
there is a case that user want to discover all keys (huge number) in a 
column family.


What I think  is doing batchly: using empty string as start and finish 
first, then using the last key returned as start and query second.


My question is: using this method, the last key returned for the first 
query, will be returned again in the second query as the first key. 
And it's a duplication. Is there any other API to discover keys 
without duplications in current implementation?


Thanks,

Regards,

Dop