date:20100503

Strange streaming behaviour

2010-05-03 Thread Dr . Martin Grabmüller

Hello list,

I encountered a problem with streaming in my test cluster but
found nothing like this in JIRA or on the list.

I'm running a test cluster of three nodes, RF=3, Cassandra 0.6.1.
I started the first node and inserted some data, then bootstrapped
the other machines one after the other.  No problems here.

Then, I inserted data over the weekend and checked back this morning,
finding a lot of -tmp-files in the data directory of the two bootstrapepd
machines which seem to be leftovers from streaming.

Interesting thing is, they are mostly all about the same size on each node. 
Additionally,
this size is the same as the size of some file to be streamed (see the output of
nodetool streams at the end of this post):

[Node 2:]

-rw-r--r-- 1 cassandra cassandra  2360 2010-05-03 07:39 
AccountList-tmp-7-Data.db
-rw-r--r-- 1 cassandra cassandra  2360 2010-05-03 00:39 
AddressList-tmp-7-Data.db
-rw-r--r-- 1 cassandra cassandra  2360 2010-05-03 07:53 
IndexQueue-tmp-7-Data.db
-rw-r--r-- 1 cassandra cassandra  2360 2010-05-03 05:19 
MailMetadata-tmp-7-Data.db
-rw-r--r-- 1 cassandra cassandra  2360 2010-05-03 00:53 
Statistics-tmp-7-Data.db

[Node 3:]

-rw-r--r-- 1 cassandra cassandra 10994 2010-05-03 07:39 
AccountList-tmp-6-Data.db
-rw-r--r-- 1 cassandra cassandra   325 2010-05-03 00:39 
AddressList-tmp-6-Data.db
-rw-r--r-- 1 cassandra cassandra 10994 2010-05-03 08:02 
CustomerList-tmp-6-Data.db
-rw-r--r-- 1 cassandra cassandra 10994 2010-05-03 07:53 
IndexQueue-tmp-7-Data.db
-rw-r--r-- 1 cassandra cassandra 10994 2010-05-03 01:02 
MailMetadata-tmp-7-Data.db
-rw-r--r-- 1 cassandra cassandra 10994 2010-05-03 00:53 
Statistics-tmp-7-Data.db

Checking the logs, I find a few streaming-related error messages:

[Node 1 (yes, the node without leftover files):]

ERROR [MESSAGE-STREAMING-POOL:1] 2010-05-02 14:39:10,675 
DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.FileNotFoundException: 
/mnt/data000/cassandra/data/Archive/stream/LocalpartMapping-2-Data.db (No such 
file or directory)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.FileNotFoundException: 
/mnt/data000/cassandra/data/Archive/stream/LocalpartMapping-2-Data.db (No such 
file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:233)
at 
org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:84)
at 
org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more
ERROR [MESSAGE-STREAMING-POOL:1] 2010-05-02 14:39:10,675 CassandraDaemon.java 
(line 78) Fatal exception in thread Thread[MESSAGE-STREAMING-POOL:1,5,main]
java.lang.RuntimeException: java.io.FileNotFoundException: 
/mnt/data000/cassandra/data/Archive/stream/LocalpartMapping-2-Data.db (No such 
file or directory)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.FileNotFoundException: 
/mnt/data000/cassandra/data/Archive/stream/LocalpartMapping-2-Data.db (No such 
file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:233)
at 
org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:84)
at 
org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
... 3 more

[Node 3:]

ERROR [MESSAGE-STREAMING-POOL:1] 2010-05-03 00:39:46,656 
DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
java.lang.RuntimeException: java.io.FileNotFoundException: 
/mnt/data000/cassandra/data/Archive/stream/Headers-7-Data.db (No such file or 
directory)
at 
org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
at java.lang.Thread.run(Thread.java:636)
Caused by: java.io.FileNotFoundException: 
/mnt/data000/cassandra/data/Archive/stream/Headers-7-Data.db (No such file or 
directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:233)

Re: ColumnFamilyOutputFormat?

2010-05-03 Thread Johan Oskarsson

I wrote this CassandraOutputFormat last year. It is most likely not working 
against newer/current versions of Cassandra, but if you want something to work 
with it can be used as a starting point.

http://github.com/johanoskarsson/cassandraoutputformat

/Johan

On 30 apr 2010, at 14.14, Utku Can Topçu wrote:

> Hey All,
> 
> I've been looking at the documentation and related articles about Cassandra 
> and Hadoop integration, I'm only seeing ColumnFamilyInputFormat for now.
> What if I want to write directly to cassandra after a reduce?
> 
> What comes to my mind is, in the Reducer's setup I'd initialize a Cassandra 
> client so that rather than emitting the results to the MR framework, it would 
> be possible to output them to the Cassandra in a simple way.
> 
> Can you think of any other high level solutions like an OutputFormat or so?
> 
> Best Regards,
> Utku

Feeding in specific Cassandra columns into Hadoop

2010-05-03 Thread Mark Schnitzius

Hi all...  I am trying to feed a specific list of Cassandra column names in
as input to a Hadoop process, but for some reason it only feeds in some of
the columns I specify, not all.

This is a short description of the problem - I'll see if anyone might have
some insight before I dump a big load of code on you...

1.  I've uploaded a bunch of data into Cassandra; the column names as longs
(dates, basically) converted to byte[8].

2.  I can successfully set a SlicePredicate using setSlice_range to return
all the data for a set of columns.

3.  However, if I instead call setColumn_names on the SlicePredicate, only
some of the specified columns get fed into Hadoop.

4.  This faulty behavior is repeatable, with the same columns going missing
each time for the same input parameters.

5.  For the values that fail, I've made fairly certain that the value for
the column name is getting inserted successfully, and that the exact same
column name is specified in the call to setColumn_names.

Any clues?


AdTHANKSvance,
Mark

Re: Login failure with SimpleAuthenticator

2010-05-03 Thread Julio Carlos Barrera Juez

Hi again.

My system log says:

ERROR [pool-1-thread-1] 2010-05-03 12:54:03,801 Cassandra.java (line 1153)
Internal error processing login
java.lang.RuntimeException: Unexpected authentication problem
at
org.apache.cassandra.auth.SimpleAuthenticator.login(SimpleAuthenticator.java:113)
at
org.apache.cassandra.thrift.CassandraServer.login(CassandraServer.java:651)
at
org.apache.cassandra.thrift.Cassandra$Processor$login.process(Cassandra.java:1147)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:1125)
at
org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
at
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.NullPointerException
at java.io.FileInputStream.(FileInputStream.java:103)
at java.io.FileInputStream.(FileInputStream.java:66)
at
org.apache.cassandra.auth.SimpleAuthenticator.login(SimpleAuthenticator.java:82)
... 7 more

Maybe it is a problem with the configuration file. Do I need to add
something more
than   
org.apache.cassandra.auth.SimpleAuthenticator
line? It seems that cassandra doesn't found access.properties
and passwd.properties files? I have put it in the conf directory, but do I
need to put something more in the storage-conf.xml file?

Keyspace name and user names and password are false, it is only for the
example.

2010/4/29 roger schildmeijer 

> Are you sure that your keyspace is named "keyspace", and not "Keyspace1"
> (default)?
>
>
>
> / Roger Schildmeijer
>
>
> On Thu, Apr 29, 2010 at 2:47 PM, Jonathan Ellis  wrote:
>
>> If you're getting an internalerror, you need to check the server logs
>> for the exception that caused it
>>
>> On Wed, Apr 28, 2010 at 6:20 AM, Julio Carlos Barrera Juez
>>  wrote:
>> > Hi all!
>> > I am using org.apache.cassandra.auth.SimpleAuthenticator to use
>> > authentication in my cluster with one node (with cassandra 0.6.1). I
>> have
>> > put:
>> >
>> org.apache.cassandra.auth.SimpleAuthenticator
>> > in storage-conf.xml file, and:
>> > keyspace=username
>> > in access.properties file, and:
>> > username=password
>> > in passwd.properties file.
>> > When I try to use cassandra client I am using:
>> > ./cassandra-cli --host localhost --port 9160 --username username
>> --password
>> > password --keyspace keyspace --debug
>> > and it returns this:
>> > org.apache.thrift.TApplicationException: Internal error processing login
>> > at
>> >
>> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>> > at
>> >
>> org.apache.cassandra.thrift.Cassandra$Client.recv_login(Cassandra.java:300)
>> > at
>> org.apache.cassandra.thrift.Cassandra$Client.login(Cassandra.java:282)
>> > at org.apache.cassandra.cli.CliMain.connect(CliMain.java:109)
>> > at org.apache.cassandra.cli.CliMain.main(CliMain.java:239)
>> > Login failure. Did you specify 'keyspace', 'username' and 'password'?
>> > When I try the same process with Java Thrift API:
>> > TTransport tr = new TSocket(ip, port);
>> > static Cassandra.Client client = new Cassandra.Client(new
>> > TBinaryProtocol(tr));
>> > Map credentials = new HashMap();
>> > credentials.put(SimpleAuthenticator.USERNAME_KEY, username);
>> > credentials.put(SimpleAuthenticator.PASSWORD_KEY, password);
>> > try {
>> > tr.open();
>> > client.login(KEY_SPACE, new AuthenticationRequest(credentials));
>> > catch{...}
>> > ..
>> > I get:
>> > org.apache.thrift.TApplicationException: Internal error processing login
>> > at
>> >
>> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>> > at
>> >
>> org.apache.cassandra.thrift.Cassandra$Client.recv_login(Cassandra.java:300)
>> > at
>> org.apache.cassandra.thrift.Cassandra$Client.login(Cassandra.java:282)
>> > ...
>> > What I am doing wrong?
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>
>

Re: Login failure with SimpleAuthenticator

2010-05-03 Thread roger schildmeijer

You need to define two more properties: passwd.properties and
access.properties (hint
-Dpasswd.properties=/user/schildmeijer/cassandra/conf/passwd.properties and
analogous for access.properties)



// Roger Schildmeijer


On Mon, May 3, 2010 at 1:06 PM, Julio Carlos Barrera Juez <
juliocar...@gmail.com> wrote:

> Hi again.
>
> My system log says:
>
> ERROR [pool-1-thread-1] 2010-05-03 12:54:03,801 Cassandra.java (line 1153)
> Internal error processing login
> java.lang.RuntimeException: Unexpected authentication problem
>  at
> org.apache.cassandra.auth.SimpleAuthenticator.login(SimpleAuthenticator.java:113)
> at
> org.apache.cassandra.thrift.CassandraServer.login(CassandraServer.java:651)
>  at
> org.apache.cassandra.thrift.Cassandra$Processor$login.process(Cassandra.java:1147)
> at
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:1125)
>  at
> org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:253)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>  at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
> at java.io.FileInputStream.(FileInputStream.java:103)
> at java.io.FileInputStream.(FileInputStream.java:66)
>  at
> org.apache.cassandra.auth.SimpleAuthenticator.login(SimpleAuthenticator.java:82)
> ... 7 more
>
> Maybe it is a problem with the configuration file. Do I need to add
> something more
> than   
> org.apache.cassandra.auth.SimpleAuthenticator
> line? It seems that cassandra doesn't found access.properties
> and passwd.properties files? I have put it in the conf directory, but do I
> need to put something more in the storage-conf.xml file?
>
> Keyspace name and user names and password are false, it is only for the
> example.
>
> 2010/4/29 roger schildmeijer 
>
>  Are you sure that your keyspace is named "keyspace", and not "Keyspace1"
>> (default)?
>>
>>
>>
>> / Roger Schildmeijer
>>
>>
>> On Thu, Apr 29, 2010 at 2:47 PM, Jonathan Ellis wrote:
>>
>>> If you're getting an internalerror, you need to check the server logs
>>> for the exception that caused it
>>>
>>> On Wed, Apr 28, 2010 at 6:20 AM, Julio Carlos Barrera Juez
>>>  wrote:
>>> > Hi all!
>>> > I am using org.apache.cassandra.auth.SimpleAuthenticator to use
>>> > authentication in my cluster with one node (with cassandra 0.6.1). I
>>> have
>>> > put:
>>> >
>>> org.apache.cassandra.auth.SimpleAuthenticator
>>> > in storage-conf.xml file, and:
>>> > keyspace=username
>>> > in access.properties file, and:
>>> > username=password
>>> > in passwd.properties file.
>>> > When I try to use cassandra client I am using:
>>> > ./cassandra-cli --host localhost --port 9160 --username username
>>> --password
>>> > password --keyspace keyspace --debug
>>> > and it returns this:
>>> > org.apache.thrift.TApplicationException: Internal error processing
>>> login
>>> > at
>>> >
>>> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>>> > at
>>> >
>>> org.apache.cassandra.thrift.Cassandra$Client.recv_login(Cassandra.java:300)
>>> > at
>>> org.apache.cassandra.thrift.Cassandra$Client.login(Cassandra.java:282)
>>> > at org.apache.cassandra.cli.CliMain.connect(CliMain.java:109)
>>> > at org.apache.cassandra.cli.CliMain.main(CliMain.java:239)
>>> > Login failure. Did you specify 'keyspace', 'username' and 'password'?
>>> > When I try the same process with Java Thrift API:
>>> > TTransport tr = new TSocket(ip, port);
>>> > static Cassandra.Client client = new Cassandra.Client(new
>>> > TBinaryProtocol(tr));
>>> > Map credentials = new HashMap();
>>> > credentials.put(SimpleAuthenticator.USERNAME_KEY, username);
>>> > credentials.put(SimpleAuthenticator.PASSWORD_KEY, password);
>>> > try {
>>> > tr.open();
>>> > client.login(KEY_SPACE, new AuthenticationRequest(credentials));
>>> > catch{...}
>>> > ..
>>> > I get:
>>> > org.apache.thrift.TApplicationException: Internal error processing
>>> login
>>> > at
>>> >
>>> org.apache.thrift.TApplicationException.read(TApplicationException.java:108)
>>> > at
>>> >
>>> org.apache.cassandra.thrift.Cassandra$Client.recv_login(Cassandra.java:300)
>>> > at
>>> org.apache.cassandra.thrift.Cassandra$Client.login(Cassandra.java:282)
>>> > ...
>>> > What I am doing wrong?
>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>>>
>>
>>
>

Distributed export and import into cassandra

2010-05-03 Thread Utku Can Topçu

Hey All,

I have a simple sample use case,
The aim is to export the columns in a column family into flat files with the
keys in range from k1 to k2.
Since all the nodes in the cluster is supposed to contain some of the
distribution of data, is it possible to make each node dump its own local
data volume to a flat file?

Best Regards,
Utku

RE: Cassandra on Windows network latency

2010-05-03 Thread Viktor Jevdokimov

Yes, we have already figured that out :)

Thanks!

-Original Message-
From: Carlos Alvarez [mailto:cbalva...@gmail.com] 
Sent: Thursday, April 29, 2010 4:03 PM
To: user@cassandra.apache.org
Subject: Re: Cassandra on Windows network latency

Are you using TSocket in the client?. If yes, use TbufferedTransport instead.


Carlos

On 4/29/10, Viktor Jevdokimov  wrote:
> Thrift C# sources, thrift generated Cassandra sources, test app built with
> C#. Simple connect/write/read operations. No pooling or anything else.
>
> From: Heath Oderman [mailto:he...@526valley.com]
> Sent: Thursday, April 29, 2010 2:17 PM
> To: user@cassandra.apache.org
> Subject: Re: Cassandra on Windows network latency
>
> I learned the hard way, that running py_stress in the src/contrib directory
> is a great way to test what kind of speeds you are really getting.
>
> What tools / client are you using to test to get the 200ms number?
>
> stu
> On Thu, Apr 29, 2010 at 7:12 AM, Viktor Jevdokimov
> mailto:viktor.jevdoki...@adform.com>> wrote:
> Hi all,
>
> We have installed Cassandra on Windows and found that with any number of
> Cassandra (single, or 3 node cluster) on Windows Vista or Windows Server
> 2008, 32 or 64 bit, with any load or number of requests we, have:
>
> When client and server are on the same machine, connect/read/write latencies
> ~0-1ms
> When client on another machine, same network, on the same switch, connection
> latency 0-1ms (as a ping), read/write latencies >=200ms.
>
> What causes 200ms latency accessing Cassandra on Windows through network?
> Does anybody experience such behavior?
>
> Cassandra 0.6.1
> Java SE 6 u20
>
>
> Best regards,
> Viktor
>
>
>

-- 
Sent from my mobile device

Tal vez hubo un error en la grafía. O en la articulación del Sacro Nombre.

Primary and Backup clusters

2010-05-03 Thread Viktor Jevdokimov

Hello,

Our system (not Cassandra) have backup cluster in different datacenter in case 
of primary cluster unavailability or for software upgrades.
100% of traffic goes to primary cluster. We switch 100% traffic to backup 
cluster in case above for a short time, then when issues are resolved, traffic 
is switched back to primary.

We'd like to have primary and backup Cassandra clusters in different 
datacenters for the same reasons.
We do not want to have a high traffic between primary and backup datacenters.

Now the questions:

1. How to sync Cassandra clusters (backup<->primary) with minimal traffic?
2. How to configure Cassandra in such case?


Thanks,

Viktor

Re: inserting new rows with one key vs. inserting new columns in a row performance

2010-05-03 Thread malsmith

I've seen this too (your second case) - it seems like the entire row
contents (or some big subset of the row) are loaded to memory on the
server before any column value is returned.  The partitioner selection
did not make any difference to performance in my case.  I did not find a
way around this except to take a strategy similar to your first case.



On Mon, 2010-05-03 at 09:33 +0300, Даниел Симеонов wrote:

> Hello,
>It seems that I have experienced network problems (local
> pre-installed firewall) and some rest http inefficiencies, so I think
> that it behaves the same in both cases. I am sorry to have taken from
> your time.
> Best regards, Daniel.
> 
> На 30 април 2010 20:46, Даниел Симеонов  написа:
> 
> Hi, 
>I've checked two similar scenarios and one of them seem to
> be more performant. So timestamped data is being appended, the
> first use case is with an OPP and new rows being created every
> with only one column (there are about 7-8 CFs). The second
> cases is to have rows with more columns and RandomPartitioner,
> although every row gets much more than one column appended yet
> the inserts are relatively uniformly distributed among rows.
> Yet the first scenario is faster than the second, and the
> second one starts with good response times (about 20-30 ms)
> and gradually the mean time increases (to about 150-200 ms).
> What could be the reason?  
> Thank you very much! 
> Best regards, Daniel. 
> 
> 
>

Re: inserting new rows with one key vs. inserting new columns in a row performance

2010-05-03 Thread Sylvain Lebresne

Make sure you have disallowed the row cache. If you have row cache, the entire
row do get loaded to memory. Otherwise it is not.

On Mon, May 3, 2010 at 3:06 PM, malsmith  wrote:
> I've seen this too (your second case) - it seems like the entire row
> contents (or some big subset of the row) are loaded to memory on the server
> before any column value is returned.  The partitioner selection did not make
> any difference to performance in my case.  I did not find a way around this
> except to take a strategy similar to your first case.
>
>
>
> On Mon, 2010-05-03 at 09:33 +0300, Даниел Симеонов wrote:
>
> Hello,
>    It seems that I have experienced network problems (local pre-installed
> firewall) and some rest http inefficiencies, so I think that it behaves the
> same in both cases. I am sorry to have taken from your time.
> Best regards, Daniel.
>
> На 30 април 2010 20:46, Даниел Симеонов  написа:
>
> Hi,
>    I've checked two similar scenarios and one of them seem to be more
> performant. So timestamped data is being appended, the first use case is
> with an OPP and new rows being created every with only one column (there are
> about 7-8 CFs). The second cases is to have rows with more columns and
> RandomPartitioner, although every row gets much more than one column
> appended yet the inserts are relatively uniformly distributed among rows.
> Yet the first scenario is faster than the second, and the second one starts
> with good response times (about 20-30 ms) and gradually the mean time
> increases (to about 150-200 ms). What could be the reason?
> Thank you very much!
> Best regards, Daniel.
>
>
>
>

Re: Search Sample and Relation question because UDDI as Key

2010-05-03 Thread Jonathan Shook

I am only speaking to your second question.

It may be helpful to think of modeling your storage layout in terms of
* lists
* sets
* hash maps
... and certain combinations of these.

Since there are no schema-defined relations, your relations may appear
implicit between different views or "copies" of your data. The relationship
can be assumed to be explicit to the extent that it is used in that way or
even (in some cases) enforced by a boundary layer in your software.

For accessing data by value, you can try to do your bookkeeping (indexing)
as you go, by maintaining auxiliary maps directly via your application.
Scanning by value is really not a strong point for Cassandra, and in fact is
one of the trade-offs made when moving to a DHT (
http://en.wikipedia.org/wiki/Distributed_hash_table) data store.

There has been discussion around putting some form of value indexing in at
some point in the future, but the plans appear indefinite. Even with this,
it would move workload into the hub which may otherwise be better handled in
a client node.

On Sun, May 2, 2010 at 4:33 PM, CleverCross | Falk Wolsky <
falk.wol...@clevercross.eu> wrote:

> Hello,
>
> 1) Can you provide a solution or a sample for searching (Column and
> SuperColumn) (Fulltext).
> What is the Way to realize this? Hadoop/MapReduce? See you a posibility to
> build/use a index for columns?
>
> Why this: In a given Data-Model we "must" use UUIDs as Key and have
> actually no chance to seach values from "Columns"? (or not?)
>
> 2) How can we realize a "relation"
>
> For Sample: (http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
> )
> Arin describes good a simple Data-Model to build a Blog. But how can we
> read (filter) all Posts from "BlogEntries" from a single Autor?
> (filter the Supercolumns by a culum inside of a SuperColumn)
>
> The "relation" for Sample is Autor -> BlogEntries...
> To filter the Datas there is a needing to specify in a "get(...)"-Function
> a Column/Value combination...
>
> I know well that cassandra is not a "relational Database"! But without
> these releations the usage is very "limited" (specialized)
>
> Thanks in Advance! - and thx for Cassandra!
> With Hector i build a (Apache)Cocoon-Transformer...
>
> With Kind Regards,
> Falk Wolsky
>

Re: A simple Cassandra CLI in Python with readline support

2010-05-03 Thread Jonathan Ellis

That's fine, although GPLv3 software cannot be included in Apache
projects.  http://www.apache.org/licenses/GPL-compatibility.html

On Sat, May 1, 2010 at 8:25 PM, Shuge Lee  wrote:
> Thanks for reply.
> Add GPLv3 license, github.com/shuge/shuge-cassandra/downloads.
>
> 2010/5/1 Jonathan Ellis 
>>
>> Nice work!
>>
>> Can you add license information, e.g., Apache license?
>>
>> On Sat, May 1, 2010 at 5:06 AM, Shuge Lee  wrote:
>> > Hi all:
>> >
>> > I write a simple Cassandra CLI in Python with readline support,
>> >
>> > FEATURES
>> >
>> > inherited all standard Apache Cassandra CLI features
>> > readline
>> >
>> > for more information, see http://github.com/shuge/shuge-cassandra .
>> >
>> > --
>> > Shuge Lee | Lee Li
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
>
>
> --
> Shuge Lee | Lee Li | 李蠡
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Design Query

2010-05-03 Thread Jonathan Ellis

On Sat, May 1, 2010 at 6:34 AM, Rakesh Rajan  wrote:
> I am evaluating cassandra to implement activity streams. We currently have
> over 100 feeds with total entries exceeding 32000 implemented using
> redis ( ~320 entries / feed). Would like hear from the community on how to
> use cassandra to solve the following cases:
>
> Ability to fetch entries by applying a few filters ( like show me only likes
> from a given user). This would include range query to support pagination. So
> this would mean indices on a few columns like the feed id, feed type etc.

Sounds like you've got it: you need to denormalize in your app to
other CFs for things that you need "filtered" server-side.  Everything
else you have to filter client-side.

> We have around 3 machines with 4GB RAM for this purpose and thinking of
> having replication factor 2. Would 4GB * 3 be enough for cassandra for this
> kind of data? I read that cassandra does not keep all the data in memory but
> want to be sure that we have the right server config to handle this data
> using cassandra.

Depends on how much of the data is "hot."  Cassandra does not require
all memory to be in memory, but of course if you request data faster
than the disk can keep up then that will be your bottleneck.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: strange get_range_slices behaviour v0.6.1

2010-05-03 Thread Jonathan Ellis

Util.range returns a Range object which is end-exclusive.  (You want
"Bounds" for end-inclusive.)

On Sun, May 2, 2010 at 7:19 AM, aaron morton  wrote:
> He there, I'm still getting odd behavior with get_range_slices. I've created
> a JUNIT test that illustrates the case.
> Could someone take a look and either let me know where my understanding is
> wrong or is this is a real issue?
>
>
> I added the following to ColumnFamilyStoreTest.java
>
>
>    private ColumnFamilyStore insertKey1Key2Key3() throws IOException,
> ExecutionException, InterruptedException
>    {
>        List rms = new LinkedList();
>        RowMutation rm;
>        rm = new RowMutation("Keyspace2", "key1".getBytes());
>        rm.add(new QueryPath("Standard1", null, "Column1".getBytes()),
> "asdf".getBytes(), 0);
>        rms.add(rm);
>
>        rm = new RowMutation("Keyspace2", "key2".getBytes());
>        rm.add(new QueryPath("Standard1", null, "Column1".getBytes()),
> "asdf".getBytes(), 0);
>        rms.add(rm);
>
>        rm = new RowMutation("Keyspace2", "key3".getBytes());
>        rm.add(new QueryPath("Standard1", null, "Column1".getBytes()),
> "asdf".getBytes(), 0);
>        rms.add(rm);
>        return Util.writeColumnFamily(rms);
>    }
>
>
>   �...@test
>    public void testThreeKeyRangeAll() throws IOException,
> ExecutionException, InterruptedException
>    {
>        ColumnFamilyStore cfs = insertKey1Key2Key3();
>
>        IPartitioner p = StorageService.getPartitioner();
>        RangeSliceReply result =
> cfs.getRangeSlice(ArrayUtils.EMPTY_BYTE_ARRAY,
>                                                   Util.range(p, "key1",
> "key3"),
>                                                   10,
>                                                   null,
>
> Arrays.asList("Column1".getBytes()));
>        assertEquals(3, result.rows.size());
>    }
>
>   �...@test
>    public void testThreeKeyRangeSkip1() throws IOException,
> ExecutionException, InterruptedException
>    {
>        ColumnFamilyStore cfs = insertKey1Key2Key3();
>
>        IPartitioner p = StorageService.getPartitioner();
>        RangeSliceReply result =
> cfs.getRangeSlice(ArrayUtils.EMPTY_BYTE_ARRAY,
>                                                   Util.range(p, "key2",
> "key3"),
>                                                   10,
>                                                   null,
>
> Arrays.asList("Column1".getBytes()));
>        assertEquals(2, result.rows.size());
>    }
>
> Running this with "ant test" the partial output is
>
>    [junit] Testsuite: org.apache.cassandra.db.ColumnFamilyStoreTest
>    [junit] Tests run: 7, Failures: 2, Errors: 0, Time elapsed: 1.405 sec
>    [junit]
>    [junit] Testcase:
> testThreeKeyRangeAll(org.apache.cassandra.db.ColumnFamilyStoreTest):
>  FAILED
>    [junit] expected:<3> but was:<2>
>    [junit] junit.framework.AssertionFailedError: expected:<3> but was:<2>
>    [junit]     at
> org.apache.cassandra.db.ColumnFamilyStoreTest.testThreeKeyRangeAll(ColumnFamilyStoreTest.java:170)
>    [junit]
>    [junit]
>    [junit] Testcase:
> testThreeKeyRangeSkip1(org.apache.cassandra.db.ColumnFamilyStoreTest):
>  FAILED
>    [junit] expected:<2> but was:<1>
>    [junit] junit.framework.AssertionFailedError: expected:<2> but was:<1>
>    [junit]     at
> org.apache.cassandra.db.ColumnFamilyStoreTest.testThreeKeyRangeSkip1(ColumnFamilyStoreTest.java:184)
>    [junit]
>    [junit]
>    [junit] Test org.apache.cassandra.db.ColumnFamilyStoreTest FAILED
>
>
> Any help appreciated.
>
> Aaron
>
>
> On 27 Apr 2010, at 09:38, aaron wrote:
>
>>
>> I've broken this case down further to some pyton code that works against
>> the thrift generated
>> client and am still getting the same odd results. With keys obejct1,
>> object2 and object3 an
>> open ended get_range_slice starting with "object1" only returns object1
>> and
>> 2.
>>
>> I'm guessing that I've got something wrong or my expectation of how
>> get_range_slice works
>> is wrong, but I cannot see where I've gone wrong. Any help would be
>> appreciated.
>>
>> They python code to add and read keys is below, assumes a Cassandra.Client
>> connection.
>>
>> import time
>> from cassandra import Cassandra,ttypes
>> from thrift import Thrift
>> from thrift.protocol import TBinaryProtocol
>> from thrift.transport import TSocket, TTransport
>>
>>
>> def add_data(conn):
>>
>>   col_path = ttypes.ColumnPath(column_family="Standard1",
>> column="col_name")
>>   consistency = ttypes.ConsistencyLevel.QUORUM
>>
>>   for key in ["object1", "object2", "object3"]:
>>       conn.insert("Keyspace1", key, col_path, "col_value",
>>           int(time.time() * 1e6), consistency)
>>   return
>>
>> def read_range(conn, start_key, end_key):
>>
>>   col_parent = ttypes.ColumnParent(column_family="Standard1")
>>
>>   predicate = ttypes.SlicePredicate(column_names=["col_name"])
>>   range = ttypes.KeyRange(start_key=start_key, end_key=end_key,
>> count=1000)
>>   consistency = ttypes.ConsistencyLevel.Q

Re: Row slice / cache performance

2010-05-03 Thread Jonathan Ellis

On Sun, May 2, 2010 at 1:00 PM, James Golick  wrote:
> the ConcurrentSkipListMap (ColumnFamily.columns_).
> SliceQueryFilter.getMemColumnIterator @ ~30% - Virtually all the time in
> here is spent in ConcurrentSkipListMap$Values.toArrray()

Besides the UUID optimization you posted, we should do an audit of
ColumnFamily.getSortedColumns and replace with iteration where
possible (in this case, we'd be left with one copy of most of the
columns, but that's better than two).

We can get rid of the other copy by fixing the logic in
Memtable.getSliceIterator, which says "copy all the columns, so we can
do a binary search on them to find where to start," but since columns
are natively in sorted order we could just use an iterator and a while
loop.

Created https://issues.apache.org/jira/browse/CASSANDRA-1046 for this.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Feeding in specific Cassandra columns into Hadoop

2010-05-03 Thread Jonathan Ellis

Can you reproduce outside the Hadoop environment, i.e. w/ Thrift code?

On Mon, May 3, 2010 at 5:49 AM, Mark Schnitzius
 wrote:
> Hi all...  I am trying to feed a specific list of Cassandra column names in
> as input to a Hadoop process, but for some reason it only feeds in some of
> the columns I specify, not all.
> This is a short description of the problem - I'll see if anyone might have
> some insight before I dump a big load of code on you...
> 1.  I've uploaded a bunch of data into Cassandra; the column names as longs
> (dates, basically) converted to byte[8].
> 2.  I can successfully set a SlicePredicate using setSlice_range to return
> all the data for a set of columns.
> 3.  However, if I instead call setColumn_names on the SlicePredicate, only
> some of the specified columns get fed into Hadoop.
> 4.  This faulty behavior is repeatable, with the same columns going missing
> each time for the same input parameters.
> 5.  For the values that fail, I've made fairly certain that the value for
> the column name is getting inserted successfully, and that the exact same
> column name is specified in the call to setColumn_names.
> Any clues?
>
> AdTHANKSvance,
> Mark



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Distributed export and import into cassandra

2010-05-03 Thread Jonathan Ellis

sstable2json does this.  (you'd want to perform nodetool compact
first, so there is only one sstable for the CF you want.)

On Mon, May 3, 2010 at 6:17 AM, Utku Can Topçu  wrote:
> Hey All,
>
> I have a simple sample use case,
> The aim is to export the columns in a column family into flat files with the
> keys in range from k1 to k2.
> Since all the nodes in the cluster is supposed to contain some of the
> distribution of data, is it possible to make each node dump its own local
> data volume to a flat file?
>
> Best Regards,
> Utku
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Primary and Backup clusters

2010-05-03 Thread Jonathan Ellis

I suppose you could use rsync (sstable files are immutable, so you
don't need to worry about not getting a "consistent" version of the
data files), but compared to letting Cassandra handle the replication
the way it's designed to,

# you'll generate a lot of disk i/o doing that vs
# your backup cluster will miss a relatively large amount of the most
recent updates
# your failover process will be more error-prone

On Mon, May 3, 2010 at 8:05 AM, Viktor Jevdokimov
 wrote:
> Hello,
>
> Our system (not Cassandra) have backup cluster in different datacenter in 
> case of primary cluster unavailability or for software upgrades.
> 100% of traffic goes to primary cluster. We switch 100% traffic to backup 
> cluster in case above for a short time, then when issues are resolved, 
> traffic is switched back to primary.
>
> We'd like to have primary and backup Cassandra clusters in different 
> datacenters for the same reasons.
> We do not want to have a high traffic between primary and backup datacenters.
>
> Now the questions:
>
> 1. How to sync Cassandra clusters (backup<->primary) with minimal traffic?
> 2. How to configure Cassandra in such case?
>
>
> Thanks,
>
> Viktor
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Strange streaming behaviour

2010-05-03 Thread Gary Dusbabek

Martin,

Please create a ticket and include the relevant parts of your storage-conf.

To summarize, the output gives you the impression that bootstrap has
completed normally, but when you check, it appears to be hung on the
receiving nodes?

Do you mind turning debug on and and seeing if you can reproduce?  The
strange part is that the source node doesn't think it's streaming at
all.

Gary.


On Mon, May 3, 2010 at 02:30, Dr. Martin Grabmüller
 wrote:
> Hello list,
>
> I encountered a problem with streaming in my test cluster but
> found nothing like this in JIRA or on the list.
>
> I'm running a test cluster of three nodes, RF=3, Cassandra 0.6.1.
> I started the first node and inserted some data, then bootstrapped
> the other machines one after the other.  No problems here.
>
> Then, I inserted data over the weekend and checked back this morning,
> finding a lot of -tmp-files in the data directory of the two bootstrapepd
> machines which seem to be leftovers from streaming.
>
> Interesting thing is, they are mostly all about the same size on each node. 
> Additionally,
> this size is the same as the size of some file to be streamed (see the output 
> of
> nodetool streams at the end of this post):
>
> [Node 2:]
>
> -rw-r--r-- 1 cassandra cassandra      2360 2010-05-03 07:39 
> AccountList-tmp-7-Data.db
> -rw-r--r-- 1 cassandra cassandra      2360 2010-05-03 00:39 
> AddressList-tmp-7-Data.db
> -rw-r--r-- 1 cassandra cassandra      2360 2010-05-03 07:53 
> IndexQueue-tmp-7-Data.db
> -rw-r--r-- 1 cassandra cassandra      2360 2010-05-03 05:19 
> MailMetadata-tmp-7-Data.db
> -rw-r--r-- 1 cassandra cassandra      2360 2010-05-03 00:53 
> Statistics-tmp-7-Data.db
>
> [Node 3:]
>
> -rw-r--r-- 1 cassandra cassandra     10994 2010-05-03 07:39 
> AccountList-tmp-6-Data.db
> -rw-r--r-- 1 cassandra cassandra       325 2010-05-03 00:39 
> AddressList-tmp-6-Data.db
> -rw-r--r-- 1 cassandra cassandra     10994 2010-05-03 08:02 
> CustomerList-tmp-6-Data.db
> -rw-r--r-- 1 cassandra cassandra     10994 2010-05-03 07:53 
> IndexQueue-tmp-7-Data.db
> -rw-r--r-- 1 cassandra cassandra     10994 2010-05-03 01:02 
> MailMetadata-tmp-7-Data.db
> -rw-r--r-- 1 cassandra cassandra     10994 2010-05-03 00:53 
> Statistics-tmp-7-Data.db
>
> Checking the logs, I find a few streaming-related error messages:
>
> [Node 1 (yes, the node without leftover files):]
>
> ERROR [MESSAGE-STREAMING-POOL:1] 2010-05-02 14:39:10,675 
> DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.FileNotFoundException: 
> /mnt/data000/cassandra/data/Archive/stream/LocalpartMapping-2-Data.db (No 
> such file or directory)
>        at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>        at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>        at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>        at java.lang.Thread.run(Thread.java:636)
> Caused by: java.io.FileNotFoundException: 
> /mnt/data000/cassandra/data/Archive/stream/LocalpartMapping-2-Data.db (No 
> such file or directory)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.(RandomAccessFile.java:233)
>        at 
> org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:84)
>        at 
> org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
>        at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>        ... 3 more
> ERROR [MESSAGE-STREAMING-POOL:1] 2010-05-02 14:39:10,675 CassandraDaemon.java 
> (line 78) Fatal exception in thread Thread[MESSAGE-STREAMING-POOL:1,5,main]
> java.lang.RuntimeException: java.io.FileNotFoundException: 
> /mnt/data000/cassandra/data/Archive/stream/LocalpartMapping-2-Data.db (No 
> such file or directory)
>        at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:34)
>        at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>        at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>        at java.lang.Thread.run(Thread.java:636)
> Caused by: java.io.FileNotFoundException: 
> /mnt/data000/cassandra/data/Archive/stream/LocalpartMapping-2-Data.db (No 
> such file or directory)
>        at java.io.RandomAccessFile.open(Native Method)
>        at java.io.RandomAccessFile.(RandomAccessFile.java:233)
>        at 
> org.apache.cassandra.net.FileStreamTask.stream(FileStreamTask.java:84)
>        at 
> org.apache.cassandra.net.FileStreamTask.runMayThrow(FileStreamTask.java:63)
>        at 
> org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:30)
>        ... 3 more
>
> [Node 3:]
>
> ERROR [MESSAGE-STREAMING-POOL:1] 2010-05-03 00:39:46,656 
> DebuggableThreadPoolExecutor.java (line 101) Error in ThreadPoolExecutor
> java.lang.RuntimeException: java.io.FileNotFoundException: 
> /mnt/data000/cassandra/

Error in TBaseHelper compareTo(byte [] a , byte [] b)

2010-05-03 Thread Erik Holstad

Hey!
We are currently using Cassandra 0.5.1 and I'm getting a StackOverflowError
when
comparing two ColumnOrSuperColumn objects. It turns out the the comparTo
function
for byte [] has an infinite loop in libthrift-r820831.jar.

We are planning to upgrade to 0.6.1 but not ready to do it today', so just
wanted to check
if it is possible to get a jar where that bug has been fixed that works with
0.5, so we can
just replace it?

-- 
Regards Erik

Bootstrap problem

2010-05-03 Thread David Koblas

Trying to add a node to an existing cluster and getting the following 
error (using 0.6.1):


 INFO [main] 2010-05-03 08:36:58,960 CommitLog.java (line 169) Log 
replay complete
 INFO [main] 2010-05-03 08:36:58,993 SystemTable.java (line 164) Saved 
Token found: 113225717064305079230489016527619806663
 INFO [main] 2010-05-03 08:36:58,994 SystemTable.java (line 179) Saved 
ClusterName found: Image Cluster
 INFO [main] 2010-05-03 08:36:59,003 StorageService.java (line 317) 
Starting up server gossip
 INFO [main] 2010-05-03 08:36:59,019 StorageService.java (line 378) 
Joining: getting load information
 INFO [main] 2010-05-03 08:36:59,019 StorageLoadBalancer.java (line 
365) Sleeping 9 ms to wait for load information...
 INFO [main] 2010-05-03 08:38:29,020 StorageService.java (line 378) 
Joining: getting bootstrap token
ERROR [main] 2010-05-03 08:38:29,023 CassandraDaemon.java (line 195) 
Exception encountered during startup.

java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap
at 
org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:120)
at 
org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:102)
at 
org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:97)
at 
org.apache.cassandra.service.StorageService.initServer(StorageService.java:347)
at 
org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:99)
at 
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)


I can poll the ring and every other node has an "UP" status.

Any ideas of where to look?

Skip large size (Configurable) SSTable in minor or/and major compaction

2010-05-03 Thread Schubert Zhang

We make a patch to 0.6 branch and 0.6.1 for this feature.

https://issues.apache.org/jira/browse/CASSANDRA-1041

Re: Bootstrap problem

2010-05-03 Thread Schubert Zhang

Seems your adding node is not a "new" node.

 INFO [main] 2010-05-03 08:36:58,993 SystemTable.java (line 164) Saved Token
found: 113225717064305079230489016527619806663
 INFO [main] 2010-05-03 08:36:58,994 SystemTable.java (line 179) Saved
ClusterName found: Image Cluster

Above log says, this node have system table which indicates the adding node
is belongs to "Image Cluster" and it's token is
113225717064305079230489016527619806663


Schubert

On Tue, May 4, 2010 at 1:09 AM, David Koblas  wrote:

> Trying to add a node to an existing cluster and getting the following error
> (using 0.6.1):
>
>  INFO [main] 2010-05-03 08:36:58,960 CommitLog.java (line 169) Log replay
> complete
>  INFO [main] 2010-05-03 08:36:58,993 SystemTable.java (line 164) Saved
> Token found: 113225717064305079230489016527619806663
>  INFO [main] 2010-05-03 08:36:58,994 SystemTable.java (line 179) Saved
> ClusterName found: Image Cluster
>  INFO [main] 2010-05-03 08:36:59,003 StorageService.java (line 317)
> Starting up server gossip
>  INFO [main] 2010-05-03 08:36:59,019 StorageService.java (line 378)
> Joining: getting load information
>  INFO [main] 2010-05-03 08:36:59,019 StorageLoadBalancer.java (line 365)
> Sleeping 9 ms to wait for load information...
>  INFO [main] 2010-05-03 08:38:29,020 StorageService.java (line 378)
> Joining: getting bootstrap token
> ERROR [main] 2010-05-03 08:38:29,023 CassandraDaemon.java (line 195)
> Exception encountered during startup.
> java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap
>at
> org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:120)
>at
> org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:102)
>at
> org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:97)
>at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:347)
>at
> org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:99)
>at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)
>
> I can poll the ring and every other node has an "UP" status.
>
> Any ideas of where to look?
>

Re: Bootstrap problem

2010-05-03 Thread David Koblas

It started out new, didn't cut and paste the "original" startup, but 
here it is...


 INFO [main] 2010-05-03 08:34:43,305 DatabaseDescriptor.java (line 229) 
Auto DiskAccessMode determined to be mmap
 INFO [main] 2010-05-03 08:34:43,637 SystemTable.java (line 139) Saved 
Token not found. Using 113225717064305079230489016527619806663
 INFO [main] 2010-05-03 08:34:43,638 SystemTable.java (line 145) Saved 
ClusterName not found. Using Image Cluster
 INFO [main] 2010-05-03 08:34:43,647 CommitLogSegment.java (line 50) 
Creating new commitlog segment 
/data/var/lib/cassandra/commitlog/CommitLog-1272900883647.log
 INFO [main] 2010-05-03 08:34:43,712 StorageService.java (line 317) 
Starting up server gossip
 INFO [main] 2010-05-03 08:34:43,737 StorageService.java (line 378) 
Joining: getting load information
 INFO [main] 2010-05-03 08:34:43,738 StorageLoadBalancer.java (line 
365) Sleeping 9 ms to wait for load information...
 INFO [main] 2010-05-03 08:36:13,738 StorageService.java (line 378) 
Joining: getting bootstrap token
ERROR [main] 2010-05-03 08:36:13,741 CassandraDaemon.java (line 195) 
Exception encountered during startup.


--koblas

On 5/3/10 10:15 AM, Schubert Zhang wrote:

Seems your adding node is not a "new" node.

 INFO [main] 2010-05-03 08:36:58,993 SystemTable.java (line 164) Saved 
Token found: 113225717064305079230489016527619806663
 INFO [main] 2010-05-03 08:36:58,994 SystemTable.java (line 179) Saved 
ClusterName found: Image Cluster


Above log says, this node have system table which indicates the adding 
node is belongs to "Image Cluster" and it's token is 
113225717064305079230489016527619806663



Schubert

On Tue, May 4, 2010 at 1:09 AM, David Koblas > wrote:


Trying to add a node to an existing cluster and getting the
following error (using 0.6.1):

 INFO [main] 2010-05-03 08:36:58,960 CommitLog.java (line 169) Log
replay complete
 INFO [main] 2010-05-03 08:36:58,993 SystemTable.java (line 164)
Saved Token found: 113225717064305079230489016527619806663
 INFO [main] 2010-05-03 08:36:58,994 SystemTable.java (line 179)
Saved ClusterName found: Image Cluster
 INFO [main] 2010-05-03 08:36:59,003 StorageService.java (line
317) Starting up server gossip
 INFO [main] 2010-05-03 08:36:59,019 StorageService.java (line
378) Joining: getting load information
 INFO [main] 2010-05-03 08:36:59,019 StorageLoadBalancer.java
(line 365) Sleeping 9 ms to wait for load information...
 INFO [main] 2010-05-03 08:38:29,020 StorageService.java (line
378) Joining: getting bootstrap token
ERROR [main] 2010-05-03 08:38:29,023 CassandraDaemon.java (line
195) Exception encountered during startup.
java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap
   at

org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:120)
   at

org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:102)
   at

org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:97)
   at

org.apache.cassandra.service.StorageService.initServer(StorageService.java:347)
   at
org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:99)
   at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)

I can poll the ring and every other node has an "UP" status.

Any ideas of where to look?

debian packages

2010-05-03 Thread Lee Parker

Is there a reason why the jvm options are so different in the debian version
from the standard cassandra.in.sh?  The following lines are completely
missing from the init script:
-XX:TargetSurvivorRatio=90 \
-XX:+AggressiveOpts \
-XX:+UseParNewGC \
-XX:+UseConcMarkSweepGC \
-XX:+CMSParallelRemarkEnabled \
-XX:+HeapDumpOnOutOfMemoryError \
-XX:SurvivorRatio=128 \
-XX:MaxTenuringThreshold=0 \

I see these in the JVM_EXTRA_OPS line of /etc/defaults/cassandra, but I
don't see where this data is actually passed into the jvm in the init
script.  Am I missing something?

Lee Parker

Re: debian packages

2010-05-03 Thread Eric Evans

On Mon, 2010-05-03 at 13:30 -0500, Lee Parker wrote:
> I see these in the JVM_EXTRA_OPS line of /etc/defaults/cassandra, but
> I don't see where this data is actually passed into the jvm in the
> init script.  Am I missing something? 

JVM_EXTRA_OPS should be getting passed (they used to be), so this is a
bug. It's been fixed now in SVN (both the 0.6 branch, and trunk); if
you're not running from SVN you can apply the following one-liner:

https://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/debian/init?r1=940575&r2=940574&pathrev=940575

Thanks for spotting this.

-- 
Eric Evans
eev...@rackspace.com

replication with large rows

2010-05-03 Thread Lee Parker

I have a CF on our cluster which has several rows with 200k+ columns of
TimeUUID data.  I have noticed recently that this CF is reaching my memtable
thresholds (128M or 1.5 mill obj) far more frequently than I would expect
(every 10 minutes or so).  This CF is used as an index of items in another
CF.  So, all of the columns only have a single value, but there are lots of
them.  In the other CF, the rows all have about 10-15 columns, but there are
millions of rows.  I have reviewed our code several times and cannot see
where we would be writing millions of columns to the index CF with this kind
of frequency.  Could this be caused by the replication of data between
nodes?  When one node has new data for a row, does it pass the entire row to
the other nodes for replication or does it just pass the portion of the row
that has changed? I have two nodes with a replication factor of 2.  In the
end, this is causing both of my servers to constantly work on compacting the
files for the index CF.

Lee Parker

Re: debian packages

2010-05-03 Thread Lee Parker

Thanks.  I'll apply the patch.  I'm not real familiar with the JVM options,
but I assume that on a production machine I should remove -Xdebug and the
-Xrunjdwp options.

Lee Parker
On Mon, May 3, 2010 at 2:29 PM, Eric Evans  wrote:

> On Mon, 2010-05-03 at 13:30 -0500, Lee Parker wrote:
> > I see these in the JVM_EXTRA_OPS line of /etc/defaults/cassandra, but
> > I don't see where this data is actually passed into the jvm in the
> > init script.  Am I missing something?
>
> JVM_EXTRA_OPS should be getting passed (they used to be), so this is a
> bug. It's been fixed now in SVN (both the 0.6 branch, and trunk); if
> you're not running from SVN you can apply the following one-liner:
>
>
> https://svn.apache.org/viewvc/cassandra/branches/cassandra-0.6/debian/init?r1=940575&r2=940574&pathrev=940575
>
> Thanks for spotting this.
>
> --
> Eric Evans
> eev...@rackspace.com
>
>

Re: debian packages

2010-05-03 Thread Eric Evans

On Mon, 2010-05-03 at 14:39 -0500, Lee Parker wrote:
> Thanks.  I'll apply the patch.  I'm not real familiar with the JVM
> options, but I assume that on a production machine I should remove
> -Xdebug and the -Xrunjdwp options.

Yes.

-- 
Eric Evans
eev...@rackspace.com

replication and memtable flush

2010-05-03 Thread Lee Parker

i have a cluster which contains two CFs.  One is a bunch of rows with 10-15
columns per row.  the other is an index of those items with only a few rows,
but thousands of columns per row.  i am noticing that the replication of
data between the nodes in the cluster is causing a lot of memtable flushing
for the index CF.  is this normal?  my memtable thresholds are 128M or 1.5m
objects.  Some of the rows in the index CF have 500k+ columns.  Also, this
doesn't seem to happen in my dev environment which has a single cassandra
node.  It also slows down the rate of flush/compact cycles when there is
only one node up in the prod cluster.

Lee Parker

Determining Cassandra System/Memory Requirements

2010-05-03 Thread Jon Graham

Hello Everyone,

Is there a practical formula for determining Cassandra system requirements
using OrderPreservingPartitioner ?

We have hundreds of millions of rows in a single column family with a
potential target of maybe a billion rows.

How can we estimate the Cassandra system requirements given factors such as:

N=number of nodes
M=memory allocated for Cassandra
R=replication factor
K=key size
D=individual column data size
CR=columns/row
NR=number of rows (keys) in column family

It seems like the compaction process gets more stressed as we add more data,
but I have no idea how close we are
to a breaking point.

Thanks,
Jon

Re: Error in TBaseHelper compareTo(byte [] a , byte [] b)

2010-05-03 Thread Jonathan Ellis

You'd need to check out thrift r820831, fix the compareTo code, then
build a new jar.

You can't just use the jar from 0.6 or from current thrift trunk
because Thrift breaks backwards compatibility frequently, and there
were such changes between our 0.5 and 0.6.

So, yes, you could do it, but it's probably easier to upgrade to 0.6.
And a good idea anyway because of the other fixes that went in.

On Mon, May 3, 2010 at 11:52 AM, Erik Holstad  wrote:
> Hey!
> We are currently using Cassandra 0.5.1 and I'm getting a StackOverflowError
> when
> comparing two ColumnOrSuperColumn objects. It turns out the the comparTo
> function
> for byte [] has an infinite loop in libthrift-r820831.jar.
>
> We are planning to upgrade to 0.6.1 but not ready to do it today', so just
> wanted to check
> if it is possible to get a jar where that bug has been fixed that works with
> 0.5, so we can
> just replace it?
>
> --
> Regards Erik
>

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Determining Cassandra System/Memory Requirements

2010-05-03 Thread Jonathan Ellis

Short answer: no, there is no formula into which you can plug numbers.

Longer answer: benchmark with a subset of your data and extrapolate.
The closer the test data is to real data, the more accurate it will
be.  Yes, compaction is O(N) wrt the amount of data in the system, so
don't do it more than necessary (increase memtable flush thresholds;
go easy on nodetool compact).

On Mon, May 3, 2010 at 4:34 PM, Jon Graham  wrote:
> Hello Everyone,
>
> Is there a practical formula for determining Cassandra system requirements
> using OrderPreservingPartitioner ?
>
> We have hundreds of millions of rows in a single column family with a
> potential target of maybe a billion rows.
>
> How can we estimate the Cassandra system requirements given factors such as:
>
> N=number of nodes
> M=memory allocated for Cassandra
> R=replication factor
> K=key size
> D=individual column data size
> CR=columns/row
> NR=number of rows (keys) in column family
>
> It seems like the compaction process gets more stressed as we add more data,
> but I have no idea how close we are
> to a breaking point.
>
> Thanks,
> Jon
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Error in TBaseHelper compareTo(byte [] a , byte [] b)

2010-05-03 Thread Anthony Molinaro

On Mon, May 03, 2010 at 05:46:51PM -0500, Jonathan Ellis wrote:
> You'd need to check out thrift r820831, fix the compareTo code, then
> build a new jar.
> 
> You can't just use the jar from 0.6 or from current thrift trunk
> because Thrift breaks backwards compatibility frequently, and there
> were such changes between our 0.5 and 0.6.

Wow, what were the backwards compatibility breakages?  I've been testing
using old thrift library for clients and new thrift for server, with
inserts and fetches and so far haven't noticed any oddities.

-Anthony

-- 

Anthony Molinaro

Re: Feeding in specific Cassandra columns into Hadoop

2010-05-03 Thread Mark Schnitzius

If I take the exact same SlicePredicate that fails in the Hadoop example,
and pass it in to a multiget_slice, the data is returned successfully.  So
it appears the problem does lie somewhere in the tie-in to Hadoop.

I will try to create a maximally-trimmed-down example that's complete enough
to run on its own that demonstrates the failure, and will post here.  I was
just hoping that there might've been an easy fix recognizable from my
description before I had to resort to that...


Thanks
Mark


On Tue, May 4, 2010 at 1:40 AM, Jonathan Ellis  wrote:

> Can you reproduce outside the Hadoop environment, i.e. w/ Thrift code?
>
> On Mon, May 3, 2010 at 5:49 AM, Mark Schnitzius
>  wrote:
> > Hi all...  I am trying to feed a specific list of Cassandra column names
> in
> > as input to a Hadoop process, but for some reason it only feeds in some
> of
> > the columns I specify, not all.
> > This is a short description of the problem - I'll see if anyone might
> have
> > some insight before I dump a big load of code on you...
> > 1.  I've uploaded a bunch of data into Cassandra; the column names as
> longs
> > (dates, basically) converted to byte[8].
> > 2.  I can successfully set a SlicePredicate using setSlice_range to
> return
> > all the data for a set of columns.
> > 3.  However, if I instead call setColumn_names on the SlicePredicate,
> only
> > some of the specified columns get fed into Hadoop.
> > 4.  This faulty behavior is repeatable, with the same columns going
> missing
> > each time for the same input parameters.
> > 5.  For the values that fail, I've made fairly certain that the value for
> > the column name is getting inserted successfully, and that the exact same
> > column name is specified in the call to setColumn_names.
> > Any clues?
> >
> > AdTHANKSvance,
> > Mark
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>

Re: How do you, Bloom filter of the false positive rate or remove the problem of distributed databases?

2010-05-03 Thread Kauzki Aranami

Let me rephrase my question.


How does Cassandra deal with bloom filter's false positives on deleted records?


The bloom filters can answer false positives, especially for deleted
records. How does Cassandra detect them?
And, how does Cassandra remove those *detected* false positives from
the bloom filter?


---
  Kazuki Aranami

 Twitter: http://twitter.com/kimtea
 Email: kazuki.aran...@gmail.com
 http://d.hatena.ne.jp/kazuki-aranami/
 ---



2010/5/3 Kauzki Aranami :
> Hi
>
> This data structure recognizes to the way based on the idea of
> Eventually Consistency of BASE
> though Bloom filter is adopted for the data structure in Cassandra as
> shape to allow no limited adjustment.
>
> In a word, there is a problem of generating the false positive rate.
> Moreover, data is deleted as a common problem to an existing
> filesystem of OS and the distributed database including BigTable of
> Google.
>
> In the deletion of data, I think that I try to attempt solving by
> especially using Interval Tree Clocks of Vector Clock that is a kind
> of the logical clock of Lamport.
>
> So question.
>
> How in the Bloom filter to detect the false positive rate, or to
> resolve the problem?
> My guess is, Merkel Tree and I thought that Tombstone is concerned?
>
>
> PS. Cassandra has contributed to the Wiki's poor ability to Japanese
> translation. :-)
>
> ---
>  Kazuki Aranami
>
>  Twitter: http://twitter.com/kimtea
>  Email: kazuki.aran...@gmail.com
>  http://d.hatena.ne.jp/kazuki-aranami/
>  ---
>

Re: Feeding in specific Cassandra columns into Hadoop

2010-05-03 Thread Jonathan Ellis

We serialize the SlicePredicate as part of the Hadoop Configuration
string.  It's quite possible that either

 - one of your column names is exposing a bug in the Thrift json serializer
 - Hadoop is silently truncating large predicates

You should test that getSlicePredicate(conf).equals(originalPredicate)

On Mon, May 3, 2010 at 8:15 PM, Mark Schnitzius
 wrote:
> If I take the exact same SlicePredicate that fails in the Hadoop example,
> and pass it in to a multiget_slice, the data is returned successfully.  So
> it appears the problem does lie somewhere in the tie-in to Hadoop.
> I will try to create a maximally-trimmed-down example that's complete enough
> to run on its own that demonstrates the failure, and will post here.  I was
> just hoping that there might've been an easy fix recognizable from my
> description before I had to resort to that...
>
> Thanks
> Mark
>
>
> On Tue, May 4, 2010 at 1:40 AM, Jonathan Ellis  wrote:
>>
>> Can you reproduce outside the Hadoop environment, i.e. w/ Thrift code?
>>
>> On Mon, May 3, 2010 at 5:49 AM, Mark Schnitzius
>>  wrote:
>> > Hi all...  I am trying to feed a specific list of Cassandra column names
>> > in
>> > as input to a Hadoop process, but for some reason it only feeds in some
>> > of
>> > the columns I specify, not all.
>> > This is a short description of the problem - I'll see if anyone might
>> > have
>> > some insight before I dump a big load of code on you...
>> > 1.  I've uploaded a bunch of data into Cassandra; the column names as
>> > longs
>> > (dates, basically) converted to byte[8].
>> > 2.  I can successfully set a SlicePredicate using setSlice_range to
>> > return
>> > all the data for a set of columns.
>> > 3.  However, if I instead call setColumn_names on the SlicePredicate,
>> > only
>> > some of the specified columns get fed into Hadoop.
>> > 4.  This faulty behavior is repeatable, with the same columns going
>> > missing
>> > each time for the same input parameters.
>> > 5.  For the values that fail, I've made fairly certain that the value
>> > for
>> > the column name is getting inserted successfully, and that the exact
>> > same
>> > column name is specified in the call to setColumn_names.
>> > Any clues?
>> >
>> > AdTHANKSvance,
>> > Mark
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: How do you, Bloom filter of the false positive rate or remove the problem of distributed databases?

2010-05-03 Thread Jonathan Ellis

On Mon, May 3, 2010 at 8:45 PM, Kauzki Aranami  wrote:
> Let me rephrase my question.
>
> How does Cassandra deal with bloom filter's false positives on deleted 
> records?

The same way it deals with tombstones that it encounters otherwise
(part of a row slice, or in a memtable).

All the bloom filter does is keep you from having to check rows that
don't have any data at all for a given key.  Tombstones are not the
same as "no data at all," we do need to propagate tombstones during
replication.

-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: Error in TBaseHelper compareTo(byte [] a , byte [] b)

2010-05-03 Thread Jonathan Ellis

Thrift is good about wire compatibility.  We're talking about running
Java code built against one API, against another version of the thrift
jar.  Different ball game.

On Mon, May 3, 2010 at 6:00 PM, Anthony Molinaro
 wrote:
>
> On Mon, May 03, 2010 at 05:46:51PM -0500, Jonathan Ellis wrote:
>> You'd need to check out thrift r820831, fix the compareTo code, then
>> build a new jar.
>>
>> You can't just use the jar from 0.6 or from current thrift trunk
>> because Thrift breaks backwards compatibility frequently, and there
>> were such changes between our 0.5 and 0.6.
>
> Wow, what were the backwards compatibility breakages?  I've been testing
> using old thrift library for clients and new thrift for server, with
> inserts and fetches and so far haven't noticed any oddities.
>
> -Anthony
>
> --
> 
> Anthony Molinaro                           
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: replication with large rows

2010-05-03 Thread Jonathan Ellis

replication in Cassandra is per-operation, not per-row

On Mon, May 3, 2010 at 2:40 PM, Lee Parker  wrote:
> I have a CF on our cluster which has several rows with 200k+ columns of
> TimeUUID data.  I have noticed recently that this CF is reaching my memtable
> thresholds (128M or 1.5 mill obj) far more frequently than I would expect
> (every 10 minutes or so).  This CF is used as an index of items in another
> CF.  So, all of the columns only have a single value, but there are lots of
> them.  In the other CF, the rows all have about 10-15 columns, but there are
> millions of rows.  I have reviewed our code several times and cannot see
> where we would be writing millions of columns to the index CF with this kind
> of frequency.  Could this be caused by the replication of data between
> nodes?  When one node has new data for a row, does it pass the entire row to
> the other nodes for replication or does it just pass the portion of the row
> that has changed? I have two nodes with a replication factor of 2.  In the
> end, this is causing both of my servers to constantly work on compacting the
> files for the index CF.
>
> Lee Parker



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Cassandra and Request routing

2010-05-03 Thread Olivier Mallassi

Hi all,

I can't figure out how to deal with request routing...

In fact I have two nodes in the "Test Cluster" and I wrote the client as
specified here http://wiki.apache.org/cassandra/ThriftExamples#Java. The
Keyspace is the default one (KeySpace1, replicatorFactor 1..)
The Seeds are well configured (using the IP) : ie. the cassandra log
indicates that the servers are up.

Everything goes well
if I write and read the data on node#1 for instance. Yet, if I write the
data on node#1 and then read the same data (using the key) on node#2,  no
data is found.

Did I miss something?
As far as I understood, I should be able to reach any nodes from the cluster
and the node should be able to "redirect" the request to the "good" node

Thank you for your answers and your time.

Best Regards.

Olivier.

-- 

Olivier Mallassi
OCTO Technology

50, Avenue des Champs-Elysées
75008 Paris

Mobile: (33) 6 28 70 26 61
Tél: (33) 1 58 56 10 00
Fax: (33) 1 58 56 10 01

http://www.octo.com
Octo Talks! http://blog.octo.com

Re: Feeding in specific Cassandra columns into Hadoop

2010-05-03 Thread Mark Schnitzius

>
> You should test that getSlicePredicate(conf).equals(originalPredicate)
>
>
That's it!  The byte arrays are slightly different after setting it on the
Hadoop config.  Below is a simple test which demonstrates the bug -- it
should print "true" but instead prints "false".  Please let me know if a bug
gets raised so I can track it.


Thanks
Mark


import org.apache.cassandra.hadoop.ConfigHelper;
import org.apache.cassandra.thrift.SlicePredicate;
import org.apache.hadoop.conf.Configuration;

import java.nio.ByteBuffer;
import java.util.ArrayList;
import java.util.List;

/**
 * A class which demonstrates a bug in Cassandra's ConfigHelper.
 */
public class SlicePredicateTest {

public static void main(String[] args) {
long columnValue = 127125360l;
byte[] columnBytes = getBytes(columnValue);
List columnNames = new ArrayList();
columnNames.add(columnBytes);
SlicePredicate originalPredicate = new SlicePredicate();
originalPredicate.setColumn_names(columnNames);
Configuration conf = new Configuration();
ConfigHelper.setSlicePredicate(conf, originalPredicate);

 
System.out.println(ConfigHelper.getSlicePredicate(conf).equals(originalPredicate));
}

private static byte[] getBytes(long l) {
byte[] bytes = new byte[8];
ByteBuffer.wrap(bytes).putLong(l);
return bytes;
}
}

performance tuning - where does the slowness come from?

2010-05-03 Thread Ran Tavory

I'm looking into performance issues on a 0.6.1 cluster. I see two symptoms:
1. Reads and writes are slow
2. One of the hosts is doing a lot of GC.

1 is slow in the sense that in normal state the cluster used to make around
3-5k read and writes per second (6-10k operations per second), but how it's
in the order of 200-400 ops per second, sometimes even less.
2 looks like this:
$ tail -f /outbrain/cassandra/log/system.log
 INFO [GC inspection] 2010-05-04 00:42:18,636 GCInspector.java (line 110) GC
for ParNew: 672 ms, 166482384 reclaimed leaving 2872087208 used; max is
4432068608
 INFO [GC inspection] 2010-05-04 00:42:28,638 GCInspector.java (line 110) GC
for ParNew: 498 ms, 166493352 reclaimed leaving 2836049448 used; max is
4432068608
 INFO [GC inspection] 2010-05-04 00:42:38,640 GCInspector.java (line 110) GC
for ParNew: 327 ms, 166091528 reclaimed leaving 2796888424 used; max is
4432068608
... and it goes on and on for hours, no stopping...

The cluster is made of 6 hosts, 3 in one DC and 3 in another.
Each host has 8G RAM.
-Xmx=4G

For some reason, the load isn't distributed evenly b/w the hosts, although
I'm not sure this is the cause for slowness
$ nodetool -h localhost -p 9004 ring
Address   Status Load  Range
 Ring

144413773383729447702215082383444206680
192.168.252.99Up 15.94 GB
 66002764663998929243644931915471302076 |<--|
192.168.254.57Up 19.84 GB
 81288739225600737067856268063987022738 |   ^
192.168.254.58Up 973.78 MB
86999744104066390588161689990810839743 v   |
192.168.252.62Up 5.18 GB
88308919879653155454332084719458267849 |   ^
192.168.254.59Up 10.57 GB
 142482163220375328195837946953175033937v   |
192.168.252.61Up 11.36 GB
 144413773383729447702215082383444206680|-->|

The slow host is 192.168.252.61 and it isn't the most loaded one.

The host is waiting a lot on IO and the load average is usually 6-7
$ w
 00:42:56 up 11 days, 13:22,  1 user,  load average: 6.21, 5.52, 3.93

$ vmstat 5
procs ---memory-- ---swap-- -io --system--
-cpu--
 r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id
wa st
 0  8 2147844  45744   1816 445738465663252  1  1 96
 2  0
 0  8 2147164  49020   1808 4451596  3850  234558 3372 9957  2  2 78
18  0
 0  3 2146432  45704   1812 4453956  3420  2274   108 3937 10732  2  2
78 19  0
 0  1 2146252  44696   1804 4453436  345  164  1939   294 3647 7833  2  2 78
18  0
 0  1 2145960  46924   1744 4451260  1580  2423   122 4354 14597  2  2
77 18  0
 7  1 2138344  44676952 4504148 1722  403  1722   406 1388  439 87  0 10
 2  0
 7  2 2137248  45652956 4499436 1384  655  1384   658 1356  392 87  0 10
 3  0
 7  1 2135976  46764956 4495020 1366  718  1366   718 1395  380 87  0  9
 4  0
 0  8 2134484  46964956 4489420 1673  555  1814   586 1601 215590 14  2
68 16  0
 0  1 2135388  47444972 4488516  785  833  2390   995 3812 8305  2  2 77
20  0
 0 10 2135164  45928980 4488796  788  543  2275   626 36

So, the host is swapping like crazy...

top shows that it's using a lot of memory. As noted before -Xmx=4G and
nothing else seems to be using a lot of memory on the host except for the
cassandra process, however, of the 8G ram on the host, 92% is used by
cassandra. How's that?
Top shows there's 3.9g Shared and 7.2g Resident and *15.9g Virtual*. Why
does it have 15g virtual? And why 7.2 RES? This can explain the slowness in
swapping.

$ top
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND


20281 cassandr  25   0 15.9g 7.2g 3.9g S 33.3 92.6 175:30.27 java

So, can the total memory be controlled?
Or perhaps I'm looking in the wrong direction...

I've looked at all the cassandra JMX counts and nothing seemed suspicious so
far. By suspicious i mean a large number of pending tasks - there were
always very small numbers in each pool.
About read and write latencies, I'm not sure what the normal state is, but
here's an example of what I see on the problematic host:

#mbean = org.apache.cassandra.service:type=StorageProxy:
RecentReadLatencyMicros = 30105.888180684495;
TotalReadLatencyMicros = 78543052801;
TotalWriteLatencyMicros = 4213118609;
RecentWriteLatencyMicros = 1444.4809201925639;
ReadOperations = 4779553;
RangeOperations = 0;
TotalRangeLatencyMicros = 0;
RecentRangeLatencyMicros = NaN;
WriteOperations = 4740093;

And the only pool that I do see some pending tasks is the ROW-READ-STAGE,
but it doesn't look like much, usually around 6-8:
#mbean = org.apache.cassandra.concurrent:type=ROW-READ-STAGE:
ActiveCount = 8;
PendingTasks = 8;
CompletedTasks = 5427955;

Any help finding the solution is appreciated, thanks...

Below are a few more JMXes I collected from the system that may be
interesting.

#mbean = java.lang:type=Memory:
Verbose = false;

HeapMemoryUsage = {
  committed = 3767279616;
  init = 134217728;
  max = 4293656576;
  used = 1237105080;
 };

NonHeapMe

Re: Cassandra and Request routing

2010-05-03 Thread Jonathan Shook

I think you may found the "eventually" in eventually consistent. With a
replication factor of 1, you are allowing the client thread to continue to
the read on node#2 before it is replicated to node 2. Try setting your
replication factor higher for different results.

Jonathan

On Tue, May 4, 2010 at 12:14 AM, Olivier Mallassi wrote:

> Hi all,
>
> I can't figure out how to deal with request routing...
>
> In fact I have two nodes in the "Test Cluster" and I wrote the client as
> specified here http://wiki.apache.org/cassandra/ThriftExamples#Java. The
> Keyspace is the default one (KeySpace1, replicatorFactor 1..)
> The Seeds are well configured (using the IP) : ie. the cassandra log
> indicates that the servers are up.
>
>  Everything goes
> well if I write and read the data on node#1 for instance. Yet, if I write
> the data on node#1 and then read the same data (using the key) on node#2,
>  no data is found.
>
> Did I miss something?
> As far as I understood, I should be able to reach any nodes from the
> cluster and the node should be able to "redirect" the request to the "good"
> node
>
> Thank you for your answers and your time.
>
> Best Regards.
>
> Olivier.
>
> --
> 
> Olivier Mallassi
> OCTO Technology
> 
> 50, Avenue des Champs-Elysées
> 75008 Paris
>
> Mobile: (33) 6 28 70 26 61
> Tél: (33) 1 58 56 10 00
> Fax: (33) 1 58 56 10 01
>
> http://www.octo.com
> Octo Talks! http://blog.octo.com
>
>
>

44 matches

Mail list logo