undo effect of CASSANDRA-3989

2012-04-10 Thread Radim Kolar
what is method for undo effect of CASSANDRA-3989 (too many unnecessary 
levels)? running major compact or cleanup does nothing.


Re: cassandra and .net

2012-04-10 Thread puneet loya
I checked logs of cassandra.. in the debug state..

I got this response

DEBUG [ScheduledTasks:1] 2012-04-10 14:49:29,654 LoadBroadcaster.java (line
86) Disseminating load info ...
DEBUG [Thrift:7] 2012-04-10 14:50:00,820 CustomTThreadPoolServer.java (line
197) Thrift transport error occurred during processing of message.
org.apache.thrift.transport.TTransportException
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
DEBUG [Thrift:7] 2012-04-10 14:50:00,820 ClientState.java (line 104) logged
out: #
On Tue, Apr 10, 2012 at 12:03 PM, Pierre Chalamet wrote:

> **
> Another tentative : try using TFramedTransport with an instance of TSocket
> directly.
>
> - Pierre
> --
> *From: * puneet loya 
> *Date: *Tue, 10 Apr 2012 11:06:22 +0530
> *To: *
> *ReplyTo: * user@cassandra.apache.org
> *Subject: *Re: cassandra and .net
>
> hi,
>
> sorry i posted the port as 7000. I m using 9160 but still has the same
> error.
>
> "Cannot read, Remote side has closed".
> Can u guess whats happening??
>
> On Tue, Apr 10, 2012 at 11:00 AM, Pierre Chalamet wrote:
>
>> hello,
>>
>> 9160 is probably the port to use if you use the default config.
>>
>> - Pierre
>>
>> On Apr 10, 2012, at 7:26 AM, puneet loya  wrote:
>>
>> > using System;
>> > using System.Collections.Generic;
>> > using System.Linq;
>> > using System.Text;
>> > using Thrift.Collections;
>> > using Thrift.Protocol;
>> > using Thrift.Transport;
>> > using Apache.Cassandra;
>> >
>> > namespace ConsoleApplication1
>> > {
>> > class Program
>> > {
>> > static void Main(string[] args)
>> > {
>> > TTransport transport=null;
>> > try
>> > {
>> > transport = new TBufferedTransport(new
>> TSocket("127.0.0.1", 7000));
>> >
>> >
>> > //if(buffered)
>> > //trans = new TBufferedTransport(trans as
>> TStreamTransport);
>> > //if (framed)
>> > //trans = new TFramedTransport(trans);
>> >
>> > TProtocol protocol = new TBinaryProtocol(transport);
>> > Cassandra.Client client = new
>> Cassandra.Client(protocol);
>> >
>> > Console.WriteLine("Opening connection");
>> >
>> > if (!transport.IsOpen)
>> > transport.Open();
>> >
>> > client.describe_keyspace("abc");   //
>> Crashing at this point
>> >
>> >   }
>> > catch (Exception ex)
>> > {
>> > Console.WriteLine(ex.Message);
>> > }
>> > finally
>> > { if(transport!=null)
>> > transport.Close(); }
>> > Console.ReadLine();
>> > }
>> > }
>> > }
>> >
>> > I m trying to interact with cassandra server(database) from .net. For
>> that i have referred two libraries i.e, apacheCassandra08.dll and
>> thrift.dll.. In the following piece of code the connection is getting
>> opened but when i m using client object it is giving an error stating
>> "Cannot read, Remote side has closed".
>> >
>> > Can any1 help me out with this? Has any1 faced the same prob?
>> >
>> >
>>
>
>


Re: cassandra and .net

2012-04-10 Thread puneet loya
Log is showing the following exception

DEBUG [ScheduledTasks:1] 2012-04-10 14:49:29,654 LoadBroadcaster.java (line
86) Disseminating load info ...
DEBUG [Thrift:7] 2012-04-10 14:50:00,820 CustomTThreadPoolServer.java (line
197) Thrift transport error occurred during processing of message.
org.apache.thrift.transport.TTransportException
at
org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
at
org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
at
org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
at
org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
at
org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
at
org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
at
org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
at java.lang.Thread.run(Unknown Source)
DEBUG [Thrift:7] 2012-04-10 14:50:00,820 ClientState.java (line 104) logged
out: #

On Tue, Apr 10, 2012 at 11:24 AM, Maki Watanabe wrote:

> Check your cassandra log.
> If you can't find any interesting log, set cassandra log level
> to DEBUG and run your program again.
>
> maki
>
> 2012/4/10 puneet loya :
> > hi,
> >
> > sorry i posted the port as 7000. I m using 9160 but still has the same
> > error.
> >
> > "Cannot read, Remote side has closed".
> > Can u guess whats happening??
> >
> > On Tue, Apr 10, 2012 at 11:00 AM, Pierre Chalamet 
> > wrote:
> >>
> >> hello,
> >>
> >> 9160 is probably the port to use if you use the default config.
> >>
> >> - Pierre
> >>
> >> On Apr 10, 2012, at 7:26 AM, puneet loya  wrote:
> >>
> >> > using System;
> >> > using System.Collections.Generic;
> >> > using System.Linq;
> >> > using System.Text;
> >> > using Thrift.Collections;
> >> > using Thrift.Protocol;
> >> > using Thrift.Transport;
> >> > using Apache.Cassandra;
> >> >
> >> > namespace ConsoleApplication1
> >> > {
> >> > class Program
> >> > {
> >> > static void Main(string[] args)
> >> > {
> >> > TTransport transport=null;
> >> > try
> >> > {
> >> > transport = new TBufferedTransport(new
> >> > TSocket("127.0.0.1", 7000));
> >> >
> >> >
> >> > //if(buffered)
> >> > //trans = new TBufferedTransport(trans as
> >> > TStreamTransport);
> >> > //if (framed)
> >> > //trans = new TFramedTransport(trans);
> >> >
> >> > TProtocol protocol = new TBinaryProtocol(transport);
> >> > Cassandra.Client client = new
> >> > Cassandra.Client(protocol);
> >> >
> >> > Console.WriteLine("Opening connection");
> >> >
> >> > if (!transport.IsOpen)
> >> > transport.Open();
> >> >
> >> > client.describe_keyspace("abc");   //
> >> > Crashing at this point
> >> >
> >> >   }
> >> > catch (Exception ex)
> >> > {
> >> > Console.WriteLine(ex.Message);
> >> > }
> >> > finally
> >> > { if(transport!=null)
> >> > transport.Close(); }
> >> > Console.ReadLine();
> >> > }
> >> > }
> >> > }
> >> >
> >> > I m trying to interact with cassandra server(database) from .net. For
> >> > that i have referred two libraries i.e, apacheCassandra08.dll and
> >> > thrift.dll.. In the following piece of code the connection is getting
> opened
> >> > but when i m using client object it is giving an error stating
> "Cannot read,
> >> > Remote side has closed".
> >> >
> >> > Can any1 help me out with this? Has any1 faced the same prob?
> >> >
> >> >
> >
> >
>


Re: cassandra and .net

2012-04-10 Thread Henrik Schröder
In your code you are using BufferedTransport, but in the Cassandra logs
you're getting errors when it tries to use FramedTransport. If I remember
correctly, BufferedTransport is gone, so you should only use
FramedTransport. Like this:

TTransport transport = new TFramedTransport(new TSocket(host, port));
TProtocol protocol = new TBinaryProtocol(transport);
var client = new Cassandra.Client(protocol);
transport.Open();
client.describe_keyspace("abc");


/Henrik

On Tue, Apr 10, 2012 at 11:23, puneet loya  wrote:

>
> Log is showing the following exception
>
> DEBUG [ScheduledTasks:1] 2012-04-10 14:49:29,654 LoadBroadcaster.java
> (line 86) Disseminating load info ...
> DEBUG [Thrift:7] 2012-04-10 14:50:00,820 CustomTThreadPoolServer.java
> (line 197) Thrift transport error occurred during processing of message.
> org.apache.thrift.transport.TTransportException
> at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>  at
> org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
> at
> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
>  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>  at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>  at
> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
> at
> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>  at java.lang.Thread.run(Unknown Source)
> DEBUG [Thrift:7] 2012-04-10 14:50:00,820 ClientState.java (line 104)
> logged out: #
>
> On Tue, Apr 10, 2012 at 11:24 AM, Maki Watanabe 
> wrote:
>
>> Check your cassandra log.
>> If you can't find any interesting log, set cassandra log level
>> to DEBUG and run your program again.
>>
>> maki
>>
>> 2012/4/10 puneet loya :
>> > hi,
>> >
>> > sorry i posted the port as 7000. I m using 9160 but still has the same
>> > error.
>> >
>> > "Cannot read, Remote side has closed".
>> > Can u guess whats happening??
>> >
>> > On Tue, Apr 10, 2012 at 11:00 AM, Pierre Chalamet 
>> > wrote:
>> >>
>> >> hello,
>> >>
>> >> 9160 is probably the port to use if you use the default config.
>> >>
>> >> - Pierre
>> >>
>> >> On Apr 10, 2012, at 7:26 AM, puneet loya  wrote:
>> >>
>> >> > using System;
>> >> > using System.Collections.Generic;
>> >> > using System.Linq;
>> >> > using System.Text;
>> >> > using Thrift.Collections;
>> >> > using Thrift.Protocol;
>> >> > using Thrift.Transport;
>> >> > using Apache.Cassandra;
>> >> >
>> >> > namespace ConsoleApplication1
>> >> > {
>> >> > class Program
>> >> > {
>> >> > static void Main(string[] args)
>> >> > {
>> >> > TTransport transport=null;
>> >> > try
>> >> > {
>> >> > transport = new TBufferedTransport(new
>> >> > TSocket("127.0.0.1", 7000));
>> >> >
>> >> >
>> >> > //if(buffered)
>> >> > //trans = new TBufferedTransport(trans as
>> >> > TStreamTransport);
>> >> > //if (framed)
>> >> > //trans = new TFramedTransport(trans);
>> >> >
>> >> > TProtocol protocol = new TBinaryProtocol(transport);
>> >> > Cassandra.Client client = new
>> >> > Cassandra.Client(protocol);
>> >> >
>> >> > Console.WriteLine("Opening connection");
>> >> >
>> >> > if (!transport.IsOpen)
>> >> > transport.Open();
>> >> >
>> >> > client.describe_keyspace("abc");   //
>> >> > Crashing at this point
>> >> >
>> >> >   }
>> >> > catch (Exception ex)
>> >> > {
>> >> > Console.WriteLine(ex.Message);
>> >> > }
>> >> > finally
>> >> > { if(transport!=null)
>> >> > transport.Close(); }
>> >> > Console.ReadLine();
>> >> > }
>> >> > }
>> >> > }
>> >> >
>> >> > I m trying to interact with cassandra server(database) from .net. For
>> >> > that i have referred two libraries i.e, apacheCassandra08.dll and
>> >> > thrift.dll.. In the following piece of code the connection is
>> getting opened
>> >> > but when i m using client object it is giving an error stating
>> "Cannot read,
>> >> > Remote side has closed".
>> >> >
>> >> > Can any1 help me out with this? Has any1 faced the same prob?
>> >> >
>> >> >
>> >
>> >
>>
>
>


Re: Listen and RPC address

2012-04-10 Thread aaron morton
Schema may not be fully propagated. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 9/04/2012, at 10:18 PM, Rishabh Agrawal wrote:

> Thanks, it just worked. Though I am able to load sstables but I get following 
> error:
>  
> ERROR 15:44:23,557 Error in ThreadPoolExecutor
> java.lang.IllegalArgumentException: Unknown CF 1000
>  
> what could be the reason.
> From: aaron morton [mailto:aa...@thelastpickle.com] 
> Sent: Monday, April 09, 2012 3:30 PM
> To: user@cassandra.apache.org
> Subject: Re: Listen and RPC address
>  
> Background: "Configuration" section 
> http://www.datastax.com/dev/blog/bulk-loading
>  
> I *think* you can get by with changing the rpc_port and storage_port for the 
> bulkl loader config. If that does not work create another loop back interface 
> and bind the bulk loader to it…
>  
> sudo ifconfig lo0 alias 127.0.0.2 up
>  
> Cheers
>  
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>  
> On 9/04/2012, at 8:31 PM, Rishabh Agrawal wrote:
> 
> 
> Hello,
> 
> I have three nodes cluster with having listen address as xxx.xx.1.101, 
> xxx.xx.1.102, xxx.xx.1.103 and Rpc address to be xxx.xx.1.111, xxx.xx.1.112, 
> xxx.xx.1.113. rpc_port and storage_port are 9160 and 7000 respectively.
> 
> Now when I run sstableloader tool I get following error:
> 
> org.apache.cassandra.config.ConfigurationException: /xxx.xx.1.101:7000 is in 
> use by another process.  Change listen_address:storage_port in cassandra.yaml 
> to values that do not conflict with other service.
> 
> Can someone help me with what am I missing in configuration.
> 
>  
> Thanks and Regards
> 
> Rishabh Agarawal
> 
>  
> 
> Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22. Know 
> more about our Big Data quick-start program at the event. 
> 
> New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis On-premise’ 
> available at http://bit.ly/z6zT4L. 
> 
> 
> NOTE: This message may contain information that is confidential, proprietary, 
> privileged or otherwise protected by law. The message is intended solely for 
> the named addressee. If received in error, please destroy and notify the 
> sender. Any use of this email is prohibited when received in error. Impetus 
> does not represent, warrant and/or guarantee, that the integrity of this 
> communication has been maintained nor that the communication is free of 
> errors, virus, interception or interference.
>  
> 
> 
> Impetus to sponsor and exhibit at Structure Data 2012, NY; Mar 21-22. Know 
> more about our Big Data quick-start program at the event. 
> 
> New Impetus webcast ‘Cloud-enabled Performance Testing vis-à-vis On-premise’ 
> available at http://bit.ly/z6zT4L. 
> 
> 
> NOTE: This message may contain information that is confidential, proprietary, 
> privileged or otherwise protected by law. The message is intended solely for 
> the named addressee. If received in error, please destroy and notify the 
> sender. Any use of this email is prohibited when received in error. Impetus 
> does not represent, warrant and/or guarantee, that the integrity of this 
> communication has been maintained nor that the communication is free of 
> errors, virus, interception or interference.



Re: composite columns vs super columns

2012-04-10 Thread aaron morton
Super Columns: top level column to have a list of sub column.
e.g. 
row key: foo
column: bar
sub columns: 
baz = qux 

Composite columns: data types are defined by combining multiple types, 
instances of the type are compared by comparing each component in turn.
e.g.
row key: foo
column:  = qux

in general prefer composite columns over super columns. 

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 10/04/2012, at 4:39 AM, Michael Cherkasov wrote:

> Hi all,
> 
> Can someone describe difference between super and composite columns?
> 
> And describe when each type is preferred to use?
> 
> Thanks.



Re: Request timeout and host marked down

2012-04-10 Thread aaron morton
> Caused by: java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> ... 31 more
This looks like a client side timeout to me. 

AFAIK it will use this 
http://rantav.github.com/hector//source/content/API/core/1.0-1/me/prettyprint/cassandra/service/CassandraHost.html#getCassandraThriftSocketTimeout()

if it's > 0 otherwise the value of the CASSANDRA_THRIFT_SOCKET_TIMEOUT JVM param

otherwise 0 i think. 

Hector is one of the many things I am not an expert on. Try the hector user 
list if you are still having problems. 


> 
> [cassy@s2.dsat4 ~]$  ~/bin/nodetool -h localhost tpstats
> Pool NameActive   Pending  Completed   Blocked  All 
> time blocked
> ReadStage 3 3  414129625 0
>  0
Looks fine. 

Hope that helps.


-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 10/04/2012, at 8:08 AM, Daning Wang wrote:

> Thanks Aaron! Here is the exception, is that the timeout between nodes? any 
> parameter I can change to reduce timeout?
> 
> me.prettyprint.hector.api.exceptions.HectorTransportException: 
> org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out
> at 
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:33)
> at 
> me.prettyprint.cassandra.model.CqlQuery$1.execute(CqlQuery.java:130)
> at 
> me.prettyprint.cassandra.model.CqlQuery$1.execute(CqlQuery.java:100)
> at 
> me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103)
> at 
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:246)
> at 
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
> at me.prettyprint.cassandra.model.CqlQuery.execute(CqlQuery.java:99)
> at 
> com.netseer.cassandra.cache.dao.CacheReader.getRows(CacheReader.java:267)
> at 
> com.netseer.cassandra.cache.dao.CacheReader.getCache0(CacheReader.java:55)
> at 
> com.netseer.cassandra.cache.dao.CacheDao.getCaches(CacheDao.java:85)
> at com.netseer.cassandra.cache.dao.CacheDao.getCache(CacheDao.java:71)
> at 
> com.netseer.cassandra.cache.dao.CacheDao.getCache(CacheDao.java:149)
> at 
> com.netseer.cassandra.cache.service.CacheServiceImpl.getCache(CacheServiceImpl.java:55)
> at 
> com.netseer.cassandra.cache.service.CacheServiceImpl.getCache(CacheServiceImpl.java:28)
> at 
> com.netseer.dsat.cache.CassandraDSATCacheImpl.get(CassandraDSATCacheImpl.java:62)
> at 
> com.netseer.dsat.cache.CassandraDSATCacheImpl.getTimedValue(CassandraDSATCacheImpl.java:144)
> at 
> com.netseer.dsat.serving.GenericCacheManager$4.call(GenericCacheManager.java:427)
> at 
> com.netseer.dsat.serving.GenericCacheManager$4.call(GenericCacheManager.java:423)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: org.apache.thrift.transport.TTransportException: 
> java.net.SocketTimeoutException: Read timed out
> at 
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
> at 
> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
> at 
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
> at 
> org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql_query(Cassandra.java:1698)
> at 
> org.apache.cassandra.thrift.Cassandra$Client.execute_cql_query(Cassandra.java:1682)
> at 
> me.prettyprint.cassandra.model.CqlQuery$1.execute(CqlQuery.java:106)
> ... 21 more
> Caused by: java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at 
> o

Re: undo effect of CASSANDRA-3989

2012-04-10 Thread Watanabe Maki
Plz refer to the following thread:

http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/cleanup-crashing-with-quot-java-util-concurrent-ExecutionException-java-lang-ArrayIndexOutOfBoundsEx-td7371682.html

maki

From iPhone


On 2012/04/10, at 17:21, Radim Kolar  wrote:

> what is method for undo effect of CASSANDRA-3989 (too many unnecessary 
> levels)? running major compact or cleanup does nothing.


Re: cassandra and .net

2012-04-10 Thread puneet loya
thankk  :) :) it works :)

On Tue, Apr 10, 2012 at 3:07 PM, Henrik Schröder  wrote:

> In your code you are using BufferedTransport, but in the Cassandra logs
> you're getting errors when it tries to use FramedTransport. If I remember
> correctly, BufferedTransport is gone, so you should only use
> FramedTransport. Like this:
>
> TTransport transport = new TFramedTransport(new TSocket(host, port));
>
> TProtocol protocol = new TBinaryProtocol(transport);
> var client = new Cassandra.Client(protocol);
> transport.Open();
> client.describe_keyspace("abc");
>
>
> /Henrik
>
>
> On Tue, Apr 10, 2012 at 11:23, puneet loya  wrote:
>
>>
>> Log is showing the following exception
>>
>> DEBUG [ScheduledTasks:1] 2012-04-10 14:49:29,654 LoadBroadcaster.java
>> (line 86) Disseminating load info ...
>> DEBUG [Thrift:7] 2012-04-10 14:50:00,820 CustomTThreadPoolServer.java
>> (line 197) Thrift transport error occurred during processing of message.
>> org.apache.thrift.transport.TTransportException
>> at
>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>>  at
>> org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
>> at
>> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
>>  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>>  at
>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>> at
>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>>  at
>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
>> at
>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>>  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
>> Source)
>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>>  at java.lang.Thread.run(Unknown Source)
>> DEBUG [Thrift:7] 2012-04-10 14:50:00,820 ClientState.java (line 104)
>> logged out: #
>>
>> On Tue, Apr 10, 2012 at 11:24 AM, Maki Watanabe 
>> wrote:
>>
>>> Check your cassandra log.
>>> If you can't find any interesting log, set cassandra log level
>>> to DEBUG and run your program again.
>>>
>>> maki
>>>
>>> 2012/4/10 puneet loya :
>>> > hi,
>>> >
>>> > sorry i posted the port as 7000. I m using 9160 but still has the same
>>> > error.
>>> >
>>> > "Cannot read, Remote side has closed".
>>> > Can u guess whats happening??
>>> >
>>> > On Tue, Apr 10, 2012 at 11:00 AM, Pierre Chalamet >> >
>>> > wrote:
>>> >>
>>> >> hello,
>>> >>
>>> >> 9160 is probably the port to use if you use the default config.
>>> >>
>>> >> - Pierre
>>> >>
>>> >> On Apr 10, 2012, at 7:26 AM, puneet loya 
>>> wrote:
>>> >>
>>> >> > using System;
>>> >> > using System.Collections.Generic;
>>> >> > using System.Linq;
>>> >> > using System.Text;
>>> >> > using Thrift.Collections;
>>> >> > using Thrift.Protocol;
>>> >> > using Thrift.Transport;
>>> >> > using Apache.Cassandra;
>>> >> >
>>> >> > namespace ConsoleApplication1
>>> >> > {
>>> >> > class Program
>>> >> > {
>>> >> > static void Main(string[] args)
>>> >> > {
>>> >> > TTransport transport=null;
>>> >> > try
>>> >> > {
>>> >> > transport = new TBufferedTransport(new
>>> >> > TSocket("127.0.0.1", 7000));
>>> >> >
>>> >> >
>>> >> > //if(buffered)
>>> >> > //trans = new TBufferedTransport(trans
>>> as
>>> >> > TStreamTransport);
>>> >> > //if (framed)
>>> >> > //trans = new TFramedTransport(trans);
>>> >> >
>>> >> > TProtocol protocol = new TBinaryProtocol(transport);
>>> >> > Cassandra.Client client = new
>>> >> > Cassandra.Client(protocol);
>>> >> >
>>> >> > Console.WriteLine("Opening connection");
>>> >> >
>>> >> > if (!transport.IsOpen)
>>> >> > transport.Open();
>>> >> >
>>> >> > client.describe_keyspace("abc");   //
>>> >> > Crashing at this point
>>> >> >
>>> >> >   }
>>> >> > catch (Exception ex)
>>> >> > {
>>> >> > Console.WriteLine(ex.Message);
>>> >> > }
>>> >> > finally
>>> >> > { if(transport!=null)
>>> >> > transport.Close(); }
>>> >> > Console.ReadLine();
>>> >> > }
>>> >> > }
>>> >> > }
>>> >> >
>>> >> > I m trying to interact with cassandra server(database) from .net.
>>> For
>>> >> > that i have referred two libraries i.e, apacheCassandra08.dll and
>>> >> > thrift.dll.. In the following piece of code the connection is
>>> getting opened
>>> >> > but when i m using client object it is giving an error stating
>>> "Cannot read,
>>> >> > Rem

Re: cassandra and .net

2012-04-10 Thread Jake Luciani
You can also look at using a .net client wrapper like
https://github.com/managedfusion/fluentcassandra

On Tue, Apr 10, 2012 at 8:06 AM, puneet loya  wrote:

> thankk  :) :) it works :)
>
>
> On Tue, Apr 10, 2012 at 3:07 PM, Henrik Schröder wrote:
>
>> In your code you are using BufferedTransport, but in the Cassandra logs
>> you're getting errors when it tries to use FramedTransport. If I remember
>> correctly, BufferedTransport is gone, so you should only use
>> FramedTransport. Like this:
>>
>> TTransport transport = new TFramedTransport(new TSocket(host, port));
>>
>> TProtocol protocol = new TBinaryProtocol(transport);
>> var client = new Cassandra.Client(protocol);
>> transport.Open();
>> client.describe_keyspace("abc");
>>
>>
>> /Henrik
>>
>>
>> On Tue, Apr 10, 2012 at 11:23, puneet loya  wrote:
>>
>>>
>>> Log is showing the following exception
>>>
>>> DEBUG [ScheduledTasks:1] 2012-04-10 14:49:29,654 LoadBroadcaster.java
>>> (line 86) Disseminating load info ...
>>> DEBUG [Thrift:7] 2012-04-10 14:50:00,820 CustomTThreadPoolServer.java
>>> (line 197) Thrift transport error occurred during processing of message.
>>> org.apache.thrift.transport.TTransportException
>>> at
>>> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
>>> at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>>>  at
>>> org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
>>> at
>>> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
>>>  at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
>>> at
>>> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
>>>  at
>>> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
>>> at
>>> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
>>>  at
>>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2877)
>>> at
>>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:187)
>>>  at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown
>>> Source)
>>> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>>>  at java.lang.Thread.run(Unknown Source)
>>> DEBUG [Thrift:7] 2012-04-10 14:50:00,820 ClientState.java (line 104)
>>> logged out: #
>>>
>>> On Tue, Apr 10, 2012 at 11:24 AM, Maki Watanabe >> > wrote:
>>>
 Check your cassandra log.
 If you can't find any interesting log, set cassandra log level
 to DEBUG and run your program again.

 maki

 2012/4/10 puneet loya :
 > hi,
 >
 > sorry i posted the port as 7000. I m using 9160 but still has the same
 > error.
 >
 > "Cannot read, Remote side has closed".
 > Can u guess whats happening??
 >
 > On Tue, Apr 10, 2012 at 11:00 AM, Pierre Chalamet <
 pie...@chalamet.net>
 > wrote:
 >>
 >> hello,
 >>
 >> 9160 is probably the port to use if you use the default config.
 >>
 >> - Pierre
 >>
 >> On Apr 10, 2012, at 7:26 AM, puneet loya 
 wrote:
 >>
 >> > using System;
 >> > using System.Collections.Generic;
 >> > using System.Linq;
 >> > using System.Text;
 >> > using Thrift.Collections;
 >> > using Thrift.Protocol;
 >> > using Thrift.Transport;
 >> > using Apache.Cassandra;
 >> >
 >> > namespace ConsoleApplication1
 >> > {
 >> > class Program
 >> > {
 >> > static void Main(string[] args)
 >> > {
 >> > TTransport transport=null;
 >> > try
 >> > {
 >> > transport = new TBufferedTransport(new
 >> > TSocket("127.0.0.1", 7000));
 >> >
 >> >
 >> > //if(buffered)
 >> > //trans = new TBufferedTransport(trans
 as
 >> > TStreamTransport);
 >> > //if (framed)
 >> > //trans = new TFramedTransport(trans);
 >> >
 >> > TProtocol protocol = new
 TBinaryProtocol(transport);
 >> > Cassandra.Client client = new
 >> > Cassandra.Client(protocol);
 >> >
 >> > Console.WriteLine("Opening connection");
 >> >
 >> > if (!transport.IsOpen)
 >> > transport.Open();
 >> >
 >> > client.describe_keyspace("abc");   //
 >> > Crashing at this point
 >> >
 >> >   }
 >> > catch (Exception ex)
 >> > {
 >> > Console.WriteLine(ex.Message);
 >> > }
 >> > finally
 >> > { if(transport!=null)
 >> > transport.Close(); }
 >> > Console.ReadLine();
 >> > }
 >> > }
 >> > }
 >> >
 >> > I m trying t

json2sstable error: Can not write to the Standard columns Super Column Family

2012-04-10 Thread Aliou SOW

Dear All,

I am new
to Cassandra 1.0.8, and I use the tool json2sstable for bulk insert, but I
still have the error:

  

Repair Process Taking too long

2012-04-10 Thread Frank Ng
Hello,

I am on Cassandra 1.0.7.  My repair processes are taking over 30 hours to
complete.  Is it normal for the repair process to take this long?  I wonder
if it's because I am using the ext3 file system.

thanks


json2sstable error: Can not write to the Standard columns Super Column Family

2012-04-10 Thread Aliou SOW



Dear
All,



I am new to Cassandra 1.0.8, and I use the tool json2sstable for bulk
insert, but I still have the error:



java.lang.RuntimeException: Can not write to the Standard columns Super
Column Family.

 Has org.apache.cassandra.tools.SSTableImport.importSorted
(SSTableImport.java: 368)

 Has
org.apache.cassandra.tools.SSTableImport.importJson (SSTableImport.java: 255)

 Has
org.apache.cassandra.tools.SSTableImport.main (SSTableImport.java: 479)

ERROR: Can not write to the Standard columns Super Column Family. 

Before that, i first created a keyspace
"testkeyspace", then a column family "testCF" wherein I
defined the key as varchar. Here is the structure of my json file:

 

{

"rs12354060": {

"X1714T": 1.0,

"X1905T": 1.0,

...

"X3155T":
1.0

}

"rs3115850": {

"X1714T": 0938,

"X1905T": 0879,

...

"X3155T":
0822

}

}

 

Help please, and if there is a other and easier
way for bulk inserts, let me know please.

 

Kind
regards, Aliou.

 

 

  

Why so many SSTables?

2012-04-10 Thread Romain HARDOUIN
Hi,

We are surprised by the number of files generated by Cassandra.
Our cluster consists of 9 nodes and each node handles about 35 GB. 
We're using Cassandra 1.0.6 with LeveledCompactionStrategy.
We have 30 CF.

We've got roughly 45,000 files under the keyspace directory on each node:
ls -l /var/lib/cassandra/data/OurKeyspace/ | wc -l
44372

The biggest CF is spread over 38,000 files:
ls -l Documents* | wc -l
37870

ls -l Documents*-Data.db | wc -l
7586

Many SSTable are about 4 MB:

19 MB -> 1 SSTable
12 MB -> 2 SSTables
11 MB -> 2 SSTables
9.2 MB -> 1 SSTable
7.0 MB to 7.9 MB -> 6 SSTables
6.0 MB to 6.4 MB -> 6 SSTables
5.0 MB to 5.4 MB -> 4 SSTables
4.0 MB to 4.7 MB -> 7139 SSTables
3.0 MB to 3.9 MB -> 258 SSTables
2.0 MB to 2.9 MB -> 35 SSTables
1.0 MB to 1.9 MB -> 13 SSTables
87 KB to  994 KB -> 87 SSTables
0 KB -> 32 SSTables

FYI here is CF information:

ColumnFamily: Documents
  Key Validation Class: org.apache.cassandra.db.marshal.BytesType
  Default column value validator: 
org.apache.cassandra.db.marshal.BytesType
  Columns sorted by: org.apache.cassandra.db.marshal.BytesType
  Row cache size / save period in seconds / keys to save : 0.0/0/all
  Row Cache Provider: org.apache.cassandra.cache.SerializingCacheProvider
  Key cache size / save period in seconds: 20.0/14400
  GC grace seconds: 1728000
  Compaction min/max thresholds: 4/32
  Read repair chance: 1.0
  Replicate on write: true
  Column Metadata:
Column Name: refUUID (7265664944)
  Validation Class: org.apache.cassandra.db.marshal.BytesType
  Index Name: refUUID_idx
  Index Type: KEYS
  Compaction Strategy: 
org.apache.cassandra.db.compaction.LeveledCompactionStrategy
  Compression Options:
sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor

Is it a bug? If not, how can we tune Cassandra to avoid this?

Regards,

Romain

Trouble with wrong data

2012-04-10 Thread Alain RODRIGUEZ
Hi, I'm experimenting a strange and very annoying phenomena.

I had a problem with the commit log size which grew too much and full one
of the hard disks in all my nodes almost at the same time (2 nodes only,
RF=2, so the 2 nodes are behaving exactly in the same way)

My data are mounted in an other partition that was not full. However after
recovering from this issue (freeing some space and fixing the value of
 "commitlog_total_space_in_mb" in cassandra.yaml) I realized that all
statistics were all destroyed. I have bad values on every single counter
since I start using them (september) !

Does anyone experimented something similar or have any clue on this ?

Do you need more information ?

Alain


Re: Trouble with wrong data

2012-04-10 Thread Alain RODRIGUEZ
By the way, I am using Cassandra 1.0.7, CL = ONE (R/W), RF = 2, 2 EC2
c1.medium nodes cluster

Alain

2012/4/10 Alain RODRIGUEZ 

> Hi, I'm experimenting a strange and very annoying phenomena.
>
> I had a problem with the commit log size which grew too much and full one
> of the hard disks in all my nodes almost at the same time (2 nodes only,
> RF=2, so the 2 nodes are behaving exactly in the same way)
>
> My data are mounted in an other partition that was not full. However after
> recovering from this issue (freeing some space and fixing the value of
>  "commitlog_total_space_in_mb" in cassandra.yaml) I realized that all
> statistics were all destroyed. I have bad values on every single counter
> since I start using them (september) !
>
> Does anyone experimented something similar or have any clue on this ?
>
> Do you need more information ?
>
> Alain
>


Pycassa help?

2012-04-10 Thread Mucklow, Blaine (GE Energy)
Hi all,

I had a lot of success using Java+Hector, but was trying to migrate to pycassa 
and was having some 'simple' issues.  What I am trying to do is create a column 
family where the following occurs:
KEY-> String ColumnName-> LongType ColumnValue-> DoubleType

Basically this is time series data for an 'id'.  I had no problems with this in 
Hector but am struggling in pycassa.  I have the following code snippets:

system.create_column_family(keyspace=keyspace,name=columnfamily,comparator_type=BytesType)
system.close()
//Iterative loop to build my time series data
columns[long(ms_since_epoch)] = float(measurement)
//After loop
col_family.insert(key=id,columns=columns)

I then get the following error:
TypeError: A str or unicode value was expected, but long was received instead 
(1321574400)

Any thoughts?

Thanks,

Blaine Mucklow


Re: Repair Process Taking too long

2012-04-10 Thread Igor

Hi

You can check with nodetool  which part of repair process is slow - 
network streams or verify compactions. use nodetool netstats or 
compactionstats.


On 04/10/2012 05:16 PM, Frank Ng wrote:

Hello,

I am on Cassandra 1.0.7.  My repair processes are taking over 30 hours 
to complete.  Is it normal for the repair process to take this long?  
I wonder if it's because I am using the ext3 file system.


thanks




Re: Resident size growth

2012-04-10 Thread ruslan usifov
mmap doesn't depend on jna

2012/4/9 Jeremiah Jordan 

>  He says he disabled JNA.  You can't mmap without JNA can you?
>
>  On Apr 9, 2012, at 4:52 AM, aaron morton wrote:
>
>  see http://wiki.apache.org/cassandra/FAQ#mmap
>
>  Cheers
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
>  On 9/04/2012, at 5:09 AM, ruslan usifov wrote:
>
> mmap sstables? It's normal
>
> 2012/4/5 Omid Aladini 
>
>> Hi,
>>
>> I'm experiencing a steady growth in resident size of JVM running
>> Cassandra 1.0.7. I disabled JNA and off-heap row cache, tested with
>> and without mlockall disabling paging, and upgraded to JRE 1.6.0_31 to
>> prevent this bug [1] to leak memory. Still JVM's resident set size
>> grows steadily. A process with Xmx=2048M has grown to 6GB resident
>> size and one with Xmx=8192M to 16GB in a few hours and increasing. Has
>> anyone experienced this? Any idea how to deal with this issue?
>>
>> Thanks,
>> Omid
>>
>> [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7066129
>>
>
>
>
>


Re: Resident size growth

2012-04-10 Thread ruslan usifov
also i suggest to setup disk_access_mode: mmap_index_only

2012/4/9 Omid Aladini 

> Thanks. Yes it's due to mmappd SSTables pages that count as resident size.
>
> Jeremiah: mmap isn't through JNA, it's via java.nio.MappedByteBuffer I
> think.
>
> -- Omid
>
> On Mon, Apr 9, 2012 at 4:15 PM, Jeremiah Jordan
>  wrote:
> > He says he disabled JNA.  You can't mmap without JNA can you?
> >
> > On Apr 9, 2012, at 4:52 AM, aaron morton wrote:
> >
> > see http://wiki.apache.org/cassandra/FAQ#mmap
> >
> > Cheers
> >
> > -
> > Aaron Morton
> > Freelance Developer
> > @aaronmorton
> > http://www.thelastpickle.com
> >
> > On 9/04/2012, at 5:09 AM, ruslan usifov wrote:
> >
> > mmap sstables? It's normal
> >
> > 2012/4/5 Omid Aladini 
> >>
> >> Hi,
> >>
> >> I'm experiencing a steady growth in resident size of JVM running
> >> Cassandra 1.0.7. I disabled JNA and off-heap row cache, tested with
> >> and without mlockall disabling paging, and upgraded to JRE 1.6.0_31 to
> >> prevent this bug [1] to leak memory. Still JVM's resident set size
> >> grows steadily. A process with Xmx=2048M has grown to 6GB resident
> >> size and one with Xmx=8192M to 16GB in a few hours and increasing. Has
> >> anyone experienced this? Any idea how to deal with this issue?
> >>
> >> Thanks,
> >> Omid
> >>
> >> [1] http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=7066129
> >
> >
> >
> >
>


Re: Pycassa help?

2012-04-10 Thread Mucklow, Blaine (GE Energy)
Turns out if you read to the bottom of the tutorial you find answers.
Please disregard this mail, I found my answer.

:)

On 4/10/12 10:41 AM, "Mucklow, Blaine (GE Energy)" 
wrote:

>Hi all,
>
>I had a lot of success using Java+Hector, but was trying to migrate to
>pycassa and was having some 'simple' issues.  What I am trying to do is
>create a column family where the following occurs:
>KEY-> String ColumnName-> LongType ColumnValue-> DoubleType
>
>Basically this is time series data for an 'id'.  I had no problems with
>this in Hector but am struggling in pycassa.  I have the following code
>snippets:
>
>system.create_column_family(keyspace=keyspace,name=columnfamily,comparator
>_type=BytesType)
>system.close()
>//Iterative loop to build my time series data
>columns[long(ms_since_epoch)] = float(measurement)
>//After loop
>col_family.insert(key=id,columns=columns)
>
>I then get the following error:
>TypeError: A str or unicode value was expected, but long was received
>instead (1321574400)
>
>Any thoughts?
>
>Thanks,
>
>Blaine Mucklow



Re: Why so many SSTables?

2012-04-10 Thread Jonathan Ellis
LCS explicitly tries to keep sstables under 5MB to minimize extra work
done by compacting data that didn't really overlap across different
levels.

On Tue, Apr 10, 2012 at 9:24 AM, Romain HARDOUIN
 wrote:
>
> Hi,
>
> We are surprised by the number of files generated by Cassandra.
> Our cluster consists of 9 nodes and each node handles about 35 GB.
> We're using Cassandra 1.0.6 with LeveledCompactionStrategy.
> We have 30 CF.
>
> We've got roughly 45,000 files under the keyspace directory on each node:
> ls -l /var/lib/cassandra/data/OurKeyspace/ | wc -l
> 44372
>
> The biggest CF is spread over 38,000 files:
> ls -l Documents* | wc -l
> 37870
>
> ls -l Documents*-Data.db | wc -l
> 7586
>
> Many SSTable are about 4 MB:
>
> 19 MB -> 1 SSTable
> 12 MB -> 2 SSTables
> 11 MB -> 2 SSTables
> 9.2 MB -> 1 SSTable
> 7.0 MB to 7.9 MB -> 6 SSTables
> 6.0 MB to 6.4 MB -> 6 SSTables
> 5.0 MB to 5.4 MB -> 4 SSTables
> 4.0 MB to 4.7 MB -> 7139 SSTables
> 3.0 MB to 3.9 MB -> 258 SSTables
> 2.0 MB to 2.9 MB -> 35 SSTables
> 1.0 MB to 1.9 MB -> 13 SSTables
> 87 KB to  994 KB -> 87 SSTables
> 0 KB -> 32 SSTables
>
> FYI here is CF information:
>
> ColumnFamily: Documents
>   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
>   Default column value validator: org.apache.cassandra.db.marshal.BytesType
>   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
>   Row cache size / save period in seconds / keys to save : 0.0/0/all
>   Row Cache Provider: org.apache.cassandra.cache.SerializingCacheProvider
>   Key cache size / save period in seconds: 20.0/14400
>   GC grace seconds: 1728000
>   Compaction min/max thresholds: 4/32
>   Read repair chance: 1.0
>   Replicate on write: true
>   Column Metadata:
>     Column Name: refUUID (7265664944)
>       Validation Class: org.apache.cassandra.db.marshal.BytesType
>       Index Name: refUUID_idx
>       Index Type: KEYS
>   Compaction Strategy:
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy
>   Compression Options:
>     sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
>
> Is it a bug? If not, how can we tune Cassandra to avoid this?
>
> Regards,
>
> Romain



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com


Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
I think both processes are taking a while.  When it starts up, netstats and
compactionstats show nothing.  Anyone out there successfully using ext3 and
their repair processes are faster than this?

On Tue, Apr 10, 2012 at 10:42 AM, Igor  wrote:

> Hi
>
> You can check with nodetool  which part of repair process is slow -
> network streams or verify compactions. use nodetool netstats or
> compactionstats.
>
>
> On 04/10/2012 05:16 PM, Frank Ng wrote:
>
>> Hello,
>>
>> I am on Cassandra 1.0.7.  My repair processes are taking over 30 hours to
>> complete.  Is it normal for the repair process to take this long?  I wonder
>> if it's because I am using the ext3 file system.
>>
>> thanks
>>
>
>


Re: Repair Process Taking too long

2012-04-10 Thread Igor

On 04/10/2012 07:16 PM, Frank Ng wrote:

Short answer - yes.
But you are asking wrong question.

I think both processes are taking a while.  When it starts up, 
netstats and compactionstats show nothing.  Anyone out there 
successfully using ext3 and their repair processes are faster than this?


On Tue, Apr 10, 2012 at 10:42 AM, Igor > wrote:


Hi

You can check with nodetool  which part of repair process is slow
- network streams or verify compactions. use nodetool netstats or
compactionstats.


On 04/10/2012 05:16 PM, Frank Ng wrote:

Hello,

I am on Cassandra 1.0.7.  My repair processes are taking over
30 hours to complete.  Is it normal for the repair process to
take this long?  I wonder if it's because I am using the ext3
file system.

thanks







Re: Repair Process Taking too long

2012-04-10 Thread Jonathan Rhone
Data size, number of nodes, RF?

Are you using size-tiered compaction on any of the column families that
hold a lot of your data?

Do your cassandra logs say you are streaming a lot of ranges?
zgrep -E "(Performing streaming repair|out of sync)"


On Tue, Apr 10, 2012 at 9:45 AM, Igor  wrote:

>  On 04/10/2012 07:16 PM, Frank Ng wrote:
>
> Short answer - yes.
> But you are asking wrong question.
>
>
> I think both processes are taking a while.  When it starts up, netstats
> and compactionstats show nothing.  Anyone out there successfully using ext3
> and their repair processes are faster than this?
>
>  On Tue, Apr 10, 2012 at 10:42 AM, Igor  wrote:
>
>> Hi
>>
>> You can check with nodetool  which part of repair process is slow -
>> network streams or verify compactions. use nodetool netstats or
>> compactionstats.
>>
>>
>> On 04/10/2012 05:16 PM, Frank Ng wrote:
>>
>>> Hello,
>>>
>>> I am on Cassandra 1.0.7.  My repair processes are taking over 30 hours
>>> to complete.  Is it normal for the repair process to take this long?  I
>>> wonder if it's because I am using the ext3 file system.
>>>
>>> thanks
>>>
>>
>>
>
>


-- 
Jonathan Rhone
Software Engineer

*TinyCo*
800 Market St., Fl 6
San Francisco, CA 94102
www.tinyco.com


Re: Repair Process Taking too long

2012-04-10 Thread Igor

also - JVM heap size, and anything related to memory pressure

On 04/10/2012 07:56 PM, Jonathan Rhone wrote:

Data size, number of nodes, RF?

Are you using size-tiered compaction on any of the column families 
that hold a lot of your data?


Do your cassandra logs say you are streaming a lot of ranges?
zgrep -E "(Performing streaming repair|out of sync)"


On Tue, Apr 10, 2012 at 9:45 AM, Igor > wrote:


On 04/10/2012 07:16 PM, Frank Ng wrote:

Short answer - yes.
But you are asking wrong question.



I think both processes are taking a while.  When it starts up,
netstats and compactionstats show nothing.  Anyone out there
successfully using ext3 and their repair processes are faster
than this?

On Tue, Apr 10, 2012 at 10:42 AM, Igor mailto:i...@4friends.od.ua>> wrote:

Hi

You can check with nodetool  which part of repair process is
slow - network streams or verify compactions. use nodetool
netstats or compactionstats.


On 04/10/2012 05:16 PM, Frank Ng wrote:

Hello,

I am on Cassandra 1.0.7.  My repair processes are taking
over 30 hours to complete.  Is it normal for the repair
process to take this long?  I wonder if it's because I am
using the ext3 file system.

thanks








--
Jonathan Rhone
Software Engineer

*TinyCo*
800 Market St., Fl 6
San Francisco, CA 94102
www.tinyco.com 





Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
I have 12 nodes with approximately 1TB load per node.  The RF is 3.  I am
considering moving to ext4.

I checked the ranges and the numbers go from 1 to the 9000s .

On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone  wrote:

> Data size, number of nodes, RF?
>
> Are you using size-tiered compaction on any of the column families that
> hold a lot of your data?
>
> Do your cassandra logs say you are streaming a lot of ranges?
> zgrep -E "(Performing streaming repair|out of sync)"
>
>
> On Tue, Apr 10, 2012 at 9:45 AM, Igor  wrote:
>
>>  On 04/10/2012 07:16 PM, Frank Ng wrote:
>>
>> Short answer - yes.
>> But you are asking wrong question.
>>
>>
>> I think both processes are taking a while.  When it starts up, netstats
>> and compactionstats show nothing.  Anyone out there successfully using ext3
>> and their repair processes are faster than this?
>>
>>  On Tue, Apr 10, 2012 at 10:42 AM, Igor  wrote:
>>
>>> Hi
>>>
>>> You can check with nodetool  which part of repair process is slow -
>>> network streams or verify compactions. use nodetool netstats or
>>> compactionstats.
>>>
>>>
>>> On 04/10/2012 05:16 PM, Frank Ng wrote:
>>>
 Hello,

 I am on Cassandra 1.0.7.  My repair processes are taking over 30 hours
 to complete.  Is it normal for the repair process to take this long?  I
 wonder if it's because I am using the ext3 file system.

 thanks

>>>
>>>
>>
>>
>
>
> --
> Jonathan Rhone
> Software Engineer
>
> *TinyCo*
> 800 Market St., Fl 6
> San Francisco, CA 94102
> www.tinyco.com
>
>


Re: Request timeout and host marked down

2012-04-10 Thread Daning Wang
Thanks Aaron, will seek help from hector team.

On Tue, Apr 10, 2012 at 3:41 AM, aaron morton wrote:

> Caused by: java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketRead0(Native Method)
> at java.net.SocketInputStream.read(SocketInputStream.java:129)
> at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:127)
> ... 31 more
>
> This looks like a client side timeout to me.
>
> AFAIK it will use this
>
> http://rantav.github.com/hector//source/content/API/core/1.0-1/me/prettyprint/cassandra/service/CassandraHost.html#getCassandraThriftSocketTimeout()
>
> if it's > 0 otherwise the value of the CASSANDRA_THRIFT_SOCKET_TIMEOUT JVM
> param
>
> otherwise 0 i think.
>
> Hector is one of the many things I am not an expert on. Try the hector
> user list if you are still having problems.
>
>
>
> [cassy@s2.dsat4 ~]$  ~/bin/nodetool -h localhost tpstats
> Pool NameActive   Pending  Completed   Blocked
> All time blocked
> ReadStage 3 3  414129625
> 0 0
>
> Looks fine.
>
> Hope that helps.
>
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 10/04/2012, at 8:08 AM, Daning Wang wrote:
>
> Thanks Aaron! Here is the exception, is that the timeout between nodes?
> any parameter I can change to reduce timeout?
>
> me.prettyprint.hector.api.exceptions.HectorTransportException:
> org.apache.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Read timed out
> at
> me.prettyprint.cassandra.service.ExceptionsTranslatorImpl.translate(ExceptionsTranslatorImpl.java:33)
> at
> me.prettyprint.cassandra.model.CqlQuery$1.execute(CqlQuery.java:130)
> at
> me.prettyprint.cassandra.model.CqlQuery$1.execute(CqlQuery.java:100)
> at
> me.prettyprint.cassandra.service.Operation.executeAndSetResult(Operation.java:103)
> at
> me.prettyprint.cassandra.connection.HConnectionManager.operateWithFailover(HConnectionManager.java:246)
> at
> me.prettyprint.cassandra.model.ExecutingKeyspace.doExecuteOperation(ExecutingKeyspace.java:97)
> at
> me.prettyprint.cassandra.model.CqlQuery.execute(CqlQuery.java:99)
> at
> com.netseer.cassandra.cache.dao.CacheReader.getRows(CacheReader.java:267)
> at
> com.netseer.cassandra.cache.dao.CacheReader.getCache0(CacheReader.java:55)
> at
> com.netseer.cassandra.cache.dao.CacheDao.getCaches(CacheDao.java:85)
> at
> com.netseer.cassandra.cache.dao.CacheDao.getCache(CacheDao.java:71)
> at
> com.netseer.cassandra.cache.dao.CacheDao.getCache(CacheDao.java:149)
> at
> com.netseer.cassandra.cache.service.CacheServiceImpl.getCache(CacheServiceImpl.java:55)
> at
> com.netseer.cassandra.cache.service.CacheServiceImpl.getCache(CacheServiceImpl.java:28)
> at
> com.netseer.dsat.cache.CassandraDSATCacheImpl.get(CassandraDSATCacheImpl.java:62)
> at
> com.netseer.dsat.cache.CassandraDSATCacheImpl.getTimedValue(CassandraDSATCacheImpl.java:144)
> at
> com.netseer.dsat.serving.GenericCacheManager$4.call(GenericCacheManager.java:427)
> at
> com.netseer.dsat.serving.GenericCacheManager$4.call(GenericCacheManager.java:423)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: org.apache.thrift.transport.TTransportException:
> java.net.SocketTimeoutException: Read timed out
> at
> org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129)
> at
> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at
> org.apache.thrift.transport.TFramedTransport.readFrame(TFramedTransport.java:129)
> at
> org.apache.thrift.transport.TFramedTransport.read(TFramedTransport.java:101)
> at
> org.apache.thrift.transport.TTransport.readAll(TTransport.java:84)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:378)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:297)
> at
> org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:204)
> at
> org.apache.cassandra.thrift.Cassandra$Client.recv_execute_cql_query(Cassandra.java:1698)
> at
> org.apache.cassandra.thrift.Cassandra$Client.execute_cql_query(Cassandra.java:1682)
> at
> me.prettyprint.cassandra.model.CqlQuery$1.execute(CqlQuery.java:106)
> ... 21 more
> Caused by: java.net.SocketTimeoutException: Read timed out
> at java.net.SocketInputStream.socketR

Re: Repair Process Taking too long

2012-04-10 Thread Frank Ng
I am not using tier-sized compaction.


On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone  wrote:

> Data size, number of nodes, RF?
>
> Are you using size-tiered compaction on any of the column families that
> hold a lot of your data?
>
> Do your cassandra logs say you are streaming a lot of ranges?
> zgrep -E "(Performing streaming repair|out of sync)"
>
>
> On Tue, Apr 10, 2012 at 9:45 AM, Igor  wrote:
>
>>  On 04/10/2012 07:16 PM, Frank Ng wrote:
>>
>> Short answer - yes.
>> But you are asking wrong question.
>>
>>
>> I think both processes are taking a while.  When it starts up, netstats
>> and compactionstats show nothing.  Anyone out there successfully using ext3
>> and their repair processes are faster than this?
>>
>>  On Tue, Apr 10, 2012 at 10:42 AM, Igor  wrote:
>>
>>> Hi
>>>
>>> You can check with nodetool  which part of repair process is slow -
>>> network streams or verify compactions. use nodetool netstats or
>>> compactionstats.
>>>
>>>
>>> On 04/10/2012 05:16 PM, Frank Ng wrote:
>>>
 Hello,

 I am on Cassandra 1.0.7.  My repair processes are taking over 30 hours
 to complete.  Is it normal for the repair process to take this long?  I
 wonder if it's because I am using the ext3 file system.

 thanks

>>>
>>>
>>
>>
>
>
> --
> Jonathan Rhone
> Software Engineer
>
> *TinyCo*
> 800 Market St., Fl 6
> San Francisco, CA 94102
> www.tinyco.com
>
>


RE: Issue with SStable loader.

2012-04-10 Thread Rishabh Agrawal
I don't understand config for sstableloader. I thought sstableloader just takes 
cassandra.yaml file and does it. pleae throw some more light on same.

From: aaron morton [aa...@thelastpickle.com]
Sent: 10 April 2012 11:37 PM
To: user@cassandra.apache.org
Subject: Re: Issue with SStable loader.

Did you update the config for sstableloader ?

Are their any data files in the data directory pointed to by the sstableloader 
config ?

Cheers

-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 10/04/2012, at 11:56 PM, Rishabh Agrawal wrote:

Hello,

I had three node cluster which I converted to 4 node one. Later I 
decommissioned one of them and load balanced the data on remaining 3. I removed 
decommissioned node from ‘seed list’ . I restarted all nodes and performed 
compaction. After that when I am using sstable loader it is trying to connect 
to that decommissioned node and hence failing.

Can someone provide me solution to same.

Regards
Rishabh Agrawal



Register for the Impetus webinar on Mobile Testing Automation : Best Practices 
- Apr 13 (10:00 am PT). http://bit.ly/Hb4YOq

Impetus’ expert to present on ‘Streamlining Cloud Based Performance Testing’ at 
STAREAST 2012, FL - Apr 15 -20. Know more about our IP Enabled Solutions for 
Load Testing & Mobile Test Automation at the event.


NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.




Register for the Impetus webinar on Mobile Testing Automation : Best Practices 
- Apr 13 (10:00 am PT). http://bit.ly/Hb4YOq

Impetus’ expert to present on ‘Streamlining Cloud Based Performance Testing’ at 
STAREAST 2012, FL - Apr 15 -20. Know more about our IP Enabled Solutions for 
Load Testing & Mobile Test Automation at the event.


NOTE: This message may contain information that is confidential, proprietary, 
privileged or otherwise protected by law. The message is intended solely for 
the named addressee. If received in error, please destroy and notify the 
sender. Any use of this email is prohibited when received in error. Impetus 
does not represent, warrant and/or guarantee, that the integrity of this 
communication has been maintained nor that the communication is free of errors, 
virus, interception or interference.


cassandra_jobs on twitter

2012-04-10 Thread Jeremy Hanna
some time back, I created the account cassandra_jobs on twitter.  if you email 
the user list or better yet just cc cassandra_jobs on twitter, I'll retweet it 
there so that the information can get out to more people.

https://twitter.com/#!/cassandra_jobs

cheers,

Jeremy

Re: issue with composite row key on CassandraStorage pig?

2012-04-10 Thread Janne Jalkanen

There doesn't seem to be an open JIRA ticket for it - can you please make one 
at https://issues.apache.org/jira/browse/CASSANDRA? That ensures that at some 
point someone will take a look at it and it just won't be forgotten in the 
endless barrage of emails...

Yup, I did the composite columns support. I'd start by looking at 
CassandraStorage.getNext().

/Janne

On 9 Apr 2012, at 22:02, Janwar Dinata wrote:

> Hi Janne,
> 
> Do you happen to know if support for composite row key is in the pipeline?
> 
> It seems that you did a patch for composite columns support on 
> CassandraStorage.java.
> Do you have any pointers for implementing composite row key feature?
> 
> Thanks.
> 
> On Mon, Apr 9, 2012 at 11:32 AM, Janne Jalkanen  
> wrote:
> 
> I don't think the Pig code supports Composite *keys* yet. The 1.0.9 code 
> supports Composite Column Names tho'...
> 
> /Janne
> 
> On Apr 8, 2012, at 06:02 , Janwar Dinata wrote:
> 
>> Hi,
>> 
>> I have a column family that uses DynamicCompositeType for its 
>> keys_validation_class.
>> When I try to dump the row keys using pig but it fails with 
>> java.lang.ClassCastException: org.apache.pig.data.DataByteArray cannot be 
>> cast to org.apache.pig.data.Tuple
>> 
>> This is how I create the column family
>> create column family CompoKey
>>with
>>  key_validation_class =
>>'DynamicCompositeType(
>>  a=>AsciiType,
>>  o=>BooleanType,
>>  b=>BytesType,
>>  e=>DateType,
>>  d=>DoubleType,
>>  f=>FloatType,
>>  i=>IntegerType,
>>  x=>LexicalUUIDType,
>>  l=>LongType,
>>  t=>TimeUUIDType,
>>  s=>UTF8Type,
>>  u=>UUIDType)' and
>>  comparator =
>>'DynamicCompositeType(
>>  a=>AsciiType,
>>  o=>BooleanType,
>>  b=>BytesType,
>>  e=>DateType,
>>  d=>DoubleType,
>>  f=>FloatType,
>>  i=>IntegerType,
>>  x=>LexicalUUIDType,
>>  l=>LongType,
>>  t=>TimeUUIDType,
>>  s=>UTF8Type,
>>  u=>UUIDType)' and
>>  default_validation_class = CounterColumnType;   
>> 
>> This is my pig script
>> rows =  LOAD 'cassandra://PigTest/CompoKey' USING CassandraStorage();
>> keys = FOREACH rows GENERATE flatten(key);
>> dump keys;
>> 
>> I'm on cassandra 1.0.9 and pig 0.9.2.
>> 
>> Thanks.
> 
> 



Re: Materialized Views or Index CF - data model question

2012-04-10 Thread Data Craftsman
Hi Aaron,

Thanks for the quick answer, I'll build a prototype to benchmark each
approach next week.

Here are more questions based on your reply:

a) "These queries are not easily supported on standard Cassandra"
select * from book where price  < 992   order by price descending limit 30;

This is a typical (time series data)timeline query well supported by
Cassandra, from my understanding.

b) "You do not need a different CF for each custom secondary index.
Try putting the name of the index in the row key. "

I couldn't understand it. Can you help to build an demo with CF
structure and some sample data?

Thanks,
Charlie | DBA developer



On Sun, Apr 8, 2012 at 2:30 PM, aaron morton  wrote:
> We need to query data by each column, do pagination as below,
>
> select * from book where isbn   < "XYZ" order by ISBN   descending limit 30;
> select * from book where price  < 992   order by price  descending limit 30;
> select * from book where col_n1 < 789   order by col_n1 descending limit 30;
> select * from book where col_n2 < "MUJ" order by col_n2 descending limit 30;
> ...
> select * from book where col_nm < 978 order by col_nm descending limit 30;
>
> These queries are not easily supported on standard Cassandra. If you need
> this level of query complexity consider Data Stax Enterprise, Solr, or a
> RDBMS.
>
> If we choose Materialized Views approach, we have to update all
> 20 Materialized View column family(s), for each base row update.
> Will the Cassandra write performance acceptable?
>
> Yes, depending on the size of the cluster and the machine spec.
>
> It's often a good idea to design CF's to match the workloads. If you have
> some data that changes faster than other, consider splitting them into
> different CFs.
>
> Should we just normalize the data, create base book table with book_id
> as primary key, and then
> build 20 index column family(s), use wide row column slicing approach,
> with index column data value as column name and book_id as value?
>
> You do not need a different CF for each custom secondary index. Try putting
> the name of the index in the row key.
>
> What will you recommend?
>
> Take another look at the queries you *need* to support. Then build a small
> proof of concept to see if Cassandra will work for you.
>
> Hope that helps.
>
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 6/04/2012, at 6:46 AM, Data Craftsman wrote:
>
> Howdy,
>
> Can I ask a data model question here?
>
> We have a book table with 20 columns, 300 million rows, average row
> size is 1500 bytes.
>
> create table book(
> book_id,
> isbn,
> price,
> author,
> titile,
> ...
> col_n1,
> col_n2,
> ...
> col_nm
> );
>
> Data usage:
>
> We need to query data by each column, do pagination as below,
>
> select * from book where isbn   < "XYZ" order by ISBN   descending limit 30;
> select * from book where price  < 992   order by price  descending limit 30;
> select * from book where col_n1 < 789   order by col_n1 descending limit 30;
> select * from book where col_n2 < "MUJ" order by col_n2 descending limit 30;
> ...
> select * from book where col_nm < 978 order by col_nm descending limit 30;
>
> Write: 100 million updates a day.
> Read : 16  million queries a day. 200 queries per second, one query
> returns 30 rows.
>
> ***
> Materialized Views approach
>
> {"ISBN_01",book_object1},{"ISBN_02",book_object2},...,{"ISBN_N",book_objectN}
> ...
> We will end up with 20 timelines.
>
>
> ***
> Index approach - create 2nd Column Family as Index
>
> 'ISBN_01': 'book_id_a01','book_id_a02',...,'book_id_aN'
> 'ISBN_02': 'book_id_b01','book_id_b02',...,'book_id_bN'
> ...
> 'ISBN_0m': 'book_id_m01','book_id_m02',...,'book_id_mN'
>
> This way, we will create 20 index Column Family(s).
>
> ---
>
> If we choose Materialized Views approach, we have to update all
> 20 Materialized View column family(s), for each base row update.
> Will the Cassandra write performance acceptable?
>
> Redis recommend building an index for the query on each column, that
> is your 1st strategy - create 2nd index CF:
> http://redis.io/topics/data-types-intro
> (see section [ Pushing IDs instead of the actual data in Redis lists ]
>
> Should we just normalize the data, create base book table with book_id
> as primary key, and then
> build 20 index column family(s), use wide row column slicing approach,
> with index column data value as column name and book_id as value?
> This way, we only need to update fewer affected column family that
> column value changed, but not all 20 Materialized Views CF(s).
>
> Another option would be using Redis to store master book data, using
> Cassandra Column Family to maintain 2nd index.
>
> What will you recommend?
>
> Charlie (@mujiang) 一个 木匠
> ===
> Data Architect Developer
> http://mujiang.blogspot.com
>
>
> p.s.
>
> Gist from datastax dev blog (
> http://www.datastax.com/dev/blog/advanced-time-series-with-cassandra )
> "
> If the same event is tracked 

Re: Repair Process Taking too long

2012-04-10 Thread David Leimbach
I had this happen when I had really poorly generated tokens for the ring.
 Cassandra seems to accept numbers that are too big.  You get hot spots
when you think you should be balanced and repair never ends (I think there
is a 48 hour timeout).

On Tuesday, April 10, 2012, Frank Ng wrote:

> I am not using tier-sized compaction.
>
>
> On Tue, Apr 10, 2012 at 12:56 PM, Jonathan Rhone 
> 
> > wrote:
>
>> Data size, number of nodes, RF?
>>
>> Are you using size-tiered compaction on any of the column families that
>> hold a lot of your data?
>>
>> Do your cassandra logs say you are streaming a lot of ranges?
>> zgrep -E "(Performing streaming repair|out of sync)"
>>
>>
>> On Tue, Apr 10, 2012 at 9:45 AM, Igor > 'cvml', 'i...@4friends.od.ua');>
>> > wrote:
>>
>>>  On 04/10/2012 07:16 PM, Frank Ng wrote:
>>>
>>> Short answer - yes.
>>> But you are asking wrong question.
>>>
>>>
>>> I think both processes are taking a while.  When it starts up, netstats
>>> and compactionstats show nothing.  Anyone out there successfully using ext3
>>> and their repair processes are faster than this?
>>>
>>>  On Tue, Apr 10, 2012 at 10:42 AM, Igor 
>>> 
>>> > wrote:
>>>
 Hi

 You can check with nodetool  which part of repair process is slow -
 network streams or verify compactions. use nodetool netstats or
 compactionstats.


 On 04/10/2012 05:16 PM, Frank Ng wrote:

> Hello,
>
> I am on Cassandra 1.0.7.  My repair processes are taking over 30 hours
> to complete.  Is it normal for the repair process to take this long?  I
> wonder if it's because I am using the ext3 file system.
>
> thanks
>


>>>
>>>
>>
>>
>> --
>> Jonathan Rhone
>> Software Engineer
>>
>> *TinyCo*
>> 800 Market St., Fl 6
>> San Francisco, CA 94102
>> www.tinyco.com
>>
>>
>


Re: Why so many SSTables?

2012-04-10 Thread Maki Watanabe
You can configure sstable size by sstable_size_in_mb parameter for LCS.
The default value is 5MB.
You should better to check you don't have many pending compaction tasks
with nodetool tpstats and compactionstats also.
If you have enough IO throughput, you can increase
compaction_throughput_mb_per_sec
in cassandra.yaml to reduce pending compactions.

maki

2012/4/10 Romain HARDOUIN :
>
> Hi,
>
> We are surprised by the number of files generated by Cassandra.
> Our cluster consists of 9 nodes and each node handles about 35 GB.
> We're using Cassandra 1.0.6 with LeveledCompactionStrategy.
> We have 30 CF.
>
> We've got roughly 45,000 files under the keyspace directory on each node:
> ls -l /var/lib/cassandra/data/OurKeyspace/ | wc -l
> 44372
>
> The biggest CF is spread over 38,000 files:
> ls -l Documents* | wc -l
> 37870
>
> ls -l Documents*-Data.db | wc -l
> 7586
>
> Many SSTable are about 4 MB:
>
> 19 MB -> 1 SSTable
> 12 MB -> 2 SSTables
> 11 MB -> 2 SSTables
> 9.2 MB -> 1 SSTable
> 7.0 MB to 7.9 MB -> 6 SSTables
> 6.0 MB to 6.4 MB -> 6 SSTables
> 5.0 MB to 5.4 MB -> 4 SSTables
> 4.0 MB to 4.7 MB -> 7139 SSTables
> 3.0 MB to 3.9 MB -> 258 SSTables
> 2.0 MB to 2.9 MB -> 35 SSTables
> 1.0 MB to 1.9 MB -> 13 SSTables
> 87 KB to  994 KB -> 87 SSTables
> 0 KB -> 32 SSTables
>
> FYI here is CF information:
>
> ColumnFamily: Documents
>   Key Validation Class: org.apache.cassandra.db.marshal.BytesType
>   Default column value validator: org.apache.cassandra.db.marshal.BytesType
>   Columns sorted by: org.apache.cassandra.db.marshal.BytesType
>   Row cache size / save period in seconds / keys to save : 0.0/0/all
>   Row Cache Provider: org.apache.cassandra.cache.SerializingCacheProvider
>   Key cache size / save period in seconds: 20.0/14400
>   GC grace seconds: 1728000
>   Compaction min/max thresholds: 4/32
>   Read repair chance: 1.0
>   Replicate on write: true
>   Column Metadata:
>     Column Name: refUUID (7265664944)
>       Validation Class: org.apache.cassandra.db.marshal.BytesType
>       Index Name: refUUID_idx
>       Index Type: KEYS
>   Compaction Strategy:
> org.apache.cassandra.db.compaction.LeveledCompactionStrategy
>   Compression Options:
>     sstable_compression: org.apache.cassandra.io.compress.SnappyCompressor
>
> Is it a bug? If not, how can we tune Cassandra to avoid this?
>
> Regards,
>
> Romain