Re: Blob vs. "normal" columns (internals) difference?

2013-04-03 Thread Alan Ristić
I forgot, lets stay on the edge with C* 1.2.* branch ;)

Hvala in lp,
*Alan Ristić*

*w*: personal blog 
 *t*: @alanristic 
* l:* linkedin.com/alanristic 
*m*: ​068 15 73 88​


2013/4/3 Alan Ristić 

> Hi guys,
>
> Here is example (fictional) model I have for learning purposes...
>
> I'm currently storing the "User" object in a Tweet as blob value. So
> taking JSON of 'User' and storing it as blob. I'm wondering why is this
> better vs. just prefixing and flattening column names?
>
> Tweet {
>  id uuid,
>  user blob
> }
>
> vs.
>
> Tweet {
>  id uuid,
>  user_id uuid,
>  user_name text,
>  
> }
>
> In one or other
>
> 1. Is size getting bigger in either one in storing one Tweet?
> 2. Has either choice have impact on read/write performance on large scale?
> 3. Anything else I should be considering here? Your view/thinking would be
> great.
>
> *Here is my understanding:*
> For 'ease' of update if for example user changes its name I'm aware I need
> to (re)write whole object in all Tweets in *first "blob"* example and
> only user_name column in *second 'flattened'* example. Which brings me
> that If I'd wanted to actually do this "updating/rewriting" for every Tweet
> I'd use *second 'flattened'* example since payload of only user_name is
> smaller than whole User blob for every Tweet right?
>
> Nothing urgent, any input is valuable, tnx guys :)
>
>
>
> Hvala in lp,
> *Alan Ristić*
>
> *w*: personal blog 
>  *t*: @alanristic 
> * l:* linkedin.com/alanristic 
> *m*: ​068 15 73 88​
>


Re: Cassandra freezes

2013-04-03 Thread Joel Samuelsson
It seems this problem is back and I am unsure how to solve it. I have a
test setup like this:
4 machines run 8 processess each. Each process has 2 threads, 1 for writing
100 000 rows and one for reading another 100 000 rows. Each machine (and
process) read and write the exact same rows so it is essentially the same
200 000 rows being read / written.
The cassandra cluster is a one node cluster.
The first 10-20 runs of the test described above goes smoothly, after that
tests take increasingly long time with GC happening almost all the time.

Here is my CASS-FREEZE-001 form answers:

How big is your JVM heap ?
2GB

How many CPUs ?
A virtual environment so I can't be perfectly sure but according to their
specification, "8 cores".

Garbage collection taking long ? ( look for log lines from GCInspector)
Yes, these are a few lines seen during 1 test run:
INFO [ScheduledTasks:1] 2013-04-03 08:47:40,757 GCInspector.java (line 122)
GC for ParNew: 40370 ms for 3 collections, 565045688 used; max is 2038693888
INFO [ScheduledTasks:1] 2013-04-03 08:48:24,720 GCInspector.java (line 122)
GC for ParNew: 39840 ms for 2 collections, 614065528 used; max is 2038693888
INFO [ScheduledTasks:1] 2013-04-03 08:49:09,319 GCInspector.java (line 122)
GC for ParNew: 37666 ms for 2 collections, 682352952 used; max is 2038693888
INFO [ScheduledTasks:1] 2013-04-03 08:50:02,577 GCInspector.java (line 122)
GC for ParNew: 44590 ms for 1 collections, 792861352 used; max is 2038693888


Running out of heap ? ( "heap is .. full" log lines )
Yes. Same run as above:
WARN [ScheduledTasks:1] 2013-04-03 08:54:35,108 GCInspector.java (line 139)
Heap is 0.8596674853032178 full.  You may need to reduce memtable and/or
cache sizes.  Cassandra is now reducing cache sizes to free up memory.
 Adjust reduce_cache_sizes_at threshold in cassandra.yaml if you don't want
Cassandra to do this automatically
WARN [ScheduledTasks:1] 2013-04-03 08:54:36,831 GCInspector.java (line 145)
Heap is 0.8596674853032178 full.  You may need to reduce memtable and/or
cache sizes.  Cassandra will now flush up to the two largest memtables to
free up memory.  Adjust flush_largest_memtables_at threshold in
cassandra.yaml if you don't want Cassandra to do this automatically


Any tasks backing up / being dropped ? ( nodetool tpstats and ".. dropped
in last .. ms" log lines )
Yes. Same run as above:
INFO [ScheduledTasks:1] 2013-04-03 08:52:04,943 MessagingService.java (line
673) 31 MUTATION messages dropped in last 5000ms
INFO [ScheduledTasks:1] 2013-04-03 08:52:04,944 MessagingService.java (line
673) 8 READ messages dropped in last 5000ms

Are writes really slow? ( nodetool cfhistograms Keyspace ColumnFamily )
Not sure how to interpret the output of nodetool cfhistograms, but here it
is (I hope it's fairly readable):
Offset  SSTables Write Latency  Read Latency  Row Size
 Column Count
1   38162520 0 0 0
   20
2  022 0 0
0
3  0  1629 0 0
0
4  0  9990 0 0
0
5  0 40169 0 0
0
6  0161538 0 0
0
7  0487266 0 0
0
8  0   1096601 0 0
0
10 0   4842978 0 0
0
12 0   7976003 0 0
0
14 0   8673230 0 0
0
17 0   9805730 0 0
0
20 0   5083707 0 0
0
24 0   2541157 0 0
0
29 0768916 0 0
0
35 0220440 0 0
0
42 0112915 0 0
0
50 0 71469 0 0
0
60 0 48909 0 0
0
72 0 50714 0 0
0
86 0 45390 0 0
0
1030 41975 0 0
0
1240 40371 0

Repair does not fix inconsistency

2013-04-03 Thread Michal Michalski

Hi,

TL;DR: I have inconsistend data (1 live row on node A & 1 tombstoned row 
on node B) that do not get fixed by repair. What can be a problem?


Long version:

I have a CF containing Users' info, which I sometimes query by key, and 
sometimes by indexed columns like email. I'm using RF=2. I write with 
CL.ONE, but  this CF is very rarely updated, so C* has a looot of time 
to fix inconsistencies that may occur, so I'm fine with this (at least 
in theory ;-) ).


To be clear:
- I've run a successfull cluster-wide repair on this CF before testing, 
so I do not expect any inconsistency
- All indexes are built, I've rebuilt them manually before testing, so I 
expect them to work properly (I mention it because it seems to be 
somehow related to indexes, but I'm not sure - see below)


The problem is:

When I query (cqlsh) some rows by key (CL is default = ONE) I _always_ 
get a correct result.  However, when I query it by indexed column, it 
returns nothing.


When tracing a query with CL.ALL in cqlsh, I get info that C* has:

Read 0 live cells and 1 tombstoned   // for first replica node
Read 1 live cells and 0 tombstoned   // for second replica node

When CL is ONE it's never asking second replica for data (possibly due 
to DynamicSnitch scores or so), so it returns nothing.


Switching to CL >= TWO obviously fixes this problem for us, but it's not 
the solution I'd like to use as I'd rather rely on fast read/write 
requests with CL.ONE + frequent repairs, allowing some short-term 
inconsistency.


Any ideas why it may happen that data are still inconsistent after 
repair? Is there something I could have missed?


I'm mainly surprised that repair does not fix this inconsistency in ANY 
way - either by pulling missing data to first replica _OR_ tombstoning 
it on second replica. First one would be correct (delete was made a long 
time ago and then the row reappeared), but both could make sense, as 
both will make the data consistent. In this state it's definitely 
inconsistent and I don't understand it :-)



M.


Re: Problem with streaming data from Hadoop: DecoratedKey(-1, )

2013-04-03 Thread Michal Michalski

Strange things happen.

It wasn't a single row, but one single "part" file of the Hadoop's input 
that failed - we didn't manage to find a specific row that causes the 
problem. However, it keeps failing only on production, where we can't 
experiment with it a lot. We tried to reproduce it in a few ways on 3 
different environments, but we were unable to do it.


We have to leave this problem for now.
Thanks for help anyway :-)

M.

W dniu 02.04.2013 10:02, Michal Michalski pisze:

Thanks for reply, Aaron. Unluckily, I think it's not the case - we did
some quick tests last week and for now it _seems_ that:

1) There was no empty / zero-lenght key in data we loaded - that was the
first thing we checked
2) By "bisecting" the data, we found out that the row that makes the
problem is the one with longest key (184 characters; much longer that
other keys we have in this file, but it's still not much and definitely
far, far below the 64K limit mentioned here:
http://wiki.apache.org/cassandra/FAQ#max_key_size ) - not sure yet if it
matters, but it's the only thing that makes him different. It has only
one, short column - nothing special.
3) Loading the same data using Thrift finished with no error, but the
row we have a problem with is NOT present in Cassandra - this is so
strange, that I'll double-check it.

However, we'll try do a few more tests in next few days to make 100%
sure what in our data causes the problem. I'll update you if we learn
something new.

M.

W dniu 31.03.2013 12:01, aaron morton pisze:

  but yesterday one of 600 mappers failed


:)


 From what I can understand by looking into the C* source, it seems
to me that the problem is caused by a empty (or surprisingly
finished?) input buffer (?) causing token to be set to -1 which is
improper for RandomPartitioner:

Yes, there is a zero length key which as a -1 token.


However, I can't figure out what's the root cause of this problem.
Any ideas?

mmm, the BulkOutputFormat uses a SSTableSimpleUnsortedWriter and
neither of them check for zero length row keys. I would look there first.

There is no validation in the  AbstractSSTableSimpleWriter, not sure
if that is by design or an oversight. Can you catch the zero length
key in your map job ?

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 28/03/2013, at 2:26 PM, Michal Michalski  wrote:


We're streaming data to Cassandra directly from MapReduce job using
BulkOutputFormat. It's been working for more than a year without any
problems, but yesterday one of 600 mappers faild and we got a
strange-looking exception on one of the C* nodes.

IMPORTANT: It happens on one node and on one cluster only. We've
loaded the same data to test cluster and it worked.


ERROR [Thread-1340977] 2013-03-28 06:35:47,695 CassandraDaemon.java
(line 133) Exception in thread Thread[Thread-1340977,5,main]
java.lang.RuntimeException: Last written key
DecoratedKey(5664330507961197044404922676062547179,
302c6461696c792c32303133303332352c312c646f6d61696e2c756e6971756575736572732c633a494e2c433a6d63635f6d6e635f636172726965725f43656c6c4f6e655f4b61726e6174616b615f2842616e67616c6f7265295f494e2c643a53616d73756e675f47542d49393037302c703a612c673a3133)
>= current key DecoratedKey(-1, ) writing into
/cassandra/production/IndexedValues/production-IndexedValues-tmp-ib-240346-Data.db

at
org.apache.cassandra.io.sstable.SSTableWriter.beforeAppend(SSTableWriter.java:133)

at
org.apache.cassandra.io.sstable.SSTableWriter.appendFromStream(SSTableWriter.java:209)

at
org.apache.cassandra.streaming.IncomingStreamReader.streamIn(IncomingStreamReader.java:179)

at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:122)

at
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:226)

at
org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:166)

at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)



 From what I can understand by looking into the C* source, it seems
to me that the problem is caused by a empty (or surprisingly
finished?) input buffer (?) causing token to be set to -1 which is
improper for RandomPartitioner:

public BigIntegerToken getToken(ByteBuffer key)
{
if (key.remaining() == 0)
return MINIMUM;// Which is -1
return new BigIntegerToken(FBUtilities.hashToBigInteger(key));
}

However, I can't figure out what's the root cause of this problem.
Any ideas?

Of course I can't exclude a bug in my code which streams these data,
but - as I said - it works when loading the same data to test cluster
(which has different number of nodes, thus different token
assignment, which might be a case too).

Michał







Linear scalability problems

2013-04-03 Thread Anand Somani
Hi,

I am running some tests trying to scale out our application from using a 3
node cluster to 6 node cluster. The thing I observed is that when using a 3
node cluster I was able to handle abt 41 req/second, so I added 3 more
nodes thinking it should close to double, but instead it only goes upto bat
47 req/second!! I am doing something wrong and it is not obvious, so wanted
some help in what stats could/should I monitor to tell me things like if a
node has more requests or if the load distribution is not random enough?

Note I am using direct thrift (old code base) and cassandra 1.1.6. The data
model is for storing blobs (split across columns) and has around 6 CF, RF=3
and all operations are at quorum. Also at the end of the run nodetool ring
reports the same data size.

Thanks
Anand


Alter table drop column seems not working

2013-04-03 Thread julien Campan
Hi,

I'm working with cassandra 1.2.2.

When I try to drop a column , it's not working.

This is what I tried :

CREATE TABLE cust (
  ise text PRIMARY KEY,
  id_avatar_1 uuid,
  id_avatar_2 uuid,
  id_avatar_3 uuid,
  id_avatar_4 uuid
) ;


cqlsh> ALTER TABLE cust DROP id_avatar_1 ;

==>Bad Request: line 1:17 no viable alternative at input 'DROP'
==>Perhaps you meant to use CQL 2? Try using the -2 option when ==>starting
cqlsh.

Can someone  tell me how to drop a column or if it is a bug ?

Thank


Re: Linear scalability problems

2013-04-03 Thread Tyler Hobbs
If I had to guess, I would say that your client is the bottleneck, not the
cluster.  Are you inserting data with multiple threads or processes?


On Wed, Apr 3, 2013 at 8:49 AM, Anand Somani  wrote:

> Hi,
>
> I am running some tests trying to scale out our application from using a 3
> node cluster to 6 node cluster. The thing I observed is that when using a 3
> node cluster I was able to handle abt 41 req/second, so I added 3 more
> nodes thinking it should close to double, but instead it only goes upto bat
> 47 req/second!! I am doing something wrong and it is not obvious, so wanted
> some help in what stats could/should I monitor to tell me things like if a
> node has more requests or if the load distribution is not random enough?
>
> Note I am using direct thrift (old code base) and cassandra 1.1.6. The
> data model is for storing blobs (split across columns) and has around 6 CF,
> RF=3 and all operations are at quorum. Also at the end of the run nodetool
> ring reports the same data size.
>
> Thanks
> Anand
>



-- 
Tyler Hobbs
DataStax 


upgrading 1.1.x to 1.2.x via sstableloader

2013-04-03 Thread Michał Czerwiński
Does anyone knows what is the best process to put data from cassandra 1.1.x
(1.1.7 to be more precise) to cassandra 1.2.3 ?

I am trying to use sstableloader and stream data to a new cluster but I get.

ERROR [Thread-125] 2013-04-03 16:37:27,330 IncomingTcpConnection.java (line
183) Received stream using protocol version 5 (my version 6). Terminating
connection

ERROR [Thread-141] 2013-04-03 16:38:05,704 CassandraDaemon.java (line 164)
Exception in thread Thread[Thread-141,5,main]

java.lang.UnsupportedOperationException: SSTable zzz/xxx/yyy-hf-47-Data.db
is not compatible with current version ib

at
org.apache.cassandra.streaming.StreamIn.getContextMapping(StreamIn.java:77)

at
org.apache.cassandra.streaming.IncomingStreamReader.(IncomingStreamReader.java:87)

at
org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:238)

at
org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:178)

at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)


I've changed Murmur3Partitioner to RandomPartitioner already and I've
noticed I am not able to use 1.1.7's sstableloader so I copied sstables to
new nodes and tried doing it locally on cassandra 1.2.3, but it seems
protocol versions do not match (see error above)

The reason why I want to use sstableloader is that I have different number
of nodes and would like to avoid using rsync and then repair/cleanup of
excessive data.

Thanks!


how are reads done on compressed sstables?

2013-04-03 Thread Hiller, Dean
I was reading this great article on sstables

http://www.igvita.com/2012/02/06/sstable-and-log-structured-storage-leveldb/

And the only thing it is not touching upon is how it is doing lookups when 
compression is enabled.  Is cassandra using the snappy framing concept so it 
can random access into the file as well?  I mean, some of the STCS files are 
HUGE while all my LCS files are 10MB.  Under the covers how is that working?  
(If I were to guess snappy framing is used and on a read, it either keeps 
halving the file and reading just pieces or has some index info as to which 
frame the the key is in)…I would guess on LCS that the whole file is one frame 
always due to the small size and the whole file is uncompressed to read the one 
key from that file.

Though I would love to read a blog on this somewhere about how this works in 
detail.  (all the writes/updates/deletes make complete sense but no one seems 
to write about reads in the face of compression).

Thanks,
Dean


unsubscribe

2013-04-03 Thread puneet loya
Can you please unsubscribe me from this group


Any plans for read-before-write update operations in CQL3?

2013-04-03 Thread Drew Kutcharian
Hi Guys,

Are there any short/long term plans to support UPDATE operations that require 
read-before-write, such as increment on a numeric non-counter column? 
i.e. 

UPDATE CF SET NON_COUNTER_NUMERIC_COLUMN = NON_COUNTER_NUMERIC_COLUMN + 1;

UPDATE CF SET STRING_COLUMN = STRING_COLUMN + "postfix";

etc.

I know this goes against keeping updates idempotent, but there are times you 
need to do these kinds of operations. We currently do things like this in 
client code, but it would be great to be able to this on the server side to 
minimize the chance of race conditions.

-- Drew

Re: unsubscribe

2013-04-03 Thread Radek Gruchalski
To remove your address from the list, send a message to:
mailto:user-unsubscr...@cassandra.apache.org)>


Kind regards,

Radek Gruchalski
radek.gruchal...@technicolor.com (mailto:radek.gruchal...@technicolor.com) | 
radek.gruchal...@portico.io (mailto:radek.gruchal...@portico.io) | 

ra...@gruchalski.com
 (mailto:ra...@gruchalski.com)
00447889948663





Confidentiality:
This communication is intended for the above-named person and may be 
confidential and/or legally privileged.
If it has come to you in error you must take no action based on it, nor must 
you copy or show it to anyone; please delete/destroy and inform the sender 
immediately.


On Wednesday, 3 April 2013 at 21:01, puneet loya wrote:

> Can you please unsubscribe me from this group  
>  




Re: Unable to prefix in astyanax read query

2013-04-03 Thread aaron morton
>> I have created this column family using CQL and defined the primary key
>> as 
What was the create table statement ? 

>> BadRequestException: [host=localhost(127.0.0.1):9160, latency=6(6),
>> attempts=1]InvalidRequestException(why:Not enough bytes to read value of
>> component 0)
Unless the CQL 3 create table statement specifies USE COMPACT_STORAGE it will 
use composites in the row keys and Astyanax may not be expected this. 

Unless astyanax specifically says it can write to CQL 3 tables it's best to 
only access them using CQL 3. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 2/04/2013, at 6:07 PM, "Hiller, Dean"  wrote:

> We ran into some similar errors in playorm development.  Basically, you
> defined a composite probably but are not correctly using that composite.
> I am not sure about queries though as we had the issue when saving data
> (ie. Using deviceID+deviceName did not work and we had to create an full
> blown composite object).  I think you need to read up on how astyanax
> works with compositesŠ..I am not sure this is a cassandra question
> reallyŠ.more of an astyanax one.
> 
> Dean
> 
> On 4/1/13 11:48 PM, "Apurva Jalit"  wrote:
> 
>> I have a scheme as follows:
>> 
>> TimeStamp
>> Device ID
>> Device Name
>> Device Owner
>> Device location
>> 
>> I have created this column family using CQL and defined the primary key
>> as 
>> (TimeStamp,Device ID, Device Name). Through a serializable object that
>> has 
>> fields for DeviceID, name and a field name (which stores either Device
>> Owner or 
>> Device Location). I have inserted some records using Astyanax.
>> 
>> As per my understanding, the columns for a row are created by combining
>> Device 
>> ID, Device Name and field name as column name and the value to be the
>> value for 
>> that particular field. Thus for a particular timestamp and device, the
>> column 
>> names would be in the pattern (Device ID:Device Name: ...).
>> 
>> So I believe we can use these 2 fields as prefix to obtain all the
>> entries for a 
>> particular time-device combination.
>> 
>> I am using the following query to obtain the results:
>> 
>> RowSliceQuery query = adu.keyspace
>> .prepareQuery(columnFamily)
>> .getKeySlice(timeStamp)
>> .withColumnRange(new RangeBuilder()
>>  .setStart(deviceID+deviceName+"_\u0")
>>  .setEnd(deviceID+deviceName+"_\u")
>>  .setLimit(batch_size)
>>  .build());
>> 
>> But on executing the above query I get the following Exception:
>> 
>> BadRequestException: [host=localhost(127.0.0.1):9160, latency=6(6),
>> attempts=1]InvalidRequestException(why:Not enough bytes to read value of
>> component 0)
>> 
>> Can any one help to understand where am I going wrong?
>> 
>> 
> 



Re: how to test our transfer speeds

2013-04-03 Thread aaron morton
JBOD as talked about here 
http://www.datastax.com/wp-content/uploads/2012/08/C2012-StateofCassandra-JonathanEllis.pdf
 and defined by disk_failure_policy

So that when you have very large nodes disk failed does not require a full 
replacement. But if you are using a high level raid guess that's not necessary. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 2/04/2013, at 6:35 PM, "Hiller, Dean"  wrote:

> Oh, JBOD, not JBOB.
> 
> no, we were using RAID 5 and RAID 6 from what I understand.  I am trying to 
> get a test run with just one disk to make sure the test is correct as one 
> disk should have much less performance than 20 in the case of random access.  
> In sequential, I think performance would be the same(ie. Both would be 
> 250MB/sec in throughput is my guess)
> Thanks,
> Dean
> 
> From: , Nrel mailto:dean.hil...@nrel.gov>>
> Date: Tuesday, April 2, 2013 6:40 AM
> To: "user@cassandra.apache.org" 
> mailto:user@cassandra.apache.org>>
> Subject: Re: how to test our transfer speeds
> 
> Is 1.2 JBOB and april fools joke?  Heh, seriously though, I have no idea what 
> you are talking about there.  I am trying to get raw disk performance with no 
> cassandra involved before involving cassandra…..which is the next step.
> 
> Thanks,
> Dean
> 
> From: aaron morton mailto:aa...@thelastpickle.com>>
> Reply-To: "user@cassandra.apache.org" 
> mailto:user@cassandra.apache.org>>
> Date: Monday, April 1, 2013 11:01 PM
> To: "user@cassandra.apache.org" 
> mailto:user@cassandra.apache.org>>
> Subject: Re: how to test our transfer speeds
> 
> If not, maybe I just generate the same 1,000,000 files on each machine, then 
> randomly delete 1/2 the files and stream them from the other machine as 
> writing those files would all be in random locations again forcing a much 
> worse measurement of MB/sec I would think.
> Not sure I understand the question. But you could just scrub the data off a 
> node and rebuild it.
> 
> Note that streaming is throttled, and it will also generate compaction.
> 
> He has twenty 1T drives on each machine and I think he also tried with one 1T 
> drive seeing the same performance which makes sense if writing sequentially
> Are you using the 1.2 JBOB configuration?
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 1/04/2013, at 11:01 PM, "Hiller, Dean" 
> mailto:dean.hil...@nrel.gov>> wrote:
> 
> (we plan on running similar performance tests on cassandra but wanted to 
> understand the raw foot print first)…..
> 
> Someone in ops was doing a test transferring 1T of data from one node to 
> another.  I had a huge concern I emailed him that this could end up being a 
> completely sequential write not testing random access speeds.  He has twenty 
> 1T drives on each machine and I think he also tried with one 1T drive seeing 
> the same performance which makes sense if writing sequentially.  Does anyone 
> know of something that could generate a random access pattern such that we 
> could time that?  Right now, he was measuring 253MB / second from the time 
> it took and the 1T of data.  I would like to find the much worse case of 
> course.
> 
> If not, maybe I just generate the same 1,000,000 files on each machine, then 
> randomly delete 1/2 the files and stream them from the other machine as 
> writing those files would all be in random locations again forcing a much 
> worse measurement of MB/sec I would think.
> 
> Thoughts?
> 
> Thanks,
> Dean
> 



Re: CorruptSSTableException in system keyspace

2013-04-03 Thread aaron morton
There is a ticket there from an older version but I doubt thats it 
https://issues.apache.org/jira/browse/CASSANDRA-4837

You may be hitting an edge case by quickly creating 200 to 300 CF's. Can you 
reproduce the problem outside of your test infrastructure ? If so can you raise 
a ticket https://issues.apache.org/jira/browse/CASSANDRA

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 2/04/2013, at 6:39 PM, Alexander Shutyaev  wrote:

> Hi, Aaron!
> 
> We were using 1.2.2. Now after your suggestion I've upgraded it to 1.2.3. 
> That CorruptSSTableException is now gone, but the problem is still here - 
> after some time we start getting another exception in cassandra logs (and 
> transport exception on the client side). The new exception is 
> java.lang.IllegalStateException: One row required, 0 found. Full stacktrace 
> available here [1]
> 
> [1] http://pastebin.com/W0nL6hK7
> 
> Thanks in advance,
> Alexander
> 
> 
> 2013/4/2 aaron morton 
> What version are you using ?
> 
> There are two tickets with similar issues fixed in 1.2.X releases.
> 
> https://issues.apache.org/jira/browse/CASSANDRA-5225
> https://issues.apache.org/jira/browse/CASSANDRA-5365
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 1/04/2013, at 1:45 PM, Alexander Shutyaev  wrote:
> 
> > Hi all!
> >
> > Our app currently moves from hibernate to cassandra+solr combination (we 
> > use our own storage abstraction for this). We have a lot of junit 
> > integration tests that use database. Now we've decided to use these tests 
> > to test our cassandra+solr storage implementation (given that on hibernate 
> > implementation these tests run fine). For that purpose we've installed 
> > cassandra and solr on a dedicated vm and we've launched the tests.
> >
> > Now there are two things that should be mentioned about these tests and how 
> > we use cassandra.
> >
> > 1. We have 1 keyspace and about 200-300 column families.
> > 2. Tests are written in such a manner that at the beginning of each test we 
> > drop the keyspace and then recreate it from scratch (together with all 
> > column families).
> > 3. On client side we use hector.
> >
> > The problem is that each time we launch the tests they start failing 
> > randomly at some point. In cassandra logs we see a lot of 
> > CorruptSSTableException-s in system keyspace caused by 
> > CorruptBlockException-s. An example of full stack trace of such exception 
> > can be found here [1]. The tests start to fail due to thrift transport 
> > exceptions. At that point we are also unable to connect to cassandra using 
> > cassandra-cli. Then, after some time cassandra goes back to normal state 
> > all by itself.
> >
> > [1] http://pastebin.com/9SpEJpap
> >
> > P.S. I can provide full system.log and output.log from clean start (all 
> > data folders erased) till the errors and later when the system is ok once 
> > again. Although I'm not sure which file sharing service to utilise.
> >
> > Thanks in advance,
> > Alexander
> 
> 



Re: how to stop out of control compactions?

2013-04-03 Thread aaron morton
>  And it appears I can't set min > 32
Why did you want to set it so high ?
If you want to disable compaction set it to 0. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 2/04/2013, at 8:43 PM, William Oberman  wrote:

> I just tried to use this setting (I'm using 1.1.9).  And it appears I can't 
> set min > 32, as that's the max max now (using nodetool at least).  Not sure 
> if JMX would allow more access, but I don't like bypassing things I don't 
> fully understand.  I think I'll just leave my compaction killers running 
> instead (not that killing compactions constantly isn't messing with things as 
> well).
> 
> will
> 
> 
> On Tue, Apr 2, 2013 at 10:43 AM, William Oberman  
> wrote:
> Edward, you make a good point, and I do think am getting closer to having to 
> increase my cluster size (I'm around ~300GB/node now).  
> 
> In my case, I think it was neither.  I had one node OOM after working on a 
> large compaction but it continued to run in a zombie like state (constantly 
> GC'ing), which I didn't have an alert on.  Then I had the bad luck of a 
> "close token" also starting a large compaction.  I have RF=3 with some of my 
> R/W patterns at quorum, causing that segment of my cluster to get slow (e.g. 
> a % of of my traffic started to slow).  I was running 1.1.2 (I haven't had to 
> poke anything for quite some time, obviously), so I upgraded before moving on 
> (as I saw a lot of bug fixes to compaction issues in release notes).  But the 
> upgrade caused even more nodes to start compactions.  Which lead to my 
> original email... I had a cluster where 80% of my nodes were compacting, and 
> I really needed to boost production traffic and couldn't seem to "tamp 
> cassandra down" temporarily.  
> 
> Thanks for the advice everyone!
> 
> will
> 
> 
> On Tue, Apr 2, 2013 at 10:20 AM, Edward Capriolo  
> wrote:
> Settings do not make compactions go away. If your compactions are "out of 
> control" it usually means one of these things,
> 1)  you have a corrupt table that the compaction never finishes on, sstables 
> count keep growing
> 2) you do not have enough hardware to handle your write load
> 
> 
> On Tue, Apr 2, 2013 at 7:50 AM, William Oberman  
> wrote:
> Thanks Gregg & Aaron. Missed that setting! 
> 
> On Tuesday, April 2, 2013, aaron morton wrote:
>> Set the min and max 
>> compaction thresholds for a given column family
> +1 for setting the max_compaction_threshold (as well as the min) on the a CF 
> when you are getting behind. It can limit the size of the compactions and 
> give things a chance to complete in a reasonable time. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 2/04/2013, at 3:42 AM, Gregg Ulrich  wrote:
> 
>> You may want to set compaction threshold and not throughput.  If you set the 
>> min threshold to something very large (10), compactions will not start 
>> until cassandra finds this many files to compact (which it should not).
>> 
>> In the past I have used this to stop compactions on a node, and then run an 
>> offline major compaction to get though the compaction, then set the min 
>> threshold back.  Not everyone likes major compactions though.
>> 
>> 
>> 
>>   setcompactionthreshold - 
>> Set the min and max 
>> compaction thresholds for a given column family
>> 
>> 
>> 
>> On Mon, Apr 1, 2013 at 12:38 PM, William Oberman  
>> wrote:
>> I'll skip the prelude, but I worked myself into a bit of a jam.  I'm 
>> recovering now, but I want to double check if I'm thinking about things 
>> correct.
>> 
>> Basically, I was in a state where a majority of my servers wanted to do 
>> compactions, and rather large ones.  This was impacting my site performance. 
>>  I tried nodetool stop COMPACTION.  I tried setcompactionthroughput=1.  I 
>> tried restarting servers, but they'd restart the compactions pretty much 
>> immediately on boot.
>> 
>> Then I realized that:
>> nodetool stop COMPACTION
>> only stopped running compactions, and then the compactions would re-enqueue 
>> themselves rather quickly.
>> 
>> So, right now I have:
>> 1.) scripts running on N-1 servers looping on "nodetool stop COMPACTION" in 
>> a tight loop
>> 2.) On the "Nth" server I've disabled gossip/thrift and turned up 
>> setcompactionthroughput to 999
>> 3.) When the Nth server completes, I pick from the remaining N-1 (well, I'm 
>> still running the first compaction, which is going to take 12 more hours, 
>> but that is the plan at least).
>> 
>> Does this make sense?  Other than the fact there was probably warning signs 
>> that would have prevented me from getting into this state in the first 
>> place? :-)
>> 
>> will
>> 
> 
> 
> 
> 



Re: creation of yaml-configuration from existing Config-object fails

2013-04-03 Thread aaron morton
Compare the yaml file you created with the default one, something is different. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 12:17 AM, Heinrich Götzger  wrote:

> Hello,
> 
> I try to read, edit, write and re-read a cassandra-yaml-config file 
> programmatically, but this unfortunately does not work as expected:
> 
> Here is my example:
> 
> /* start */
> package com.fue;
> 
> import java.io.*;
> 
> import org.apache.cassandra.config.*;
> import org.apache.cassandra.utils.SkipNullRepresenter;
> import org.yaml.snakeyaml.*;
> import org.yaml.snakeyaml.DumperOptions.*;
> import org.yaml.snakeyaml.constructor.*;
> import org.yaml.snakeyaml.nodes.*;
> 
> public class YamlConfigTest {
> 
>   public static void main(String[] args) throws IOException {
> 
>  // 1. reading an existing config-file,
>  InputStream input = new FileInputStream("/tmp/cassandra-1.2.3.yaml");
> 
>  Constructor constructor = new Constructor(Config.class);
>  TypeDescription seedDesc = new TypeDescription(SeedProviderDef.class);
>  seedDesc.putMapPropertyType("parameters", String.class, String.class);
>  constructor.addTypeDescription(seedDesc);
>  Yaml cassandraConfYaml = new Yaml(new Loader(constructor));
> 
>  Config conf = (Config) cassandraConfYaml.load(input);
> 
>  // 2. change the setting in the Config-object
>  SeedProviderDef spd = conf.seed_provider;
>  spd.parameters.put("seeds", "192.168.1.2");
> 
>  DumperOptions options = new DumperOptions();
>  options.setDefaultFlowStyle(DumperOptions.FlowStyle.BLOCK);
>  options.setDefaultScalarStyle(ScalarStyle.PLAIN);
> 
>  SkipNullRepresenter representer = new SkipNullRepresenter();
>  representer.addClassTag(Config.class, Tag.MAP);
> 
>  Dumper dumper = new Dumper(representer, options);
> 
>  Yaml yamlWriter = new Yaml(new Loader(constructor), dumper);
> 
>  // 3. and writing it back to another file.
>  String fileName = "/tmp/myCassandra.yaml";
>  Writer output = new FileWriter(fileName);
>  yamlWriter.dump(conf, output);
> 
>  Reader reader = new FileReader(fileName);
> 
>  Yaml myCassandraConfYaml = new Yaml(new Loader(constructor));
> 
>  // This procedure looses these two dashes and makes the configuration not
>  // usable for further processing in other tasks.
>  Config myConf = (Config) myCassandraConfYaml.load(reader);
> 
>   }
> }
> /* end */
> 
> And this is the output:
> 
> Exception in thread "main" Can't construct a java object for 
> tag:yaml.org,2002:org.apache.cassandra.config.Config; exception=Cannot create 
> property=seed_provider for 
> JavaBean=org.apache.cassandra.config.Config@3c3ac93e; 
> java.lang.NoSuchMethodException: 
> org.apache.cassandra.config.SeedProviderDef.()
> in 'reader', line 1, column 1:
>authenticator: org.apache.cassan ...
>^
> 
>   at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:333)
>   at 
> org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:182)
>   at 
> org.yaml.snakeyaml.constructor.BaseConstructor.constructDocument(BaseConstructor.java:141)
>   at 
> org.yaml.snakeyaml.constructor.BaseConstructor.getSingleData(BaseConstructor.java:127)
>   at org.yaml.snakeyaml.Yaml.loadFromReader(Yaml.java:481)
>   at org.yaml.snakeyaml.Yaml.load(Yaml.java:424)
>   at com.fue.YamlConfigTest.main(YamlConfigTest.java:53)
> Caused by: org.yaml.snakeyaml.error.YAMLException: Cannot create 
> property=seed_provider for 
> JavaBean=org.apache.cassandra.config.Config@3c3ac93e; 
> java.lang.NoSuchMethodException: 
> org.apache.cassandra.config.SeedProviderDef.()
>   at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.constructJavaBean2ndStep(Constructor.java:299)
>   at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.construct(Constructor.java:189)
>   at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructYamlObject.construct(Constructor.java:331)
>   ... 6 more
> Caused by: org.yaml.snakeyaml.error.YAMLException: 
> java.lang.NoSuchMethodException: 
> org.apache.cassandra.config.SeedProviderDef.()
>   at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.createEmptyJavaBean(Constructor.java:219)
>   at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.construct(Constructor.java:189)
>   at 
> org.yaml.snakeyaml.constructor.BaseConstructor.constructObject(BaseConstructor.java:182)
>   at 
> org.yaml.snakeyaml.constructor.Constructor$ConstructMapping.constructJavaBean2ndStep(Constructor.java:296)
>   ... 8 more
> Caused by: java.lang.NoSuchMethodException: 
> org.apache.cassandra.config.SeedProviderDef.()
>   at java.lang.Class.getConstructor0(Class.java:2715)
>   at java.lang.Class.getDeclaredConstructor(Class.java:1987)
>   at 
> org.ya

Re: IndexOutOfBoundsException during repair, streaming

2013-04-03 Thread aaron morton
> We deleted and recreated those CFs before moving into
> production mode. 
We have a wiener. 

The comparator is applying the current schema to the byte value read from disk 
(schema on read) which describes a value with more than 2 components. It's then 
trying to apply the current schema so it can type cast the bytes for 
comparison. 

Something must have gone wrong in the "deleted" part of your statement above. 
We do not store schema with data, so this a problem of changing the schema in 
an incompatible way with existing data. 

nodetool scrub is probably your best bet. I've not checked that it handles this 
specific problem, but in general it will drop rows from SSTables that cannot be 
read or have some other problem. Best thing to do is snapshot and copy the data 
from one prod node to a QA box and run some tests.

hope that helps. 

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 2:11 AM, Dane Miller  wrote:

> On Mon, Apr 1, 2013 at 10:19 PM, aaron morton  wrote:
>> ERROR [Thread-232] 2013-04-01 22:22:21,760 CassandraDaemon.java (line
>> 133) Exception in thread Thread[Thread-232,5,main]
>> java.lang.IndexOutOfBoundsException: index (2) must be less than size (2)
>>   at
>> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:305)
>>   at
>> com.google.common.base.Preconditions.checkElementIndex(Preconditions.java:284)
>>   at
>> com.google.common.collect.RegularImmutableList.get(RegularImmutableList.java:81)
>>   at
>> org.apache.cassandra.db.marshal.CompositeType.getComparator(CompositeType.java:96)
>> 
>> Something odd in the schema world perhaps.
>> 
>> Has the schema changed recently?
> 
> No, not recently.  But during development we experimented with other
> comparator types for those CFs.  More info below.
> 
>> Do yo have more than one schema in the cluster ? (describe cluster in
>> cassandra-cli)
> 
> I don't think so, that command in cassandra-cli shows just a single
> schema in the cluster:
> 
> Cluster Information:
>   Snitch: org.apache.cassandra.locator.Ec2Snitch
>   Partitioner: org.apache.cassandra.dht.RandomPartitioner
>   Schema versions:
> 126b31ad-3660-3831-9d4f-c6763c9acc97: [ ...ip list... ]
> 
> 
> This error happened on a CF where the composite is: (Integer, UTF8)
> 
> I'm a bit stumped about how we could get the an index==2 in that code
> pathway.  See here:
> https://github.com/apache/cassandra/blob/cassandra-1.2.1/src/java/org/apache/cassandra/db/marshal/AbstractCompositeType.java
> 
> ...start at line 63 in compare.
> 
> My Java is terrible, but all of our CompositeTypes are composites of
> only two types.  Thus the counter 'i' should never get up to 2 (it is
> used to access by index to individual comparator types within the
> composite), unless the value of the first two components in each of
> the column names being compared are equal, which should be impossible.
> 
> During development we experimented with other comparator types for
> those CFs.  We deleted and recreated those CFs before moving into
> production mode.  Is there a chance Cassandra 'remembers' these old
> types from a deleted CF that shares a name with an existing CF?  Could
> that be causing improper parsing of comparator column names?
> 
> That we enter this code pathway from a section that seems to want to
> clean up tombstones makes me think this is a possibility, that there
> is a tombstone somewhere whose composite column name is causing
> issues.
> 
> Dane



Data Model and Query

2013-04-03 Thread shubham srivastava
Hi,



Whats the recommendation on querying a data model like  StartDate > “X” and
counter > “Y” . Its kind of range queries across multiple columns and key.

I have the flexibility for modelling the data for the above
query accordingly.



Regards,

Shubham


Re: Cassandra 1.0.10 to 1.2.3 upgrade "post-mortem"

2013-04-03 Thread aaron morton
> I just wanted to share our experience of upgrading 1.0.10 to 1.2.3
In general it's dangerous to skip a major release when upgrading. 

> ERROR [MutationStage:33] 2013-03-31 09:00:02,899 CassandraDaemon.java (line 
> 164) Exception in thread Thread[MutationStage:33,5,main]
> java.lang.AssertionError: Missing host ID for 10.0.1.8
Was 10.0.1.8 been updated ?

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 4:09 AM, Rustam Aliyev  wrote:

> Hi,
> 
> I just wanted to share our experience of upgrading 1.0.10 to 1.2.3. It 
> happened that first we upgraded both of our two seeds to 1.2.3. And basically 
> after that old nodes couldn't communicate with new ones anymore. Cluster was 
> down until we upgraded all nodes to 1.2.3. We don't have many nodes and that 
> process didn't took long. Yet it caused outage for ~10 mins.
> 
> Here are some logs:
> 
> On the new, freshly upgraded seed node (v1.2.3):
> 
> ERROR [OptionalTasks:1] 2013-03-31 08:48:19,370 CassandraDaemon.java (line 
> 164) Exception in thread Thread[OptionalTasks:1,5,main]
> java.lang.NullPointerException
> at 
> org.apache.cassandra.service.MigrationManager$1.run(MigrationManager.java:137)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:98)
> at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:206)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> WARN [MutationStage:20] 2013-03-31 08:48:23,613 StorageProxy.java (line 577) 
> Unable to store hint for host with missing ID, /10.0.1.8 (old node?)
> 
> 
> 
> ERROR [MutationStage:33] 2013-03-31 09:00:02,899 CassandraDaemon.java (line 
> 164) Exception in thread Thread[MutationStage:33,5,main]
> java.lang.AssertionError: Missing host ID for 10.0.1.8
> at 
> org.apache.cassandra.service.StorageProxy.writeHintForMutation(StorageProxy.java:580)
> at 
> org.apache.cassandra.service.StorageProxy$5.runMayThrow(StorageProxy.java:555)
> at 
> org.apache.cassandra.service.StorageProxy$HintRunnable.run(StorageProxy.java:1643)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> 
> 
> 
> At the same time, old nodes (v1.0.10) were blinded:
> 
> 
> ERROR [RequestResponseStage:441] 2013-03-31 09:04:07,955 
> AbstractCassandraDaemon.java (line 139) Fatal exception in thread 
> Thread[RequestResponseStage:441,5,main]
> java.io.IOError: java.io.EOFException
> at 
> org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:71)
> at org.apache.cassandra.service.ReadCallback.response(ReadCallback.java:132)
> at 
> org.apache.cassandra.net.ResponseVerbHandler.doVerb(ResponseVerbHandler.java:45)
> at 
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:59)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.io.EOFException
> at java.io.DataInputStream.readFully(DataInputStream.java:180)
> at 
> org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:100)
> at 
> org.apache.cassandra.db.ReadResponseSerializer.deserialize(ReadResponse.java:81)
> at 
> org.apache.cassandra.service.AbstractRowResolver.preprocess(AbstractRowResolver.java:64)
> ... 6 more
> 
> .
> 
> INFO [GossipStage:3] 2013-03-31 09:06:08,885 Gossiper.java (line 804) 
> InetAddress /10.0.1.8 is now UP
> ERROR [GossipStage:3] 2013-03-31 09:06:08,885 AbstractCassandraDaemon.java 
> (line 139) Fatal exception in thread Thread[GossipStage:3,5,main]
> java.lang.UnsupportedOperationException: Not a time-based UUID
> at java.util.UUID.timestamp(UUID.java:308)
> at 
> org.apache.cassandra.service.MigrationManager.updateHighestKnown(MigrationManager.java:121)
> at 
> org.apache.cassandra.service.MigrationManager.rectify(MigrationManager.java:99)
> at 
> org.apache.cassandra.service.MigrationManager.onAlive(MigrationManager.java:83)
> at org.apache.cassandra.gms.Gossiper.markAlive(Gos

Re: Blob vs. "normal" columns (internals) difference?

2013-04-03 Thread aaron morton
> 1. Is size getting bigger in either one in storing one Tweet?
If you store the data in one blob then we only store one column name and the 
blob. If they are in different cols then we store the column names and their 
values.

> 2. Has either choice have impact on read/write performance on large scale?
If you store data in a blob you can only read and update it as a blob, so 
chances are you will be wasting effort as you do read-modify-write operations. 
Unless you have a good reason split things up and store them as columns. 

cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 1:08 PM, Alan Ristić  wrote:

> Hi guys,
> 
> Here is example (fictional) model I have for learning purposes...
> 
> I'm currently storing the "User" object in a Tweet as blob value. So taking 
> JSON of 'User' and storing it as blob. I'm wondering why is this better vs. 
> just prefixing and flattening column names?
> 
> Tweet {
>  id uuid,
>  user blob
> }
> 
> vs.
> 
> Tweet {
>  id uuid,
>  user_id uuid,
>  user_name text,
>  
> }
> 
> In one or other
> 
> 1. Is size getting bigger in either one in storing one Tweet?
> 2. Has either choice have impact on read/write performance on large scale?
> 3. Anything else I should be considering here? Your view/thinking would be 
> great.
> 
> Here is my understanding:
> For 'ease' of update if for example user changes its name I'm aware I need to 
> (re)write whole object in all Tweets in first "blob" example and only 
> user_name column in second 'flattened' example. Which brings me that If I'd 
> wanted to actually do this "updating/rewriting" for every Tweet I'd use 
> second 'flattened' example since payload of only user_name is smaller than 
> whole User blob for every Tweet right?
> 
> Nothing urgent, any input is valuable, tnx guys :)
> 
> 
> 
> Hvala in lp,
> Alan Ristić
> 
> w: personal blog  
>  t: @alanristic
>  l: linkedin.com/alanristic
> m: ​068 15 73 88​



Re: Blob vs. "normal" columns (internals) difference?

2013-04-03 Thread Chidambaran Subramanian
On Thu, Apr 4, 2013 at 6:58 AM, aaron morton wrote:

> > 1. Is size getting bigger in either one in storing one Tweet?
> If you store the data in one blob then we only store one column name and
> the blob. If they are in different cols then we store the column names and
> their values.
>
> > 2. Has either choice have impact on read/write performance on large
> scale?
> If you store data in a blob you can only read and update it as a blob, so
> chances are you will be wasting effort as you do read-modify-write
> operations. Unless you have a good reason split things up and store them as
> columns.
>
> If its mostly read only data that can be cached outside Cassandra, storing
it in one column looks like a good idea to me. What is the downside, anyway?



> cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 3/04/2013, at 1:08 PM, Alan Ristić  wrote:
>
> > Hi guys,
> >
> > Here is example (fictional) model I have for learning purposes...
> >
> > I'm currently storing the "User" object in a Tweet as blob value. So
> taking JSON of 'User' and storing it as blob. I'm wondering why is this
> better vs. just prefixing and flattening column names?
> >
> > Tweet {
> >  id uuid,
> >  user blob
> > }
> >
> > vs.
> >
> > Tweet {
> >  id uuid,
> >  user_id uuid,
> >  user_name text,
> >  
> > }
> >
> > In one or other
> >
> > 1. Is size getting bigger in either one in storing one Tweet?
> > 2. Has either choice have impact on read/write performance on large
> scale?
> > 3. Anything else I should be considering here? Your view/thinking would
> be great.
> >
> > Here is my understanding:
> > For 'ease' of update if for example user changes its name I'm aware I
> need to (re)write whole object in all Tweets in first "blob" example and
> only user_name column in second 'flattened' example. Which brings me that
> If I'd wanted to actually do this "updating/rewriting" for every Tweet I'd
> use second 'flattened' example since payload of only user_name is smaller
> than whole User blob for every Tweet right?
> >
> > Nothing urgent, any input is valuable, tnx guys :)
> >
> >
> >
> > Hvala in lp,
> > Alan Ristić
> >
> > w: personal blog
> >  t: @alanristic
> >  l: linkedin.com/alanristic
> > m: ​068 15 73 88​
>
>


Re: Repair does not fix inconsistency

2013-04-03 Thread aaron morton
What version are you on ? 

Can you run a repair on the CF and check:

Does the repair detect differences in the CF and stream changes ? 
After the streaming does it run a secondary index rebuild on the new sstable ? 
(Should be in the logs)

Can you provide the full query trace ? 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 5:25 PM, Michal Michalski  wrote:

> Hi,
> 
> TL;DR: I have inconsistend data (1 live row on node A & 1 tombstoned row on 
> node B) that do not get fixed by repair. What can be a problem?
> 
> Long version:
> 
> I have a CF containing Users' info, which I sometimes query by key, and 
> sometimes by indexed columns like email. I'm using RF=2. I write with CL.ONE, 
> but  this CF is very rarely updated, so C* has a looot of time to fix 
> inconsistencies that may occur, so I'm fine with this (at least in theory ;-) 
> ).
> 
> To be clear:
> - I've run a successfull cluster-wide repair on this CF before testing, so I 
> do not expect any inconsistency
> - All indexes are built, I've rebuilt them manually before testing, so I 
> expect them to work properly (I mention it because it seems to be somehow 
> related to indexes, but I'm not sure - see below)
> 
> The problem is:
> 
> When I query (cqlsh) some rows by key (CL is default = ONE) I _always_ get a 
> correct result.  However, when I query it by indexed column, it returns 
> nothing.
> 
> When tracing a query with CL.ALL in cqlsh, I get info that C* has:
> 
> Read 0 live cells and 1 tombstoned   // for first replica node
> Read 1 live cells and 0 tombstoned   // for second replica node
> 
> When CL is ONE it's never asking second replica for data (possibly due to 
> DynamicSnitch scores or so), so it returns nothing.
> 
> Switching to CL >= TWO obviously fixes this problem for us, but it's not the 
> solution I'd like to use as I'd rather rely on fast read/write requests with 
> CL.ONE + frequent repairs, allowing some short-term inconsistency.
> 
> Any ideas why it may happen that data are still inconsistent after repair? Is 
> there something I could have missed?
> 
> I'm mainly surprised that repair does not fix this inconsistency in ANY way - 
> either by pulling missing data to first replica _OR_ tombstoning it on second 
> replica. First one would be correct (delete was made a long time ago and then 
> the row reappeared), but both could make sense, as both will make the data 
> consistent. In this state it's definitely inconsistent and I don't understand 
> it :-)
> 
> 
> M.



Re: Alter table drop column seems not working

2013-04-03 Thread aaron morton
I dont think it's supported 
http://www.datastax.com/docs/1.2/cql_cli/cql/ALTER_TABLE#dropping-typed-col

Anyone else know?

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 8:11 PM, julien Campan  wrote:

> Hi,
> 
> I'm working with cassandra 1.2.2. 
> 
> When I try to drop a column , it's not working. 
> 
> This is what I tried : 
> 
> CREATE TABLE cust (
>   ise text PRIMARY KEY,
>   id_avatar_1 uuid,
>   id_avatar_2 uuid,
>   id_avatar_3 uuid,
>   id_avatar_4 uuid
> ) ;
> 
> 
> cqlsh> ALTER TABLE cust DROP id_avatar_1 ;
> 
> ==>Bad Request: line 1:17 no viable alternative at input 'DROP'
> ==>Perhaps you meant to use CQL 2? Try using the -2 option when ==>starting 
> cqlsh.
> 
> Can someone  tell me how to drop a column or if it is a bug ? 
> 
> Thank



Re: upgrading 1.1.x to 1.2.x via sstableloader

2013-04-03 Thread aaron morton
> java.lang.UnsupportedOperationException: SSTable zzz/xxx/yyy-hf-47-Data.db is 
> not compatible with current version ib
You cannot stream files that have a different on disk format. 

1.2 can read the old files, but cannot accept them as streams. You can copy the 
files to the new machines and use nodetool refresh to load them, then 
upgradesstables to re-write them before running repair. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 3/04/2013, at 10:53 PM, Michał Czerwiński  wrote:

> Does anyone knows what is the best process to put data from cassandra 1.1.x 
> (1.1.7 to be more precise) to cassandra 1.2.3 ?
> 
> I am trying to use sstableloader and stream data to a new cluster but I get.
> 
> ERROR [Thread-125] 2013-04-03 16:37:27,330 IncomingTcpConnection.java (line 
> 183) Received stream using protocol version 5 (my version 6). Terminating 
> connection
> 
> ERROR [Thread-141] 2013-04-03 16:38:05,704 CassandraDaemon.java (line 164) 
> Exception in thread Thread[Thread-141,5,main]
> 
> java.lang.UnsupportedOperationException: SSTable zzz/xxx/yyy-hf-47-Data.db is 
> not compatible with current version ib
> 
> at 
> org.apache.cassandra.streaming.StreamIn.getContextMapping(StreamIn.java:77)
> 
> at 
> org.apache.cassandra.streaming.IncomingStreamReader.(IncomingStreamReader.java:87)
> 
> at 
> org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:238)
> 
> at 
> org.apache.cassandra.net.IncomingTcpConnection.handleStream(IncomingTcpConnection.java:178)
> 
> at 
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:78)
> 
> 
> 
> I've changed Murmur3Partitioner to RandomPartitioner already and I've noticed 
> I am not able to use 1.1.7's sstableloader so I copied sstables to new nodes 
> and tried doing it locally on cassandra 1.2.3, but it seems protocol versions 
> do not match (see error above)
> 
> The reason why I want to use sstableloader is that I have different number of 
> nodes and would like to avoid using rsync and then repair/cleanup of 
> excessive data.
> 
> Thanks!
> 



Re: Any plans for read-before-write update operations in CQL3?

2013-04-03 Thread aaron morton
I would guess not. 

> I know this goes against keeping updates idempotent, 
There are also issues with consistency. i.e. is the read local or does it 
happen at the CL level ? 
And it makes things go slower.

>  We currently do things like this in client code, but it would be great to be 
> able to this on the server side to minimize the chance of race conditions.
Sometimes you can write the plus one into a new column and then apply the 
changes in the reading client thread. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 12:48 AM, Drew Kutcharian  wrote:

> Hi Guys,
> 
> Are there any short/long term plans to support UPDATE operations that require 
> read-before-write, such as increment on a numeric non-counter column? 
> i.e. 
> 
> UPDATE CF SET NON_COUNTER_NUMERIC_COLUMN = NON_COUNTER_NUMERIC_COLUMN + 1;
> 
> UPDATE CF SET STRING_COLUMN = STRING_COLUMN + "postfix";
> 
> etc.
> 
> I know this goes against keeping updates idempotent, but there are times you 
> need to do these kinds of operations. We currently do things like this in 
> client code, but it would be great to be able to this on the server side to 
> minimize the chance of race conditions.
> 
> -- Drew



Re: Blob vs. "normal" columns (internals) difference?

2013-04-03 Thread aaron morton
>  What is the downside, anyway?
you code is now the only thing that can read the data. So it makes it harder to 
look at in a CLI tool. 

IMHO just store the data in columns. 

Cheers

-
Aaron Morton
Freelance Cassandra Consultant
New Zealand

@aaronmorton
http://www.thelastpickle.com

On 4/04/2013, at 7:04 AM, Chidambaran Subramanian  wrote:

> 
> 
> 
> On Thu, Apr 4, 2013 at 6:58 AM, aaron morton  wrote:
> > 1. Is size getting bigger in either one in storing one Tweet?
> If you store the data in one blob then we only store one column name and the 
> blob. If they are in different cols then we store the column names and their 
> values.
> 
> > 2. Has either choice have impact on read/write performance on large scale?
> If you store data in a blob you can only read and update it as a blob, so 
> chances are you will be wasting effort as you do read-modify-write 
> operations. Unless you have a good reason split things up and store them as 
> columns.
> 
> If its mostly read only data that can be cached outside Cassandra, storing it 
> in one column looks like a good idea to me. What is the downside, anyway?
> 
>  
> cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 3/04/2013, at 1:08 PM, Alan Ristić  wrote:
> 
> > Hi guys,
> >
> > Here is example (fictional) model I have for learning purposes...
> >
> > I'm currently storing the "User" object in a Tweet as blob value. So taking 
> > JSON of 'User' and storing it as blob. I'm wondering why is this better vs. 
> > just prefixing and flattening column names?
> >
> > Tweet {
> >  id uuid,
> >  user blob
> > }
> >
> > vs.
> >
> > Tweet {
> >  id uuid,
> >  user_id uuid,
> >  user_name text,
> >  
> > }
> >
> > In one or other
> >
> > 1. Is size getting bigger in either one in storing one Tweet?
> > 2. Has either choice have impact on read/write performance on large scale?
> > 3. Anything else I should be considering here? Your view/thinking would be 
> > great.
> >
> > Here is my understanding:
> > For 'ease' of update if for example user changes its name I'm aware I need 
> > to (re)write whole object in all Tweets in first "blob" example and only 
> > user_name column in second 'flattened' example. Which brings me that If I'd 
> > wanted to actually do this "updating/rewriting" for every Tweet I'd use 
> > second 'flattened' example since payload of only user_name is smaller than 
> > whole User blob for every Tweet right?
> >
> > Nothing urgent, any input is valuable, tnx guys :)
> >
> >
> >
> > Hvala in lp,
> > Alan Ristić
> >
> > w: personal blog
> >  t: @alanristic
> >  l: linkedin.com/alanristic
> > m: ​068 15 73 88​
> 
> 



Re: Any plans for read-before-write update operations in CQL3?

2013-04-03 Thread Edward Capriolo
Counters are currently read before write, some collection operations on
List are read before write.


On Wed, Apr 3, 2013 at 9:59 PM, aaron morton wrote:

> I would guess not.
>
> I know this goes against keeping updates idempotent,
>
> There are also issues with consistency. i.e. is the read local or does it
> happen at the CL level ?
> And it makes things go slower.
>
>  We currently do things like this in client code, but it would be great to
> be able to this on the server side to minimize the chance of race
> conditions.
>
> Sometimes you can write the plus one into a new column and then apply the
> changes in the reading client thread.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
>
> @aaronmorton
> http://www.thelastpickle.com
>
> On 4/04/2013, at 12:48 AM, Drew Kutcharian  wrote:
>
> Hi Guys,
>
> Are there any short/long term plans to support UPDATE operations that
> require read-before-write, such as increment on a numeric non-counter
> column?
> i.e.
>
> UPDATE CF SET NON_COUNTER_NUMERIC_COLUMN = NON_COUNTER_NUMERIC_COLUMN + 1;
>
> UPDATE CF SET STRING_COLUMN = STRING_COLUMN + "postfix";
>
> etc.
>
> I know this goes against keeping updates idempotent, but there are times
> you need to do these kinds of operations. We currently do things like this
> in client code, but it would be great to be able to this on the server side
> to minimize the chance of race conditions.
>
> -- Drew
>
>
>


Re: Why do Datastax docs recommend Java 6?

2013-04-03 Thread Edward Capriolo
Hey guys. what gives!
Apparently Cassandra ran on 1.7 like 5 years ago. Was their a regression?

jk

https://github.com/jbellis/helenus

Installation

* Please use jdk 1.7; Cassandra will run with 1.6 but
  frequently core dumps on quad-core machines
* Unpack the tar ball:




On Wed, Feb 6, 2013 at 12:56 PM, Wei Zhu  wrote:

> Anyone has first hand experience with Zing JVM which is claimed to be
> pauseless? How do they charge, per CPU?
>
> Thanks
> -Wei
>   --
> *From:* Edward Capriolo 
> *To:* user@cassandra.apache.org
> *Sent:* Wednesday, February 6, 2013 7:07 AM
> *Subject:* Re: Why do Datastax docs recommend Java 6?
>
> Oracle already did this once, It was called jrockit :)
> http://www.oracle.com/technetwork/middleware/jrockit/overview/index.html
>
> Typically oracle acquires they technology and then the bits are merged
> with the standard JVM.
>
> On Wed, Feb 6, 2013 at 2:13 AM, Viktor Jevdokimov <
> viktor.jevdoki...@adform.com> wrote:
>
>  I would prefer Oracle to own an Azul’s Zing JVM over any other (GC) to
> provide it for free for anyone :)
>
>Best regards / Pagarbiai
> *Viktor Jevdokimov*
> Senior Developer
>
> Email: viktor.jevdoki...@adform.com
> Phone: +370 5 212 3063, Fax +370 5 261 0453
> J. Jasinskio 16C, LT-01112 Vilnius, Lithuania
> Follow us on Twitter: @adforminsider 
> Take a ride with Adform's Rich Media Suite
>  [image: Adform News] 
> [image: Adform awarded the Best Employer 2012]
> 
>
> Disclaimer: The information contained in this message and attachments is
> intended solely for the attention and use of the named addressee and may be
> confidential. If you are not the intended recipient, you are reminded that
> the information remains the property of the sender. You must not use,
> disclose, distribute, copy, print or rely on this e-mail. If you have
> received this message in error, please contact the sender immediately and
> irrevocably delete this message and any copies.
>
>   *From:* jef...@gmail.com [mailto:jef...@gmail.com]
> *Sent:* Wednesday, February 06, 2013 02:23
> *To:* user@cassandra.apache.org
> *Subject:* Re: Why do Datastax docs recommend Java 6?
>
> Oracle now owns the sun hotspot team, which is inarguably the highest
> powered java vm team in the world. Its still really the epicenter of all
> java vm development.
>  Sent from my Verizon Wireless BlackBerry
>  --
>  *From: *"Ilya Grebnov" 
>  *Date: *Tue, 5 Feb 2013 14:09:33 -0800
>  *To: *
>  *ReplyTo: *user@cassandra.apache.org
>  *Subject: *RE: Why do Datastax docs recommend Java 6?
>
>  Also, what is particular reason to use Oracle JDK over Open JDK? Sorry,
> I could not find this information online.
>
> Thanks,
> Ilya
>  *From:* Michael Kjellman 
> [mailto:mkjell...@barracuda.com]
>
> *Sent:* Tuesday, February 05, 2013 7:29 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Why do Datastax docs recommend Java 6?
>
>  There have been tons of threads/convos on this.
>
>  In the early days of Java 7 it was pretty unstable and there was pretty
> much no convincing reason to use Java 7 over Java 6.
>
>  Now that Java 7 has stabilized and Java 6 is EOL it's a reasonable
> decision to use Java 7 and we do it in production with no issues to speak
> of.
>
>  That being said there was one potential situation we've seen as a
> community where bootstrapping new node was using 3x more CPU and getting
> significantly less throughput. However, reproducing this consistently never
> happened AFAIK.
>
>  I think until more people use Java 7 in production and prove it doesn't
> cause any additional bugs/performance issues Datastax will update their
> docs. Until now I'd say it's a safe bet to use Java 7 with Vanilla C*
> 1.2.1. I hope this helps!
>
>  Best,
>  Michael
>
>  *From: *Baron Schwartz 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Tuesday, February 5, 2013 7:21 AM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Why do Datastax docs recommend Java 6?
>
>   The Datastax docs repeatedly say (e.g.
> http://www.datastax.com/docs/1.2/install/install_jre) that Java 7 is not
> recommended, but they don't say why. It would be helpful to know this. Does
> anyone know?
>
>  The same documentation is referenced from the Cassandra wiki, for
> example, http://wiki.apache.org/cassandra/GettingStarted
>
>  - Baron
>
>
>
>
>
<><>

Re: Any plans for read-before-write update operations in CQL3?

2013-04-03 Thread Drew Kutcharian
I guess it'd be safe to say that the read consistency could be the same as the 
consistency of the update. But regardless, that would be a lot better than 
reading a value, modifying it at the client side and then writing it back.


On Apr 3, 2013, at 7:12 PM, Edward Capriolo  wrote:

> Counters are currently read before write, some collection operations on List 
> are read before write.
> 
> 
> On Wed, Apr 3, 2013 at 9:59 PM, aaron morton  wrote:
> I would guess not. 
> 
>> I know this goes against keeping updates idempotent, 
> There are also issues with consistency. i.e. is the read local or does it 
> happen at the CL level ? 
> And it makes things go slower.
> 
>>  We currently do things like this in client code, but it would be great to 
>> be able to this on the server side to minimize the chance of race conditions.
> Sometimes you can write the plus one into a new column and then apply the 
> changes in the reading client thread. 
> 
> Cheers
> 
> -
> Aaron Morton
> Freelance Cassandra Consultant
> New Zealand
> 
> @aaronmorton
> http://www.thelastpickle.com
> 
> On 4/04/2013, at 12:48 AM, Drew Kutcharian  wrote:
> 
>> Hi Guys,
>> 
>> Are there any short/long term plans to support UPDATE operations that 
>> require read-before-write, such as increment on a numeric non-counter 
>> column? 
>> i.e. 
>> 
>> UPDATE CF SET NON_COUNTER_NUMERIC_COLUMN = NON_COUNTER_NUMERIC_COLUMN + 1;
>> 
>> UPDATE CF SET STRING_COLUMN = STRING_COLUMN + "postfix";
>> 
>> etc.
>> 
>> I know this goes against keeping updates idempotent, but there are times you 
>> need to do these kinds of operations. We currently do things like this in 
>> client code, but it would be great to be able to this on the server side to 
>> minimize the chance of race conditions.
>> 
>> -- Drew
> 
> 



Apache Cassandra for Developers-Starter

2013-04-03 Thread Vivek Mishra
Hi,
Just wanted to share that recently i worked with Packt publishing to author
a quick Cassandra reference in form of a book. Here it is:
http://www.packtpub.com/apache-cassandra-for-developers/book


Sincerely,
-Vivek


Issues running Bulkloader program on AIX server

2013-04-03 Thread praveen.akunuru
Hi All,

I am facing issues with running java Bulkloader program from a AIX server. The 
program is working fine on Linux server. I am receiving the below error on AIX. 
Can anyone help me in getting this working?

java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:317)
at org.xerial.snappy.SnappyLoader.load(SnappyLoader.java:219)
at org.xerial.snappy.Snappy.(Snappy.java:44)
at java.lang.J9VMInternals.initializeImpl(Native Method)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
at 
org.apache.cassandra.io.compress.SnappyCompressor.create(SnappyCompressor.java:45)
at 
org.apache.cassandra.io.compress.SnappyCompressor.isAvailable(SnappyCompressor.java:55)
at 
org.apache.cassandra.io.compress.SnappyCompressor.(SnappyCompressor.java:37)
at java.lang.J9VMInternals.initializeImpl(Native Method)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
at org.apache.cassandra.config.CFMetaData.(CFMetaData.java:82)
at java.lang.J9VMInternals.initializeImpl(Native Method)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:200)
at 
org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.(SSTableSimpleUnsortedWriter.java:80)
at 
org.apache.cassandra.io.sstable.SSTableSimpleUnsortedWriter.(SSTableSimpleUnsortedWriter.java:93)
at BulkLoadExample.main(BulkLoadExample.java:55)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:60)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:37)
at java.lang.reflect.Method.invoke(Method.java:611)
at 
org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.UnsatisfiedLinkError: snappyjava (Not found in 
java.library.path)
at java.lang.ClassLoader.loadLibraryWithPath(ClassLoader.java:1011)
at 
java.lang.ClassLoader.loadLibraryWithClassLoader(ClassLoader.java:975)
at java.lang.System.loadLibrary(System.java:469)
at 
org.xerial.snappy.SnappyNativeLoader.loadLibrary(SnappyNativeLoader.java:52)
... 25 more
log4j:WARN No appenders could be found for logger 
(org.apache.cassandra.io.compress.SnappyCompressor).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more 
info.
Unhandled exception
Type=Segmentation error vmState=0x
J9Generic_Signal_Number=0004 Signal_Number=000b Error_Value= 
Signal_Code=0032
Handler1=09001000A06FF5A0 Handler2=09001000A06F60F0

Regards,
Praveen

Wipro Limited (Company Regn No in UK FC 019088)
Address: Level 2, West wing, 3 Sheldon Square, London W2 6PS, United Kingdom. 
Tel +44 20 7432 8500 Fax: +44 20 7286 5703 

VAT Number: 563 1964 27

(Branch of Wipro Limited (Incorporated in India at Bangalore with limited 
liability vide Reg no L9KA1945PLC02800 with Registrar of Companies at 
Bangalore, India. Authorized share capital  Rs 5550 mn))

Please do not print this email unless it is absolutely necessary. 

The information contained in this electronic message and any attachments to 
this message are intended for the exclusive use of the addressee(s) and may 
contain proprietary, confidential or privileged information. If you are not the 
intended recipient, you should not disseminate, distribute or copy this e-mail. 
Please notify the sender immediately and destroy all copies of this message and 
any attachments. 

WARNING: Computer viruses can be transmitted via email. The recipient should 
check this email and any attachments for the presence of viruses. The company 
accepts no liability for any damage caused by any virus transmitted by this 
email. 

www.wipro.com