Re: How to retrieve snappy compressed data from Cassandra using Datastax?

2014-01-28 Thread Alex Popescu
Wouldn't you be better to delegate the compression part to Cassandra (which support Snappy [1])? This way the compression part will be completely transparent to your application. [1] http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-compression On Tue, Jan 28, 2014 at 8:51 PM, Check Pe

Re: Issues with seeding on EC2 for C* 2.0.4 - help needed

2014-01-28 Thread Kumar Ranjan
Hi Michael - Yes, 7000, 7001, 9042, 9160 are all open on EC2. Issue was seeds address and listen_address were 127.0.0.1 and private_ip. This will help anyone http://stackoverflow.com/questions/20690987/apache-cassandra-unable-to-gossip-with-any-seeds On Wed, Jan 29, 2014 at 1:12 AM, Michael Sh

Re: Issues with seeding on EC2 for C* 2.0.4 - help needed

2014-01-28 Thread Michael Shuler
Did you open up the ports so they can talk to each other? http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/install/installAMISecurityGroup.html -- Michael

Re: OpenJDK is not recommended? Why

2014-01-28 Thread Kumar Ranjan
Yes got rid of openJDK and installed oracle version and warning went away. Happy happy...Thank you folks.. On Tue, Jan 28, 2014 at 11:59 PM, Michael Shuler wrote: > On 01/28/2014 09:55 PM, Kumar Ranjan wrote: > >> I am in process of setting 2 node cluster with C* version 2.0.4. When I >> started

Issues with seeding on EC2 for C* 2.0.4 - help needed

2014-01-28 Thread Kumar Ranjan
Hey Folks - I am burning the midnight oil fast but cant figure out what I am doing wrong? log files has this. I have also listed both seed node and node 2 partial configurations. INFO [main] 2014-01-29 05:15:11,515 CommitLog.java (line 127) Log replay complete, 46 replayed mutations INFO [main

Re: OpenJDK is not recommended? Why

2014-01-28 Thread Michael Shuler
On 01/28/2014 09:55 PM, Kumar Ranjan wrote: I am in process of setting 2 node cluster with C* version 2.0.4. When I started each node, it failed to communicate thus, each are running separate and not in same ring. So started looking at the log files are saw the message below: This is probably j

How to retrieve snappy compressed data from Cassandra using Datastax?

2014-01-28 Thread Check Peck
I am working on a project in which I am supposed to store the snappy compressed data in Cassandra, so that when I retrieve the same data from Cassandra, it should be snappy compressed in memory and then I will decompress that data using snappy to get the actual data from it. I am having a byte arr

Re: OpenJDK is not recommended? Why

2014-01-28 Thread Colin
Open jdk has known issues and they will raise their ugly little head from time to time-i have experienced them myself. To be safe, I would use the latest oracle 7 release. You may also be experiencing a configuration issue, make sure one node is specified as the seed node and that the other nod

Re: question about secondary index or not

2014-01-28 Thread Jimmy Lin
in my #2 example: select * from people where company_id='xxx' and gender='male' I already specify the first part of the primary key(row key) in my where clause, so how does the secondary indexed column gender='male" help determine which row to return? It is more like filtering a list of column fro

OpenJDK is not recommended? Why

2014-01-28 Thread Kumar Ranjan
I am in process of setting 2 node cluster with C* version 2.0.4. When I started each node, it failed to communicate thus, each are running separate and not in same ring. So started looking at the log files are saw the message below: WARN [main] 2014-01-28 06:02:17,861 CassandraDaemon.java (line 15

Re: Heavy update dataset and compaction

2014-01-28 Thread Robert Wille
> > Perhaps a log structured database with immutable data files is not best suited > for this use case? Perhaps not, but I have other data structures I¹m moving to Cassandra as well. This is just the first. Cassandra has actually worked quite well for this first step, in spite of it not being an

Re: resetting nodetool info exception count

2014-01-28 Thread Robert Coli
On Tue, Jan 28, 2014 at 2:16 PM, John Pyeatt wrote: > Is there any way of resetting the value of a nodetool info Exceptions > value manually? > > Is there a JMX call I can make? > Almost certainly not. =Rob

Re: GC eden filled instantly (any size). Dropping messages.

2014-01-28 Thread Arya Goudarzi
Dimetrio, Look at my last post. I showed you how to turn on all useful GC logging flags. From there we can get information on why GC has long pauses. From the changes you have made it seems you are changing things without knowing the effect. Here are a few things to considenr: - Having a 9GB NewG

Re: Help me on Cassandra Data Modelling

2014-01-28 Thread Thunder Stumpges
Hey Naresh, Unfortunately I don't have any further advice. I keep feeling like you're looking at a search problem instead of a lookup problem. Perhaps Cassandra is not the right tool for your need in this case. Perhaps something with a full-text index type feature would help. Or perhaps someone m

resetting nodetool info exception count

2014-01-28 Thread John Pyeatt
Is there any way of resetting the value of a nodetool info Exceptions value manually? Is there a JMX call I can make? -- John Pyeatt Singlewire Software, LLC www.singlewire.com -- 608.661.1184 john.pye...@singlewire.com

Re: question about secondary index or not

2014-01-28 Thread Mullen, Robert
I would do #2. Take a look at this blog which talks about secondary indexes, cardinality, and what it means for cassandra. Secondary indexes in cassandra are a different beast, so often old rules of thumb about indexes don't apply. http://www.wentnet.com/blog/?p=77 On Tue, Jan 28, 2014 at 1

Re: no more zookeeper?

2014-01-28 Thread S Ahmed
Sorry guys, I am confusing things with Hbase. But Nate's jira look sure looks interesting thanks. On Tue, Jan 28, 2014 at 12:25 PM, Edward Capriolo wrote: > Some people had done some custom cassandra zookeper integration back in > the day. Triggers, there is some reference in the original faceb

Re: A question to OutboundTcpConnection.expireMessages()

2014-01-28 Thread Robert Coli
On Mon, Jan 27, 2014 at 11:40 PM, Lu, Boying wrote: > When I read the codes of OutboundTcpConnection.expireMessages(), I found > the following snippet in a loop: > > if (qm.timestamp >= System.currentTimeMillis() - > qm.message.getTimeout()) > > *return*; > > > > My understan

Re: Heavy update dataset and compaction

2014-01-28 Thread Robert Coli
On Tue, Jan 28, 2014 at 7:57 AM, Robert Wille wrote: > I have a dataset which is heavy on updates. The updates are actually > performed by inserting new records and deleting the old ones the following > day. Some records might be updated (replaced) a thousand times before they > are finished. >

question about secondary index or not

2014-01-28 Thread Jimmy Lin
I have a simple column family like the following create table people( company_id text, employee_id text, gender text, primary key(company_id, employee_id) ); if I want to find out all the "male" employee given a company id, I can do 1/ select * from people where company_id=' and loop through

Re: question about secondary index or not

2014-01-28 Thread Edward Capriolo
Generally indexes on binary fields true/false male/female are not terrible effective. On Tue, Jan 28, 2014 at 12:40 PM, Jimmy Lin wrote: > I have a simple column family like the following > > create table people( > company_id text, > employee_id text, > gender text, > primary key(company_id, em

Re: no more zookeeper?

2014-01-28 Thread Edward Capriolo
Some people had done some custom cassandra zookeper integration back in the day. Triggers, there is some reference in the original facebook thrown over the wall to zk. No official release has ever used zk directly. Though people have suggested it. On Tue, Jan 28, 2014 at 12:08 PM, Andrey Ilinykh

Re: no more zookeeper?

2014-01-28 Thread Andrey Ilinykh
Why would cassandra use zookeeper? On Tue, Jan 28, 2014 at 7:18 AM, S Ahmed wrote: > Does C* no long use zookeeper? > > I don't see a reference to it in the > https://github.com/apache/cassandra/blob/trunk/build.xml > > If not, what replaced it? >

Re: Help me on Cassandra Data Modelling

2014-01-28 Thread Naresh Yadav
please inputs on last email if any.. On Tue, Jan 28, 2014 at 7:18 AM, Naresh Yadav wrote: > yes thunder you are right, i had simplified that by moving *tags > *search(partial/exact) > in separate column family tagcombination which will act as index for all > search based on tags and in my my o

Re: Possible optimization: avoid creating tombstones for TTLed columns if updates to TTLs are disallowed

2014-01-28 Thread horschi
Hi Donald, I was reporting the ticket you mentioned, so I kinds feel like I should answer this :-) I presume the point is that GCable tombstones can still do work > (preventing spurious writing from nodes that were down) but only until the > data is flushed to disk. > I am not sure I understand

Re: no more zookeeper?

2014-01-28 Thread Nate McCall
AFAIK zookeeper was never in use. It was discussed once or twice over the years, but never seriously. If you are talking about the origins of the current lightweight transactions in 2.0, take a look at this issue (warning - it's one of the longer ASF jira issues I've seen, but some good stuff in t

Re: Heavy update dataset and compaction

2014-01-28 Thread Nate McCall
LeveledCompactionStrategy is ideal for update heavy workloads. If you are using a pre 1.2.8 version make sure you set the sstable_size_in_mb up to the new default of 160. Also, keep an eye on "Average live cells per slice" and "Average tombstones per slice" (available in versions > 1.2.11 - so I g

Heavy update dataset and compaction

2014-01-28 Thread Robert Wille
I have a dataset which is heavy on updates. The updates are actually performed by inserting new records and deleting the old ones the following day. Some records might be updated (replaced) a thousand times before they are finished. As I watch SSTables get created and compacted on my staging serve

RE: no more zookeeper?

2014-01-28 Thread S Ahmed
Does C* no long use zookeeper? I don't see a reference to it in the https://github.com/apache/cassandra/blob/trunk/build.xml If not, what replaced it?

Re: No deletes - is periodic repair needed? I think not...

2014-01-28 Thread Sylvain Lebresne
> > > I have actually set up one of our application streams such that the same > key is only overwritten with a monotonically increasing ttl. > > For example, a breaking news item might have an initial ttl of 60 seconds, > followed in 45 seconds by an update with a ttl of 3000 seconds, followed by

Re: No deletes - is periodic repair needed? I think not...

2014-01-28 Thread Laing, Michael
Thanks again Sylvain! I have actually set up one of our application streams such that the same key is only overwritten with a monotonically increasing ttl. For example, a breaking news item might have an initial ttl of 60 seconds, followed in 45 seconds by an update with a ttl of 3000 seconds, fo

Re: No deletes - is periodic repair needed? I think not...

2014-01-28 Thread Sylvain Lebresne
On Tue, Jan 28, 2014 at 1:05 AM, Edward Capriolo wrote: > If you have only ttl columns, and you never update the column I would not > think you need a repair. > Right, no deletes and no updates is the case 1. of Michael on which I think we all agree 'periodic repair to avoid resurrected columns'