Re: Filter data on row key in Cassandra Hadoop's Random Partitioner

2012-12-13 Thread Ayush V.
Thanks Hiller and Shamim. Let me share more details. I want to use cassandra MR to calculate some KPI's on the data which is stored in cassandra continuously. So here fetching whole data from cassandra every time seems an overhead to me? The rowkey I'm using is like "(timestamp/6)_otherid";

Re: Multiple Data Center shows very uneven load

2012-12-13 Thread Sergey Olefir
I'll try nodetool drain, thanks. But more generally -- are you basically saying that I should not worry about these things? Data will not keep accumulating indefinitely in production and it'll not affect performance negatively (despite vast differences in node load)? Best regards, Sergey aaron

Does a scrub remove deleted/expired columns?

2012-12-13 Thread Mike Smith
I'm using 1.0.12 and I find that large sstables tend to get compacted infrequently. I've got data that gets deleted or expired frequently. Is it possible to use scrub to accelerate the clean up of expired/deleted data? -- Mike Smith Director Development, MailChannels

Best Java Driver for Cassandra?

2012-12-13 Thread Stephen.M.Thompson
There seem to be a number of good options listed ... FireBrand and Hector seem to have the most attractive sites, but that doesn't necessarily mean anything. :) Can anybody make a case for one of the drivers over another, especially in terms of which ones seem to be most used in major implemen

Re: Why Secondary indexes is so slowly by my test?

2012-12-13 Thread Edward Capriolo
Until the secondary indexes do not read before write is in a release and stabilized you should follow Ed ENuff s blog and do your indexing yourself with composites. On Thursday, December 13, 2012, aaron morton wrote: > The IndexClause for the get_indexed_slices takes a start key. You can page the

Re: Datastax C*ollege Credit Webinar Series : Create your first Java App w/ Cassandra

2012-12-13 Thread Edward Capriolo
It should be good stuff. Brian eats this stuff for lunch. On Wednesday, December 12, 2012, Brian O'Neill wrote: > FWIW -- > I'm presenting tomorrow for the Datastax C*ollege Credit Webinar Series: > http://brianoneill.blogspot.com/2012/12/presenting-for-datastax-college-credit.html > > I hope to

Re: Help on MMap of SSTables

2012-12-13 Thread Edward Capriolo
This issue has to be looked from a micro and macro level. On the microlevel the "best" way is workload specific. On the macro level this mostly boils down to data and memory size. Companions are going to churn cache, this is unavoidable. Imho solid state makes the micro optimization meanless in th

Re: Best Java Driver for Cassandra?

2012-12-13 Thread Brian O'Neill
Well, we'll talk a bit about this in my webinar later todayŠ http://brianoneill.blogspot.com/2012/12/presenting-for-datastax-college-cre dit.html I put together a quick decision matrix for all of the options based on production-readiness, potential and momentum. I think the slides will be made a

Re: Why Secondary indexes is so slowly by my test?

2012-12-13 Thread Alain RODRIGUEZ
Hi Edward, can you share the link to this blog ? Alain 2012/12/13 Edward Capriolo > Ed ENuff s

Re: Why Secondary indexes is so slowly by my test?

2012-12-13 Thread Edward Capriolo
Here is a good start. http://www.anuff.com/2011/02/indexing-in-cassandra.html On Thu, Dec 13, 2012 at 11:35 AM, Alain RODRIGUEZ wrote: > Hi Edward, can you share the link to this blog ? > > Alain > > 2012/12/13 Edward Capriolo > >> Ed ENuff s > > >

Re: Why Secondary indexes is so slowly by my test?

2012-12-13 Thread Tyler Hobbs
If anyone's interested in a little more background on the read-before-write fix that Ed mentioned, see: https://issues.apache.org/jira/browse/CASSANDRA-2897 On Thu, Dec 13, 2012 at 11:31 AM, Edward Capriolo wrote: > Here is a good start. > > http://www.anuff.com/2011/02/indexing-in-cassandra.htm

Re: Datastax C*ollege Credit Webinar Series : Create your first Java App w/ Cassandra

2012-12-13 Thread Wei Zhu
I tried to registered and got the following page and haven't received email yet. I registered 10 minutes ago. Thank you for registering to attend: Is My App a Good Fit for Apache Cassandra? Details about this webinar have also been sent to your email, including a link to the webinar's URL. W

Re: Datastax C*ollege Credit Webinar Series : Create your first Java App w/ Cassandra

2012-12-13 Thread Wei Zhu
Never mind, the email arrived after 15 minutes or so... From: Wei Zhu To: "user@cassandra.apache.org" Sent: Thursday, December 13, 2012 10:06 AM Subject: Re: Datastax C*ollege Credit Webinar Series : Create your first Java App w/ Cassandra I tried to regis

State of Cassandra and Java 7

2012-12-13 Thread Drew Kutcharian
Hey Guys, With Java 6 begin EOL-ed soon (https://blogs.oracle.com/java/entry/end_of_public_updates_for), what's the status of Cassandra's Java 7 support? Anyone using it in production? Any outstanding *known* issues? -- Drew

Re: State of Cassandra and Java 7

2012-12-13 Thread Michael Kjellman
Works just fine for us. On 12/13/12 11:43 AM, "Drew Kutcharian" wrote: >Hey Guys, > >With Java 6 begin EOL-ed soon >(https://blogs.oracle.com/java/entry/end_of_public_updates_for), what's >the status of Cassandra's Java 7 support? Anyone using it in production? >Any outstanding *known* issues? >

BulkOutputFormat error - org.apache.thrift.transport.TTransportException

2012-12-13 Thread ANAND_BALARAMAN
Hi I am a newbie to Cassandra. Was trying out a sample (word count) code on BulkOutputFormat and got stuck with an error. What I am trying to do is - migrate all Hive tables (from Hadoop cluster) to Cassandra column families. My MR program is configured to run on Hadoop cluster v 0.20.2 (cdh3u3

Re: Multiple Data Center shows very uneven load

2012-12-13 Thread aaron morton
There is a limit on the size of the commit log and on how long Hints are stored for. I'm not sure why your load was different, I think it was left of hints and commit log. But it's not always easy to diagnose thingsvia email. Hopefully nodetool drain or deleting the rest system and starting a

Re: Does a scrub remove deleted/expired columns?

2012-12-13 Thread aaron morton
> Is it possible to use scrub to accelerate the clean up of expired/deleted > data? No. Scrub, and upgradesstables, are used to re-write each file on disk. Scrub may remove some rows from a file because of corruption, however upgradesstables will not. If you have long lived rows and a mixed w

Re: BulkOutputFormat error - org.apache.thrift.transport.TTransportException

2012-12-13 Thread aaron morton
Looks like it cannot connect to the server >conf.set("cassandra.output.thrift.address", "localhost"); Is this the same address as the rpc_address in the cassandra config ? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpic

Re: State of Cassandra and Java 7

2012-12-13 Thread Rob Coli
On Thu, Dec 13, 2012 at 11:43 AM, Drew Kutcharian wrote: > With Java 6 begin EOL-ed soon > (https://blogs.oracle.com/java/entry/end_of_public_updates_for), what's the > status of Cassandra's Java 7 support? Anyone using it in production? Any > outstanding *known* issues? I'd love to see an off

Re: Why Secondary indexes is so slowly by my test?

2012-12-13 Thread Chengying Fang
I do missed this important article about index, which discussing about the focus concerns. In fact, I have used Composite Column to resolve my problem. In some context, data model can resolves as 'alternate index', but it's complicated and can result new problems: data redundancy and maintenanc

ETL Tools to transfer data from Cassandra into other relational databases

2012-12-13 Thread cko2...@gmail.com
We will use Cassandra as logging storage in one of our web application. The application only insert rows into Cassandra but never update or delete any rows. The CF is expected to grow by about 0.5 million rows per day. We need to transfer the data in Cassandra to another relational database dai

Re: Does a scrub remove deleted/expired columns?

2012-12-13 Thread Mike Smith
Thanks for the great explanation. I'd just like some clarification on the last point. Is it the case that if I constantly add new columns to a row, while periodically trimming the row by by deleting the oldest columns, the deleted columns won't get cleaned up until all fragments of the row exist i

Re: ETL Tools to transfer data from Cassandra into other relational databases

2012-12-13 Thread Milind Parikh
Why would you use Cassandra for primary store of logging information? Have you considered Kafka ? You could , of course, then fan out the logs to both Cassandra (on a near real time basis ) and then on a daily basis (if you wish) extract the "deltas" from Kafka into a RDBMS; with no PIG/Hive etc.

RE: BulkOutputFormat error - org.apache.thrift.transport.TTransportException

2012-12-13 Thread ANAND_BALARAMAN
Aaron Both the rpc_address in caasandra.yaml file and job configuration are same (localhost). I will try connecting to a different Cassandra cluster and test it again. -Original Message- From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Thursday, December 13, 2012 9:03 PM To: user