For full disclosure, I've been in the Apache Cassandra community since 2010 and
at DataStax since 2012.
So DataStax moved on to focus on things for their customers, effectively
putting most development effort into DataStax Enterprise. However, there have
been a lot of fixes and improvements co
Generally if you foresee the partitions getting out of control in terms of
size, a method often employed is to bucket according to some criteria. For
example, if I have a time series use case, I might bucket by month or week.
That presumes you can foresee it though. As far as limiting that ca
For an individual node, you can check the status of building indexes using
nodetool compactionstats. And similarly, if you want to speed up building the
indexes (and you have the extra IO) you can increase or unthrottle your
compaction throughput temporarily - nodetool setcompactionthrough 0 to
Starting with 1.6.0_34, you'll need xss set to 180k. It's updated with the
forthcoming 1.1.5 as well as the next minor rev of 1.0.x (1.0.12).
https://issues.apache.org/jira/browse/CASSANDRA-4631
See also the comments on https://issues.apache.org/jira/browse/CASSANDRA-4602
for the reference to wh
Are there any deletions in your data? The Hadoop support doesn't filter out
tombstones, though you may not be filtering them out in your code either. I've
used the hadoop support for doing a lot of data validation in the past and as
long as you're sure that the code is sound, I'm pretty confid
A couple of guesses:
- are you mixing versions of Cassandra? Streaming differences between versions
might throw this error. That is, are you bulk loading with one version of
Cassandra into a cluster that's a different version?
- (shot in the dark) is your cluster overwhelmed for some reason?
I
Generally the main knob for compaction performance is
compaction_throughput_in_mb in cassandra.yaml. It defaults to 16. You can use
nodetool setcompactionthroughput' to set it on a running server. The next time
Cassandra server starts it will use what's in the yaml again. You might try
usin
Another option that may or may not work for you is the support in Cassandra
1.1+ to use a secondary index as an input to your mapreduce job. What you
might do is add a field to the column family that represents which virtual
column family that it is part of. Then when doing mapreduce jobs, you
It's always had data locality (since hadoop support was added in 0.6).
You don't need to specify a partition, you specify the input predicate with
ConfigHelper or the cassandra.input.predicate property.
On Oct 2, 2012, at 2:26 PM, "Hiller, Dean" wrote:
> So you're saying that you can access th
The Dachis Group (where I just came from, now at DataStax) uses pig with
cassandra for a lot of things. However, we weren't using the widerow
implementation yet since wide row support is new to 1.1.x and we were on 0.7,
then 0.8, then 1.0.x.
I think since it's new to 1.1's hadoop support, it s
t; On Thu, Oct 11, 2012 at 11:25 AM, Jeremy Hanna
> wrote:
> The Dachis Group (where I just came from, now at DataStax) uses pig with
> cassandra for a lot of things. However, we weren't using the widerow
> implementation yet since wide row support is new to 1.1.x and we were on 0
On Oct 18, 2012, at 3:52 PM, Andrey Ilinykh wrote:
> On Thu, Oct 18, 2012 at 1:34 PM, Michael Kjellman
> wrote:
>> Not sure I understand your question (if there is one..)
>>
>> You are more than welcome to do CL ONE and assuming you have hadoop nodes
>> in the right places on your ring things
LCS works well in specific circumstances, this blog post gives some good
considerations: http://www.datastax.com/dev/blog/when-to-use-leveled-compaction
On Nov 8, 2012, at 1:33 PM, Aaron Turner wrote:
> "kill performance" is relative. Leveled Compaction basically costs 2x disk
> IO. Look at
You can do check nodetool compactionstats to see progress for current cleanup
operations. It essentially traverses all of your sstables and removes data
that the node isn't responsible for. So that's the overall operation, so you
would estimate in terms of how long it would take to go through
Hi Naveen,
You can start with http://wiki.apache.org/cassandra/HadoopSupport but there's
also a commercial product that you can use, DataStax Enterprise:
http://www.datastax.com/docs/datastax_enterprise2.2/solutions/hadoop_index
which makes things more streamlined, but it's a commercial product
See https://issues.apache.org/jira/browse/CASSANDRA-5168 - should be fixed in
1.1.10 and 1.2.2.
On Jan 30, 2013, at 9:18 AM, Tejas Patil wrote:
> While reading data from Cassandra in map-reduce, I am getting
> "InvalidRequestException(why:Start token sorts after end token)"
>
> Below is the c
Fwiw - here is are some changes that a friend said should make C*'s Hadoop
support work with CDH4 - for ColumnFamilyRecordReader.
https://gist.github.com/jeromatron/4967799
On Feb 16, 2013, at 8:23 AM, Edward Capriolo wrote:
> Here is the deal.
>
> http://wiki.apache.org/hadoop/Defining%20Hado
does this help? Links at the bottom show the cql statements to add/modify
users:
http://www.datastax.com/docs/1.2/security/native_authentication
On Feb 26, 2013, at 4:06 PM, C.F.Scheidecker Antunes
wrote:
> Hello all,
>
> Cassandra has changed and now has a default authentication and authori
If I remember correctly when I configured pig, cassandra, and oozie to work
together, I just used vanilla pig but gave it the jars it needed.
What is the problem you’re experiencing that you are unable to do this?
Jeremy
On 28 Nov 2013, at 12:56, Miguel Angel Martin junquera
wrote:
> hi all;
rt#Oozie
>
> I am using Cassandra 1.2.10, Oozie 4.0.0 adn pig 0.11.1.
>
> I try to test these options and see if it works-
>
> Thanks in advance
>
>
>
>
>
>
>
>
>
>
>
> 2013/11/28 Jeremy Hanna
>
>> If I rememb
With RHEL, there is a problem with snappy 1.0.5. You’d need to use 1.0.4.1
which works fine but you need to download it separately and put it in your lib
directory. You can find the 1.0.4.1 file from
https://github.com/apache/cassandra/tree/cassandra-1.1.12/lib
Jeremy
On 29 Nov 2013, at 10:1
I need to update those to be current with the Cassandra source download.
You’re right, you would just use what’s in the examples directory now for Pig.
You should be able to run the examples, but generally you need to specify the
partitioner of the cluster, the host name of a node in the clust
Of the 16 active committers, 8 are not at DataStax. See
http://wiki.apache.org/cassandra/Committers. That said, active involvement
varies and there are other contributors inside DataStax and in the community.
You can look at the dev mailing list as well to look for involvement in more
detail
ething to do with different address for rpc_address
>> and listen_address but not sure what it is...
>>
>>
>>
>> -Original Message-
>> From: Jeremy Hanna [mailto:jeremy.hanna1...@gmail.com]
>> Sent: Friday, May 06, 2011 11:10 PM
>> To: u...@
I download a fresh 0.8 beta2 and create keyspaces fine - including the ones
below.
I don't know if there are relics of a previous install somewhere or something
wonky about the classpath. You said that you might have /var/lib/cassandra
data left over so one thing to try is starting fresh there
Take a look at cassandra.yaml in your 0.8 download at the very bottom. There
are docs and examples there.
e.g.
http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.8.0-beta2/conf/cassandra.yaml
On May 16, 2011, at 6:36 PM, Sameer Farooqui wrote:
> I understand that 0.8.0 has configurable
eystore .keystore -rfc -file jdoe.cer
> 4) cat jdoe.cer
> 5) keytool -import -alias jdoecert -file jdoe.cer -keystore .truststore
> 6) keytool -list -v -keystore .truststore
>
>
> - Sameer
>
> On Mon, May 16, 2011 at 5:35 PM, Jeremy Hanna
> wrote:
> Take a
FWIW, as I mentioned in the 1497 comments, the patch makes it abstract so that
you can have any rpc/marshalling format you want with a simple extension point.
So if we want to move to something besides avro, or even like I mentioned do
something with Dumbo for streaming, it's easy to extend.
O
On May 23, 2011, at 2:23 PM, Ryan King wrote:
> On Mon, May 23, 2011 at 12:06 PM, Yang wrote:
>> Thanks Ryan,
>>
>> could you please share more details: according to what you observed in
>> testing, why was performance worse if you do not do extra buffering?
>>
>> I was thinking (could be wr
The link was fixed in cassandra.apache.org/download a couple of hours ago. For
the time being it may be better to scroll down to the Backup Sites section and
use one of those links.
On May 24, 2011, at 12:24 PM, Sameer Farooqui wrote:
> http://cassandra.apache.org/download
>
> If you click th
For the purposes of clearing out disk space, you might also occasionally check
to see if you have snapshots that you no longer need. Certain operations
create snapshots (point-in-time backups of sstables) in the (default)
/var/lib/cassandra/data//snapshots directory.
If you are absolutely sure
Some more recent documentation can be found here:
http://wiki.apache.org/cassandra/Counters but even that may be out of date.
One thing that has been added is multiple consistency levels are supported.
There are a lot of other tickets that have been completed post 1072. Search
for "cassandra
In 0.8 (and 0.7) you can have a script that you create that you can run on the
CLI that creates your schema. We create something like a ddl file and run it
on a new cluster. You just pass it to the cli with -f .
On Jun 3, 2011, at 11:14 AM, Paul Loy wrote:
> We embed cassandra in our app. Whe
I think that's partly the idea of it. CQL could end up being a way forward and
it currently builds on thrift. Then if it becomes the API/client of record to
build on, then it could move to something else underneath that's more efficient
and CQL itself wouldn't have to change at all.
On Jun 8,
I need to update the wiki with better pig info. I did put some information in
the getting started docs of pygmalion, but it would be good to transfer that to
cassandra's wiki and add to it.
fwiw - https://github.com/jeromatron/pygmalion/wiki/Getting-Started
Thanks for the rundown William!
On
Have you looked at the TTL column feature in 0.7?
http://www.datastax.com/dev/blog/whats-new-cassandra-07-expiring-columns
Those will automatically expire columns after a certain time period - not when
you near the column limit, but might be helpful for you.
On Jun 9, 2011, at 10:51 AM, Bahadur
I would take a look at pycassa - https://github.com/pycassa/pycassa though
there is also a twisted client named Telephus -
http://github.com/driftx/Telephus.
The complete list of current client language options are found here:
http://wiki.apache.org/cassandra/ClientOptions
On Jun 10, 2011, at
Yes - avro is alive and well. Avro as an RPC alternative for Cassandra is
dead. See reasoning here: http://goo.gl/urENc
On Jun 15, 2011, at 8:28 AM, Holger Hoffstaette wrote:
> On Wed, 15 Jun 2011 10:04:53 +1200, aaron morton wrote:
>
>> Avro is dead.
>
> Just so that this is not misundersto
We started doing this recently and thought it might be useful to others.
Pig (and Hive) have a sample function that allows you to sample data from your
data store.
In pig it looks something like this:
mysample = SAMPLE myrelation 0.01;
One possible use for this, with pig and cassandra is to sol
ng keys even if you sampled in a way that didn't actually
> produce any, etc.
>
> D
>
> On Wed, Jun 15, 2011 at 10:35 AM, Jeremy Hanna
> wrote:
>> We started doing this recently and thought it might be useful to others.
>>
>> Pig (and Hive) have a sample
Hi Will,
That's partly why I like to use FromCassandraBag and ToCassandraBag from
pygmalion - it does the work for you to get it back into a form that cassandra
understands.
Others may know better how to massage the data into that form using just pig,
but if all else fails, you could write a u
(the script).
>
> On Wed, Jun 15, 2011 at 3:04 PM, Jeremy Hanna
> wrote:
>
>> Hi Will,
>>
>> That's partly why I like to use FromCassandraBag and ToCassandraBag from
>> pygmalion - it does the work for you to get it back into a form that
>> cassandr
Try running with cdh3u0 version of pig and see if it has the same problem.
They backported the patch (to pig 0.9 which should be out in time for the
hadoop summit next week) that adds the updated jackson dependency for avro.
The download URL for that is -
http://archive.cloudera.com/cdh/3/pig
t; for jar in `ls *.jar`
> do
> jar -tf $jar | grep TypeParser
> if [ $? -eq 0 ]; then
> echo $jar
> fi
> done
>
> Shows me nothing in all the lib dirs
>
>
>
> On Mon, Jun 20, 2011 at 8:44 PM, Jeremy Hanna
> wrote:
>> Try running with cdh3u0 v
sr/local/src/apache-cassandra-0.8.0-src# echo $?
>> 1
>> /usr/local/src/apache-cassandra-0.8.0-src#
>>
>> /usr/local/src/apache-cassandra-0.8.0-src# grep -Ri TypeError .
>> /usr/local/src/apache-cassandra-0.8.0-src# echo $?
>> 1
>> /usr/local/src/apache-cassa
ype$3.classBytesType.class
>> MarshalException.class UUIDType.class
>> AbstractType$4.classCounterColumnType.class
>> TimeUUIDType.class
>> AbstractType$5.classIntegerType.class
>> UTF8Type$1.class
>>
&g
Also - there is an open ticket to create a .NET CQL driver - may be worth
watching or if you'd like to help out with it somehow:
https://issues.apache.org/jira/browse/CASSANDRA-2634
On Jun 21, 2011, at 9:31 AM, Stephen Pope wrote:
> We just recently switched to 0.8 (from 0.7.4), and it looks lik
Just wanted to mention that there is also a #solandra irc channel on freenode
in case people are interested.
On Jun 21, 2011, at 1:26 PM, Mark Kerzner wrote:
> Me too!
>
> I would be interested to know how such queries are done in Solandra. I would
> understand it if it creates a complete Luce
This ticket's outcome replaces what BMT was supposed to do:
https://issues.apache.org/jira/browse/CASSANDRA-1278
0.8.1 is being voted on now and will hopefully be out in the next day or two.
You can try it out with the 0.8-branch if you want - looking near the bottom of
the comments on the ticke
The replacement is to use the replication_factor variable in strategy options.
If you look in
http://svn.apache.org/repos/asf/cassandra/tags/cassandra-0.8.0/conf/schema-sample.txt
you can see an example of that.
The issue to do that was https://issues.apache.org/jira/browse/CASSANDRA-1263
The
A coworker of mine in the UK has been having problems with inserting UTF8
Strings into Cassandra using the Ruby thrift client. I'm just wondering if
anyone else is seeing this or if they have a workaround.
It may have to do with ruby/thrift itself:
https://issues.apache.org/jira/browse/THRIFT-
Anyone know if secondary index performance should be in the 100-500 ms range.
That's what we're seeing right now when doing lookups on a single value. We've
increased keys_cached and rows_cached to 100% for that column family and assume
that the secondary index gets the same attributes. I've
On Jul 3, 2011, at 4:29 PM, Jeremy Hanna wrote:
> Anyone know if secondary index performance should be in the 100-500 ms range.
> That's what we're seeing right now when doing lookups on a single value.
> We've increased keys_cached and rows_cached to 100% for t
I'm seeing some strange behavior and not sure how it is possible. We updated
some data using a pig script and that wrote back to cassandra. We get the
value and list the value on the Cassandra CLI and it's the updated value - from
MARKET to market. However, when doing a pig script to filter b
n the background, so it
> may only be visible to subsequent reads.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 6 Jul 2011, at 20:52, Jeremy Hanna wrote:
>
>>
+1 - We do a lot of this with Pig - joining over several column families. Pig
makes it just work. I think Hive does something similar. Unless you really
need that much control over your process, I would really use one of those two.
On Jul 15, 2011, at 5:28 PM, Jonathan Ellis wrote:
> The eas
I know additional types have been added as of 0.8.1:
https://issues.apache.org/jira/browse/CASSANDRA-2530
However, I'm not sure how those have propagated up to validators, the CLI, and
hector though.
On Jul 18, 2011, at 4:16 PM, Sameer Farooqui wrote:
> I wrote some data to a standard column fa
If you look at the bin/nodetool file, it's just a shell script to run
org.apache.cassandra.tools.NodeCmd. You could probably call that directly from
your code.
On Jul 20, 2011, at 3:18 PM, cbert...@libero.it wrote:
> Hi all,
> I'd like to build something like "nodetool" to show the status of t
Just saw this and created a lhf ticket for it -
http://issues.apache.org/jira/browse/CASSANDRA-2932
On Jul 21, 2011, at 8:20 AM, Stephen Pope wrote:
> Boo-urns. Ok, thanks.
>
> -Original Message-
> From: Brandon Williams [mailto:dri...@gmail.com]
> Sent: Thursday, July 21, 2011 9:10 AM
Try help on the CLI for how to do it, specifically "help update column family;"
It looks like you're missing the "with."
update column family columnfamily2 memtable_throughput=155;
should be
update column family columnfamily2 with memtable_throughput=155;
On Jul 27, 2011, at 12:49 PM, lebron j
See http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting - I would
probably start with setting your rpc_timeout_in_ms to something like 3.
On Jul 28, 2011, at 11:09 AM, Jian Fang wrote:
> Hi,
>
> I run Cassandra 0.8.2 and hadoop 0.20.2 on three nodes, each node includes a
> Cassa
exceptions when I use hector to get
> back data.
>
> Thanks,
>
> John
>
> On Thu, Jul 28, 2011 at 12:45 PM, Jian Fang
> wrote:
>
> My current setting is 1. I will try 3.
>
> Thanks,
>
> John
>
> On Thu, Jul 28, 2011 at 12:
fwiw - https://issues.apache.org/jira/browse/CASSANDRA-2970
thoughts? (please post on the ticket)
On Jul 29, 2011, at 7:08 PM, Ryan King wrote:
> It'd be great if we had different settings for inter- and intra-DC read
> repair.
>
> -ryan
>
> On Fri, Jul 29, 2011 at 5:06 PM, Jake Luciani wrot
Check out http://wiki.apache.org/cassandra/HadoopSupport#ClusterConfig and that
whole page to see an intro to configuring your cluster. Brisk extends these
basic ideas.
On Jul 31, 2011, at 12:31 PM, mcasandra wrote:
> Is it possible to add brisk nodes for analytics to already existing real tim
Some quick thoughts that might be helpful:
- use ephemeral instances and RAID0 over the local volumes for both cassandra's
data as well as the log directory. The log directory because if you crash due
to heap size, the heap dump will be stored in the log directory. you don't
want that to go i
That is something we have to update, thanks for mentioning that. We should
just be depending on apache hadoop components now that we are no longer
supporting hadoop output streaming.
On Aug 5, 2011, at 10:27 AM, Dean Hiller wrote:
> oh, cloudera repo is down like a previous poster just said...
It won't be required in the future:
https://issues.apache.org/jira/browse/CASSANDRA-2998
On Aug 5, 2011, at 1:34 PM, Martin Lansler wrote:
> It solved itself as the cloudera repo is up again now...
>
> -Martin
>
> On Fri, Aug 5, 2011 at 12:06 PM, Martin Lansler
> wrote:
>> Hi,
>>
>> I'm tryin
Yes - that ticket was done by Nirmal Ranganathan for the intention of getting
support in Cassandra. That's just for a java client though.
In the future, I wonder if the CQL driver level is the right place for client
encryption.
On Aug 11, 2011, at 11:26 PM, Vijay wrote:
> https://issues.apach
/browse/THRIFT-151
C# (patch attached but no progress in a while):
https://issues.apache.org/jira/browse/THRIFT-181
PHP (patch attached but no progress in a while):
https://issues.apache.org/jira/browse/THRIFT-948
On Aug 12, 2011, at 9:39 AM, Jeremy Hanna wrote:
> Yes - that ticket was done by Nir
http://wiki.apache.org/cassandra/FAQ#dropped_messages
As to what's causing them - look in the logs and it will do the equivalent of a
nodetool tpstats right after the dropped messages messages. That should give
you a clue as to why there are dropped messages - which thread pools are backed
up
We're trying to bootstrap some new nodes and it appears when adding a new node
that there is a lot of logging on hints being flushed and compacted. It's been
taking about 75 minutes thus far to bootstrap for only about 10 GB of data.
It's ballooned up to over 40 GB on the new node. I do 'ls -
:
> I would assume it's because it thinks some node is down and is
> creating hints for it.
>
> On Thu, Aug 18, 2011 at 6:31 PM, Jeremy Hanna
> wrote:
>> We're trying to bootstrap some new nodes and it appears when adding a new
>> node that there is a lot
We've been having issues where as soon as we start doing heavy writes (via
hadoop) recently, it really hammers 4 nodes out of 20. We're using random
partitioner and we've set the initial tokens for our 20 nodes according to the
general spacing formula, except for a few token offsets as we've re
On Aug 23, 2011, at 2:25 AM, Peter Schuller wrote:
>> We've been having issues where as soon as we start doing heavy writes (via
>> hadoop) recently, it really hammers 4 nodes out of 20. We're using random
>> partitioner and we've set the initial tokens for our 20 nodes according to
>> the ge
m unreasonable - about a MB. I turned up
logging to DEBUG for that class and I get plenty of dropped READ_REPAIR
messages, but nothing coming out of DEBUG in the logs to indicate the time
taken that I can see.
>
> Cheers
>
> -
> Aaron Morton
> Freelance Cass
At the point that book was written (about a year ago it was finalized), vector
clocks were planned. In August or September of last year, they were removed.
0.7 was released in January. The ticket for vector clocks is here and you can
see the reasoning for not using them at the bottom.
https
ext
token of a different rack (depending on which it is looking for). So that is
why alternating by rack is important. That might be able to be smarter in the
future which would be nice - to not have to care and let Cassandra spread the
replication around intelligently.
On Aug 23, 2011, at 6:02 A
in token order, that can lead to
serious hotspots. For more on this with ec2, see:
http://www.slideshare.net/mattdennis/cassandra-on-ec2/5 where he talks about
alternating zones.
On Aug 25, 2011, at 10:45 AM, mcasandra wrote:
> Thanks for the update
>
> Jeremy Hanna wrote:
>&
I was watching compactionstats via opscenter and saw one of my nodes was minor
compacting a secondary index column family. Problem is I removed all of my
secondary indexes on Friday and just double checked on the CLI with 'show
keyspaces;' and sure enough, no secondary indexes. Is this a bug?
Just wanted to let people know about a great presentation that Matt Dennis did
here at the Cassandra Austin meetup. It's on Cassandra best practices on EC2.
We found the presentation extremely helpful.
http://www.slideshare.net/mattdennis/cassandra-on-ec2
FWIW, we are using Pig (and Hadoop) with Cassandra and are looking to
potentially move to Brisk because of the simplicity of operations there.
Not sure what you mean about the true power of Hadoop. In my mind the true
power of Hadoop is the ability to parallelize jobs and send each task to wher
I would not use nano time with cassandra. Internally and throughout the
clients, milliseconds is pretty much a standard. You can get into trouble
because when comparing nanoseconds with milliseconds as long numbers,
nanoseconds will always win. That bit us a while back when we deleted
someth
Ed- you're right - milliseconds * 1000. That's right. The other stuff about
nano time still stands, but you're right - microseconds. Sorry about that.
On Aug 30, 2011, at 1:20 PM, Edward Capriolo wrote:
>
>
> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna
> wr
0.
>
> Anyone sees problem with this approach?
>
> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo
> wrote:
>>
>>
>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna
>> wrote:
>>>
>>> I would not use nano time with cassandra. Internall
/repos/asf/cassandra/trunk/contrib/pig. Are there any
> other resource that you can point me to? There seems to be a lack of samples
> on this subject.
>
> On Tue, Aug 30, 2011 at 10:56 PM, Jeremy Hanna
> wrote:
> FWIW, we are using Pig (and Hadoop) with Cassandra and are looking to
rive the current time in
> nano seconds though?
>
> On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
> wrote:
>> Yes - the reason why internally Cassandra uses milliseconds * 1000 is
>> because System.nanoTime javadoc says "This method can only be used to
>> mea
We moved off of ubuntu because of kernel issues in the AMIs we found in 10.04
and 10.10 in ec2. So we're now on debian squeeze with ext4. It's been great
for us.
One thing that bit us is we'd been using property file snitch and the
availability zones as racks and had an equal number of nodes
I would look at http://www.slideshare.net/mattdennis/cassandra-on-ec2
Also, people generally do raid0 on the ephemerals.
EBS is a bad fit for cassandra - see the presentation above. However, that
means you'll need to have a backup strategy, which is also mentioned in the
presentation.
Also ar
I dont remember setting up snitch.
>
> The servers are all in a VPC, the only thing I did was configure the seed IP
> so all the nodes can see each other.
>
> Ben
>
> On Sat, Sep 3, 2011 at 11:13 PM, Jeremy Hanna
> wrote:
> I would look at http://www.slideshar
Thanks William - so you were able to get everything running correctly, right?
FWIW, we're in the process of upgrading to 0.8.4 and found that all we needed
was that first link you mentioned - the VersionedValue modification. It's
running fine on our staging cluster and we're in the process of m
The voting started on Monday and is a 72 hour vote. So if there aren't any
problems that people find, it should be released sometime Thursday (7
September).
On Sep 7, 2011, at 10:41 AM, Roshan Dawrani wrote:
> Hi,
>
> Quick check: is there a tentative date for release of Cassandra 0.8.5?
>
>
We run 0.8 in production and it's been working well for us. There are some new
settings that we had to tune for - for example, the default concurrent
compaction is the number of cores. We had to tune that down because we also
run hadoop jobs on our nodes.
On Sep 8, 2011, at 4:44 PM, Anand Som
We are experiencing massive writes to column families when only doing reads
from Cassandra. A set of 5 hadoop jobs are reading from Cassandra and then
writing out to hdfs. That is the only thing operating on the cluster. We are
reading at CL.QUORUM with hadoop and have written with CL.QUORUM.
0 0
InternalResponseStage 0 0 0
HintedHandoff 0 0 0
CompactionManager n/a29
MessagingServicen/a 0,34
On Sep 10, 2011, at 3:38 PM, Jeremy Hanna wrote:
> We are experiencing mass
Oh and we're running 0.8.4 and the RF is 3.
On Sep 10, 2011, at 3:49 PM, Jeremy Hanna wrote:
> In addition, the mutation stage and the read stage are backed up like:
>
> Pool NameActive Pending Blocked
> ReadStage32
t; 2) You have something doing writes that you're not aware of, I guess
> you could track that down using wireshark to see where the write
> messages are coming from
>
> On Sat, Sep 10, 2011 at 3:56 PM, Jeremy Hanna
> wrote:
> > Oh and we're running 0.8.4 and the RF
We just tried to disable hinted handoff by setting:
hinted_handoff_enabled: false
in all the nodes of our cluster and restarting them. When they come back up,
we continue to see things like this:
INFO [HintedHandoff:1] 2011-09-10 22:41:40,813 HintedHandOffManager.java (line
323) Started hinted h
rowse/CASSANDRA-3176
On Sep 10, 2011, at 5:50 PM, Jeremy Hanna wrote:
> INFO [HintedHandoff:1] 2011-09-10 22:41:40,813 HintedHandOffManager.java
> (line 323) Started hinted handoff for endpoint /10.1.2.3
> INFO [HintedHandoff:1] 2011-09-10 22:41:40,813 HintedHandOffManager.java
> (l
Turned out that wasn't a problem - I put some notes on the ticket.
On Sep 10, 2011, at 6:22 PM, Jeremy Hanna wrote:
> I tried looking through the source to see if the log statements would happen
> regardless but it doesn't look like it. Also I looked at one of the nodes
>
Yeah - I would bootstrap at initial_token of -1 the current one. Then once
that has bootstrapped, then decommission the old one. Avoid trying to use
removetoken on anything before 0.8.3. Use decommission if you can if you're
dealing with a live node.
On Sep 12, 2011, at 10:42 AM, Kyle Gibson
1 - 100 of 246 matches
Mail list logo