interesting,
first just to make sure: since replicateOnWrite is for Counters, you
are using counters (you use the word "insert" instead of
"add/increment" ) right?
if you are using counters, supposedly the leader runs
replicateOnWrite, somehow all your adds find the one box as leader,
that's prob
It's definitely for counters, and some of the rows I'm inserting are long-ish,
if 1.3MB is long.
Maybe it would help if I said I was using counter super columns. I'm also
writing to only a handful of rows at a time, until they are full. It looks
like the counter super column code in addReadCo
We are happy to announce release of Kundera 2.0.1
Kundera 2.0.1 is a JPA compliant, Object-Datastore Mapping Library for NoSQL
Datastores. The idea behind Kundera is to make working with NoSQL Databases
drop-dead simple and fun. Some salient features of Kundera are:
1. Fully JPA 1.0 compliant.
>From "Cassandra the definitive guide" - Basic Maintenance - Repair
Running nodetool repair causes Cassandra to execute a Major Compaction [...]
AntiEntropyService implements the Singleton pattern and defines the static
Differencer class as well, which is used to compare two trees. If it finds a
Hi David,
This is interesting topic and it would be interesting to hear from
someone who is using it in prod.
Particularly - How your fs implementation behaves for medium/large
files, e.g. > 1MB?
If you store large files, how large is your store per node and how does
it handle compactions
Recently we released Kundera-2.0.1
One feature which is included in this release is Secondary index support over
Super Column.
I have tried to compile a blog post for the same, which can be found for a
reference at:
http://mevivs.wordpress.com/2011/07/12/cassandra-play-kundera-orm/
-Vivek
I'm thinking about Secondary indexes as an alternative for using
column families to index my data.
I'd appreciate some answers to basic questions about this new feature.
1. Can you insert a row to a column family with a predefined column
metadata (and secondary index for these columns) that does n
### Preamble
There have been several reports on the mailing list of the JVM running
Cassandra using "too much" memory. That is, the resident set size is
>>(max java heap size + mmaped segments) and continues to grow until the
process swaps, kernel oom killer comes along, or performance just
degra
>From "Cassandra the definitive guide" - Basic Maintenance - Repair
"Running nodetool repair causes Cassandra to execute a major compaction.
During a major compaction (see “Compaction” in the Glossary), the
server initiates a
TreeRequest/TreeReponse conversation to exchange Merkle trees with ne
> So now I'm confused ... Cassandra doc says that I have to run it by myself,
> Cassandra book says I don't have to.
> Did I misunderstand something?
The book is wrong, at least by current versions of Cassandra (I'm
basing that on the quote you pasted, I don't know the context).
nodetool repair m
Yang, I'm not sure I understand what you mean by "prefix of the HLog".
Also, can you explain what failure scenario you are talking about? The
major failure that I see is when the leader node confirms to the client
a successful local write, but then fails before the write can be
replicated to
> From "Cassandra the definitive guide" - Basic Maintenance - Repair
> "Running nodetool repair causes Cassandra to execute a major compaction.
> During a major compaction (see “Compaction” in the Glossary), the
> server initiates a
> TreeRequest/TreeReponse conversation to exchange Merkle tree
Just confirming. Thanks for the clarification.
On Tue, Jul 12, 2011 at 10:53 AM, Peter Schuller
wrote:
>> From "Cassandra the definitive guide" - Basic Maintenance - Repair
>> "Running nodetool repair causes Cassandra to execute a major compaction.
>> During a major compaction (see “Compactio
With big data requirements pressuring me to pack up to a terabyte on one
node, I suspect that even 32 GB of RAM just will not be large enough for
Cass' various memory caches to be effective. 32/1000 is a tiny working
set to data store ratio... even assuming non-random reads. So, I'm
investiga
We actually cap the maximum file size pretty low (30-50MB) because
this system is mainly designed to handle images and audio from browser
uploads. Most files are accessed via HTTP requests, which get cached
in our edge layer. So, mostly the design aims to be distributed and
available more than blaz
> Do any Cass developers have any thoughts on this and whether or not it would
> be helpful considering Cass' architecture and operation?
A well-functioning L2 cache should definitely be very useful with
Cassandra for read-intensive workloads where the request distribution
is such that additional
If you're interested in this idea, you should read up about Spinnaker:
http://www.vldb.org/pvldb/vol4/p243-rao.pdf
-ryan
On Mon, Jul 11, 2011 at 2:48 PM, Yang wrote:
> I'm not proposing any changes to be done, but this looks like a very
> interesting topic for thought/hack/learning, so the follo
I have been reading through some code in TokenMetadata.java,
specifically with ringIterator() and firstTokenIndex(). I am trying
to get a very firm grasp on how nodes are collected for writes.
I have run into a bit of confusion about what happens when the data's
token is larger than than the larg
for example,
coord writes record 1,2 ,3 ,4,5 in sequence
if u have replica A, B, C
currently A can have 1 , 3
B can have 1,3,4,
C can have 2345
by "prefix", I mean I want them to have only 1---n where n is some
number between 1 and 5,
for example A having 1,2,3
B having 1,2,3,4
C having 1,2,3,4,
thanks , let me read it...
On Tue, Jul 12, 2011 at 9:27 AM, Ryan King wrote:
> If you're interested in this idea, you should read up about Spinnaker:
> http://www.vldb.org/pvldb/vol4/p243-rao.pdf
>
> -ryan
>
> On Mon, Jul 11, 2011 at 2:48 PM, Yang wrote:
>> I'm not proposing any changes to be do
Do we need to do anything special to turn off-heap cache on?
https://issues.apache.org/jira/browse/CASSANDRA-1969
-Raj
what do you mean by "until they are full" ?
right now I guess a quick black-box testing method for this problem is
to try inserting only shorter rows , and see if that persists.
as you said, it could be that addReadCommandFromColumnFamily is taking
a lot of time to read, if that's from disk, it's
>The book is wrong, at least by current versions of Cassandra (I'm
>basing that on the quote you pasted, I don't know the context).
To be sure that I didn't misunderstand (English is not my mother tongue) here
is what the entire "repair paragraph" says ...
Basic Maintenance
There are a few tasks
On Mon, Jul 11, 2011 at 11:51 PM, Casey Deccio wrote:
> java.lang.RuntimeException: Cannot recover SSTable with version f (current
> version g).
You need to scrub before any streaming is performed.
-Brandon
On 7/12/2011 10:19 AM, Peter Schuller wrote:
Do any Cass developers have any thoughts on this and whether or not it would
be helpful considering Cass' architecture and operation?
A well-functioning L2 cache should definitely be very useful with
Cassandra for read-intensive workloads where the re
Have you seen
http://www.datastax.com/docs/0.8/troubleshooting/index#nodes-are-dying-with-oom-errors
?
On Mon, Jul 11, 2011 at 1:55 PM, Anurag Gujral wrote:
> Hi All,
> I am getting following error from cassandra:
> ERROR [ReadStage:23] 2011-07-10 17:19:18,300
> DebuggableThreadPoolEx
Sounds like https://issues.apache.org/jira/browse/CASSANDRA-2870 to
me. You can disable the dynamic snitch as a workaround, or use a
different consistencylevel.
On Mon, Jul 11, 2011 at 11:38 AM, Hefeng Yuan wrote:
> Hi,
> We're using Cassandra with 2 DC
> - one OLTP Cassandra, 6 nodes, with RF3
You need to set row_cache_provider=SerializingCacheProvider on the
columnfamily definition (via the cli)
On Tue, Jul 12, 2011 at 9:57 AM, Raj N wrote:
> Do we need to do anything special to turn off-heap cache on?
> https://issues.apache.org/jira/browse/CASSANDRA-1969
> -Raj
--
Jonathan Ellis
> Running nodetool repair causes Cassandra to execute a major compaction
This is not what I would call factually accurate. Repair does not run a major
compaction. Major compaction is when all SSTables for a CF are compacted down
to one SSTable.
Cheers
-
Aaron Morton
Freelance C
When you do counter increment at CL.ONE, the write is acknowledged as
soon as the first replica getting the the write has pushed the
increment into his memtable. However, there is a read happening for
the replication to the other replicas (this is necessary to the
counter design). What is happening
You're going to be mad at how simple the answer turns out to be. :)
Nodes "own" the range from (previous, token], NOT from [token, next).
So, the last node will get from (50, 75] and the first will get from
(75, 0].
On Tue, Jul 12, 2011 at 9:38 AM, Eric tamme wrote:
> I have been reading through
On Tue, Jul 12, 2011 at 3:32 PM, Jonathan Ellis wrote:
> You're going to be mad at how simple the answer turns out to be. :)
>
> Nodes "own" the range from (previous, token], NOT from [token, next).
> So, the last node will get from (50, 75] and the first will get from
> (75, 0].
>
Okay i figured
Hey there. I'm trying to convert one of my sstables to json, but it doesn't
appear to be escaping quotes. As a result, I've got a line in my resulting json
like this:
"3230303930373139313734303236efbfbf3331313733": [["6d6573736167655f6964",
""<66AA9165386616028BD3FECF893BBAC204347F3BAF@CONFLIC
Well, I was using a large number of clients: I tried configuring a hector pool
of 20-200 to see what affect that had on throughput. There's definitely a
point after which there's no gain, so I dialed it back down. To clarify a few
other things, when I say inserts I mean increments, as this te
the intended meaning of "initial" is "use this the first time you
start up; it will be ignored after that, if you use nodetool to move
it around." Sorry for the confusion.
On Tue, Jul 12, 2011 at 12:53 PM, Eric tamme wrote:
> On Tue, Jul 12, 2011 at 3:32 PM, Jonathan Ellis wrote:
>> You're goin
You can upgrade to 0.8.1 to fix this. :)
On Tue, Jul 12, 2011 at 1:03 PM, Stephen Pope wrote:
> Hey there. I'm trying to convert one of my sstables to json, but it doesn't
> appear to be escaping quotes. As a result, I've got a line in my resulting
> json like this:
>
> "3230303930373139313734
On Tue, Jul 12, 2011 at 11:42 PM, David Hawthorne wrote:
> Well, I was using a large number of clients: I tried configuring a hector
> pool of 20-200 to see what affect that had on throughput. There's definitely
> a point after which there's no gain, so I dialed it back down. To clarify a
> f
Hello,
I'm testing using a 2 node cluster. My CF only has counters and I have
replicate_on_write=true.
I loaded data into the CF using CL=ALL.
At some point, it couldn't write anymore (looked like a GC froze one of the
machines) which makes sense.
However, I happened to run nodetool repair it is h
Hi Jonathan,
Thanks for the answer, I wanted to report on the improvements I got because
someone else is bound to run into the same questions...
> > C) I want to access a key that is at the 50th position in that table,
> > Cassandra will seek position 0 and then do a sequential read of the file
>
On Jul 12, 2011, at 3:02 PM, Sylvain Lebresne wrote:
> On Tue, Jul 12, 2011 at 11:42 PM, David Hawthorne
> wrote:
>> Well, I was using a large number of clients: I tried configuring a hector
>> pool of 20-200 to see what affect that had on throughput. There's definitely
>> a point after whic
Hi Jonathan,
Thanks for your mail. But no-one of the things
mentioned in the link pertains to OOM error I we are seeing.
thanks
Anurag
On Tue, Jul 12, 2011 at 10:42 AM, Jonathan Ellis wrote:
> Have you seen
> http://www.datastax.com/docs/0.8/troubleshooting/index#nodes-are-d
I'll post more tomorrow ... However, we set up one node in a single node
cluster and have left it with no datareviewing memory consumption
graphs...it increased daily until it gobbled (highly technical term) all
memory...the system is now running just below 100% memory usagewhich i
find pec
On Wed, Jul 13, 2011 at 12:18 AM, David Hawthorne wrote:
>
> On Jul 12, 2011, at 3:02 PM, Sylvain Lebresne wrote:
>
>> On Tue, Jul 12, 2011 at 11:42 PM, David Hawthorne
>> wrote:
>>> Well, I was using a large number of clients: I tried configuring a hector
>>> pool of 20-200 to see what affect
On Tue, Jul 12, 2011 at 10:10 AM, Brandon Williams wrote:
> On Mon, Jul 11, 2011 at 11:51 PM, Casey Deccio wrote:
> > java.lang.RuntimeException: Cannot recover SSTable with version f
> (current
> > version g).
>
> You need to scrub before any streaming is performed.
>
>
Okay, turns out that my
On Wed, Jul 13, 2011 at 1:00 AM, David Hawthorne wrote:
> Thanks for looking at that.
>
> Our use case involves supercolumns that have 2-20,000 counters within them.
> For a set of continuous updates to one supercolumn, the behavior you're
> describing is:
Here's your problem. Don't do that. I
Thanks for the update, that is very useful!
On Tue, Jul 12, 2011 at 3:16 PM, Philippe wrote:
> Hi Jonathan,
> Thanks for the answer, I wanted to report on the improvements I got because
> someone else is bound to run into the same questions...
>
>>
>> > C) I want to access a key that is at the 50
Then you'll want to use MAT to analyze the dump the JVM gave you of
the heap at OOM time. (http://www.eclipse.org/mat/)
On Tue, Jul 12, 2011 at 3:22 PM, Anurag Gujral wrote:
> Hi Jonathan,
> Thanks for your mail. But no-one of the things
> mentioned in the link pertains to O
Using Cassandra 0.8.1 and cql 1.0.3 and following the syntax mentioned
in https://issues.apache.org/jira/browse/CASSANDRA-2473
cqlsh> UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE
KEY = '1_20110728_ifoutmulticastpkts';
Bad Request: line 1:51 no viable alternative at character '+'
Try quoting the column name.
On Tue, Jul 12, 2011 at 5:30 PM, Aaron Turner wrote:
> Using Cassandra 0.8.1 and cql 1.0.3 and following the syntax mentioned
> in https://issues.apache.org/jira/browse/CASSANDRA-2473
>
> cqlsh> UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE
> KEY = '1_
Doesn't seem to help:
cqlsh> UPDATE RouterAggWeekly SET '1310367600' = '1310367600' + 17
WHERE KEY = '1_20110728_ifoutmulticastpkts';
Bad Request: line 1:55 no viable alternative at character '+'
cqlsh> UPDATE RouterAggWeekly SET 1310367600 = '1310367600' + 17 WHERE
KEY = '1_20110728_ifoutmultica
As Ryan said, Cassandra's really designed to manage DAS. That's
probably a big part of the problem.
If you have to use a SAN, I recommend checking out this thread:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Cassandra-on-iSCSI-td5945217.html
On Tue, Jun 14, 2011 at 8:16 AM,
> Thanks Peter, but... hmmm, are you saying that even after a cache miss which
> results in a disk read and blocks being moved to the ssd, that by the next
> cache miss for the same data and subsequent same file blocks, that the ssd
> is unlikely to have those same blocks present anymore?
I am say
> To be sure that I didn't misunderstand (English is not my mother tongue) here
> is what the entire "repair paragraph" says ...
Read it, I maintain my position - the book is wrong or at the very
least strongly misleading.
You *definitely* need to run nodetool repair periodically for the
reasons
Running version 0.7.6-2, recently upgraded from 0.7.3.
I am get a time out exception when I run a particular
get_indexed_slices, which results in the following error showing up on
a few nodes:
ERROR [ReadStage:16] 2011-07-12 23:01:31,424
AbstractCassandraDaemon.java (line 114) Fatal exception in
On 7/12/2011 9:02 PM, Peter Schuller wrote:
Thanks Peter, but... hmmm, are you saying that even after a cache miss which
results in a disk read and blocks being moved to the ssd, that by the next
cache miss for the same data and subsequent same file blocks, that the ssd
is unlikely to have those
On 7/12/2011 10:48 AM, Yang wrote:
for example,
coord writes record 1,2 ,3 ,4,5 in sequence
if u have replica A, B, C
currently A can have 1 , 3
B can have 1,3,4,
C can have 2345
by "prefix", I mean I want them to have only 1---n where n is some
number between 1 and 5,
for example A having 1,2
that is not an important issue, it's separate from the replication
question I'm thinking about.
for now I'll just think about the case where every node owns the same
key range , or N=RF.
> Are you saying: All replicas will receive the value whether or not they
> actually own the key range for th
57 matches
Mail list logo