Hi R. Verlangen!
On 2011.12.27 at 15:50:24 +0100, R. Verlangen wrote next:
> You might consider a hybrid solution with a transactional db for all data
> that should be ACID complient and Cassandra for the huge amounts of data
> you want to store.
>
> 2011/12/27 Radim Kolar
>
> >
> > makes me
Yep. They put them up here:
http://www.datastax.com/events/cassandranyc2011/presentations
-brian
On Dec 27, 2011, at 4:52 AM, Alain RODRIGUEZ wrote:
> Anything new about this ?
>
> I'm specifically interestead in the Joe Stein (Medialets) talk about how to
> manage real-time multidimensional
I don't know what you are basing that on. It seems unlikely to me that
the working set of a compaction is 600 MB. However, it may very well
be that the allocation rate is such that it contributes to an
additional 600 MB average heap usage after a CMS phase has completed.
I will investigate situa
Hi,
I Have a 3 node cluster running Cassandra 1.0.3 and using replication
factor=3.
Recently I've noticed that some previously deleted rows have started to
reappear for some reason. And now I wonder if this is a known issue with
1.0.3?
Repairs have been running every weekend (gc_grace is 1
You might consider a hybrid solution with a transactional db for all data
that should be ACID complient and Cassandra for the huge amounts of data
you want to store.
2011/12/27 Radim Kolar
>
> makes me feel disappointed about consistency in Cassandra, but I wonder is
>> there is a way to work a
Kevin,
I just pulled the code and read through the design. Great stuff.
Any thought to potentially using this for real-time processing as well? Right
now, we have a set of Hadoop M/R jobs that operate against Cassandra for ETL.
We were looking at using Storm for the real-time processing side
Anything new about this ?
I'm specifically interestead in the Joe Stein (Medialets) talk about how to
manage real-time multidimensional metrics.
2011/12/10 Jonathan Ellis
> Not yet -- we're working on it.
>
> On Fri, Dec 9, 2011 at 1:48 PM, Brian O'Neill
> wrote:
> >
> > I may have missed it..
> That is a good reason for both to be configurable IMO.
index sampling is currently configurable only per node, it would be
better to have it per Keyspace because we are using OLTP like and OLAP
keyspaces in same cluster. OLAP Keyspaces has about 1000x more rows.
But its difficult to estimate
I want to store an ID and a date and I want to retrieve all entries from
dateA up to dateB, what exactly do I need to be able to perform:
select from my_column_family where date >= dateA and date < dateB;
@so: http://stackoverflow.com/q/8638646/226201
> A key innovation here is a partitioning layout algorithm that can support
>> fast
>> many to many recovery similar to HDFS but still support partitioned
>> operation
>> with deterministic key placement.
>>
>
> Thanks for your contribution.
>
> Is here more detail info on this point?
>
yes... our
On Tue, Dec 27, 2011 at 2:31 PM, Kevin Burton wrote:
>
> I'm pleased to announce Peregrine 0.5.0 - a new map reduce framework
> optimized
> for iterative and pipelined map reduce jobs.
>
> http://peregrine_mapreduce.bitbucket.org/
>
> This originally started off with some internal work at Spinn3r
makes me feel disappointed about consistency in Cassandra, but I wonder is
there is a way to work around it.
cassandra is not suitable for this kind of programs. CouchDB is slightly
better, it has transactions but no locking and i am not sure if
transaction isolation is supported now. mongodb
> I will investigate situation more closely using gc via jconsole, but isn't
> bloom filter for new sstable entirely in memory? On disk there are only 2
> files Index and Data.
> -rw-r--r-- 1 root wheel 1388969984 Dec 27 09:25
> sipdb-tmp-hc-4634-Index.db
> -rw-r--r-- 1 root wheel 1096522137
> But is there any way of implementing minimum required ACID subset on
top of Cassandra?
try this, its nosql ACID compliant. I haven't tested this, it will have
most likely pretty slow writes and lot of bugs like any other oracle
application.
http://www.oracle.com/technetwork/database/nosqld
> on node with 300m rows (small node), it will be 585937 index sample entries
> with 512 sampling. lets say 100 bytes per entry this will be 585 MB, bloom
> filters are 884 MB. With default sampling 128, sampled entries will use
> majority of node memory. Index sampling should be reworked like bloo
Compaction should delete empty rows once gc_grace_seconds is passed, right?
Feng Qu
> Compaction should delete empty rows once gc_grace_seconds is passed, right?
Yes.
--
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>> Compaction should delete empty rows once gc_grace_seconds is passed, right?
>
> Yes.
But just to be extra clear: Data will not actually be removed once the
row in question participates in compaction. Compactions will not be
actively triggered by Cassandra for tombstone processing reasons.
--
But just to be extra clear: Data will not actually be removed once the
row in question participates in compaction. Compactions will not be
actively triggered by Cassandra for tombstone processing reasons.
leveled compaction is really good for this because it compacts often
Hi!
I was trying to get an understanding of the real strengths of Cassandra against
other competitors. Its actually not that simple and depends a lot on details on
the actual requirements.
Reading the following comparison:
http://kkovacs.eu/cassandra-vs-mongodb-vs-couchdb-vs-redis
It felt like
This is not really a comparison of anything because each NoSQL has its own
bullet points like:
Boats
great for traveling on water
Cars
great for traveling on land
So the conclusion I should gather is?
Also as for the Cassandra bullet points, they are really thin (and wrong).
Such as:
Cassan
> Also when comparing these technologies very subtle differences in design
> have profound in effects in operation and performance. Thus someone trying
> to paper over 6 technologies and compare them with a few bullet points is
> really doing the world an injustice.
+1. Same goes for 99% of all be
If I change endpoint_snitch from SimpleSnitch to PropertyFileSnitch,
does it require restart of cassandra on that node ?
Thanks.
demo, it will be in cassandra 1.0.7
standard cassa bloom filter
-rw-r--r-- 1 root wheel 19307376721 Dec 27 20:06 sipdb-hc-4634-Data.db
-rw-r--r-- 1 root wheel 63 Dec 27 20:06
sipdb-hc-4634-Digest.sha1
-rw-r--r-- 1 root wheel770714896 Dec 27 20:06 sipdb-hc-4634-Filter.db
-rw
> How large is the bloom filters in total? I.e., sizes of the
*-Filter.db files.
On moderate node about 6.5 GB, index sampling will be about 4 GB, heap
12 gb.
> In general, don't expect to be able to run at close to heap capacity;
there *will* be spikes.
i try to tune for 80% of heap.
>> In general, don't expect to be able to run at close to heap capacity;
>> there *will* be spikes.
> i try to tune for 80% of heap.
Just FYI, at 80% target heap usage you're likely to have fallbacks to
full compacting GC:s is my guess. If you are doing analytics only and
aren't latency critical,
You are totally right. I'm far from being an expert on the subject, but the
comparison felt inconsistent and incomplete. (I could not express that in my
1st email, not to bias the opinion)
Do you know of any similar comparison, which is not biased towards some
particular technology or solution
Don't trust NoSQL Benchmark. It's not a lie. but. NoSQL has different
performance in many different environment.
Do Benchmark with your real environment. and choose it.
Thank you.
2011/12/28 Igor Lino
> You are totally right. I'm far from being an expert on the subject, but
> the comparison fe
> If I change endpoint_snitch from SimpleSnitch to PropertyFileSnitch,
> does it require restart of cassandra on that node ?
Yes.
--
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)
I do major companions and I have ran into bloom filters causing oom. One
trick I did was using nodetool to lower the size of row/key caches before
triggering the compact and raising them after companion finished. As
suggested running with spare heap is a very good idea it lowers the chance
of a sto
hello,
I am new to the world of non-relational databases. Cassandra is
refreshingly easy to setup and has a great command line environment. I
genuinely like the command line tools and look forward to learning
more.
However I have been asked to setup a php/cassandra site that also has
some mysql
31 matches
Mail list logo