Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread David Jeske
> My point still applies though. Caching HFIle blocks on a single node >> vs individual "dataums" on N nodes may not be more efficient. Thus >> terms like "Slower" and "Less Efficient" could be very misleading. >> > I seem to have missed this the first time around. Next time I correct the summary I

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 2:39 PM, Edward Capriolo wrote: > @Todd. Good catch about caching HFile blocks. > > My point still applies though. Caching HFIle blocks on a single node > vs individual "dataums" on N nodes may not be more efficient. Thus > terms like "Slower" and "Less Efficient" could be

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
Seems accurate to me. One small correction - the daemon in HBase that serves regions is known as a "region server" rather than a region master. The RS is the equivalent of the tablet server in Bigtable terminology. -Todd On Mon, Nov 22, 2010 at 4:50 PM, David Jeske wrote: > This is my second at

unsubscribe

2010-11-22 Thread ywf2008
2010-11-23 ywf2008

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread David Jeske
This is my second attempt at a summary of Cassandra vs HBase consistency and performance for an hbase acceptable workload. I think these tricky subtlties are hard to understand, yet it's helpful for the community to understand them. I'm not trying to state my own facts (or opinion) but merely summa

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Edward Capriolo
On Mon, Nov 22, 2010 at 5:48 PM, David Jeske wrote: > > > On Mon, Nov 22, 2010 at 2:44 PM, David Jeske wrote: >> >> On Mon, Nov 22, 2010 at 2:39 PM, Edward Capriolo >> wrote: >>> >>> Return messages such as "your data was written to at least 1 node but >>> not enough to make your write-consisten

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread David Jeske
On Mon, Nov 22, 2010 at 2:44 PM, David Jeske wrote: > On Mon, Nov 22, 2010 at 2:39 PM, Edward Capriolo wrote: > >> Return messages such as "your data was written to at least 1 node but >> not enough to make your write-consistency count". Do not help the >> situation. As the client that writes the

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread David Jeske
On Mon, Nov 22, 2010 at 2:39 PM, Edward Capriolo wrote: > Return messages such as "your data was written to at least 1 node but > not enough to make your write-consistency count". Do not help the > situation. As the client that writes the data would be aware of the > inconsistency, but the other c

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Edward Capriolo
On Mon, Nov 22, 2010 at 5:14 PM, Todd Lipcon wrote: > On Mon, Nov 22, 2010 at 1:58 PM, David Jeske wrote: >> >> On Mon, Nov 22, 2010 at 11:52 AM, Todd Lipcon wrote: >>> >>> Not quite. The replica synchronization code is pretty messy, but >>> basically it will take the longest replica that may ha

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 1:58 PM, David Jeske wrote: > On Mon, Nov 22, 2010 at 11:52 AM, Todd Lipcon wrote: > >> Not quite. The replica synchronization code is pretty messy, but basically >> it will take the longest replica that may have been synced, not a quorum. >> >> i.e the guarantee is that

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread David Jeske
On Mon, Nov 22, 2010 at 11:52 AM, Todd Lipcon wrote: > Not quite. The replica synchronization code is pretty messy, but basically > it will take the longest replica that may have been synced, not a quorum. > > i.e the guarantee is that "if you successfully sync() data, it will be > present after

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 1:26 PM, Edward Capriolo wrote: > For cassandra all writes must be transmitted to all replicas. > CASSANDRA-1314 does not change how writes happen. Write operations > will still effect cache (possibly evicting things if cache is full). > Reads however will prefer a single n

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread David Jeske
On Mon, Nov 22, 2010 at 1:26 PM, Edward Capriolo wrote: > For cassandra all writes must be transmitted to all replicas. > I thought that was only true if you set the number of replicas required for the write to the same as the number of replicas. Further, we've established in this thread that ev

Re: Newbie question on Cassandra mem usage

2010-11-22 Thread Aaron Morton
They are memtable_throughput_in_mb, memtable_flush_after_mins, memtable_operations_in_millions. Under 0.7 these are per CF settings, in 0.6 these are cluster wide. To start with try  mb one down to something like 64 or 128, ops to 0.5 and mins to 60 . What version are you using ? AaronOn 23 Nov, 20

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Edward Capriolo
For cassandra all writes must be transmitted to all replicas. CASSANDRA-1314 does not change how writes happen. Write operations will still effect cache (possibly evicting things if cache is full). Reads however will prefer a single node of it's possible replicas. This should cause better cache uti

Re: Newbie question on Cassandra mem usage

2010-11-22 Thread Trung Tran
Hi, Is it the min_compaction_threshold and max_compaction_threshold? Do i need to lower the memtable setting also? Thanks, Trung. On Mon, Nov 22, 2010 at 12:02 PM, Jonathan Ellis wrote: > Set your columnfamily thresholds lower. > > On Mon, Nov 22, 2010 at 12:45 PM, Trung Tran wrote: >> Hi, >>

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 12:03 PM, Edward Capriolo wrote: > What of reads that are not in the cache? > Cassandra can use memory mapped io for its data and index files. Hbase > has a very expensive read path for things that are not in cache. HDFS > random read performance is historically poor. > Ye

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread David Jeske
> > 2) Cassandra has a less efficient memory footprint data pinned in > memory (or cached). With 3 replicas on Cassandra, each element of data > pinned in-memory is kept in memory on 3 servers, wheras in hbase only > region masters keep the data in memory, so there is only one-copy of > each data e

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Edward Capriolo
On Mon, Nov 22, 2010 at 2:56 PM, Edward Capriolo wrote: > On Mon, Nov 22, 2010 at 2:52 PM, Todd Lipcon wrote: >> On Mon, Nov 22, 2010 at 10:01 AM, David Jeske wrote: >>> >>> I havn't used either Cassandra or hbase, so please don't take any part of >>> this message as me attempting to state facts

Re: Newbie question on Cassandra mem usage

2010-11-22 Thread Jonathan Ellis
Set your columnfamily thresholds lower. On Mon, Nov 22, 2010 at 12:45 PM, Trung Tran wrote: > Hi, > > I have a test cluster of 3 nodes, 14Gb of mem in each node, > replication factor = 3. With default -Xms and Xmx, my nodes are set to > have max-heap-size = 7Gb. After initial load with about 200M

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Edward Capriolo
On Mon, Nov 22, 2010 at 2:52 PM, Todd Lipcon wrote: > On Mon, Nov 22, 2010 at 10:01 AM, David Jeske wrote: >> >> I havn't used either Cassandra or hbase, so please don't take any part of >> this message as me attempting to state facts about either system. However, >> I'm very familiar with data-s

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread Todd Lipcon
On Mon, Nov 22, 2010 at 10:01 AM, David Jeske wrote: > I havn't used either Cassandra or hbase, so please don't take any part of > this message as me attempting to state facts about either system. However, > I'm very familiar with data-storage design details, and I've worked > extensively optimiz

Re: Newbie question on Cassandra mem usage

2010-11-22 Thread Trung Tran
Hi, Thanks for the guideline. I did not turn up any memory setting, the nodes are configured with all default settings (except for disk-access is using nmap). I have 3 nodes with 1 client using hector, 8 writing threads. There are 3 CF, 1 standard and 2 super. Thanks, Trung. On Mon, Nov 22, 2010

Re: Rows missing after new node bootstrapped

2010-11-22 Thread Jonathan Ellis
I think you'll need to show us how to reproduce without your custom LoadFunc, e.g., with normal index scans outside of pig. On Wed, Nov 17, 2010 at 3:56 PM, Christian Decker wrote: > On Tue, Nov 16, 2010 at 6:58 PM, Jonathan Ellis wrote: >> >> I'm pretty sure that "reading an index" and "using p

Re: Newbie question on Cassandra mem usage

2010-11-22 Thread Aaron Morton
The higher memory usage for the java process may be because of memory mapped file access, take a look at the disk_access_mode in cassandra.yaml WRT going OutOfMemory:- what are your Memtable thresholds in cassandra.yaml ? - how many Column Families do you have? - What are your row and key cache set

Newbie question on Cassandra mem usage

2010-11-22 Thread Trung Tran
Hi, I have a test cluster of 3 nodes, 14Gb of mem in each node, replication factor = 3. With default -Xms and Xmx, my nodes are set to have max-heap-size = 7Gb. After initial load with about 200M rows (write with hector default consistencylevel = quorum,) my nodes memory usage are up to 13.5Gb, sh

Re: cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread David Jeske
I already noticed a mistake in my own facts... On Mon, Nov 22, 2010 at 10:01 AM, David Jeske wrote: > *4) Cassandra (N3/W3/R1) takes longer to allow data to become writable > again in the face of a node-failure than HBase/HDFS.* Cassandra must > repair the keyrange to bring N from 2 to 3 to resu

Re: Cassandra memtable and GC

2010-11-22 Thread Terje Marthinussen
. > > - graph-diskT-stat-with-jmx.png: graph of cpu load, LiveSSTableCount > > and logarithm of MemtableDataSize. > > - log-gc.20101122-12:41.160M.log.gz: GC log with -XX:+PrintGC > > -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > > > > As you can see from the second g

cassandra vs hbase summary (was facebook messaging)

2010-11-22 Thread David Jeske
I havn't used either Cassandra or hbase, so please don't take any part of this message as me attempting to state facts about either system. However, I'm very familiar with data-storage design details, and I've worked extensively optimizing applications running on MySQL, Oracle, berkeledb (including

Re: Cassandra memtable and GC

2010-11-22 Thread Edward Capriolo
aph-diskT-stat-with-jmx.png: graph of cpu load, LiveSSTableCount > and logarithm of MemtableDataSize. > - log-gc.20101122-12:41.160M.log.gz: GC log with -XX:+PrintGC > -XX:+PrintGCDetails -XX:+PrintGCTimeStamps > > As you can see from the second graph, logarithm of MemtableDataSize > and cp

Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-22 Thread Nick Telford
Provided at least one node receives the write, it will eventually be written to all replicas. A failure to meet the requested ConsistencyLevel is just that; not a failure to write the data itself. Once the write is received by a node, it will eventually reach all replicas, there is no roll back. T

Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-22 Thread David Boxenhorn
Yes, but the value is supposed to be 11, since the write failed. On Mon, Nov 22, 2010 at 2:27 PM, André Fiedler wrote: > Doesn´t sync Cassandra all nodes if the network is up again? I think this > was one of the reasons, storing a timestamp at every key/value pair? > So i think the response will

Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-22 Thread André Fiedler
Doesn´t sync Cassandra all nodes if the network is up again? I think this was one of the reasons, storing a timestamp at every key/value pair? So i think the response will only temporary be 11. If all nodes have synct it should be 12? Or isn´t that so? greetings André 2010/11/22 Samuel Carrière

Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-22 Thread Samuel Carrière
>Cassandra can work in a consistent way, see some of this discussion and the >Consistency section here http://wiki.apache.org/cassandra/ArchitectureOverview > >If you always read and write with CL.Quorum (or the other way discussed) you >will have consistency. Even if some of the replicas are tem

Re: How can I get rows in groups?

2010-11-22 Thread aaron morton
If you are working inside the cassandra code base, take a look at o.a.c.hadoop.ColumnFamilyRecordReader. It reads all the rows in a CF using tokens. I'm not sure that code cares too much about reading a row twice. AFAIK using tokens for is considered an internal feature. WRT the start key / end

Re: data size

2010-11-22 Thread aaron morton
Is this from a clean install ? Have you been deleting data? Could this be your problem ? http://wiki.apache.org/cassandra/FAQ#i_deleted_what_gives If not you'll need to provide some more details, which version, what the files are on disk, what was the data you loaded etc. Hope that helps Aar

Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-22 Thread aaron morton
Cassandra can work in a consistent way, see some of this discussion and the Consistency section here http://wiki.apache.org/cassandra/ArchitectureOverview If you always read and write with CL.Quorum (or the other way discussed) you will have consistency. Even if some of the replicas are temporar

Re: How can I get rows in groups?

2010-11-22 Thread altanis
I am not using any client, I am trying to extend Cassandra with a new API call so that a _node_ will do that on behalf of clients. Thank you for the answer, but it doesn't answer my question! Alexander > Most of the high level clients do this for you. > > For example, pycassa and phpcassa both do

Re: Facebook messaging and choice of HBase over Cassandra - what can we learn?

2010-11-22 Thread David Boxenhorn
It's true that Cassandra has "tunable consistency", but if eventual consistency is not sufficient for most of your use cases, Cassandra becomes much less attractive. Am I wrong? On Sun, Nov 21, 2010 at 7:56 PM, Eric Evans wrote: > On Sun, 2010-11-21 at 11:32 -0500, Simon Reavely wrote: > > As