Re: [SPAM] Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Donal Zang
On 06/06/2011 14:29, David Boxenhorn wrote: Jonathan, are Donal Zang's results (10x slowdown) typical? On Mon, Jun 6, 2011 at 3:14 PM, Jonathan Ellis > wrote: On Mon, Jun 6, 2011 at 6:28 AM, Donal Zang mailto:zan...@ihep.ac.cn>> wrote: > Another thing I notice

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Jonathan Ellis
If the rows you are updating are not cached, yes. (Otherwise maybe 10% slower.) On Mon, Jun 6, 2011 at 7:29 AM, David Boxenhorn wrote: > Jonathan, are Donal Zang's results (10x slowdown) typical? > > On Mon, Jun 6, 2011 at 3:14 PM, Jonathan Ellis wrote: >> >> On Mon, Jun 6, 2011 at 6:28 AM, Don

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread David Boxenhorn
Jonathan, are Donal Zang's results (10x slowdown) typical? On Mon, Jun 6, 2011 at 3:14 PM, Jonathan Ellis wrote: > On Mon, Jun 6, 2011 at 6:28 AM, Donal Zang wrote: > > Another thing I noticed is : if you first do insertion, and then build > the > > secondary index use "update column family ...

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Jonathan Ellis
On Mon, Jun 6, 2011 at 6:28 AM, Donal Zang wrote: > Another thing I noticed is : if you first do insertion, and then build the > secondary index use "update column family ...", and then do select based on > the index, the result is not right (seems the index is still being built > though the "upda

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Donal Zang
On 06/06/2011 10:15, David Boxenhorn wrote: Is there really a 10x difference between indexed CFs and non-indexed CFs? Well, as for my test, it is! I'm using 0.7.6-2, 9 nodes, 3 replicas, write_consistency_level QUORUM, about 90,000,000 rows (~ 1K per row) I use 20 process, 20rows for each inse

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread David Boxenhorn
Is there really a 10x difference between indexed CFs and non-indexed CFs? On Mon, Jun 6, 2011 at 11:05 AM, Donal Zang wrote: > On 06/06/2011 05:38, Jonathan Ellis wrote: > >> Index updates require read-before-write (to find out what the prior >> version was, if any, and update the index accordin

Re: [SPAM] Re: slow insertion rate with secondary index

2011-06-06 Thread Donal Zang
On 06/06/2011 05:38, Jonathan Ellis wrote: Index updates require read-before-write (to find out what the prior version was, if any, and update the index accordingly). This is random i/o. Index creation on the other hand is a lot of sequential i/o, hence more efficient. So, the classic bulk loa

Re: slow insertion rate with secondary index

2011-06-05 Thread Jonathan Ellis
Index updates require read-before-write (to find out what the prior version was, if any, and update the index accordingly). This is random i/o. Index creation on the other hand is a lot of sequential i/o, hence more efficient. So, the classic bulk load advice to ingest data prior to creating ind

slow insertion rate with secondary index

2011-06-05 Thread Donal Zang
I did a insertion test with and without secondary indexes, and found that: Without secondary index: ~10864 rows inserted per second With secondary index on one column(BytesType): ~1515 rows inserted per second Is this normal? why secondary index would have so much affect? I noticed that If I bu