Re: mmap

2010-07-15 Thread Peter Schuller
> This would require that Cassandra run as root on Linux systems, as 'man > mlockall' states: IIRC, mlock() (as opposed to mlockall()) does not require root privileges - but is subject to resource limitations. However, given a lack of control of how memory is allocated in the JVM I suppose mlock

Re: Bootstrap question

2010-07-15 Thread Anthony Molinaro
Okay, so things were pretty messed up. I shut down all the new nodes, then the old nodes started doing the half the ring is down garbage which pretty much requires a full restart of everything. So I had to shut everything down, then bring the seed back, then the rest of the nodes, so they finally

Re: key types and grouping related rows together

2010-07-15 Thread Aaron Morton
yes, you need to maintain the secondary index your self. Send a batch_mutation and write the article and website article colums at the same time. I think your safe up to a large number of cols, say 1M Not sure, may try to track the info down one day.AOn 16 Jul, 2010,at 03:39 PM, S Ahmed wrote:S

Re: A very short summary on Cassandra for a book

2010-07-15 Thread David Strauss
On 2010-07-16 01:57, Dave Viner wrote: > I am no expert... but parts seem accurate, parts not. > > "Cassandra stores four or five dimension associated arrays" > not sure what you're counting as a dimension of the associated array, > but here are the 2 associative array-like syntaxes: > > ColumnFa

Re: Seeing very weird results on 0.6.2 when paginating through a ColumnFamily with get_slice()

2010-07-15 Thread Ilya Maykov
The column names are arbitrary strings, so it's not obvious what the "next" value should be at any step. So, I just set the start of the next page to the end of the last page and eliminate the duplicate value when joining the 2 pages together. The paging direction does not matter in my case, as I

Re: key types and grouping related rows together

2010-07-15 Thread S Ahmed
So am I to keep track on the # of columns for a given key in CF WebsiteArticle? i.e. if I want to do a get_slice for the first 10 OR last 10 (I would need to know the count to get the last 10). >>Am assuming RP. There are some recommendations on the number of cols per key, in the millions I think

Re: Seeing very weird results on 0.6.2 when paginating through a ColumnFamily with get_slice()

2010-07-15 Thread Paul Brown
You should make sure that your directions and interval endpoints are chosen correctly. I recall the semantics of the call being like an old-school for with the descending flag as a step of +1 or -1. -- Spelling by mobile. On Jul 15, 2010, at 20:19, Ilya Maykov wrote: > Hi all, > > I'm tryi

Seeing very weird results on 0.6.2 when paginating through a ColumnFamily with get_slice()

2010-07-15 Thread Ilya Maykov
Hi all, I'm trying to debug some pretty weird behavior when paginating through a ColumnFamily with get_slice(). It basically looks like Cassandra does not respect the limit parameter in the SlicePredicate, sometimes returning more than limit columns. It also sometimes silently drops columns. I'm r

Re: A very short summary on Cassandra for a book

2010-07-15 Thread Dave Viner
I am no expert... but parts seem accurate, parts not. "Cassandra stores four or five dimension associated arrays" not sure what you're counting as a dimension of the associated array, but here are the 2 associative array-like syntaxes: ColumnFamily[row-key][column-name] = value1 ColumnFamily[row-

Re: ERROR 22:59:00,329 Error in ThreadPoolExecutor

2010-07-15 Thread Claire Chang
i saw this in the kernel log: jsvc uses 32-bit capabilitie. Is this right? our server is Linux 2.6.32-23-generic #37-Ubuntu SMP Fri Jun 11 08:03:28 UTC 2010 x86_64 GNU/Linux On Jul 15, 2010, at 11:04 AM, Claire Chang wrote: > I am using Random Partitioner. The other 2 nodes are working fine.

Re: key types and grouping related rows together

2010-07-15 Thread Aaron Morton
You could build a secondary index, e.g.CFArticles : {article_id1 : {}article_id2 : {}}CFWebsiteArticle : {website_id1 : { time_uuid : article_id1, time_uuid2 : article_id2}}when you want to get the last 10 for a website, get_slice from the WebsiteArticle CF then multi get from Articles. Am assuming

A very short summary on Cassandra for a book

2010-07-15 Thread Karoly Negyesi
Hi, I am writing a scalability chapter in a book and I need to mention Apache Cassandra although it's just a mention. Still I would not like to be sloppy and would like to get verification whether my summary is accurate. "Cassandra stores four or five dimension associated arrays. The first dimensi

Re: mmap

2010-07-15 Thread Jonathan Ellis
On Thu, Jul 15, 2010 at 5:46 PM, Clint Byrum wrote: > One other approach that works on Linux is to use HugeTLB. This post details > the process for doing so with a jvm: > > http://andrigoss.blogspot.com/2008/02/jvm-performance-tuning.html > > Basically when mmapping using HUGETLB you don't have t

Re: mmap

2010-07-15 Thread Clint Byrum
On Jul 15, 2010, at 2:52 PM, Jonathan Ellis wrote: > On Thu, Jul 15, 2010 at 3:56 PM, Carlos Alvarez wrote: >> On Thu, Jul 15, 2010 at 2:01 PM, Jonathan Ellis wrote: >>> The main problem is not the syscall so much as Java insisting on >>> zeroing out any buffer you create, which is a big hit to

Re: How to change the RF and repair

2010-07-15 Thread Jonathan Ellis
On Thu, Jul 15, 2010 at 5:29 PM, Mubarak Seyed wrote: >  Just want to verify with group that what i am doing wrt RF is correct. > 1. Nodes were running with RF=2 > 2. Stopped all the nodes, changed the RF to 4 > 3. Started all the nodes, verify the cluster ring using nodetool, all the > nodes are

How to change the RF and repair

2010-07-15 Thread Mubarak Seyed
Just want to verify with group that what i am doing wrt RF is correct. 1. Nodes were running with RF=2 2. Stopped all the nodes, changed the RF to 4 3. Started all the nodes, verify the cluster ring using nodetool, all the nodes are part of cluster 4. Ran nodetool repair on all the nodes 5. Ran n

Re: mmap

2010-07-15 Thread Jonathan Ellis
On Thu, Jul 15, 2010 at 3:56 PM, Carlos Alvarez wrote: > On Thu, Jul 15, 2010 at 2:01 PM, Jonathan Ellis wrote: >> The main problem is not the syscall so much as Java insisting on >> zeroing out any buffer you create, which is a big hit to performance >> when you're allocating buffers for file i/

Re: key types and grouping related rows together

2010-07-15 Thread S Ahmed
Given a CF like: Articles : { key1 : { title:"some title", body: "this is my article body...", }, key1 : { title:"some title", body: "this is my article body...", } } Now these articles could be for different websites e.g. www.website1.com, www.website2.com If I want to get the

Re: key types and grouping related rows together

2010-07-15 Thread S Ahmed
Benjamin, Ah, thanks for clarifying that. key sorting is changing in .7 I believe to support a binary array? On Thu, Jul 15, 2010 at 3:26 PM, Benjamin Black wrote: > Keys are always sorted (in 0.6) as UTF8 strings. The CompareWith > applies to _columns_ within rows, _not_ to row keys. > > On

Re: mmap

2010-07-15 Thread Carlos Alvarez
On Thu, Jul 15, 2010 at 2:01 PM, Jonathan Ellis wrote: > The main problem is not the syscall so much as Java insisting on > zeroing out any buffer you create, which is a big hit to performance > when you're allocating buffers for file i/o on each request instead of > just mmaping things.  Re-using

Re: Bootstrap question

2010-07-15 Thread Jonathan Ellis
On Thu, Jul 15, 2010 at 3:28 PM, Anthony Molinaro wrote: > Is the fact that 2 new nodes are in the range messing it up? Probably. >  And if so > how do I recover (I'm thinking, shutdown new nodes 2,3,4,5, the bringing > up nodes 2,4, waiting for them to finish, then bringing up 3,5?). Yes. You

Re: Bootstrap question

2010-07-15 Thread Anthony Molinaro
Oh, and looking at the load on the new machines it appears that New 2 and New 6 have gotten some data (although neither is in the ring yet). Not sure if that clears anything up though. -Anthony On Thu, Jul 15, 2010 at 01:28:06PM -0700, Anthony Molinaro wrote: > This is a cluster which is horri

Re: Bootstrap question

2010-07-15 Thread Anthony Molinaro
This is a cluster which is horribly imbalanced because I didn't assign initial tokens, so I'm adding 6 nodes with tokens according to the operations page (ie, i * (2^127/N) with N = 6). So here's what the ring will look like when bootstrap finishes 151901684708361811491018697

Re: key types and grouping related rows together

2010-07-15 Thread Benjamin Black
Keys are always sorted (in 0.6) as UTF8 strings. The CompareWith applies to _columns_ within rows, _not_ to row keys. On Wed, Jul 14, 2010 at 1:44 PM, S Ahmed wrote: > Where is the link that describes the various key types and their impact on > sorting? (I believe I read it before, can't seem to

Re: nodetool repair

2010-07-15 Thread Jonathan Ellis
On Thu, Jul 15, 2010 at 1:54 PM, B. Todd Burruss wrote: > if i have N=3 and run nodetool repair on node X.  i assume that merkle > trees (at a minimum) are calculated on nodes X, X+1, and X+2 (since > N=3).  when the repair is finished are nodes X, X+1, and X+2 all in sync > with respect to node X

Re: mmap

2010-07-15 Thread Peter Schuller
> I'm convinced. :)  See comments on > https://issues.apache.org/jira/browse/CASSANDRA-1214 Noted :) To be clear I only mentioned it as an acknowledgement that everyone didn't necessarily agree with what I was saying. > The main problem is not the syscall so much as Java insisting on > zeroing ou

Re: CassandraBulkLoader

2010-07-15 Thread Torsten Curdt
> If you could can you please share the command line function (to load TSV)? There is no command line function ... you have to write code for this. > and Can you please help me on storing storage-conf.xml on HDFS part? As I said. Maybe you better start with a simpler scenario and leave out HDFS

nodetool repair

2010-07-15 Thread B. Todd Burruss
if i have N=3 and run nodetool repair on node X. i assume that merkle trees (at a minimum) are calculated on nodes X, X+1, and X+2 (since N=3). when the repair is finished are nodes X, X+1, and X+2 all in sync with respect to node X's data? or does X have the latest data and X+1 and X+2 still in

Re: ERROR 22:59:00,329 Error in ThreadPoolExecutor

2010-07-15 Thread Claire Chang
I am using Random Partitioner. The other 2 nodes are working fine. There are no Errors in the log files for the 2 good nodes. There were no log messages within 30 minutes before the exception occurs. Here is the last log statement before the exception occurred. INFO [COMPACTION-POOL:1] 2010-07

ERROR 22:59:00,329 Error in ThreadPoolExecutor

2010-07-15 Thread Claire Chang
I am trying to set up a 3 node cluster. RF=3 and CL=1 for most of the request. The initial seeding took about 1 hour to complete which loaded each node with 2G of data. After the seeding completed, one node started having this exception and hung. Read/Write with CL=ALL timed out but CL=QUORUM wa

Re: CassandraBulkLoader

2010-07-15 Thread Mubarak Seyed
Hi Torsten, If you could can you please share the command line function (to load TSV)? and Can you please help me on storing storage-conf.xml on HDFS part? Thanks, Mubarak On Tue, Jul 13, 2010 at 1:27 AM, Torsten Curdt wrote: > On Tue, Jul 13, 2010 at 04:35, Mubarak Seyed > wrote: > > Where c

Re: mmap

2010-07-15 Thread Jonathan Ellis
On Thu, Jul 15, 2010 at 11:41 AM, Peter Schuller wrote: > Not really. That is, the intent of mmap is to let the OS dynamically > choose what gets swapped in and out. The practical problem is that the > OS will often tend to swap too much. I got the impression jbellis > wasn't convinced, but my ane

Re: Hintedhandoff will never complete when a BIG rowmutation

2010-07-15 Thread Schubert Zhang
Yes, I think current HintedHandOff implementation in 0.6.x cannot support large hints, it is a risk in a production system. On Tue, Jun 29, 2010 at 12:31 AM, albert_e wrote: > In 0.6.2, HH sending MUTATION message using the same OutboundTcpConnection > with READ message. When HH transfering big

Re: mmap

2010-07-15 Thread Schubert Zhang
I found, for large dataset, long-term random reading test, the performance with mmap is very bad. See the attached chart in https://issues.apache.org/jira/browse/CASSANDRA-1214. On Fri, Jul 16, 2010 at 12:41 AM, Peter Schuller < peter.schul...@infidyne.com> wrote: > > Can someone please explain t

Re: mmap

2010-07-15 Thread Peter Schuller
> Can someone please explain the mmap issue. > mmap is default for all storage files for 64bit machines. > according to this case https://issues.apache.org/jira/browse/CASSANDRA-1214 > it might not be a good thing. > Is it right to say that you should use mmap only if your MAX expected data > is sm

Re: key types and grouping related rows together

2010-07-15 Thread S Ahmed
Do you think a composite key using a key type of Bytes would work? How many bytes can it be? public static byte [] createRowKey(int websiteid, long stamp) throws Exception { byte [] websiteidBytes = Bytes.toBytes(websiteid); byte [] stampBytes = Bytes.toBytes(stamp); return Bytes.add(websi

want to change algorithm used in OPP for token and key comparison

2010-07-15 Thread Sagar Agrawal
Hi, I am using OrderPreservingPartitioner, and my keys are integers which are stored as strings, I want to manually assign token values equal to my key values such that data is equally distributed. So for this to work, I want to convert the token and key strings to integers before doing compareTo

Re: key types and grouping related rows together

2010-07-15 Thread S Ahmed
Well I'm not talking about a specific column family here, as ALL my column families will have content that is specific to a certain website, so I need a strategy that I will use on almost all my column families. On Wed, Jul 14, 2010 at 9:20 PM, Schubert Zhang wrote: > for your apps, how about th

Re: Bootstrap Token collision

2010-07-15 Thread Gary Dusbabek
Did you add a new node to the cluster at the time you restarted it? If not, I would think that each node already had a token that would make such a collision impossible, unless we have a new bug to troubleshoot. Gary. On Wed, Jul 14, 2010 at 20:46, Mubarak Seyed wrote: > The cluster nodes were r

Re: Data in Cassandra

2010-07-15 Thread Jonathan Ellis
Short answer: yes, this is normal. Longer answer: this was discussed at length on this list a few days ago, check the archives. On Wed, Jul 14, 2010 at 10:55 PM, Hendro Kaskus wrote: > Hi everyone, > > I'm newbie to Cassandra :D.. I try to insert data from MySQL to Cassandra. > Data dump from My

mmap

2010-07-15 Thread shimi
Can someone please explain the mmap issue. mmap is default for all storage files for 64bit machines. according to this case https://issues.apache.org/jira/browse/CASSANDRA-1214it might not be a good thing. Is it right to say that you should use mmap only if your MAX expected data is smaller then th

Re: Data in Cassandra

2010-07-15 Thread Dimitry Lvovsky
It could be that your Cassandra nodes haven't full compacted yet. On Thu, Jul 15, 2010 at 5:55 AM, Hendro Kaskus wrote: > Hi everyone, > > I'm newbie to Cassandra :D.. I try to insert data from MySQL to Cassandra. > Data dump from MySQL is about 11 MB (64716 records). But when i'm insert to > Cas