unsibscribe

2012-05-30 Thread Maxim Potekhin

row cache -- does it have data from other nodes?

2012-05-17 Thread Maxim Potekhin
Hello, when I chose to have a rowcache -- will it contain data that is owned by other nodes? Thanks Maxim

Re: Cassandra search performance

2012-05-07 Thread Maxim Potekhin
Thanks for the comments, much appreciated. Maxim On 5/7/2012 3:22 AM, David Jeske wrote: On Sun, Apr 29, 2012 at 4:32 PM, Maxim Potekhin <mailto:potek...@bnl.gov>> wrote: Looking at your example,as I think you understand, you forgo indexes by combining two conditions in

Re: Cassandra search performance

2012-04-29 Thread Maxim Potekhin
Jason, I'm using plenty of secondary indexes with no problem at all. Looking at your example,as I think you understand, you forgo indexes by combining two conditions in one query, thinking along the lines of what is often done in RDBMS. A scan is expected in this case, and there is no magic to av

Re: Server Side Logic/Script - Triggers / StoreProc

2012-04-29 Thread Maxim Potekhin
About a year ago I started getting a strange feeling that the noSQL community is busy re-creating RDBMS in minute detail. Why did we bother in the first place? Maxim On 4/27/2012 6:49 PM, Data Craftsman wrote: > Howdy, > > Some Polyglot Persistence(NoSQL) products started support server side >

Re: RMI/JMX errors, weird

2012-04-24 Thread Maxim Potekhin
using nodetool disablegossip and disablerthrift , and the turn off the IO limiter with nodetool setcompactionthroughput 0. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 20/04/2012, at 12:29 AM, Maxim Potekhin wrote: Hello Aaron, how sh

Re: RMI/JMX errors, weird

2012-04-18 Thread Maxim Potekhin
17) On 4/12/2012 10:03 PM, aaron morton wrote: Look at the server side logs for errors. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 13/04/2012, at 11:47 AM, Maxim Potekhin wrote: Hello, I'm doing compactions under 0.8.8.

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
The "offending" CF only has one. The other one, that seems to behave well, has nine. Maxim On 4/17/2012 10:20 AM, Jake Luciani wrote: How many indexes are there? On Tue, Apr 17, 2012 at 10:16 AM, Maxim Potekhin <mailto:potek...@bnl.gov>> wrote: Yes. Sorry I didn

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
family? -Jake On Tue, Apr 17, 2012 at 10:09 AM, Maxim Potekhin <mailto:potek...@bnl.gov>> wrote: I understand that indexes are CFs. But the compaction stats says it's building the index, not compacting the corresponding CF. Either that's an ambiguous diagnost

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
since the secondary indexes are themselves column families they too are compacted along with everything else. On Tue, Apr 17, 2012 at 10:02 AM, Maxim Potekhin <mailto:potek...@bnl.gov>> wrote: Thanks Jake. Then I am definitely seeing weirdness, as there are tons of

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
use compaction manager to rebuild. On Tue, Apr 17, 2012 at 9:47 AM, Maxim Potekhin <mailto:potek...@bnl.gov>> wrote: Thanks Aaaron. Just to be clear, every time I do a compaction, I rebuild all indexes from scratch. Right? Maxim On 4/17/2012 6:16 AM, aaron morton wrote:

Re: Is the secondary index re-built under compaction?

2012-04-17 Thread Maxim Potekhin
http://www.thelastpickle.com On 17/04/2012, at 1:06 PM, Maxim Potekhin wrote: I noticed that "nodetool compactionstats" shows the building of the secondary index while I initiate compaction. Is this to be expected? Cassandra version 0.8.8. Thank you Maxim

Is the secondary index re-built under compaction?

2012-04-16 Thread Maxim Potekhin
I noticed that "nodetool compactionstats" shows the building of the secondary index while I initiate compaction. Is this to be expected? Cassandra version 0.8.8. Thank you Maxim

RMI/JMX errors, weird

2012-04-12 Thread Maxim Potekhin
Hello, I'm doing compactions under 0.8.8. Recently, I started seeing a stack trace like one below, and I can't figure out what causes this to appear. The cluster has been in operation for mode than half a year w/o errors like this one. Any help will be appreciated, Thanks Maxim WARNING: F

a very simple indexing question (strange thing seen in CLI)

2012-04-07 Thread Maxim Potekhin
Greetings, Cassandra 0.8.8 is used. I'm trying to create an additional CF which is trivial in all respects. Just ascii columns and a few indexes. This is how I add an index: update column family files with column_metadata = [{column_name : '1', validation_class : AsciiType, index_type : 0, i

Re: import

2012-04-01 Thread Maxim Potekhin
Since Python has a native csv module, it's trivial to achieve. I load lots of csv data into my database daily. Maxim On 3/27/2012 11:44 AM, R. Verlangen wrote: You can write your own script to parse the excel file (export as csv) and import it with batch inserts. Should be pretty easy if you

Re: Building a brand new cluster and readying it for production -- advice needed

2012-03-13 Thread Maxim Potekhin
cache can use this and the OS can cache disk blocks with it. Edward On Tue, Mar 13, 2012 at 3:15 PM, Maxim Potekhin wrote: Dear All, after all the testing and continuous operation of my first cluster, I've been given an OK to build a second production Cassandra cluster in Europe. There wer

Building a brand new cluster and readying it for production -- advice needed

2012-03-13 Thread Maxim Potekhin
Dear All, after all the testing and continuous operation of my first cluster, I've been given an OK to build a second production Cassandra cluster in Europe. There were posts in recent weeks regarding the most stable and solid Cassandra version. I was wondering is anything better has appeare

Re: Implications of length of column names

2012-02-28 Thread Maxim Potekhin
When I migrated data from our RDBMS, I hashed columns names to integers. This makes for some footwork, but the space gain is clearly there so it's worth it. I de-hash on read. Maxim On 2/10/2012 5:15 PM, Narendra Sharma wrote: It is good to have short column names. They save space all the way

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Maxim Potekhin
Thank you so much, looks nice, I'll be looking into it. On 2/22/2012 3:08 PM, Rob Coli wrote: On Wed, Feb 22, 2012 at 10:37 AM, Maxim Potekhin <mailto:potek...@bnl.gov>> wrote: The idea was to provide redundancy, resilience, automatic load balancing and automatic

Re: Please advise -- 750MB object possible?

2012-02-22 Thread Maxim Potekhin
l. On Wed, Feb 22, 2012 at 9:04 AM, Maxim Potekhin mailto:potek...@bnl.gov>> wrote: Hello everybody, I'm being asked whether we can serve an "object", which I assume is a blob, of 750MB size?

Please advise -- 750MB object possible?

2012-02-22 Thread Maxim Potekhin
Hello everybody, I'm being asked whether we can serve an "object", which I assume is a blob, of 750MB size? I guess the real question is of how to chunk it and/or even it's possible to chunk it. Thanks! Maxim

Re: nodetool hangs and didn't print anything with firewall

2012-02-08 Thread Maxim Potekhin
That's good to hear because it does present a problem for a strictly manages and firewalled campus environment. Maxim On 2/6/2012 11:57 AM, Nick Bailey wrote: JMX is not very firewall friendly. The problem is that JMX is a two connection process. The first connection happens on port 7199 and t

Re: Encrypting traffic between Hector client and Cassandra server

2012-01-31 Thread Maxim Potekhin
Hello, do you see any value in having a web service over cassandra, with actual client-clients talking to it via https/ssl? This way the cluster can be firewalled and therefore protected, plus you get decent auth/auth right there. Maxim On 1/31/2012 5:21 PM, Xaero S wrote: I have been try

Re: Restart cassandra every X days?

2012-01-28 Thread Maxim Potekhin
Sorry if this has been covered, I was concentrating solely on 0.8x -- can I just d/l 1.0.x and continue using same data on same cluster? Maxim On 1/28/2012 7:53 AM, R. Verlangen wrote: Ok, seems that it's clear what I should do next ;-) 2012/1/28 aaron morton >

Problematic deletes in 0.8.8

2012-01-27 Thread Maxim Potekhin
Hello, after I thought I was out of the woods with data deletion in 0.8.8, I unfortunately see "undead" data and other strange behavior. Let me clarify: a) I do run repair and compaction well within GC_GRACE b) deletes happen daily c) after a few repairs, when I run an indexed query on the dat

Re: Restart cassandra every X days?

2012-01-25 Thread Maxim Potekhin
I also do repair, compact and cleanup every couple of days, and also have daily restarts on crontab. It doesn't hurt and I avoid having a node becoming unresponsive after many days of operation, that has happened before. Older files get cleaned up on restart. It doesn't take long to shut down

Re: Cassandra & usage

2012-01-24 Thread Maxim Potekhin
You provide zero information on what you are planning to do with the data. Thus, your question is impossible to answer. On 1/24/2012 9:38 PM, francesco.tangari@gmail.com wrote: Do you think that for a standard project with 50.000.000 of rows on 2-3 machines cassandra is appropriate or i sh

Re: Cassandra x MySQL Sharded - Insert Comparison

2012-01-24 Thread Maxim Potekhin
a) I hate to break it to you, but 6GB x 4 cores != 'high-end machine'. It's pretty much middle of the road consumer level these days. b) Hosting the client and Cassandra on the same node is a Bad Idea. It will depend on what exactly the client will do, but in my experience it won't work too we

Re: Cassandra x MySQL Sharded - Insert Comparison

2012-01-22 Thread Maxim Potekhin
Hello, I have some experience in benchmarking Cassandra against Oracle and in running on a VM cluster. While the VM solution will work for many applications, it simply won't cut it for all. In particular, I observed a large difference in insert performance when I moved from VM to real hardwar

Re: ideal cluster size

2012-01-20 Thread Maxim Potekhin
You can also scale not "horizontally" but "diagonally", i.e. raid SSDs and have multicore CPUs. This means that you'll have same performance with less nodes, making it far easier to manage. SSDs by themselves will give you an order of magnitude improvement on I/O. On 1/19/2012 9:17 PM, Thorsten

Re: Cassandra to Oracle?

2012-01-20 Thread Maxim Potekhin
Another way of solving this could be to index the fields in search engine. On Fri, Jan 20, 2012 at 7:37 PM, Maxim Potekhin wrote: What makes you think that RDBMS will give you acceptable performance? I guess you will try to index it to death (because otherwise the "ad hoc" queries won&#x

Re: Cassandra to Oracle?

2012-01-20 Thread Maxim Potekhin
What makes you think that RDBMS will give you acceptable performance? I guess you will try to index it to death (because otherwise the "ad hoc" queries won't work well if at all), and at this point you may be hit with a performance penalty. It may be a good idea to interview users and build d

Re: delay in data deleting in cassadra

2012-01-20 Thread Maxim Potekhin
Did you run repairs withing GC_GRACE all the time? On 1/20/2012 3:42 AM, Shammi Jayasinghe wrote: Hi, I am experiencing a delay in delete operations in cassandra. Its as follows. I am running a thread which contains following three steps. Step 01: Read data from column family "foo"[1]

Re: Using 5-6 bytes for cassandra timestamps vs 8…

2012-01-18 Thread Maxim Potekhin
I must have accidentally deleted all messages in this thread save this one. On the face value, we are talking about saving 2 bytes per column. I know it can add up with many columns, but relative to the size of the column -- is it THAT significant? I made an effort to minimize my CF footprint

Re: About initial token, autobootstraping and load balance

2012-01-15 Thread Maxim Potekhin
I see. Sure, that's a bit more complicated and you'd have to move tokens after adding a machine. Maxim On 1/15/2012 4:40 AM, ??? wrote: It's nothing wrong for 3 nodes. It's a problem for cluster of 20+ nodes, growing. 2012/1/14 Maxim Potekhin mailto:potek..

Re: About initial token, autobootstraping and load balance

2012-01-14 Thread Maxim Potekhin
I'm just wondering -- what's wrong with manual specification of tokens? I'm so glad I did it and have not had problems with balancing and all. Before I was indeed stuck with 25/25/50 setup in a 3 machine cluster, when had to move tokens to make it 33/33/33 and I screwed up a little in that the

Exception thrown during repair, contains jmx classes -- why?

2012-01-11 Thread Maxim Potekhin
As per below trace, there is jmx.mbeanserber involved. What I ran was a common repair. Is that right? What does this failure indicate? at org.apache.cassandra.service.StorageService.forceTableRepair(StorageService.java:1613) at sun.reflect.NativeMethodAccessorImpl.invoke0(Nativ

Re: Should I throttle deletes?

2012-01-10 Thread Maxim Potekhin
Thanks, this makes sense. I'll try that. Maxim On 1/6/2012 10:51 AM, Vitalii Tymchyshyn wrote: Do you mean on writes? Yes, your timeouts must be so that your write batch could complete until timeout elapsed. But this will lower write load, so reads should not timeout. Best regards, Vitalii T

Re: How does Cassandra decide when to do a minor compaction?

2012-01-07 Thread Maxim Potekhin
at 10:03 AM, aaron morton <mailto:aa...@thelastpickle.com>> wrote: http://www.datastax.com/docs/1.0/operations/tuning#tuning-compaction - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 7/01/2012, at 3:17

How does Cassandra decide when to do a minor compaction?

2012-01-06 Thread Maxim Potekhin
The subject says it all -- pointers appreciated. Thanks Maxim

Re: How to find out when a nodetool operation has ended?

2012-01-06 Thread Maxim Potekhin
Thanks, so I take it there is no solution outside of Opcenter. I mean of course I can redirect the output, with additional timestamps if needed, to a log file -- which I can access remotely. I just thought there would be some "status" command by chance, to tell me what maintenance the node is

How to find out when a nodetool operation has ended?

2012-01-06 Thread Maxim Potekhin
Suppose I start a repair on one or a few nodes in my cluster, from an interactive machine in the office, and leave for the day (which is a very realistic scenario imho). Is there a way to know, from a remote machine, when a particular action, such as compaction or repair, has been finished? I fi

Re: Should I throttle deletes?

2012-01-05 Thread Maxim Potekhin
Thanks, that's quite helpful. I'm wondering though if multiplying the number of clients will end up doing same thing. On 1/5/2012 3:29 PM, Philippe wrote: Then I do have a question, what do people generally use as the batch size? I used to do batches from 500 to 2000 like you do. Afte

Re: Should I throttle deletes?

2012-01-05 Thread Maxim Potekhin
Hello Aaron, On 1/5/2012 4:25 AM, aaron morton wrote: I use a batch mutator in Pycassa to delete ~1M rows based on a longish list of keys I'm extracting from an auxiliary CF (with no problem of any sort). What is the size of the deletion batches ? 2000 mutations. Now, it appears that suc

Should I throttle deletes?

2012-01-04 Thread Maxim Potekhin
Now that my cluster appears to run smoothly and after a few successful repairs and compacts, I'm back in the business of deletion of portions of data based on its date of insertion. For reasons too lengthy to be explained here, I don't want to use TTL. I use a batch mutator in Pycassa to delete ~

Re: Strange OOM when doing "list" in CLI

2012-01-04 Thread Maxim Potekhin
ry. I have counters using composite keys and about 1k columns causes this to happen. We should have some paging support with list. On Tuesday, January 3, 2012, Maxim Potekhin <mailto:potek...@bnl.gov>> wrote: > I came back from Xmas vacation only to see that what always was an innocuou

Strange OOM when doing "list" in CLI

2012-01-03 Thread Maxim Potekhin
I came back from Xmas vacation only to see that what always was an innocuous procedure in CLI now reliably results in OOM -- does anyone have ideas why? It never happened before. Version of Cassandra is 0.8.8. 2956 java -ea -javaagent:/home/cassandra/cassandra/bin/../lib/jamm-0.2.2.jar -XX:+

Re: Cassandra WebUI with Sources released

2012-01-03 Thread Maxim Potekhin
Congrats on what seems to be a nice piece of work, need to check it out. Nicely complements other tools. Maxim On 1/2/2012 12:48 PM, Markus Wiesenbacher | Codefreun.de wrote: Hi, I wish you all a happy and healthy new year! As you may remember, I coded a little GUI for Apache Cassandra. Now

Can I slice on composite indexes?

2011-12-20 Thread Maxim Potekhin
Let's say I have rows with composite columns Like ("key1", {('xyz', 'abc'): 'colval1'}, {('xyz', 'def'): 'colval2'}) ("key2", {('ble', 'meh'): 'otherval'}) Is it possible to create a composite type index such that I can query on 'xyz' and get the first two columns? Thanks Maxim

Re: Doubts related to composite type column names/values

2011-12-20 Thread Maxim Potekhin
properties of a comment or all properties for comments between two comment_id's Finally, the client library knows what's going on. Hope that helps. - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 21/12/2011, at 7:43 AM, Maxim Potekhin wro

Re: Doubts related to composite type column names/values

2011-12-20 Thread Maxim Potekhin
With regards to static, what are major benefits as it compares with string catenation (with some convenient separator inserted)? Thanks Maxim On 12/20/2011 1:39 PM, Richard Low wrote: On Tue, Dec 20, 2011 at 5:28 PM, Ertio Lew wrote: With regard to the composite columns stuff in Cassandra,

Best way to implement indexing for high-cardinality values?

2011-12-14 Thread Maxim Potekhin
I now have a CF with extremely skinny rows (in the current implementation), and the application will want to query by more than one column values. Problem is that the values in a lot of cases will be high cardinality. One other factor is that I want to rotate data in and our of the system in one d

Crazy compactionstats

2011-12-14 Thread Maxim Potekhin
Hello I ran repair like this: nohup repair.sh & where repair.sh contains simply nodetool repair plus timestamp. The process dies while dumping this: Exception in thread "main" java.io.IOException: Repair command #1: some repair session(s) failed (see log for details). at org.apache.c

Asymmetric load

2011-12-14 Thread Maxim Potekhin
What could be the reason I see unequal loads on a 3-node cluster? This all started happening during repairs (which again are not going smoothly). Maxim

Re: Keys for deleted rows visible in CLI

2011-12-14 Thread Maxim Potekhin
#range_ghosts On Wed, Dec 14, 2011 at 4:36 AM, Radim Kolar wrote: Dne 14.12.2011 1:15, Maxim Potekhin napsal(a): Thanks. It could be hidden from a human operator, I suppose :) I agree. Open JIRA for it.

Re: commit log size

2011-12-14 Thread Maxim Potekhin
Alexandru, Jeremiah -- what setting needs to be tweaked, and what's the recommended value? I observed similar behavior this morning. Maxim On 11/28/2011 2:53 PM, Jeremiah Jordan wrote: Yes, the low volume memtables are causing the problem. Lower the thresholds for those tables if you don't

Re: Keys for deleted rows visible in CLI

2011-12-13 Thread Maxim Potekhin
that an operation has been performed to delete the data. Harold -Original Message- From: Maxim Potekhin [mailto:potek...@bnl.gov] Sent: Tuesday, December 13, 2011 4:03 PM To: user@cassandra.apache.org Subject: Keys for deleted rows visible in CLI Hello, I searched the archives and it

Keys for deleted rows visible in CLI

2011-12-13 Thread Maxim Potekhin
Hello, I searched the archives and it appears that this question was once asked but was not answered. I just deleted a lot of rows, and want to "list" in cli. I still see the keys. This is not the same as getting slices, is it? Anyhow, what's the reason and rationale? I run 0.8.8. Thanks Max

"show schema" bombs in 0.8.6

2011-12-13 Thread Maxim Potekhin
Running cli --debug: [default@PANDA] show schema; null java.lang.RuntimeException at org.apache.cassandra.cli.CliClient.executeCLIStatement(CliClient.java:310) at org.apache.cassandra.cli.CliMain.processStatement(CliMain.java:217) at org.apache.cassandra.cli.CliMain.mai

Deleted rows re-appearing on repair in 0,8.6

2011-12-12 Thread Maxim Potekhin
Hello, I know that this problem used to exist in 0.8.1 -- I delete rows, run a repair and these rows are back with a vengeance. I recall I was told that this was fixed in 0.8.6 -- is that the case? I still keep seeing that behavior. Thanks Maxim

Re: Cassandra 0.8.8

2011-12-09 Thread Maxim Potekhin
Hello everyone, so what's the update on 0.8.8? Many thanks Maxim On 12/2/2011 4:49 AM, Patrik Modesto wrote: Hi, It's been almost 2 months since the release of the 0.8.7 version and there are quite some changes in 0.8.8, so I'd like to ask is there a release date? Regards, Patrik

Really old files in the data directory

2011-12-09 Thread Maxim Potekhin
Hello, a varied the GC grace a few times over the period of my cluster's lifetime, but I never went above 10 days. I did compactions, repairs etc. Now, I see that some files in the data directories of the nodes that were there from day one carry timestamps back from July. There are files cont

Cassandra behavior too fragile?

2011-12-07 Thread Maxim Potekhin
OK, thanks to the excellent help of Datastax folks, some of the more severe inconsistencies in my Cassandra cluster were fixed (after a node was down and compactions failed etc). I'm still having problems as reported in "repairs 0.8.6." thread. Thing is, why is it so easy for the repair proces

Re: Repair failure under 0.8.6

2011-12-07 Thread Maxim Potekhin
I'm still having tons of problems with repairs and compactions, where the nodes are declared dead in their log files, although they were online at all times. This leads to problem behavior, i.e. once again I see that repair fails, and the cluster becomes unusable since there is no space to com

Re: exporting data from Cassandra cluster

2011-12-07 Thread Maxim Potekhin
Hello Alexandru, as you probably know, my group is using Amazon S3 to permanently (or sem-permanently) park the data in CSV format, which makes it portable and we can load it into anything if needed, or analyze on its own. Just my half of a Swiss centime :) And, because the S3 option is not f

forceUserDefinedCompaction -- how to use it?

2011-12-07 Thread Maxim Potekhin
Can anyone provide an example of how to use forceUserDefinedCompaction? Thanks Maxim

Could not reach schema agreement... 0.8.6

2011-12-05 Thread Maxim Potekhin
Hello, upon startup, in my cluster of 3 machines, I see similar messages in system.log on each node (below). I start nodes one by one, after I ascertain the previous one is online. So they can't reach schema agreement, all of them. Why? No unusual load visible in Ganglia plots. ERROR [Hinted

Re: Repair failure under 0.8.6

2011-12-05 Thread Maxim Potekhin
Basically I tweaked the phi, put in more verbose GC reporting and decided to do a compaction before I proceed. I'm getting this on the node where compaction is being run. And the system log for the other two nodes follows. It's obvious that the cluster is sick, but I can't determine why -- ther

Re: Repair failure under 0.8.6

2011-12-04 Thread Maxim Potekhin
As a side effect of the failed repair (so it seems) the disk usage on the affected node prevents compaction from working. It still works on the remaining nodes (we have 3 total). Is there a way to scrub the extraneous data? Thanks Maxim On 12/4/2011 4:29 PM, Peter Schuller wrote: I will try

Re: can not create a column family named 'index'

2011-12-04 Thread Maxim Potekhin
I seem to recall problems when using a cf called "indexRegistry", don't remember much detail now. Maxim On 11/30/2011 7:24 PM, Shu Zhang wrote: Hi, just wondering if this is intentional: [default@test] create column family index; Syntax error at position 21: mismatched input 'index' expecting

Re: Repair failure under 0.8.6

2011-12-04 Thread Maxim Potekhin
Please disregard the GC part of the question -- I found it. On 12/4/2011 4:12 PM, Maxim Potekhin wrote: Thanks Peter! I will try to increase phi_convict -- I will just need to restart the cluster after the edit, right? I do recall that I see nodes temporarily marked as down, only to pop up

Re: Repair failure under 0.8.6

2011-12-04 Thread Maxim Potekhin
Thanks Peter! I will try to increase phi_convict -- I will just need to restart the cluster after the edit, right? I do recall that I see nodes temporarily marked as down, only to pop up later. In the current situation, there is no load on the cluster at all, outside the maintenance like

Re: Repair failure under 0.8.6

2011-12-04 Thread Maxim Potekhin
at org.apache.cassandra.gms.Gossiper.access$700(Gossiper.java:57) at org.apache.cassandra.gms.Gossiper$GossipTask.run(Gossiper.java:157) On 12/3/2011 8:34 PM, Maxim Potekhin wrote: Thank you Peter. Before I look into details as you suggest, may I ask what you mean "automatically restarted"?

Re: Repair failure under 0.8.6

2011-12-03 Thread Maxim Potekhin
Thank you Peter. Before I look into details as you suggest, may I ask what you mean "automatically restarted"? They way the box and Cassandra are set up in my case is such that the death of either if final. Also, how do I look for full GC? I just realized that in the latest install, I might have

Repair failure under 0.8.6

2011-12-03 Thread Maxim Potekhin
Please help -- I've been having pretty consistent failures that look like this one. Don't know how to proceed. Below text comes from the system log. The cluster was all up before and after the attempted repair, so I don't quite understand how Cassandra declared a node dead (in the below). Was is

Re: Yanking a dead node

2011-11-29 Thread Maxim Potekhin
Thanks! Looks pretty obvious in retrospect... Regards, Maxim On 11/24/2011 6:54 AM, Filipe Gonçalves wrote: Just remove its token from the ring using nodetool removetoken 2011/11/23 Maxim Potekhin: This was discussed a long time ago, but I need to know what's the state of the art a

How many indexes to keep? Guidelines

2011-11-29 Thread Maxim Potekhin
As a matter of practice, how many secondary indexes on a CF do you usually keep? What are rules of thumb? Is 10 too many? 100? 1000? Thanks Maxim

Yanking a dead node

2011-11-23 Thread Maxim Potekhin
This was discussed a long time ago, but I need to know what's the state of the art answer to that: assume one of my few nodes is very dead. I have no resources or time to fix it. Data is replicated so the data is still available in the cluster. How do I completely remove the dead node without ha

Re: 7199

2011-11-22 Thread Maxim Potekhin
Thanks. I'm trying to look up HttpAdaptor and what it does, can you give any pointers? Thanks. I didn't find much useful info just yet. Maxim On 11/22/2011 9:52 PM, Jeremiah Jordan wrote: Yes, that is the port nodetool needs to access. On Nov 22, 2011, at 8:43 PM, Maxim Pote

7199

2011-11-22 Thread Maxim Potekhin
Hello, I have this in my cassandra-env.sh JMX_PORT="7199" Does this mean that if I use nodetool from another node, it will try to connect to that particular port? Thanks, Maxim

Re: read performance problem

2011-11-19 Thread Maxim Potekhin
Try to see if there is a lot of paging going on, and run some benchmarks on the disk itself. Are you running Windows or Linux? Do you think the disk may be fragmented? Maxim On 11/19/2011 8:58 PM, Kent Tong wrote: Hi, On my computer with 2G RAM and a core 2 duo CPU E4600 @ 2.40GHz, I am tes

Re: A Cassandra CLI question: null vs 0 rows

2011-11-17 Thread Maxim Potekhin
Should I file a ticket? I consistently see this behavior after a mass delete. On 11/17/2011 12:46 PM, Maxim Potekhin wrote: Thanks Jonathan. I get the bellow error. Don't have a clue as to what it means. null java.lang.RuntimeException

Re: Data Model Design for Login Servie

2011-11-17 Thread Maxim Potekhin
1122: { gender: MALE birthdate: 1987.11.09 name: Alfred Tester pwd: e72c504dc16c8fcd2fe8c74bb492affa alias1: alfred.tes...@xyz.de alias2: alf...@aad.de alias3: a...@dd.de

Varying number of rows coming from same query on same database

2011-11-17 Thread Maxim Potekhin
Hello, I'm running the same query repeatedly. It's a secondary index query, done from a Pycassa client. I see that when I iterate the "result" object, I get slightly different number of entries when running the test serially. There is no deletions in the database, and no writes, it's static for n

What sort of load do the tombstones create on the cluster?

2011-11-17 Thread Maxim Potekhin
In view of my unpleasant discovery last week that deletions in Cassandra lead to a very real and serious performance loss, I'm working on a strategy of moving forward. If the tombstones do cause such problem, where should I be looking for performance bottlenecks? Is it disk, CPU or something el

Re: A Cassandra CLI question: null vs 0 rows

2011-11-17 Thread Maxim Potekhin
11/17/2011 12:28 PM, Jonathan Ellis wrote: If CLI returns null it means there was an error -- run with --debug to check the exception. On Thu, Nov 17, 2011 at 11:20 AM, Maxim Potekhin wrote: Hello everyone, I run a query on a secondary index. For some queries, I get 0 rows returned. In other

A Cassandra CLI question: null vs 0 rows

2011-11-17 Thread Maxim Potekhin
Hello everyone, I run a query on a secondary index. For some queries, I get 0 rows returned. In other cases, I just get a string that reads "null". What's going on? TIA Maxim

Re: Mass deletion -- slowing down

2011-11-14 Thread Maxim Potekhin
eeks, structure your query so that your slice range only goes back 2 weeks, rather than to the beginning of time. this would avoid iterating over all the tombstones from prior to the 2 week window. this wouldn't work if you are deleting arbitrary days in the middle of your date range. O

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
Thanks Peter, I'm not sure I entirely follow. By the oldest data, do you mean the primary key corresponding to the limit of the time horizon? Unfortunately, unique IDs and the timstamps do not correlate in the sense that chronologically "newer" entries might have a smaller sequential ID. That's

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
M, Brandon Williams wrote: On Sun, Nov 13, 2011 at 7:25 PM, Maxim Potekhin wrote: Each row represents a computational task (a job) executed on the grid or in the cloud. It naturally has a timestamp as one of its attributes, representing the time of the last update. This timestamp is used to grou

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
, but that doesn't seem to help. Thanks, Maxim On 11/13/2011 8:00 PM, Brandon Williams wrote: On Sun, Nov 13, 2011 at 6:55 PM, Maxim Potekhin wrote: Thanks to all for valuable insight! Two comments: a) this is not actually time series data, but yes, each item has a timestamp and thus chro

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
Thanks to all for valuable insight! Two comments: a) this is not actually time series data, but yes, each item has a timestamp and thus chronological attribution. b) so, what do you practically recommend? I need to delete half a million to a million entries daily, then insert fresh data. What's

Re: Mass deletion -- slowing down

2011-11-13 Thread Maxim Potekhin
such behavior? Thanks, Maxim On 11/10/2011 8:30 PM, Maxim Potekhin wrote: Hello, My data load comes in batches representing one day in the life of a large computing facility. I index the data by the day it was produced, to be able to quickly pull data for a specific day within the last ye

Is there a way to get only keys with get_indexed_slices?

2011-11-10 Thread Maxim Potekhin
Is there a way to get only keys with get_indexed_slices? Looking at the code, it's not possible, but -- is there some way anyhow? I don't want to extract any data, just a list of matching keys. TIA, Maxim

Mass deletion -- slowing down

2011-11-10 Thread Maxim Potekhin
Hello, My data load comes in batches representing one day in the life of a large computing facility. I index the data by the day it was produced, to be able to quickly pull data for a specific day within the last year or two. There are 6 other indexes. When it comes to retiring the data, I in

"Error connection to remote JMX agent" during repair

2011-11-07 Thread Maxim Potekhin
Hello, I'm trying to run "repair" on one of my nodes which needs to be repopulated after a failure of the hard drive. What I'm getting is below. Note: I'm not loading JMX with Cassandra, it always worked before... The version if 0.8.6. Any help will be appreciated, Maxim Error connection t

Re: Tool for SQL -> Cassandra data movement

2011-11-01 Thread Maxim Potekhin
Just a short comment -- we are going the CSV way as well because of its compactness and extreme portability. The CSV files are kept in the cloud as backup. They can also find other uses. JSON would work as well, but it would be at least twice as large in size. Maxim On 9/22/2011 1:25 PM, Nehal

Re: CMS GC initial-mark taking 6 seconds , bad?

2011-10-20 Thread Maxim Potekhin
Hello Aaron, I happen to have 48GB on each machines I use in the cluster. Can I assume that I can't really use all of this memory productively? Do you have any suggestion related to that? Can I run more than one instance on Cassandra on the same box (using different ports) to take advantage of

Re: [RELEASE] Apache Cassandra 1.0 released

2011-10-18 Thread Maxim Potekhin
There was a problem in early 0.8 where the repair was taking forever -- am I right to assume this was fixed in 1.0? Many thanks to you guys, Maxim On 10/18/2011 2:25 PM, Thibaut Britz wrote: Great news! Especially the improved read performance and compactions are great! Thanks, Thibaut On

  1   2   >