Re: slow read

2012-03-05 Thread ruslan usifov
2012/3/5 Jeesoo Shin > Hi all. > > I have very SLOW READ here. :-( > I made a cluster with three node (aws xlarge, replication = 3) > Cassandra version is 1.0.6 > I have inserted 1,000,000 rows. (standard column) > Each row has 200 columns. > Each column has 16 byte key, 512 byte value. > > I us

Re: Secondary indexes don't go away after metadata change

2012-03-05 Thread aaron morton
The secondary index CF's are marked as no longer required / marked as compacted. under 1.x they would then be deleted reasonably quickly, and definitely deleted after a restart. Is there a zero length .Compacted file there ? > Also, when adding a new node to the ring the new node will build i

Re: slow read

2012-03-05 Thread Jeesoo Shin
Thank you for reply. :) Yes I did multiple thread. 160, 320 gave me same result. On 3/5/12, ruslan usifov wrote: > 2012/3/5 Jeesoo Shin > >> Hi all. >> >> I have very SLOW READ here. :-( >> I made a cluster with three node (aws xlarge, replication = 3) >> Cassandra version is 1.0.6 >> I have ins

Re: can't find rows

2012-03-05 Thread aaron morton
am guessing a lot here, but I would check if auto_bootstrap is enabled. It is by default. When a new node joins reads are not directed to it until it is marked as "UP" (writes are sent to it as it is joining). So reads should continue to go to the original UP node. Sounds like it's all runnin

Re: Schema change causes exception when adding data

2012-03-05 Thread aaron morton
I don't have a lot of Hector experience but it sounds like the way to go. The CLI and cqlsh will take care of this. Cheers - Aaron Morton Freelance Developer @aaronmorton On 2/03/2012, at 10:12 AM, Tharindu Mathew wrote: > There are 2. I'd like to

Re: slow read

2012-03-05 Thread ruslan usifov
And sum of all rq/s threads is 160?? 2012/3/5 Jeesoo Shin > Thank you for reply. :) > Yes I did multiple thread. > 160, 320 gave me same result. > > On 3/5/12, ruslan usifov wrote: > > 2012/3/5 Jeesoo Shin > > > >> Hi all. > >> > >> I have very SLOW READ here. :-( > >> I made a cluster with th

Re: composite types in CQL

2012-03-05 Thread aaron morton
It's not currently supported in CQL You can do it using the CLI, see the online help. Cheers - Aaron Morton Freelance Developer @aaronmorton On 2/03/2012, at 10:39 AM, Bayle Shanks wrote: > hi,

Re: Test Data creation in Cassandra

2012-03-05 Thread aaron morton
try tools/stress in the source distribution. Cheers - Aaron Morton Freelance Developer @aaronmorton On 3/03/2012, at 6:01 AM, A J wrote: > What is the best way to create millions of test data in Cassandra ? > > I would like to have some script wher

RE: cli question

2012-03-05 Thread Rishabh Agrawal
I faced the same issue some time back. Solution which fit my bill is as follows: CREATE COLUMN FAMILY aaa with comparator = 'CompositeType(UTF8Type,UTF8Type)' and default_validation_class = 'UTF8Type' and key_validation_class = 'CompositeType(UTF8Type,UTF8Type,UTF8Type,)'; notice I ha

running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
Hi! I have a Cassandra cluster with two nodes nodetool ring -h localhost Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 datacenter1 rack1 Up Normal 488.74 KB 50.00% 0 datac

Re: cli question

2012-03-05 Thread Tamar Fraenkel
Thanks! I decided to just replace all ":" with "^" and I can simply run: get a_b_indx ['AAA:BBB^CCC']; *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, Mar 5, 2012 at

Re: Maximum Row Size in Cassandra : Potential Bottleneck

2012-03-05 Thread aaron morton
> Is there any way in which the writes can be made pretty slow on different > nodes. Ideally I would like data to be written on one node and eventually > replicating across other nodes I dont really need a real time update, so can > pretty much live with slow writes. Replicating "inside" the mut

Re: Writing Data To A Super Column That Is In A Column Family With A Type Of Standard

2012-03-05 Thread aaron morton
> Is it possible to mix both Standard and Super columns in the same > Column Family? No. > create column family users >with comparator = UTF8T >and key_validation_class=UTF8TYpe >and compression_options = { sstable_compression:SnappyCompressor, > chunk_length_kb:64} >and column_m

Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Hi All, While benchmarking Cassandra I found "Mutation Dropped" messages in the logs. Now I know this is a good old question. It will be really great if someone can provide a check list to recover when such a thing happens. I am looking for answers of the following questions - 1. Whic

Re: slow read

2012-03-05 Thread aaron morton
Where is the client running from ? To see if a node it keeping up with requests look at nodetool tpstats, check if the read stage is backing up. To see how long a read takes, use nodetool cfstats and look at the read latency. (this the latency of a read on that node, not cluster wide) To see

Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
> Would the rings be separate? Yes. But I would recommend you give them different cluster names. It's a good protections against nodes accidentally joining the wrong cluster. cheers - Aaron Morton Freelance Developer @aaronmorton On 5/03/2012, at 1

Re: running two rings on the same subnet

2012-03-05 Thread Hontvári József Levente
You have to use PropertyFileSnitch and NetworkTopologyStrategy to create a multi-datacenter setup with two circles. You can start reading from this page: Moreover all token

Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
> 1. Which parameters to tune in the config files? – Especially looking > for heavy writes The node is overloaded. It may be because there are no enough nodes, or the node is under temporary stress such as GC or repair. If you have spare IO / CPU capacity you could increase the current_wri

Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
Do you want to create two separate clusters or a single cluster with two data centres ? If it's the later, token selection is discussed here > Moreover all tokens must be unique (even across datacenters), although - fro

Issue with nodetool clearsnapshot

2012-03-05 Thread B R
Version 0.8.9 We run a 2 node cluster with RF=2. We ran a scrub and after that ran the clearsnapshot to remove the backup snapshot created by scrub. It seems that instead of removing the snapshot, clearsnapshot moved the data files from the snapshot directory to the parent directory and the size o

Re: how stable is 1.0 these days?

2012-03-05 Thread Viktor Jevdokimov
1.0.7 is very stable, weeks in high-load production environment without any exception, 1.0.8 should be even more stable, check changes.txt for what was fixed. 2012/3/2 Marcus Eriksson > beware of though if > you have many keys per node > > ot

Re: Huge amount of empty files in data directory.

2012-03-05 Thread Viktor Jevdokimov
After running Cassandra for 2 years in production on Windows servers, starting from 0.7 beta2 up to 1.0.7 we have moved to Linux and forgot all the hell we had on Windows. Having JNA, off-heap row cache and normally working MMAP on Linux you're getting a lot better performance and stability compari

Re: running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
I want tow separate clusters. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Mon, Mar 5, 2012 at 12:48 PM, aaron morton wrote: > Do you want to create two separate cluster

Rationale behind incrementing all tokens by one in a different datacenter (was: running two rings on the same subnet)

2012-03-05 Thread Hontvári József Levente
I am thinking about the frequent example: dc1 - node1: 0 dc1 - node2: large...number dc2 - node1: 1 dc2 - node2: large...number + 1 In theory using the same tokens in dc2 as in dc1 does not significantly affect key distribution, specifically the two keys on the border will move to the next on

RE: Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Thanks a lot for the concurrent_writes hint that really improves the throughput. Do you mean dropped messages and no timedoutexception will mean the data is written somewhere in the cluster and by taking corrective measures desired CL can be achieved? From: aaron morton [mailto:aa...@thelastp

RE: Mutation Dropped Messages

2012-03-05 Thread Tiwari, Dushyant
Hey Aaron, I increased the size of the cluster also the concurrent_writes parameter. Still there is a node which keeps on dropping the mutation messages. The other nodes are not dropping mutation messages. I am using Hector API and had done nothing for load balancing so far. Just provided the h

Adding a second datacenter

2012-03-05 Thread David Koblas
Everything that I've read about data centers focuses on setting things up at the beginning of time. I've the the following situation: 10 machines in a datacenter (DC1), with replication factor of 2. I want to set up a second data center (DC2) with the following configuration: 20 machines w

Re: Adding a second datacenter

2012-03-05 Thread Jeremiah Jordan
You need to make sure your clients are reading using LOCAL_* settings so that they don't try to get data from the other data center. But you shouldn't get errors while replication_factor is 0. Once you change the replication factor to 4, you should get missing data if you are using LOCAL_* fo

Division by zero

2012-03-05 Thread Vanger
After upgrading from version 1.0.1 to 1.0.8 we started to get exception: ERROR [http-8095-1] - get: key1 - {type=RANGE, start=0, end=9223372036854775807, orderDesc=false, limit=1} me.prettyprint.hector.api.exceptions.HCassandraInternalException: Cassandra encount

Re: Rationale behind incrementing all tokens by one in a different datacenter (was: running two rings on the same subnet)

2012-03-05 Thread Jeremiah Jordan
There is a requirement that all nodes have a unique token. There is still one global cluster/ring that each node needs to be unique on. The logically seperate rings that NetworkTopologyStrategy puts them into is hidden from the rest of the code. -Jeremiah On 03/05/2012 05:13 AM, Hontvári Jó

Re: how stable is 1.0 these days?

2012-03-05 Thread Thibaut Britz
Thanks for the feedback. I will certainly execute scrub after the update. On Mon, Mar 5, 2012 at 11:55 AM, Viktor Jevdokimov wrote: > 1.0.7 is very stable, weeks in high-load production environment without > any exception, 1.0.8 should be even more stable, check changes.txt for what > was fixed.

Re: Adding a second datacenter

2012-03-05 Thread David Koblas
Jeremiah, Thanks! I'm running 1.0.8, two interesting things to note: - I don't have sufficient disk space to handle the straight bump to a replication factor of 4, so I think I'm going to have to do it one by one (1,2,3 and 4) with a bunch of cleanups in between. - Also, using a LOCAL_QUORU

Re: Issue with nodetool clearsnapshot

2012-03-05 Thread aaron morton
> It seems that instead of removing the snapshot, clearsnapshot moved the data > files from the snapshot directory to the parent directory and the size of the > data for that keyspace has doubled. That is not possible, there is only code there to delete a files in the snapshot. Note that in th

Re: running two rings on the same subnet

2012-03-05 Thread aaron morton
Create nodes that do not share seeds, and give the clusters different names as a safety measure. Cheers - Aaron Morton Freelance Developer @aaronmorton On 6/03/2012, at 12:04 AM, Tamar Fraenkel wrote: > I want tow separate clusters. > Tamar Fraenke

Re: Division by zero

2012-03-05 Thread aaron morton
(Commented in the ticket as well) What is the error in the server log ? Cheers - Aaron Morton Freelance Developer @aaronmorton On 6/03/2012, at 5:04 AM, Vanger wrote: > After upgrading from version 1.0.1 to 1.0.8 we started to get exception: > > E

Re: Mutation Dropped Messages

2012-03-05 Thread aaron morton
> I increased the size of the cluster also the concurrent_writes parameter. > Still there is a node which keeps on dropping the mutation messages. Ensure all the nodes have the same spec, and the nodes have the same config. In a virtual environment consider moving the node. > Is this due to some

Re: Issue with nodetool clearsnapshot

2012-03-05 Thread B R
Hi Aaron, 1)Since you mentioned hard links, I would like to add that our data directory itself is a sym-link. Could that be causing an issue ? 2)Yes, there are 0 byte files of the same numbers in Keyspace1 directory 0 Mar 4 01:33 Standard1-g-7317-Compacted 0 Mar 3 22:58 Standard1-g-7968-Compact

hector connection pool

2012-03-05 Thread Daning Wang
I just got this error ": All host pools marked down. Retry burden pushed out to client." in a few clients recently, client could not recover, we have to restart client application. we are using hector. At that time we did compaction for a CF, it takes several hours, server was busy. But

RE: Secondary indexes don't go away after metadata change

2012-03-05 Thread Frisch, Michael
Thank you very much for your response. It is true that the older, previously existing nodes are not snapshotting the indexes that I had removed. I'll go ahead and just delete those SSTables from the data directory. They may be around still because they were created back when we used 0.8. The

Cassandra cache patterns with thiny and wide rows

2012-03-05 Thread Maciej Miklas
I've asked this question already on stackoverflow but without answer - I wll try again: My use case expects heavy read load - there are two possible model design strategies: 1. Tiny rows with row cache: In this case row is small enough to fit into RAM and all columns are being cached.

Re: hector connection pool

2012-03-05 Thread Maciej Miklas
Have you tried to change: me.prettyprint.cassandra.service.CassandraHostConfigurator#retryDownedHostsDelayInSeconds ? Hector will ping down hosts every xx seconds and recover connection. Regards, Maciej On Mon, Mar 5, 2012 at 8:13 PM, Daning Wang wrote: > I just got this error ": All host pool

Re: Cassandra cache patterns with thiny and wide rows

2012-03-05 Thread Viktor Jevdokimov
Depends on how large is a data set, specifically hot data, comparing to available RAM, what is a heavy read load, and what are the latency requirements. 2012/3/6 Maciej Miklas > I've asked this question already on stackoverflow but without answer - I > wll try again: > > > My use case expects h

Re: running two rings on the same subnet

2012-03-05 Thread Tamar Fraenkel
Works.. But during the night my setup encountered a problem. I have two VMs on my cluster (running on VmWare ESXi). Each VM has1GB memory, and two Virtual Disks of 16 GB They are running on a small server with 4CPUs (2.66 GHz), and 4 GB memory (together with two other VMs) I put cassandra data on