date:20110616

Re: Multi data center configuration - A question on read correction

2011-06-16 Thread Sylvain Lebresne

Yes, that's the way to do it. On Wed, Jun 15, 2011 at 9:43 PM, Selva Kumar wrote: > Thanks Jonathan. Can we turn off RR by READ_REPAIR_CHANCE.= 0. Please > advice. > > Selva > > > From: Jonathan Ellis > To: user@cassandra.apache.org > Sent: Tue, June 14, 2011 8:5

Re: sstable2json2sstable bug with json data stored

2011-06-16 Thread Timo Nentwig

On 6/15/11 17:41, Timo Nentwig wrote: (json can likely be boiled down even more...) Any JSON (well, probably anything with quotes...) breaks it: { "74657374": [["data", "{"foo":"bar"}", 1308209845388000]] } [default@foo] set transactions[test][data]='{"foo":"bar"}'; I feared that storing dat

Re: sstable2json2sstable bug with json data stored

2011-06-16 Thread Sasha Dolgy

The JSON you are showing below is an export from cassandra? { "74657374": [["data", "{"foo":"bar"}", 1308209845388000]] } Does this work? { 74657374: [["data", {foo:"bar"}, 1308209845388000]] } -sd On Thu, Jun 16, 2011 at 9:49 AM, Timo Nentwig wrote: > On 6/15/11 17:41, Timo Nentwig wrote: >>

Re: sstable2json2sstable bug with json data stored

2011-06-16 Thread Timo Nentwig

On 6/16/11 10:06, Sasha Dolgy wrote: The JSON you are showing below is an export from cassandra? Yes. Just posted the solution: https://issues.apache.org/jira/browse/CASSANDRA-2780?focusedCommentId=13050274&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13050274

Re: sstable2json2sstable bug with json data stored

2011-06-16 Thread Timo Nentwig

On 6/16/11 10:12, Timo Nentwig wrote: On 6/16/11 10:06, Sasha Dolgy wrote: The JSON you are showing below is an export from cassandra? Yes. Just posted the solution: https://issues.apache.org/jira/browse/CASSANDRA-2780?focusedCommentId=13050274&page=com.atlassian.jira.plugin.system.issuetabpan

Getting Started website is out of date

2011-06-16 Thread Christian Straube

Hi, the "Getting started" website (http://wiki.apache.org/cassandra/GettingStarted) is out of date -> the link to the Twissandra demo is broken -> the new CQL is not mentioned :-) Beside this I love cassandra! Best Christian

Re: Migration question

2011-06-16 Thread aaron morton

Lots of folk use a single disk or raid-1 for the system and commit log and raid-0 for the data volumes http://wiki.apache.org/cassandra/CassandraHardware Your money is probably better spent on more nodes with more disks and more memory. More nodes is always better. Happy to hear reasons other

Re: Slowdowns during repair

2011-06-16 Thread aaron morton

Look for log messages at the ERROR level first to find out why it's crashing. Check for GC pressure during the repair, either using JConsole or log messages from the GCInspector. Check the nodetool tpstats to get an idea if the nodes are saturated, i.e. are their tasks in the pending list. Or

Re: Where is my data?

2011-06-16 Thread aaron morton

I wrote a blog post about this sort of thing the other day http://thelastpickle.com/2011/06/13/Down-For-Me/ Let me know if you spot any problems. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 16 Jun 2011, at 02:20, AJ wrote:

Re: What's the best approach to search in Cassandra

2011-06-16 Thread Jake Luciani

Mark, Solandra doesn't use secondary indexes, the functionality is too limited for the lucene api. It maintain's it's own indexes in regular column families. I suggest you look at Solr and decide if this is the functionality you need, Solandra offers the same api but on Cassandra's distributed m

Re: Force a node to form part of quorum

2011-06-16 Thread aaron morton

Short answer: No. Medium answer: No all nodes are equal. It could create a single point of failure if a QUOURM could not be formed without a specific node. Writes are sent to every replica. Reads with Read Repair enabled are also sent to every replica. For reads the "closest" UP node as dete

Re: Atomicity of batch updates

2011-06-16 Thread aaron morton

See http://wiki.apache.org/cassandra/FAQ#batch_mutate_atomic Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 16 Jun 2011, at 06:26, chovatia jaydeep wrote: > Cassandra write operation is atomic for all the columns/super columns fo

Re: Easy way to overload a single node on purpose?

2011-06-16 Thread aaron morton

> DEBUG 14:36:55,546 ... timed out Is logged when the coordinator times out waiting for the replicas to respond, the timeout setting is rpc_timeout in the yaml file. This results in the client getting a TimedOutException. AFAIK There is no global everything is good / bad flags to check. e.

Re: Is there a way from a running Cassandra node to determine whether or not itself is "up"?

2011-06-16 Thread aaron morton

take a look at mx4j http://wiki.apache.org/cassandra/Operations#Monitoring_with_MX4J someone told me once you can call the JMX ops via http, i've not checked though. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 16 Jun 2011, a

Re: Docs: Token Selection

2011-06-16 Thread aaron morton

See this thread for background http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Replica-data-distributing-between-racks-td6324819.html In a multi DC environment, if you calculate the initial tokens for the entire cluster data will not be evenly distributed. Cheers

Querying superColumn

2011-06-16 Thread Vivek Mishra

I have a question about querying super column For example: I have a supercolumnFamily DEPARTMENT with dynamic superColumn 'EMPLOYEE'( name, country). Now for rowKey 'DEPT1' I have inserted multiple super column like: Employee1{ Name: Vivek country: India } Employee2{ Name: Vivs country: US

Re: Important Variables for Scaling

2011-06-16 Thread aaron morton

It's a difficult questions to answer in the abstract. Some thoughts... Scaling by adding one node at time is not optimal. The best case scenario is to double the number of nodes, as this means existing nodes only have to stream their data to a new node. Obviously this is not always possible. Wh

Upgrading Cassandra cluster from 0.6.3 to 0.7.5

2011-06-16 Thread Ali Ahsan

Hi All, We are upgrading cassandra from 0.6.3 to 0.7.5.We have two node in cluster.I am bit confused how to upgrade them can you have any guide. -- S.Ali Ahsan Senior System Engineer e-Business (Pvt) Ltd 49-C Jail Road, Lahore, P.O. Box 676 Lahore 54000, Pakistan Tel: +92 (0)42 3758 7140 E

Re: Querying superColumn

2011-06-16 Thread Donal Zang

Well, you are looking for the secondary index. But for now,AFAIK, the supercolumn can not use secondary index . On 16/06/2011 13:55, Vivek Mishra wrote: Now for rowKey 'DEPT1' I have inserted multiple super column like: *Employee1{* *Name: Vivek* *country: India* *}* ** *Employee2{* *Nam

snitch & thrift

2011-06-16 Thread Terje Marthinussen

Hi all! Assuming a node ends up in GC land for a while, there is a good chance that even though it performs terribly and the dynamic snitching will help you to avoid it on the gossip side, it will not really help you much if thrift still accepts requests and the thrift interface has choppy perform

Re: Docs: Token Selection

2011-06-16 Thread Eric tamme

AJ, sorry I seemed to miss the original email on this thread. As Aaron said, when computing tokens for multiple data centers, you should compute them independently for each data center - as if it were its own Cassandra cluster. You can have "overlapping" token ranges between multiple data center

Re: Docs: Token Selection

2011-06-16 Thread AJ

LOL, I feel Eric's pain. This double-ring thing can throw you for a loop since, like I said, there is only one place it is documented and it is only *implied*, so one is not sure he is interpreting it correctly. Even the source for NTS doesn't mention this. Thanks for everyone's help on this

Re: Docs: Token Selection

2011-06-16 Thread AJ

Thanks Eric! I've finally got it! I feel like I've just been initiated or something by discovering this "secret". I kid! But, I'm thinking about using OldNetworkTopStrat. Do you, or anyone else, know if the same rules for token assignment applies to ONTS? On 6/16/2011 7:21 AM, Eric tamme

Cassandra JVM GC settings

2011-06-16 Thread Sebastien Coutu

Hi Everyone, I'm seeing Cassandra GC a lot and I would like to tune the Young space and the Tenured space. Anyone would have recommendations on the NewRatio or NewSize/MaxNewSize to use for an environment where Cassandra has several column families and in which we are doing a mixed load of reading

client API

2011-06-16 Thread karim abbouh

i use jdk1.6 to install and launch cassandra in a linux platform,but can i use jdk1.5 for my cassandra Client ?

Re: Querying superColumn

2011-06-16 Thread Sasha Dolgy

Have 1 row with employee info for country/office/division, each column an employee id and json info about the employee or a reference.to.another row id for that employee data No more supercolumn. On Jun 16, 2011 1:56 PM, "Vivek Mishra" wrote: > I have a question about querying super column >

Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ

Good morning all. Hypothetical Setup: 1 data center RF = 3 Total nodes > 3 Problem: Suppose I need maximum consistency for one critical operation; thus I specify CL = ALL for reads. However, this will fail if only 1 replica endpoint is down. I don't see why this fail is necessary all of the

Re: Docs: Token Selection

2011-06-16 Thread Sasha Dolgy

So, with ec2 ... 3 regions (DC's), each one is +1 from another? On Jun 16, 2011 3:40 PM, "AJ" wrote: > Thanks Eric! I've finally got it! I feel like I've just been initiated > or something by discovering this "secret". I kid! > > But, I'm thinking about using OldNetworkTopStrat. Do you, or any

Re: Docs: Token Selection

2011-06-16 Thread Eric tamme

On Thu, Jun 16, 2011 at 11:11 AM, Sasha Dolgy wrote: > So, with ec2 ... 3 regions (DC's), each one is +1 from another? I dont use ec2, so I am not familiar with the specifics of deployment there. That said, if you have 3 data centers with equal nodes in each (so that you would calculate the

Unable to access column family in CLI after building CF in CQL

2011-06-16 Thread yikes bigdata

Hi, I was following the CQL example on the DataStax website and was able to create a new column family and query it. But when I viewed the column family in the CLI, it gives me the following error. # Unable to read column family created from CQL [default@store] list users2; *users2 not found in

Re: Upgrading Cassandra cluster from 0.6.3 to 0.7.5

2011-06-16 Thread Jonathan Ellis

Read NEWS.txt. 0.7.6 is better than 0.7.5, btw. On Thu, Jun 16, 2011 at 5:03 AM, Ali Ahsan wrote: > Hi All, > > We are upgrading cassandra from 0.6.3 to 0.7.5.We have two node in cluster.I > am bit confused how to upgrade them can you have any guide. > > -- > S.Ali Ahsan > > Senior System Engine

Re: Unable to access column family in CLI after building CF in CQL

2011-06-16 Thread Jonathan Ellis

If you create CFs outside the cli, you may need to restart it to refresh its internal cache of the schema. On Thu, Jun 16, 2011 at 8:51 AM, yikes bigdata wrote: > Hi, > I was following the CQL example on the DataStax website and was able to > create a new column family and query it. But when I vi

Re: Unable to access column family in CLI after building CF in CQL

2011-06-16 Thread Konstantin Naryshkin

The second error (the CQL select) is because you have different Key Validation Class values for your two user columns. users is org.apache.cassandra.db.marshal.BytesType, while users2 is org.apache.cassandra.db.marshal.UTF8Type. The select is failing because you are comparing a String to a bu

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Ryan King

On Thu, Jun 16, 2011 at 8:18 AM, AJ wrote: > Good morning all. > > Hypothetical Setup: > 1 data center > RF = 3 > Total nodes > 3 > > Problem: > Suppose I need maximum consistency for one critical operation; thus I > specify CL = ALL for reads. However, this will fail if only 1 replica > endpoint

Re: snitch & thrift

2011-06-16 Thread Ryan King

On Thu, Jun 16, 2011 at 6:11 AM, Terje Marthinussen wrote: > Hi all! > Assuming a node ends up in GC land for a while, there is a good chance that > even though it performs terribly and the dynamic snitching will help you to > avoid it on the gossip side, it will not really help you much if thrift

Re: Unable to access column family in CLI after building CF in CQL

2011-06-16 Thread yikes bigdata

Ah that works. Thanks everyone for the help. On Thu, Jun 16, 2011 at 9:04 AM, Konstantin Naryshkin wrote: > The second error (the CQL select) is because you have different Key > Validation Class values for your two user columns. users is > org.apache.cassandra.db.marshal.BytesType, > while us

Re: Cassandra Statistics and Metrics

2011-06-16 Thread Viktor Jevdokimov

There's possibility to use command line JMX client with standard Zabbix agent to request JMX counters without incorporating zapcat into Cassandra or another Java app. I'm investigating this feature right now, will post results when finish. 2011/6/15 Viktor Jevdokimov > http://www.kjkoster.org/za

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ

On 6/16/2011 10:05 AM, Ryan King wrote: I don't think this buys you anything that you can't get with quorum reads and writes. -ryan QUORUM <= ALL_AVAIL <= ALL == RF

RE: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Dan Hendry

I think this would add a lot of complexity behind the scenes and be conceptually confusing, particularly for new users. The Cassandra consistency model is pretty elegant and this type of approach breaks that elegance in many ways. It would also only really be useful when the value has a high pro

Re: Cassandra Statistics and Metrics

2011-06-16 Thread Héctor Izquierdo Seliva

This is what I use: http://code.google.com/p/simple-cassandra-monitoring/ Disclaimer: I did it myself, don't expect too much :P El jue, 16-06-2011 a las 19:35 +0300, Viktor Jevdokimov escribió: > There's possibility to use command line JMX client with standard > Zabbix agent to request JMX count

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Yang

consistency level definition should be a definition of requirement from the application perspective, it should not be tied to some ephemeral state in the system (: a node being deemed "available/up" or down is determined by the gossip and changes every second ) what you want can be simply achieve

Re: snitch & thrift

2011-06-16 Thread Jonathan Ellis

Seems like a more robust solution would be to implement dynamic-snitch-like behavior in the client. Hector has done this for a few months now. https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/connection/DynamicLoadBalancingPolicy.java On Thu, Jun 16, 2011 a

Re: need some help with counters

2011-06-16 Thread Ian Holsman

On Jun 13, 2011, at 5:10 AM, aaron morton wrote: >> I am wondering how to index on the most recent hour as well. (ie show me top >> 5 URLs type query).. > > AFAIK thats not a great application for counters. You would need range > support in the secondary indexes so you could get the first X r

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ

On 6/16/2011 10:58 AM, Dan Hendry wrote: I think this would add a lot of complexity behind the scenes and be conceptually confusing, particularly for new users. I'm not so sure about this. Cass is already somewhat sophisticated and I don't see how this could trip-up anyone who can already gras

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Ryan King

On Thu, Jun 16, 2011 at 1:05 PM, AJ wrote: > On 6/16/2011 10:58 AM, Dan Hendry wrote: >> >> I think this would add a lot of complexity behind the scenes and be >> conceptually confusing, particularly for new users. > > I'm not so sure about this. Cass is already somewhat sophisticated and I > don

Visiting Auckland

2011-06-16 Thread aaron morton

So long as the Volcanic Ash stays away I'll be visiting Auckland next week on the 23rd and 24th. Drop me an email if you would like to meet to talk about things Cassandra. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ

On 6/16/2011 2:37 PM, Ryan King wrote: On Thu, Jun 16, 2011 at 1:05 PM, AJ wrote: The Cassandra consistency model is pretty elegant and this type of approach breaks that elegance in many ways. It would also only really be useful when the value has a high probability of being updated between

Re: Force a node to form part of quorum

2011-06-16 Thread A J

It would be great if Cassandra puts this on their roadmap. There is lot of durability benefits by incorporating dc awareness into the write consistency equation. MongoDB has this feature in their upcoming release: http://www.mongodb.org/display/DOCS/Data+Center+Awareness#DataCenterAwareness-Taggin

Re: Force a node to form part of quorum

2011-06-16 Thread Peter Schuller

> It would be great if Cassandra puts this on their roadmap. There is > lot of durability benefits by incorporating dc awareness into the > write consistency equation. You may be interested in the discussion here: https://issues.apache.org/jira/browse/CASSANDRA-2338 -- / Peter Schuller

Re: Easy way to overload a single node on purpose?

2011-06-16 Thread Suan Aik Yeo

> Having a ping column can work if every key is replicated to every node. It would tell you the cluster is working, sort of. Once the number of nodes is greater than the RF, it tells you a subset of the nodes works. The way our check works is that each node checks itself, so in this context we're

compression for regular column names?

2011-06-16 Thread E R

Hi all, As a way of gaining familiarity with Cassandra I am migrating a table that is currently stored in a relational database and mapping it into a Cassandra column family. We add about 700,000 new rows a day to this table, and the average disk space used per row is ~ 300 bytes including indexes

Re: compression for regular column names?

2011-06-16 Thread Ryan King

On Thu, Jun 16, 2011 at 3:41 PM, E R wrote: > Hi all, > > As a way of gaining familiarity with Cassandra I am migrating a table > that is currently stored in a relational database and mapping it into > a Cassandra column family. We add about 700,000 new rows a day to this > table, and the average

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Ryan King

On Thu, Jun 16, 2011 at 2:12 PM, AJ wrote: > On 6/16/2011 2:37 PM, Ryan King wrote: >> >> On Thu, Jun 16, 2011 at 1:05 PM, AJ wrote: > >> The Cassandra consistency model is pretty elegant and this type of approach breaks that elegance in many ways. It would also only really be >>>

Re: jsvc hangs shell

2011-06-16 Thread Ken Brumer

Anton Belyaev gmail.com> writes: > > I guess it is not trivial to modify the package to make it use JSW > instead of JSVC. > I am still not sure the JSVC itself is a culprit. Maybe something is > wrong in my setup. > > > I am seeing similar behavior using the Brisk Debian packages for Mave

Brisk .rpm packages for CentOS/RH/Fedora

2011-06-16 Thread Marcos Ortiz Valmaseda

Regards to all Cassandra´ users I don´t know if Brisk has its own mailing list, so I ask here. Has Brisk .rpm packages for Red Hat and based distributions (CentOS/Fedora)? If this is true, Where I can find them? Thanks a lot for your time. -- Marcos Luís Ortíz Valmaseda Software Engineer (Larg

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Dan Hendry

How would your solution deal with complete network partitions? A node being 'down' does not actually mean it is dead, just that it is unreachable from whatever is making the decision to mark it 'down'. Following from Ryan's example, consider nodes A, B, and C but within a fully partitioned network

cassandra crash

2011-06-16 Thread Donna Li

All: Why cassandra crash after print the following log? INFO [SSTABLE-CLEANUP-TIMER] 2011-06-16 14:19:01,020 SSTableDeletingReference.java (line 104) Deleted /usr/local/rss/DDB/data/data/PSCluster/CsiStatusTab-206-Data.db INFO [SSTABLE-CLEANUP-TIMER] 2011-06-16 14:19:01,020 SSTableDeletingRe

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ

UPDATE to my suggestion is below. On 6/16/2011 5:50 PM, Ryan King wrote: On Thu, Jun 16, 2011 at 2:12 PM, AJ wrote: On 6/16/2011 2:37 PM, Ryan King wrote: On Thu, Jun 16, 2011 at 1:05 PM, AJwrote: The Cassandra consistency model is pretty elegant and this type of approach breaks that

Re: Brisk .rpm packages for CentOS/RH/Fedora

2011-06-16 Thread Nate McCall

Yes, there is a brisk list: brisk-us...@googlegroups.com Packages are available via rpm.datastax.com On Thu, Jun 16, 2011 at 8:21 PM, Marcos Ortiz Valmaseda wrote: > Regards to all Cassandra´ users > I don´t know if Brisk has its own mailing list, so I ask here. > Has Brisk .rpm packages for Red

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ

On 6/16/2011 7:56 PM, Dan Hendry wrote: How would your solution deal with complete network partitions? A node being 'down' does not actually mean it is dead, just that it is unreachable from whatever is making the decision to mark it 'down'. Following from Ryan's example, consider nodes A, B,

Re: Cassandra JVM GC settings

2011-06-16 Thread aaron morton

It would help if you can provide some log messages from the GCInspector so people can see how much GC is going on. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 17 Jun 2011, at 02:46, Sebastien Coutu wrote: > Hi Everyone, >

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread Dan Hendry

"Help me out here. I'm trying to visualize a situation where the clients can access all the C* nodes but the nodes can't access each other. I don't see how that can happen on a regular ethernet subnet in one data center. Well, I"m sure there is a case that you can point out. Ok, I will concede

Re: client API

2011-06-16 Thread aaron morton

"The Thrift Java compiler creates code that is not compliant with Java 5." https://issues.apache.org/jira/browse/THRIFT-1170 So you may have trouble getting the thrift API to run. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On

Re: Docs: Token Selection

2011-06-16 Thread aaron morton

> But, I'm thinking about using OldNetworkTopStrat. NetworkTopologyStrategy is where it's at. A - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 17 Jun 2011, at 01:39, AJ wrote: > Thanks Eric! I've finally got it! I feel like I've jus

Re: Propose new ConsistencyLevel.ALL_AVAIL for reads

2011-06-16 Thread AJ

On 6/16/2011 9:36 PM, Dan Hendry wrote: "Help me out here. I'm trying to visualize a situation where the clients can access all the C* nodes but the nodes can't access each other. I don't see how that can happen on a regular ethernet subnet in one data center. Well, I"m sure there is a case

Re: Docs: Token Selection

2011-06-16 Thread AJ

On 6/16/2011 9:45 PM, aaron morton wrote: But, I'm thinking about using OldNetworkTopStrat. NetworkTopologyStrategy is where it's at. Oh yeah? It didn't look like it would serve my requirements. I want 2 full production geo-diverse data centers with each serving as a failover for the other

Re: client API

2011-06-16 Thread Jonathan Ellis

Cassandra also uses a bunch of classes that are new in JDK6. JDK5 is end-of-lifed, time to let it rest in piece. On Thu, Jun 16, 2011 at 10:41 PM, aaron morton wrote: > "The Thrift Java compiler creates code that is not compliant with Java 5." > > https://issues.apache.org/jira/browse/THRIFT-117

Re: Docs: Token Selection

2011-06-16 Thread Jonathan Ellis

Replication location is determined by the row key, not the location of the client that inserted it. (Otherwise, without knowing what DC a row was inserted in, you couldn't look it up to read it!) On Fri, Jun 17, 2011 at 12:20 AM, AJ wrote: > On 6/16/2011 9:45 PM, aaron morton wrote: >>> >>> But,

68 matches

Mail list logo