Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
Thanks Brandon! Out of curiosity, would making schema changes through a thrift interface (via hector) be any different? In other words, would using hector instead of the cli make schema changes possible without upgrading? On Thu, Oct 13, 2011 at 8:22 AM, Brandon Williams wrote: > You're runnin

Restore snapshots suggestion

2011-10-13 Thread Daning
If I need to restore snapshots from all nodes, but I can only shutdown one node a time since it is production, is there a way I can stop data syncing between nodes temporarily? I don't want the existing data overwrites the snapshot. I found this undocumented parameter DoConsistencyChecksBoolea

Re: Cassandra as session store under heavy load

2011-10-13 Thread Jonathan Ellis
Or upgrade to 1.0 and use leveled compaction (http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra) On Thu, Oct 13, 2011 at 4:28 PM, aaron morton wrote: > They only have a minimum time, gc_grace_seconds for deletes. > > If you want to be really watch disk space reduce the compa

Re: Cassandra as session store under heavy load

2011-10-13 Thread aaron morton
They only have a minimum time, gc_grace_seconds for deletes. If you want to be really watch disk space reduce the compaction thresholds on the CF. Or run a major compaction as part of maintenance. cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.the

Re: Storing pre-sorted data

2011-10-13 Thread Stephen Connolly
Then just use a soundex function on the first word in the text... that will shrink it sufficiently and give nice buckets in near sequential order (http://en.wikipedia.org/wiki/Soundex) On 13 October 2011 21:21, Matthias Pfau wrote: > Hi Stephen, > we are hashing the first 8 byte (8 US-ASCII chara

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau
Hi Stephen, we are hashing the first 8 byte (8 US-ASCII characters) of text that has been written by humans. Wouldn't it be easy for the attacker to do a dictionary attack on this text, especially if he knows the language of the text? Kind regards Matthias On 10/13/2011 08:20 PM, Stephen Con

Re: MapReduce with two ethernet cards

2011-10-13 Thread Brandon Williams
On Thu, Oct 13, 2011 at 1:17 PM, Scott Fines wrote: > When I look at the source for ColumnFamilyInputFormat, it appears that it > does a call to client.describe_ring; when you do the equivalent call  with > nodetool, you get the 10.1.1.* addresses.  This seems to indicate to me that > I should

Re: Storing pre-sorted data

2011-10-13 Thread Stephen Connolly
in theory, however they have less than 32 bits of entropy from which they can do that, leaving them with at least 32 more bits of combinations to try... that's 2 billion or so... must be a big dictionary - Stephen --- Sent from my Android phone, so random spelling mistakes, random nonsense words

RE: MapReduce with two ethernet cards

2011-10-13 Thread Scott Fines
When I look at the source for ColumnFamilyInputFormat, it appears that it does a call to client.describe_ring; when you do the equivalent call with nodetool, you get the 10.1.1.* addresses. This seems to indicate to me that I should open up the firewall and attempt to contact those IPs instead

RE: MapReduce with two ethernet cards

2011-10-13 Thread Scott Fines
The listen address on all machines are set to the 10.1.1.* addresses, while the thrift rpc address is the 172.28.* addresses From: Brandon Williams [dri...@gmail.com] Sent: Thursday, October 13, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re: MapR

Re: MapReduce with two ethernet cards

2011-10-13 Thread Brandon Williams
What is your rpc_address set to? If it's 0.0.0.0 (bind everything) then that's not going to work if listen_address is blocked. -Brandon On Thu, Oct 13, 2011 at 11:13 AM, Scott Fines wrote: > I upgraded to cassandra 0.8.7, and the problem persists. > > Scott > ___

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau
Hi Stephen, this sounds very reasonable. But wouldn't this enable an attacker to execute dictionary attacks in order to "decrypt" the first 8 bytes of the plain text? Kind regards Matthias On 10/13/2011 05:03 PM, Stephen Connolly wrote: It wouldn't be unencrypted... which is the point you u

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau
Hi Zach, thanks for your additional input. You are absolutely right: The long namespace should be big enough. We are going to insert up to 2^32 values into the list. We only need support for get(index), insert(index) and remove(index) while get and insert will be used very often. Remove is al

Re: Efficiency of hector's setRowCount (and setStartKey!)

2011-10-13 Thread Patricio Echagüe
On Thu, Oct 13, 2011 at 9:39 AM, Don Smith wrote: > ** > It's actually setStartKey that's the important method call (in combination > with setRowCount). So I should have been clearer. > > The following code performs as expected, as far as returning the expected > data in the expected order. I be

Re: Efficiency of hector's setRowCount (and setStartKey!)

2011-10-13 Thread Don Smith
It's actually setStartKey that's the important method call (in combination with setRowCount). So I should have been clearer. The following code performs as expected, as far as returning the expected data in the expected order. I believe that the use of IndexedSliceQuery's setStartKey will sup

Re: Hector has a website

2011-10-13 Thread Patricio Echagüe
Hi Aaron. does it still happen ? We didn't set up any password on the page. On Tue, Oct 11, 2011 at 9:15 AM, Aaron Turner wrote: > Just a FYI: > > http://hector-client.org is requesting a username/pass > http://www.hector-client.org is working fine > > On Fri, Oct 7, 2011 at 12:51 AM, aaron mort

Re: Efficiency of hector's setRowCount

2011-10-13 Thread Patricio Echagüe
Hi Don. No it will not. IndexedSlicesQuery will read just the amount of rows specified by RowCount and will go to the DB to get the new page when needed. SetRowCount is doing indexClause.setCount(rowCount); On Mon, Oct 10, 2011 at 3:52 PM, Don Smith wrote: > Hector's IndexedSlicesQuery has a se

RE: MapReduce with two ethernet cards

2011-10-13 Thread Scott Fines
I upgraded to cassandra 0.8.7, and the problem persists. Scott From: Brandon Williams [dri...@gmail.com] Sent: Monday, October 10, 2011 12:28 PM To: user@cassandra.apache.org Subject: Re: MapReduce with two ethernet cards On Mon, Oct 10, 2011 at 11:47 AM,

Re: Hector Problem Basic one

2011-10-13 Thread Patricio Echagüe
Hi, Hector does not retry on a down server. In the unit tests where you have just one server, Hector will pass the exception to the client. Can you tell us please what your test looks like ? 2011/10/12 Wangpei (Peter) > I only saw this error message when all Cassandra nodes are down. > > H

Re: supercolumns vs. prefixing columns of same data type?

2011-10-13 Thread Dean Hiller
great video, thanks! On Thu, Oct 13, 2011 at 7:45 AM, hani elabed wrote: > Hi Dean, > I don't have have an answer to your question, but just in case you haven't > seen this screencast by Ed Anuff on Cassandra Indexes, it helped me a lot. > http://blip.tv/datastax/indexing-in-cassandra-5495633 >

Re: Storing pre-sorted data

2011-10-13 Thread Stephen Connolly
It wouldn't be unencrypted... which is the point you use a one way linear hash function to take the first, say 8 bytes, of unencrypted data and turn it into 4 bytes of a sort prefix. You've used lost half the data in the process, so effectively each bit is an OR of two bits and you can only infer

Re: Storing pre-sorted data

2011-10-13 Thread Zach Richardson
Matthias, Answers below. On Thu, Oct 13, 2011 at 9:03 AM, Matthias Pfau wrote: > Hi Zach, > thanks for that good idea. Unfortunately, our list needs to be rewritten > often because our data is far away from being evenly distributed. This shouldn't be a problem if you use long's. If you were to

Re: [Solved] column index offset miscalculation

2011-10-13 Thread Thomas Richter
Thanks for the hint. Ticket created: https://issues.apache.org/jira/browse/CASSANDRA-3358 Best, Thomas On 10/13/2011 03:27 PM, Sylvain Lebresne wrote: > JIRA is not read-only, you should be able to create a ticket at > https://issues.apache.org/jira/browse/CASSANDRA, though > that probably requ

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Brandon Williams
You're running into https://issues.apache.org/jira/browse/CASSANDRA-3259 Try upgrading and doing a rolling restart. -Brandon On Thu, Oct 13, 2011 at 9:11 AM, Eric Czech wrote: > Nope, there was definitely no intersection of the seed nodes between the two > clusters so I'm fairly certain that th

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
Nope, there was definitely no intersection of the seed nodes between the two clusters so I'm fairly certain that the second cluster found out about the first through what was in the LocationInfo* system tables. Also, I don't think that procedure will really help because I don't actually want the s

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau
Hi Zach, thanks for that good idea. Unfortunately, our list needs to be rewritten often because our data is far away from being evenly distributed. However, we could get this under control but there is a more severe problem: Random access is very hard to implement on a structure with undefine

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Mohit Anchlia
Do you have same seed node specified in cass-analysis-1 as cass-1,2,3? I am thinking that changing the seed node in cass-analysis-2 and following the directions in http://wiki.apache.org/cassandra/FAQ#schema_disagreement might solve the problem. Somone please correct me. On Thu, Oct 13, 2011 at 12

Re: supercolumns vs. prefixing columns of same data type?

2011-10-13 Thread hani elabed
Hi Dean, I don't have have an answer to your question, but just in case you haven't seen this screencast by Ed Anuff on Cassandra Indexes, it helped me a lot. http://blip.tv/datastax/indexing-in-cassandra-5495633 Hani On Wed, Oct 12, 2011 at 12:18 PM, Dean Hiller wrote: > I heard cassandra may

Re: [Solved] column index offset miscalculation (was: Existing column(s) not readable)

2011-10-13 Thread Sylvain Lebresne
JIRA is not read-only, you should be able to create a ticket at https://issues.apache.org/jira/browse/CASSANDRA, though that probably require that you create an account. -- Sylvain On Thu, Oct 13, 2011 at 3:20 PM, Thomas Richter wrote: > Hi Aaron, > > the fix does the trick. I wonder why nobody

[Solved] column index offset miscalculation (was: Existing column(s) not readable)

2011-10-13 Thread Thomas Richter
Hi Aaron, the fix does the trick. I wonder why nobody else ran into this before... I checked org/apache/cassandra/db/ColumnIndexer.java in 0.7.9, 0.8.7 and 1.0.0-rc2 and all seem to be affected. Looks like public Jira is readonly - so I'm not sure about how to continue. Best, Thomas On 10/13/2

Re: Storing pre-sorted data

2011-10-13 Thread Zach Richardson
Matthias, This is an interesting problem. I would consider using long's as the column type, where your column names are evenly distributed longs in sort order when you first write your list out. So if you have items A and C with the long column names 1000 and 2000, and then you have to insert B,

Re: Cassandra as session store under heavy load

2011-10-13 Thread Maciej Miklas
durable_writes sounds great - thank you! I really do not need commit log here. Another question: it is possible to configure live time of Tombstones? Regards, Maciej

Re: Existing column(s) not readable

2011-10-13 Thread Thomas Richter
Hi Aaron, I guess i found it :-). I added logging for the used IndexInfo to SSTableNamesIterator.readIndexedColumns and got negative index postions for the missing columns. This is the reason why the columns are not loaded from sstable. So I had a look at ColumnIndexer.serializeInternal and ther

Re: Storing pre-sorted data

2011-10-13 Thread Matthias Pfau
Hi Stephen, this is a great idea but unfortunately doesn't work for us either as we can not store the data in an unencrypted form. Kind regards Matthias On 10/12/2011 07:42 PM, Stephen Connolly wrote: could you prefix the data with 3-4 bytes of a linear hash of the unencypted data? it wouldn'

Re: Schema versions reflect schemas on unwanted nodes

2011-10-13 Thread Eric Czech
I don't think that's what I'm after here since the unwanted nodes were originally assimilated into the cluster with the same initial_token values as other nodes that were already in the cluster (that have, and still do have, useful data). I know this is an awkward situation so I'll try to depict i