date:20110605

Re: Direct control over where data is stored?

2011-06-05 Thread Watanabe Maki

You can know endpoints which cassandra will store your key to with getNaturalEndpoints, but you can't specify endpoint you want to use with this API. Partitioner decides which key will go to which node. With OPP, you may be able to predicate which key range will be stored to a node, so you can c

Re: [RELEASE] 0.8.0

2011-06-05 Thread Terje Marthinussen

0.8 under load may turn out to be more stable and well behaving than any release so far Been doing a few test runs stuffing more than 1 billion records into a 12 node cluster and thing looks better than ever. VM's stable and nice at 11GB. No data corruptions, dead nodes, full GC's or any of the ot

Re: Direct control over where data is stored?

2011-06-05 Thread Khanh Nguyen

On Sun, Jun 5, 2011 at 11:26 PM, Maki Watanabe wrote: > getNaturalEndpoints tells you which key will be stored on which nodes, > but we can't force cassandra to store given key to specific nodes. > > maki I'm confused. Didn't you mention previously that I can use OrderPreservingPartitioner to sto

Re: slow insertion rate with secondary index

2011-06-05 Thread Jonathan Ellis

Index updates require read-before-write (to find out what the prior version was, if any, and update the index accordingly). This is random i/o. Index creation on the other hand is a lot of sequential i/o, hence more efficient. So, the classic bulk load advice to ingest data prior to creating ind

Re: Direct control over where data is stored?

2011-06-05 Thread Maki Watanabe

getNaturalEndpoints tells you which key will be stored on which nodes, but we can't force cassandra to store given key to specific nodes. maki 2011/6/6 mcasandra : > > Khanh Nguyen wrote: >> >> Is there a way to tell where a piece of data is stored in a cluster? >> For example, can I tell if Last

Re: Direct control over where data is stored?

2011-06-05 Thread Watanabe Maki

It may not what you want, but please read about Network Topology Strategy and DC_QUORUM. http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers You can configure your Cassandra "Data Center aware" . Your read and write will be resolved in DC local, but will be replica

Re: CQL How to do

2011-06-05 Thread Jeffrey Kesselman

Fair enough. I do have to keep reminding myself that a REST interface requires text. And it does make more sense, at least, when coming from a human as opposed to when you make a computer spend cycles converting binary to text just so another computer can spend cycles turning it back again. On Su

Re: Direct control over where data is stored?

2011-06-05 Thread mcasandra

Khanh Nguyen wrote: > > Is there a way to tell where a piece of data is stored in a cluster? > For example, can I tell if LastNameColumn['A'] is stored at node 1 in > the ring? > I have not used it but you can see getNaturalEndpoints in jmx. It will tell you which nodes are responsible for a gi

Re: how to know there are some columns in a row

2011-06-05 Thread aaron morton

You can create columns without values. Are you talking about reading them back through the API ? I would suggest looking at your data model to see if there is a better way to support your read patterns. Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://

Re: CQL How to do

2011-06-05 Thread aaron morton

From what I've seen of CQL there is no comparison between the potential complexity of a CQL statement and that of a SQL statement. IMHO CQL is more or less a human readable form of the current API, it does not add features. SQL statements are arbitrarily complex and may generate many possible qu

Re: problems with many columns on a row

2011-06-05 Thread aaron morton

Ops, I misread "150 GB" in one of your earlier emails as "150 MB" so forget what I said before. You have loads of free space :) How many files do you have in your data directory ? If it's 1 then that log message was a small bug, that has been fixed. Cheers - Aaron Morton Freel

slow insertion rate with secondary index

2011-06-05 Thread Donal Zang

I did a insertion test with and without secondary indexes, and found that: Without secondary index: ~10864 rows inserted per second With secondary index on one column(BytesType): ~1515 rows inserted per second Is this normal? why secondary index would have so much affect? I noticed that If I bu

Re: how to know there are some columns in a row

2011-06-05 Thread Patrick de Torcy

It would be definetely useful to be able to have columns (or super columns) names WITHOUT their values. If these ones are pretty big or if there are a lot of columns, that would generate traffic not necessarily needed (if in the end you are just interrested by some column). Moreover it doesn't seem

Re: Paging Columns from a Row

2011-06-05 Thread Joseph Stein

So I can have one PagedIndex CF that holdes a row for each data file I am processing. The columns for that row (in my example) would have X columns and I can make those columns values be 100 strings that represent keys in another PagedData CF This other PagedData CF for each row would have 10,000

Re: Paging Columns from a Row

2011-06-05 Thread Jonathan Ellis

If you need to parallelize (and scale) you need to distribute across multiple rows. One Big Row means all your 100 workers are hammering the same 3 (for instance) replicas at the same time. On Sun, Jun 5, 2011 at 1:43 PM, Joseph Stein wrote: > What is the best practices here to page and slice col

Re: How to delete UUIDs from the CLI?

2011-06-05 Thread Jonathan Ellis

You're going to need to get a lot more specific. On Sun, Jun 5, 2011 at 12:12 PM, Kevin wrote: > Jonathan, I've upgraded to 0.8.0 and the problem got worse. Now, I can't > delete any rows from the CLI, regardless of the type they're stored as. > > > > -Original Message- > From: Jonathan E

Re: Direct control over where data is stored?

2011-06-05 Thread Khanh Nguyen

On Sun, Jun 5, 2011 at 2:17 PM, mcasandra wrote: > Please give more detailed info about what exactly you are worried about or > trying to solve. In general, we are trying to devise a partitioning and replication scheme that takes into account social relations between data. > Please take a step b

Re: Direct control over where data is stored?

2011-06-05 Thread Khanh Nguyen

Great. Thank you, Eric. -k On Sun, Jun 5, 2011 at 2:13 PM, Eric tamme wrote: > On Sun, Jun 5, 2011 at 12:18 PM, Khanh Nguyen > wrote: >> Hi Maki and Adrian, >> >> Thank you very much for the promptness. It's weekend after all :). >> >> I realized I forgot a part of my question until Adrian men

Paging Columns from a Row

2011-06-05 Thread Joseph Stein

What is the best practices here to page and slice columns from a row. So lets say I have 1,000,000 columns in a row I read the row but want to have 1 thread read columns 0 - , second thread (actor in my case) 1 - 1 ... and so on so i can have 100 workers processing 10,000 columns for

Re: Direct control over where data is stored?

2011-06-05 Thread mcasandra

Please give more detailed info about what exactly you are worried about or trying to solve. Please take a step back and look at cassandra's architecture again and what it's trying to solve. It's a distributed database so if you do what you are describing there is a potential of getting hotspots. W

Re: Direct control over where data is stored?

2011-06-05 Thread Eric tamme

On Sun, Jun 5, 2011 at 12:18 PM, Khanh Nguyen wrote: > Hi Maki and Adrian, > > Thank you very much for the promptness. It's weekend after all :). > > I realized I forgot a part of my question until Adrian mentioned the > replication factor. Is it also possible to set where the replicas are > store

RE: How to delete UUIDs from the CLI?

2011-06-05 Thread Kevin

Jonathan, I've upgraded to 0.8.0 and the problem got worse. Now, I can't delete any rows from the CLI, regardless of the type they're stored as. -Original Message- From: Jonathan Ellis [mailto:jbel...@gmail.com] Sent: Sunday, June 05, 2011 10:56 AM To: user@cassandra.apache.org Subject:

Re: Direct control over where data is stored?

2011-06-05 Thread Khanh Nguyen

Hi Maki and Adrian, Thank you very much for the promptness. It's weekend after all :). I realized I forgot a part of my question until Adrian mentioned the replication factor. Is it also possible to set where the replicas are stored as well? Thanks. This is a research experiment we're exploring

Re: CQL How to do

2011-06-05 Thread Eric Evans

On Sun, 2011-06-05 at 00:51 -0400, Jeffrey Kesselman wrote: > Is CQL really the path for the future for Cassandra? CQL is no more or less "official" than the Thrift interface, and TTBMK, there is no secret cabal that met to decide it would be The Way. People will use what works best for them, and

Re: CQL/JDBC: Cannot locate cassandra.yaml

2011-06-05 Thread Jonathan Ellis

On Sun, Jun 5, 2011 at 9:38 AM, Timo Nentwig wrote: > Hmm, worked-around that by setting -Dcassandra.config (hmm, the client needs > the server's config...?). Yes, this is fixed for 0.8.1. > Not very verbose :-\ May have something to do with my l/p being just "/" for > AllowAll. Correct, that's

Re: How to delete UUIDs from the CLI?

2011-06-05 Thread Jonathan Ellis

If you're not using 0.8.0 the cli deals poorly with non-string row keys. On Sat, Jun 4, 2011 at 7:48 PM, Kevin wrote: > Currently I'm using a client (Pelops) to insert UUIDs (both lexical and > time) in to Cassandra. I haven't yet implemented a facility to remove them > with Pelops; i'm testing a

Re: Troubleshooting IO performance ?

2011-06-05 Thread Jonathan Ellis

You may be swapping. http://spyced.blogspot.com/2010/01/linux-performance-basics.html explains how to check this as well as how to see what threads are busy in the Java process. On Sat, Jun 4, 2011 at 5:34 PM, Philippe wrote: > Hello, > I am evaluating using cassandra and I'm running into some s

Re: CQL/JDBC: Cannot locate cassandra.yaml

2011-06-05 Thread Timo Nentwig

On 6/5/11 16:26, Timo Nentwig wrote: $ CLASSPATH=~/sqlshell/lib/ ~/sqlshell/bin/sqlshell org.apache.cassandra.cql.jdbc.CassandraDriver,jdbc:cassandra:foo/bar@localhost:9160/ks 2011-06-05 16:21:54,452 INFO [main] org.apache.cassandra.cql.jdbc.Connection - Connected to localhost:9160 2011-06-05

CQL/JDBC: Cannot locate cassandra.yaml

2011-06-05 Thread Timo Nentwig

$ CLASSPATH=~/sqlshell/lib/ ~/sqlshell/bin/sqlshell org.apache.cassandra.cql.jdbc.CassandraDriver,jdbc:cassandra:foo/bar@localhost:9160/ks 2011-06-05 16:21:54,452 INFO [main] org.apache.cassandra.cql.jdbc.Connection - Connected to localhost:9160 2011-06-05 16:21:54,517 ERROR [main] org.apache

Re: When should I use Solandra?

2011-06-05 Thread Jean-Nicolas Boulay Desjardins

Perfect thanks! On Sun, Jun 5, 2011 at 4:43 AM, Victor Kabdebon wrote: > Again I don't really know the specifics of Solandra but in Solr (so > Solandra being a cousin of Solr it should be true too) you have XML fields > like this : > > Just turn indexed to false and it's not going to be indexed.

Re: problems with many columns on a row

2011-06-05 Thread Mario Micklisch

I found a patch for the php extension here: https://issues.apache.org/jira/browse/THRIFT-1067 … this seemed to fix the issue. Thank you Jonathan and Aaron for taking time to provide me with some help! Regarding the compaction I would still love to hear your feedback on how to configure Cassandra

Re: problems with many columns on a row

2011-06-05 Thread Mario Micklisch

I tracked down the timestamp submission and everything was fine within the PHP Libraries. The thrift php extension however seems to have an overflow, because it was now setting now timestamps with also negative values ( -1242277493 ). I disabled the php extension and as a result I now got correct

Re: problems with many columns on a row

2011-06-05 Thread Mario Micklisch

Thanks for the feedback Aaron! The schema of the CF is default, I just defined the name and the rest is default, have a look: Keyspace: TestKS Read Count: 65 Read Latency: 657.8047076923076 ms. Write Count: 10756 Write Latency: 0.03237039791744143 ms. Pending Tasks: 0 Column Family: CFTest SSTa

Re: When should I use Solandra?

2011-06-05 Thread Victor Kabdebon

Again I don't really know the specifics of Solandra but in Solr (so Solandra being a cousin of Solr it should be true too) you have XML fields like this : Just turn indexed to false and it's not going to be indexed... Thrift won't affect Solandra at all. 2011/6/4 Jean-Nicolas Boulay Desjardins

Re: Direct control over where data is stored?

Re: [RELEASE] 0.8.0

Re: Direct control over where data is stored?

Re: slow insertion rate with secondary index

Re: Direct control over where data is stored?

Re: Direct control over where data is stored?

Re: CQL How to do

Re: Direct control over where data is stored?

Re: how to know there are some columns in a row

Re: CQL How to do

Re: problems with many columns on a row

slow insertion rate with secondary index

Re: how to know there are some columns in a row

Re: Paging Columns from a Row

Re: Paging Columns from a Row

Re: How to delete UUIDs from the CLI?

Re: Direct control over where data is stored?

Re: Direct control over where data is stored?

Paging Columns from a Row

Re: Direct control over where data is stored?

Re: Direct control over where data is stored?

RE: How to delete UUIDs from the CLI?

Re: Direct control over where data is stored?

Re: CQL How to do

Re: CQL/JDBC: Cannot locate cassandra.yaml

Re: How to delete UUIDs from the CLI?

Re: Troubleshooting IO performance ?

Re: CQL/JDBC: Cannot locate cassandra.yaml

CQL/JDBC: Cannot locate cassandra.yaml

Re: When should I use Solandra?

Re: problems with many columns on a row

Re: problems with many columns on a row

Re: problems with many columns on a row

Re: When should I use Solandra?

34 matches

Site Navigation

Mail list logo

Footer information