Re: What's the best approach to search in Cassandra

2011-06-04 Thread Paul Loy
I use ElasticSearch myself. Which is a distributed Lucene. http://www.elasticsearch.org On Sat, Jun 4, 2011 at 1:56 AM, Mark Kerzner wrote: > Hi, > > I need to store, say, 10M-100M documents, with each document having say 100 > fields, like author, creation date, access date, etc., and then I w

CQL How to do

2011-06-04 Thread Yonder
Hi, In Cassandra 0.8, CQL become the primary client interface, but I don't know how to use it in a non-command line env. I could not find out any how-to do docs in Wiki or DataStax's website. Thanks

回复: how to know there are some columns in a row

2011-06-04 Thread Yonder
Thanks you very much. but I'm afraid it's not a graceful means if there are billion columns in a row.   > >发件人: Michal Augustýn >收件人: user@cassandra.apache.org; Yonder >发送日期: 2011年6月2日, 星期四, 下午 9:25 >主题: Re: how to know there are some columns in a row > >Hi, >

When should I use Solandra?

2011-06-04 Thread Jean-Nicolas Boulay Desjardins
Hi, I am planning to use Cassandra to store my users passwords and at the same time data for my website that need to be accessible via search. My Question is should I use two DB: Cassandra (for users passwords) and Solandra (for the websites data) or can I put everything in Solandra? Is there a w

problems with many columns on a row

2011-06-04 Thread Mario Micklisch
Hello there! I have ran into a strange problem several times now and I wonder if someone here has an solution for me: For some of my data I want to keep track all the ID's I have used. To do that, I am putting the ID's as column into rows. At first I wanted to put all ID's into one row (because

Re: When should I use Solandra?

2011-06-04 Thread Victor Kabdebon
Why do you need Solandra for storing data ? If you want to retrieve data simply use Cassandra. Solandra is for research and indexing it is a search engine. I do not recommand you to store data uniquely in a search engine. Use the following desgin : *Store ALL data in Cassandra then extract from C

Re: When should I use Solandra?

2011-06-04 Thread Norman Maurer
Are you sure you really need cassandra for this ? For me it sounds like mysql or other databases would be a better fit for you (if you don't need to store a very hugh amount of data...) Bye, Norman 2011/6/4 Jean-Nicolas Boulay Desjardins : > Hi, > I am planning to use Cassandra to store my users

Re: problems with many columns on a row

2011-06-04 Thread Jonathan Ellis
It sounds like you're trying to read entire rows at once. Past a certain point (depending on your heap size) you won't be able to do that, you need to "page" through them N columns at a time. On Sat, Jun 4, 2011 at 12:27 PM, Mario Micklisch wrote: > Hello there! > I have ran into a strange proble

Re: problems with many columns on a row

2011-06-04 Thread Mario Micklisch
Thank you for the reply! I am not trying to read a row with too many columns into memory, the lock I am experiencing is write-related only and happening for everything added prior to an unknown event. I just ran into the same thing again and the column count is maybe not the real issue here (as I

Re: problems with many columns on a row

2011-06-04 Thread Jonathan Ellis
Did you check the server log for errors? See if the problem persists after running nodetool compact. If it does, use sstable2json to export the row in question. On Sat, Jun 4, 2011 at 3:21 PM, Mario Micklisch wrote: > Thank you for the reply! I am not trying to read a row with too many columns >

Re: When should I use Solandra?

2011-06-04 Thread Kirk Peterson
I think the OP was asking if you can use the same Cassandra cluster that Solandra is integrated with to store non-Solandra in a different keyspace. This would remove the need to run two Cassandra clusters, one for storing his Solandra index, and another for his other data. I'm not sure if Solandra

Re: problems with many columns on a row

2011-06-04 Thread Mario Micklisch
Yes, checked the log file, no errors there. With debug logging it confirms to receive the write too and it is also in the commitlog. DEBUG 22:00:14,057 insert writing local RowMutation(keyspace='TestKS', key='44656661756c747c6532356231342d373937392d313165302d613663382d31323331336330616334

Troubleshooting IO performance ?

2011-06-04 Thread Philippe
Hello, I am evaluating using cassandra and I'm running into some strange IO behavior that I can't explain, I'd like some help/ideas to troubleshoot it. I am running a 1 node cluster with a keyspace consisting of two columns families, one of which has dozens of supercolumns itself containing dozens

Re: When should I use Solandra?

2011-06-04 Thread Jake Luciani
On Saturday, June 4, 2011, Kirk Peterson wrote: > I think the OP was asking if you can use the same Cassandra cluster > that Solandra is integrated with to store non-Solandra in a different > keyspace. This would remove the need to run two Cassandra clusters, one for > storing his Solandra inde

How to delete UUIDs from the CLI?

2011-06-04 Thread Kevin
Currently I'm using a client (Pelops) to insert UUIDs (both lexical and time) in to Cassandra. I haven't yet implemented a facility to remove them with Pelops; i'm testing and refining the insertion mechanism. As such, I would like to use the CLI to delete test UUID values. It seems, however, t

Direct control over where data is stored?

2011-06-04 Thread Khanh Nguyen
Hi everyone, Is it possible to have direct control over where objects are stored in Cassandra? For example, I have a Cassandra cluster of 4 machines and 4 objects A, B, C, D; I want to store A at machine 1, B at machine 2, C at machine 3 and D at machine 4. My guess is that I need to intervene the

Re: When should I use Solandra?

2011-06-04 Thread Jean-Nicolas Boulay Desjardins
Hi, So if I understand Solandra. All the data are in Solandra and you can query them like you would normaly with a normal Cassandra setup and search through them. The data from the indexing of Solr is stored in Cassandra column family... Second, question. I have Thrift already install will it a

Re: Problem compiling

2011-06-04 Thread aaron morton
It's a maven task that is failing, check which version you have (no idea if it's automatically packaged with ant). Works for me with these... $ ant -version Apache Ant(TM) version 1.8.2 compiled on February 28 2011 $ /usr/bin/mvn -V Apache Maven 3.0.2 (r1056850; 2011-01-09 13:58:10+1300) Hope

Re: CQL How to do

2011-06-04 Thread aaron morton
May be wrong but as far as I know thrift is still the official API, for now. CQL is in it's first release and still has a few things to be added to it https://issues.apache.org/jira/browse/CASSANDRA-2472 . That said, jump in and try it out :) The best documentation I can point you to is https

Re: 回复: how to know there are some columns in a row

2011-06-04 Thread aaron morton
You can also use slice a range of columns for a row , e.g. first 100 columns after column "". What client are you using ? Cheers - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 5 Jun 2011, at 04:19, Yonder wrote: > > Thanks you

Re: Direct control over where data is stored?

2011-06-04 Thread Maki Watanabe
You may be able to do it with the Order Preserving Partitioner with making key to node mapping before storing data, or you may need your custom Partitioner. Please note that you are responsible to distribute load between nodes in this case. >From application design perspective, it is not clear for

Re: problems with many columns on a row

2011-06-04 Thread aaron morton
It is rarely a good idea to let the data disk get to far over 50% utilisation. With so little free space the compaction process will have trouble running http://wiki.apache.org/cassandra/MemtableSSTable As you are on the RC1 I would just drop the data and start again. If you need to keep it you

Re: CQL How to do

2011-06-04 Thread Jeffrey Kesselman
Is CQL really the path for the future for Cassandra? It seems to me by introducing a textual language that has to be parsed and understood, you are adding back in some of the inefficiency of SQl... 2011/6/4 aaron morton : > May be wrong but as far as I know thrift is still the official API, for n

Re: Direct control over where data is stored?

2011-06-04 Thread Adrian Cockcroft
Sounds like Khanh thinks he can do joins... :-) User oriented data is easy, key by facebook id, let cassandra handle location. Set replication factor=3 so you don't lose data and can do consistent but slower read after write when you need to using quorum. If you are running on AWS you should distr