Re: Schema advice: (Single row or multiple row!?) How do I store millions of columns when I need to read a set of around 500 columns at a single read query using column names ?
in addition, if you don't know how many rows will be needed - in each row, you can store the key of the next one. Just like in a linked list. OR have 1 row that will hold all the keys that combining your other rows. 1st select the main row (with the keys), then select the other rows. On Mon, Jul 23, 2012 at 3:40 PM, rohit bhatia wrote: > You should probably try to break the one row scheme to > 2*Number_of_nodes rows scheme.. This should ensure proper distribution > of rows and still allow u to query from a few fixed number of rows. > How u do it depends on how are u gonna choose ur 200-500 columns > during reading (try having them in the same row) > > Even if u r forced to put them in seperate rows, u can make the row > key as "some modulus of hash of column name", ensuring symmetry and > easy access of columns... > > On Mon, Jul 23, 2012 at 6:02 PM, Ertio Lew wrote: > > Any ideas/suggestions please? >
Re: Any meet ups in southern california
You can use Watchitoo.com (LIke GoToMeeting/WebEX) to host an event. using that tool, everyone around the world can join and take action. the great thing about is that it's FREE! On Wed, Jul 6, 2011 at 10:25 PM, Mike Rapuano wrote: > Hi all > > Are there any active cassandra meet ups in southern California? > > Thanks > Mike >
Re: Pre-CassandraSF Happy Hour on Sunday
Can you please Watchitoo.com (its' free) and broadcast the event? On Fri, Jul 8, 2011 at 8:54 PM, Richard Low wrote: > Hi all, > > If you're in San Francisco for CassandraSF on Monday 11th, then come > and join fellow Cassandra users and committers on Sunday evening. > Starting at 6:30pm at ThirstyBear, the famous brewing company. We'll > have drinks, food and more. > > RSVP at Eventbrite: http://pre-cassandrasf-happyhour.eventbrite.com/ > > Hope you can join us! > > -- > Richard Low > Acunu | http://www.acunu.com | @acunu >
Cassandra Secondary index/Twissandra
Hi, I have few questions: *Secondary index* 1. Is there a limit on the number of columns in a single column family that serve as secondary indexes? 2. Does performance decrease (significantly) if the uniqueness of the column’s values is high? *Twissandra* 1. Why in the source (or any tutorial I've read): The CF for "Userline"/"Uimeline" - have comparator of "LONG_TYPE" and not TimeUUID? https://github.com/twissandra/twissandra/blob/master/tweets/management/commands/sync_cassandra.py 2. Does performance decrease (significantly) if the uniqueness of the column’s name is high when comparator is LONG_TYPE/TimeUUID and each row has lots of columns? Thanks! Eldad
Re: Cassandra Secondary index/Twissandra
Aaron - Thank you for the fast response! 1. Does performance decrease (significantly) if the uniqueness of the column’s name is high when comparator is LONG_TYPE/TimeUUID and each row has lots of columns? >Depends on what sort of operations you are doing. Some read operations have to pay a constant cost to decode the row level column index, this can be tuned though. AFAIK the comparator type has very little to do with the performance. In Twissandra, the columns are used as "alternative" index for the Userline/Timeline. therefore the operation I'm going to do is slice_range. I'm going to get (for example) the first 50 columns (using comparator of TimeUUID/LONG). Can you recommend on a better way of doing that or a way to tune Cassandra to support those 2 CF? Thanks! On Sun, Jul 10, 2011 at 3:26 AM, aaron morton wrote: > >1. Is there a limit on the number of columns in a single column family >that serve as secondary indexes? > > AFAIK there is no coded limit, however every index is implemented as > another (hidden) Column Family that inherits the settings of the parent CF. > So under 0.7 you may run out of memory, under 0.8 you may flush a lot. > Also, when an indexed column is updated there are potentially 3 operations > that have to happen: read the old value, delete the old value, write the new > value. More indexes == more index updating, just like any other database. > > >1. Does performance decrease (significantly) if the uniqueness of the >column’s values is high? > > Low cardinality is recommended > > http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Secondary-indices-Why-low-cardinality-td6160509.html > > >1. The CF for "Userline"/"Uimeline" - have comparator of "LONG_TYPE" >and not TimeUUID? > > Probably just to make the demo easier. It's used to order tweets in the > user and public timelines by the current time > https://github.com/twissandra/twissandra/blob/master/cass.py#L204 > > >1. Does performance decrease (significantly) if the uniqueness of the >column’s name is high when comparator is LONG_TYPE/TimeUUID and each row > has >lots of columns? > > Depends on what sort of operations you are doing. Some read operations have > to pay a constant cost to decode the row level column index, this can be > tuned though. AFAIK the comparator type has very little to do with the > performance. > > Hope that helps. > > - > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 9 Jul 2011, at 12:15, Eldad Yamin wrote: > > Hi, > I have few questions: > > *Secondary index* > >1. Is there a limit on the number of columns in a single column family >that serve as secondary indexes? >2. Does performance decrease (significantly) if the uniqueness of the >column’s values is high? > > > *Twissandra* > >1. Why in the source (or any tutorial I've read): >The CF for "Userline"/"Uimeline" - have comparator of "LONG_TYPE" and >not TimeUUID? > > > https://github.com/twissandra/twissandra/blob/master/tweets/management/commands/sync_cassandra.py >2. Does performance decrease (significantly) if the uniqueness of the >column’s name is high when comparator is LONG_TYPE/TimeUUID and each row > has >lots of columns? > > > Thanks! > Eldad > > >
Re: Cassandra Secondary index/Twissandra
Hi Aaron, Thank you again for your response. I've read the article but I didn't understand everything. it would be great if the benchmark will include the actual CLI/Python comments (that way it will be easier to understand the query). in addition, an explanation about row pages - what is it?. Anyway, for a scale proportion, we can take as example the average Facebook/Twitter user which can get 100K columns per user (Userline). So what is needed is to take the first 50 columns (order by TimeUUID), then column 51 to 100, 101 to 150 etc. Any suggestion on fast will it be? or how you recommend on configuring Cassandra? or even a different way of achieving that goal? Thanks, Eldad. On Sun, Jul 10, 2011 at 8:31 PM, aaron morton wrote: > Can you recommend on a better way of doing that or a way to tune Cassandra > to support those 2 CF? > > A select with no start or finish column name, a column count and not in > reversed order is about the fastest read query. > > You will need to do a reversed query, which will be a little slower. But > may still be plenty fast enough, depending on scale and throughput and all > those other things. see > http://thelastpickle.com/2011/07/04/Cassandra-Query-Plans/ > > Cheers > > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 10 Jul 2011, at 00:14, Eldad Yamin wrote: > > Aaron - Thank you for the fast response! > > >1. Does performance decrease (significantly) if the uniqueness of the >column’s name is high when comparator is LONG_TYPE/TimeUUID and each row > has >lots of columns? > > >Depends on what sort of operations you are doing. Some read operations > have to pay a constant cost to decode the row level column index, this can > be tuned though. AFAIK the comparator type has very little to do with the > performance. > > In Twissandra, the columns are used as "alternative" index for the > Userline/Timeline. therefore the operation I'm going to do is slice_range. > I'm going to get (for example) the first 50 columns (using comparator of > TimeUUID/LONG). > Can you recommend on a better way of doing that or a way to tune Cassandra > to support those 2 CF? > > > Thanks! > > On Sun, Jul 10, 2011 at 3:26 AM, aaron morton wrote: > >> >>1. Is there a limit on the number of columns in a single column family >>that serve as secondary indexes? >> >> AFAIK there is no coded limit, however every index is implemented as >> another (hidden) Column Family that inherits the settings of the parent CF. >> So under 0.7 you may run out of memory, under 0.8 you may flush a lot. >> Also, when an indexed column is updated there are potentially 3 operations >> that have to happen: read the old value, delete the old value, write the new >> value. More indexes == more index updating, just like any other database. >> >> >>1. Does performance decrease (significantly) if the uniqueness of the >>column’s values is high? >> >> Low cardinality is recommended >> >> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Secondary-indices-Why-low-cardinality-td6160509.html >> >> >>1. The CF for "Userline"/"Uimeline" - have comparator of "LONG_TYPE" >>and not TimeUUID? >> >> Probably just to make the demo easier. It's used to order tweets in the >> user and public timelines by the current time >> https://github.com/twissandra/twissandra/blob/master/cass.py#L204 >> >> >>1. Does performance decrease (significantly) if the uniqueness of the >>column’s name is high when comparator is LONG_TYPE/TimeUUID and each row >> has >>lots of columns? >> >> Depends on what sort of operations you are doing. Some read operations >> have to pay a constant cost to decode the row level column index, this can >> be tuned though. AFAIK the comparator type has very little to do with the >> performance. >> >> Hope that helps. >> >> - >> - >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 9 Jul 2011, at 12:15, Eldad Yamin wrote: >> >> Hi, >> I have few questions: >> >> *Secondary index* >> >>1. Is there a limit on the number of columns in a single column family >>that serve as secondary indexes? >>2. Does performance decrease (significantly) if the uniqueness of the >>column’s values is high? >> >> >> *Twissandra* >> >>1. Why in the source (or any tutorial I've read): >>The CF for "Userline"/"Uimeline" - have comparator of "LONG_TYPE" and >>not TimeUUID? >> >> >> https://github.com/twissandra/twissandra/blob/master/tweets/management/commands/sync_cassandra.py >>2. Does performance decrease (significantly) if the uniqueness of the >>column’s name is high when comparator is LONG_TYPE/TimeUUID and each row >> has >>lots of columns? >> >> >> Thanks! >> Eldad >> >> >> > >
b-tree
Hello, Is there any good way of storing a binary-tree in Cassandra? I wonder if someone already implement something like that and how accomplished that without transaction supports (while the tree keep evolving)? I'm asking that becouse I want to save geospatial-data, and SimpleGeo did it using b-tree: http://www.readwriteweb.com/cloud/2011/02/video-simplegeo-cassandra.php Thanks!
Re: b-tree
Aaron, Nested set is exactly what I had in mind. But how will you be able to maintain it while it evolves and new data is added without transactions? Thanks! On Thu, Jul 21, 2011 at 1:44 AM, aaron morton wrote: > Just throwing out a (half baked) idea, perhaps the Nested Set Model of > trees would work http://en.wikipedia.org/wiki/Nested_set_model > > <http://en.wikipedia.org/wiki/Nested_set_model>* Ever row would represent > a set with a left and right encoded into the key > * Members are inserted as columns into *every* set / row they are a member. > So we are de-normalising and trading space for time. > * May need to maintain a custom secondary index of the materialised sets. > e.g. slice a row to get the first column >= the left value you are > interested in, that is the key for the set. > > I've not thought it through much further than that, a lot would depend on > your data. The top sets may get very big, . > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 21 Jul 2011, at 08:33, Jeffrey Kesselman wrote: > > Im not sure if I have an answer for you, anyway, but I'm curious > > A b-tree and a binary tree are not the same thing. A binary tree is a > basic fundamental data structure, A b-tree is an approach to storing and > indexing data on disc for a database. > > Which do you mean? > > On Wed, Jul 20, 2011 at 4:30 PM, Eldad Yamin wrote: > >> Hello, >> Is there any good way of storing a binary-tree in Cassandra? >> I wonder if someone already implement something like that and how >> accomplished that without transaction supports (while the tree keep >> evolving)? >> >> I'm asking that becouse I want to save geospatial-data, and SimpleGeo did >> it using b-tree: >> http://www.readwriteweb.com/cloud/2011/02/video-simplegeo-cassandra.php >> >> Thanks! >> > > > > -- > It's always darkest just before you are eaten by a grue. > > >
Re: b-tree
Hi Jeffery, I meant for binary tree. go an watch the video (in my first email), it will give you a better understanding. Eldad On Wed, Jul 20, 2011 at 11:33 PM, Jeffrey Kesselman wrote: > Im not sure if I have an answer for you, anyway, but I'm curious > > A b-tree and a binary tree are not the same thing. A binary tree is a > basic fundamental data structure, A b-tree is an approach to storing and > indexing data on disc for a database. > > Which do you mean? > > > On Wed, Jul 20, 2011 at 4:30 PM, Eldad Yamin wrote: > >> Hello, >> Is there any good way of storing a binary-tree in Cassandra? >> I wonder if someone already implement something like that and how >> accomplished that without transaction supports (while the tree keep >> evolving)? >> >> I'm asking that becouse I want to save geospatial-data, and SimpleGeo did >> it using b-tree: >> http://www.readwriteweb.com/cloud/2011/02/video-simplegeo-cassandra.php >> >> Thanks! >> > > > > -- > It's always darkest just before you are eaten by a grue. >
Re: how to stop the whole cluster, start the whole cluster like in hadoop/hbase?
I wonder if it wont make problems... Anyine did it already? On Jul 21, 2011 10:39 PM, "Jonathan Ellis" wrote: > dsh -c -g cassandra /etc/init.d/cassandra stop > > http://www.netfort.gr.jp/~dancer/software/dsh.html.en > > P.S. mostly people are concerned about making sure their entire > cluster does NOT stop at the same time :) > > On Thu, Jul 21, 2011 at 2:23 PM, Dean Hiller wrote: >> Is there a framework for stopping all nodes/starting all nodes for >> cassandra? I am okay with something like password-less ssh setup that >> hadoop scripts did...just something that allows me to start and stop the >> whole cluster. >> >> thanks, >> Dean >> > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com
Re: b-tree
In order order to split the nodes. SimpleGeo have max 1,000 recods (i.e places) on each node in the tree, if the number is >1,000 they split the node. In order to avoid that more then 1 process will edit/split the node - transaction is needed. On Jul 22, 2011 1:01 AM, "aaron morton" wrote: >> But how will you be able to maintain it while it evolves and new data is added without transactions? > > What is the situation you think you need transactions for ? > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 22 Jul 2011, at 00:06, Eldad Yamin wrote: > >> Aaron, >> Nested set is exactly what I had in mind. >> But how will you be able to maintain it while it evolves and new data is added without transactions? >> >> Thanks! >> >> On Thu, Jul 21, 2011 at 1:44 AM, aaron morton wrote: >> Just throwing out a (half baked) idea, perhaps the Nested Set Model of trees would work http://en.wikipedia.org/wiki/Nested_set_model >> >> * Ever row would represent a set with a left and right encoded into the key >> * Members are inserted as columns into *every* set / row they are a member. So we are de-normalising and trading space for time. >> * May need to maintain a custom secondary index of the materialised sets. e.g. slice a row to get the first column >= the left value you are interested in, that is the key for the set. >> >> I've not thought it through much further than that, a lot would depend on your data. The top sets may get very big, . >> >> Cheers >> >> - >> Aaron Morton >> Freelance Cassandra Developer >> @aaronmorton >> http://www.thelastpickle.com >> >> On 21 Jul 2011, at 08:33, Jeffrey Kesselman wrote: >> >>> Im not sure if I have an answer for you, anyway, but I'm curious >>> >>> A b-tree and a binary tree are not the same thing. A binary tree is a basic fundamental data structure, A b-tree is an approach to storing and indexing data on disc for a database. >>> >>> Which do you mean? >>> >>> On Wed, Jul 20, 2011 at 4:30 PM, Eldad Yamin wrote: >>> Hello, >>> Is there any good way of storing a binary-tree in Cassandra? >>> I wonder if someone already implement something like that and how accomplished that without transaction supports (while the tree keep evolving)? >>> >>> I'm asking that becouse I want to save geospatial-data, and SimpleGeo did it using b-tree: >>> http://www.readwriteweb.com/cloud/2011/02/video-simplegeo-cassandra.php >>> >>> Thanks! >>> >>> >>> >>> -- >>> It's always darkest just before you are eaten by a grue. >> >> >
Question about eventually consistent
Hi, Let’s say that I have 2 datacenters, a key is changed on both of my datacenters in the exact same time (even in 1-2 seconds diff). Datacenter #1 remove a column and Datacenter #2 add 2 new columns. Is there any problem with consistency or Cassandra will handle this situation easily. Thanks!
Question about eventually consistent in Cassandra
Hi, Let’s say that I have 2 datacenters, a key is changed on both of my datacenters in the exact same time (even in 1-2 seconds diff). Datacenter #1 add column "abc" with value X Datacenter #2 add column "abc" with value Y. What is the result of that situation? Is there any different if the changes will be made withing the same data center? Thanks! Eldad Yamin
HOW TO select a column or all columns that start with X
Hello, I wonder if I can select a column or all columns that start with X. E.g I have columns ABC_1, ABC_2, ZZZ_1 and I want to select all columns that start with ABC_ - is that possible? Thanks!
cassandra consistency level
Is consistency level "All" for write actually grenty that my data is updated in all of my node? is it apply to read actions as-well? I've read it on the wiki, I just want to make sure. Thanks!
geo-data in Cassandra
Hello, I'm trying to save geo-data in Cassandra, according to SimpleGeo they did that using nested tree: http://www.readwriteweb.com/cloud/2011/02/video-simplegeo-cassandra.php I wonder if someone already implement something like that and how he accomplished that without transaction supports (while the tree keep evolving)? In addition what consistency level he used? Thanks!
Re: cassandra consistency level
So what you're saying is that no matter what consistency level I'm using, the data will be written to all CF nodes right away, the consistency level is just for making sure that all CF nodes are UP and all data is written. In other words, if one of the nodes is down - the write (or read) will fail. I'm asking that because I'm a bit worried with consistency, for example: Every action that my client is doing is stored in a CF.x in a specific column by his user_id. I'm doing that by de-serializing the data that already found in the column, adding new data (the action), serializing and storing the data. so I'm worrying that some of the user actions will "drop" due low-consistency when there are lots of changes to a specific column in a sort period of time. I know that I can solve this situation in a different way by storing each action in a new column etc... but this is just an example that explain my question in a simple way. Thanks! On Wed, Aug 3, 2011 at 3:21 AM, aaron morton wrote: > Not sure I understand your question exactly, but will take a shot… > > Writes are sent to every UP node, the consistency level is how many nodes > we require to complete before we say the request completed successfully. So > we also make sure that CL nodes are UP before we start the request. If you > run CL ALL then Replication Factor nodes must be up for each key you are > writing. > > With the exception of CL ONE reads are also sent to all UP replicas. > > Hope that helps. > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 3 Aug 2011, at 09:32, Eldad Yamin wrote: > > > Is consistency level "All" for write actually grenty that my data is > updated in all of my node? > > is it apply to read actions as-well? > > > > I've read it on the wiki, I just want to make sure. > > Thanks! > >
Re: HOW TO select a column or all columns that start with X
Thanks! On Wed, Aug 3, 2011 at 3:03 PM, aaron morton wrote: > and AsciiType > > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 3 Aug 2011, at 16:35, eldad87 wrote: > > Thank you! > Will this situation work only for UTF8Type comparator? > > > On Wed, Aug 3, 2011 at 4:50 AM, Tyler Hobbs wrote: > >> A minor correction: >> >> To get all columns starting with "ABC_", you would set column_start="ABC_" >> and column_finish="ABC`" (the '`' character comes after '_'), and ignore the >> last column in your results if it happened to be "ABC`". >> >> column_finish, or the "slice end" in other clients, is inclusive. You >> could of course use "ABC_~" as column_finish and avoid the check if you know >> that you don't have column names like "ABC_~FOO" that you want to include. >> >> >> On Tue, Aug 2, 2011 at 7:17 PM, aaron morton wrote: >> >>> Yup, thats a pretty common pattern. How exactly depends on the client you >>> are using. >>> >>> Say you were using pycassam, you would do a get() >>> http://pycassa.github.com/pycassa/api/pycassa/columnfamily.html#pycassa.columnfamily.ColumnFamily.get >>> >>> with column_start="ABC_" , count to whatever, and column_finish not >>> provided. >>> >>> You can also provide a finish and use the highest encoded character, e.g. >>> ascii 126 is ~ so if you used column_finish = "ABC_~" you would get >>> everything that starts with ABC_ >>> >>> Cheers >>> >>> - >>> Aaron Morton >>> Freelance Cassandra Developer >>> @aaronmorton >>> http://www.thelastpickle.com >>> >>> On 3 Aug 2011, at 09:28, Eldad Yamin wrote: >>> >>> Hello, >>> I wonder if I can select a column or all columns that start with X. >>> E.g I have columns ABC_1, ABC_2, ZZZ_1 and I want to select all columns >>> that start with ABC_ - is that possible? >>> >>> >>> >>> Thanks! >>> >>> >>> >> >> >> -- >> Tyler Hobbs >> Software Engineer, DataStax <http://datastax.com/> >> Maintainer of the pycassa <http://github.com/pycassa/pycassa> Cassandra >> Python client library >> >> > >
Install Cassandra on EC2
Hi, Is there any manual or important notes I should know before I try to install Cassandra on EC2? Thanks!
Re: Install Cassandra on EC2
Thanks! But I prefer to learn how to Install first - if you have any good references (I didn't find any, even general installation for a EC2/regular machine) I'm also going to try and install Solandra, I hope that Whirr will support it in the near future. On Wed, Aug 3, 2011 at 5:43 PM, John Conwell wrote: > One thing you might want to look at is the Apache Whirr project (which is > awesome by the way!). It automagically handles spinning up a cluster of > resources on EC2 (or rackspace for that matter), installing and configuring > cassandra, and starting it. > > One thing to be aware of if you go this route. By default in the yaml file > all data is written under the /var folder. But on a server started by > Whirr, this folder only has something like 4gb. Most of the hard disk > space is under the /mnt folder. So you'll either need to change what > folders are pointed to what drives (not sure if you can or not...I'm sure > you could), or change the yaml file to point the /mnt folder. > > > On Wed, Aug 3, 2011 at 6:28 AM, Eldad Yamin wrote: > >> Hi, >> Is there any manual or important notes I should know before I try to >> install Cassandra on EC2? >> >> Thanks! >> > > > > -- > > Thanks, > John C > >
Cassandra and Solandra Installation guid
Hi, I'd like to get tutorials on how to install Cassandra and Solandra - I couldn't find anything helpful. In addition, how to use (index/search) Solandra tutorials will be great. Thanks!
Installation Exception
Hi, I'm trying to install Cassandra on Amazon EC2 without success, this is what I did: 1. Created new "Small" EC2 instance (this is just for testing), running Ubuntu OS - custom AIM (ami-596f3c1c) from: http://uec-images.ubuntu.com/releases/11.04/release/ 2. Installed Java: # sudo add-apt-repository "deb http://archive.canonical.com/ lucid partner" # sudo apt-get update # sudo apt-get install sun-java6-jre sun-java6-plugin sun-java6-fonts openjdk-6-jre 3. Upgraded: # sudo apt-get upgrade 4. Downloaded Cassandra: # cd /usr/src/ # sudo wget http://apache.mivzakim.net//cassandra/0.8.2/apache-cassandra-0.8.2-src.tar.gz # sudo tar xvfz apache-cassandra-* # cd apache-cassandra-* 5. Config (according to README.txt) # sudo mkdir -p /var/log/cassandra # sudo chown -R `whoami` /var/log/cassandra # sudo mkdir -p /var/lib/cassandra # sudo chown -R `whoami` /var/lib/cassandra 6. RUN CASSANDRA # bin/cassandra -f The I got Exception: "ubuntu@ip-10-170-31-128:/usr/src/apache-cassandra-0.8.2-src$ bin/cassandra -f Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/cassandra/thrift/CassandraDaemon Caused by: java.lang.ClassNotFoundException: org.apache.cassandra.thrift.CassandraDaemon at java.net.URLClassLoader$1.run(URLClassLoader.java:217) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:205) at java.lang.ClassLoader.loadClass(ClassLoader.java:321) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) at java.lang.ClassLoader.loadClass(ClassLoader.java:266) Could not find the main class: org.apache.cassandra.thrift.CassandraDaemon. Program will exit." Any idea what is wrong? Thanks!
Re: Installation Exception
Thanks Jonathan, I saw the EC2 AMI that was made by datastax - I prefer not to use it becuse I want to learn how to install Cassandra first. On Wed, Aug 3, 2011 at 8:03 PM, Jonathan Ellis wrote: > > http://www.datastax.com/dev/blog/setting-up-a-cassandra-cluster-with-the-datastax-ami > > On Wed, Aug 3, 2011 at 10:44 AM, Eldad Yamin wrote: > > Hi, > > I'm trying to install Cassandra on Amazon EC2 without success, this is > what > > I did: > > > > Created new "Small" EC2 instance (this is just for testing), running > Ubuntu > > OS - custom AIM (ami-596f3c1c) from: > > http://uec-images.ubuntu.com/releases/11.04/release/ > > Installed Java: > > # sudo add-apt-repository "deb http://archive.canonical.com/ lucid > partner" > > # sudo apt-get update > > # sudo apt-get install sun-java6-jre sun-java6-plugin sun-java6-fonts > > openjdk-6-jre > > Upgraded: > > # sudo apt-get upgrade > > Downloaded Cassandra: > > # cd /usr/src/ > > # sudo wget > > > http://apache.mivzakim.net//cassandra/0.8.2/apache-cassandra-0.8.2-src.tar.gz > > # sudo tar xvfz apache-cassandra-* > > # cd apache-cassandra-* > > Config (according to README.txt) > > # sudo mkdir -p /var/log/cassandra > > # sudo chown -R `whoami` /var/log/cassandra > > # sudo mkdir -p /var/lib/cassandra > > # sudo chown -R `whoami` /var/lib/cassandra > > RUN CASSANDRA > > # bin/cassandra -f > > > > The I got Exception: > > "ubuntu@ip-10-170-31-128:/usr/src/apache-cassandra-0.8.2-src$ > bin/cassandra > > -f > > Exception in thread "main" java.lang.NoClassDefFoundError: > > org/apache/cassandra/thrift/CassandraDaemon > > Caused by: java.lang.ClassNotFoundException: > > org.apache.cassandra.thrift.CassandraDaemon > > at java.net.URLClassLoader$1.run(URLClassLoader.java:217) > > at java.security.AccessController.doPrivileged(Native Method) > > at java.net.URLClassLoader.findClass(URLClassLoader.java:205) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:321) > > at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) > > at java.lang.ClassLoader.loadClass(ClassLoader.java:266) > > Could not find the main class: > org.apache.cassandra.thrift.CassandraDaemon. > > Program will exit." > > > > Any idea what is wrong? > > Thanks! > > > > -- > Jonathan Ellis > Project Chair, Apache Cassandra > co-founder of DataStax, the source for professional Cassandra support > http://www.datastax.com >
Re: Installation Exception
Thanks! I missed that lol! BTW, how do I compile it? Thanks! On Wed, Aug 3, 2011 at 6:51 PM, samal wrote: > did u compile source code? :) > you have downloaded source code not binary. > > try with binary. > > On Wed, Aug 3, 2011 at 9:14 PM, Eldad Yamin wrote: > >> Hi, >> I'm trying to install Cassandra on Amazon EC2 without success, this is >> what I did: >> >>1. Created new "Small" EC2 instance (this is just for testing), >>running Ubuntu OS - custom AIM (ami-596f3c1c) from: >>http://uec-images.ubuntu.com/releases/11.04/release/ >>2. Installed Java: >># sudo add-apt-repository "deb http://archive.canonical.com/ lucid >>partner" >># sudo apt-get update >># sudo apt-get install sun-java6-jre sun-java6-plugin sun-java6-fonts >>openjdk-6-jre >>3. Upgraded: >># sudo apt-get upgrade >>4. Downloaded Cassandra: >># cd /usr/src/ >># sudo wget >> >> http://apache.mivzakim.net//cassandra/0.8.2/apache-cassandra-0.8.2-src.tar.gz >> >># sudo tar xvfz apache-cassandra-* >># cd apache-cassandra-* >>5. Config (according to README.txt) >># sudo mkdir -p /var/log/cassandra >># sudo chown -R `whoami` /var/log/cassandra >># sudo mkdir -p /var/lib/cassandra >># sudo chown -R `whoami` /var/lib/cassandra >>6. RUN CASSANDRA >># bin/cassandra -f >> >> The I got Exception: >> "ubuntu@ip-10-170-31-128:/usr/src/apache-cassandra-0.8.2-src$ >> bin/cassandra -f >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/apache/cassandra/thrift/CassandraDaemon >> Caused by: java.lang.ClassNotFoundException: >> org.apache.cassandra.thrift.CassandraDaemon >> at java.net.URLClassLoader$1.run(URLClassLoader.java:217) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:205) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:321) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:266) >> Could not find the main class: >> org.apache.cassandra.thrift.CassandraDaemon. Program will exit." >> >> >> Any idea what is wrong? >> Thanks! >> > >
Solandra
Hello, I have a cluster of 3 Cassandra nodes and I would like to start using Solandra. 1. How can I install Solandra and make use the existing nodes? 2. Will it be better to install Solandra on a new node and add it to the existing cluster? 3. How Solandra index, does it operate automatically or I need to "tell" Solandra to index CF.keys every time a new key is create or update? Thanks!
Re: Planet Cassandra (an aggregation site for Cassandra News)
Great! I hope it will be open soon! On Wed, Aug 3, 2011 at 10:33 PM, Ed Anuff wrote: > Awesome, great news! > > > On Wed, Aug 3, 2011 at 11:53 AM, Lynn Bender wrote: > >> Greetings all, >> >> I just wanted to send a note out to let everyone know about Planet >> Cassandra -- an aggregation site for Cassandra news and blogs. Andrew >> Llavore from DataStax and I built the site. >> >> We are currently waiting for approval from the Apache Software Foundation >> before we publicly launch. However, in the meantime, we'd love to hear from >> you. If you have any favorite Cassandra-related blogs, or blogs that >> frequently contain quality Cassandra content, please send us the URL, so >> that we can contact the author about including a site feed. >> >> If you have any questions or comments, please send them to >> pla...@geekaustin.org. >> >> -Lynn Bender >> >> -- >> -Lynn Bender >> http://geekaustin.org >> http://linuxagainstpoverty.org >> http://twitter.com/linearb >> http://twitter.com/geekaustin >> >> >> >> >
Re: Install Cassandra on EC2
HI Aaron, Thanks for your replay. I've already saw that, but at the moment I'm interesting in installing Cassandra from scratch - I want to learn. well, yesterday I've installed 1 node - now I'm looking on how to add more nodes and read more about Cassandra's tools (node reaper etc.) Thanks! On Thu, Aug 4, 2011 at 1:23 AM, aaron morton wrote: > Pre build AMI here > > http://www.datastax.com/dev/blog/setting-up-a-cassandra-cluster-with-the-datastax-ami > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 4 Aug 2011, at 03:24, Jeremy Hanna wrote: > > Some quick thoughts that might be helpful: > > - use ephemeral instances and RAID0 over the local volumes for both > cassandra's data as well as the log directory. The log directory because if > you crash due to heap size, the heap dump will be stored in the log > directory. you don't want that to go in your root/OS partition. > > - probably want to stripe across AZs so that a single AZ failure doesn't > affect you as much. > > - for seeds, it's nice to use elastic ips so that your seed configuration > doesn't have to change if a node is replaced. > > - the ec2snitch makes it so each AZ appears as a rack wrt topology - > simpler as it inspects the ec2 metadata. if you need more than one DC in > your cluster (we need a second virtual DC for analytics), you'll probably > want to use the property file snitch. there's a cross region ec2snitch > that's coming in 1.0. > > would probably be good to add some ec2 specific tips in the wiki. the page > that dave mentioned is a good step-by-step, but there's been a lot of > community knowledge accumulated about best practices in the year since that > was done. > > On Aug 3, 2011, at 8:28 AM, Eldad Yamin wrote: > > Hi, > > Is there any manual or important notes I should know before I try to > install Cassandra on EC2? > > > Thanks! > > > >
Re: Planet Cassandra (an aggregation site for Cassandra News)
Great! If possible, please blog about full-text-search options + how to use them (Solandra, Elastic Search, Sphinx etc). Thanks! On Sun, Aug 7, 2011 at 5:58 AM, Edward Capriolo wrote: > > > On Thu, Aug 4, 2011 at 5:12 AM, Boris Yen wrote: > >> Looking forward to it. ^^ >> >> On Thu, Aug 4, 2011 at 1:56 PM, Eldad Yamin wrote: >> >>> Great! I hope it will be open soon! >>> >>> >>> On Wed, Aug 3, 2011 at 10:33 PM, Ed Anuff wrote: >>> >>>> Awesome, great news! >>>> >>>> >>>> On Wed, Aug 3, 2011 at 11:53 AM, Lynn Bender wrote: >>>> >>>>> Greetings all, >>>>> >>>>> I just wanted to send a note out to let everyone know about Planet >>>>> Cassandra -- an aggregation site for Cassandra news and blogs. Andrew >>>>> Llavore from DataStax and I built the site. >>>>> >>>>> We are currently waiting for approval from the Apache Software >>>>> Foundation before we publicly launch. However, in the meantime, we'd love >>>>> to >>>>> hear from you. If you have any favorite Cassandra-related blogs, or blogs >>>>> that frequently contain quality Cassandra content, please send us the URL, >>>>> so that we can contact the author about including a site feed. >>>>> >>>>> If you have any questions or comments, please send them to >>>>> pla...@geekaustin.org. >>>>> >>>>> -Lynn Bender >>>>> >>>>> -- >>>>> -Lynn Bender >>>>> http://geekaustin.org >>>>> http://linuxagainstpoverty.org >>>>> http://twitter.com/linearb >>>>> http://twitter.com/geekaustin >>>>> >>>>> >>>>> >>>>> >>>> >>> >> > I have started a blog to support the High Performance Cassandra Cookbook: > > http://www.jointhegrid.com/highperfcassandra/ > > I am going to use blog to continue writing about features and tips for > Cassandra in the writing style used for the book. > > Lynn, please consider it for syndication. All others, please enjoy. > >
Re: Best practices when deploying & upgrading a cassandra cluster
Is there any good reason why shouldn't we build the latest version from source? Thanks! On Fri, Aug 12, 2011 at 12:18 AM, aaron morton wrote: > In a non dev system it's a lot easier to use the packages > http://wiki.apache.org/cassandra/DebianPackaging > http://www.datastax.com/docs/0.8/install/packaged_releases > > Cheers > > - > Aaron Morton > Freelance Cassandra Developer > @aaronmorton > http://www.thelastpickle.com > > On 12 Aug 2011, at 02:30, Martin Lansler wrote: > > (Note: This is a repost from another thread which did not have a > relevant subject, sorry for the spamming) > > Hi Eldad / All, > > On Wed, Aug 10, 2011 at 8:32 AM, Eldad Yamin wrote: > > Can you please explain how did you upgraded. something like step-by-step. > > Thanks! > > > I took the liberty of replying to the group as it would be interesting > to hear how other folks out there are doing it... > > I'm *not* running a prod system, just a test system of three nodes on > my laptop. So it would be nice to hear about real setups. Here is my > test setup: > > apache-cassandra -> apache-cassandra-0.8.3 > apache-cassandra-0.8.2/ > apache-cassandra-0.8.3/ > node1/ > node2/ > node3/ > > All nodeX look like: > bin -> ../apache-cassandra/bin/ > commitlog/ > conf/ > data/ > interface -> ../apache-cassandra/interface/ > lib -> ../apache-cassandra/lib/ > saved_caches/ > > The 'conf' directory is copied into each node from the virgin > cassandra distribution. I then create a local GIT repo and add the > 'conf' directory so I can track any configuration changes on a node. > Then relevant node specific configuration settings are set. The > 'commitlog', 'data' and 'saved_caches' are created by cassandra and > must be configured in 'cassandra.yaml' for each node. > > When I upgrade I do the following: > > 1. > Make a diff of the new conf files from the new version so that get > new parameters etc... I use emacs ediff-mode. > 2. > Remove the old "apache-cassandra" symlink and point it to the new cassandra > dist > 3. > In a rolling fashion stop one node, and then restart it... as the > symlink is changes it will then boot with the upgraded cassandra dist. > (remember to cd out & in of the bin/ dir otherwise you will still be > in the old directory). > (4). > Should something break... just re-create the old symlink and restart > the node (provided cassandra has not performed any non backwards > compatible changes to the db files, should be noted in the README) > > That's pretty much it. > > On a prod setup one would probably use a tool such as puppet > (www.puppetlabs.com/) to ease setting up on many nodes... But there > are many ways to do this, for instance pssh > (http://code.google.com/p/parallel-ssh/). > > Regards, > -Martin > > >
Re: Cassandra London: failure modes and HBase
HI Dave, unfortunately, me and some guys that are very interesting won't be able to get all the way to London. Can you please consider using a video streaming service? I recommend on using Watchitoo.com (I used to work there) At the moment its free. Thanks! On Tue, Aug 16, 2011 at 12:47 PM, Dave Gardner wrote: > Hi all, > > I'm pleased to announce our next Cassandra meetup on 5th September in > London. > > http://www.meetup.com/Cassandra-London/events/29668191/ > > We will be looking at failure modes in Cassandra (how it deals with nodes > failing and returning etc..) as well as a comparison with HBase. It's a > great opportunity to meet other users of Cassandra, so please come along! > > > Dave >