Re: Retrieve all composite columns from a row, whose composite name's first component matches from a list of Integers
I currently have scf[c1][sc1]=value scf[c1][sc2]=value ... scf[c2][sc1]=value scf[c2][sc2]=value scf[c2][sc3]=value scf[c2][sc4]=value 99% of the time, I do multiget super slices: for multiple keys, I query for columns explicitly c1,c2,c10,c12 1% of the time, I do a multigetrange superslice where for multiple keys, I query for a range of super columns As Tyler said, it can be done by specifying supercolumns in the slice predicate, it will implicitly return all its columns. I use Hector and it works great. Now interestingly enough, column names sc1, sc2, sc3 are in fact home-made composite columns. I could and would switch to full composite columns because I am fishing for every drop of performance I can. However, I would need "Letting multiget_slice accept multiple SlicePredicates per key could also accomplish this." Can anyone on the dev team comment on doing this ? Is it a no-no ? Thanks 2011/12/29 Edward Capriolo > Hum... > > Do you have this? > scf [b][1][a]=value > scf [b][1][x]=value > scf [b][7][b]=value > > and you want to slice: > scf [b][1][*] > > Which would result in > > scf [b][1][a]=value > scf [b][1][x]=value > > ? > > The composite version of this would be: > cf [b][1:a]=value > cf [b][1:x]=value > cf [b][7:b]=value > > I am not sure exactly what you are doing because A SlicePredicate > takes either a list of columns or a SliceRange. A ColumnPath takes a > Single SuperColumn. > > I do not see how this is done with Columns or SuperColumns. Maybe you > can provide a code snippet and/or some sample data? > > On 12/29/11, Aditya wrote: > > @Edward: Perhaps you missed to notice that I need to always retrieve 'all > > columns' under the supercolumn at any time.. and as per my query > > requirements if I use composite columns instead of supercolumns then it > is > > impossible to do wildcard queries like the ones asked in this thread's > > headline but which is much easier to do through the use of supercolumns. > > > > On Thu, Dec 29, 2011 at 11:06 PM, Edward Capriolo > > wrote: > > > >> The use case in question was: Only accessing some columns. > >> > >> Even if that is not the case: > >> > >> SuperColumns: 1 extra level of nesting > >> Composite Colunns: Arbitrary levels of nesting > >> > >> SuperColumns: More overhead (space on disk) then using your own > delimiter > >> '_' > >> SuperColumns: Likely going to be replaced in future c* version behind > >> the scenes by composite columns anyway > >> SuperColumns: Usually an afterthought for API developers, (support for > >> them comes "later") > >> SuperColumns: Almost always utilized incorrectly by users, users speak > >> of '10%' performance gains after they switch away from them. > >> > >> There are some (a small % of cases) where SuperColumns are a better > >> choice, but this is rare. With composites and concatenating columns > >> they have no great purpose any more, (bad analogy coming!) like a > >> mechanical type writer. > >> > >> On 12/29/11, Philippe wrote: > >> > Would you stand by that statement in case all colums inside the super > >> > column need to be read? Why? > >> > > >> > Thanks > >> > Le 28 déc. 2011 19:26, "Edward Capriolo" a > >> écrit : > >> > > >> >> Super columns have the same fundamental problem and perform worse in > >> >> general. So switching from composites to super columns is NEVER a > good > >> >> idea. > >> >> > >> >> > >> >> On Wed, Dec 28, 2011 at 1:19 PM, Aditya wrote: > >> >> > >> >>> Since I have around 20 items to query, I guess making 20 queries to > >> >>> retrieve activities by all followies on all of those 20 columns > would > >> too > >> >>> inefficient, so to take the advantage of more efficient queries, are > >> >>> supercolumns recommended for this case ? Anyways, in case I use > >> >>> supercolumns, I need to retrieve the entire supercolumn at any point > >> >>> of > >> >>> time & I am writing subcolumn(s) to the supercolumn at different > times > >> >>> not > >> >>> at once. > >> >>> > >> >>> On Wed, Dec 28, 2011 at 8:07 PM, Edward Capriolo > >> >>> wrote: > >> >>> > >> You need to execute one get slice operation for each item id or if > >> the > >> row is not large , you can try one large get slice on the entire > row > >> and > >> deal with the results client side. > >> > >> If you try method 1 When doing slices on composites you can set the > >> start inclusive or exclusive values to get only the column you want > >> and > >> not > >> some extra columns up to slice range size. > >> > >> > >> On Tuesday, December 27, 2011, Aditya wrote: > >> > I need to store data of all activities by user's followies in > >> > single > >> row. I am trying to do that making use of composite column names > in a > >> single user specific row named 'rowX'. > >> > On any activity by a user's followie on an item, a column is > stored > >> in > >> 'rowX'. The column has a composite type column name made up of > >>
Dealing with "Corrupt (negative) value length encountered"
Hello, Running a combination of 0.8.6 and 0.8.8 with RF=3, I am getting the following while repairing one node (all other nodes completed successfully). Can I just stop the instance, erase the SSTable and restart cleanup ? Thanks ERROR [Thread-402484] 2011-12-29 14:51:03,687 AbstractCassandraDaemon.java (line 139) Fatal exception in thread Thread[Thread-402484,5,main] java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.io.IOError: java.io.IOException: Corrupt (negative) value length encountered at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:154) at org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:63) at org.apache.cassandra.net.IncomingTcpConnection.stream(IncomingTcpConnection.java:189) at org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:117) Caused by: java.util.concurrent.ExecutionException: java.io.IOError: java.io.IOException: Corrupt (negative) value length encountered at java.util.concurrent.FutureTask$Sync.innerGet(FutureTask.java:222) at java.util.concurrent.FutureTask.get(FutureTask.java:83) at org.apache.cassandra.streaming.StreamInSession.closeIfFinished(StreamInSession.java:138) ... 3 more
CLI exception :: A long is exactly 8 bytes: 1
Hi Everyone, Been a while .. without any problems. Thanks for grinding out a good product! On 1.0.6, I applied an update to a column family to add a secondary index, and now via the CLI, when I perform a "get user where something=1" I receive the following result: org.apache.cassandra.db.marshal.MarshalException: A long is exactly 8 bytes: 1 This behaviour doesn't seem to be affecting phpcassa or hector retrieving the results of that query ... is this a silly something i've done, or something a bit more buggy with the CLI? Thanks in advance, -sd -- Sasha Dolgy sasha.do...@gmail.com
Re: CLI exception :: A long is exactly 8 bytes: 1
I think you need to mention data type in your command. You have to run the following command first: assume <*CFName*> keys as <*TypeName*, i.e., utf8> Otherwise, you need to mention type with each command, e.g., utf8('keyname'). http://wiki.apache.org/cassandra/CassandraCli Moshiur On Fri, Dec 30, 2011 at 10:50 AM, Sasha Dolgy wrote: > Hi Everyone, > > Been a while .. without any problems. Thanks for grinding out a good > product! On 1.0.6, I applied an update to a column family to add a > secondary index, and now via the CLI, when I perform a "get user where > something=1" I receive the following result: > > org.apache.cassandra.db.marshal.MarshalException: A long is exactly 8 > bytes: 1 > > This behaviour doesn't seem to be affecting phpcassa or hector > retrieving the results of that query ... is this a silly something > i've done, or something a bit more buggy with the CLI? > > Thanks in advance, > -sd > > -- > Sasha Dolgy > sasha.do...@gmail.com >
Re: CLI exception :: A long is exactly 8 bytes: 1
as per the wiki link you sent, i change my query to: get user where something = '1'; Still throws the error ... This was fine *before* I ran the update CF command .. To Query Data get User where age = '12'; On Fri, Dec 30, 2011 at 6:05 PM, Moshiur Rahman wrote: > I think you need to mention data type in your command. You have to run the > following command first: > assume keys as > > Otherwise, you need to mention type with each command, e.g., > utf8('keyname'). > http://wiki.apache.org/cassandra/CassandraCli > > Moshiur > > > > On Fri, Dec 30, 2011 at 10:50 AM, Sasha Dolgy wrote: >> >> Hi Everyone, >> >> Been a while .. without any problems. Thanks for grinding out a good >> product! On 1.0.6, I applied an update to a column family to add a >> secondary index, and now via the CLI, when I perform a "get user where >> something=1" I receive the following result: >> >> org.apache.cassandra.db.marshal.MarshalException: A long is exactly 8 >> bytes: 1 >> >> This behaviour doesn't seem to be affecting phpcassa or hector >> retrieving the results of that query ... is this a silly something >> i've done, or something a bit more buggy with the CLI? >> >> Thanks in advance, >> -sd
rename column family
How can I rename a column family (if version matters, I'm interested in both 0.8.x and 1.0.x). Thanks, Jim
Cassandra performance question
Hi, could anyone tell me whether this is possible with Cassandra using an appropriately sized EC2 cluster. 100,000 clients writing 50k each to their own specific row at 5 second intervals?
Re: Cassandra performance question
This might be helpful: http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html On Dec 30, 2011, at 1:59 PM, Dom Wong wrote: > Hi, could anyone tell me whether this is possible with Cassandra using an > appropriately sized EC2 cluster. > > 100,000 clients writing 50k each to their own specific row at 5 second > intervals?
Re: Cassandra performance question
We did some benchmarking as well. http://blog.vcider.com/2011/09/virtual-networks-can-run-cassandra-up-to-60-faster/ Although we were primarily interested in the networking issues CM On Fri, Dec 30, 2011 at 12:08 PM, Jeremy Hanna wrote: > This might be helpful: > http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html > > On Dec 30, 2011, at 1:59 PM, Dom Wong wrote: > > > Hi, could anyone tell me whether this is possible with Cassandra using > an appropriately sized EC2 cluster. > > > > 100,000 clients writing 50k each to their own specific row at 5 second > intervals? > >