I agree, producing a Delphi generator would be preferable, and may be the end
result in any case. There are some issues with this, such as the fact the
compiled generator program apparently does not run on Windows (is there a
prebuilt version of this that can be downloaded)? Not having generics
Peter,
Do you think 0-padding the entries would be more efficient than just
implementing your own comparator?
On Wed, Mar 24, 2010 at 10:57 PM, Peter Chang wrote:
> If there's not much overhead, I recommend client side as well.
> Otherwise, you can only sort on column. Therefore, you could creat
Hi Chris,
So, if I get it right, you suggest that I pull all the columns for in a
single row and do the sorting client side?
The user-friends-messages was just an example and maybe not the best I could
come up with cause I agree that there are not too many friends in general
that send you messages
I am not clear how does this work when I want to increase the count of
user-1.
Thanks
Erez
On Thu, Mar 25, 2010 at 12:57 AM, Peter Chang wrote:
> If there's not much overhead, I recommend client side as well.
>
> Otherwise, you can only sort on column. Therefore, you could create some
> sort of
Hi,
I wondered if you were eluding to something more complex. You'd probably
want to create a index using something along the lines that Peter suggested.
:)
But I'm a Cassandra / Column DB newbie, so my experience ends just about ...
here. :)
Cheers,
Chris
On 25 March 2010 08:59, Erez Efrati
You are correct Chris.
I am a newbie too in this field.
I like the Cassandra/NoSQL way and I am trying to see if it can fit my
model.
Thanks,
Erez
On Thu, Mar 25, 2010 at 11:03 AM, Christopher Brind <
christopher.br...@googlemail.com> wrote:
> Hi,
>
> I wondered if you were eluding to something
Hi there,
Thanks to all for reply. But I still have a question:
1. When I using Twtiter via Tweetie which is iPhone application, I can see
the unique ID for each of users in their personal profile page. It seems
like incremental number. As far as I know, Twitter using Cassandra for its
back-end.
Hi everyone,
We're trying to implement a virtual datastore for our users where they can
set up "tables" and "indexes" to store objects and have them indexed on
arbitrary properties. And we did a test implementation for Cassandra in the
following way:
Objects are stored in one columnfamily, each k
I don't know If that could play any role, but if ever you have
disabled the assertions
when running cassandra (that is, you removed the -ea line in
cassandra.in.sh), there
was a bug in 0.6beta2 that will make read in row with lots of columns
quite slow.
Another problem you may have is if you have
The FAQ page makes mention of using separate disks for the commit log and
data directory. How would one go about achieving this in a cloud deployment
such as Rackspace cloud servers or EC2 EBS? Or is it just preferred to use
dedicated hardware to get the optimal performance?
Thanks In Advance!
Be
On 03/25/2010 11:10 AM, Mark Greene wrote:
The FAQ page makes mention of using separate disks for the commit log
and data directory. How would one go about achieving this in a cloud
deployment such as Rackspace cloud servers or EC2 EBS? Or is it
just preferred to use dedicated hardware to get t
Hi All:
I am thinking a more precise query in Cassandra:
Could we hava a query API like this :
List> get_slice_condition(String keyspace, List
keys, ColumnParent column_parent, Map
queryConditions, int consistency_level)
So we could use this API to query more precise data like age column's valu
On 03/25/2010 11:18 AM, Ethan Rowe wrote:
[snip]
I'll defer to the Rackspace folks regarding Rackspace Cloud; it has
been I/O on average since you're dealing with a real, local disk. But
I don't know about getting a second disk in that environment, though.
That should have said "better I/O o
If you have enough data or insert volume that you can reasonably use
dedicated hardware, you should probably use that.
(http://spyced.blogspot.com/2010/03/why-your-data-may-not-belong-in-cloud.html)
If you don't, then having CL + data on the same volume isn't going to
hurt nearly as much as sharin
I wanted to check my understanding of the load balance operation. Let's say I
have 5 nodes, each of them has been assigned at startup 1/5 of the ring, and
the load is equal across them (say using random partitioner). The load on the
cluster gets high, so I add a sixth server. During bootstrap, t
Hi All,
I am designing an application where I need to store data as
key-value pair without the present need to use column/super-column family
stuff.
Does my use case fits Cassandra. My traffic will be 70-80% read traffic.The
latency requirements are 100ms.
Thanks
Anurag
The advantage to doing it the way Cassandra does is that you can keep
keys sorted with OrderPreservingPartitioner for range scans. grabbing
one token of many from each node in the ring would prohibit that.
So we rely on active load balancing to get to a "good enough" balance,
say within 50%. It
Cassandra gives you a superset of simple key/value, so why not?
reddit is using Cassandra like this, fwiw.
On Thu, Mar 25, 2010 at 10:55 AM, Anurag Gujral wrote:
> Hi All,
> I am designing an application where I need to store data as
> key-value pair without the present need to use c
Cassandra is not being used to generate the Twitter identifiers.
Twitter, like most places using Cassandra, has more than one database
system in production.
UUIDs are not at risk of conflicts with billions of rows.
b
On Thu, Mar 25, 2010 at 5:57 AM, Jaepil Jeong wrote:
> Hi there,
> Thanks to
On Thu, Mar 25, 2010 at 15:17, Sylvain Lebresne wrote:
> I don't know If that could play any role, but if ever you have
> disabled the assertions
> when running cassandra (that is, you removed the -ea line in
> cassandra.in.sh), there
> was a bug in 0.6beta2 that will make read in row with lots o
I noticed you turned Key caching off in your ColumnFamily declaration,
have you tried experimenting with this on and playing key caching
configuration? Also, have you looked at the JMX output for what
commands are pending execution? That is always helpful to me in
hunting down bottlenecks.
-Nate
On Thu, Mar 25, 2010 at 10:56 AM, Jonathan Ellis wrote:
> The advantage to doing it the way Cassandra does is that you can keep
> keys sorted with OrderPreservingPartitioner for range scans. grabbing
> one token of many from each node in the ring would prohibit that.
>
> So we rely on active load
Do you mean on the client? It really depends on how many items you're
sorting. In terms of computer runtime, client-side will always likely be
faster but if you take into account bandwidth speeds having a pre-sorted
list will be better for large lists.
Creating 0-padded numbers is pretty straightf
On Thu, Mar 25, 2010 at 5:31 PM, Henrik Schröder wrote:
> On Thu, Mar 25, 2010 at 15:17, Sylvain Lebresne wrote:
>>
>> I don't know If that could play any role, but if ever you have
>> disabled the assertions
>> when running cassandra (that is, you removed the -ea line in
>> cassandra.in.sh), the
On Thu, Mar 25, 2010 at 11:40 AM, Jeremy Dunck wrote:
> On Thu, Mar 25, 2010 at 10:56 AM, Jonathan Ellis wrote:
>> The advantage to doing it the way Cassandra does is that you can keep
>> keys sorted with OrderPreservingPartitioner for range scans. grabbing
>> one token of many from each node in
Erez,
To make this work you have to make your model fit Cassandra, not the
other way around. As a rule, you either do complex queries via client
code to process the results of several, simpler queries or via a CF
you create to act as an index. Yes, this means you have to write data
to each index
On Thu, Mar 25, 2010 at 9:20 AM, Benjamin Black wrote:
> Cassandra is not being used to generate the Twitter identifiers.
> Twitter, like most places using Cassandra, has more than one database
> system in production.
>
> UUIDs are not at risk of conflicts with billions of rows.
Exactly: UUIDs we
On Thu, Mar 25, 2010 at 9:56 AM, Jonathan Ellis wrote:
> The advantage to doing it the way Cassandra does is that you can keep
> keys sorted with OrderPreservingPartitioner for range scans. grabbing
> one token of many from each node in the ring would prohibit that.
>
> So we rely on active load
Hi all,
I have a question about load-balancing.
I have easily built a cluster with two nodes, but I am wondering how my
client should connect to this cluster.
- Run queries against one node (but all data will transit through this node
and this way creates a SPOF)
- Run queries against an external
On Thu, Mar 25, 2010 at 1:17 PM, Mike Malone wrote:
> On Thu, Mar 25, 2010 at 9:56 AM, Jonathan Ellis wrote:
>>
>> The advantage to doing it the way Cassandra does is that you can keep
>> keys sorted with OrderPreservingPartitioner for range scans. grabbing
>> one token of many from each node in
On Thu, Mar 25, 2010 at 1:20 PM, Y Aw wrote:
> Hi all,
> I have a question about load-balancing.
http://wiki.apache.org/cassandra/FAQ#node_clients_connect_to
Does that help?
Commented on the Jira issue.
Curious how badly out of date that patch is now. :)
On Wed, Mar 24, 2010 at 12:55 PM, Ran Tavory wrote:
> I'm willing to give it a try.
> Where do I start, except for applying the patch in the bug?
>
> On Wed, Mar 24, 2010 at 2:30 PM, Jonathan Ellis wrote:
>>
>> Cur
On Thu, Mar 25, 2010 at 1:26 PM, Jonathan Ellis wrote:
> Pretty much everything assumes that there is a 1:1 correspondence
> between IP and Token. It's probably in the ballpark of "one month to
> code, two to get the bugs out." Gossip is one of the trickier parts
> of our code base, and this wou
I was originally using 0.5.0 but I've reproduced the behavior with
0.5.1 and 0.6.0-beta3.
On Wed, Mar 24, 2010 at 3:00 PM, Jonathan Ellis wrote:
> Are you using 0.5.0? Because this sounds like a bug that was fixed in 0.5.1.
>
> On Mon, Mar 22, 2010 at 5:13 PM, Bob Florian wrote:
>> I'm new to
Can you create a ticket with a test case?
On Thu, Mar 25, 2010 at 3:39 PM, Bob Florian wrote:
> I was originally using 0.5.0 but I've reproduced the behavior with
> 0.5.1 and 0.6.0-beta3.
>
>
> On Wed, Mar 24, 2010 at 3:00 PM, Jonathan Ellis wrote:
>> Are you using 0.5.0? Because this sounds li
One problem is if the heaviest node is next to a node that's is
lighter than average, instead of heavier. Then if the new node takes
extra from the heaviest, say 75% instead of just 1/2, and then we take
1/2 of the heaviest's neighbor and put it on the heaviest, you made
that lighter-than-average
As promised, here is the official invite to register for the hackathon in
SF. The event starts at 6:30pm on April 22nd.
http://cassandrahackathon.eventbrite.com/
--
Chris Goffinet
no compaction.
Jonathan Ellis wrote:
did you check jmx to see if a compaction is going on?
On Mon, Mar 22, 2010 at 5:14 PM, Todd Burruss wrote:
after running my cluster for a while performance has become unacceptable,
200+ ms for reads. if running well, i see reads <10ms. when i run iost
Sure thing. Here it is:
https://issues.apache.org/jira/browse/CASSANDRA-920
On Thu, Mar 25, 2010 at 4:44 PM, Jonathan Ellis wrote:
> Can you create a ticket with a test case?
>
> On Thu, Mar 25, 2010 at 3:39 PM, Bob Florian wrote:
>> I was originally using 0.5.0 but I've reproduced the behavior
On Thu, Mar 25, 2010 at 5:57 AM, Jaepil Jeong wrote:
> Hi there,
> Thanks to all for reply. But I still have a question:
> 1. When I using Twtiter via Tweetie which is iPhone application, I can see
> the unique ID for each of users in their personal profile page. It seems
> like incremental number
I agree it's only a problem with 'small' clusters - but it seems like 'small'
is 'most users'? Even with 10 nodes it looks like a pretty big imbalance if I
add an 11th node, and don't add the other 9 or move a large part of the ring.
Or in practice have folks not had trouble with incremental sca
It is much more likely that you always increase your cluster in size by a
certain large percentage. With a 10 node cluster, you are likely to add 5 nodes
at a time, and with a 100 node cluster you'll probably add 25 to 50 per batch.
-Original Message-
From: "Daniel Kluesing"
Sent: Thur
On Thu, Mar 25, 2010 at 8:33 AM, Henrik Schröder wrote:
> Hi everyone,
>
> We're trying to implement a virtual datastore for our users where they can
> set up "tables" and "indexes" to store objects and have them indexed on
> arbitrary properties. And we did a test implementation for Cassandra in
Cassandra mmaps your data files which show up as RES and SHR. This is normal.
c0d1p1 is completely maxed out. Assuming that is your data disk and
not your commitlog one, you need to tell Cassandra to cache more rows
(or keys, depending).
If you are maxing out your caches and still seeing this t
44 matches
Mail list logo