On Thu, 2014-04-10 at 11:17 -0700, motta.lrd wrote:
> What is the minimum database size and number of Operations/Second (reads and
> write) for which I should seriously consider this database?
Significant number of writes / second -> possibly a good use case for
cassandra.
Database size is a di
137GB would fairly easily fit in core memory on a single node these days:
so it seems a very low amount for a 27 node cluster..
Off the top of my head: would 99th percentile latency be improved by using
replication factor 5, assuming you are doing quorum operations..
Sent from my phone
On 1 Mar 2
I have found that in (limited) practice that it's fairly hard to estimate
due to compression and compaction behaviour. I think measuring and
extrapolating (with an understanding of the datastructures) is the most
effective.
Tim
Sent from my phone
On 6 Dec 2013 20:54, "John Sanda" wrote:
> I hav
On Wed, 2013-08-21 at 10:42 -0700, Robert Coli wrote:
> On Wed, Aug 21, 2013 at 3:58 AM, Tim Wintle wrote:
>
> > What would the best way to achieve this? (We can tolerate a fairly short
> > period of downtime).
> >
>
> I think this would work, but may require a
Hi,
Suppose we have two networks:
10.1.0.0/16 and 10.2.0.0/16.
It is not possible to route packets between the two networks, but all
nodes have interfaces on both networks, so any node can communicate with
any address on either network.
We are currently running our all nodes on one network, but
I might be missing something, but if it is all on one machine then why use
Cassandra or hadoop?
Sent from my phone
On 13 Jul 2013 01:16, "Martin Arrowsmith"
wrote:
> Dear Cassandra experts,
>
> I have an HP Proliant ML350 G8 server, and I want to put virtual
> servers on it. I would like to put
On Mon, 2013-06-03 at 17:20 -0700, Aiman Parvaiz wrote:
> @Faraaz check out the comment by Aaron morton here :
> http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Seed-Nodes-td6077958.html
> Having same nodes is a good idea but it is not necessary.
> > In your case, sure the nodes w
Hi,
I've tried searching for this all over the place, but I can't find an
answer anywhere...
What is the (theoretical) time complexity of basic C* operations?
I assume that single lookups are O(log(R/N)) for R rows across N nodes
(as SST lookups should be O(log(n)) and there are R/N rows per nod
On Tue, 2013-02-05 at 13:51 -0500, Edward Capriolo wrote:
> Without stating the obvious, if you are interested in scale, then why
> pick python?.
I would (kind of) agree with this point..
If you absolutely need performance here then python isn't the right
choice.
If, however, you are currently w
On Tue, 2013-02-05 at 21:38 +1300, aaron morton wrote:
> The first thing I noticed is your script uses python threading library, which
> is hampered by the Global Interpreter Lock
> http://docs.python.org/2/library/threading.html
>
> You don't really have multiple threads running in parallel, tr
On Fri, 2012-10-12 at 10:20 +, Viktor Jevdokimov wrote:
> IMO, in most cases you'll be limited by the RAM first.
+1 - I've seen our 8-core boxes limited by RAM and inter-rack
networking, but not by CPU (yet).
Tim
es -
assuming the number of categories is significantly smaller than the
number of documents that could make a major difference to latency.
Tim
>
> Regards,
> Clément
>
> 2012/9/28 Tim Wintle
>
> > On Fri, 2012-09-28 at 18:20 +0200, Clement Honore wrote:
> > > Hi,**
On Fri, 2012-09-28 at 18:53 +, Xu, Zaili wrote:
> Hi,
>
> I have an existing Cassandra Cluster. I removed a node from the cluster. Then
> I decommissioned the removed node, stopped it, updated its config so that it
> only has itself as the seed and in the cassandra-topology.properties file,
On Fri, 2012-09-28 at 18:20 +0200, Clement Honore wrote:
> Hi,
>
> ** **
>
> I have hierarchical data.
>
> I'm storing them in CF with rowkey somewhat like (category, doc id), and
> plenty of columns for a doc definition.
>
> ** **
>
> I have hierarchical data traversal too.
>
t;
> -
> Aaron Morton
> Freelance Developer
> @aaronmorton
> http://www.thelastpickle.com
>
> On 3/09/2012, at 8:20 PM, Tim Wintle wrote:
>
> > On Tue, 2012-08-28 at 16:57 +1200, aaron morton wrote:
> > > Sorry I don't understand your q
On Tue, 2012-08-28 at 16:57 +1200, aaron morton wrote:
> Sorry I don't understand your question.
>
> Can you explain it a bit more or maybe someone else knows.
I believe the question is why is the maximum 2**127 and not
0x
Tim
>
> Cheers
>
> -
> Aaron Morton
>
Data layer into parts that are stateless and parts
which aren't then you can load balance the horizontally scalable parts
of that layer using something like haproxy too if you need to.
Tim Wintle
t; Tamar Fraenkel
> Senior Software Engineer, TOK Media
>
> Inline image 1
>
> ta...@tok-media.com
> Tel: +972 2 6409736
> Mob: +972 54 8356490
> Fax: +972 2 5612956
>
>
> On Mon, Jul 30, 2012 at 3:14 PM, Tim Wintle
> wrote:
> On Mon, 2012-07
On Mon, 2012-07-30 at 14:40 +0300, Tamar Fraenkel wrote:
> Hi!
> To clarify it a bit more,
> Let's assume the setup is changed to
> RF=3
> W_CL=QUORUM (or two for that matter)
> R_CL=ONE
> The setup will now work for both read and write in case of one node
> failure.
> What are the disadvantages,
Would it be possible to support this in a more general case by providing a
distributed |= operator over arbitrary byte strings (like the + operator on
counter columns), which would allow distributed bloom filters as well?
Tim Wintle
On Fri, Jun 29, 2012 at 6:31 AM, Chris Burroughs
wrote:
> W
entire
dataset in the single node cluser, or has it been lost along the way?
What is the replication factor for your data?
Tim Wintle
On Tue, 2012-05-01 at 11:00 -0700, Aaron Turner wrote:
> Tens or a few hundred MB per row seems reasonable. You could do
> thousands/MB if you wanted to, but that can make things harder to
> manage.
thanks (Both Aarons)
> Depending on the size of your data, you may find that the overhead of
> ea
I believe that the general design for time-series schemas looks
something like this (correct me if I'm wrong):
(storing time series for X dimensions for Y different users)
Row Keys: "{USET_ID}_{TIMESTAMP/BUCKETSIZE}"
Columns: "{DIMENSION_ID}_{TIMESTAMP%BUCKETSIZE}" -> {Counter}
But I've not fou
23 matches
Mail list logo