Okay, this is going to be a pretty long post, but I think its an interesting
data model, and hopefully someone will find it worth going through.
First, I think it will be easier to understand the modeling choices I made if
you see the end product. Go to
http://www.fold3.com/browse.php#249|hzUkL
Hmmm... If you serialize the tree properly in a partition, you could always
read an entire sub-tree as a single slice (consecutive CQL rows.) Is there
much more to it?
-- Jack Krupansky
On Fri, Mar 27, 2015 at 7:35 PM, Ben Bromhead wrote:
> +1 would love to see how you do it
>
> On 27 March 201
+1 would love to see how you do it
On 27 March 2015 at 07:18, Jonathan Haddad wrote:
> I'd be interested to see that data model. I think the entire list would
> benefit!
>
> On Thu, Mar 26, 2015 at 8:16 PM Robert Wille wrote:
>
>> I have a cluster which stores tree structures. I keep several hu
Thanks Robert!
Yes I tried what you said: clean the data and re-bootstrap. But still it
failed, once at the point of 600GB transferred and once at 1.1TB :(
But I could see following exceptions from time to time:
=
java.io.IOException: net.jpountz.lz4.LZ4Exception: Error decodin
One other thing to keep in mind / check is that doing these tests locally
the cassandra driver will connect using the network stack, whereas postgres
supports local connections over a unix domain socket (this is also enabled
by default).
Unix domain sockets are significantly faster than tcp as you
Actually I am in the middle of setting up the same sort of thing for
PostgreSQL using psycopg2 and pyev.
I'll be using Cassandra and PostgreSQL in an IoT experiment as the backend
for swarms of MQTT brokers at something in the 10-100M client range.
ml
On Fri, Mar 27, 2015 at 4:59 PM, Laing, Mich
I use callback chaining with the python driver and can confirm that it is
very fast.
You can "chain the chains" together to perform sequential processing. I do
this when retrieving "metadata" and then the referenced "payload" for
example, when the metadata has been inverted and the payload is larg
Since you're executing queries sequentially, you may want to look into
using callback chaining to avoid the cross-thread signaling that results in
the 1ms latencies. Basically, just use session.execute_async() and attach
a callback to the returned future that will execute your next query. The
cal
I think that in your example Postgres spends most time on waiting for
fsync() to complete. On Linux, for a battery-backed raid controller,
it's safe to mount ext4 filesystem with "barrier=0" option which
improves fsync() performance a lot. I have partitions mounted with this
option and I did a
Yes, I'm concerned about the latency. Throughput can be high even when
using Python: http://datastax.github.io/python-driver/performance.html.
But in my scenarios I need to run queries sequentially, so latencies
matter. And Cassandra requires issuing more queries than SQL databases
so these lat
Latency can be so variable even when testing things locally. I quickly
fired up postgres and did the following with psql:
ben=# CREATE TABLE foo(i int, j text, PRIMARY KEY(i));
CREATE TABLE
ben=# \timing
Timing is on.
ben=# INSERT INTO foo VALUES(2, 'yay');
INSERT 0 1
Time: 1.162 ms
ben=# INSERT I
hi
I hav run the source of cassandra in eclipse juno by following this
document
http://brianoneill.blogspot.in/2015/03/getting-started-with-cassandra.html.
but i'm getting the exceptions. please help to solve this.
INFO 17:43:40 Node localhost/127.0.0.1 state jump to normal
INFO 17:43:41 Netty u
Just to check, are you concerned about minimizing that latency or
maximizing throughput?
I'll that latency is what you're actually concerned about. A fair amount
of that latency is probably happening in the python driver. Although it
can easily execute ~8k operations per second (using cpython),
Yeah that's the one :) sorry, was on my phone and didn't want to look up
the exact name.
Cheers,
Thunder
On Mar 27, 2015 6:17 AM, "Brice Dutheil" wrote:
> Would it help here to not actually issue a delete statement but instead
> use date based compaction and a dynamically calculated ttl that is
Running upgrade is a noop if the tables don't need to be upgraded. I
consider the cost of this to be less than the cost of missing an upgrade.
On Thu, Mar 26, 2015 at 4:23 PM Robert Coli wrote:
> On Wed, Mar 25, 2015 at 7:16 PM, Jonathan Haddad
> wrote:
>
>> There's no downside to running upgrad
I'd be interested to see that data model. I think the entire list would
benefit!
On Thu, Mar 26, 2015 at 8:16 PM Robert Wille wrote:
> I have a cluster which stores tree structures. I keep several hundred
> unrelated trees. The largest has about 180 million nodes, and the smallest
> has 1 node. T
On 3/26/15 10:15 PM, Robert Wille wrote:
I have a cluster which stores tree structures. I keep several hundred unrelated
trees. The largest has about 180 million nodes, and the smallest has 1 node.
The largest fanout is almost 400K. Depth is arbitrary, but in practice is
probably less than 10.
Would it help here to not actually issue a delete statement but instead use
date based compaction and a dynamically calculated ttl that is some safe
distance in the future from your key?
I’m not sure about about this part *date based compaction*, do you mean
DateTieredCompationStrategy ?
Anyway w
http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configCassandra_yaml_r.html?scroll=reference_ds_qfg_n1r_1k__num_tokens
So go with a default 256, and leave initial token empty:
num_tokens: 256
# initial_token:
Cassandra will always give each node the same number of t
Hi All,
We are using cassandra version 2.1.2 with cqlsh 5.0.1 (cluster of three
nodes with rf 2)
I need to load around 40 million records into a table of cassandra db. I
have created batch of 1 million ( batch of 1 records also gives the
same error) in csv format. when I use copy command t
2015-03-27 11:58 GMT+01:00 Sibbald, Charles :
> Cassandra’s Vnodes config
Thank you. Yes, we are using vnodes! The num_token parameter controls the
number of vnodes assigned to a specific node.
Might be I am seeing problems where are none.
Let me rephrase my question: How does Cassandra know
I'm running Cassandra locally and I see that the execution time for the
simplest queries is 1-2 milliseconds. By a simple query I mean either
INSERT or SELECT from a small table with short keys.
While this number is not high, it's about 10-20 times slower than
Postgresql (even if INSERTs are w
Rob, the cluster now upgraded to cassandra 1.0.12 (default hd version,
in Descriptor.java) and I ensure all sstables in current cluster are
hd version before upgrade to cassandra 1.1. I have also checked in
cassandra 1.1.12 , the sstable is version hf version. so i guess,
nodetool upgradesstables i
I would recommend you utilise Cassandra’s Vnodes config and let it manage this
itself.
This means it will create these and a mange them all on its own and allows
quick and easy scaling and boot strapping.
From: Björn Hachmann
mailto:bjoern.hachm...@metrigo.de>>
Reply-To: "user@cassandra.apache
Hi,
we currently plan to add a second data center to our Cassandra-Cluster. I
have read about this procedure in the documentation (eg.
https://www.datastax.com/documentation/cassandra/2.1/cassandra/operations/ops_add_dc_to_cluster_t.html),
but at least one question remains:
Do I have to provide a
Hi All,
This is possible with cassandra-driver-core-2.1.5, with
'row.getLong("sum")'.
Thanks
On Fri, Mar 27, 2015 at 2:51 PM, Amila Paranawithana
wrote:
> in Apache Cassandra Java Driver 2.1 how to read counter type values from a
> row when iterating over result set.
>
> eg: If I have a counte
Hi,
This post[1] may be useful. But note that this was done with cassandra
older version. So there may be new way to do this.
[1].
http://amilaparanawithana.blogspot.com/2012/06/bulk-loading-external-data-to-cassandra.html
Thanks,
On Fri, Mar 27, 2015 at 11:40 AM, Rahul Bhardwaj <
rahul.bhard.
in Apache Cassandra Java Driver 2.1 how to read counter type values from a
row when iterating over result set.
eg: If I have a counter table called 'countertable' with key and a counter
colum 'sum' how can I read the value of the counter column using Java
driver?
If I say, row.getInt("sum") this g
Hi Robert,
We're trying to do something similar to the OP and finding it a bit
difficult. Would it be possible to provide more details about how you're
doing it?
Thanks.
On Fri, Mar 27, 2015 at 3:15 AM, Robert Wille wrote:
> I have a cluster which stores tree structures. I keep several hundred
29 matches
Mail list logo