On Wed, Sep 19, 2012 at 5:12 PM, Michael Kjellman
wrote:
> Sounds like you are loosing your system keyspace. When you say nothing
> important changed between yaml files do you mean with or without your
> changes?
>
I compared the 1.1.1 cassandra.yaml (with my changes) to the cassandra.yaml
distri
"I think that's inconsistent with the hypothesis that unclean shutdown is
the sole cause of these problems"
I agree, we just never shut down any node, neither had any crash, and yet
we have these bugs.
About your side note :
We know about it, but we couldn't find any other way to be able to prov
I wrote an answer on the blog post
(http://www.datastax.com/dev/blog/cql3_collections#comment-127093).
--
Sylvain
On Thu, Sep 20, 2012 at 7:13 AM, Roshni Rajagopal
wrote:
> Hi,
>
> CQL3, has collections support as described in this link
> http://www.datastax.com/dev/blog/cql3_collections
>
> So
On Wed, Sep 19, 2012 at 2:00 PM, Roshni Rajagopal
wrote:
> Hi,
>
> There was a conversation on this some time earlier, and to continue it
>
> Suppose I want to associate a user to an item, and I want to also store 3
> commonly used attributes without needing to go to an entity item column
> famil
Oh, i just saw your first mail.
"I don't see a negative number in you paste?"
(03a227f0-a5c3-11e1--b7f5e49dceff, 1, -1) and
(03a227f0-a5c3-11e1--b7f5e49dceff,
1, 1)
(03a227f0-a5c3-11e1--b7f5e49dceff, 4, -5000) and
(03a227f0-a5c3-11e1--b7f5e49dceff, 4, 2)
(03a227f0-a5c3-11e1-00
On Wed, Sep 19, 2012 at 3:32 PM, Brian O'Neill wrote:
> That said, I'm keeping a close watch on:
> https://issues.apache.org/jira/browse/CASSANDRA-3647
>
> But if this is CQL only, I'm not sure how much use it will be for us
> since we're coming in from different clients.
> Anyone know how/if coll
I am testing the performance of 1 cassandra node on a production server. I
wrote a script to insert 1 million items into cassandra. the data is like
below:
*prefix = "benchmark_"*
*dct = {}*
*for i in range(0,100):*
*key = "%s%d" % (prefix,i)*
*dct[key] = "abc"*200*
and the inserting
forgot to mention the rpc configuration in cassandra.yaml is:
rpc_timeout_in_ms: 2
and the cassandra version on production server is: 1.1.3
the cassandra version I am using on my macbook is: 1.0.10
On Thu, Sep 20, 2012 at 6:07 PM, Yan Chunlu wrote:
> I am testing the performance of 1 cas
Hi, all!
We have a cluster with virtual 7 nodes (disk storage is connected to
nodes with iSCSI). The storage schema is:
Reports:{
1:{
1:{"value1":"some val", "value2":"some val"},
2:{"value1":"some val", "value2":"some val"}
...
},
2:{
1:{"value1":"some
As I understand from the link below, burning column index-info onto the
sstable index files will not only eliminate sstables but also reduce disk
seeks from 3 to 2 for wide rows.
Our index files are always mmapped, so there is only one random seek for a
named column query. I think that is a wonder
A follow-up:
Currently I'm back on version 1.1.1.
I tried - unsuccessfully - the following things:
1. Create the missing keyspace on the 1.1.5 node, then copy the files back
into the data directory.
This failed, since the keyspace was already known on the other node in the
cluster.
2. shut down
p.s. Cassandra 1.1.4
On Thu, Sep 20, 2012 at 3:27 PM, Denis Gabaydulin wrote:
> Hi, all!
>
> We have a cluster with virtual 7 nodes (disk storage is connected to
> nodes with iSCSI). The storage schema is:
>
> Reports:{
> 1:{
> 1:{"value1":"some val", "value2":"some val"},
> 2
Hello,
We are trying to add new nodes to our *6-node* cassandra cluster with
RF=3 cassandra version 1.0.11. We are *adding 18 new nodes* one-by-one.
First strange thing, I've noticed, is the number of completed
MigrationStage in nodetool tpstats grows for every new node, while
schema is not c
Hi, when the heap is going more than 70% usage, you should be able to see
in the log, many flushing, or reducing the row cache size down. Did you
restart the cassandra daemon in the node that thrown OOM?
On Thu, Sep 20, 2012 at 9:11 PM, Vanger wrote:
> Hello,
> We are trying to add new nodes to
> As I understand from the link below, burning column index-info onto the
> sstable index files will not only eliminate sstables but also reduce disk
> seeks from 3 to 2 for wide rows.
Yes.
> Shouldn't we be wary of the spike in heap usage by promoting column indexes
> to index file?
If you're t
That's showing a client-side socket timeout. By default, the timeout for
pycassa connections is fairly low, at 0.5 seconds. With the default batch
insert size of 100 rows, you're probably hitting this timeout
occasionally. I suggest lowering the batch size and using multiple threads
for the highe
While diskspace is cheap, nodes are not that cheap, and usually systems have a
1T limit on each node which means we would love to really not add more nodes
until we hit 70% disk space instead of the normal 50% that we have read about
due to compaction.
Is there any way to use less disk space du
1. Use compression
2. Used Leveled Compaction
Also, 1TB/node is a lot larger then the normal recommendation...
generally speaking more in the 300-400GB range.
On Thu, Sep 20, 2012 at 8:10 PM, Hiller, Dean wrote:
> While diskspace is cheap, nodes are not that cheap, and usually systems have
> a
Hi,
I'd like to incrementally synchronize data written to Cassandra into
an external store without having to maintain an index to do this, so I
was wondering whether anybody is using the commit log to establish
what updates have taken place since a given point in time?
Cheers,
Ben
Hi Everyone,
I'm writing a conversion tool from CSV files to SSTable
using SSTableSimpleUnsortedWriter and unable to find a good example of
using CompositeType.Builder with SSTableSimpleUnsortedWriter.
It also will be great if someone had an sample code for insert/update only
a single value in com
This should explain the schema issue in 1.0 that has been fixed in 1.1:
http://www.datastax.com/dev/blog/the-schema-management-renaissance
On Thu, Sep 20, 2012 at 10:17 AM, Jason Wee wrote:
> Hi, when the heap is going more than 70% usage, you should be able to see
> in the log, many flushing, o
I'm not 100% that I understand your data model and read patterns correctly,
but it sounds like you have large supercolumns and are requesting some of
the subcolumns from individual super columns. If that's the case, the
issue is that Cassandra must deserialize the entire supercolumn in memory
when
If you're seeing that in cassandra-cli, it's possible that there are some
non-printable characters in the name that the cli doesn't display, like the
NUL char (ascii 0). I opened a ticket for that somewhere, but in the
meantime, you may want to verify that they are identical with a real client.
O
This will be a good new feature. I guess the development team don't
have time on this yet. ;)
On Thu, Sep 20, 2012 at 1:29 PM, Ben Hood <0x6e6...@gmail.com> wrote:
> Hi,
>
> I'd like to incrementally synchronize data written to Cassandra into
> an external store without having to maintain an ind
+1. Would be a pretty cool feature
Right now I write once to cassandra and once to kafka.
On 9/20/12 4:13 PM, "Data Craftsman 木匠"
wrote:
>This will be a good new feature. I guess the development team don't
>have time on this yet. ;)
>
>
>On Thu, Sep 20, 2012 at 1:29 PM, Ben Hood <0x6e6...@gmai
Along those lines...
We sought to use triggers for external synchronization. If you read through
this issue:
https://issues.apache.org/jira/browse/CASSANDRA-1311
You'll see the idea of leveraging a commit log for synchronization, via
triggers.
We went ahead and implemented this concept in:
> Actually, if I use community edition for now, I wouldn't be able to use
> hadoop against data stored in CFS?
AFAIK DSC is a packaged deployment of Apache Cassandra. You should be ale to
use Hadoop against it, in the same way you can use hadoop against Apache
Cassandra.
You "can do" anything
Set the caching attribute for the CF. It defaults to keys_only, other values
are both or rows_only.
See http://www.datastax.com/dev/blog/caching-in-cassandra-1-1
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 19/09/2012, at 1:34 PM, Jaso
> Would it help if I partitioned the computing resources of my physical
> machines into VMs?
No.
Just like cutting a cake into smaller pieces does not mean you can eat more
without getting fat.
In the general case, regular HDD and 1 Gbe and 8 to 16 virtual cores and 8GB to
16GB ram, you can e
> Also, Cassandra is great for writes but not as optimized for reads.
From cassandra 1.0 read throughout on a par with writes
http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-performance
You milage may vary depending on the workload.
Cheers
-
Aaron Morton
Freelanc
> I want to set the replication factor = 2,
This is part of the CREATE KEYSPACE command, not sure where this is in
solandra.
I would recommend using RF 3 as a minimum.
> , and the default replications strategy to be RackAwareStrategy.
That's a very old strategy.
The default is NetworkTopolog
> I created the following model: an UserCF, whose key is a userID generated by
> TimeUUID, and a RequestCF, whose key is composite: UserUUID + timestamp. For
> each user, I will store basic data and, for each request, I will insert a lot
> of columns.
I would consider:
# User CF
* row_key: use
With Solandra as well you can use the Cassandra Cli to do the needful. The
location would be [~/Solandra/bin/] .
Regards,
Shubham
On Fri, Sep 21, 2012 at 6:56 AM, aaron morton wrote:
> I want to set the replication factor = 2,
>
> This is part of the CREATE KEYSPACE command, not sure where this
Rows are actually stored on disk in the order of the hash of their keys
when using RandomPartitioner.
Furthermore, the rows are stored in SSTables, which are immutable, and are
periodically compacted together. There's no shifting involved. This gives
an overview: http://wiki.apache.org/cassandra
Ended up switching the biggest offending column families back to size tiered
compaction and pending compactions across the cluster dropped to 0 very quickly.
On Sep 19, 2012, at 10:55 PM, "Michael Kjellman"
wrote:
> After changing my ss_table_size as recommended my pending compactions across
Got it. Thanks for the replies
On Fri, Sep 21, 2012 at 6:30 AM, aaron morton wrote:
> Set the caching attribute for the CF. It defaults to keys_only, other
> values are both or rows_only.
>
> See http://www.datastax.com/dev/blog/caching-in-cassandra-1-1
>
> Cheers
>
> -
> Aaron Mo
36 matches
Mail list logo