Any thoughts?
Thanks.
-Wei
From: Wei Zhu
To: Cassandr usergroup
Sent: Wednesday, November 7, 2012 12:47 PM
Subject: composite column validation_class question
Hi All,
I am trying to design my schema using composite column. One thing I am a bit
confused
Hi All,
I am doing a benchmark on a Cassandra. I have a three node cluster with RF=3. I
generated 6M rows with sequence number from 1 to 6m, so the rows should be
evenly distributed among the three nodes disregarding the replicates.
I am doing a benchmark with read only requests, I generate re
For those looking to index data in Cassandra with Elastic Search, here
is what we decided to do:
http://brianoneill.blogspot.com/2012/11/big-data-quadfecta-cassandra-storm.html
-brian
--
Brian ONeill
Lead Architect, Health Market Science (http://healthmarketscience.com)
mobile:215.588.6024
blog:
It is not as bad with hector, but still each Keyspace object is
another socket open to Cassandra. If you have 500 webservers and 10
keyspaces. Instead of having 5000 connections you now have 5000.
On Thu, Nov 8, 2012 at 6:35 PM, sankalp kohli wrote:
> I think this code is from the thrift part. I
I did overlook something. get_range_slice will invoke cfs.getRawCachedRow
instead of cfs.getThroughCache. Hence, no row will be cached if it's not
present in the row cache. Well, this puzzles me further as to that how the
range of rows is expected to get stored into the row cache in the first
place
If I have multiple clusters can I replicate a keyspace from each of those
cluster to separate cluster?
smime.p7s
Description: S/MIME cryptographic signature
Pierre Chalamet chalamet.net> writes:
>
> Hi,You do not need to have 700 Go of data in RAM. Cassandra is able to store
on disks and query from there if data is not cached in memory. Caches are
maintained by C* by itself but you still have to some configuration.Supposing
you want to store around
I think this code is from the thrift part. I use hector. In hector, I can
create multiple keyspace objects for each keyspace and use them when I want
to talk to that keyspace. Why will it need to do a round trip to the server
for each switch.
On Thu, Nov 8, 2012 at 3:28 PM, Edward Capriolo wrote:
In the old days the API looked like this.
client.insert("Keyspace1",
key_user_id,
new ColumnPath("Standard1", null, "name".getBytes("UTF-8")),
"Chris Goffinet".getBytes("UTF-8"),
timestamp,
ConsistencyLevel.ONE);
bu
I am a bit confused. One connection pool I know is the one which
MessageService has to other nodes. Then there will be incoming connections
via thrift from clients. How are they affected by multiple keyspaces?
On Thu, Nov 8, 2012 at 3:14 PM, Edward Capriolo wrote:
> Any connection pool. Imagine
Any connection pool. Imagine if you have 10 column families in 10
keyspaces. You pull a connection off the pool and the odds are 1 in 10
of it being connected to the keyspace you want. So 9 out of 10 times
you have to have a network round trip just to change the keyspace, or
you have to build a key
Can it be that you have tons and tons of tombstoned columns in the middle
of these two? I've seen plenty of performance issues with wide
rows littered with column tombstones (you could check with dumping the
sstables...)
Just a thought...
Josep M.
On Thu, Nov 8, 2012 at 12:23 PM, André Cruz wro
Hi,
Lets say I am reading with consistency TWO and my replication is 3. The
read is eligible for global read repair. It will send a request to get data
from one node and a digest request to two.
If there is a digest mismatch, what I am reading from the code looks like
it will get the data from
Which connection pool are you talking about?
On Thu, Nov 8, 2012 at 2:19 PM, Edward Capriolo wrote:
> it is better to have one keyspace unless you need to replicate the
> keyspaces differently. The main reason for this is that changing
> keyspaces requires an RPC operation. Having 10 keyspaces w
it is better to have one keyspace unless you need to replicate the
keyspaces differently. The main reason for this is that changing
keyspaces requires an RPC operation. Having 10 keyspaces would mean
having 10 connection pools.
On Thu, Nov 8, 2012 at 4:59 PM, sankalp kohli wrote:
> Is it better t
Is it better to have 10 Keyspaces with 10 CF in each keyspace. or 100
keyspaces with 1 CF each.
I am talking in terms of memory footprint.
Also I would be interested to know how much better one is over other.
Thanks,
Sankalp
Is there a ticket open for this for 1.1.6?
We also noticed this after upgrading from 1.1.3 to 1.1.6. Every node runs a
0 row hinted handoff every 10 minutes. N-1 nodes hint to the same node,
while that node hints to another node.
On Tue, Oct 30, 2012 at 1:35 PM, Vegard Berget wrote:
> Hi,
>
>
@ben, thx, we will be deploying 2.2.1 of DSE soon and will try to
setup a traffic sampling node so we can test leveled compaction.
we essentially keep a rolling window of data written once. it is
written, then after N days it is deleted, so it seems that leveled
compaction should help
On Thu, No
These are the two columns in question:
=> (super_column=13957152-234b-11e2-92bc-e0db550199f4,
(column=attributes, value=, timestamp=1351681613263657)
(column=blocks,
value=A4edo5MhHvojv3Ihx_JkFMsF3ypthtBvAZkoRHsjulw06pez86OHch3K3OpmISnDjHODPoCf69bKcuAZSJj-4Q,
timestamp=1351681613263657
thanks for the links! i had forgotten about live sampling
On Thu, Nov 8, 2012 at 11:41 AM, Brandon Williams wrote:
> On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner wrote:
>> There are also ways to bring up a test node and just run Level Compaction on
>> that. Wish I had a URL handy, but hopefull
Also to answer your question, LCS is well suited to workloads where
overwrites and tombstones come into play. The tombstones are _much_ more
likely to be merged with LCS than STCS.
I would be careful with the patch that was referred to above, it hasn't
been reviewed, and from a glance it appears t
On Thu, Nov 8, 2012 at 1:33 PM, Aaron Turner wrote:
> There are also ways to bring up a test node and just run Level Compaction on
> that. Wish I had a URL handy, but hopefully someone else can find it.
This rather handsome fellow wrote a blog about it:
http://www.datastax.com/dev/blog/whats-new
http://www.datastax.com/docs/1.1/operations/tuning#testing-compaction-and-compression
Write Survey mode.
After you have it up and running you can modify the column family mbean to
use LeveledCompactionStrategy on that node to see how your hardware/load
fares with LCS.
On Thu, Nov 8, 2012 at 11:
LCS works well in specific circumstances, this blog post gives some good
considerations: http://www.datastax.com/dev/blog/when-to-use-leveled-compaction
On Nov 8, 2012, at 1:33 PM, Aaron Turner wrote:
> "kill performance" is relative. Leveled Compaction basically costs 2x disk
> IO. Look at
"kill performance" is relative. Leveled Compaction basically costs 2x disk
IO. Look at iostat, etc and see if you have the headroom.
There are also ways to bring up a test node and just run Level Compaction
on that. Wish I had a URL handy, but hopefully someone else can find it.
Also, if you'r
we are running Datastax enterprise and cannot patch it. how bad is
"kill performance"? if it is so bad, why is it an option?
On Thu, Nov 8, 2012 at 10:17 AM, Radim Kolar wrote:
> Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
>
>> my question is would leveled compaction help to get rid of the
Hi All,
We are happy to announce release of Kundera 2.2.
Kundera is a JPA 2.0 based, object-datastore mapping library for NoSQL
datastores. The idea behind Kundera is to make working with NoSQL Databases
drop-dead simple and fun. It currently supports Cassandra, HBase, MongoDB and
relational da
Ok, this article answered all the confusion in my head:
http://www.datastax.com/dev/blog/thrift-to-cql3
It's a must read for noobs (like me). It perfectly explains mappings and
diffs between internals and CQL3(abstractions). First read this and THEN go
study all the resources out there ;)
Lp,
Ala
Dne 8.11.2012 19:12, B. Todd Burruss napsal(a):
my question is would leveled compaction help to get rid of the
tombstoned data faster than size tiered, and therefore reduce the disk
space usage?
leveled compaction will kill your performance. get patch from jira for
maximum sstable size per CF
we are having the problem where we have huge SSTABLEs with tombstoned data
in them that is not being compacted soon enough (because size tiered
compaction requires, by default, 4 like sized SSTABLEs). this is using
more disk space than we anticipated.
we are very write heavy compared to reads, an
What is the size of columns? Probably those two are huge.
On Thu, Nov 8, 2012 at 4:01 AM, André Cruz wrote:
> On Nov 7, 2012, at 12:15 PM, André Cruz wrote:
>
> > This error also happens on my application that uses pycassa, so I don't
> think this is the same bug.
>
> I have narrowed it down t
Hi there!
I'm strugguling to figure out (for quite few hours now) how can I insert
for example column with TimeUUID name and empy value in CQL3 in fictional
table. And what's the table design? I'm interested in syntax (e.g. example).
I'm trying to do something like Matt Dennis did here (*Cassandr
No, we haven't changed RF, but it's been a very long time since we repaired
last, so we're guessing this is an effect of not running repair regularly,
and that doing it regularly will fix it. It would just be nice to know.
Also, running major compaction after the repair made the data size shrink
b
Did you change the RF or had a node down since you repaired last time ?
2012/11/8 Henrik Schröder
> No, we're not using columns with TTL, and I performed a major compaction
> before the repair, so there shouldn't be vast amounts of tombstones moving
> around.
>
> And the increase happened durin
No, we're not using columns with TTL, and I performed a major compaction
before the repair, so there shouldn't be vast amounts of tombstones moving
around.
And the increase happened during the repair, the nodes gained ~20-30GB each.
/Henrik
On Thu, Nov 8, 2012 at 12:40 PM, horschi wrote:
> H
On Nov 7, 2012, at 12:15 PM, André Cruz wrote:
> This error also happens on my application that uses pycassa, so I don't think
> this is the same bug.
I have narrowed it down to a slice between two consecutive columns. Observe
this behaviour using pycassa:
>>> DISCO_CASS.col_fam_nsrev.get(uui
Hi,
We recently ran a major compaction across our cluster, which reduced the
storage used by about 50%. This is fine, since we do a lot of updates to
existing data, so that's the expected result.
The day after, we ran a full repair -pr across the cluster, and when that
finished, each storage node
Hi,
Is there a way we can limit the data of a particular user on the Cassandra
cluster?
Say for example, I have three users namely, Jsmith, Elvis, Dilbert configured
in my Cassandra deployment.
And I wanted to limit the data usage for them as follows.
Jsmith - 1 GB
Elvis - 2 GB
Dilbert - 500 M
Hi
A bit more info on that
I have one working setup with
python-cql1.0.9-1
python-thrift 0.6.0-2~riptano1
cassandra1.0.8
The setup where cqlsh is not working has:
python-cql1.0.10-1
python-thrift 0.6.0-2~riptano1
cassandra1.0.11
Maybe this will give someone a hint of what the pr
40 matches
Mail list logo