Index queries (ColumnFamilyStore.scan) don't do any low-level i/o
themselves, they go through CFS.getColumnFamily, which is what normal
row fetches also go through. So if there is a leak there it's
unlikely to be specific to indexes.
What is your open-file limit (remember that sockets count towar
What's going on in the logs? CPU? i/o?
On Thu, Mar 31, 2011 at 4:20 AM, Or Yanay wrote:
> Hi all,
>
>
>
> My production cluster reads got stuck.
>
> The ring gives:
>
>
>
> Address Status State LoadOwns
> Token
>
>
>
I've got a single node of cassandra 0.7.4, and I used the java stress tool
to insert about 100 million records.
The inserts took about 6 hours (45k inserts/sec) but the following minor
compactions last for 2 days and the pending compaction jobs are still
increasing.
>From jconsole I can read the M
Yup, I screwed up the token setting, my bad.
Now, I moved the tokens. I still observe that read latency deteriorated with
3 machines vs original one. Replication factor is 1, Cassandra version 0.7.2
(didn't have time to upgrade as I need results by this weekend).
Key and row caching was disabled
On Thu, Mar 31, 2011 at 8:25 PM, mcasandra wrote:
> It looks like if I use system schema it fails. Is it because of
> LocalPartitioner?
>
> I ran with other keyspace and got following output.
>
> Offset SSTables Write Latency Read Latency Row Size Column Count
> 1 0 0 0 0 0
> 2 0 0 0 0 0
> 179 0 0
Hi All,
I am trying out a very simple scenario and I dont seem to get it working. It
would be great if I am pointed to some things here.
I have set up a 2 node cluster, cassandra.yaml being the default and same for
each other than the seed: being each other and I have set the Thrift RPC
addres
Gregori,
Congrats on writing the fud-liest post of the month award. Firstly if
you don't like updates give up on computers and software. Especally
give up on anything that has to do with nosql because it is fast
evolving.
If you think you have a problem with the cassandra api, then what you
really
ant on my command line had completed without error.
Next I tried to build cassandra 0.7.4 in eclipse, and had luck.
So I'll explore cassandra code with eclipse, rather than IDEA.
maki
2011/3/31 Maki Watanabe :
> Not yet. I'll try.
>
> maki
>
> 2011/3/31 Tommy Tynjä :
>> Have you assured you are a
It looks like if I use system schema it fails. Is it because of
LocalPartitioner?
I ran with other keyspace and got following output.
Offset SSTables Write Latency Read Latency Row Size Column Count
1 0 0 0 0 0
2 0 0 0 0 0
179 0 0 0 320 320
Can someone please help me understand the output in fi
Cassandra 7.4:
nodetool -h `hostname` cfhistograms system schema
Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
at $Proxy5.getRecentReadLatencyHistogramMicros(Unknown Source)
at
org.apache.cassandra.tools.NodeCmd.printCfHistograms(NodeCmd.java:452)
On Thu, Mar 31, 2011 at 4:19 PM, Ryan King wrote:
> We have a solution for time series data on cassandra at Twitter that
> we'd like to open source, but it requires 0.8/trunk so we're not going
> to release it until that's stable.
>
> See
> http://www.slideshare.net/kevinweil/rainbird-realtime-an
On Thu, Mar 31, 2011 at 6:15 PM, Eric Gilmore wrote:
> A script that I have says the following:
>
> $ python ctokens.py
> How many nodes are in your cluster? 2
> node 0: 0
> node 1: 85070591730234615865843651857942052864
>
> The first token should be zero, for the reasons discussed here:
> http://
Just finished looking at the slides. It looks awesome!
On 3/31/11 4:19 PM, "Ryan King" wrote:
>We have a solution for time series data on cassandra at Twitter that
>we'd like to open source, but it requires 0.8/trunk so we're not going
>to release it until that's stable.
>
>See
>http://www.slid
It iterates over all the SSTables and disk and estimates the number of keys by
looking at how big the index is. It does not count the actual keys.
aaron
On 31 Mar 2011, at 17:46, Sheng Chen wrote:
> I just found an estmateKeys() method of the ColumnFamilyStoreMBean.
> Is there any indication
We have a solution for time series data on cassandra at Twitter that
we'd like to open source, but it requires 0.8/trunk so we're not going
to release it until that's stable.
See
http://www.slideshare.net/kevinweil/rainbird-realtime-analytics-at-twitter-strata-2011
-ryan
On Thu, Mar 31, 2011 at
Where are the connection refused messages ? Are they client side ? Can you
cannot to the cluster with nodetool and run the ring command ?
Aaron
On 31 Mar 2011, at 11:44, Anurag Gujral wrote:
> I restarted the cassandra node with more disks when I try to connect to
> cassandra i get connection
I know cloudkick is doing something like this, and we're developing our own
in-house method, but it would be nice for there to be a generically-available
package that would do this. Lately I've been wishing that someone would take
graphite (written in python) and put the frontend on top of cass
There is no reason to change the RF on the system keyspace, it should probably
not be allowed.
The system keyspace uses a LocalPartitioner and it's data is not replicated
through the same mechanism as a user keyspace.
Aaron
On 31 Mar 2011, at 10:22, Jeremy Stribling wrote:
> On 03/30/2011
I've been looking at replacing our PostgreSQL backend for RTG (a SNMP
based polling and graphing solution for network traffic/ports) with
something using Cassandra in order to solve our scalability and
redundancy requirements. Based on a lot of what I've read, Cassandra
is an ideal data store for
It does not have a yaml file, so am assuming it's the default Random
Partitioner.
Aaron
On 1 Apr 2011, at 04:51, Drew Kutcharian wrote:
> Thanks Aaron,
>
> I have already checked out Twissandra. I was mainly looking to see how
> Secondary Indexes can be used and how they effect Data Modeling
Peter, I want to join everyone else thanking you for helping out so much
with this thread, and especially for pointing out the problems with the DS
docs on this topic. We have some corrections posted today, and will keep
looking to improve the information.
On Thu, Mar 31, 2011 at 3:11 PM, Peter S
A script that I have says the following:
$ python ctokens.py
How many nodes are in your cluster? 2
node 0: 0
node 1: 85070591730234615865843651857942052864
The first token should be zero, for the reasons discussed here:
http://www.datastax.com/dev/tutorials/getting_started_0_7/configuring#initial
I experience something that looks exactly like
https://issues.apache.org/jira/browse/CASSANDRA-1178
On cassandra 0.7.3 when using index slice queries (lots of them)
Crashing multiple nodes and rendering the cluster useless. But I have no clue
where to look if index queries still leak fd
Does any
> Thanks a lot for elaborating on repairs. Still, it's a bit fuzzy to me why
> it is so important to run a repair before the GCGraceSeconds kicks in. Does
> this mean a delete does not get "replicated" ? In other words when I delete
> something on a node, doesn't cassandra set tombstones
I just configured a cluster of two nodes -- do these token values make sense?
The reason I'm asking that so far I don't see load balancing to be
happening, judging from performance.
Address Status State LoadOwnsToken
Peter -
Thanks a lot for elaborating on repairs.Still, it's a bit fuzzy to me why
it is so important to run a repair before the GCGraceSeconds kicks in. Does
this mean a delete does not get "replicated" ? In other words when I delete
something on a node, doesn't cassandra set tombstones
On Thu, Mar 31, 2011 at 2:53 PM, Peter Schuller
wrote:
>> Only the following Levels are provided, I am wondering if the ZERO
>> consistency level is removed in Cassandra 0.7.X ?
>
> Yes, it's gone.
>
>> If so, Could you please explain why was it removed and what is the best
>> option I have given
> Only the following Levels are provided, I am wondering if the ZERO
> consistency level is removed in Cassandra 0.7.X ?
Yes, it's gone.
> If so, Could you please explain why was it removed and what is the best
> option I have given my context.
https://issues.apache.org/jira/browse/CASSANDRA-160
Hi,
I am dealing with reporting with not so important data and I am okay with data
being lost.
I would like to minimize the time taken for the actual data insert.
I am using Cassandra 0.7.4
If it matter, using Hector to connect to Cassandra
cZERO consistency level in Thrift Generated code
org.ap
Thanks Aaron,
I have already checked out Twissandra. I was mainly looking to see how
Secondary Indexes can be used and how they effect Data Modeling. There doesn't
seem to be a lot of coverage on them.
In addition, I couldn't tell what kind of Partitioner is Twissandra using and
why.
cheers,
ConnectionPool has a set_server_list() method that you can use to update the
list of servers. (It appears this method did not make it into the docs;
I'll make sure it gets in there.) Pycassa doesn't make any attempt to
update the server list automatically right now.
By the way, there is a pycass
I'm rebalancing a cluster of 2 nodes at this point. Netstats on the "source"
node reports progress of the stream, whereas on the receving end netstats
states that progress = 0. Did anyone see that?
Do I need both nodes listed as seeds in cassandra.yaml?
TIA/
--
View this message in context:
ht
In the pycassa.pool.ConnectionPool class, I can specify all the nodes
in server_list parameter.
But overtime, when nodes get decomissioned and new nodes with new IPs
get added, how can the server_list parameter be refereshed ?
Do I have to modify it manually, or is there a way to update the list
au
If I am not wrong node repair need to be run on all the nodes in staggerred
manner. It is required to take care of tombstones. Please correct me team if
I am wrong :)
See Distributed Deletes:
http://wiki.apache.org/cassandra/Operations
--
View this message in context:
http://cassandra-user-in
> silly question, would every cassandra installation need to have manual
> repairs done on it?
>
> It would seem cassandra's "read repair" and regular compaction would take
> care of keeping the data clean.
>
> Am I missing something?
See my previous posts in this thread for the distinct reasons
Thanks Edward,
Anyone able to provide some answers for the other questions?
On 03/26/2011 07:25 AM, Edward Capriolo wrote:
On Fri, Mar 25, 2011 at 2:11 PM, ian douglas wrote:
On 03/25/2011 10:12 AM, Jonathan Ellis wrote:
On Fri, Mar 25, 2011 at 11:59 AM, ian douglaswrote:
(we're runnin
silly question, would every cassandra installation need to have manual repairs
done on it?
It would seem cassandra's "read repair" and regular compaction would take care
of keeping the data clean.
Am I missing something?
On Mar 30, 2011, at 7:46 PM, Peter Schuller wrote:
>> I just wanted t
On Thu, Mar 31, 2011 at 3:52 AM, T Akhayo wrote:
> Hi Aaron,
>
> Thank you for your reply, i appreciate the suggestions you made.
>
> Yesterday i managed to get everything (our main read) in one CF, with the
> use of a structure in a value like you suggested.
>
> Designing a new data model is diff
From my understanding of replica copies, cassandra picks which nodes to
replicate the data based on replication strategy, and those same "replica
partner" nodes are always used according to token ring distribution.
If you change the replication strategy, does cassandra pick new nodes to
repl
Ok, we'll do it for sure!
Thanks,
Roberto
On 31 March 2011 14:56, aaron morton wrote:
> Next time it happens take a note of the snapshot folder, different
> processes name the folder differently. It may help track down what created
> the snapshot.
>
> Cheers
> Aaron
>
> On 31 Mar 2011, at 01:13
--
Darío Bravo
AFAIK Cassandra will just pick the directory with the most space.
Also AFAIK using multiple directories should only be considered a safety valve
to fix problems such as the one you describe see
http://www.mail-archive.com/user@cassandra.apache.org/msg07874.html
Aaron
On 31 Mar 2011, at 15:1
The CassandraBulkLoader example is written to use Super Columns, so seems odd.
Do you have the rest of the error stack ?
Aaron
On 31 Mar 2011, at 04:54, George Ciubotaru wrote:
> Hello,
>
> I’m using CassandraBulkLoader.java
> (https://svn.apache.org/repos/asf/cassandra/trunk/contrib/bmt
Next time it happens take a note of the snapshot folder, different processes
name the folder differently. It may help track down what created the snapshot.
Cheers
Aaron
On 31 Mar 2011, at 01:13, Roberto Bentivoglio wrote:
> Hi Aaron,
> I already deleted the snapshot folder unfortunately.
> We
Drew,
The Twissandra project is a twitter clone in cassandra, it may give you
some insight into how things can be modelled
https://github.com/thobbs/twissandra
If you are just starting then consider something like...
- CF to hold the user, their data and their network l
Not yet. I'll try.
maki
2011/3/31 Tommy Tynjä :
> Have you assured you are able to build Cassandra outside
> of IDEA, e.g. on command line?
>
> Best regards,
> Tommy
> @tommysdk
>
> On Thu, Mar 31, 2011 at 3:56 AM, Maki Watanabe
> wrote:
>> Hello,
>>
>> I'm trying to build and run cassandra 0.7
I had troubles setting up my Cassandra IDE on IntelliJ IDEA 10 as
well. The problems were related to IDEA not finding all the libraries
necessary so I had to make sure all necessary libraries were
downloaded and that hadoop directories etc were marked as
source-folders in the project. I don't recog
Fo all who reply on this topic, thanks, for you patience and explanations
Hi,
When I iteratively get data with secondary index and index clause,
result of data acquired by consistency level "one" is different from
the one by consistency level "quorum". The one by consistecy level "one"
is correct result. But the one by consistecy level "quorum" is incorrect
and some d
Thanks a lot for sharing your inputs, guys...
On Thu, Mar 31, 2011 at 6:47 AM, Drew Kutcharian wrote:
> Hi Ed,
>
> Cool, I guess we both read/interpreted his post differently and gave two
> valid answers ;)
>
> - Drew
>
> On Mar 30, 2011, at 5:40 PM, Ed Anuff wrote:
>
> > Hey Drew, I'm somewhat
I am using Cassandra 0.7.0 and Random Partitioner.
From: Or Yanay [mailto:o...@peer39.com]
Sent: Thursday, March 31, 2011 12:20 PM
To: user@cassandra.apache.org
Subject: Requests stuck on production cluster
Hi all,
My production cluster reads got stuck.
The ring gives:
Address Status St
Hi all,
My production cluster reads got stuck.
The ring gives:
Address Status State LoadOwnsToken
146231632500721020374621781629360107476
10.39.21.7 Up Normal 118.86 GB 18.15%
696879268146680791533
Hi Aaron,
Thank you for your reply, i appreciate the suggestions you made.
Yesterday i managed to get everything (our main read) in one CF, with the
use of a structure in a value like you suggested.
Designing a new data model is different from what i'm used to, but if you
keep in mind that you d
> Woud you cassandra team think to add an alias name for nodetool
> "repair" command?
That thought has crossed my mind lately too; particularly in one of
the recent threads.
The problem seems analogous to 'fsck', and the distinction between
fully expected by-design behavior needing fsck/repair is
54 matches
Mail list logo