Collectd with GenericJMX pushing data into Graphite is what we use.
You can monitor the Graphite graphs directly instead of having an extra JMX
interface on the Cassandra nodes for monitoring.
On Wed, Mar 21, 2012 at 8:16 PM, Jeremiah Jordan <
jeremiah.jor...@morningstar.com> wrote:
> You can al
Hi
I'm going with yes to all three of your questions.
I found a very heavily hit index which we have since reworked to remove the
secondry index entirely.
This fixed a large portion of the problem but during the panic of the
overloaded cluster we did the simple scaling out trick of doubling the
c
Hi,
We recently moved from 0.8.2 to 1.0.8 and the behaviour seems to have
changed so that tombstones are now not being deleted.
Our application continually adds and removes columns from Cassandra. We
have set a short gc_grace time (3600) since our application would
automatically delete zombies i
Over the past 9 months, we've learned a lot about indexing in Cassandra and
we've had a few false starts. I've tried to capture what we learned in the
hopes that we can save a few others from false starts.
http://brianoneill.blogspot.com/2012/03/cassandra-indexing-good-bad-and-ugly.html
We've al
This is a known issue(s) to be fixed (can't find exact tickets on the tracker).
Controller, that receives truncate command, checks all nodes up and send
truncate message to all (including itself), waiting for an answer for
rpc_timeout_in_ms (will be fixed to separate timeout setting).
If any nod
I'd double-check the firewall on each node to make sure the storage and RPC
ports aren't being blocked.
We've found that "Up" in nodetool ring output reflects the gossip status, which
means only that one of the nodes can contact the node, not necessarily the
entire ring. It's possible for the e
Aaron
txs for the reply. Yes I am shutting down the server.
I also noticed that I put the wrong version nr of Cassandra in my email.
I have multiple installations. The version that I used for the test was 0.7
I will upgrade and check against a more recent version.
/Martien
Op 21 maart 2012 10:04
You can also use any network/server monitoring tool which can talk to JMX. We
are currently using vFabric Hyperic's JMX plugin for this.
IIRC there are some cacti and nagios scripts on github for getting the data
into those.
-Jeremiah
From: R. Verlangen [ro..
With the crazy message size, the version exception, and the frequency, is
there a service that is connecting to the thrift port on occasion and
sending garbage (port scan or something else of that nature). Can you rule
that out?
On Wed, Mar 21, 2012 at 11:15 AM, aaron morton wrote:
> 1. or
run 'show keyspaces' in the cassandra-cli and paste the details for ks1
here.
On Wed, Mar 21, 2012 at 12:33 PM, Cyril Scetbon wrote:
> Hi,
>
> I'm using version 1.0.7 and when I try to truncate a CF in cqlsh it raises
> an error message, but it seems incorrect ...
>
> cqlsh:ks1> truncate core;
>
Hi,
I'm using version 1.0.7 and when I try to truncate a CF in cqlsh it
raises an error message, but it seems incorrect ...
cqlsh:ks1> truncate core;
Unable to complete request: *one or more nodes were unavailable*.
>nodetool -h localhost ring
The node is overloaded with hints.
I'll just grab the comments from codeā¦
// avoid OOMing due to excess hints. we need to do this check even
for "live" nodes, since we can
// still generate hints for those if it's overloaded or simply dead
but not yet known-to-be-dead
> Regarding bloom filters, have I understood correctly that they are stored on
> Heap,
yes.
> and that the "Bloom Filter Space Used" reported by 'nodetool cfstats' is an
> approximation of the heap space used by bloom filters?
Yes, it's the on serialised on disk size. This will be smaller than
Yes, that is good enough for now. Thanks.
On Fri, Mar 16, 2012 at 6:49 PM, Watanabe Maki wrote:
> How about to fill zeros before smaller digits?
> Ex. 0001, 0002, etc
>
> maki
>
>
> On 2012/03/17, at 6:29, A J wrote:
>
>> If I define my rowkeys to be Integer
>> (key_validation_class=Inte
and we are on 0.8.6.
On Wed, Mar 21, 2012 at 10:24 AM, Daning Wang wrote:
> Hi All,
>
>
> We got lots of Exception in the log, and later the server crashed. any
> idea what is happening and how to fix it?
>
> ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482
> AbstractCassandraDaemon.java
Hi All,
We got lots of Exception in the log, and later the server crashed. any idea
what is happening and how to fix it?
ERROR [RequestResponseStage:4] 2012-03-21 04:16:30,482
AbstractCassandraDaemon.java (line 139) Fatal exception in thread
Thread[RequestResponseStage:4,5,main]
java.io.IOError:
> 1. org.apache.thrift.TException: Message length exceeded: 134218240
thrift_mas_message_length_in_mb
https://github.com/apache/cassandra/blob/cassandra-1.0/conf/cassandra.yaml#L243
(134218240 is 128MB, which is a lot of data(
> 2. org.apache.thrift.protocol.TProtocolException: Missi
There is a forced flusher that kicks in when your heap becomes full.
Look for log lines from GCInspector.
There is a bug that prevents flushing memtable when it has only full key
delete mutations, see https://issues.apache.org/jira/browse/CASSANDRA-3741
For me it happened when we've started to mo
I remember problems in the past with windows and deleting files. Are you
shutting down the server in the test tear down ?
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 21/03/2012, at 7:02 PM, martien huijsmans wrote:
> Hi,
>
> I have
> How will you recommend doing schema level health checks (consistency) for
> Cassandra within the cluster?
describe cluster in the cli can be used to see how many schema versions there
are.
The similar functionality is included in most other clients.
Cheers
-
Aaron Morton
F
Hello
I am trying to upgrade our 1-node setup from 0.8.9 to 1.0.8 and seeing the
following exception when starting up 1.0.8. We have been running 0.8.9
without any issues.
DEBUG [OptionalTasks:1] 2012-03-21 11:59:14,731 IntervalNode.java (line 45)
Creating IntervalNode from
[Interval(DecoratedKe
Hi,
I need to attach a virtual DC to an existing cluster. My existing
cluster uses the Ec2Snitch as endpoint_snitch in cassandra.yaml.
Since both my clusters will live in the same AZ in AWS, it seems as
though my best bet is to switch to a property file snitch to divide.
A couple of questions:
*
On 2012-03-21 16:36, Erik Forsberg wrote:
Hi!
We're using the bulkloader to load data to Cassandra. During and after
bulkloading, the minor compaction process seems to result in larger
sstables being created. An example:
This is on Cassandra 1.1, btw.
\EF
Hi!
We're using the bulkloader to load data to Cassandra. During and after
bulkloading, the minor compaction process seems to result in larger
sstables being created. An example:
INFO [CompactionExecutor:105] 2012-03-21 15:18:46,608
CompactionTask.java (line 115) Compacting [SSTableReader(p
I have increased index_interval. Will let you know if I see a difference.
My theory is that memtables are not getting flushed. If I manually
flush them, the heap consumption goes down drastically.
I think when memtable_total_space_in_mb is exceeded not enough
memtables are getting flushed. There
Hi!
We're currently testing Cassandra with a large number of row keys per
node - nodetool cfstats approximated number of keys to something like
700M per node. This seems to have caused a very large heap consumption.
After reading
http://wiki.apache.org/cassandra/LargeDataSetConsiderations I
Hi all
I'm running into a weird error on Cassandra 1.0.7.
As my clusters load gets heavier many of the nodes seem to hit the same
error around the same time, resulting in MutationStage backing up and never
clearing down. The only way to recover the cluster is to kill all the nodes
and start them u
To note: I still have the problem in before beta 1.1 custom build that
seems to have the fix. I am going to upgrade to 1.1 beta and check if
problem will go away and will file a bug if problem still exists.
BTW: It would be great for cassandra to exit on any fatal errors, like
assertion problems
Hello.
There is also a primary row index. It's space can be controlled with
index_interval setting. Don't know if you can look for it's memory usage
somewhere. If I where you, I'd take jmap tool and examine heap histogram
first, heap dump second.
Best regards, Vitalii Tymchyshyn
20.03.12 18
Hi Cassandra Users,
A couple of questions on the server side exceptions that I see sometimes -
1. org.apache.thrift.TException: Message length exceeded: 134218240
n How to configure message length?
2. org.apache.thrift.protocol.TProtocolException: Missing version in
readMessageB
30 matches
Mail list logo