Thanks a lot dear. I will try it out and will let you know if the problem
persists.
On Thu, Jul 14, 2011 at 5:52 AM, Sameer Farooqui wrote:
> As long as you have no data in this cluster, try clearing out the
> /var/lib/cassandra directory from all nodes and restart Cassandra.
>
> The only way to
okay, I am not sure if it is infinite loop, I change log4j to "DEBUG" only
because cassandra never get online after run cassandra, it seems just halt.
I enable debug then it start showing those message very fast and never end.
I have just run nodetool cleanup, and it start reading commitlog, seem
setting server.config ->$SERVER_PATH/Cassandra.yaml as a system property
should resolve this?
-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Thursday, July 14, 2011 3:53 AM
To: user@cassandra.apache.org
Subject: Re: JDBC CQL Driver unable to locate cassandra.yam
That says "I'm collecting data to answer requests."
I don't see anything here that indicates an infinite loop.
I do see that it's saying "N of 2147483647" which looks like you're
doing slices with a much larger limit than is advisable (good way to
OOM the way you already did).
On Wed, Jul 13, 20
On Wed, Jul 13, 2011 at 9:45 PM, Konstantin Naryshkin
wrote:
> Do you mean that it is using all of the available heap? That is the
> expected behavior of most long running Java applications. The JVM will not
> GC until it needs memory (or you explicitly ask it to) and will only free up
> a bit of
Consistency and Availability are in trade-off each other.
If you use RF=7 + CL=ONE, your read/write will success if you have one
node alive during replicate data to 7 nodes.
Of course you will have a chance to read old data in this case.
If you need strong consistency, you must use CL=QUORUM.
maki
problem is I can't take cassandra back does that because not enough
memory for cassandra?
On Thu, Jul 14, 2011 at 11:29 AM, Bret Palsson wrote:
> How much total memory does your machine have?
>
> --
> Bret
>
> On Wednesday, July 13, 2011 at 9:27 PM, Yan Chunlu wrote:
>
> I gave cassandra 8GB
16GB
On Thu, Jul 14, 2011 at 11:29 AM, Bret Palsson wrote:
> How much total memory does your machine have?
>
> --
> Bret
>
> On Wednesday, July 13, 2011 at 9:27 PM, Yan Chunlu wrote:
>
> I gave cassandra 8GB heap size and somehow it run out of memory and
> crashed. after I start it, it just run
How much total memory does your machine have?
--
Bret
On Wednesday, July 13, 2011 at 9:27 PM, Yan Chunlu wrote:
> I gave cassandra 8GB heap size and somehow it run out of memory and crashed.
> after I start it, it just runs in to the following infinite loop, the last
> line:
> DEBUG [main] 2
I gave cassandra 8GB heap size and somehow it run out of memory and crashed.
after I start it, it just runs in to the following infinite loop, the last
line:
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 0 of 2147483647: 100zs:false:14@1310168625866434
goes for e
DEBUG [main] 2011-07-13 22:19:00,586 SliceQueryFilter.java (line 123)
collecting 0 of 2147483647: 100zs:false:14@1310168625866434
Thanks for the reply Peter.
The goal is to configure a cluster in which reads and writes can
complete successfully even if only 1 node is online. For this to work,
each node would need the entire dataset. Your example of a 3 node ring
with RF=3 would satisfy this requirement. However, if two nodes
As long as you have no data in this cluster, try clearing out the
/var/lib/cassandra directory from all nodes and restart Cassandra.
The only way to change tokens after they've been set is using a nodetool
move or clearing /var/lib/cassandra.
On Wed, Jul 13, 2011 at 7:41 AM, Abdul Haq Shaik <
a
Running Cassandra 0.8.1. Ran major compaction via:
sudo /home/ubuntu/brisk/resources/cassandra/bin/nodetool -h localhost
compact &
>From what I'd read about Cassandra, I thought that after compaction all of
the different SSTables on disk for a Column Family would be merged into one
new file.
How
> Read and write operations should succeed even if only 1 node is online.
>
> When a read is performed, it is performed against all active nodes.
Using QUORUM is the closest thing you get for reads without modifying
Cassandra. You can't make it wait for all nodes that happen to be up.
> When a wr
The current version of the driver does require having the server's
cassandra.yaml on the classpath. This is a bug.
On Wed, Jul 13, 2011 at 3:13 PM, Derek Tracy wrote:
> I am trying to integrate the Cassandra JDBC CQL driver with my companies ETL
> product.
> We have an interface that performs da
I am trying to integrate the Cassandra JDBC CQL driver with my companies ETL
product.
We have an interface that performs database queries using there respective
JDBC drivers.
When I try to use the Cassandra CQL JDBC driver I keep getting a stacktrace:
Unable to locate cassandra.yaml
I am using Ca
I am wondering if the following cluster figuration is possible with
cassandra, and if so, how it could be achieved. Please also feel free
to point out any issues that may make this configuration undesired
that I may not have thought of.
Suppose a cluster of N nodes.
Each node replicates the data
>Note that if GCGraceSeconds is 10 days, you want to run repair often
>enough that there will never be a moment where there is more than
>exactly 10 days since the last successfully completed repair
>*STARTED*.
>When scheduling repairs, factor in things like - what happens if
>repair fails? Who ge
> # wait for a bit until no one is sending it writes anymore
More accurately, until all other nodes have realized it's down
(nodetool ring on each respective host).
--
/ Peter Schuller (@scode on twitter)
> What are the other ways to stop Cassandra?
nodetool disablegossip
nodetool disablethrift
# wait for a bit until no one is sending it writes anymore
nodetool flush # only relevant if in periodic mode
# then kill it
> What's the difference between batch vs periodic?
Search for "batch" on http://
Peter Schuller wrote:
>
>> Recently upgraded to 0.8.1 and noticed what seems to be missing data
>> after a
>> commitlog replay on a single-node cluster. I start the node, insert a
>> bunch
>> of stuff (~600MB), stop it, and restart it. There are log messages
>
> If you stop by a kill, make sure
> Recently upgraded to 0.8.1 and noticed what seems to be missing data after a
> commitlog replay on a single-node cluster. I start the node, insert a bunch
> of stuff (~600MB), stop it, and restart it. There are log messages
If you stop by a kill, make sure you use batched commitlog synch mode
in
Thanks. Looks like we tracked down the problem to the datasax 0.8.1
rpm is actually 0.8.0.
rpm -qa | grep cassandra
apache-cassandra08-0.8.1-1
grep ' Cassandra version:' /var/log/cassandra/system.log | tail -1
INFO [main] 2011-07-13 12:04:31,039 StorageService.java (line 368)
Cassandra version:
If you can provide some more details on the use case we may be able to provide
some data model help.
You can always use a dedicated CF for the counters, and use the same row key.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 1
Have you verified that data you expect to see is not in the server after
shutdown?
WRT the differed in the difference between the Memtable data size and SSTable
live size, don't believe everything you read :)
Memtable live size is increased by the serialised byte size of every column
inserted,
How do I ensure it is indeed using the SerializingCacheProvider.
Thanks
-Rajesh
On Tue, Jul 12, 2011 at 1:46 PM, Jonathan Ellis wrote:
> You need to set row_cache_provider=SerializingCacheProvider on the
> columnfamily definition (via the cli)
>
> On Tue, Jul 12, 2011 at 9:57 AM, Raj N wrote:
> In the company I work for I suggested many times to run repair at least 1
> every 10 days (gcgraceseconds is set approx to 10 days in our config) -- but
> this book has been used against me :-) I will ask to run repair asap
Note that if GCGraceSeconds is 10 days, you want to run repair often
eno
I'll have to apologize on that one. Just saw that the JMX call I was
talking about doesn't work as it should.
I'll fix that for 0.8.2 but in the meantime you'll want to use
sstableloader on a different IP as pointed by Jonathan.
--
Sylvain
On Wed, Jul 13, 2011 at 5:11 PM, Sylvain Lebresne wrote:
You can escape quotes but I don't think you can escape semicolons.
Can you create a ticket for us to fix this?
On Wed, Jul 13, 2011 at 10:16 AM, Blake Visin wrote:
> I am trying to get all the columns named "fmd:" in cqlsh.
> I am using:
> select 'fmd:'..'fmd;' from feeds where;
> But I am gettin
And fixed! a co-worker put in a bad host line entry last night that through it
all off :( Thanks for the assist guys.
--
Ray Slakinski
On Wednesday, July 13, 2011 at 1:32 PM, Ray Slakinski wrote:
> Was all working before, but we ran out of file handles and ended up
> restarting the nodes. No
>
> >>> cqlsh> UPDATE RouterAggWeekly SET 1310367600 = 1310367600 + 17 WHERE
> >>> KEY = '1_20110728_ifoutmulticastpkts';
> >>> Bad Request: line 1:51 no viable alternative at character '+'
>
I m able to insert it.
___
cqlsh>
cqlsh> UPDATE counts SET 1310367600 = 1310367600 +
I've tried using the Thrift/execute_cql_query() API as well, and it
doesn't work either. I've also tried using a CF where the column
names are of AsciiType to see if that was the problem (quoted and
unquoted column names) and I get the exact same error of: no viable
alternative at character '+'
F
Was all working before, but we ran out of file handles and ended up restarting
the nodes. No yaml changes have occurred.
Ray Slakinski
On 2011-07-13, at 12:55 PM, Sasha Dolgy wrote:
> any firewall changes? ping is fine ... but if you can't get from
> node(a) to nodes(n) on the specific ports
I am trying to get all the columns named "fmd:" in cqlsh.
I am using:
select 'fmd:'..'fmd;' from feeds where;
But I am getting errors (as expected). Is there any way to escape the colon
or semicolon in cqlsh?
Thanks,
Blake
Also note that if you have a cassandra node running on the local node
from which you want to bulk load sstables, there is a JMX
(StorageService->bulkLoad) call to do just that. May be simpler than
using sstableloader if that is what you want to do.
--
Sylvain
On Wed, Jul 13, 2011 at 3:46 PM, Step
"data grids", it seems that this really does not have much
relationship to "java", since all major noSQL solutions explicitly
create interfaces in almost all languages and try to be
language-agnostic by using RPC like thrift,avro etc.
On Wed, Jul 13, 2011 at 9:06 AM, Pete Muir wrote:
> Hi,
>
> I
any firewall changes? ping is fine ... but if you can't get from
node(a) to nodes(n) on the specific ports...
On Wed, Jul 13, 2011 at 6:47 PM, samal wrote:
> Check seed ip is same in all node and should not be loopback ip on cluster.
>
> On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski
> wrote:
>
Hi,
I am looking to "round out" the EG membership of JSR-347 so that we can get
going with discussions. It would be great if someone from the Cassandra
community could join to represent the experiences of developing HBase :-)
We'll be communicating using https://groups.google.com/forum/#!forum/
Check seed ip is same in all node and should not be loopback ip on cluster.
On Wed, Jul 13, 2011 at 8:40 PM, Ray Slakinski wrote:
> One of our nodes, which happens to be the seed thinks its Up and all the
> other nodes are down. However all the other nodes thinks the seed is down
> instead. The l
Got it.
Thanks!
On Wed, Jul 13, 2011 at 6:05 PM, Jonathan Ellis wrote:
> (1) the hash calculation is a small amount of CPU -- MD5 is
> specifically designed to be efficient in this kind of situation
> (2) we compute one hash per query, so for multiple columns the
> advantage over timestamp-per-
Ahhh..ok. Thanks.
-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Wednesday, July 13, 2011 11:35 AM
To: user@cassandra.apache.org
Subject: Re: BulkLoader
Because it's hooking directly into gossip, so the local instance it's
ignoring is the bulkloader process, no
Because it's hooking directly into gossip, so the local instance it's
ignoring is the bulkloader process, not Cassandra.
You'd need to run the bulkloader from a different IP, than Cassandra.
On Wed, Jul 13, 2011 at 8:22 AM, Stephen Pope wrote:
> Fair enough. My original question stands then. :)
Fair enough. My original question stands then. :)
Why aren't you allowed to talk to a local installation using BulkLoader?
-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Wednesday, July 13, 2011 11:06 AM
To: user@cassandra.apache.org
Subject: Re: BulkLoader
One of our nodes, which happens to be the seed thinks its Up and all the other
nodes are down. However all the other nodes thinks the seed is down instead.
The logs for the seed node show everything is running as it should be. I've
tried restarting the node, turning on/off gossip and thrift and
Sure, that will work fine with a single machine. The advantage of
bulkloader is it handles splitting the sstable up and sending each
piece to the right place(s) when you have more than one.
On Wed, Jul 13, 2011 at 7:47 AM, Stephen Pope wrote:
> I think I've solved my own problem here. After gen
(1) the hash calculation is a small amount of CPU -- MD5 is
specifically designed to be efficient in this kind of situation
(2) we compute one hash per query, so for multiple columns the
advantage over timestamp-per-column gets large quickly.
On Wed, Jul 13, 2011 at 7:31 AM, David Boxenhorn wrote
This (https://issues.apache.org/jira/browse/CASSANDRA-2653) is fixed
in 0.7.7, which will be out soon.
On Tue, Jul 12, 2011 at 9:13 PM, Kyle Gibson
wrote:
> Running version 0.7.6-2, recently upgraded from 0.7.3.
>
> I am get a time out exception when I run a particular
> get_indexed_slices, which
I think I've solved my own problem here. After generating the sstable using
json2sstable it looks like I can simply copy the created sstable into my data
directory.
Can anyone think of any potential problems with doing it this way?
-Original Message-
From: Stephen Pope [mailto:stephen
Hi,
I have deleted the data, commitlog and saved cache directories. I have
removed one of the nodes from the seeds of cassandra.yaml. When i tried to
use nodetool, itshowing the removed node as up..
Thanks,
Abdul
Is that the actual reason?
This seems like a big inefficiency to me. For those of us who don't worry
about this extreme edge case (that probably will NEVER happen in real life,
for most applications), is there a way to turn this off?
Or am I wrong about this making the operation MUCH more expensi
A ColumnPath can contain a super column, so you should be fine inserting a
super column family (in fact I do that). Quoting cassandra.thrift:
struct ColumnPath {
3: required string column_family,
4: optional binary super_column,
5: optional binary column,
}
- Original Message ---
Do you mean that it is using all of the available heap? That is the expected
behavior of most long running Java applications. The JVM will not GC until it
needs memory (or you explicitly ask it to) and will only free up a bit of
memory at a time. That is very good behavior from a performance sta
I'm trying to figure out how to use the BulkLoader, and it looks like there's
no way to run it against a local machine, because of this:
Set hosts = Gossiper.instance.getLiveMembers();
hosts.remove(FBUtilities.getLocalAddress());
if (hosts.isEmpty(
Perfect, thanks!
-Original Message-
From: Jonathan Ellis [mailto:jbel...@gmail.com]
Sent: Tuesday, July 12, 2011 5:53 PM
To: user@cassandra.apache.org
Subject: Re: sstabletojson
You can upgrade to 0.8.1 to fix this. :)
On Tue, Jul 12, 2011 at 1:03 PM, Stephen Pope wrote:
> Hey there.
For a specific column, If there are two versions with the same timestamp,
the value of the column is used to break the tie.
if v1.value().compareTo(v2.value()) < 0, it means that v2 wins.
On Wed, Jul 13, 2011 at 7:13 PM, David Boxenhorn wrote:
> How would you know which data is correct, if they
How would you know which data is correct, if they both have the same
timestamp?
On Wed, Jul 13, 2011 at 12:40 PM, Boris Yen wrote:
> I can only say, "data" does matter, that is why the developers use hash
> instead of timestamp. If hash value comes from other node is not a match, a
> read repair
I can only say, "data" does matter, that is why the developers use hash
instead of timestamp. If hash value comes from other node is not a match, a
read repair would perform. so that correct data can be returned.
On Wed, Jul 13, 2011 at 5:08 PM, David Boxenhorn wrote:
> If you have to pieces of
If you have to pieces of data that are different but have the same
timestamp, how can you resolve consistency?
This is a pathological situation to begin with, why should you waste effort
to (not) solve it?
On Wed, Jul 13, 2011 at 12:05 PM, Boris Yen wrote:
> I guess it is because the timestamp
I guess it is because the timestamp does not guarantee data consistency, but
hash does.
Boris
On Wed, Jul 13, 2011 at 4:27 PM, David Boxenhorn wrote:
> I just saw this
>
> http://wiki.apache.org/cassandra/DigestQueries
>
> and I was wondering why it returns a hash of the data. Wouldn't it be
>
I just saw this
http://wiki.apache.org/cassandra/DigestQueries
and I was wondering why it returns a hash of the data. Wouldn't it be better
and easier to return the timestamp? You don't really care what the data is,
you only care whether it is more or less recent than another piece of data.
>
> Can you give me a bit idea how key_cache and row_cache effects on
> performance of cassandra. How these things works in different scenario
> depending upon the data size?
>
> While reading, if row_cached is set, it check for row_cache first then
key_cached, memtable & disk.
row_cache store al
I'll write a FAQ for this topic :-)
maki
2011/7/13 Peter Schuller :
>> To be sure that I didn't misunderstand (English is not my mother tongue) here
>> is what the entire "repair paragraph" says ...
>
> Read it, I maintain my position - the book is wrong or at the very
> least strongly misleading
For batch_insert, I think you could use batch_mutate instead.
For multi_get, I think you could use multiget_slice instead.
Boris
在 ,魏金仙 寫道:
insert(key, column_path, column, consistency_level) can only insert a
standard column.Is batch_mutate the only API to insert a super column?
and also
Thanks for the confirmatio, Peter.
In the company I work for I suggested many times to run repair at least 1
every 10 days (gcgraceseconds is set approx to 10 days in our config) -- but
this book has been used against me :-) I will ask to run repair asap
>Messaggio originale
>Da: peter.s
insert(key, column_path, column, consistency_level) can only insert a standard
column.
Is batch_mutate the only API to insert a super column?
and also can someone tell why batch_insert,multi_get is removed in version
0.7.4?
row_Cache caches a whole row, Key_cache caches the key and the row location.
thus, if the request is hit in row_Cache then the result can be given without
disk seek. If it is hit in key_Cache, result can be obtains after one disk seek.
without key_Cache or row_cache, it will check the index file f
Hi All,
Can you give me a bit idea how key_cache and row_cache effects on
performance of cassandra. How these things works in different scenario
depending upon the data size?
Thank You
Nilabja Banerjee
68 matches
Mail list logo