We tried to dump the stack trace of threads, we noticed that
"manual-repair-d08349af-189f-47cb-9cc3-452538ce04d1" daemon prio=10
tid=0x406a3000 nid=0x1890 waiting on condition [0x7f5c97be8000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
As somewhat of a conclusion to this thread, we have resolved the major issue
having to do with the hotspots. We were balanced between availability zones in
aws/ec2 us-east - a,b,c with the number of nodes in our cluster. However we
didn't alternate by rack with the token order. We are using t
Hello,
thanks for your time.
I have suggested a SCF but i am still testing the system with CF, making some
tests and testing the data flow ( insert / select ).
Making subdata as JSON already came into my mind, but it's not possible because
later i will need to apply filter to that data, and if
I've been doing multi-tenant with cassandra for a while, and from what I
have found, it is better to keep your keyspaces down in number. That said,
I have been using composite keys for my multi-tenancy now and it works
great:
Column Family: User
Key: [AccountId]/[UserId]
This makes it super han
hi, all -
I am very new to Cassandra, please bear with me if this is really a
FAQ. We are exploring if Cassandra is suitable use for a data
management project. The basic characteristics of the data are the
following:
- it centers around data files, each data file's size can be very
small to very
Depends of course a lot on how many tenants you have.
Hopefully the new off heap memtables is 1.0 may help as well as java gc on
large heaps is getting a much bigger issue than memory cost.
Regards,
Terje
On 25 Aug 2011, at 14:20, Himanshi Sharma wrote:
>
> I am working on similar sort of st
Yes, this is what I am worrying about.
2011/8/24 Ryan King
> On Tue, Aug 23, 2011 at 10:03 AM, Alvin UW wrote:
> > Hello,
> >
> > As mentioned by Ed Anuff in his blog and slides, one way to build
> customized
> > secondary index is:
> > We use one CF, each row to represent a secondary index, wi
We are using Cassandra 0.8.0 with 8 node ring and only one CF.
Every column has TTL of 86400 (24 hours). we also set 'GC grace second' to
43200
(12 hours). We have to store massive amount of data for one day now and
eventually for five days if we get more disk space.
Even for one day, we do run ou
Using 0.8.2, I've created a column family called "_Schema" (without the
quotes). For some reason, I can't seem to list the rows in it from the cli:
I've tried:
[default@BIM] list _Schema;
Syntax error at position 5: unexpected "_" for `list _Schema;`.
[default@BIM] list '_Schema';
Syntax error a
Hi,
If you want to store files with partition/replication, you could use
Distributed File System(DFS).
Like http://hadoop.apache.org/hdfs/
or any other:
http://en.wikipedia.org/wiki/Distributed_file_system
Still you could use Cassandra to store any metadata and filepath in DFS.
So: Cassandra + H
hey eric, the one thing i do not agree that it is the element of least
surprise. i would argue that the default behavior for *nix appplications is
that they find out what their home directory is and operate relative to
that. something like:
script_dir="$(dirname "$(readlink -f ${BASH_SOURCE[0]})")
Hmm...I've tried changing my column family name to "MySchema" instead. Now the
cli is behaving normally, but the OOM error still occurs when I
get_range_slices from my code.
From: Stephen Pope [mailto:stephen.p...@quest.com]
Sent: Thursday, August 25, 2011 11:10 AM
To: user@cassandra.apache.org
Never mind. I've got a hard-coded Count on the KeyRange set to 2 billion, which
is apparently beyond the maximum allowable.
From: Stephen Pope [mailto:stephen.p...@quest.com]
Sent: Thursday, August 25, 2011 11:15 AM
To: user@cassandra.apache.org
Subject: RE: Column Family names
Hmm...I've tried
Hi Aaron,
Thanks a lot for your suggestions. I have got exhausted with below error. It
would great if you point me what went wrong with my approach.
I wanted to install cassandra-0.8.4 on 3 nodes and to run Map/Reduce job that
uploads data from HDFS to Cassandra.
I have installed Cassnadra on
Thanks for the update
Jeremy Hanna wrote:
>
> It appears though that when choosing the non-local replicas, it looks for
> the next token in the ring of the same rack and the next token of a
> different rack (depending on which it is looking for).
Can you please explain this little more?
--
V
hi Evgeny
I appreciate the input. The concern with HDFS is that it has own
share of problems - its name node, which essentially a metadata
server, load all files information into memory (roughly 300 MB per
million files) and its failure handling is far less attractive ... on
top of configuring an
You can chunk the files into pieces and store the pieces in Cassandra...
Munge all the pieces back together when delivering back to the client...
On Aug 25, 2011 6:33 PM, "Ruby Stevenson" wrote:
> hi Evgeny
>
> I appreciate the input. The concern with HDFS is that it has own
> share of problems -
I believe this is conceptually similar to what Brisk is doing under CassandraFS
(HDFS compliant file system on top of cassandra).
Robert Jackson
[1] - https://github.com/riptano/brisk
- Original Message -
From: "Sasha Dolgy"
To: user@cassandra.apache.org
Sent: Thursday, August
hi Sasha -
Yes indeed. this solution was in the second part of my original
question - it just seems "out of norm" on what people usually use
Cassandra for, I guess I am looking for some reassurance before I roll
up the sleeve of trying it.
Thanks
Ruby
On Thu, Aug 25, 2011 at 12:36 PM, Sasha Dol
How many unique last names do you anticipate having? How many characters in
the last name do you anticipate keeping in your index? You can easily do
the math to figure out how many you could fit on a node. I think you'll
find that the ceiling might be quite a bit higher than you think. If you
h
On Thu, Aug 25, 2011 at 9:33 AM, Yang wrote:
> http://twitoaster.com/country-us/lenn0x/testing-out-a-slab-allocator-for-cassandra-to-reduce-gc-promotion-failures-by-stuhood-cassandra-memtables-gc-cc-jointheflock/
>
> hi: I'm interested in learning more about the slaballocator, anyone
> has a copy
Ruby Stevenson wrote:
>
> hi Sasha -
>
> Yes indeed. this solution was in the second part of my original
> question - it just seems "out of norm" on what people usually use
> Cassandra for, I guess I am looking for some reassurance before I roll
> up the sleeve of trying it.
>
> Thanks
>
> Rub
hmmm, I somehow came across some links that mentions cassandra SF
conference with this one, maybe I was wrong.
anyway, found this link that gives a very good background (on Hbase though, )
http://www.cloudera.com/blog/2011/03/avoiding-full-gcs-in-hbase-with-memstore-local-allocation-buffers-part-3
On Thu, Aug 25, 2011 at 10:26 AM, Yang wrote:
> hmmm, I somehow came across some links that mentions cassandra SF
> conference with this one, maybe I was wrong.
Yes there was a conference and this topic was discussed a bit.
> anyway, found this link that gives a very good background (on Hbase th
We have a 'virtual keyspaces' feature baked into the Hector client
that might be of interest:
https://github.com/rantav/hector/wiki/Virtual-Keyspaces
On Thu, Aug 25, 2011 at 8:23 AM, Terje Marthinussen
wrote:
>
> Depends of course a lot on how many tenants you have.
> Hopefully the new off heap m
hi Robert -
This is quite interesting. Now CassandraFS on google.code seems
inactive now. I don't see any release out of that.
Do you know if Brisk is considered stable at all or still very experimental?
thanks
Ruby
On Thu, Aug 25, 2011 at 12:44 PM, Robert Jackson
wrote:
> I believe this is
As far as I know the CassandraFS google code project has nothing to do with the
current implementation in Brisk (although I really have no idea about that).
Some additional information about CFS in Brisk can be found in the
presentations from Cassandra SF 2011 [1]. There is a nice presentation
Why are you keeping all your indexes in the same row? We do a similar thing
(maintain several indexes over the same data) and we just have an index column
family with keys like "dest192.168.0.1" which means destination index of
192.168.0.1. You can do rows like User_Keys_By_Last_Name_adams and
Agreed, that's what I meant by "there are a lot of simple ways to split it
up over multiple rows", assuming it necessary.
On Thu, Aug 25, 2011 at 4:24 PM, Konstantin Naryshkin
wrote:
> Why are you keeping all your indexes in the same row? We do a similar thing
> (maintain several indexes over the
Thanks.
Assume I use this approach, use the last names as the row keys of secondary
index, and use the base column family key as the column name.
There may be duplication key issue. We may solve it by composite key, like
"adams_1" , "adams_2".
Then, we can query these index by range query starting
Ernst,
Can you share the logs just before the crash. Specially the
GCInspector logs.Check the last reported used heap space and whether
it was close to the threshold for full GC.
Also how frequent are your OOM crashes?
The cassandra default for kicking in full GC is 75% (
-XX:CMSInitiati
With a fresh cassandra install and a pre built client what error do you get ?
Can you connect with node tool ? If not what error ?
What about the cassandra CLI ?
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 25/08/2011, at 2:51 AM,
Could you put together some information on this in a ticket and references this
one https://issues.apache.org/jira/browse/CASSANDRA-3071
The short term fix is to disable HH. You will still get consistent reads.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
ht
What about if you do a get ? What happens if you re-start the cassandra-cli ?
If you can reproduce the fault with a cli script please create a jira ticket
here https://issues.apache.org/jira/browse/CASSANDRA
Thanks
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http:
> One final question: should I add new nodes as Brisk instances instead of my
> home brew cassandra + hadoop nodes? I've obviously already put in the
> pain/effort of learning how to run hadoop + cassandra…
yes, make you life easier.
> create keyspace civicscience with replication_factor=3 an
Well you could group all the duplicate adams as columns in the same row. This
has several advantages:
* one, I am not sure what partitioner you plan to use, but if you plan to do
key range queries over all the same last names, you cannot use a
RandomPartitioner since it does not support key ran
That's a thread waiting for other threads / activities to complete. Nothing
unusual there.
Work out how fair the repair gets. Is there a validation compaction listed in
nodetool compactionstats ? Are there any streams running in nodetool netstats ?
Look through the logs on the machine you st
> later i will need to apply filter to that data,
Sounds like a read query you should support by denormalising the data.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 25/08/2011, at 10:50 PM, Helder Oliveira wrote:
> Hello,
>
If cassandra does not have enough disk space to create a new file it will
provoke a JVM GC which should result in compacted SStables that are no longer
needed been deleted. Otherwise they are deleted at some time in the future.
Compacted SSTables have a file written out with a "compacted" extens
Hi,
I am looking for an efficient way migrate a portion of the data existing in
a Cassandra cluster to another, separate Cassandra cluster. What I need is
to solve the typical live migration problem that appears in any "DB
sharding" where need to transfer "ownership" of certain rows from DB1 to
D
Thanks. The only logs I have are system and cassandra. I've included
those. I don't have gcinspector logs. I log gc via munin on other
machines, but I need to install it on these.
On 8/25/11 2:22 PM, Adi wrote:
Ernst,
Can you share the logs just before the crash. Specially the
GC
>
>
> create keyspace civicscience with replication_factor=3 and
> strategy_options = [{us-east:3}] and
> placement_strategy='org.apache.cassandra.locator.NetworkTopologyStrategy';
>
> FYI the replication_factor property with the NTS is incorrect, the next(?)
> revision of 0.8 will raise an error
No pending tasks for compactionstats and netstats.
On Fri, Aug 26, 2011 at 6:07 AM, aaron morton wrote:
> That's a thread waiting for other threads / activities to complete. Nothing
> unusual there.
>
> Work out how fair the repair gets. Is there a validation compaction listed
> in nodetool compa
Dear all,
My Cassandra 082 server had very large swap memory.
JConsole show memory used just 2.9GB. But htop (top) show Cassandra process
take 8700MB.
Here is my config:
MAX_HEAP_SIZE="6G"
HEAP_NEWSIZE="400M"
JVM_OPTS="$JVM_OPTS -XX:+UseParNewGC"
JVM_OPTS="$JVM_OPTS -XX:+UseConcMarkSweepGC"
JVM_O
On Thu, Aug 25, 2011 at 10:13 AM, Koert Kuipers wrote:
> hey eric, the one thing i do not agree that it is the element of least
> surprise. i would argue that the default behavior for *nix appplications is
> that they find out what their home directory is and operate relative to
> that. something
Sounds like http://wiki.apache.org/cassandra/FAQ#mmap
On Thu, Aug 25, 2011 at 10:36 PM, King JKing wrote:
> Dear all,
> My Cassandra 082 server had very large swap memory.
> JConsole show memory used just 2.9GB. But htop (top) show Cassandra process
> take 8700MB.
> Here is my config:
> MAX_HEAP_
On Thu, Aug 25, 2011 at 6:31 AM, Ruby Stevenson wrote:
> - Although Cassandra (and other decentralized NoSQL data store) has
> been reported to handle very large data in total, my preliminary
> understanding is the individual "column value" is quite limited. I
> have read some posts saying you sho
Dear Jonathan,
Cassandra process has 63.5 GB virtual size.
I mention about RES column in top. RES is 8.3G. Very large than 2.5G Used
Memory Used show in JConsole.
On Fri, Aug 26, 2011 at 11:04 AM, Jonathan Ellis wrote:
> Sounds like http://wiki.apache.org/cassandra/FAQ#mmap
>
> On Thu, Aug 25,
Hi All,
It looks it is know issue with Cassandra-0.8.4. So either I have to wait till
0.8.5 to be released or have to switch to 0.7.8 if this has been resolved in
that.
Ref: https://issues.apache.org/jira/browse/CASSANDRA-3044
Regards,
Thamizhannal P
--- On Thu, 25/8/11, Thamizh wrote:
Fr
49 matches
Mail list logo