I have seen supercolumns usage been discouraged most of the times.
However sometimes the supercolumns seem to fit the scenario most
appropriately not only in terms of how the data is stored but also in
terms of how is it retrieved. Some of the queries supported by SCs are
uniquely capable of doing
Some thoughts on the plan:
* You are monkeying around with things, do not be surprised when surprising
things happen.
* Deliberately unbalancing the cluster may lead to Bad Things happening.
* In the design discussed it is perfectly reasonable for data not to be on the
archive node.
* Truncat
in a traditional database it's not a good a idea to have hundreds of tables
but is it also bad to have hundreds of column families in cassandra? thank
you.
Hi Aaron,
Thanks for your response!
I re-ran the test case # 5. (Node 1 & 2 running, Node 3 & 4 down, Node 1
contains the data, CL ONE and RF 2). I was connected to Node 1 while I ran
the test. I still did not get any data. See below logs:
Now that my cluster appears to run smoothly and after a few successful
repairs and compacts, I'm back in the business of deletion of portions
of data based on its date of insertion. For reasons too lengthy to be
explained here, I don't want to use TTL.
I use a batch mutator in Pycassa to delete ~
If you have the time turn logging up to DEBUG and start again, it will log
where it failed. Put the logs aside incase there is a bug there.
To get things running again: Move the commit log segment out of the directory
and try the restart. Then run a repair from the node.
Have you made any re
Sounds good.
You can take some extra steps when doing a rolling restart see
http://blog.milford.io/2011/11/rolling-upgrades-for-cassandra/
Also make sure repair *does not* run until all the nodes have been upgraded.
> Do i miss something (I will backup everything before the
> upgrade)?
I'm
i know it's a throwaway example, but i would probably structure your
column the other way around in that case.
ie steve.4, steve.5, steve.6, greg.4, greg.6, greg.9.
and then do two slice queries, steve.4-steve.10, greg.4-greg.10.
On 04/01/2012 15:41, Jeremiah Jordan wrote:
You can't use a slic
Will be fixed in 1.0.7 https://issues.apache.org/jira/browse/CASSANDRA-3656
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 4/01/2012, at 11:26 PM, Michael Vaknine wrote:
> Hi,
>
> I have a 4 cluster version 1.0.3 which was upgraded from
Hi,
I can't start a node of my cluster. Could someone help me to catch the
problem?
Using: debian with cassandra 1.0.6.
root@carlo-laptop:/etc/cassandra# cat /var/log/cassandra/output.log
INFO 13:46:00,596 JVM vendor/version: Java HotSpot(TM) 64-Bit Server
VM/1.6.0_26
INFO 13:46:00,600 Heap si
Unless you are running into an issue with using super columns that make
the composite columns better fit what you are trying to do, I would just
stick with super-columns. "if it ain't broke don't fix it".
-Jeremiah
On 01/03/2012 11:21 PM, Asil Klin wrote:
@Stephan: in that case, you can easil
You can't use a slice range. But you can query for the specific
columns. "4.steve", "5.steve", "6.steve" ... "4.greg", "5.greg",
"6.greg". Just have to ask for all of the possible columns you want.
On 01/03/2012 04:31 PM, Stephen Pope wrote:
The bonus you're talking about here, how do I a
04.01.12 14:25, Radim Kolar написав(ла):
> So, what are cassandra memory requirement? Is it 1% or 2% of disk data?
It depends on number of rows you have. if you have lot of rows then
primary memory eaters are index sampling data and bloom filters. I use
index sampling 512 and bloom filters set
Hi,
I'm going to migrate from Cassandra 0.7 to 1.0 in production and I'd like to
know the best way to do it ...
"Upgrading from version 0.7.1+ or 0.8.2+ can be done with a rolling restart,
one node at a time. (0.8.0 or 0.8.1 are NOT network-compatible with 1.0:
upgrade to the most recent 0.8 r
I don't think I can tell my exact column names in many cases. For example most
of our queries are for specific keys, and an unknown range of numbers (like
key1, key where number > 1). How can I set up my slice in this case to
retrieve only the columns that match both criteria?
Cheers,
Steve
Hi,
On Tue, Jan 3, 2012 at 8:19 PM, aaron morton wrote:
> Running a time based rolling window of data can be done using the TTL.
> Backing up the nodes for disaster recover can be done using snapshots.
> Restoring any point in time will be tricky because to may restore columns
> where the TTL h
> Looking at heap dumps, a lot of memory is taken by memtables, much
more than 1/3 of heap. At the same time, logs say that it has nothing to
flush since there are not dirty memtables.
I seen this too.
> So, what are cassandra memory requirement? Is it 1% or 2% of disk data?
It depends on numbe
Ed,
thanks for a dose of common sense, I should have thunk about it.
In fact, I only have 2 columns in that one particular CF, but one of
these can get really fat (for a good reason). So the CLI just plain runs
out of memory when pulling the default 100 rows (with a little help from
various o
Hello.
BTW: It would be great for cassandra to shutdown on Errors like OOM
because now I am not sure if the problem described in previous email is
the root cause or some of OOM error found in log made some "writer" stop.
I am now looking at different OOMs in my cluster. Currently each node
h
Hi,
I have a 4 cluster version 1.0.3 which was upgraded from 0.7.6 in 2 stages.
Upgrade to 1.0.0 run scrub on all nodes
Upgrade to 1.0.3
I keep getting this errors from time to time on all 4 nodes.
Is there any maintenance I can do to fix the problem?
I tried to run repair on the clu
Look at the nodetool tpstats when you get the TimedOutException, to work out
which nodes are backing up with pending messages. Then try to identify why.
Check the server logs for GC, and the CPU and IO usage.
Somehow the cluster is getting overwhelmed and cannot respond. Either the
clients hav
You do not need a restart if you use nodetool refresh
Otherwise a rolling restart will do the trick.
Cheers
-
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com
On 4/01/2012, at 12:31 PM, Jim Newsham wrote:
>
> Thanks that's very helpful. I'm assuming
I've not spent much time with the secondary indexes, so a couple of questions.
Whats is the output of nodetool ring ?
Which node were you connected to when you did the get ?
If you enable DEBUG logging what do the log messages from StorageProxy say that
contain the string "scan ranges are" and
> I was able to scrub the node the repair that failed was running on. Are you
> saying the error could be displayed on that node but the bad data coming from
> another node ?
Yes. The error occurred the node was receiving a data stream from another, you
will need to clean the source of the data.
24 matches
Mail list logo