the former, but also see http://issues.apache.org/jira/browse/CASSANDRA-1530
On Wed, Oct 6, 2010 at 9:22 PM, MK wrote:
> Say I have a cluster of N nodes and I have started all the nodes with
> a replication factor of N. So effectively all data is being mirrored
> everywhere.
>
> Now, when I write
Say I have a cluster of N nodes and I have started all the nodes with
a replication factor of N. So effectively all data is being mirrored
everywhere.
Now, when I write to a node, how does this data get propagated to the
remaining N-1 nodes.
1) Does this one origin node do N-1 network operations
Creating indexes takes extra space (does in MySQL, PGSQL, etc too).
https://issues.apache.org/jira/browse/CASSANDRA-749 has quite a bit of
detail about how the secondary indexes currently work.
On Wed, Oct 6, 2010 at 7:17 PM, Alvin UW wrote:
> Hello,
>
> Before 0.7, actually we can create an ex
Rob is correct.
drain is really on there for when you need the commit log to be empty (some
upgrades or a complete backup of a shutdown cluster).
There really is no point to using to shutdown C* normally, just kill it...
On Wed, Oct 6, 2010 at 4:18 PM, Rob Coli wrote:
> On 10/6/10 1:13 PM, Aar
I'm seeing cases where the count in slicerange predicate is not
respected. This is only happening for super columns. I'm running
Cassandra 0.6.4 in a single node.
Steps to reproduce, using the Keyspace1.Super1 CF:
* insert three super columns, bar1 bar 2, and bar3, under the same key
* delete bar1
Hello,
Before 0.7, actually we can create an extra ColumnFamily as an secondary
index, if we need.
I was wondering whether the secondary index mechanism in 0.7 just likes
creating an extra ColumnFamily as an index.
The difference is only that users don't take care of the maintainence of the
secon
can you tar.gz the filter/index/data files for this sstable and attach
it to a ticket so we can debug?
if you can't make the data public you can send it to me off list and I
can have a look.
On Wed, Oct 6, 2010 at 11:37 AM, Narendra Sharma
wrote:
> Has any one used sstable2json on 0.6.5 and noti
Ryan,
Independent of this ambiguous requirement what were you thinking about. What I
am trying to ask is can you be more specific/concrete about when you can
Simon Reavely
On Oct 5, 2010, at 11:30 AM, Ryan King wrote:
> On Tue, Oct 5, 2010 at 8:23 AM, Ian Rogers
> wrote:
>>
>> Does Cassan
On 10/6/10 1:13 PM, Aaron Morton wrote:
To shutdown cleanly, say in a production system, use nodetool drain
first. This will flush the memtables and put the node into a read only
mode, AFAIK this also gives the other nodes a faster way of detecting
the node is down via the drained node gossiping
Thats a lot of questions, I'll try to answer some...Read/Write latency as reported for a CF is the time taken to perform a local read on that node. Read/Write latency reported on the o.a.c.service.StorageProxy are the time taken to process a complete request, including local and remote reads when C
The SCs are stored on disk in the order defined by the compareWith setting
so if you want them back in a different order either someone is sorting them
(C*, which doesn't sort them right now, or the client; which doesn't make
much of a difference, it's just moving the load around) or you're
denorma
>
> PS. Are other ppl interested in this functionality ?
> I could file it to JIRA as well...
>
>
Yes, please file it to Jira. It seems like it would be pretty useful for
various things and fairly easy to change the code to move it to another
directory whenever C* thinks it should be deleted...
To shutdown cleanly, say in a production system, use nodetool drain first. This will flush the memtables and put the node into a read only mode, AFAIK this also gives the other nodes a faster way of detecting the node is down via the drained node gossiping it's new status. Then kill. AaronOn 07 Oct
Some relevant reading if you're interested:
http://dslab.epfl.ch/pubs/crashonly/
http://web.archive.org/web/20060426230247/http://crash.stanford.edu/
On Wed, Oct 6, 2010 at 1:46 PM, Scott Mann wrote:
> Yes. ctrl-C if running in the foreground. Use kill , if running
> in the background (see the
There is an explanation of how to lock the JVM into memory here http://www.riptano.com/blog/whats-new-cassandra-065However from the JVM Heap Size section here http://wiki.apache.org/cassandra/MemtableThresholdsFor a rough rule of thumb, Cassandra's internal datastructures will require about memtabl
As jbellis mentioned, the secondary indexes with > will work for this but in
the mean time you can still index this manually in .6 (which will continue
to work in .7 if need be).
There are several ways to attack this now. If you don't have too many users
you can have a row with "age" as the row k
Ok, Thank you all. More reading to do :)
On Oct 6, 2010, at 3:21 PM, Jonathan Ellis wrote:
> On Wed, Oct 6, 2010 at 1:49 PM, Brayton Thompson
> wrote:
>> Ok, let me tweak the scenario a tiny bit. What if I wanted something
>> extremely arbitrary, for instance... simple comparisons like a WHERE
On Wed, Oct 6, 2010 at 1:49 PM, Brayton Thompson wrote:
> Ok, let me tweak the scenario a tiny bit. What if I wanted something
> extremely arbitrary, for instance... simple comparisons like a WHERE clause
> in SQL
> get Users.someuser['uuid'] where Users.someuser['age'] > 33
>
> From what i'
So would my best bet be to simply get ALL of my users uuids and ages,
then throw away all of those that do not meet the required test?
And in fact this is also what a traditional database does when you need
table scan. And this will happen if you have not prepared an index on
that column. (
Commitlog segments remain until all the data in them has been flushed.
Reduce MemtableFlushAfterMinutes.
If I had to guess without your error log why the node went down, I
would guess you exceeded the open file handle allowance. You can
increase that with the standard ulimit or /etc/security/lim
Hello Experts,
I see a queer behavior from on of the Cassandra nodes in my cluster where
the data is not flushed off Commitlogs and the Commitlog file grows in
number. I was inserting the data into the cluster and since yesterday this
node had more than 900 commitlog files.
-rw-r--r-- 1 dev dev
Ok, let me tweak the scenario a tiny bit. What if I wanted something extremely
arbitrary, for instance... simple comparisons like a WHERE clause in SQL
get Users.someuser['uuid'] where Users.someuser['age'] > 33
From what i've read this functionality defeats the point of Cassandra becau
Yes. ctrl-C if running in the foreground. Use kill , if running
in the background (see the man page for kill if you are unfamiliar
with it). Killing Cassandra is the only way to terminate it.
On Wed, Oct 6, 2010 at 11:03 AM, Alberto Velandia wrote:
> So, is ctrl + C how you stop cassandra? or I'm
On 10/6/10 9:05 AM, Utku Can Topçu wrote:
The nodes are still swapping, even though the swappiness is set to zero
right now. After swapping comes the OOM.
https://issues.apache.org/jira/browse/CASSANDRA-1214
?
=Rob
Hi,
On a first pass, that patch seems to have solved the problem.
I'll be testing that functionality repeatedly in the next day or so I'll let
you know how it fairs.
Thanks
Jason
On Wed, Oct 6, 2010 at 4:06 PM, Stu Hood wrote:
> Hey JT,
>
> I believe this issue should be fixed by CASSANDRA-15
As Norman said, secondary indexes are only in .7 but you can create standard
indexes in both .6 and .7
Basically have a email_domain_idx CF where the row key is the domain and the
column names have the row id of the user (the column value is unused in this
scenario). This sounds basically like wh
Only in 0.7
Bye,
Norman
2010/10/6 Brayton Thompson :
> Are secondary index's available in .6.5? or are they only in .7?
> On Oct 6, 2010, at 1:15 PM, Tyler Hobbs wrote:
>
> If you're interested in only checking part of a column's value, you can
> generally
> just store that part of the value in a
Are secondary index's available in .6.5? or are they only in .7?
On Oct 6, 2010, at 1:15 PM, Tyler Hobbs wrote:
> If you're interested in only checking part of a column's value, you can
> generally
> just store that part of the value in a different column. So, have an
> "email_addr" column
> a
Hmm, I thought the Thrift API was moved to 18 before beta2 was released.
I'll make a matching release for pycassa in just a moment. Thanks for the
notice.
By the way, there is a pycassa specific mailing list,
pycassa-disc...@googlegroups.com
- Tyler
On Wed, Oct 6, 2010 at 12:13 PM, Dipti Mathu
If you're interested in only checking part of a column's value, you can
generally
just store that part of the value in a different column. So, have an
"email_addr" column
and a "email_domain" column, which stores "aol.com", for example.
Then you can just use a secondary index on the "email_domain
Hi All,
I was trying to connect to cassandra using the pycassa module. Looks like
there is a API cersion mismatch. Any ideas where I can get the right version
of the APIs?
I am using:
INFO 22:11:50,860 Cassandra version: 0.7.0-beta2
INFO 22:11:50,861 Thrift API version: 17.1.0
Error message on p
So, is ctrl + C how you stop cassandra? or I'm i better doing it another way?
Thanks
On Oct 6, 2010, at 11:59 AM, Norman Maurer wrote:
> CTRL + Z does not stop a programm it just suspend it. You will need to
> resume it with "fg" and then hit CTRL + C to stop it.
>
> For some basic background:
CTRL + Z does not stop a programm it just suspend it. You will need to
resume it with "fg" and then hit CTRL + C to stop it.
For some basic background:
http://linuxreviews.org/beginner/jobs/
Bye,
Norman
2010/10/6 Alberto Velandia :
> Hi I've stopped cassandra hitting Ctrl + Z and tried to rest
Hi I've stopped cassandra hitting Ctrl + Z and tried to restart it and got this
message:
INFO 11:46:16,039 JNA not found. Native methods will be disabled.
INFO 11:46:16,159 DiskAccessMode 'auto' determined to be mmap, indexAccessMode
is mmap
ERROR 11:46:16,449 Fatal exception during initializa
Thanks Oleg!
Could you please share the patch. I have build Cassandra before from source.
I can definitely give it try.
-Naren
On Wed, Oct 6, 2010 at 3:55 AM, Oleg Anastasyev wrote:
> > Is it possible to retain the commit logs?
>
> In off-the-shelf cassandra 0.6.5 this is not possible, AFAIK.
Has any one used sstable2json on 0.6.5 and noticed the issue I described in
my email below? This doesn't look like data corruption issue as sstablekeys
shows the keys.
Thanks,
Naren
On Tue, Oct 5, 2010 at 8:09 PM, Narendra Sharma
wrote:
> 0.6.5
>
> -Naren
>
>
> On Tue, Oct 5, 2010 at 6:56 PM, J
Hi Oleg,
I've been also looking into these after some research.
I've been tacking with:
1. Setting the default max and min heap from 1G to 1500M.
2. I'm not using row caches, and the key caches are set to 1000, before they
were 200K as default
3. I've lowered the memtable throughput to 32MB
4. We
> PS. Are other ppl interested in this functionality ?
> I could file it to JIRA as well...
I was about to post that such a thing was useful for point-in-time
recovery before reading your post, so yes :)
--
/ Peter Schuller
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1
Ok, I am VERY new to Cassandra and trying to get my head around its core ideas.
So lets say I have a CF of Users that contains all the info I would ever want
to know about them. One day I decide(for some reason) that I want to send a
mass email to
Hey JT,
I believe this issue should be fixed by CASSANDRA-1571... if you're able to
test that patch, it would be very helpful.
Thanks,
Stu
-Original Message-
From: "J T"
Sent: Tuesday, October 5, 2010 9:50pm
To: cassandra-u...@incubator.apache.org
Subject: Null Pointer Exception / Seco
Ah, great, thanks. I was looking under trunk/src/java/... instead of
trunk/interface/...
Dan
From: Michal Augustýn [mailto:augustyn.mic...@gmail.com]
Sent: October-06-10 10:38
To: user@cassandra.apache.org
Subject: Re: Column TTL
Hi,
I checked Cassandra.thrift file and found:
@
Hi,
I checked Cassandra.thrift file and found:
@param ttl. An optional, positive delay (in seconds) after which the column
will be automatically deleted.
Augi
2010/10/6 Dan Hendry
> Hi,
>
>
>
> I have a quick and quite frankly ridiculous question regarding the column
> TTL value; what are t
Hi,
I have a quick and quite frankly ridiculous question regarding the column
TTL value; what are the time units? Milliseconds/seconds or something else?
I initially thought milliseconds given that it is Java and that is what
timestamps are in but the data type used in the setTll() Java thrif
I have been seeing some strange trends in read latency that I wanted to
throw out there to find some explanations. We are running .6.5 in a 10 node
cluster rf=3. We find that the read latency reported by the cfstats is
always about 1/4 of the actual time it takes to get the data back to the
python
> Is it possible to retain the commit logs?
In off-the-shelf cassandra 0.6.5 this is not possible, AFAIK.
I developed a patch we use internally in our company for commit
log archivation and replay.
I can share a patch with you, if you dare patching cassandra
sources by yourself ;-)
PS. Are o
>
> Hi All,We're currently starting to get OOM exceptions in our cluster. I'm
trying to push the limiations of our machines. Currently we have 1.7 G memory
(ec2-medium)I'm wondering if by tweaking some of cassandra's configuration
settings, is it possible to make it live in peace and less memory.
Yes - the HadoopSupport should be updated for the functionality that is added
to 0.7. It's still a little in flux. There is an output format and output
streaming support on trunk/0.7 beta2. The output format has a java example in
the contrib/word_count example code. The output streaming, whi
AFAIK you can submit a pig job to the Hadoop job server via the pig command
line interface. If you have not done so already have a read of the Hadoop Book
it discusses pig as well
http://bit.ly/9gGRyH
Not sure how you go about monitoring the hadoop job though.
There is support for hadoop to o
> PHP: I basicaly need to start pig program from a php script (via thrift or
> something..?)
Can't you just execute a Pig script with PHP by calling Pig with a PHP exec
function call? I'm not sure what you're trying to do with it, but that's one
way you could do it.
> PIG: there is a LoadFunc
If you turn the log level up to DEBUG that will include information about each
request. Would that help? You could restrict it by setting a logging
configuration for the specific classes that output the message you are
interested in.
Not sure about retaining the commit logs.
Aaron
On 6 Oct 20
Your sort of right for point two. The comparators you define in the keyspace
def are for the names of the columns (or super columns) not their values. So
it's not possible to sort by the value of your name column, you'll need to do
it client side.
The indexing features in 0.7 can sort the value
Cassandra Version: 0.6.5
I am running a long duration test and I need to keep the commit log to see
the sequence of operations to debug few application issues. Is it possible
to retain the commit logs? Apart from increasing the value of
CommitLogRotationThresholdInMB
what is the other way to achie
Pig do not have thrift interface, But I believe you can create it. And
another way I think is create a web service for your pig service, and
call the web service in your php.
On Wed, Oct 6, 2010 at 4:17 PM, Petr Odut wrote:
> Hi,
> PHP: I basicaly need to start pig program from a php script (vi
Hi,
I've been battling against some errors that only seem to crop up when I'm
messing around with secondary indices in 0.7-beta2.
Namely I seem to get errors like this start to happen, after I 'delete' a
row in a CF that has a couple of secondary indices on it and then at some
point later try to
Hi,
PHP: I basicaly need to start pig program from a php script (via thrift or
something..?)
PIG: there is a LoadFunc that loads data from Cassandra, is there also a
StoreFunc?
On Tue, Oct 5, 2010 at 9:22 PM, Aaron Morton wrote:
> There is an example for pig in contrib/pig and a hadoop example i
Aaron,
first of all thanks for your time.
1. You cannot return just the super columns, you have to get their sub columns
as well. The returned data is ordered, please provide and example of where it
is not.
I don't know what I did before but now I checked and data are sorted as I
expected th
56 matches
Mail list logo