there is a current stratregy to use cassandra for data storage and it makes
sense to have user management and roster management exist in the same place
for all the different services that we provide.
specific to user interaction, i started looking at ejabberd because Apache
Vysper is not as featur
Is there any reason why you would be interested to use erlang with
cassandra instead of other erlang based database [i.e Couchbase, Riak]
?
I am interested to know the reason.
Kind regards,
Joshua
On Sat, Feb 19, 2011 at 9:39 AM, Sasha Dolgy wrote:
> hi,
> does anyone have an erlang example for
Dude, I never mentioned the server side, sorry if it wasn't obvious.
As for python being slow, I'm not going away from it. It performs
amazingly well in other circumstances.
Jonathan Ellis-3 wrote:
>
> That doesn't make sense to me. IntegerType validation is a no-op and
> LongType validation i
That doesn't make sense to me. IntegerType validation is a no-op and
LongType validation is pretty close (just a size check).
If you meant that the conversion is killing performance on your
client, you should switch to a more performant client language. :)
On Fri, Feb 18, 2011 at 9:56 PM, buddha
I've been too smart for my own good trying to type columns, on the theory
that it would later increase performance by having more efficient
comparators in place. So if a string represents an integer, I would convert
it to an integer and declare the column as such. Same for LONG.
What I found is t
Forgot to mention replication factor is 1 and I am running Cassandra 0.7.0.
It's using SimpleStrategy
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Timeout-tp6042052p6042150.html
Sent from the cassandra-u...@incubator.apache.org mailing list ar
This is a test cluster of 3 nodes.
This is a test code that does the following:
1) First 4 lines physically drop, create keyspace and then creates CF and
column definition on the server
2) Right after from 5th line onwards it then gets the reference to keyspace
and tries to insert a row and colu
Why don't you post some details about your Cassandra Cluster, version,
information about the keyspace you are creating (for example which is the
replication factor within)? It might be of help.
Besides, I don't fully understand your code. First you drop KEYSPACE, then
create it again with a column
I have this below code and what I see is that when I run this below code
there is a timeout that occurs when I try to insert a column. But when I
comment out first 4 lines (drop to display) then it works without any
issues. I am trying to understand why. If required I can sleep and then
insert. Is
If you know you will have 3 nodes, you should set the initial token inside
the cassandra.yaml for each node.
Then you won't need to run nodetool move.
Regards,
Chen
www.evidentsoftware.com
On Fri, Feb 18, 2011 at 5:24 PM, mcasandra wrote:
>
> Thanks! I feel so horrible after realizing what m
hi,
does anyone have an erlang example for connecting to cassandra and
performing an operation like a get?
I'm not having much luck with: \thrift-0.5.0\test\erl\src\* as a reference
point.
I generated all of the erlang files using thrift and have successfully
compiled them but am having a pretty
Cassandra as dessert topping? Cassandra as floor-wax?
I do apologize for this basket of clueless questions, but I'm
exploring new territory for me.
Overall problem has two datasets with distinct storage characteristics.
The first is a set of data that can fit in memory, but which needs
reliable
Thanks! I feel so horrible after realizing what mistaked I made :)
After I bring up the new node I just need to run the following on old nodes?
1) New node set the initial token to 56713727820156410577229101238628035242
2) start new node
3) On second node run nodetool move 1134274556403128211544
Hi,
I see "The CompareWith attribute tells Cassandra how to sort the columns for
slicing operations." on wiki (
http://wiki.apache.org/cassandra/StorageConfiguration). So the CompareWith
defines how to sort column (or super-columns) in scope of one row. So this
option is relate to (multi)get_slice
try this
BigInteger bi = new BigInteger("2");
BigInteger or = new BigInteger("2");
for (int i=1;i<127;i++) {
or = or.multiply(bi);
}
or = or.divide(new BigInteger("3"));
for (int i=0;i<3;i++) {
System.out.println(or.multiply(new BigInteger(""+i)));
}
which generate
0
56713727820156410577229101
Also, ^ means xor in Java, not exponentiation.
Just use the Python Eric linked. :)
On Fri, Feb 18, 2011 at 3:24 PM, Ching-Cheng Chen
wrote:
> 41
> 82
> 123
> These certainly not correct. Can't just use 2 ^ 127, will overflow
> You can't use Java's primitive type to do this calculation. long o
I'm not sure I can say exactly why, but I'm sure those numbers can't be
correct. One node should be zero and the other values should be very long
numbers like 85070591730234615865843651857942052863.
We need another Java expert's opinion here, but it looks like your snippet
may have "integer
overf
Fact as i understand them:-
- A write call to db triggers a number of async writes to all nodes where
the particular write should be recorded (and the nodes are up per Gossip and
so on)
- Once desired CL number of writes acknowledge - the call returns
So your issue is moot. That is what is happeni
41
82
123
These certainly not correct. Can't just use 2 ^ 127, will overflow
You can't use Java's primitive type to do this calculation. long only use
64 bit.
You'd need to use BigInteger class to do this calculation.
Regards,
Chen
www.evidentsoftware.com
On Fri, Feb 18, 2011 at 4:04 PM,
No. CompareWith is for columns.
On Fri, Feb 18, 2011 at 3:16 PM, cbert...@libero.it wrote:
> Hi all,
> I created a CF in which i need to get, sorted by time, the Rows inside. Each
> Row represents a comment.
>
>
>
> I've created a few rows using as Row Key a generated TimeUUID but when I call
>
W always stands for number of sync writes. N-W is the number of async writes.
Note, N decides number of replicas. W only decides out of those N
replicas, how many should be written synchronously before returning
success of write to client. All writes always happen to a total of N
nodes (W right awa
Hi all,
I created a CF in which i need to get, sorted by time, the Rows inside. Each
Row represents a comment.
I've created a few rows using as Row Key a generated TimeUUID but when I call
the Pelops method "GetColumnsFromRows" I don't get the data back as I expect:
rows are not sorted by Tim
On Thu, Feb 17, 2011 at 12:22 PM, Aaron Morton wrote:
> Messages been dropped means the machine node is overloaded. Look at the
> thread pool stats to see which thread pools have queues. It may be IO
> related, so also check the read and write latency on the CF and use iostat.
>
> i would try th
So does it mean there is no way to say use sync + async ? I am thinking if I
have to write accross data center and doing it synchronuosly is going to be
very slow and will be bad for clients to have to wait. What are my options
or alternatives?
Use N=3 and W=2? And the 3rd one (assuming will be a
Thanks! This is what I got. Is this right?
public class TokenCalc{
public static void main(String ...args){
int nodes=3;
for(int i = 1 ; i <= nodes; i++) {
System.out.println( (2 ^ 127) / nodes * i);
}
}
}
41
82
123
--
View this message in context:
htt
This is transparent!
Essentially - when enough writes are acknowledged to meet the desired
Consistency Level - it returns.
On Fri, Feb 18, 2011 at 2:48 PM, mcasandra wrote:
>
> I am still trying to understand how writes work. Is there any concept of
> sync
> and async writes? For eg:
>
> If I w
A Java program should work fine. The Wiki and the DataStax documentation
use a python program for the same purpose:
http://www.datastax.com/docs/0.7/operations/clustering#calculating-tokens
On Fri, Feb 18, 2011 at 12:45 PM, mcasandra wrote:
>
> Yes I had set the first node to token 0. I think
I am still trying to understand how writes work. Is there any concept of sync
and async writes? For eg:
If I want to have W=2 but 1 write as sync and the 2nd as async.
Or say I want to have W=3 with networktopology with DC1 getting 1 sync write
+ 1 async write and DC2 always getting async write
Yes I had set the first node to token 0. I think I read somewhere in the
docs. What should I do. Should I write a java program to calculate the hash
for 3 nodes and distribute it accross 3 nodes?
--
View this message in context:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Er
It sounds like one of your existing nodes already has the initial token
zero. Did you set the intial token of the first node you brought online to
zero?
On Fri, Feb 18, 2011 at 12:35 PM, mcasandra wrote:
>
> I see following error. Is it because I have initial token defined? What
> token
> shoul
I see following error. Is it because I have initial token defined? What token
should I use as initial token?
INFO 12:31:36,689 Finished hinted handoff of 0 rows to endpoint
/172.16.208.12
INFO 12:32:58,448 Joining: getting bootstrap token
ERROR 12:32:58,451 Fatal error: Bootstraping to existing
Nick,
Assuming I have a tenant that has only one CF, and I am using NetworkAware
repliaction strategy where the keys of this CF are replicated 3 times, each
copy in a different DC (DC1,DC2,DC3)
Now lets assume the cluster holds 5 DCs. As far as I understand only the
servers that belong to the thre
If I wish to find name of all the keys in all the column families
along with other related metadata (such as last updated, size of
column value field), is there an additional solution that caches this
metadata OR do I have to always perform range queries and get the
information ?
I am not interest
Hi there,
there is not such an operation in cassandra. The only thing which
comes "close" is the TTL support which will "delete" columns after a
given time. See:
http://www.datastax.com/dev/blog/whats-new-cassandra-07-expiring-columns
Bye,
Norman
2011/2/18 Benson Margulies :
> The following is
#1, R=2, so if only one machine is up, by definition R cannot be
satisfied. So it will not return.
#2, consistency is an involved topic with no quick and easy
explanation and answers. my 2 cents,
Question of eventual consistency comes in distributed systems, where
you can write to one machine but
On Fri, Feb 18, 2011 at 6:19 PM, Aklin_81 wrote:
> Sylvain,
> I also need to store data that is frequently updated, same column
> being updated several times during each user session, at each action
> by user, But, this data is not very fresh and hence when I update this
> column frequently, ther
The following is derived from the redis list operations.
The data model is that a key maps to an list of items. The operation
is to push a new item into the front, and discard any items from the
end above a threshold number of items.
of course, this can be done by reading a value, fiddling with i
Again, my understanding!
1. Writes will go thru w/hinted handoff, read will fail
2. Yes - but Oracle and others have no partition tolerance and lower levels
of availability. To build in partition tolerance and high availability and
still be shared nothing to avoid SPOF (to cover the RAC implementa
I have couple of more quesitons:
1. What happens when RF = 3, R = 2 and W = 2 and 2 machines go down? Would
read and write fail or get the results from that one machine that is up?
2. Someone in this thread mentioned that write is eventually consistent. Is
it because response is returned to the c
K - let me state the facts first (As I see know them)
- I do not know the inner workings, so interpret my response with that
caveat. Although, at an architectural level, one should be able to keep
detailed implementation at bay
- Quorum is (N+!)/2 where N is the Replication Factor (RF)
- And consis
Sylvain,
I also need to store data that is frequently updated, same column
being updated several times during each user session, at each action
by user, But, this data is not very fresh and hence when I update this
column frequently, there would be many versions of the same column in
several sst fi
typical experiment.
Redis 2.0.4 deployed on my macbook pro.
Saves enabled.
appendfsync off.
vm enabled, 1g max memory.
72 databases. Each database asked to store 13*N key-value pairs with
lpush, bucket size not very big, N -> 500,000.
Client jredis.
Start running against a stream of inputs.
John,
Just wondering what you are using if not phpcassa?
Thanks!
David
From: John Lennard [mailto:j...@gravitate.co.nz]
Sent: Thursday, February 17, 2011 6:41 PM
To: user@cassandra.apache.org
Subject: Re: cassandra & php
Hi,
How does this connection pooling fit in with the
Benson,
I was considering using Redis for a specific project. Can you
elaborate a bit on your problem with it? What were the circumstances,
loading factors, etc?
On Fri, Feb 18, 2011 at 9:19 AM, Benson Margulies wrote:
> redis times out at random regardless of what we configure for client
> timeo
On Fri, Feb 18, 2011 at 9:59 AM, Benson Margulies wrote:
> I want to package some schema with a library.
>
> I could use the hector API to create the schema if not found.
That's probably simplest for your users. (This is what stress.java
does, for instance.)
Otherwise, I'd recommend bundling a
Couple of more related questions:
5. For reads, does Cassandra first read N nodes or just the R nodes it
selects ? I am thinking unless it reads all the N nodes, how will it
know which node has the latest write.
6. Who decides the timestamp that gets inserted into the timestamp
field of every col
I want to package some schema with a library.
I could use the hector API to create the schema if not found. Or I
could, what, stuff a yaml file into something? Is there an API for
that, or do I end up where I started?
On Fri, Feb 18, 2011 at 12:00 AM, Stu Hood wrote:
> But, the reason that it isn't safe to say that we are a strongly consistent
> store is that if 2 of your 3 replicas were to die and come back with no
> data, QUORUM might return the wrong result.
Not so. If you allow vaporizing arbitrary numbe
Hi,
Are there any blogs/writeups anyone is aware of that talks of using
primary replica as coordinator node (rather than a random coordinator
node) in production scenarios ?
Thank you.
On Wed, Feb 16, 2011 at 10:53 AM, A J wrote:
> Thanks for the confirmation. Interesting alternatives to avoid
ok great, thanks for the exact clarification
On 18 Feb 2011, at 14:11, Aklin_81 wrote:
> Compaction does not 'mutate' the sst files, it 'merges' several sst files
> into one with new indexes, merged data rows & deleting tombstones. Thus you
> reclaim your disk space.
>
>
> On Fri, Feb 18, 201
Questions about R and N (and W):
1. If I set R to Quorum and cassandra identifies a need for read
repair before returning, would the read repair happen on R nodes (I
mean subset of R that needs repair) or N nodes before the data is
delivered to the client ?
2. Also does the repair happen at level o
redis times out at random regardless of what we configure for client
timeouts; the platform-sensitive binaries are painful for us since we
support many platform; just to name two reasons.
On Fri, Feb 18, 2011 at 10:04 AM, Joshua Partogi wrote:
> Any reason why you want to do that?
>
> On Sat, Feb
Any reason why you want to do that?
On Sat, Feb 19, 2011 at 1:32 AM, Benson Margulies wrote:
> I'm about to launch off on replacing redis with cassandra. I wonder if
> anyone else has ever been there and done that.
>
--
http://twitter.com/jpartogi
I'm about to launch off on replacing redis with cassandra. I wonder if
anyone else has ever been there and done that.
Compaction does not 'mutate' the sst files, it 'merges' several sst files
into one with new indexes, merged data rows & deleting tombstones. Thus you
reclaim your disk space.
On Fri, Feb 18, 2011 at 7:34 PM, James Churchman
wrote:
> but a compaction will mutate the sstables and reclaim the
> spa
but a compaction will mutate the sstables and reclaim the space (eventually) ?
james
On 18 Feb 2011, at 08:36, Sylvain Lebresne wrote:
> On Fri, Feb 18, 2011 at 8:14 AM, Aklin_81 wrote:
> Are the very freshly written columns to a row in memtables, efficiently
> updated/overwritten by edited
Related question: Is it a good idea to specify ConsistencyLevels on a
per-operation basis? For example: Read ONE Write ALL would deliver
consistent read results, just like Read ALL Write ONE. However, if you
specify Read ONE Write QUORUM you cannot give such guarantees anymore.
Should there be (is
At Quorum - if 2 of 3 nodes are down, a read should not be returned, right ?
But yes - if single node READs are opted for, it will go through.
The original question was - "Why is Cassandra called eventually consistent
data store?"
Because at write time, there is not a guarantee that all replicas
With this schema:
create column family Userstream with comparator=UTF8Type and rows_cached =
1 and keys_cached = 10
and column_metadata=[{column_name:account_id, validation_class:IntegerType,
index_type: 0, index_name:UserstreamAccountidIdx},
{column_name:from_id, validation_class:IntegerT
Hi!
Can some body give me some hints about how to configure a keyspace with
NetworkTopologyStrategy via cassandra-cli? Or what is the preferred
method to do so?
Thanks!
Thanks a lot for you suggestions,
I will check the virtual keyspace solution - btw, currently I am using
Thrift client with Pycassa, I am not familiar with Hector - does it mean
we'll need to move to Hector client?
I thought of using keyspaces for each tenant, but I dont understand how to
define t
On Fri, Feb 18, 2011 at 8:14 AM, Aklin_81 wrote:
> Are the very freshly written columns to a row in memtables, efficiently
> updated/overwritten by edited/new column values.
>
> After flushing of memtable, are those(edited + unedited ones) columns
> stored together on disk (in same blocks!?) as i
> main argument for using mmap() instead of standard I/O is the fact
> that reading entails just touching memory - in the case of the memory
> being resident, you just read it - you don't even take a page fault
> (so no overhead in entering the kernel and doing a semi-context
> switch).
Oh and in
> Jonathan,
> When you get time could you please explain that a little more. Got a feeling
> I'm about to learn something :)
I'm not Jonathan, but: The operating system's virtual memory system
supports mapping files into a process' address space. This will "use"
virtual memory; i.e. address space.
64 matches
Mail list logo