Hi,all:
I have a problem with bloom filter. When made a test which tried to get
some nonexistent keys, it seemed that the bloom filter does not work. The
'BloomFilterFalseRatio' was 1.0 and the 'BloomFilterFalsePositives' was
rising and the disk I/O utils reached 100% according to 'iostat'.
On 6/21/2011 3:36 PM, Stephen Connolly wrote:
writes are not atomic.
the first side can succeed at quorum, and the second side can fail
completely... you'll know it failed, but now what... you retry, still
failed... erh I'll store it somewhere and retry it later... where do I
store it?
the
On 6/21/2011 3:14 PM, Anand Somani wrote:
Not sure if it is that simple, a quorum can fail with writes happening
on some nodes (there is no rollback). Also there is no concept of
atomic compare-and-swap.
Good points. I suppose what I need is for the client to implement the
part of ACID tha
And I was thinking of using JTA for transaction processing. I have no
experience with it but on the surface it looks like it should work.
On 6/21/2011 3:31 PM, AJ wrote:
What's the best accepted way to handle that 100% in the client? Retries?
On 6/21/2011 3:14 PM, Anand Somani wrote:
Not sur
writes are not atomic.
the first side can succeed at quorum, and the second side can fail
completely... you'll know it failed, but now what... you retry, still
failed... erh I'll store it somewhere and retry it later... where do I store
it?
the consistency level is about tuning whether reads and
What's the best accepted way to handle that 100% in the client? Retries?
On 6/21/2011 3:14 PM, Anand Somani wrote:
Not sure if it is that simple, a quorum can fail with writes happening
on some nodes (there is no rollback). Also there is no concept of
atomic compare-and-swap.
On Tue, Jun 21,
Not sure if it is that simple, a quorum can fail with writes happening on
some nodes (there is no rollback). Also there is no concept of atomic
compare-and-swap.
On Tue, Jun 21, 2011 at 2:03 PM, AJ wrote:
> **
> On 6/21/2011 2:50 PM, Stephen Connolly wrote:
>
> how important are things like tran
On 6/21/2011 2:50 PM, Stephen Connolly wrote:
how important are things like transactional consistency for you?
would you have issues if only one side of a transfer was recorded?
Right. Both of those questions are about consistency. Isn't the simple
solution is to use QUORUM read/writes?
how important are things like transactional consistency for you?
would you have issues if only one side of a transfer was recorded?
cassandra, out of the box, on it's own, would not be ideal if the above two
things are important for you.
you can add components to a system to help address these t
Right, Solr will not do anything other than basic aggregations (facets) and
range queries.
On Tue, Jun 21, 2011 at 3:16 PM, Dan Kuebrich wrote:
> Solandra is indeed distributed search, not distributed number-crunching.
> As a previous poster said, you could imagine structuring the data in a
> s
Also
https://issues.apache.org/jira/browse/HADOOP-7206
Now part of brisk
http://www.datastax.com/dev/blog/brisk-1-0-beta-2-released
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle.com
On 22 Jun 2011, at 04:04, Vijay wrote:
> You might w
Solandra is indeed distributed search, not distributed number-crunching. As
a previous poster said, you could imagine structuring the data in a series
of documents with fields containing playername, teamname, position,
location, day, time, inning, at bat, outcome, etc. Then you could query to
get
use nodetool cfstats or show keyspaces; in cassandra-cli to see the flush
settings, default is (i think) 60 minutes, 0.1 million "ops" or 1/16th of hte
heap size when the CF was created.
But under 0.8 there is an automagical global memory manager, see
https://github.com/apache/cassandra/blob/cas
If I may ask Sasha, what exactly are you trying to achieve using SolR
(or Solandra, I guess it's about the same) ?
Because from what I understood of your problem you need to do statistics
on your matches, players etc... Or do you just want to retrieve
information that are already been computed ?
Your application isn't aware of Cassandra only Solr.
The idea of Solandra is to use Cassandra as a backend for Solr.
Solr has a distributed search mechanism already so by making Solr Cassandra
aware
it can auto-shard and manage distributed queries for you, with replication
and failover etc
As for
Without getting overly complicated and long winded ... are there
practical references / examples I can review that demonstrate the
cassandra/solandra benefitsi had a quick look at
https://github.com/tjake/Solandra/wiki/Solandra-Wiki and it wasn't
dead obvious to me
On Tue, Jun 21, 2011 at
Just wanted to mention that there is also a #solandra irc channel on freenode
in case people are interested.
On Jun 21, 2011, at 1:26 PM, Mark Kerzner wrote:
> Me too!
>
> I would be interested to know how such queries are done in Solandra. I would
> understand it if it creates a complete Luce
Me too!
I would be interested to know how such queries are done in Solandra. I would
understand it if it creates a complete Lucene index of everything that's in
Cassandra, and adds the text search. Then your query goes against Lucene.
But if some data is found in column families in Cassandra, and
Solandra can answer the question you used as an example and it's more of a
fit for low-latency ad-hoc reporting then PIG. Pig queries will take
minutes not seconds.
On Tue, Jun 21, 2011 at 12:12 PM, Sasha Dolgy wrote:
> Folks,
>
> Simple question ... Assuming my current use case is the ability
Is C* suitable for storing customer account (financial) data, as well as
billing, payroll, etc? This is a new company so migration is not an
issue... starting from scratch.
Thanks!
If you're OOMing on restart you WILL OOM during normal usage given
heavy enough write load. Definitely adjust memtable thresholds down
or, as Dominic suggests, upgrade to 0.8.
On Tue, Jun 21, 2011 at 12:02 PM, Dominic Williams
wrote:
> Hi gabe,
> What you need to do is the following:
> 1. Adjust
I can speak for what I know :
Pig I have taken only a quick look and maybe some guys from Twitter can
answer better than me on that particular program. Pig is not for "on demand"
queries: they are quite slow and as you said you extract relevant
information and append it to another CF where you can
Hi gabe,
What you need to do is the following:
1. Adjust cassandra.yaml so when this node is starting up it is not
contacted by other nodes e.g. set thrift to 9061 and storage to 7001
2. Copy your commit logs to tmp sub-folder e.g. commitLog/tmp
3. Copy a small number of commit logs back into m
Also - there is an open ticket to create a .NET CQL driver - may be worth
watching or if you'd like to help out with it somehow:
https://issues.apache.org/jira/browse/CASSANDRA-2634
On Jun 21, 2011, at 9:31 AM, Stephen Pope wrote:
> We just recently switched to 0.8 (from 0.7.4), and it looks lik
This is a known issue and is being tracked on the following:
https://issues.apache.org/jira/browse/CASSANDRA-2653
On Tue, Jun 21, 2011 at 9:31 AM, Stephen Pope wrote:
> We just recently switched to 0.8 (from 0.7.4), and it looks like key-only
> queries are broken (number of columns = 0). The same
Could you verify any security settings that may come into play with Elastic
IPs? You should make sure the appropriate ports are open.
See: http://www.datastax.com/docs/0.8/brisk/install_brisk_ami
for a list of ports in the first chart.
Joaquin Casares
DataStax
Software Engineer/Support
On Mon,
Folks,
Simple question ... Assuming my current use case is the ability to log
lots of trivial and seemingly useless sports statistics ... I want a
user to be able to query / compare For example:
--> Show me all baseball players in cheektowaga and ontario,
california who have hit a grandslam
You might want to watch https://issues.apache.org/jira/browse/CASSANDRA-47
Regards,
On Tue, Jun 21, 2011 at 5:14 AM, Timo Nentwig wrote:
> Hi!
>
> Just wondering why this doesn't already exist: wouldn't it make sense to
> have
> decorating data types that compress (gzip, snappy) other data ty
bang on ... no idea why ... a new day a fresh login ... environment
variables gone. working now with cassandra 0.8.0 and pig 0.8.1
went through all my steps and all is working ... except line 45 in the
bin/pig_cassandra is not proper when there are multiple pig*.jar
files.
On Mon, Jun 20, 2011 a
We just recently switched to 0.8 (from 0.7.4), and it looks like key-only
queries are broken (number of columns = 0). The same query works if we switch
the number of columns to 1. Is there a new mechanism for getting key-only? We
can't use CQL yet since we're using .NET for our development.
Che
I've only got one cf, and haven't changed the default flush expiry period. I'm
not sure the node had fully started or not. I had to restart my data insertion
(for other reasons), so I can check the system log upon restart when the data
is finished inserting.
Do you know off-hand how long the
> I’ve got a single node deployment of 0.8 set up on my windows box. When I
> insert a bunch of data into it, the commitlogs directory doesn’t clear upon
> completion (should it?).
It is expected that commit logs are retained for a while, and that
there is reply going on when restarting a node. Th
Hi there. This is my first message to the mailing list, so let me know if I'm
doing it wrong. :)
I've got a single node deployment of 0.8 set up on my windows box. When I
insert a bunch of data into it, the commitlogs directory doesn't clear upon
completion (should it?). As a result, when I sto
Hi Daniel,
Just saw your email regarding kundera download.
Kundera snapshot jar is available at:
http://kundera.googlecode.com/svn/maven2/maven-missing-resources/com/impetus/kundera/1.1.1-SNAPSHOT/
In addition,
If you want to download source code then it is at:
https://github.com/impetus-openso
Hi!
Just wondering why this doesn't already exist: wouldn't it make sense to have
decorating data types that compress (gzip, snappy) other data types (esp.
UTF8Type,
AsciiType) transparently?
-tcn
The new memtable_total_space_in_mb option is kicking in
https://github.com/apache/cassandra/blob/cassandra-0.8.0/NEWS.txt#L34
http://thelastpickle.com/2011/05/04/How-are-Memtables-measured/
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaronmorton
http://www.thelastpickle
You've set a comparator for the super column names, but not the sub columns.
e.g.
[default@dev] set data['31']['address']['city']='noida';
org.apache.cassandra.db.marshal.MarshalException: cannot parse 'city' as hex
bytes
[default@dev] set data['31']['address'][utf8('city')]='noida';
Value
Thanks Aaron. It is really a great pointer to solution.
-Vivek
From: aaron morton [mailto:aa...@thelastpickle.com]
Sent: Monday, June 20, 2011 12:51 AM
To: user@cassandra.apache.org
Subject: Re: Cassandra.yaml
The change to the remove the calls to DatabaseDecriptor were in this commit on
the 0
AFAIK the node will not announce itself in the ring until the log replay is
complete, so it will not get the schema update until after log replay. If
possible i'd avoid making the schema change until you have solved this problem.
My theory on OOM during log replay is that the high speed inserts
Personally speaking, I do not run JMX on 8080, and never have. The
tools, like cassandra-cli and nodetool expect it to be on the default
port, but you can override with -p or -jmxport
-sd
On Tue, Jun 21, 2011 at 1:33 PM, osishkin osishkin wrote:
> I did, and everything seemed to work fine.
> Bu
I did, and everything seemed to work fine.
But I saw a reference here
http://www.onemanclapping.org/2010/03/running-multiple-cassandra-nodes-on.html
That said "make sure you have at least one node listening on 8080
since all the Cassandra tools assume JMX is listening there", and then
remembered th
it's defined in $CASSANDRA_HOME/conf/cassandra-env.sh
JMX_PORT=
Have it different for each instance ...
On Tue, Jun 21, 2011 at 1:24 PM, osishkin osishkin wrote:
> I want to have several deamons running on a machine, each belinging to
> a multi-node cluster.
> Is that a problem in concern to po
Can you provide some more information on the query you are running ? How many
terms are you selecting with?
How long does it take to return 1024 rows ? IMHO thats a reasonably big slice
to get.
The server will pick the most selective equality predicate, and then filter the
results from that
I want to have several deamons running on a machine, each belinging to
a multi-node cluster.
Is that a problem in concern to port 8080, for jmx monitoring?
Is it somewhere hardcoded, so that changing it is the configuration
files is not enough?
Thank you
osi
I try to understand the flushing behavior in Cassandra 0.8
When I create rows, after a few seconds, I see the following line in the log:
INFO 11:18:46,470 flushing high-traffic column family
ColumnFamilyStore(table='Traxis', columnFamily='Customers')
INFO 11:18:46,471 Enqueuing flush of
Memtabl
Thanks Richard.
You are right. I missed that in key validation class.
-Original Message-
From: Richard Low [mailto:r...@acunu.com]
Sent: Tuesday, June 21, 2011 12:44 PM
To: user@cassandra.apache.org
Subject: Re: issue with querying SuperColumn
You have key validation class UTF8Type for th
You have key validation class UTF8Type for the standard CF, but
BytesType for the super. This is why the key is "1" for standard, but
printed as "31" for super, which is the hex ascii code for 1. In your
java code, use "1".getBytes() as your key and it should work.
Richard.
--
Richard Low
Acun
I understand that I might be missing something on my end. But somehow I cannot
get this working using Cassandra-cli:
[default@key1] create column family supusers with comparator=UTF8Type and
default_validation_class=UTF8Type and key_validation_class=UTF8Type and
column_type=Super;
59e2e950-9bd
48 matches
Mail list logo