For posterity, I ended up hacking around this by renaming the repeated
'value' alias in CassandraStorage and rebuilding it. Here's the patch:
--- src/java/org/apache/cassandra/hadoop/pig/CassandraStorage.java.original
2011-10-11
23:42:19.0 -0700
+++ src/java/org/apache/cassandra/hadoop/pig
Thanks for the reply ben.
Actually The problem is, I could not able to run a basic hector example from
eclipse. Its throwing "me.prettyprint.hector.api.
exceptions.HectorException: All host pools marked
> down. Retry burden pushed out to client
"
Can you please let me know why i am getting this,,,
- RF is 1. We have few KeySpaces, only this one is not replicated - this
data is not that very important. In case of error customer will have to
execute process again. But again, I would like to persist it.
- Serializing data is not an option, because I would like to have
possibility to access data
Thanks for all your help Brandon and Jeremy, that got me to the point where
I could load data.
I'm now hitting a new issue that seems like it could possibly be related.
When I try to access the data like this:
grunt> rows = LOAD 'cassandra://Frap/FriendsAlreadyRanked' USING
CassandraStorage();
gr
The OpsCenter graph you're referring to basically does the following:
1. For each node, find out how much the WriteOperations attribute of the
StorageProxy increased during the last minute.
2. Sum these values to get a total for the cluster.
3. Divide by 60 to get an average number of WriteOperati
Hello,
If you set a ttl and expire a column, I've read that this eventually turns
into a tombstone and will be cleaned out by the GC. Are expirations
considered a form of delete that still requires a node repair to be run in
gc_grace_period seconds? The operations guide says you have to run node
r
as I asked earlier:
http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/how-does-compaction-throughput-kb-per-sec-affect-disk-io-td6831711.html
might not directly throttle the disk I/O?
it would be easy if ionice could work with cassandra. not sure it is because
of jvm or something e
Are all 3 CFs using compression?
On Tue, Oct 11, 2011 at 4:43 PM, Günter Ladwig wrote:
> Hi all,
>
> I'm seeing the same problem on my 1.0.0-rc2 cluster. However, I do not have
> 5000, but just three (compressed) CFs.
>
> The exception does not happen for the Migrations CF, but for one of mine:
grep -i 'killed process' /var/log/messages
On Tue, Oct 11, 2011 at 5:25 PM, Ashley Martens wrote:
> So we created a script to check if Cassandra is alive and run it every two
> minutes. Here are some results for today:
>
> Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again
Hi Aaron,
I got an account to the wiki, logged in, and claimed the 'Configuration'
page a.k.a 'Storage Configuration' for now. I will let you know when done or
if I get stumped. Will also work on "Setting up Eclipse" page and put it
somewhere.
Hani
On Mon, Oct 10, 2011 at 4:24 PM, aaron morton wro
simple, elegant, and less performant than just doing a range scan
without the index. :)
On Tue, Oct 11, 2011 at 4:06 PM, Sasha Dolgy wrote:
> ah, hadn't even thought of that. simple. elegant.
> cheers.
>
> On Tue, Oct 11, 2011 at 11:01 PM, Jake Luciani wrote:
>>
>> This hasn't changed in AFAIK
On Tue, Oct 11, 2011 at 4:24 PM, Pete Warden wrote:
> I'm trying to run the most basic example for pig_cassandra, counting the
> number of rows in a column family, and I'm hitting the following error:
> 2011-10-11 14:13:32,321 [main] ERROR org.apache.pig.tools.grunt.Grunt -
> ERROR 1031: Incompata
Just for informational purposes, Pete and I tried to troubleshoot it via
twitter. I was able to do the following with Cassandra 0.8.1 and Pig 0.9.1.
He's going to dig in to see if there's something else going on.
// Cassandra-cli stuff
// bin/cassandra-cli -h localhost -p 9160
create keyspace
So we created a script to check if Cassandra is alive and run it every two
minutes. Here are some results for today:
Tue Oct 11 18:28:09 UTC 2011 - F this Cassandra bullshit... it died again
Tue Oct 11 19:00:10 UTC 2011 - F this Cassandra bullshit... it died again
Tue Oct 11 19:30:10 UTC 2011 - F
while on the topic of the wiki ... it's not entirely pleasing to the senses
or at all user friendly ... hacking around on it earlier today, there aren't
that many options on how to give it some flare ... shame really that for
such a cool piece of software, the wiki doesn't scream the same level of
Sounds like a good place to start!
Thanks for taking the lead and please let me know how I can help!
Daria
On Tue, Oct 11, 2011 at 2:20 PM, aaron morton wrote:
> Thanks Daria, I have a look at whats there and get in touch.
>
> Right now I'm not thinking beyond getting the wiki complete (e.g. it
Hi all,
I'm seeing the same problem on my 1.0.0-rc2 cluster. However, I do not have
5000, but just three (compressed) CFs.
The exception does not happen for the Migrations CF, but for one of mine:
Keyspace: KeyspaceCumulus
Read Count: 816
Read Latency: 8.926029411764706 ms.
I'm trying to run the most basic example for pig_cassandra, counting the
number of rows in a column family, and I'm hitting the following error:
2011-10-11 14:13:32,321 [main] ERROR org.apache.pig.tools.grunt.Grunt -
ERROR 1031: Incompatable field schema: left is
"columns:bag{:tuple(name:bytearray
Thanks Daria, I have a look at whats there and get in touch.
Right now I'm not thinking beyond getting the wiki complete (e.g. it lists all
the command line tools) and correct for version 1.0. My main concern was people
coming away from the site with incorrect information and having a bad out of
ah, hadn't even thought of that. simple. elegant.
cheers.
On Tue, Oct 11, 2011 at 11:01 PM, Jake Luciani wrote:
> This hasn't changed in AFAIK, In Brisk we had the same problem in CFS so
> we created a sentinel value that all rows shared then it works.
> CASSANDRA-2915 should fix it.
>
> On
This hasn't changed in AFAIK, In Brisk we had the same problem in CFS so we
created a sentinel value that all rows shared then it works. CASSANDRA-2915
should fix it.
On Tue, Oct 11, 2011 at 4:48 PM, Sasha Dolgy wrote:
> I was trying to get a range of rows based on a secondary_index that was
>
Its the number of mutations, a mutation is a collection of changes for a single
row across one or more column families.
Take a look at the nodetool cfstats, this is where I assume Ops Centre is
getting it's data from.
Cheers
-
Aaron Morton
Freelance Cassandra Developer
@aaro
Some thoughts…
> non replicated Key Space
Not sure what you mean here. Do you mean RF 1 ? I would consider using 3.
Consider what happens you want to install a rolling upgrade to the cluster.
> single Column Family, where key is session ID and each column within row
> stores single key/value -
I was trying to get a range of rows based on a secondary_index that was
defined. Any rows where age was greater than or equal to ... it didn't
work. Is this a continued limitation? Did a quick look in JIRA, couldn't
find anything.
The output from "help get;" on the cli contains the following, w
I am running an Embedded Cassandra (0.8.7) and
calling CassandraDaemon.deactivate() after I write rows (at least 1),
doesn't shutdown Cassandra.
If I run only "reads" it does shutdown even without
calling CassandraDaemon.deactivate()
Anyone have any idea what can cause this problem?
Shimi
On Tue, Oct 11, 2011 at 11:05 AM, Eric Evans wrote:
> On Tue, Oct 4, 2011 at 2:44 PM, Chris Burroughs
> wrote:
>> ApacheCon NA is coming up next month. I suspect there will be at least
>> a few Cassandra users there (yeah new release!). Would anyone be
>> interested in getting together and shar
DataStax would like to help with the wiki update effort. For example, we
have a start on updates for 1.0, such as the storage configuration.
http://www.datastax.com/docs/1.0/configuration/storage_configuration
Let me know how we can help.
Cheers,
Daria (DataStax Tech Writer)
Question - Are you
Hi, I'm having what I think is a fairly uncommon schema issue --
My situation is that I had a cluster with 10 nodes and a consistent schema.
Then, in an experiment to setup a second cluster with the same information
(by copying the raw sstables), I left the LocationInfo* sstables in the
system ke
"46e70d80":
[["0132f3726cbb30303030303030303030303030303030303030303030303030303030303030303030316431636633","4e945b0e",1318344486784,"d"]
for the timestamp
perl -e 'print gmtime(1318344486)."\n" '
Tue Oct 11 14:48:06 2011
$ TZ=GMT date
Tue Oct 11 17:40:31 GMT 2011
so it's almost 3 hou
after I did a major compaction on both nodes in my test cluster,
I found that for the same CF, one node has a 100MB sstable file, while
the other has a 1GB one.
since GC_grace is set into schema, and both nodes have the same
config, how could this happen?
I'm still going through sstable2json to f
On Tue, Oct 11, 2011 at 12:19 PM, Yang wrote:
> I find the info about bloomfilter very helpful, could we add that to NodeCmd ?
Feel free to create a ticket and tag it 'lhf'
-Brandon
I find the info about bloomfilter very helpful, could we add that to NodeCmd ?
Thanks
Yang
Sounds good. I'll be giving a talk there about Cassandra 1.0
http://na11.apachecon.com/talks/19500
On Tue, Oct 11, 2011 at 12:05 PM, Eric Evans wrote:
> On Tue, Oct 4, 2011 at 2:44 PM, Chris Burroughs
> wrote:
> > ApacheCon NA is coming up next month. I suspect there will be at least
> > a fe
Just a FYI:
http://hector-client.org is requesting a username/pass
http://www.hector-client.org is working fine
On Fri, Oct 7, 2011 at 12:51 AM, aaron morton wrote:
> Thanks, will be handy for new peeps.
> A
> -
> Aaron Morton
> Freelance Cassandra Developer
> @aaronmorton
> http
On Tue, Oct 4, 2011 at 2:44 PM, Chris Burroughs
wrote:
> ApacheCon NA is coming up next month. I suspect there will be at least
> a few Cassandra users there (yeah new release!). Would anyone be
> interested in getting together and sharing some stories? This could
> either be a "official" [1] m
Hi,
>From time to time discussions pop up here regarding the transactional or
atomic capabilities of Cassandra (or lack thereof). There is at least one
project dedicated to solving this problem (i.e., Cages). Unfortunately, in
pretty much every discussion or blog post I’ve come across on this subj
Hello everyone,
I was trying to get some cluster wide statistics of the total insertions
performed in my 3 node Cassandra 0.8.6 cluster. So I wrote a nice little
program that gets the CompletedTasks attribute of
org.apache.cassandra.db:type=Commitlog from every node, sums up the values
and records
We already have two separate rings. Idea of bidirectional sync is, if one
ring is down, we can still send the traffic to other ring. When original
cluster comes back, it will pick up the data from available cluster. I'm not
sure if it makes sense to have separate rings or combine these two rings
On Tue, Oct 11, 2011 at 2:36 AM, Peter Schuller
wrote:
> Google/check wiki/read docs about NetworkTopologyStrategy and
> PropertyFileSnitch. I don't have a good link to multi-dc off hand
> (anyone got a good link to suggest that goes through this?).
http://www.datastax.com/docs/0.8/cluster_archit
Hi *,
I would like to use Cassandra to store session related informations. I do
not have real HTTP session - it's different protocol, but the same concept.
Memcached would be fine, but I would like to additionally persist data.
Cassandra setup:
- non replicated Key Space
- single Column F
kewl,
> * Row is not deleted (other columns can be read, row survives compaction
> with GCGraceSeconds=0)
IIRC row tombstones can hang around for a while (until gc grace has passed),
and they only have an effect on columns that have a lower timstamp. So it's
possible to read columns from a row
Hey,
We had this one, even tho in the hector documentation it says that it
retry s failed servers even 30 by default, it doesn't.
Once we explicitly set it to X seconds, when ever there is a failure,
ie with network (AWS), it will retry and add it back into the pool.
Ben
On 11 October 2011 11:0
Hi Aaron,
i invalidated the caches but nothing changed. I didn't get the mentioned
log line either, but as I read the code SliceByNamesReadCommand uses
NamesQueryFilter and not SliceQueryFilter.
Next, there is only one SSTable.
I can rule out that the row is deleted because I deleted all other r
Hi Every One,
Actually I was using cassandra long time back and when i tried today, I am
getting a problem from eclipse. When i am trying to run a basic hector
(java) example, I am getting an exception
me.prettyprint.hector.api.exceptions.HectorException: All host pools marked
down. Retry burden p
@maki thanks,
Could you take a look at the cli page
http://wiki.apache.org/cassandra/CassandraCli ?. There is a lot of online docs
in the tool, so we dont need to replicate that. Just a simple getting started
guide, some examples and a few tips about about what to do if things don't
wo
Nothing jumps out. The obvious answer is that the column has been deleted. Did
you check all the SSTables ?
It looks like query returned from row cache, otherwise you would see this as
well…
DEBUG [ReadStage:34] 2011-10-11 21:11:11,484 SliceQueryFilter.java (line 123)
collecting 0 of 214748364
> We already have two separate rings. Idea of bidirectional sync is, if one
> ring is down, we can still send the traffic to other ring. When original
> cluster comes back, it will pick up the data from available cluster. I'm not
> sure if it makes sense to have separate rings or combine these two
> so how about disk io? is there anyway to use ionice to control it?
> I have tried to adjust the priority by "ionice -c3 -p [cassandra pid].
> seems not working...
Compaction throttling (and in 1.0 internode streaming throttling) both
address disk I/O.
--
/ Peter Schuller (@scode on twitter)
Hi Aaron,
I think the CommitLog section is outdated (
http://wiki.apache.org/cassandra/ArchitectureCommitLog) :
The CommitLogHeader is no longer exist since this ticket :
https://issues.apache.org/jira/browse/CASSANDRA-2419
Regards,
Jérémy
2011/10/11 Sasha Dolgy
> maybe that should be the fi
50 matches
Mail list logo