I just started moving our scripts to Pig 0.11.1 from 0.9.2 and I see the same
issue - about 75-80% time it fails. So I'm not moving :-/.
I am using OSX + Oracle Java7 and CassandraStorage, but I did not see any
difference between CassandraStorage and CqlStorage.
Cassandra 1.2.9, though 1.1.10
hi,
i am using astyanax to access a multi nodes cassandra cluster.
In my connnection configuration setup, i already declared a global
consistency read/write level by setting:
AstanaxConfiguration.setDefaultWriteConsistencyLevel()
AstanaxConfiguration.setDefaultReadConsistencyLevel()
however, fro
Snapshot just creates a hard link to all your sstables. There is no control
on the size. That you can control if you are on level compaction. Dont know
about Size tiered.
On Fri, Sep 20, 2013 at 6:56 PM, java8964 java8964 wrote:
> Hi,
>
> The current our production is using Cassandra 1.0, and w
Hi,
The current our production is using Cassandra 1.0, and will upgrade to 1.1 next
week.
I noticed the snapshot and incremental backup sstable files size generated from
our production environment vary dramatically. Some files can be hundreds of M,
or even close to G, but a lot of files are even
Yeah, I know it was vague, but that is due to the fact that I'm still coming up
to speed on the project and have yet to hear some of the details. Since I had
heard that there has always been a requirement for ad-hoc queries against the
Oracle DB for data-mining purpsoes, that was the best I coul
On Fri, Sep 20, 2013 at 4:20 PM, Hartzman, Leslie <
leslie.d.hartz...@medtronic.com> wrote:
> Thanks Rob. I thought that might have been the situation but wasn’t
> sure. So does this negate the use of cqlsh to do this then? I’d hate to
> have to provide custom code to support ad-hoc queries.
>
T
Cool! Thanks for the suggestions.
From: Peter Lin [mailto:wool...@gmail.com]
Sent: Friday, September 20, 2013 4:52 PM
To: user@cassandra.apache.org
Subject: Re: Ad-hoc queries question
there are several ways of handling these types of use cases. Some people take a
soft real-time approach by cal
there are several ways of handling these types of use cases. Some people
take a soft real-time approach by calculating aggregates in-memory and
saving it to tables periodically. One example of this is twitter and storm.
Other techniques includes using batch process to extract summaries and
storing
By ad-hoc queries I mean exactly what you've described. The need to access data
from multiple column families, typically addressed in RDBs with JOINs.
I haven't really become familiar enough with MapReduce yet, so I'll have to
delve deeper into that. I'm hoping that the de-normalized nature of t
What do you mean by ad-hoc queries?
Most NoSql databases do not support cross table joins, due to the
distributed nature of NoSql databases. If we compare this to partitioned
databases in the RDB world, cross partition joins is also more expensive
than non-partitioned databases.
you can do ad-hoc
Thanks Rob. I thought that might have been the situation but wasn't sure. So
does this negate the use of cqlsh to do this then? I'd hate to have to provide
custom code to support ad-hoc queries.
Les
From: Robert Coli [mailto:rc...@eventbrite.com]
Sent: Friday, September 20, 2013 4:06 PM
To: use
On Fri, Sep 20, 2013 at 3:25 PM, Hartzman, Leslie <
leslie.d.hartz...@medtronic.com> wrote:
> So are ad-hoc queries more awkward or not feasible?
>
Yes.
To expand slightly, you will probably end up querying multiple
columnfamilies and doing the ad-hoc JOIN-esque aspect in application code.
=Ro
On Fri, Sep 20, 2013 at 3:42 PM, Suruchi Deodhar <
suruchi.deod...@generalsentiment.com> wrote:
> Using the nodes in the same availability zone(us-east-1b), we still get a
> highly imbalanced cluster. The nodetool status and ring output is attached.
> Even after running repairs, the cluster does n
Did you start out your cluster after wiping all the sstables and commit
logs?
On Fri, Sep 20, 2013 at 3:42 PM, Suruchi Deodhar <
suruchi.deod...@generalsentiment.com> wrote:
> We have been trying to resolve this issue to find a stable configuration
> that can give us a balanced cluster with equal
I know that for NoSQL the idea is to figure out your queries beforehand and
then plan your data architecture to support them. And this typically is
accomplished with a denormalized database.
So are ad-hoc queries more awkward or not feasible?
Thanks.
Les
[CONFIDENTIALITY AND PRIVACY NOTICE]
On Fri, Sep 13, 2013 at 7:48 AM, Dave Cowen wrote:
> We've been running Cassandra 1.1.12 in production since February, and have
> experienced a vexing problem with an arbitrary node "falling out of" or
> separating from the ring on occasion.
>
> Has anyone else seen similar behavior? For obviou
On Fri, Sep 20, 2013 at 1:22 PM, Chad Johnston wrote:
> I've checked out and built the 1.2.10-tentative branch, and I've noticed
> that all of my CQL prepared statements are now broken.
>
> Looking into the code, it looks like the "#" -> "=" and "@" -> "?"
> translations were removed. I tried to r
Can we have a composite values in each columns in Cassandra Column Family?
user-id column1-name
123 (Column1-Value Column1-SchemaName Column1-LastModifiedDate)
userId is the rowKey here. And same thing will be for other columns as well.
Each column value will contain below three
I've checked out and built the 1.2.10-tentative branch, and I've noticed
that all of my CQL prepared statements are now broken.
Looking into the code, it looks like the "#" -> "=" and "@" -> "?"
translations were removed. I tried to replace these in one of my scripts
with "=" and "?", but there's
http://billionairescoach.com/docs/vimeo.php?rqdz543bvt
linares
Fri, 20 Sep 2013 21:52:51
I don't know, I didn't go into Burger King. -- Pulp Fiction
Hello,
I have tried unsuccessfully to stream snapshots from a 4 node cluster to a 2
node cluster. I have set up a different machine to run sstableloader on and
I can see in the logs that it starts the stream and that the tmp files are
created in the correct columnfamily folders.
we are on 1.2.5 o
On Thu, Sep 19, 2013 at 9:13 PM, Keith Bogs wrote:
> I've been playing with Cassandra and have a few questions that I've been
> stuck on for awhile, and Googling around didn't seem to help much:
>
> 1. What's the quickest way to import a bunch of data from PostgreSQL? I
> have ~20M rows with most
On Fri, Sep 20, 2013 at 9:24 AM, Jayadev Jayaraman wrote:
> As a follow-up, is operating a Cassandra cluster with machines on multiple
> racks and vnodes bound to cause load imbalance ? Shouldn't token-ranges
> assigned to individual machines via their vnodes be approximately balanced
> ? We're ot
As a follow-up, is operating a Cassandra cluster with machines on multiple
racks and vnodes bound to cause load imbalance ? Shouldn't token-ranges
assigned to individual machines via their vnodes be approximately balanced
? We're otherwise unable to explain why this imbalance occurs. ( it
shouldn't
Like I said in my previous reply that I am not sure if that is the
problem and that's why I thought it would be a good test to do your test
with cluster in one RACK only.
I'll take a look at your ring output today. Did you also post cfstats
output?
On Fri, Sep 20, 2013 at 9:24 AM, Jayadev Jayaram
Nice! Thats explains it.
2013/9/19 Robert Coli
> On Thu, Sep 19, 2013 at 3:08 AM, Rene Kochen wrote:
>
>> And how does cfstats track the maximum size? What does "Compacted" mean
>> in "Compacted row maximum size".
>>
>
> That maximum size is "the largest row that I have encountered in the
> cou
Hi,
I get a lot of exceptions when using Pig scripts over Cassandra. I have to
launch them again and again until they work. You can find a sample of the
stacks when it works (twice) and when it fails (3 times) at
http://pastebin.com/yWsTHbix. I use the following sample script (there are only
a
27 matches
Mail list logo