Thanks Jeremy. These will be really useful.
On Wed, Aug 31, 2011 at 12:12 AM, Jeremy Hanna
wrote:
> I've tried to help out with some UDFs and references that help with our use
> case: https://github.com/jeromatron/pygmalion/
>
> There are some brisk docs on pig as well that might be helpful:
> ht
NEWS.txt covers upgrading.
[moving to user list.]
On Tue, Aug 30, 2011 at 8:47 PM, 邓志远 wrote:
> Hi All:
> Now i use Cassandra0.7.5 in the cluster .how to upgrade to Cassandra0.8.4?
> there are a large data in Cassandra0.7.5. Can you tell me how to upgrade ?
>
> Thanks!
--
Jonathan Ellis
Pr
Okay I figured this out, the default for MemtableFlushAfterMins is not 60
minutes like some here said and what datastax docs say
(http://www.datastax.com/docs/0.8/configuration/storage_configuration), it's 24
hours (1440). I changed them all to 60 for every CF and now commit logs only
hang aro
> > 86GB in commitlog and 42GB in data
>
> Whoa, that seems really wrong, particularly given your data spans 13 months.
> Have you changed any of the default cassandra.yaml setting? What is the
> maximum memtable_flush_after across all your CFs? Any warnings/errors in the
> Cassandra log?
>
Sorry - misread your earlier email. I would login to IRC and ask in
#cassandra. I would think given the nature of nanotime you'll run into harder
to track down problems, but it may be fine.
On Aug 30, 2011, at 2:06 PM, Jiang Chen wrote:
> Do you see any problem with my approach to derive the
Do you see any problem with my approach to derive the current time in
nano seconds though?
On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
wrote:
> Yes - the reason why internally Cassandra uses milliseconds * 1000 is because
> System.nanoTime javadoc says "This method can only be used to measure
I've tried to help out with some UDFs and references that help with our use
case: https://github.com/jeromatron/pygmalion/
There are some brisk docs on pig as well that might be helpful:
http://www.datastax.com/docs/0.8/brisk/about_pig
On Aug 30, 2011, at 1:30 PM, Tharindu Mathew wrote:
> Than
Yes - the reason why internally Cassandra uses milliseconds * 1000 is because
System.nanoTime javadoc says "This method can only be used to measure elapsed
time and is not related to any other notion of system or wall-clock time."
http://download.oracle.com/javase/6/docs/api/java/lang/System.htm
Indeed it's microseconds. We are talking about how to achieve the
precision of microseconds. One way is System.currentTimeInMillis() *
1000. It's only precise to milliseconds. If there are more than one
update in the same millisecond, the second one may be lost. That's my
original problem.
The oth
Thanks Jeremy for your response. That gives me some encouragement, that I
might be on that right track.
I think I need to try out more stuff before coming to a conclusion on Brisk.
For Pig operations over Cassandra, I only could find
http://svn.apache.org/repos/asf/cassandra/trunk/contrib/pig. Ar
Ed- you're right - milliseconds * 1000. That's right. The other stuff about
nano time still stands, but you're right - microseconds. Sorry about that.
On Aug 30, 2011, at 1:20 PM, Edward Capriolo wrote:
>
>
> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna
> wrote:
> I would not use nano ti
On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna wrote:
> I would not use nano time with cassandra. Internally and throughout the
> clients, milliseconds is pretty much a standard. You can get into trouble
> because when comparing nanoseconds with milliseconds as long numbers,
> nanoseconds will al
I would not use nano time with cassandra. Internally and throughout the
clients, milliseconds is pretty much a standard. You can get into trouble
because when comparing nanoseconds with milliseconds as long numbers,
nanoseconds will always win. That bit us a while back when we deleted
someth
Looks like the theory is correct for the java case at least.
The default timestamp precision of Pelops is millisecond. Hence the
problem as explained by Peter. Once I supplied timestamps precise to
microsecond (using System.nanoTime()), the problem went away.
I previously stated that sleeping for
FWIW, we are using Pig (and Hadoop) with Cassandra and are looking to
potentially move to Brisk because of the simplicity of operations there.
Not sure what you mean about the true power of Hadoop. In my mind the true
power of Hadoop is the ability to parallelize jobs and send each task to wher
Could you reproduce it?
No problems.
Anthony
On Tue, Aug 30, 2011 at 9:31 AM, Jonathan Ellis wrote:
> Sounds like a bug. Can you create a ticket on
> https://issues.apache.org/jira/browse/CASSANDRA ?
>
> On Tue, Aug 30, 2011 at 11:28 AM, Anthony Ikeda
> wrote:
> > One thing I have noticed is that when you query via
Sounds like a bug. Can you create a ticket on
https://issues.apache.org/jira/browse/CASSANDRA ?
On Tue, Aug 30, 2011 at 11:28 AM, Anthony Ikeda
wrote:
> One thing I have noticed is that when you query via the cli with an invalid
> "assume" you no longer get the MarshalException beyond 0.8.1, it j
One thing I have noticed is that when you query via the cli with an invalid
"assume" you no longer get the MarshalException beyond 0.8.1, it just states
"null"
Any chance this could be more user friendly? It kind of stumped me when I
switched to 0.8.4.
Anthony
On Mon, Aug 29, 2011 at 2:35 PM, A
Thank you. problem solved
在 2011-8-30,下午9:12, Jonathan Ellis 写道:
> The right way to do this is to use a script of "create" commands:
>
> bin/cassandra-cli -f my-schema-creation-script
>
> On Tue, Aug 30, 2011 at 1:00 AM, Jenny wrote:
>> Hi
>> I notice that schematool was removed from the relea
> 86GB in commitlog and 42GB in data
Whoa, that seems really wrong, particularly given your data spans 13 months.
Have you changed any of the default cassandra.yaml setting? What is the
maximum memtable_flush_after across all your CFs? Any warnings/errors in the
Cassandra log?
> Out of curi
It's a single node. Thanks for the theory. I suspect part of it may
still be right. Will dig more.
On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
wrote:
>> The problem still happens with very high probability even when it
>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>
> The problem still happens with very high probability even when it
> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
> it can't be the cause. Also I have the same problem with a Java client
> using Pelops.
You connect to localhost, but is that a single node or part of a
clus
The problem still happens with very high probability even when it
pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
it can't be the cause. Also I have the same problem with a Java client
using Pelops.
On Tue, Aug 30, 2011 at 12:14 AM, Tyler Hobbs wrote:
>
> On Mon, Aug 29, 201
On Tue, Aug 30, 2011 at 6:54 AM, Sylvain Lebresne wrote:
> If you don't want to wait for the write to be applied by Cassandra before
> doing something else, then you can do that easily[1] client side.
Right.
Also consider that if you did have local replicas in each DC you could
get low-latency r
The right way to do this is to use a script of "create" commands:
bin/cassandra-cli -f my-schema-creation-script
On Tue, Aug 30, 2011 at 1:00 AM, Jenny wrote:
> Hi
> I notice that schematool was removed from the release of Cassandra 0.8. I
> would like to know the reason of doing that and how i
Look for a file called schema-sample.txt under the conf folder.
You'll find a sample schema and the command to load the same.
On Tue, Aug 30, 2011 at 11:30 AM, Jenny wrote:
> Hi
>
> I notice that schematool was removed from the release of Cassandra 0.8. I
> would like to know the reason of doin
There used to be a ZERO consistency level but it was removed because
it was harming more people than it was helping.
If what you want is very high availability, i.e. being able to write even
if the sole replica (in your RF=1 case) is down, then what you want to
use is CL ANY.
If you don't want to
Is there any mechanism that would allow me to write to Cassandra with
no blocking at all?
I spent a long time figuring out a problem I encountered with one node
in each datacenter: LA, and NY using SS RF=1 and write consistency 1.
My row keys are -mm-dd-h so basically for every hour a row woul
29 matches
Mail list logo