Reconfiguring nodes - getting bootstrap error

2010-06-22 Thread Anthony Ikeda
I had to reconfigure my Cassandra nodes today to allow us to use
Lucandra and made the following changes:

* Shutdown ALL Cassandra instances

* For each node:

o   Added in Lucandra Keyspace

o   Changed the Partitioner to OrderPreservingPartitioner

o   Deleted the folders in my Data File Directory

o   Deleted the files in my Commit Log Directory

* Started each node individually

 

Now it seems that I'm getting bootstrap errors coalescing to each node
starting with the first node started (Error below)

 

I understand this is because they are not new nodes but I thought
deleting the data and commit log files would correct this - as there is
no data to get. Is there some other files to remove?

 

The system had been running with the RandomPartitioner without problem
for the last week.

 

 

INFO 16:56:47,322 Auto DiskAccessMode determined to be mmap

 INFO 16:56:48,045 Saved Token not found. Using ADZAODw5LiJt6juc

 INFO 16:56:48,045 Saved ClusterName not found. Using MAMBO Space

 INFO 16:56:48,053 Creating new commitlog segment
/var/cassandra/log/CommitLog-1277189808053.log

 INFO 16:56:48,096 Starting up server gossip

 INFO 16:56:48,119 Joining: getting load information

 INFO 16:56:48,119 Sleeping 9 ms to wait for load information...

 INFO 16:56:48,165 Node /172.28.1.138 is now part of the cluster

 INFO 16:56:48,170 Node /172.28.1.139 is now part of the cluster

 INFO 16:56:48,171 Node /172.28.2.136 is now part of the cluster

 INFO 16:56:48,172 Node /172.28.1.141 is now part of the cluster

 INFO 16:56:49,136 InetAddress /172.28.1.141 is now UP

 INFO 16:56:49,145 InetAddress /172.28.2.136 is now UP

 INFO 16:56:49,146 InetAddress /172.28.1.138 is now UP

 INFO 16:56:49,147 InetAddress /172.28.1.139 is now UP

 INFO 16:57:06,772 Node /172.28.2.138 is now part of the cluster

 INFO 16:57:07,538 InetAddress /172.28.2.138 is now UP

 INFO 16:57:12,179 InetAddress /172.28.1.138 is now dead.

 INFO 16:57:19,195 error writing to /172.28.1.138

 INFO 16:57:24,205 error writing to /172.28.1.139

 INFO 16:57:26,209 InetAddress /172.28.1.139 is now dead.

 INFO 16:57:43,242 error writing to **/172.28.1.141

 INFO 16:57:50,254 InetAddress /172.28.1.141 is now dead.

 INFO 16:57:59,271 error writing to /172.28.2.136

 INFO 16:58:05,280 InetAddress /172.28.2.136 is now dead.

 INFO 16:58:18,136 Joining: getting bootstrap token

ERROR 16:58:18,139 Exception encountered during startup.

java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap

at
org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.ja
va:120)

at
org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java
:102)

at
org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.jav
a:97)

at
org.apache.cassandra.service.StorageService.initServer(StorageService.ja
va:356)

at
org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:9
9)

at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:17
7)

Exception encountered during startup.

java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap

at
org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.ja
va:120)

at
org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java
:102)

at
org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.jav
a:97)

at
org.apache.cassandra.service.StorageService.initServer(StorageService.ja
va:356)

at
org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:9
9)

at
org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:17
7)

 

 

Anthony Ikeda

Java Analyst/Programmer

Cardlink Services Limited

Level 4, 3 Rider Boulevard

Rhodes NSW 2138

 

Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283

 

 


**
This e-mail message and any attachments are intended only for the use of the 
addressee(s) named above and may contain information that is privileged and 
confidential. If you are not the intended recipient, any display, 
dissemination, distribution, or copying is strictly prohibited.   If you 
believe you have received this e-mail message in error, please immediately 
notify the sender by replying to this e-mail message or by telephone to (02) 
9646 9222. Please delete the email and any attachments and do not retain the 
email or any attachments in any form.
**<>

Deletion and batch_mutate

2010-06-22 Thread Ron
Hi everyone,
I'm a new user of Cassandra, and during my tests, I've encountered a problem
with deleting rows from CFs.
I use Cassandra 0.6.2 and coding in Java, using the native Java Thrift API.

The way my application works, I need to delete multiple rows at a time (just
like reads and writes).
Obviously, in terms of performance, I'd rather use batch_mutate and delete
several rows and not issue a remove command on each and every row.
So far, all attempts doing so have failed.
The following command configurations have been tested:

   1. Deletion, without a Supercolumn or SlicePredicate set. I get this
   error: InvalidRequestException(why:A Deletion must have a SuperColumn, a
   SlicePredicate or both.)
   2. Deletion, with a SlicePredicate set. The SlicePredicate is without
   column names or SliceRange set. I get this error:
   InvalidRequestException(why:A SlicePredicate must be given a list of
   Columns, a SliceRange, or both)
   3. Deletion, with a SlicePredicate set. The SlicePredicate is set with
   SliceRange. The SliceRange is set with empty start and finish values. I get
   this error: InvalidRequestException(why:Deletion does not yet support
   SliceRange predicates.)

At this point I'm left with no other alternatives (since I want to delete a
whole row and not specific columns/supercolumns within a row).
Using the remove command in a loop has serious implications in terms of
performance.
Is there any solution for this problems?
Thanks,
Ron


Re: java.lang.OutOfMemoryError: Map failed

2010-06-22 Thread Oleg Anastasjev
> Daniel:
>  
> Thanks. That thread helped me solve my problem.
>  
> I was able to run a 700k MySQL record import without a single memory error. 
>  
> I changed the following sections in storage-conf.xml to fix the OutofMemory
errors:
>  
>  standard
>  batch 
>  1

Going to standard mode is not very good. It gives performance penalty to about
30% on uncached reads according to my own homegrown stress test. 

Do you use 32-bit or 64-bit JVM ? In 32-bit JVM you have no choice but to use
standard mode. If you're using 64-bit JVM and getting this error it looks like
you have limited virtual memory space. If you're under linux you can fix it with
command 
ulimit -v unlimited
in the same shell just before launching cassandra node.




Re: Deletion and batch_mutate

2010-06-22 Thread Mishail
Take a look at

https://issues.apache.org/jira/browse/CASSANDRA-494

https://issues.apache.org/jira/browse/CASSANDRA-1027


On 22.06.2010 19:00, Ron wrote:
> Hi everyone,
> I'm a new user of Cassandra, and during my tests, I've encountered a
> problem with deleting rows from CFs.
> I use Cassandra 0.6.2 and coding in Java, using the native Java Thrift API.
> 
> The way my application works, I need to delete multiple rows at a time
> (just like reads and writes).
> Obviously, in terms of performance, I'd rather use batch_mutate and
> delete several rows and not issue a remove command on each and every row.
> So far, all attempts doing so have failed.
> The following command configurations have been tested:
> 
>1. Deletion, without a Supercolumn or SlicePredicate set. I get this
>   error: InvalidRequestException(why:A Deletion must have a
>   SuperColumn, a SlicePredicate or both.)
>2. Deletion, with a SlicePredicate set. The SlicePredicate is without
>   column names or SliceRange set. I get this error:
>   InvalidRequestException(why:A SlicePredicate must be given a list
>   of Columns, a SliceRange, or both)
>3. Deletion, with a SlicePredicate set. The SlicePredicate is set
>   with SliceRange. The SliceRange is set with empty start and finish
>   values. I get this error: InvalidRequestException(why:Deletion
>   does not yet support SliceRange predicates.)
> 
> At this point I'm left with no other alternatives (since I want to
> delete a whole row and not specific columns/supercolumns within a row).
> Using the remove command in a loop has serious implications in terms of
> performance.
> Is there any solution for this problems?
> Thanks,
> Ron




unsubscribe

2010-06-22 Thread Dean Steele
unsubscribe
d...@dintran.com 
Dean Steele 
Reason: too much mail volume, I would prefer an weekly case study
review. 



[OT] Re: unsubscribe

2010-06-22 Thread Torsten Curdt
Hey Dean ...and everyone else not managing to unsubscribe (and sending
mails to the list instead):

If you don't know how to unsubscribe you can always look at the

 List-Unsubscribe:

header of any of the list emails.

These days most of the time you will find that an "-unsubscribe"
suffix is used instead of sending it in the subject or body.

Many of the archives provide RSS feeds for mailing list in case you
just want to read.

HTH

cheers
--
Torsten

On Tue, Jun 22, 2010 at 13:10, Dean Steele  wrote:
> unsubscribe
> d...@dintran.com
> Dean Steele
> Reason: too much mail volume, I would prefer an weekly case study
> review.
>
>


OrderPreservingPartitioner and manual token assignment

2010-06-22 Thread Maxim Kramarenko

Hello!

I use OrderPreservingPartitioner and assign tokens manually.

Questions are:

1) Why range sorted in alphabetical order, not numeric order ?
It was ok with RandomPartitioner

Address   Status Load  Range 
  Ring


84
172.19.0.35   Up 2.47 GB   0 
  |<--|
172.19.0.31   Up 1.85 GB 
112|   ^
172.19.0.33   Up 1.46 GB 
142v   |
172.19.0.30   Up 1.44 GB 
28 |   ^
172.19.0.32   Up 2.63 GB 
56 v   |
172.19.0.34   Up 3.29 GB 
84 |-->|


2) what is the token range ? For example, all our keys starts with 
customer number (a few digits), but number is only small part of ASCII 
table.


What is the best way to assign tokens manually when using 
OrderPreservingPartitioner ?


--
Best regards,
 Maximmailto:maxi...@trackstudio.com

LinkedIn Profile: http://www.linkedin.com/in/maximkr
Google Talk/Jabber: maxi...@gmail.com
ICQ number: 307863079
Skype Chat: maxim.kramarenko
Yahoo! Messenger: maxim_kramarenko


Re: New to cassandra

2010-06-22 Thread yaw
And this one is useful  :

https://wiki.fourkitchens.com/display/PF/Using+Cassandra+with+PHP




2010/6/22 Shahan Khan 

> The wiki is a great place:
>
> http://wiki.apache.org/cassandra/FrontPage
>
> Getting Started: http://wiki.apache.org/cassandra/GettingStarted
>
> Cassandra interfaces with PHP via thrift
>
> http://wiki.apache.org/cassandra/ThriftExamples
>
> Shahan
>
> On Mon, 21 Jun 2010 15:16:51 +0530, Ajay Singh 
> wrote:
>
> Hi
>
> I am a php developer, I am new to cassandra. Is there any starting guide or
> tutorial  from where i can begin
>
> Thanks
> Ajay
>
>
>


Re: OrderPreservingPartitioner and manual token assignment

2010-06-22 Thread Sylvain Lebresne
2010/6/22 Maxim Kramarenko :
> Hello!
>
> I use OrderPreservingPartitioner and assign tokens manually.
>
> Questions are:
>
> 1) Why range sorted in alphabetical order, not numeric order ?
> It was ok with RandomPartitioner

With RandomPartitioner, tokens are md5 hashes, thus number and the
comparison between two tokens is the numeric one.

With OrdrerPreservingPartitioner, tokens are the keys themselves, that is
to say Strings, and the comparison is (utf8) String comparison (hence the
alphabetic sorting). Note that as such, when switching from RP to OPP,
you most certainly don't want to keep the same tokens (as they represents
very different things (md5 hahes vs string key)).

>
> Address       Status     Load          Range           Ring
>
> 84
> 172.19.0.35   Up         2.47 GB       0           |<--|
> 172.19.0.31   Up         1.85 GB 112
>  |   ^
> 172.19.0.33   Up         1.46 GB 142
>  v   |
> 172.19.0.30   Up         1.44 GB 28
> |   ^
> 172.19.0.32   Up         2.63 GB 56
> v   |
> 172.19.0.34   Up         3.29 GB 84
> |-->|
>
> 2) what is the token range ? For example, all our keys starts with customer
> number (a few digits), but number is only small part of ASCII table.
>
> What is the best way to assign tokens manually when using
> OrderPreservingPartitioner ?

The first thing is to find (estimate most probably) the domain and repartition
of the key you will use (note that this is really the hard part as
most of the time you
can only guess what the repartition will be and most of the time you
will be wrong
anyway and get bad load balancing).
But when you know that, you just assign as tokens the particular keys
that split this
repartition the more evenly possible (and split here is with respect
to (utf8) string
comparison).

--
Sylvain

>
> --
> Best regards,
>  Maxim                            mailto:maxi...@trackstudio.com
>
> LinkedIn Profile: http://www.linkedin.com/in/maximkr
> Google Talk/Jabber: maxi...@gmail.com
> ICQ number: 307863079
> Skype Chat: maxim.kramarenko
> Yahoo! Messenger: maxim_kramarenko
>


Re: django or pylons

2010-06-22 Thread Jonathan Ellis
What problems did you run into?

On Mon, Jun 21, 2010 at 6:32 AM, Eugenio Minardi
 wrote:
> Hi, I had gave a look to django + cassandra I found the twissandra project
> (a django version of twitter based on cassandra).
> But since I am new to django I couldnt make it work. If you find it
> interesting please give me a hint on how to proceed to make it work :)
> Eugenio
>
> On Mon, Jun 21, 2010 at 3:01 AM, S Ahmed  wrote:
>>
>> aren't you guys using django though? :)
>>
>> On Sun, Jun 20, 2010 at 7:40 PM, Joe Stump  wrote:
>>>
>>> A lot of the magic that Django brings to the table is derived from the
>>> ORM. If you're skipping that then Pylons likely makes more sense.
>>>
>>> --Joe
>>> On Jun 20, 2010, at 5:08 PM, Charles Woerner 
>>> wrote:
>>>
>>> I recently looked into this and came to the same conclusion, but I'm not
>>> an expert in either Django or Pylons so I'd also be interested in hearing
>>> what someone with more Python experience would say.
>>>
>>> On Sun, Jun 20, 2010 at 1:42 PM, S Ahmed  wrote:

 Seeing as I will be using a different ORM, would it make more sense to
 use pylons over django?
 From what I understand, pylons assumes less as compared to django.
>>>
>>>
>>> --
>>> ---
>>> Thanks,
>>>
>>> Charles Woerner
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread David Boxenhorn
I want to use UUIDs whose alphanumeric order is the same as their
chronological order. So I'm generating Version 4 UUIDs (
http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Version_4_.28random.29)
as follows:

public class Id
{
   static Random random = new Random();

   public static String next()
   {
  // Format: --4xxx-8xxx-

  long high = (System.currentTimeMillis() << 16) | 0x4000 |
random.nextInt(4096);
  long low = (random.nextLong() >>> 4) | 0x8000L;

  UUID uuid = new UUID(high, low);

  return uuid.toString();
   }
}

Is there anything wrong with this idea?


Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-22 Thread Jonathan Ellis
What I would expect to have happen is for the removed node to
disappear from the ring and for nodes that are supposed to get more
data to start streaming it over.  I would expect it to be hours before
any new data started appearing anywhere when you are anticompacting
80+GB prior to the streaming part.
http://wiki.apache.org/cassandra/Streaming

On Tue, Jun 22, 2010 at 12:57 AM, Joost Ouwerkerk  wrote:
> Yes, although "forget" implies that we once knew we were supposed to do so.
> Given the following before-and-after states, on which nodes are we supposed
> to run repair?  Should the cluster be restarted?  Is there anything else we
> should be doing, or not doing?
>
> 1. Node is down due to hardware failure
>
> 192.168.1.104 Up 111.75 GB
> 8954799129498380617457226511362321354  |   ^
> 192.168.1.106 Up 113.25 GB
> 17909598258996761234914453022724642708 v   |
> 192.168.1.107 Up 75.65 GB
> 22386997823745951543643066278405803385 |   ^
> 192.168.1.108 Down    75.77 GB
> 26864397388495141852371679534086964062 v   |
> 192.168.1.109 Up 76.14 GB
> 35819196517993522469828906045449285416 |   ^
> 192.168.1.110 Up 75.9 GB
> 40296596082742712778557519301130446093 v   |
> 192.168.1.111 Up 95.21 GB
> 49251395212241093396014745812492767447 |   ^
>
> 2. nodetool removetoken 26864397388495141852371679534086964062
>
> 192.168.1.104 Up 111.75 GB
> 8954799129498380617457226511362321354  |   ^
> 192.168.1.106 Up 113.25 GB
> 17909598258996761234914453022724642708 v   |
> 192.168.1.107 Up 75.65 GB
> 22386997823745951543643066278405803385 |   ^
> 192.168.1.109 Up 76.14 GB
> 35819196517993522469828906045449285416 |   ^
> 192.168.1.110 Up 75.9 GB
> 40296596082742712778557519301130446093 v   |
> 192.168.1.111 Up 95.21 GB
> 49251395212241093396014745812492767447 |   ^
>
> At this point we're expecting 192.168.1.107 to pick up the slack for the
> removed token, and for 192.168.1.109 and/or 192.168.1.110 to start streaming
> data to 192.168.1.107 since they are holding the replicated data for that
> range.
>
> 3. nodetool repair ?
>
> On Tue, Jun 22, 2010 at 12:03 AM, Benjamin Black  wrote:
>>
>> Did you forget to run repair?
>>
>> On Mon, Jun 21, 2010 at 7:02 PM, Joost Ouwerkerk 
>> wrote:
>> > I believe we did nodetool removetoken on nodes that were already down
>> > (due
>> > to hardware failure), but I will check to make sure. We're running
>> > Cassandra
>> > 0.6.2.
>> >
>> > On Mon, Jun 21, 2010 at 9:59 PM, Joost Ouwerkerk 
>> > wrote:
>> >>
>> >> Greg, can you describe the steps we took to decommission the nodes?
>> >>
>> >> -- Forwarded message --
>> >> From: Rob Coli 
>> >> Date: Mon, Jun 21, 2010 at 8:08 PM
>> >> Subject: Re: get_range_slices confused about token ranges after
>> >> decommissioning a node
>> >> To: user@cassandra.apache.org
>> >>
>> >>
>> >> On 6/21/10 4:57 PM, Joost Ouwerkerk wrote:
>> >>>
>> >>> We're seeing very strange behaviour after decommissioning a node: when
>> >>> requesting a get_range_slices with a KeyRange by token, we are getting
>> >>> back tokens that are out of range.
>> >>
>> >> What sequence of actions did you take to "decommission" the node? What
>> >> version of Cassandra are you running?
>> >>
>> >> =Rob
>> >>
>> >
>> >
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Deletion and batch_mutate

2010-06-22 Thread Jonathan Ellis
right.

in other words, you can delete entire rows w/ batch_mutate in 0.6.3 or
trunk, but for 0.6.2 the best workaround is to issue multiple remove
commands.

On Tue, Jun 22, 2010 at 5:09 AM, Mishail  wrote:
> Take a look at
>
> https://issues.apache.org/jira/browse/CASSANDRA-494
>
> https://issues.apache.org/jira/browse/CASSANDRA-1027
>
>
> On 22.06.2010 19:00, Ron wrote:
>> Hi everyone,
>> I'm a new user of Cassandra, and during my tests, I've encountered a
>> problem with deleting rows from CFs.
>> I use Cassandra 0.6.2 and coding in Java, using the native Java Thrift API.
>>
>> The way my application works, I need to delete multiple rows at a time
>> (just like reads and writes).
>> Obviously, in terms of performance, I'd rather use batch_mutate and
>> delete several rows and not issue a remove command on each and every row.
>> So far, all attempts doing so have failed.
>> The following command configurations have been tested:
>>
>>    1. Deletion, without a Supercolumn or SlicePredicate set. I get this
>>       error: InvalidRequestException(why:A Deletion must have a
>>       SuperColumn, a SlicePredicate or both.)
>>    2. Deletion, with a SlicePredicate set. The SlicePredicate is without
>>       column names or SliceRange set. I get this error:
>>       InvalidRequestException(why:A SlicePredicate must be given a list
>>       of Columns, a SliceRange, or both)
>>    3. Deletion, with a SlicePredicate set. The SlicePredicate is set
>>       with SliceRange. The SliceRange is set with empty start and finish
>>       values. I get this error: InvalidRequestException(why:Deletion
>>       does not yet support SliceRange predicates.)
>>
>> At this point I'm left with no other alternatives (since I want to
>> delete a whole row and not specific columns/supercolumns within a row).
>> Using the remove command in a loop has serious implications in terms of
>> performance.
>> Is there any solution for this problems?
>> Thanks,
>> Ron
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: Reconfiguring nodes - getting bootstrap error

2010-06-22 Thread Jonathan Ellis
sounds like a problem with your seed configuration

On Tue, Jun 22, 2010 at 3:06 AM, Anthony Ikeda <
anthony.ik...@cardlink.com.au> wrote:

>  I had to reconfigure my Cassandra nodes today to allow us to use Lucandra
> and made the following changes:
>
> · Shutdown ALL Cassandra instances
>
> · For each node:
>
> o   Added in Lucandra Keyspace
>
> o   Changed the Partitioner to OrderPreservingPartitioner
>
> o   Deleted the folders in my Data File Directory
>
> o   Deleted the files in my Commit Log Directory
>
> · Started each node individually
>
>
>
> Now it seems that I’m getting bootstrap errors coalescing to each node
> starting with the first node started (Error below)
>
>
>
> I understand this is because they are not new nodes but I thought deleting
> the data and commit log files would correct this – as there is no data to
> get. Is there some other files to remove?
>
>
>
> The system had been running with the RandomPartitioner without problem for
> the last week.
>
>
>
>
>
> INFO 16:56:47,322 Auto DiskAccessMode determined to be mmap
>
>  INFO 16:56:48,045 Saved Token not found. Using ADZAODw5LiJt6juc
>
>  INFO 16:56:48,045 Saved ClusterName not found. Using MAMBO Space
>
>  INFO 16:56:48,053 Creating new commitlog segment
> /var/cassandra/log/CommitLog-1277189808053.log
>
>  INFO 16:56:48,096 Starting up server gossip
>
>  INFO 16:56:48,119 Joining: getting load information
>
>  INFO 16:56:48,119 Sleeping 9 ms to wait for load information...
>
>  INFO 16:56:48,165 Node /172.28.1.138 is now part of the cluster
>
>  INFO 16:56:48,170 Node /172.28.1.139 is now part of the cluster
>
>  INFO 16:56:48,171 Node /172.28.2.136 is now part of the cluster
>
>  INFO 16:56:48,172 Node /172.28.1.141 is now part of the cluster
>
>  INFO 16:56:49,136 InetAddress /172.28.1.141 is now UP
>
>  INFO 16:56:49,145 InetAddress /172.28.2.136 is now UP
>
>  INFO 16:56:49,146 InetAddress /172.28.1.138 is now UP
>
>  INFO 16:56:49,147 InetAddress /172.28.1.139 is now UP
>
>  INFO 16:57:06,772 Node /172.28.2.138 is now part of the cluster
>
>  INFO 16:57:07,538 InetAddress /172.28.2.138 is now UP
>
>  INFO 16:57:12,179 InetAddress /172.28.1.138 is now dead.
>
>  INFO 16:57:19,195 error writing to /172.28.1.138
>
>  INFO 16:57:24,205 error writing to /172.28.1.139
>
>  INFO 16:57:26,209 InetAddress /172.28.1.139 is now dead.
>
>  INFO 16:57:43,242 error writing to **/172.28.1.141
>
>  INFO 16:57:50,254 InetAddress /172.28.1.141 is now dead.
>
>  INFO 16:57:59,271 error writing to /172.28.2.136
>
>  INFO 16:58:05,280 InetAddress /172.28.2.136 is now dead.
>
>  INFO 16:58:18,136 Joining: getting bootstrap token
>
> ERROR 16:58:18,139 Exception encountered during startup.
>
> java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap
>
> at
> org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:120)
>
> at
> org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:102)
>
> at
> org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:97)
>
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:356)
>
> at
> org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:99)
>
> at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)
>
> Exception encountered during startup.
>
> java.lang.RuntimeException: No other nodes seen!  Unable to bootstrap
>
> at
> org.apache.cassandra.dht.BootStrapper.getBootstrapSource(BootStrapper.java:120)
>
> at
> org.apache.cassandra.dht.BootStrapper.getBalancedToken(BootStrapper.java:102)
>
> at
> org.apache.cassandra.dht.BootStrapper.getBootstrapToken(BootStrapper.java:97)
>
> at
> org.apache.cassandra.service.StorageService.initServer(StorageService.java:356)
>
> at
> org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:99)
>
> at
> org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:177)
>
>
>
>
>
> Anthony Ikeda
>
> Java Analyst/Programmer
>
> Cardlink Services Limited
>
> Level 4, 3 Rider Boulevard
>
> Rhodes NSW 2138
>
>
>
> Web: www.cardlink.com.au | Tel: + 61 2 9646 9221 | Fax: + 61 2 9646 9283
>
> [image: logo_cardlink1]
>
>
>
> **
> This e-mail message and any attachments are intended only for the use of
> the addressee(s) named above and may contain information that is privileged
> and confidential. If you are not the intended recipient, any display,
> dissemination, distribution, or copying is strictly prohibited. If you
> believe you have received this e-mail message in error, please immediately
> notify the sender by replying to this e-mail message or by telephone to (02)
> 9646 9222. Please delete the email and any attachments and do not retain the
> email or any attachments in any form.
> 

Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread Jonathan Ellis
Why not just use version 1 UUIDs and TimeUUIDType?

On Tue, Jun 22, 2010 at 8:58 AM, David Boxenhorn  wrote:
> I want to use UUIDs whose alphanumeric order is the same as their
> chronological order. So I'm generating Version 4 UUIDs (
> http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Version_4_.28random.29
> ) as follows:
>
> public class Id
> {
>    static Random random = new Random();
>
>    public static String next()
>    {
>   // Format: --4xxx-8xxx-
>
>   long high = (System.currentTimeMillis() << 16) | 0x4000 |
> random.nextInt(4096);
>   long low = (random.nextLong() >>> 4) | 0x8000L;
>
>   UUID uuid = new UUID(high, low);
>
>   return uuid.toString();
>    }
> }
>
> Is there anything wrong with this idea?
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread David Boxenhorn
As I understand it, the string value of TimeUUIDType does not sort
alphanumerically in chronological order. Isn't that right?

I want to use these ids in Oracle as well as Cassandra, and I want them to
sort in chronological order. In Oracle they will have to be varchars (I
think).

Even in Cassandra alone, what is the advantage of TimeUUIDType over
UTF8Type, if done this way? Is TimeUUIDType faster than UTF8Type? As far as
I can tell, the class I give below looks much easier and faster than the one
recommended here:
http://wiki.apache.org/cassandra/FAQ#working_with_timeuuid_in_java - which
looks really cumbersome, in addition to the fact that it uses a 3rd party
library and presumably machine dependent code!

On Tue, Jun 22, 2010 at 5:18 PM, Jonathan Ellis  wrote:

> Why not just use version 1 UUIDs and TimeUUIDType?
>
> On Tue, Jun 22, 2010 at 8:58 AM, David Boxenhorn 
> wrote:
> > I want to use UUIDs whose alphanumeric order is the same as their
> > chronological order. So I'm generating Version 4 UUIDs (
> >
> http://en.wikipedia.org/wiki/Universally_Unique_Identifier#Version_4_.28random.29
> > ) as follows:
> >
> > public class Id
> > {
> >static Random random = new Random();
> >
> >public static String next()
> >{
> >   // Format: --4xxx-8xxx-
> >
> >   long high = (System.currentTimeMillis() << 16) | 0x4000 |
> > random.nextInt(4096);
> >   long low = (random.nextLong() >>> 4) | 0x8000L;
> >
> >   UUID uuid = new UUID(high, low);
> >
> >   return uuid.toString();
> >}
> > }
> >
> > Is there anything wrong with this idea?
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-22 Thread Julie
Gary Dusbabek  gmail.com> writes:

> 
> *Hopefully* fixed.  I was never able to duplicate the problem on my
> workstation, but I had a pretty good idea what was causing the
> problem.  Julie, if you're in a position to apply and test the fix, it
> would help help us make sure we've got this one nailed down.
> 
> Gary.
> 
> On Thu, Jun 17, 2010 at 00:33, Jonathan Ellis  gmail.com> wrote:
> > That is consistent with the
> > https://issues.apache.org/jira/browse/CASSANDRA-1169 bug I mentioned,
> > that is fixed in the 0.6 svn branch.
> >
> > On Wed, Jun 16, 2010 at 10:51 PM, Julie  nextcentury.com>
wrote:
> >> The loop is in IncomingStreamReader.java, line 62, a 3-line while loop.
> >> bytesRead is not changing.  pendingFile.getExpectedBytes() returns
> >> 7,161,538,639 but bytesRead is stuck at 2,147,483,647.
> >>
> 
> 

Thanks for your help, Gary and Jonathon.

We updated the JVM to get rid of the Value too large exception (which it did -
yay!) and are still running with Cassandra 0.6.2 I have not been able to
duplicate the tight loop problem. 

We did try out the in-progress Cassandra 0.6.3 (off the 0.6 SVN) yesterday but
unfortunately can't tell you if the problem is gone because I am seeing a lot
more timeouts on reads (retrying up to 10 times then quitting) so I haven't been
able to get the database fully populated. I'm trying it again this morning.  If
problems persist with the in-progress version, I'll drop back to 0.6.2 and do
some torture writing and see if the problem truly went away just by updating the
JVM. 

Thanks for all of your suggestions!
Julie 



Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-22 Thread Joost Ouwerkerk
I don't mind missing data for a few hours, it's the weird behaviour of
get_range_slices that's bothering me.  I added some logging to
ColumnFamilyRecordReader to see what's going on:

Split startToken=67160993471237854630929198835217410155,
endToken=68643623863384825230116928934887817211

...

Getting batch for range: 67965855060996012099315582648654139032 to
68643623863384825230116928934887817211

Token for last row is: 50448492574454416067449808504057295946

Getting batch for range: 50448492574454416067449808504057295946 to
68643623863384825230116928934887817211

...


Notice how the get_range_slices response is invalid since it returns an
out-of-range row.  This poisons the batching loop and causes the task to
spin out of control.

/joost

On Tue, Jun 22, 2010 at 9:09 AM, Jonathan Ellis  wrote:

> What I would expect to have happen is for the removed node to
> disappear from the ring and for nodes that are supposed to get more
> data to start streaming it over.  I would expect it to be hours before
> any new data started appearing anywhere when you are anticompacting
> 80+GB prior to the streaming part.
> http://wiki.apache.org/cassandra/Streaming
>
> On Tue, Jun 22, 2010 at 12:57 AM, Joost Ouwerkerk 
> wrote:
> > Yes, although "forget" implies that we once knew we were supposed to do
> so.
> > Given the following before-and-after states, on which nodes are we
> supposed
> > to run repair?  Should the cluster be restarted?  Is there anything else
> we
> > should be doing, or not doing?
> >
> > 1. Node is down due to hardware failure
> >
> > 192.168.1.104 Up 111.75 GB
> > 8954799129498380617457226511362321354  |   ^
> > 192.168.1.106 Up 113.25 GB
> > 17909598258996761234914453022724642708 v   |
> > 192.168.1.107 Up 75.65 GB
> > 22386997823745951543643066278405803385 |   ^
> > 192.168.1.108 Down75.77 GB
> > 26864397388495141852371679534086964062 v   |
> > 192.168.1.109 Up 76.14 GB
> > 35819196517993522469828906045449285416 |   ^
> > 192.168.1.110 Up 75.9 GB
> > 40296596082742712778557519301130446093 v   |
> > 192.168.1.111 Up 95.21 GB
> > 49251395212241093396014745812492767447 |   ^
> >
> > 2. nodetool removetoken 26864397388495141852371679534086964062
> >
> > 192.168.1.104 Up 111.75 GB
> > 8954799129498380617457226511362321354  |   ^
> > 192.168.1.106 Up 113.25 GB
> > 17909598258996761234914453022724642708 v   |
> > 192.168.1.107 Up 75.65 GB
> > 22386997823745951543643066278405803385 |   ^
> > 192.168.1.109 Up 76.14 GB
> > 35819196517993522469828906045449285416 |   ^
> > 192.168.1.110 Up 75.9 GB
> > 40296596082742712778557519301130446093 v   |
> > 192.168.1.111 Up 95.21 GB
> > 49251395212241093396014745812492767447 |   ^
> >
> > At this point we're expecting 192.168.1.107 to pick up the slack for the
> > removed token, and for 192.168.1.109 and/or 192.168.1.110 to start
> streaming
> > data to 192.168.1.107 since they are holding the replicated data for that
> > range.
> >
> > 3. nodetool repair ?
> >
> > On Tue, Jun 22, 2010 at 12:03 AM, Benjamin Black  wrote:
> >>
> >> Did you forget to run repair?
> >>
> >> On Mon, Jun 21, 2010 at 7:02 PM, Joost Ouwerkerk 
> >> wrote:
> >> > I believe we did nodetool removetoken on nodes that were already down
> >> > (due
> >> > to hardware failure), but I will check to make sure. We're running
> >> > Cassandra
> >> > 0.6.2.
> >> >
> >> > On Mon, Jun 21, 2010 at 9:59 PM, Joost Ouwerkerk <
> jo...@openplaces.org>
> >> > wrote:
> >> >>
> >> >> Greg, can you describe the steps we took to decommission the nodes?
> >> >>
> >> >> -- Forwarded message --
> >> >> From: Rob Coli 
> >> >> Date: Mon, Jun 21, 2010 at 8:08 PM
> >> >> Subject: Re: get_range_slices confused about token ranges after
> >> >> decommissioning a node
> >> >> To: user@cassandra.apache.org
> >> >>
> >> >>
> >> >> On 6/21/10 4:57 PM, Joost Ouwerkerk wrote:
> >> >>>
> >> >>> We're seeing very strange behaviour after decommissioning a node:
> when
> >> >>> requesting a get_range_slices with a KeyRange by token, we are
> getting
> >> >>> back tokens that are out of range.
> >> >>
> >> >> What sequence of actions did you take to "decommission" the node?
> What
> >> >> version of Cassandra are you running?
> >> >>
> >> >> =Rob
> >> >>
> >> >
> >> >
> >
> >
>
>
>
> --
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com
>


Re: get_range_slices confused about token ranges after decommissioning a node

2010-06-22 Thread Jonathan Ellis
Ah, that sounds like
https://issues.apache.org/jira/browse/CASSANDRA-1198.  That it
happened after removetoken is just that that happened to change your
ring topology enough to make your queries start hitting it.

On Tue, Jun 22, 2010 at 10:39 AM, Joost Ouwerkerk  wrote:
> I don't mind missing data for a few hours, it's the weird behaviour of
> get_range_slices that's bothering me.  I added some logging to
> ColumnFamilyRecordReader to see what's going on:
>
> Split startToken=67160993471237854630929198835217410155,
> endToken=68643623863384825230116928934887817211
>
> ...
>
> Getting batch for range: 67965855060996012099315582648654139032 to
> 68643623863384825230116928934887817211
>
> Token for last row is: 50448492574454416067449808504057295946
>
> Getting batch for range: 50448492574454416067449808504057295946 to
> 68643623863384825230116928934887817211
>
> ...
>
> Notice how the get_range_slices response is invalid since it returns an
> out-of-range row.  This poisons the batching loop and causes the task to
> spin out of control.
> /joost
>
> On Tue, Jun 22, 2010 at 9:09 AM, Jonathan Ellis  wrote:
>>
>> What I would expect to have happen is for the removed node to
>> disappear from the ring and for nodes that are supposed to get more
>> data to start streaming it over.  I would expect it to be hours before
>> any new data started appearing anywhere when you are anticompacting
>> 80+GB prior to the streaming part.
>> http://wiki.apache.org/cassandra/Streaming
>>
>> On Tue, Jun 22, 2010 at 12:57 AM, Joost Ouwerkerk 
>> wrote:
>> > Yes, although "forget" implies that we once knew we were supposed to do
>> > so.
>> > Given the following before-and-after states, on which nodes are we
>> > supposed
>> > to run repair?  Should the cluster be restarted?  Is there anything else
>> > we
>> > should be doing, or not doing?
>> >
>> > 1. Node is down due to hardware failure
>> >
>> > 192.168.1.104 Up 111.75 GB
>> > 8954799129498380617457226511362321354  |   ^
>> > 192.168.1.106 Up 113.25 GB
>> > 17909598258996761234914453022724642708 v   |
>> > 192.168.1.107 Up 75.65 GB
>> > 22386997823745951543643066278405803385 |   ^
>> > 192.168.1.108 Down    75.77 GB
>> > 26864397388495141852371679534086964062 v   |
>> > 192.168.1.109 Up 76.14 GB
>> > 35819196517993522469828906045449285416 |   ^
>> > 192.168.1.110 Up 75.9 GB
>> > 40296596082742712778557519301130446093 v   |
>> > 192.168.1.111 Up 95.21 GB
>> > 49251395212241093396014745812492767447 |   ^
>> >
>> > 2. nodetool removetoken 26864397388495141852371679534086964062
>> >
>> > 192.168.1.104 Up 111.75 GB
>> > 8954799129498380617457226511362321354  |   ^
>> > 192.168.1.106 Up 113.25 GB
>> > 17909598258996761234914453022724642708 v   |
>> > 192.168.1.107 Up 75.65 GB
>> > 22386997823745951543643066278405803385 |   ^
>> > 192.168.1.109 Up 76.14 GB
>> > 35819196517993522469828906045449285416 |   ^
>> > 192.168.1.110 Up 75.9 GB
>> > 40296596082742712778557519301130446093 v   |
>> > 192.168.1.111 Up 95.21 GB
>> > 49251395212241093396014745812492767447 |   ^
>> >
>> > At this point we're expecting 192.168.1.107 to pick up the slack for the
>> > removed token, and for 192.168.1.109 and/or 192.168.1.110 to start
>> > streaming
>> > data to 192.168.1.107 since they are holding the replicated data for
>> > that
>> > range.
>> >
>> > 3. nodetool repair ?
>> >
>> > On Tue, Jun 22, 2010 at 12:03 AM, Benjamin Black  wrote:
>> >>
>> >> Did you forget to run repair?
>> >>
>> >> On Mon, Jun 21, 2010 at 7:02 PM, Joost Ouwerkerk 
>> >> wrote:
>> >> > I believe we did nodetool removetoken on nodes that were already down
>> >> > (due
>> >> > to hardware failure), but I will check to make sure. We're running
>> >> > Cassandra
>> >> > 0.6.2.
>> >> >
>> >> > On Mon, Jun 21, 2010 at 9:59 PM, Joost Ouwerkerk
>> >> > 
>> >> > wrote:
>> >> >>
>> >> >> Greg, can you describe the steps we took to decommission the nodes?
>> >> >>
>> >> >> -- Forwarded message --
>> >> >> From: Rob Coli 
>> >> >> Date: Mon, Jun 21, 2010 at 8:08 PM
>> >> >> Subject: Re: get_range_slices confused about token ranges after
>> >> >> decommissioning a node
>> >> >> To: user@cassandra.apache.org
>> >> >>
>> >> >>
>> >> >> On 6/21/10 4:57 PM, Joost Ouwerkerk wrote:
>> >> >>>
>> >> >>> We're seeing very strange behaviour after decommissioning a node:
>> >> >>> when
>> >> >>> requesting a get_range_slices with a KeyRange by token, we are
>> >> >>> getting
>> >> >>> back tokens that are out of range.
>> >> >>
>> >> >> What sequence of actions did you take to "decommission" the node?
>> >> >> What
>> >> >> version of Cassandra are you running?
>> >> >>
>> >> >> =Rob
>> >> >>
>> >> >
>> >> >
>> >
>> >
>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>
>



-- 

Finding new Cassandra data

2010-06-22 Thread David Boxenhorn
In my system, I have a Cassandra front end, and an Oracle back end. Some
information is created in the back end, and pushed out to the front end, and
some information is created in the front end and pulled into the back end.

Question: How do I locate new rows that have been crated in Cassandra, for
import into Oracle?

I'm thinking of having a special column family "newRows" that contains only
the keys of the new rows. The offline process would look there to see what's
new, then delete those rows. The "newRows" CF would have no data! (The data
would be in the "real" CF.)

Is this a good solution? It seems weird to have a CF with rows but no data.
But I can't think of a better way.

Any thoughts?


Re: Finding new Cassandra data

2010-06-22 Thread Gary Dusbabek
On Tue, Jun 22, 2010 at 09:59, David Boxenhorn  wrote:
> In my system, I have a Cassandra front end, and an Oracle back end. Some
> information is created in the back end, and pushed out to the front end, and
> some information is created in the front end and pulled into the back end.
>
> Question: How do I locate new rows that have been crated in Cassandra, for
> import into Oracle?
>
> I'm thinking of having a special column family "newRows" that contains only
> the keys of the new rows. The offline process would look there to see what's
> new, then delete those rows. The "newRows" CF would have no data! (The data
> would be in the "real" CF.)

I've never tried an empty row, but I'm pretty sure you need at least one column.

>
> Is this a good solution? It seems weird to have a CF with rows but no data.
> But I can't think of a better way.
>
> Any thoughts?

Another approach would be to have a CF with a single row whose column
names refer to the new row ids.  This would allow you efficient
slicing.  The downside is that you'd need to make sure the row doesn't
get too wide.  So depending on your throughput and application
behavior, this may or may not work.

Gary.


Re: Hector vs cassandra-java-client

2010-06-22 Thread Bjorn Borud
"Dop Sun"  writes:

> Updated.

the first Cassandra client lib to make it into the Maven repositories 
will probably end up with a big audience.  :-)

-Bjørn



Re: Finding new Cassandra data

2010-06-22 Thread Phil Stanhope
I can envision two fundamentally different approaches:

1. A CF that is CompareWith LONG ... use microsecond timestamps as your keys 
... then you can filter by time ranges.

This implies that you are willing to do a double write (once for the original 
data and then again for the logging). And a third read of a range_slice (which 
will most likely require pagination) to determine what to then push into your 
other system.

Which begs a question ... if you know you are inserting and generating keys ... 
and you know the keyname ... why not simply push the key into a queue 
(non-Cassandra) and do processing against that. So ...

2. Don't store new row keys in a CF ... at the point of using the thrift API 
simply build a log of new keys and process that log asynchronously.

This approach causes you to ask yourself another question: of the nodes in my 
cluster, am I willing to declare that some of those nodes are only available 
for write-thru processing. It's not Cassandra's job to make these decisions for 
you ... it's an applications decision. If you allow all nodes to perform 
writes, then you'll either have to consolidate logs or introduce some form of 
common queue for coordination of the async updates to non-Cassandra data stores.

-phil

On Jun 22, 2010, at 11:18 AM, Gary Dusbabek wrote:

> On Tue, Jun 22, 2010 at 09:59, David Boxenhorn  wrote:
>> In my system, I have a Cassandra front end, and an Oracle back end. Some
>> information is created in the back end, and pushed out to the front end, and
>> some information is created in the front end and pulled into the back end.
>> 
>> Question: How do I locate new rows that have been crated in Cassandra, for
>> import into Oracle?
>> 
>> I'm thinking of having a special column family "newRows" that contains only
>> the keys of the new rows. The offline process would look there to see what's
>> new, then delete those rows. The "newRows" CF would have no data! (The data
>> would be in the "real" CF.)
> 
> I've never tried an empty row, but I'm pretty sure you need at least one 
> column.
> 
>> 
>> Is this a good solution? It seems weird to have a CF with rows but no data.
>> But I can't think of a better way.
>> 
>> Any thoughts?
> 
> Another approach would be to have a CF with a single row whose column
> names refer to the new row ids.  This would allow you efficient
> slicing.  The downside is that you'd need to make sure the row doesn't
> get too wide.  So depending on your throughput and application
> behavior, this may or may not work.
> 
> Gary.



Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread Tatu Saloranta
On Tue, Jun 22, 2010 at 5:58 AM, David Boxenhorn  wrote:
> I want to use UUIDs whose alphanumeric order is the same as their
> chronological order. So I'm generating Version 4 UUIDs (
...
> Is there anything wrong with this idea?

If you want to keep it completely ordered, it's probably not enough to
rely on System.currentTimeMillis(). It seems likely that it would
sometimes be called twice for same clock value?  This is easy to solve
locally (just use an additional counter, that's what UUID packages do
to get to 100 nanosecond resolution); and it might not matter in
concurrent case (intra-node ordering is arbitrary but close enough).
The other theoretical problem is reduction in random value space, but
75 bits of randomness may be well is enough.

-+ Tatu +-


Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread David Boxenhorn
A little bit of time fuzziness on the order of a few milliseconds is fine
with me. This is user-generated data, so it only has to be time-ordered at
the level that a user can perceive.

I have no worries about my solution working - I'm sure it will work. I just
wonder if TimeUUIDType isn't superior for some reason that I don't know
about. (TimeUUIDType seems so bad in so many ways that I wonder why anyone
uses it. There must be some reason!)

On Tue, Jun 22, 2010 at 7:04 PM, Tatu Saloranta wrote:

> On Tue, Jun 22, 2010 at 5:58 AM, David Boxenhorn 
> wrote:
> > I want to use UUIDs whose alphanumeric order is the same as their
> > chronological order. So I'm generating Version 4 UUIDs (
> ...
> > Is there anything wrong with this idea?
>
> If you want to keep it completely ordered, it's probably not enough to
> rely on System.currentTimeMillis(). It seems likely that it would
> sometimes be called twice for same clock value?  This is easy to solve
> locally (just use an additional counter, that's what UUID packages do
> to get to 100 nanosecond resolution); and it might not matter in
> concurrent case (intra-node ordering is arbitrary but close enough).
> The other theoretical problem is reduction in random value space, but
> 75 bits of randomness may be well is enough.
>
> -+ Tatu +-
>


Re: Uneven distribution using RP

2010-06-22 Thread James Golick
This node's load is now growing at a ridiculous rate. It is at 105GB, with
the next most loaded node at 70.63GB.

Given that RF=3, I would assume that the replicas' nodes would grow
relatively quickly too?

On Mon, Jun 21, 2010 at 6:44 AM, aaron morton wrote:

> According to http://wiki.apache.org/cassandra/Operations nodetool repair
> is used to perform a major compaction and compare data between the nodes,
> repairing any conflicts. Not sure that would improve the load balance,
> though it may reduce some wasted space on the nodes.
>
> nodetool loadbalance will remove the node from the ring after streaming
> it's data to the remaining nodes and the add it back in the busiest part.
> I've used it before and it seems to do the trick.
>
> Also consider the size of the rows. Are they generally similar or do you
> have some that are much bigger? The keys will be distributed without
> considering the size of the data.
>
> The RP is random though, i do not think it tries to evenly distribute the
> keys. So some variance with a small number of nodes should be expected IMHO.
>
> Aaron
>
> On 21 Jun 2010, at 02:31, James Golick wrote:
>
> I ran cleanup on all of them and the distribution looked roughly even after
> that, but a couple of days later, it's looking pretty uneven.
>
> On Sun, Jun 20, 2010 at 10:21 AM, Jordan Pittier - Rezel  > wrote:
>
>> Hi,
>> Have you tried nodetool repair (or cleanup) on your nodes ?
>>
>>
>> On Sun, Jun 20, 2010 at 4:16 PM, James Golick wrote:
>>
>>> I just increased my cluster from 2 to 4 nodes, and RF=2 to RF=3, using
>>> RP.
>>>
>>> The tokens seem pretty even on the ring, but two of the nodes are far
>>> more heavily loaded than the others. I understand that there are a variety
>>> of possible reasons for this, but I'm wondering whether anybody has
>>> suggestions for now to tweak the tokens such that this problem is
>>> alleviated. Would it be better to just add 2 more nodes?
>>>
>>> Address   Status Load  Range
>>>  Ring
>>>
>>> 170141183460469231731687303715884105728
>>> 10.36.99.140  Up 61.73 GB
>>>  43733172796241720623128947447312912170 |<--|
>>> 10.36.99.134  Up 69.7 GB
>>> 85070591730234615865843651857942052864 |   |
>>> 10.36.99.138  Up 54.08 GB
>>>  128813844387867495544257452469445200073|   |
>>> 10.36.99.136  Up 54.75 GB
>>>  170141183460469231731687303715884105728|-->|
>>>
>>>
>>
>>
>
>


Re: Uneven distribution using RP

2010-06-22 Thread Robert Coli

On 6/22/10 10:07 AM, James Golick wrote:
This node's load is now growing at a ridiculous rate. It is at 105GB, 
with the next most loaded node at 70.63GB.


Given that RF=3, I would assume that the replicas' nodes would grow 
relatively quickly too?
What Replica Placement Strategy are you using (Rackunaware, Rackaware, 
etc?)? The current implementation of Rackaware is pretty simple and 
relies on careful placement of nodes in multiple DCs along the ring to 
avoid hotspots.


http://wiki.apache.org/cassandra/Operations#Replication
"
RackAwareStrategy: replica 2 is placed in the first node along the ring 
the belongs in another data center than the first; the remaining N-2 
replicas, if any, are placed on the first nodes along the ring in the 
same rack as the first


Note that with RackAwareStrategy, succeeding nodes along the ring should 
alternate data centers to avoid hot spots. For instance, if you have 
nodes A, B, C, and D in increasing Token order, and instead of 
alternating you place A and B in DC1, and C and D in DC2, then nodes C 
and A will have disproportionately more data on them because they will 
be the replica destination for every Token range in the other data center.

"

https://issues.apache.org/jira/browse/CASSANDRA-785

Is also related, and marked Fix Version 0.8.

=Rob



Re: Uneven distribution using RP

2010-06-22 Thread James Golick
RackUnaware, currently

On Tue, Jun 22, 2010 at 1:26 PM, Robert Coli  wrote:

> On 6/22/10 10:07 AM, James Golick wrote:
>
>> This node's load is now growing at a ridiculous rate. It is at 105GB, with
>> the next most loaded node at 70.63GB.
>>
>> Given that RF=3, I would assume that the replicas' nodes would grow
>> relatively quickly too?
>>
> What Replica Placement Strategy are you using (Rackunaware, Rackaware,
> etc?)? The current implementation of Rackaware is pretty simple and relies
> on careful placement of nodes in multiple DCs along the ring to avoid
> hotspots.
>
> http://wiki.apache.org/cassandra/Operations#Replication
> "
> RackAwareStrategy: replica 2 is placed in the first node along the ring the
> belongs in another data center than the first; the remaining N-2 replicas,
> if any, are placed on the first nodes along the ring in the same rack as the
> first
>
> Note that with RackAwareStrategy, succeeding nodes along the ring should
> alternate data centers to avoid hot spots. For instance, if you have nodes
> A, B, C, and D in increasing Token order, and instead of alternating you
> place A and B in DC1, and C and D in DC2, then nodes C and A will have
> disproportionately more data on them because they will be the replica
> destination for every Token range in the other data center.
> "
>
> https://issues.apache.org/jira/browse/CASSANDRA-785
>
> Is also related, and marked Fix Version 0.8.
>
> =Rob
>
>


forum application data model conversion

2010-06-22 Thread S Ahmed
Converting a Forum application to cassandra's data model.

Tables:

Posts [postID, threadID, userID, subject, body, created, lastmodified]

So this table contains the actual question subject and body.

When a user logs in, they want to see a list of their questions, and also
order by the last-modified date (to see if people responed to their
question).

How would you do this best in Cassandra, seeing as the question/answer text
is stored in another table.

I know you could make a CF like:

userID { postID1, postID2, ...}

And somehow order by last-modified, but then on the actual web page you
would have to first query for postID's owned by the user, and orderd by
last-modified.

THEN you would have to fetch the post data from the posts collection.

Is this the only way?  I mean other than repeating the post subject+body in
the user-to-postID index CF.


Write Rate / Second

2010-06-22 Thread Mubarak Seyed
How to find out the performance metrics such as write rate per second, and read 
rate per second. I could not find out from tpstats and cfstats command.

Are there any attributes in JMX? Can someone please help me.

Thanks,
Mubarak







Re: Write Rate / Second

2010-06-22 Thread Jonathan Ellis
rate = operations / latency

On Tue, Jun 22, 2010 at 2:50 PM, Mubarak Seyed  wrote:
> How to find out the performance metrics such as write rate per second, and 
> read rate per second. I could not find out from tpstats and cfstats command.
>
> Are there any attributes in JMX? Can someone please help me.
>
> Thanks,
> Mubarak
>
>
>
>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: how to implement the function similar to inbox search?

2010-06-22 Thread Jonathan Ellis
Not having an index doesn't matter if you're going to read all the
subcolumns back at once, which IIANM is the idea here.

On Mon, Jun 21, 2010 at 12:20 PM, hu wei  wrote:
> in datamodel wiki:
>  You can think of each super column name as a term and the columns within as
> the docids with rank info and other attributes being a part of it. If you
> have keys as the userids then you can have a per-user index stored in this
> form. This is how the per user index for term search is laid out for Inbox
> search at Facebook.
> a question: because subcolumn does't has index ,does it has performance
> bottleneck?



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com


Re: bulk loading

2010-06-22 Thread Torsten Curdt
I looked at the thrift service implementation and got it working.
(Much faster import!)

Thanks!

On Mon, Jun 21, 2010 at 13:09, Oleg Anastasjev  wrote:
> Torsten Curdt  vafer.org> writes:
>
>>
>> First I tried with my one "cassandra -f" instance then I saw this
>> requires a separate IP. (Why?)
>
> This is because your import program becomes a special member of cassandra
> cluster to be able to speak internal protocol. And each memboer of cassandra
> cluster must have its own IP.
>
>> But even with a separate IPs
>> "StorageService.instance.getNaturalEndpoints" does not return an
>> endpoint.
>
> Did you defined -Dstorage-config for your import program to point to the same
> configuration your normal cassandra nodes use ?
>
> Did you initialized client-mode storage service, like below ?
>        // init cassandra proxy
>        try
>        {
>            StorageService.instance.initClient();
>        }
>        catch (IOException e)
>        {
>            throw new RuntimeException(e);
>        }
>        try
>        {
>            Thread.sleep(10*1000);
>        }
>        catch (InterruptedException e)
>        {
>            throw new RuntimeException(e);
>        }
>
>
>
>


Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread Tatu Saloranta
On Tue, Jun 22, 2010 at 9:12 AM, David Boxenhorn  wrote:
> A little bit of time fuzziness on the order of a few milliseconds is fine
> with me. This is user-generated data, so it only has to be time-ordered at
> the level that a user can perceive.

Ok, so mostly ordered. :-)

> I have no worries about my solution working - I'm sure it will work. I just
> wonder if TimeUUIDType isn't superior for some reason that I don't know
> about. (TimeUUIDType seems so bad in so many ways that I wonder why anyone
> uses it. There must be some reason!)

I think that rationally thinking random-number based UUID is the best,
provided one has a good random number generator.
But there is something intuitive about rather using location +
time-based alternative, based on tiny chance of collision that any
(pseudo) random number based system has.
So it just seems intuitive safer to use time-uuids, I think -- it
isn't, it just feels that way. :-)

Secondary reason is probably the ordering, and desire to stay
standards compliant.
As to ordering, if you wanted to use time-uuids, comparators that do
give time-based ordering are trivial, and no slower than lexical
sorting.
Java Uuid Generator (2.0) defaults to such comparator, as I agree that
this makes more sense than whatever sorting you would otherwise get.
It is unfortunate that clock chunks are ordered in weird way by uuid
specification; there is no reason it couldn't have been made "right
way" so that hex representation would sort nicely.

-+ Tatu +-


SQL Server to Cassandra Schema Design - Ideas Anyone?

2010-06-22 Thread Craig Faulkner
I'm having a little block in converting an existing SQL Server schema that we 
have into Cassandra Keyspace(s).  The whole key-value thing has just not 
clicked yet.  Do any of you know of any good examples that are more complex 
than the example in the readme file?

We are looking to report on web traffic so things like hits, page views, unique 
visitors,...  All the basic web stuff.  I'm very sure that one of you, likely 
many more, is already doing this.

Here are two queries just to give you a few key works related to the metrics 
that we want to move into Cassandra:


/* Data logged */
select t.Datetime,c.CustomerNumber,ct.cust_type,ws.SiteNumber,ws.SiteName
,f.Session,wa.page,wa.Note,f.CPUTime,f.DCLWaitTime,f.DCLRequestCount,'clientip' 
= dbo.u_IpInt2Str(ClientIP)
from warehouse.dbo.fact_WebHit f
join Warehouse.dbo.dim_Time t
  on t.ID = f.TimeID
join Warehouse.dbo.dim_CustomerType ct
  on ct.ID = f.CustomerTypeID
join Warehouse.dbo.dim_Customer c
  on c.ID = f.CustomerID
join Warehouse.dbo.dim_Symbol s
  on s.ID = f.SymbolID
join Warehouse.dbo.dim_WebAction wa
  on wa.ID = f.WebActionID
join Warehouse.dbo.dim_WebSite ws
  on ws.ID = f.WebSiteID

/* Data with surrogate keys */
select f.Timeid,f.CustomerID,f.CustomerTypeID,f.WebSiteID 
,f.Session,f.WebActionID,f.CPUTime
,f.DCLWaitTime,f.DCLRequestCount,ClientIP
from warehouse.dbo.fact_WebHit f


Any good info would be appreciated.  I have of course checked the main web 
sites but I could have missed something along the way.

Craig


Cassandra Health Monitoring

2010-06-22 Thread Andrew Psaltis
All,
We have been working through some operations scenarios, so that we are ready to 
deploy our first Cassandra cluster into production  in the coming months. 
During this process our operations folks have asked us to provide a Health 
Check service. I am using the word service here very liberally - really we just 
need to provide a way for the folks in out NOC to know that not only is the 
Cassandra process running (which they will get with their monitoring tools ), 
but that it is actually alive and well. We do not have the intent of verifying 
that the data is valid, just that every node in the cluster that is known to be 
running is actually alive and healthy. My questions are - What does it mean for 
a Cassandra node to be healthy?  What is the minimum (from an impact to the 
performance of a node) things we can check to make sure that a node is not a 
zombie?

Any and all input is greatly appreciated.

Thanks,
Andrew



Re: java.lang.RuntimeException: java.io.IOException: Value too large for defined data type

2010-06-22 Thread Julie
Gary Dusbabek  gmail.com> writes:

> 
> *Hopefully* fixed.  I was never able to duplicate the problem on my
> workstation, but I had a pretty good idea what was causing the
> problem.  Julie, if you're in a position to apply and test the fix, it
> would help help us make sure we've got this one nailed down.
> 
> Gary.
 
Gary,
I have run a full write test with the SVN 0.6 Cassandra from yesterday which
I'll call 0.6.3 beta. I am definitely not seeing the problem in 0.6.3 beta.  I
am seeing something different than in 0.6.2 that is probably totally unrelated.
 I can tell you more if you are interested and want details.  

The headline is that when I run my 8 write clients (all on separate nodes) with
10 cassandra nodes and my clients are writing as fast as they can with
consistency=ALL with 0.6.3 beta, I get timeouts within 2 minutes of starting my
run.  When I drop back to 0.6.2, I do not get the timeouts even after 30 minutes
of running, all else the same. I have cpu usage and disk io stats on all my
cassandra nodes during the 0.6.3 beta run if they would be helpful.

I am going to go back to 0.6.2 until 0.6.3 is officially released.  Just wanted
to try it out and let you know if I saw the problem.  Interestingly, since
updating our JVM, I'm not seeing the tight-loop problem in 0.6.2 either.

Thank you for your help!
Julie




Re: Uneven distribution using RP

2010-06-22 Thread James Golick
Turns out that this is due to a larger proportion of the wide rows in the
system being located on that node. I moved its token over a little to
compensate for it, but it doesn't seem to have helped at this point.

What's confusing about this is that RF=3 and no other node's load is growing
as quickly as that one.

- James

On Tue, Jun 22, 2010 at 1:31 PM, James Golick  wrote:

> RackUnaware, currently
>
>
> On Tue, Jun 22, 2010 at 1:26 PM, Robert Coli  wrote:
>
>> On 6/22/10 10:07 AM, James Golick wrote:
>>
>>> This node's load is now growing at a ridiculous rate. It is at 105GB,
>>> with the next most loaded node at 70.63GB.
>>>
>>> Given that RF=3, I would assume that the replicas' nodes would grow
>>> relatively quickly too?
>>>
>> What Replica Placement Strategy are you using (Rackunaware, Rackaware,
>> etc?)? The current implementation of Rackaware is pretty simple and relies
>> on careful placement of nodes in multiple DCs along the ring to avoid
>> hotspots.
>>
>> http://wiki.apache.org/cassandra/Operations#Replication
>> "
>> RackAwareStrategy: replica 2 is placed in the first node along the ring
>> the belongs in another data center than the first; the remaining N-2
>> replicas, if any, are placed on the first nodes along the ring in the same
>> rack as the first
>>
>> Note that with RackAwareStrategy, succeeding nodes along the ring should
>> alternate data centers to avoid hot spots. For instance, if you have nodes
>> A, B, C, and D in increasing Token order, and instead of alternating you
>> place A and B in DC1, and C and D in DC2, then nodes C and A will have
>> disproportionately more data on them because they will be the replica
>> destination for every Token range in the other data center.
>> "
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-785
>>
>> Is also related, and marked Fix Version 0.8.
>>
>> =Rob
>>
>>
>


Re: Uneven distribution using RP

2010-06-22 Thread Jeremy Dunck
On Tue, Jun 22, 2010 at 4:08 PM, James Golick  wrote:
> Turns out that this is due to a larger proportion of the wide rows in the
> system being located on that node. I moved its token over a little to
> compensate for it, but it doesn't seem to have helped at this point.
> What's confusing about this is that RF=3 and no other node's load is growing
> as quickly as that one.

Maybe it's failing to compact for some reason?


Re: Uneven distribution using RP

2010-06-22 Thread James Golick
It's compacting at a ridiculously fast rate. The pending compactions have
been growing for a while.

It's also flushing memtables really quickly for a particular CF. Like,
really quickly. Like, one every minute. I increased the thresholds by 10x
and it's still going fast.

On Tue, Jun 22, 2010 at 5:27 PM, Jeremy Dunck  wrote:

> On Tue, Jun 22, 2010 at 4:08 PM, James Golick 
> wrote:
> > Turns out that this is due to a larger proportion of the wide rows in the
> > system being located on that node. I moved its token over a little to
> > compensate for it, but it doesn't seem to have helped at this point.
> > What's confusing about this is that RF=3 and no other node's load is
> growing
> > as quickly as that one.
>
> Maybe it's failing to compact for some reason?
>


nodetool loadbalance : Strerams Continue on Non Acceptance of New Token

2010-06-22 Thread Arya Goudarzi
Hi,

Please confirm if this is an issue and should be reported or I am doing 
something wrong. I could not find anything relevant on JIRA:

Playing with 0.7 nightly (today's build), I setup a 3 node cluster this way:

 - Added one node;
 - Loaded default schema with RF 1 from YAML using JMX;
 - Loaded 2M keys using py_stress;
 - Bootstrapped a second node;
 - Cleaned up the first node;
 - Bootstrapped a third node;
 - Cleaned up the second node;

I got the following ring:

Address   Status Load  Range
  Ring
   154293670372423273273390365393543806425  
  
10.50.26.132  Up 518.63 MB 69164917636305877859094619660693892452   
  |<--|
10.50.26.134  Up 234.8 MB  111685517405103688771527967027648896391  
  |   |
10.50.26.133  Up 235.26 MB 154293670372423273273390365393543806425  
  |-->|

Now I ran:

nodetool --host 10.50.26.132 loadbalance

It's been going for a while. I checked the streams  

nodetool --host 10.50.26.134 streams
Mode: Normal
Not sending any streams.
Streaming from: /10.50.26.132
   Keyspace1: 
/var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-3-Data.db/[(0,22206096), 
(22206096,27271682)]
   Keyspace1: 
/var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-4-Data.db/[(0,15180462), 
(15180462,18656982)]
   Keyspace1: 
/var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-5-Data.db/[(0,353139829), 
(353139829,433883659)]
   Keyspace1: 
/var/lib/cassandra/data/Keyspace1/Standard1-tmp-d-6-Data.db/[(0,366336059), 
(366336059,450095320)]

nodetool --host 10.50.26.132 streams
Mode: Leaving: streaming data to other nodes
Streaming to: /10.50.26.134
   /var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), 
(366336059,450095320)]
Not receiving any streams.

These have been going for the past 2 hours.

I see in the logs of the node with 134 IP address and I saw this:

INFO [GOSSIP_STAGE:1] 2010-06-22 16:30:54,679 StorageService.java (line 603) 
Will not change my token ownership to /10.50.26.132

So, to my understanding from wikis loadbalance supposed to decommission and 
re-bootstrap again by sending its tokens to other nodes and then bootstrap 
again. It's been stuck in streaming for the past 2 hours and the size of ring 
has not changed. The log in the first node says it has started streaming for 
the past hours:

INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 72) 
Beginning transfer process to /10.50.26.134 for ranges 
(154293670372423273273390365393543806425,69164917636305877859094619660693892452]
 INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,255 StreamOut.java (line 82) 
Flushing memtables for Keyspace1...
 INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,266 StreamOut.java (line 128) Stream 
context metadata 
[/var/lib/cassandra/data/Keyspace1/Standard1-d-48-Data.db/[(0,366336059), 
(366336059,450095320)]] 1 sstables.
 INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 135) 
Sending a stream initiate message to /10.50.26.134 ...
 INFO [STREAM-STAGE:1] 2010-06-22 16:35:56,267 StreamOut.java (line 140) 
Waiting for transfer to /10.50.26.134 to complete
 INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 359) 
LocationInfo has reached its threshold; switching in a fresh Memtable at 
CommitLogContext(file='/var/lib/cassandra/commitlog/CommitLog-1277249454413.log',
 position=720)
 INFO [FLUSH-TIMER] 2010-06-22 17:36:53,370 ColumnFamilyStore.java (line 622) 
Enqueuing flush of Memtable(LocationInfo)@1637794189
 INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,370 Memtable.java (line 149) 
Writing Memtable(LocationInfo)@1637794189
 INFO [FLUSH-WRITER-POOL:1] 2010-06-22 17:36:53,528 Memtable.java (line 163) 
Completed flushing /var/lib/cassandra/data/system/LocationInfo-d-9-Data.db
 INFO [MEMTABLE-POST-FLUSHER:1] 2010-06-22 17:36:53,529 ColumnFamilyStore.java 
(line 374) Discarding 1000


Nothing more after this line.

Am I doing something wrong? 


Best Regards,
-Arya



Hector - Java doc

2010-06-22 Thread Mubarak Seyed
Where can i find the java doc for Hector java client? Do i need to build one
from source?

-- 
Thanks,
Mubarak Seyed.


Never ending compaction

2010-06-22 Thread James Golick
We had to take a node down for an upgrade last night. When we brought it
back online in the morning, it got slammed by HH data all day so badly that
it was compacting near constantly, and the pending compactions pool was
piling up. I shut most of the writes down to let things catch up, which they
mostly have, but in an effort to minimize downtime of the component that
relies on cassandra, I restarted the reading and writing with 4 pending
compactions.

Now, the writing is taking place quickly enough that compactions just keep
queueing up. That basically means that at this pace, compactions will
*never* complete. And compactions are expensive. They essentially make a
node useless. So, we're left with 3/4 of a cluster, since we only have 4
nodes.

Since then, another node in the cluster has started queueing up compactions.

This is on pretty beefy hardware, too:

2 x E5620, 24GB, 2 x 15kRPM SAS disks in RAID1 for data, and 1 x 7200RPM
SATA for commit logs.

I guess we need more nodes? But, we only have about 80GB total per node,
which doesn't really seem like that much for that kind of hardware?

- James


Re: Hector - Java doc

2010-06-22 Thread Jonathan Holloway
I couldn't find the docs online but the Ant build script here in the source:

http://github.com/rantav/hector/blob/master/build.xml

has a javadoc target you can run to generate them... hope that helps...

Jon.

On 22 June 2010 21:25, Mubarak Seyed  wrote:

> Where can i find the java doc for Hector java client? Do i need to build
> one from source?
>
> --
> Thanks,
> Mubarak Seyed.


Re: Hector - Java doc

2010-06-22 Thread Ran Tavory
There isn't an online javadoc page, but the code is online and well
documented and there's a wiki and all sorts of documents and examples
http://github.com/rantav/hector/blob/master/src/main/java/me/prettyprint/cassandra/service/Keyspace.java
http://wiki.github.com/rantav/hector/

On Wed, Jun 23, 2010 at 8:11 AM, Jonathan Holloway <
jonathan.hollo...@gmail.com> wrote:

> I couldn't find the docs online but the Ant build script here in the
> source:
>
> http://github.com/rantav/hector/blob/master/build.xml
>
> has a javadoc target you can run to generate them... hope that helps...
>
> Jon.
>
>
> On 22 June 2010 21:25, Mubarak Seyed  wrote:
>
>> Where can i find the java doc for Hector java client? Do i need to build
>> one from source?
>>
>> --
>> Thanks,
>> Mubarak Seyed.
>
>


Re: UUIDs whose alphanumeric order is the same as their chronological order

2010-06-22 Thread David Boxenhorn
Having a physical location encoded in the UUID *increases* the chance of a
collision, because it means fewer random bits. There definitely will be more
than one UUID created in the same clock unit on the same machine! The same
bits that you use to encode your few servers can be used for over 100
trillion random numbers!

"As to ordering, if you wanted to use time-uuids, comparators that do
give time-based ordering are trivial, and no slower than lexical
sorting."

"No slower" isn't a good reason to use it! I am willing to take a
(reasonable) time *penalty* to use lexically ordered UUIDs that will work
both in Cassandra and Oracle (and which are human-readable - always good for
debugging)!

I am also willing to take a reasonable penalty to avoid using weird
third-party code for generating UUIDs in the first place.

On Tue, Jun 22, 2010 at 10:05 PM, Tatu Saloranta wrote:

> On Tue, Jun 22, 2010 at 9:12 AM, David Boxenhorn 
> wrote:
> > A little bit of time fuzziness on the order of a few milliseconds is fine
> > with me. This is user-generated data, so it only has to be time-ordered
> at
> > the level that a user can perceive.
>
> Ok, so mostly ordered. :-)
>
> > I have no worries about my solution working - I'm sure it will work. I
> just
> > wonder if TimeUUIDType isn't superior for some reason that I don't know
> > about. (TimeUUIDType seems so bad in so many ways that I wonder why
> anyone
> > uses it. There must be some reason!)
>
> I think that rationally thinking random-number based UUID is the best,
> provided one has a good random number generator.
> But there is something intuitive about rather using location +
> time-based alternative, based on tiny chance of collision that any
> (pseudo) random number based system has.
> So it just seems intuitive safer to use time-uuids, I think -- it
> isn't, it just feels that way. :-)
>
> Secondary reason is probably the ordering, and desire to stay
> standards compliant.
> As to ordering, if you wanted to use time-uuids, comparators that do
> give time-based ordering are trivial, and no slower than lexical
> sorting.
> Java Uuid Generator (2.0) defaults to such comparator, as I agree that
> this makes more sense than whatever sorting you would otherwise get.
> It is unfortunate that clock chunks are ordered in weird way by uuid
> specification; there is no reason it couldn't have been made "right
> way" so that hex representation would sort nicely.
>
> -+ Tatu +-
>