Since twitter is everyone's favorite analogy: It's like twitter, but faster and with bigger messages that I may need to go back and replay in order to mine for more details at a later date. Thus, I call it a queue, because the order of messages is important.. But not anything like a message broker/pub-sub/topic/ etc...
-JD On Thu, Apr 1, 2010 at 9:43 PM, Jeremy Davis <jerdavis.cassan...@gmail.com>wrote: > > You are correct, it is not a queue in the classic sense... I'm storing the > entire "conversation" with a client in perpetuity, and then playing it back > in the order received. > > Rabbitmq/activemq etc all have about the same throughput 3-6K persistent > messages/sec, and are not good for storing the conversation forever... Also > I can easily scale cassandra past that message rate and not have to worry > about which message broker/cluster I'm connecting to/has the > conversation/etc. > > > > > On Thu, Apr 1, 2010 at 7:02 PM, Keith Thornhill <ke...@raptr.com> wrote: > >> you mention never deleting from the queue, so what purpose is this >> serving? (if you don't pop off the front, is it really a queue?) >> >> seems if guaranteed order of messages is required, there are many >> other projects which are focused towards that problem (rabbitmq, >> kestrel, activemq, etc) >> >> or am i misunderstanding your needs here? >> >> -keith >> >> On Thu, Apr 1, 2010 at 6:32 PM, Jeremy Davis >> <jerdavis.cassan...@gmail.com> wrote: >> > I'm in the process of implementing a Totally Ordered Queue in Cassandra, >> and >> > wanted to bounce my ideas off the list and also see if there are any >> other >> > suggestions. >> > >> > I've come up with an external source of ID's that are always increasing >> (but >> > not monotonic), and I've also used external synchronization to ensure >> only >> > one writer to a given queue. And I handle de-duping in the app. >> > >> > >> > My current solution is : (simplified) >> > >> > Use the "QueueId", to Key into a row of a CF. >> > Then, every column in that CF corresponds to a new entry in the Queue, >> with >> > a custom Comparator to sort the columns by my external ID that is always >> > increasing. >> > >> > Technically I never delete data from the Queue, and I just page through >> it >> > from a given ID using a SliceRange, etc. >> > >> > Obviously the problem being that the row needs to get compacted. so then >> I >> > started bucketizing with multiple rows for a given queue (for example >> one >> > per day (again I'm simplifying))...(so the Key is now "QueueId+Day"...) >> > >> > Does this seem reasonable? It's solvable, but is starting to seem >> > complicated to implement... It would be very easy if I didn't have to >> have >> > multiple buckets.. >> > >> > >> > >> > My other thought is to store one entry per row, and perform >> get_range_slices >> > and specify a KeyRange, with the OrderPreservingPartitioner. >> > But it isn't exactly clear to me what the Order of the keys are in this >> > system, so I don't know how to construct my key and queries >> appropriately... >> > Is this Lexical String Order? Or? >> > >> > So for example.. Assuming my QueueId's are longs, and my ID's are also >> > longs.. My key would be (in Java): >> > >> > long queueId; >> > long msgId; >> > >> > key = "" + queueId + ":" + msgId; >> > >> > And if I wanted to do a query my key range might be from >> > start = "" + queueId + ":0" >> > end = "" + queueId + ":" + Long.MAX_VALUE; >> > >> > (Will I have to left pad the msgIds with 0's)? >> > >> > And is this going to be efficient if my msgId isn't monotonically >> > increasing? >> > >> > Thanks, >> > -JD >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > >