Guaranteeing globally unique TimeUUID's in a high throughput distributed system

Josh Dzielak Sat, 16 Mar 2013 14:25:10 -0700

I have a system where a client sends me arbitrary JSON events containing a 
timestamp at millisecond resolution. The timestamp is used to generate column 
names of type TimeUUIDType.


The problem I run into is this - if I client sends me 2 events with the same 
timestamp, the TimeUUID that gets generated for each is the same, and we get 1 
insert and 1 update instead of 2 inserts. I might be running many processes (in 
my case Storm supervisors) on the same node, so the machine-specific part of 
the UUID doesn't help.

I have noticed how the Cassandra UUIDGen class lets you work around this. It 
has a 'createTimeSafe' method that adds extra precision to the timestamp such 
that you can actually get up to 10k unique UUID's for the same millisecond. 
That works pretty good for a single process (although it's still possible to go 
over 10k, it's unlikely in our actual production scenario). It does make 
searches at boundary conditions a little unpredictable – 'equal' may or may not 
work depending on whether extra ns intervals were added – but I can live with 
that.)  

However, this still leaves vulnerability across a distributed system. If 2 
events arrive in 2 processes at the exact same millisecond, one will overwrite 
the other. If events keep flowing to each process evenly over the course of the 
millisecond, we'll be left with roughly half the events we should have. To work 
around this, I add a distinct 'component id' to my row keys that roughly 
equates to a Storm worker or a JVM process I can cheaply synchronize.

The real problem is that this trick of adding ns intervals only works when you 
are generating timestamps from the current time (or any time that's always 
increasing). As I mentioned before, my client might be providing a past or 
future timestamp, and I have to find a way to make sure each one is unique.

For example, a client might send me 10k events with the same millisecond 
timestamp today, and 10k again tomorrow. Using the standard Java library stuff 
to generate UUID's, I'd end up with only 1 event stored, not 20,000. The 
warning in UUIDGen.getTimeUUIDBytes is clear about this.  

Adapting the ns-adding 'trick' to this problem requires synchronized external 
state (i.e. storing that the current ns interval for millisecond 12330982383 is 
1234, etc) - definitely a non-starter.

So, my dear, and far more seasoned Cassandra users, do you have any suggestions 
for me?  

Should I drop TimeUUID altogether and just make column names a combination of 
millisecond and a big enough random part to be safe? e.g. 
'1363467790212-a6c334fefda'. Would I be able to run proper slice queries if I 
did this? What other problems might crop up? (It seems too easy :)  

Or should I just create a normal random UUID for every event as the column key 
and create the non-unique index by time in some other way?  

Would appreciate any thoughts, suggestions, and off-the-wall ideas!  

PS- I assume this could be a problem in any system (not just Cassandra) where 
you want to use 'time' as a unique index yet might have multiple records for 
the same time. So any solutions from other realms could be useful too.   

--
Josh Dzielak     
VP Engineering • Keen IO
Twitter • @dzello (https://twitter.com/dzello)
Mobile • 773-540-5264

Guaranteeing globally unique TimeUUID's in a high throughput distributed system

Reply via email to