Todd Lipcon created KUDU-1945:
---------------------------------

             Summary: Support generation of surrogate primary keys (or tables 
with no PK)
                 Key: KUDU-1945
                 URL: https://issues.apache.org/jira/browse/KUDU-1945
             Project: Kudu
          Issue Type: New Feature
          Components: client, master, tablet
            Reporter: Todd Lipcon


Many use cases have data where there is no "natural" primary key. For example, 
a web log use case mostly cares about partitioning and not about precise 
sorting by timestamp, and timestamps themselves are not necessarily unique. 
Rather than forcing users to come up with their own surrogate primary keys, 
Kudu should support some kind of "auto_increment" equivalent which generates 
primary keys on insertion. Alternatively, Kudu could support tables which are 
partitioned but not internally sorted.

The advantages would be:

- Kudu can pick primary keys on insertion to guarantee that there is no 
compaction required on the table (eg always assign a new key higher than any 
existing key in the local tablet). This can improve write throughput 
substantially, especially compared to naive PK generation schemes that a user 
might pick such as UUID, which would generate a uniform random-insert workload 
(worst case for performance)
- Make Kudu easier to use for such use cases (no extra client code necessary)



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to