Hi, Could you please suggest if Narrow partition is a good choice for the below use case.
1) Write heavy event log table with 50m inserts per day with a peak load of 20K transaction per sec. There aren't any updates/deletes to records inserted. Records are inserted with a TTL of 60 days (retention period) 2) The table has a single primary key which is a sequence number (27 digits) generated by source application 3) There are only two access patterns used - one by using the sequence number & the other using sequence number + event date (range scans also possible) 4) My target data model in Cassandra is partitioned with sequence number as the primary key + event date as clustering columns to enable range scans on date. 5) The Table has close to 120+ columns and the average row size comes close to 32K bytes 6) Reads are very very less and account to <5% while inserts can be close to 95%. 7) From a functional standpoint, I do not see any other columns that can be part of primary key to keep the partition reasonable (<100MB) Questions: 1) Is Narrow partition an ideal choice for the above use case. 2) Is artificial bucketing an alternate choice to make the partition reasonable 3) We are using varint as the data type for sequence number which is 27 digits long. Is DECIMAL data type ? 4) Any suggestions on performance impacts during compaction ? Regards, Chandra Sekar KR The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com