Thousands of partitions performance questions

Shai Cantor Sun, 28 Apr 2019 22:20:31 -0700

   - *General*
      - Our system gathers data from CI builds regarding a customer's code
      base
   - *Data includes*
      - type
         - methods/lines/branches A.K.A - code elements
         - files
      - *Queries*
      - The data is queried for a specific build only
      - Queries are aggregations on the code elements and files with some
      filtering and sorting
   - *Volume*
      - I expect to have about 1500 builds a day, 45000 builds a month
      - A build can have around 300000 code elements and around 30000 files
   - *Retention*
      - Thought about keeping 90 days of builds as retention
   - *Plan*
      - 2 tables
         - code elements
         - files
      - create a partition for each build
      - each day delete partitions older than 90 days
      - create 2 schemas for each client
         - <customer-id> schema which holds the parent tables
         - <customer-id>_partitions schema that will hold the partitions
      - upon a new build
         - create a partition for the code elements table and for the files
         table in the "<customer-id>_partitions" schema
      - *Questions*
      - Will the db hold 135000 (45000 * 3 months) partitions under the
      assumption that *I query only 1 partition?*
      - Should I model it differently in terms of schema, partitions etc.?

Thousands of partitions performance questions

Reply via email to