I have to move data from table A -> table B. 3 different queries running parallel. Below are the details.
Table A has 90G or 7.2bn data with below columns.. C1, C2, C3, C4, C5, custID, amt Table A is partitioned on C1 and 256 bucketing on C2. C1 can have max 10 different values and C2 cardinality is going to be very high..(it is like an address that will be different most of the time) I need to group by data based on below column combinations and calculate distinct custID count for each combination and total amount.. Group by combinations are C1, C2 C1, C2, C3, C4 C1, C2, C3, C4, C5 I have made 3 queries (one for each group by combination) to move data from table a to table b. All queries run parallel and takes about 3.5 hours in total. I have disabled concurrency on table b and enabled vectorization.. I need to finish this job of moving data in less than max 1 hour or if possible in 40 mins.. should I use union all and run one query, will that help improve the job completion time? Or any other good suggestions here will be much appreciated.. I am open to design suggestions as well to make it as fast as possible. Thanks a lot
