Re: How to manage huge partitioned table with 1000+ columns in Hive

2019-11-26 Thread Furcy Pin
Hello, Sorry for the late reply, but this problem is very interesting. How did you end up solving it in the end? I have an idea which is very ugly but might work: Create a big view that is an union of all partitions SELECT '2019-10-01' as ds, * FROM test_1 a JOIN test_2 b ON a.id = b.id JOIN te

Re: How to manage huge partitioned table with 1000+ columns in Hive

2019-10-02 Thread Pau Tallada
Hi, I would say the most efficient way would be option (3), where all the subtables are partitioned by date, and clustered+**sorted** by id. This way, efficient SMB map joins can be performed over the 10 tables of the same partition. Unfortunately, I haven't found a way to achieve SMB map joins*