Ádám Szita created HIVE-25651:
---------------------------------

             Summary: Enable LLAP cache affinity for Iceberg ORC splits
                 Key: HIVE-25651
                 URL: https://issues.apache.org/jira/browse/HIVE-25651
             Project: Hive
          Issue Type: Improvement
            Reporter: Ádám Szita
            Assignee: Ádám Szita


Since HiveIcebergInputformat doesn't implement any LLAP marker interfaces, 
cache affinity is never tried, and so any split containing ORC file parts may 
go to a random LLAP daemon, causing subpar hit ratio later.

So we should:
 * let HS2 know that cache affinity is required for this inputformat
 * prevent Iceberg from grouping separate files together in one combined split 
in case of LLAP execution
 * provide proper getPath() result for HiveIcebergSplit, so that 
HostAffinitySplitLocationProvider calculates different hashes for different 
files (right now getPath() returns table location only)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to