[ https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sergey Shelukhin updated HIVE-14589: ------------------------------------ Attachment: HIVE-14589.04.patch > add consistent node replacement to LLAP for splits > -------------------------------------------------- > > Key: HIVE-14589 > URL: https://issues.apache.org/jira/browse/HIVE-14589 > Project: Hive > Issue Type: Bug > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > Attachments: HIVE-14589.01.patch, HIVE-14589.02.patch, > HIVE-14589.03.patch, HIVE-14589.04.patch, HIVE-14589.patch > > > See HIVE-14574. (copied from the comment below) This basically creates the > nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest > available slot, starting from 0. Unlike worker-... nodes, the slots are > reused, which is the intent. The LLAPs are always sorted by the slot number > for splits. > The idea is that as long as LLAP is running, it will retain the same position > in the ordering, regardless of other LLAPs restarting, without knowing about > each other, the predecessors location (if restarted in a different place), or > the total size of the cluster. > The restarting LLAPs may not take the same positions as their predecessors > (i.e. if two LLAPs restart they can swap slots) but it shouldn't matter > because they have lost their cache anyway. > I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, > they will take whatever slots, but 3 will stay the 3rd and retain cache > locality. > This also handles size increase, as new LLAPs will always be added to the end > of the sequence, which is what consistent hashing needs. > One case it doesn't handle is permanent cluster size reduction. There will be > a permanent gap if LLAPs are removed that have the slots in the middle; until > some are restarted, it will result in misses -- This message was sent by Atlassian JIRA (v6.3.4#6332)