(as opposed to across) LLAP failures on top of HIVE-14589

Siddharth Seth (JIRA) Wed, 14 Sep 2016 19:08:58 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-14680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15492075#comment-15492075
 ]


Siddharth Seth commented on HIVE-14680:
---------------------------------------

Mostly looks good.
Slots start at 0, correct?

{code}// Since our probing method is totally bogus, give up after some time and 
return everything.{code}
Think it'll be better to return nothing. That'll cause the scheduler to go 
random. Everything has a good chance of overwhelming a box - at least without 
grouping. With grouping, it may spread out.

Would be good to display the number of expected instances on the LlapWebService 
(it shows the ones which are up and running only). Could probably show the ones 
which have gone down, with the isAlive set to false. Separate jira?

[~gopalv] - any comments on the second level hash function, and how it moves 
between hosts with the (+1 * hash2)

{code}startOffset >> 2{code}
This is unrelated to this patch specifically, but related to the series of 
patches. This is trying to avoid differences in the start position, correct? 
Upto a difference of 4. Will it work for splits that don't start at 0?



> retain consistent splits /during/ (as opposed to across) LLAP failures on top 
> of HIVE-14589
> -------------------------------------------------------------------------------------------
>
>                 Key: HIVE-14680
>                 URL: https://issues.apache.org/jira/browse/HIVE-14680
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>         Attachments: HIVE-14680.01.patch, HIVE-14680.patch
>
>
> see HIVE-14589.
> Basic idea (spent about 7 minutes thinking about this based on RB comment ;)) 
> is to return locations for all slots to HostAffinitySplitLocationProvider, 
> the missing slots being inactive locations (based solely on the last slot 
> actually present). For the splits mapped to these locations, fall back via 
> different hash functions, or some sort of probing.
> This still doesn't handle all the cases, namely when the last slots are gone 
> (consistent hashing is supposed to be good for this?); however for that we'd 
> need more involved coordination between nodes or a central updater to 
> indicate the number of nodes



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HIVE-14680) retain consistent splits /during/ (as opposed to across) LLAP failures on top of HIVE-14589

Reply via email to