[ https://issues.apache.org/jira/browse/HIVE-19508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16493897#comment-16493897 ]
Sahil Takiar commented on HIVE-19508: ------------------------------------- A single Spark stage can be attempted multiple times - e.g. something like {{Stage-1_0: ... Stage-1_1 ... Stage-2_0 ...}}. The comparator needs to compare based on both stage id and attempt id. If you up to do a bit of re-factoring, the implementation of {{getStageNum}} isn't ideal. We shouldn't rely on string parsing to extract the stage id and attempt id. {{SparkJobStatus#getSparkStageProgress}} should return a {{Map}} whose key isn't a string, instead it should be a POJO that contains the stage id and the attempt id. Please add a unit test for this. > SparkJobMonitor getReport doesn't print stage progress in order > --------------------------------------------------------------- > > Key: HIVE-19508 > URL: https://issues.apache.org/jira/browse/HIVE-19508 > Project: Hive > Issue Type: Sub-task > Components: Spark > Reporter: Sahil Takiar > Assignee: Bharathkrishna Guruvayoor Murali > Priority: Major > Attachments: HIVE-19508.1.patch > > > You can end up with a progress output like this: > {code} > Stage-10_0: 0/29 Stage-11_0: 0/44 Stage-12_0: 0/11 > Stage-13_0: 0/1 Stage-8_0: 258(+76)/468 Stage-9_0: 0/165 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)