Hi Got my fingers crossed that there's a work around for this as this really is a big problem for us
We are using: Amazon EMR Release label:emr-6.1.0 Hadoop distribution:Amazon Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0 Thin Client version: phoenix-5.0.0-HBase-2.0-thin-client.jar We get the following error when doing an LAST_VALUE aggregation where 1. A JOIN is empty 2. The column is INTEGER or DATETIME Remote driver error: IllegalArgumentException: offset (25) + length (4) exceed the capacity of the array: 25 The query that breaks is: SELECT "BIOMATERIAL_NAME", FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" FROM VARIABLE_VALUES_QA.OBSERVATION JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX ON DOCID = OBSERVATION_VALUE_ID AND TAGNAME = 'TRIAL_ID' AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe' GROUP BY "BIOMATERIAL_NAME" LIMIT 10 OFFSET 0; I can refactor this using EXIST but get same error, presumably the driver knows to treat them the same: SELECT "BIOMATERIAL_NAME", FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" FROM VARIABLE_VALUES_QA.OBSERVATION WHERE EXISTS ( SELECT DOCID FROM VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX WHERE DOCID = OBSERVATION_VALUE_ID AND TAGNAME = 'TRIAL_ID' AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe') GROUP BY "BIOMATERIAL_NAME" LIMIT 10 OFFSET 0; If we remove the external reference we get no error, regardless of whether there are any hits or not -- these all work There are no hits for this query SELECT "BIOMATERIAL_NAME", FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" FROM VARIABLE_VALUES_QA.OBSERVATION WHERE BIOMATERIAL_TYPE = 'aardvark' GROUP BY "BIOMATERIAL_NAME" LIMIT 10 OFFSET 0; Lots of hits for this query: SELECT "BIOMATERIAL_NAME", FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" FROM VARIABLE_VALUES_QA.OBSERVATION GROUP BY "BIOMATERIAL_NAME" LIMIT 10 OFFSET 0; I've tried weird things like: Comparing exists to TRUE to try and force it into a normal BOOLEAN value, same IllegalArgumentException. SELECT "BIOMATERIAL_NAME", FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" FROM VARIABLE_VALUES_QA.OBSERVATION WHERE (EXISTS ( SELECT DOCID FROM VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX WHERE DOCID = OBSERVATION_VALUE_ID AND TAGNAME = 'TRIAL_ID' AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')) = TRUE GROUP BY "BIOMATERIAL_NAME" LIMIT 10 OFFSET 0; And are you prepared for this one, which throws exact same error: SELECT "BIOMATERIAL_NAME", FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" FROM VARIABLE_VALUES_QA.OBSERVATION WHERE TRUE GROUP BY "BIOMATERIAL_NAME" LIMIT 10 OFFSET 0; Change FIRST_VALUE to AVG and it works fine.