git clone https://github.com/apache/phoenix
mvn clean package
bin/phoenix_sandbox.py
use the host and port displayed to start sqlline.py

On Thu, Sep 30, 2021 at 4:21 AM Simon Mottram <simon.mott...@cucumber.co.nz>
wrote:

> Sounds like a plan thanks.
>
> Is there any doco to help with getting setup?  I assume the phoenix dev
> channel is the place to ask questions.
>
> Cheers
>
> Simon
> ------------------------------
> *From:* Istvan Toth <st...@cloudera.com>
> *Sent:* 29 September 2021 7:37 PM
> *To:* user@phoenix.apache.org <user@phoenix.apache.org>
> *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins
>
> Please try to reproduce with the latest released Phoenix version, or git
> master HEAD before opening the JIRA.
> You can use bin/phoenix_sandbox.py in the source repo/distribution to
> quickly start a test environment without installing Phoenix in a real
> cluster.
>
>
> On Wed, Sep 29, 2021 at 12:36 AM Simon Mottram <
> simon.mott...@cucumber.co.nz> wrote:
>
> Ok, should I bug this in your JIRA?
>
> I will try and setup a more reproducible series of steps
>
> Cheers
>
> S
> ------------------------------
> *From:* Kadir Ozdemir <kozde...@salesforce.com>
> *Sent:* 29 September 2021 11:32 AM
> *To:* Geoffrey Jacoby <gjac...@salesforce.com>
> *Cc:* user <user@phoenix.apache.org>
> *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins
>
> Geoffrey,
>
> Thank you for pointing out this. I mixed up release numbers. Yes, the
> paging feature cannot be the reason here. My comment for disabling the
> paging feature should be ignored.
>
> Kadir
>
> On Tue, Sep 28, 2021 at 7:44 AM Geoffrey Jacoby <gjac...@salesforce.com>
> wrote:
>
> Kadir,
>
> The server-side paging changes are in Phoenix 5.1, which is at feature
> parity with 4.16 for HBase 1.x, not 5.0 (which was released in 2018 and is
> roughly equivalent to Phoenix 4.14).
>
> So if Simon's environment is Phoenix 5.0, it can't be affected by the
> server-side changes. (Though I see that he's running HBase 2.2, which I
> thought wasn't supported until Phoenix 5.1?)
>
> Geoffrey
>
> On Tue, Sep 28, 2021 at 4:07 AM Simon Mottram <
> simon.mott...@cucumber.co.nz> wrote:
>
> Hi thanks for the quick response
>
> I'm going to appear thick now, brace yourself.   By client do you mean the
> queryserver running in EMR which does have an hbase-site.xml or my Java API
> as a property, we don't have an hbase-site.xml (unless it's embedded in the
> thin client jar.
>
> Cheers
>
> S
> ------------------------------
> *From:* Kadir Ozdemir <kozde...@salesforce.com>
> *Sent:* 28 September 2021 7:45 PM
> *To:* user <user@phoenix.apache.org>
> *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins
>
> In Phoenix 5.0 we introduced a new server side paging feature to eliminate
> timeouts due long running server side computations such aggregation and
> joins. I wonder if this new feature caused your issue. If so, the work
> around is to disable it by setting phoenix.server.paging.enabled to false
> in hbase-site.xml. This is a client side config param so you just need to
> restart your client. Hope this will fix your issue.
>
> On Mon, Sep 27, 2021 at 9:23 PM Simon Mottram <
> simon.mott...@cucumber.co.nz> wrote:
>
> Hi
>
> Got my fingers crossed that there's a work around for this as this really
> is a big problem for us
>
> We are using:
>
> Amazon EMR
>
> Release label:emr-6.1.0
> Hadoop distribution:Amazon
> Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0
>
> Thin Client version:
> phoenix-5.0.0-HBase-2.0-thin-client.jar
>
> We get the following error when doing an LAST_VALUE aggregation where
>
>   1.  A JOIN is empty
>   2.  The column is INTEGER or DATETIME
>
> Remote driver error: IllegalArgumentException: offset (25) + length (4)
> exceed the capacity of the array: 25
>
> The query that breaks is:
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
> ON DOCID = OBSERVATION_VALUE_ID
>   AND TAGNAME = 'TRIAL_ID'
> AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe'
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> I can refactor this using EXIST but get same error, presumably the driver
> knows to treat them the same:
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> WHERE
> EXISTS (
> SELECT
> DOCID
> FROM
> VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
> WHERE
> DOCID = OBSERVATION_VALUE_ID
> AND TAGNAME = 'TRIAL_ID'
> AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> If we remove the external reference we get no error, regardless of whether
> there are any hits or not
>
> -- these all work
> There are no hits for this query
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> WHERE
> BIOMATERIAL_TYPE = 'aardvark'
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> Lots of hits for this query:
>
> SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10  OFFSET 0;
>
> I've tried weird things like:
>
> Comparing exists to TRUE to try and force it into a normal BOOLEAN value,
> same IllegalArgumentException.
>
>  SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> WHERE
> (EXISTS (
> SELECT
> DOCID
> FROM
> VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX
> WHERE
> DOCID = OBSERVATION_VALUE_ID
> AND TAGNAME = 'TRIAL_ID'
> AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')) = TRUE
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> And are you prepared for this one, which throws exact same error:
>
>  SELECT
> "BIOMATERIAL_NAME",
> FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP (
> ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count"
> FROM
> VARIABLE_VALUES_QA.OBSERVATION
> WHERE
> TRUE
> GROUP BY
> "BIOMATERIAL_NAME"
> LIMIT 10
>  OFFSET 0;
>
> Change FIRST_VALUE to AVG and it works fine.
>
>
>
> --
> *István Tóth* | Staff Software Engineer
> st...@cloudera.com <https://www.cloudera.com>
> [image: Cloudera] <https://www.cloudera.com/>
> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
> Cloudera on Facebook] <https://www.facebook.com/cloudera> [image:
> Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera>
> <https://www.cloudera.com/>
> ------------------------------
>


-- 
*István Tóth* | Staff Software Engineer
st...@cloudera.com <https://www.cloudera.com>
[image: Cloudera] <https://www.cloudera.com/>
[image: Cloudera on Twitter] <https://twitter.com/cloudera> [image:
Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera
on LinkedIn] <https://www.linkedin.com/company/cloudera>
<https://www.cloudera.com/>
------------------------------

Reply via email to