git clone https://github.com/apache/phoenix mvn clean package bin/phoenix_sandbox.py use the host and port displayed to start sqlline.py
On Thu, Sep 30, 2021 at 4:21 AM Simon Mottram <simon.mott...@cucumber.co.nz> wrote: > Sounds like a plan thanks. > > Is there any doco to help with getting setup? I assume the phoenix dev > channel is the place to ask questions. > > Cheers > > Simon > ------------------------------ > *From:* Istvan Toth <st...@cloudera.com> > *Sent:* 29 September 2021 7:37 PM > *To:* user@phoenix.apache.org <user@phoenix.apache.org> > *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins > > Please try to reproduce with the latest released Phoenix version, or git > master HEAD before opening the JIRA. > You can use bin/phoenix_sandbox.py in the source repo/distribution to > quickly start a test environment without installing Phoenix in a real > cluster. > > > On Wed, Sep 29, 2021 at 12:36 AM Simon Mottram < > simon.mott...@cucumber.co.nz> wrote: > > Ok, should I bug this in your JIRA? > > I will try and setup a more reproducible series of steps > > Cheers > > S > ------------------------------ > *From:* Kadir Ozdemir <kozde...@salesforce.com> > *Sent:* 29 September 2021 11:32 AM > *To:* Geoffrey Jacoby <gjac...@salesforce.com> > *Cc:* user <user@phoenix.apache.org> > *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins > > Geoffrey, > > Thank you for pointing out this. I mixed up release numbers. Yes, the > paging feature cannot be the reason here. My comment for disabling the > paging feature should be ignored. > > Kadir > > On Tue, Sep 28, 2021 at 7:44 AM Geoffrey Jacoby <gjac...@salesforce.com> > wrote: > > Kadir, > > The server-side paging changes are in Phoenix 5.1, which is at feature > parity with 4.16 for HBase 1.x, not 5.0 (which was released in 2018 and is > roughly equivalent to Phoenix 4.14). > > So if Simon's environment is Phoenix 5.0, it can't be affected by the > server-side changes. (Though I see that he's running HBase 2.2, which I > thought wasn't supported until Phoenix 5.1?) > > Geoffrey > > On Tue, Sep 28, 2021 at 4:07 AM Simon Mottram < > simon.mott...@cucumber.co.nz> wrote: > > Hi thanks for the quick response > > I'm going to appear thick now, brace yourself. By client do you mean the > queryserver running in EMR which does have an hbase-site.xml or my Java API > as a property, we don't have an hbase-site.xml (unless it's embedded in the > thin client jar. > > Cheers > > S > ------------------------------ > *From:* Kadir Ozdemir <kozde...@salesforce.com> > *Sent:* 28 September 2021 7:45 PM > *To:* user <user@phoenix.apache.org> > *Subject:* Re: Urgent problem with Phoenix function LAST_VALUE and joins > > In Phoenix 5.0 we introduced a new server side paging feature to eliminate > timeouts due long running server side computations such aggregation and > joins. I wonder if this new feature caused your issue. If so, the work > around is to disable it by setting phoenix.server.paging.enabled to false > in hbase-site.xml. This is a client side config param so you just need to > restart your client. Hope this will fix your issue. > > On Mon, Sep 27, 2021 at 9:23 PM Simon Mottram < > simon.mott...@cucumber.co.nz> wrote: > > Hi > > Got my fingers crossed that there's a work around for this as this really > is a big problem for us > > We are using: > > Amazon EMR > > Release label:emr-6.1.0 > Hadoop distribution:Amazon > Applications:Hbase 2.2.5, Hive 3.1.2, Phoenix 5.0.0, Pig 0.17.0 > > Thin Client version: > phoenix-5.0.0-HBase-2.0-thin-client.jar > > We get the following error when doing an LAST_VALUE aggregation where > > 1. A JOIN is empty > 2. The column is INTEGER or DATETIME > > Remote driver error: IllegalArgumentException: offset (25) + length (4) > exceed the capacity of the array: 25 > > The query that breaks is: > > SELECT > "BIOMATERIAL_NAME", > FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( > ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" > FROM > VARIABLE_VALUES_QA.OBSERVATION > JOIN VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX > ON DOCID = OBSERVATION_VALUE_ID > AND TAGNAME = 'TRIAL_ID' > AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe' > GROUP BY > "BIOMATERIAL_NAME" > LIMIT 10 > OFFSET 0; > > I can refactor this using EXIST but get same error, presumably the driver > knows to treat them the same: > > SELECT > "BIOMATERIAL_NAME", > FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( > ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" > FROM > VARIABLE_VALUES_QA.OBSERVATION > WHERE > EXISTS ( > SELECT > DOCID > FROM > VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX > WHERE > DOCID = OBSERVATION_VALUE_ID > AND TAGNAME = 'TRIAL_ID' > AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe') > GROUP BY > "BIOMATERIAL_NAME" > LIMIT 10 > OFFSET 0; > > If we remove the external reference we get no error, regardless of whether > there are any hits or not > > -- these all work > There are no hits for this query > > SELECT > "BIOMATERIAL_NAME", > FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( > ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" > FROM > VARIABLE_VALUES_QA.OBSERVATION > WHERE > BIOMATERIAL_TYPE = 'aardvark' > GROUP BY > "BIOMATERIAL_NAME" > LIMIT 10 > OFFSET 0; > > Lots of hits for this query: > > SELECT > "BIOMATERIAL_NAME", > FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( > ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" > FROM > VARIABLE_VALUES_QA.OBSERVATION > GROUP BY > "BIOMATERIAL_NAME" > LIMIT 10 OFFSET 0; > > I've tried weird things like: > > Comparing exists to TRUE to try and force it into a normal BOOLEAN value, > same IllegalArgumentException. > > SELECT > "BIOMATERIAL_NAME", > FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( > ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" > FROM > VARIABLE_VALUES_QA.OBSERVATION > WHERE > (EXISTS ( > SELECT > DOCID > FROM > VARIABLE_VALUES_QA.OBSERVATION_TAG_INDEX > WHERE > DOCID = OBSERVATION_VALUE_ID > AND TAGNAME = 'TRIAL_ID' > AND TAGVALUE = 'f62dd8e0-d2ea-4d9a-9ab6-2049601bb9fe')) = TRUE > GROUP BY > "BIOMATERIAL_NAME" > LIMIT 10 > OFFSET 0; > > And are you prepared for this one, which throws exact same error: > > SELECT > "BIOMATERIAL_NAME", > FIRST_VALUE("PLANT_FRUIT_COUNT") WITHIN GROUP ( > ORDER BY OBSERVATION_DATE DESC) AS "Plant Fruit Count" > FROM > VARIABLE_VALUES_QA.OBSERVATION > WHERE > TRUE > GROUP BY > "BIOMATERIAL_NAME" > LIMIT 10 > OFFSET 0; > > Change FIRST_VALUE to AVG and it works fine. > > > > -- > *István Tóth* | Staff Software Engineer > st...@cloudera.com <https://www.cloudera.com> > [image: Cloudera] <https://www.cloudera.com/> > [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: > Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: > Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> > <https://www.cloudera.com/> > ------------------------------ > -- *István Tóth* | Staff Software Engineer st...@cloudera.com <https://www.cloudera.com> [image: Cloudera] <https://www.cloudera.com/> [image: Cloudera on Twitter] <https://twitter.com/cloudera> [image: Cloudera on Facebook] <https://www.facebook.com/cloudera> [image: Cloudera on LinkedIn] <https://www.linkedin.com/company/cloudera> <https://www.cloudera.com/> ------------------------------