Hello James, Thanks for the response. I will look into adding relevant TPC-H indexes.
We are doing a performance study comparing Impala, Phoenix using TPC-H queries. For scale factor 1, I am not able to get numbers exhibiting trend similar to performance graph on our website comparing Impala and Phoenix . I have few questions regarding the performance graph comparing Phoenix and Impala for aggregation query (select count(1)) Q1: Was the Impala data stored in parquet/text format? Q2: What and how performance parameters (hadoop, hbase) were tuned for phoenix. Q3. Can I get the same population script so that I can report numbers from the local cluster. Thanks, Ashish On Thu, Dec 3, 2015 at 5:35 PM, James Taylor <jamestay...@apache.org> wrote: > Hi Ashish, > Please make sure to add a secondary index on l_shipdate + l_discount + > l_quantity > for that query. There's a kind of well-known, canonical description of the > optimal indexes to add for the TPC benchmarks that I can't seem to find > (but maybe someone else can point you to it). You should include the time > to call resultset.next() as that call will block until the result is > calculated. Also, subsequent runs of the query will always be faster due to > various caching that happens (OS, HBase block cache, etc.) as well as the > JIT compiler kicking in. > > Thanks, > James > > On Thu, Dec 3, 2015 at 12:34 PM, ashish tapdiya <ashishtapd...@gmail.com> > wrote: > >> Hi, >> >> I am profiling TPC-H queries using Phoenix. For query no. 6 and db size >> 1GB (lineitem table size is around 760 MB), >> >> Query 6 : select sum(l_extendedprice * l_discount) as revenue from >> lineitem_sf1 where l_shipdate >= TO_DATE('1993-01-01') and l_shipdate < >> TO_DATE('1994-01-01') and l_discount between 0.06 and 0.08 and l_quantity < >> 24 >> >> the execution time recorded using following code: >> >> long startTime = System.currentTimeMillis(); >> rset = stmt.executeQuery(); >> long stopTime = System.currentTimeMillis(); >> long elapsedTime = stopTime - startTime; >> >> first run - 240 ms >> second run onwards - 80 ms >> >> However, when I iterate the result set (single row), query response time >> including result set iteration shoots up to 19 seconds. >> >> Does the query gets executed with stmt.executeQuery() or phoenix does not >> execute query until first resultset.next() is invoked? >> >> Cluster includes 4 slaves nodes. Phoenix version is 4.3.0 >> >> Thanks, >> ~Ashish >> > >