Of course, I can do a lot of optimizations. However, my concern is that what I am missing that is causing Phoenix to perform bad while exactly on same time, Hbase is giving results amazingly fast.
On Sat, Sep 6, 2014 at 12:41 PM, Alex Kamil <alex.ka...@gmail.com> wrote: > well it is still network attached, If you allocate enough heap to fit the > whole thing in memory (in hbase/conf/hbase-env.sh) you could probably > eliminate this as a possible reason > > > On Sat, Sep 6, 2014 at 2:43 AM, Vikas Agarwal <vi...@infoobjects.com> > wrote: > >> EBS but with new generation SSD not magnetic one. >> >> >> On Sat, Sep 6, 2014 at 12:11 PM, Alex Kamil <alex.ka...@gmail.com> wrote: >> >>> do you use EBS or ephemeral storage, I found EBS performance to be >>> somewhat unpredictable >>> >>> >>> On Sat, Sep 6, 2014 at 2:37 AM, Vikas Agarwal <vi...@infoobjects.com> >>> wrote: >>> >>>> Hbase is 0.98.0 >>>> Phoenix is 4.0 >>>> >>>> >>>> On Sat, Sep 6, 2014 at 12:04 PM, Vikas Agarwal <vi...@infoobjects.com> >>>> wrote: >>>> >>>>> Yes, that is why it is a trouble for me. However, on contrary, HBase >>>>> shell is also on the same machine and same environment, so if it is an >>>>> issue of resource (CPU or memory) it should have affected the HBase too, >>>>> but HBase is able to give me results within 0.0150 seconds. :( >>>>> >>>>> No, I haven't tested it outside AWS. I guess, it should not be the >>>>> case due to much better performance by native HBase query on HBase shell. >>>>> >>>>> >>>>> On Sat, Sep 6, 2014 at 11:59 AM, James Taylor <jamestay...@apache.org> >>>>> wrote: >>>>> >>>>>> Something is up in your environment. What version of Phoenix and HBase >>>>>> are you using and in what environment? Have you tried this locally, >>>>>> outside of AWS to compare? >>>>>> >>>>>> Take a look at our perf numbers, generated more-or-less daily, and >>>>>> which run over more data that what you're testing against: >>>>>> >>>>>> http://phoenix-bin.github.io/client/performance/phoenix-20140904095313.htm >>>>>> >>>>>> Some of these are point queries and they take in the neighborhood of >>>>>> 0.01 seconds. >>>>>> >>>>>> Thanks, >>>>>> James >>>>>> >>>>>> On Fri, Sep 5, 2014 at 10:48 PM, Vikas Agarwal <vi...@infoobjects.com> >>>>>> wrote: >>>>>> > Missed to mention that count query (posted in my last mail) is also >>>>>> taking >>>>>> > very long time to return the count. >>>>>> > >>>>>> > >>>>>> > On Sat, Sep 6, 2014 at 11:17 AM, Vikas Agarwal < >>>>>> vi...@infoobjects.com> >>>>>> > wrote: >>>>>> >> >>>>>> >> As I mentioned, schema is nothing but bunch of fields (some being >>>>>> >> integers, longs and text) along with primary key (row key) and I >>>>>> am making >>>>>> >> simple query to get result for a particular primary key, nothing >>>>>> more than >>>>>> >> that. >>>>>> >> >>>>>> >> 0: jdbc:phoenix:localhost> SELECT count(1) FROM table_name; >>>>>> >> >>>>>> >> +------------+ >>>>>> >> >>>>>> >> | COUNT(1) | >>>>>> >> >>>>>> >> +------------+ >>>>>> >> >>>>>> >> | 4667515 | >>>>>> >> >>>>>> >> +------------+ >>>>>> >> >>>>>> >> 1 row selected (132.11 seconds) >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> On Sat, Sep 6, 2014 at 11:09 AM, Puneet Kumar Ojha >>>>>> >> <puneet.ku...@pubmatic.com> wrote: >>>>>> >>> >>>>>> >>> If you can share the schema,data type,cardinality of each >>>>>> dimension and >>>>>> >>> usual queries, I can help to design a schema with performance of >>>>>> less than 1 >>>>>> >>> sec using Phoenix. >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> Thanks >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> ------ Original message------ >>>>>> >>> >>>>>> >>> From: James Taylor >>>>>> >>> >>>>>> >>> Date: Sat, Sep 6, 2014 10:15 AM >>>>>> >>> >>>>>> >>> To: user; >>>>>> >>> >>>>>> >>> Subject:Re: Phoenix response time >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> Vikas, >>>>>> >>> Please post your schema and query. >>>>>> >>> Thanks, >>>>>> >>> James >>>>>> >>> >>>>>> >>> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal < >>>>>> vi...@infoobjects.com> >>>>>> >>> wrote: >>>>>> >>> > Ours is also a single node setup right now and as of now there >>>>>> are less >>>>>> >>> > than >>>>>> >>> > 1 million rows which is expected to grow around 100m at minimum. >>>>>> >>> > >>>>>> >>> > I am aware of secondary indexes but when I am querying on >>>>>> primary/row >>>>>> >>> > key, >>>>>> >>> > why would it take so much time? >>>>>> >>> > >>>>>> >>> > I am directly querying using sqlline for Phoenix and hbase >>>>>> shell for >>>>>> >>> > HBase >>>>>> >>> > query. I am not expecting to do any fine tuning for such small >>>>>> dataset. >>>>>> >>> > I am >>>>>> >>> > assumimg a minimum performance level out of the box. >>>>>> >>> > >>>>>> >>> > On Friday, September 5, 2014, yeshwanth kumar < >>>>>> yeshwant...@gmail.com> >>>>>> >>> > wrote: >>>>>> >>> >> >>>>>> >>> >> hi vikas, >>>>>> >>> >> >>>>>> >>> >> we used phoenix on a 4 core/23Gb machine, as a single node >>>>>> setup. >>>>>> >>> >> used HDP 2.1 >>>>>> >>> >> our table has 50-70M rows, >>>>>> >>> >> select on that table took less than 2 seconds. >>>>>> >>> >> Aggregation queries took less than 8 seconds. >>>>>> >>> >> for achieving good performance we created secondary index on >>>>>> the >>>>>> >>> >> table. >>>>>> >>> >> >>>>>> >>> >> make sure you finetuned hbase, >>>>>> >>> >> enabling compression on the data makes a difference in >>>>>> response. >>>>>> >>> >> if u distribute the data and load over all regions in hbase, >>>>>> >>> >> look at the performance tips mentioned in phoenix blog >>>>>> >>> >> >>>>>> >>> >> -yeshwanth >>>>>> >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> >>> >> Cheers, >>>>>> >>> >> Yeshwanth >>>>>> >>> >> >>>>>> >>> >> >>>>>> >>> >> >>>>>> >>> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal < >>>>>> vi...@infoobjects.com> >>>>>> >>> >> wrote: >>>>>> >>> >>> >>>>>> >>> >>> Hi, >>>>>> >>> >>> >>>>>> >>> >>> Preface: We are testing phoenix using Hortonworks >>>>>> distribution for >>>>>> >>> >>> HBase >>>>>> >>> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM). >>>>>> >>> >>> >>>>>> >>> >>> With contrast to performance benchmarks, I found Phoenix to >>>>>> be very >>>>>> >>> >>> slow >>>>>> >>> >>> in querying even on primary key or row key. So, tried to >>>>>> increase the >>>>>> >>> >>> RAM >>>>>> >>> >>> for HBase and Phoenix and increasing the CPU and RAM by >>>>>> upgrading the >>>>>> >>> >>> EC2 >>>>>> >>> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were >>>>>> like this: >>>>>> >>> >>> >>>>>> >>> >>> Time takes in returning result of query on row key: >>>>>> >>> >>> With Storm running and very less RAM available: 50 sec >>>>>> >>> >>> >>>>>> >>> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 >>>>>> sec >>>>>> >>> >>> >>>>>> >>> >>> With new machine of next higher category (4 CPU and 30 GB >>>>>> RAM): 8 sec >>>>>> >>> >>> >>>>>> >>> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 >>>>>> GB >>>>>> >>> >>> RAM): >>>>>> >>> >>> 0.0150 seconds. :) >>>>>> >>> >>> >>>>>> >>> >>> So, the difference seems to be many fold of what native HBase >>>>>> is >>>>>> >>> >>> providing to us. I am not able to understand how it can be >>>>>> possible? >>>>>> >>> >>> What I >>>>>> >>> >>> am missing here? >>>>>> >>> >>> >>>>>> >>> >>> -- >>>>>> >>> >>> Regards, >>>>>> >>> >>> Vikas Agarwal >>>>>> >>> >>> 91 – 9928301411 >>>>>> >>> >>> >>>>>> >>> >>> InfoObjects, Inc. >>>>>> >>> >>> Execution Matters >>>>>> >>> >>> http://www.infoobjects.com >>>>>> >>> >>> 2041 Mission College Boulevard, #280 >>>>>> >>> >>> Santa Clara, CA 95054 >>>>>> >>> >>> +1 (408) 988-2000 Work >>>>>> >>> >>> +1 (408) 716-2726 Fax >>>>>> >>> >> >>>>>> >>> >> >>>>>> >>> > >>>>>> >>> > >>>>>> >>> > -- >>>>>> >>> > Regards, >>>>>> >>> > Vikas Agarwal >>>>>> >>> > 91 – 9928301411 >>>>>> >>> > >>>>>> >>> > InfoObjects, Inc. >>>>>> >>> > Execution Matters >>>>>> >>> > http://www.infoobjects.com >>>>>> >>> > 2041 Mission College Boulevard, #280 >>>>>> >>> > Santa Clara, CA 95054 >>>>>> >>> > +1 (408) 988-2000 Work >>>>>> >>> > +1 (408) 716-2726 Fax >>>>>> >>> > >>>>>> >>> > >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> >>>>>> >> -- >>>>>> >> Regards, >>>>>> >> Vikas Agarwal >>>>>> >> 91 – 9928301411 >>>>>> >> >>>>>> >> InfoObjects, Inc. >>>>>> >> Execution Matters >>>>>> >> http://www.infoobjects.com >>>>>> >> 2041 Mission College Boulevard, #280 >>>>>> >> Santa Clara, CA 95054 >>>>>> >> +1 (408) 988-2000 Work >>>>>> >> +1 (408) 716-2726 Fax >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > -- >>>>>> > Regards, >>>>>> > Vikas Agarwal >>>>>> > 91 – 9928301411 >>>>>> > >>>>>> > InfoObjects, Inc. >>>>>> > Execution Matters >>>>>> > http://www.infoobjects.com >>>>>> > 2041 Mission College Boulevard, #280 >>>>>> > Santa Clara, CA 95054 >>>>>> > +1 (408) 988-2000 Work >>>>>> > +1 (408) 716-2726 Fax >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Regards, >>>>> Vikas Agarwal >>>>> 91 – 9928301411 >>>>> >>>>> InfoObjects, Inc. >>>>> Execution Matters >>>>> http://www.infoobjects.com >>>>> 2041 Mission College Boulevard, #280 >>>>> Santa Clara, CA 95054 >>>>> +1 (408) 988-2000 Work >>>>> +1 (408) 716-2726 Fax >>>>> >>>>> >>>> >>>> >>>> -- >>>> Regards, >>>> Vikas Agarwal >>>> 91 – 9928301411 >>>> >>>> InfoObjects, Inc. >>>> Execution Matters >>>> http://www.infoobjects.com >>>> 2041 Mission College Boulevard, #280 >>>> Santa Clara, CA 95054 >>>> +1 (408) 988-2000 Work >>>> +1 (408) 716-2726 Fax >>>> >>>> >>> >> >> >> -- >> Regards, >> Vikas Agarwal >> 91 – 9928301411 >> >> InfoObjects, Inc. >> Execution Matters >> http://www.infoobjects.com >> 2041 Mission College Boulevard, #280 >> Santa Clara, CA 95054 >> +1 (408) 988-2000 Work >> +1 (408) 716-2726 Fax >> >> > -- Regards, Vikas Agarwal 91 – 9928301411 InfoObjects, Inc. Execution Matters http://www.infoobjects.com 2041 Mission College Boulevard, #280 Santa Clara, CA 95054 +1 (408) 988-2000 Work +1 (408) 716-2726 Fax