We are generating DDL using Java string concatenation and a table is created on the topology submission. I missed to copy the exact statement, so it would take me some time to get it again because I have to resubmit the topology.
On Sat, Sep 6, 2014 at 10:50 AM, James Taylor <jamestay...@apache.org> wrote: > Would you mind posting the CREATE TABLE statement you used to create > the table, as it's a little easier to read? > Thanks, > James > > On Fri, Sep 5, 2014 at 10:19 PM, Vikas Agarwal <vi...@infoobjects.com> > wrote: > > James, > > > > Schema is pretty simple, I guess. Here it is (I have renamed some actual > > column names) > > > > TABLE_CAT | TABLE_SCHEM | TABLE_NAME | COLUMN_NAME | DATA_TYPE | > TYPE_NAME > > | COLUMN_SIZE | BUFFER_LENGTH | DECIMAL_DIGITS | NUM_PREC_RADIX | > NULLABLE > > | COLUMN_DEF | SQ | > > > +------------+-------------+------------+-------------+------------+------------+-------------+---------------+----------------+----------------+------------+------------+----+ > > | null | null | table_name | TIMESTAMP | -5 | > BIGINT > > | null | null | null | null | 0 > > | null | > > | null | null | table_name | ID | 12 | > VARCHAR > > | 255 | null | null | null | 0 > > | null | > > | null | null | table_name | TEXT_FIELD | 12 | > VARCHAR > > | 255 | null | null | null | 1 > > | null | > > | null | null | table_name | USER_ID | 12 | > VARCHAR > > | 255 | null | null | null | 0 > > | null | > > | null | null | table_name | TEXT_FIELD | 12 > | > > VARCHAR | 25523 | null | null | null > | > > 1 | null | > > | null | null | table_name | TYPE | 12 | > VARCHAR > > | 255 | null | null | null | 1 > > | null | > > | null | null | table_name | COUNT_1 | 4 | INTEGER > > | null | null | null | null | 1 > > | null | > > | null | null | table_name | COUNT_2 | 4 | INTEGER > > | null | null | null | null | 1 > > | null | > > | null | null | table_name | COUNT_3 | 4 | INTEGER > > | null | null | null | null | 1 > > | null | > > | null | null | table_name | COUNT_4 | -5 | BIGINT > > | null | null | null | null | 1 > > | null | > > | null | null | table_name | COUNT_5 | -5 | BIGINT > > | null | null | null | null | 1 > > | null | > > | null | null | table_name | COUNT_6 | -5 | BIGINT > > | null | null | null | null | 1 > > | null | > > | null | null | table_name | TAGS | 2003 | > > VARCHAR_ARRAY | null | null | null | null > > | 1 | null | > > | null | null | table_name | UPDATED | -5 | > BIGINT > > | null | null | null | null | 1 > > | null | > > | null | null | table_name | SOME_FIELD | 12 | > > VARCHAR | 255 | null | null | null > | > > 1 | null | > > | null | null | table_name | LOCATIONS | 12 | > VARCHAR > > | 255 | null | null | null | 1 > > | null | > > > +------------+-------------+------------+-------------+------------+------------+-------------+---------------+----------------+----------------+------------+------------+----+ > > > > > > Query: > > > > SELECT USER_ID FROM HJK_SI_LEAD_FEED WHERE ID='507449491025170432'; > > > > > > > > > > > > On Sat, Sep 6, 2014 at 10:15 AM, James Taylor <jamestay...@apache.org> > > wrote: > >> > >> Vikas, > >> Please post your schema and query. > >> Thanks, > >> James > >> > >> On Fri, Sep 5, 2014 at 9:18 PM, Vikas Agarwal <vi...@infoobjects.com> > >> wrote: > >> > Ours is also a single node setup right now and as of now there are > less > >> > than > >> > 1 million rows which is expected to grow around 100m at minimum. > >> > > >> > I am aware of secondary indexes but when I am querying on primary/row > >> > key, > >> > why would it take so much time? > >> > > >> > I am directly querying using sqlline for Phoenix and hbase shell for > >> > HBase > >> > query. I am not expecting to do any fine tuning for such small > dataset. > >> > I am > >> > assumimg a minimum performance level out of the box. > >> > > >> > On Friday, September 5, 2014, yeshwanth kumar <yeshwant...@gmail.com> > >> > wrote: > >> >> > >> >> hi vikas, > >> >> > >> >> we used phoenix on a 4 core/23Gb machine, as a single node setup. > >> >> used HDP 2.1 > >> >> our table has 50-70M rows, > >> >> select on that table took less than 2 seconds. > >> >> Aggregation queries took less than 8 seconds. > >> >> for achieving good performance we created secondary index on the > table. > >> >> > >> >> make sure you finetuned hbase, > >> >> enabling compression on the data makes a difference in response. > >> >> if u distribute the data and load over all regions in hbase, > >> >> look at the performance tips mentioned in phoenix blog > >> >> > >> >> -yeshwanth > >> >> > >> >> > >> >> > >> >> Cheers, > >> >> Yeshwanth > >> >> > >> >> > >> >> > >> >> On Fri, Sep 5, 2014 at 5:42 PM, Vikas Agarwal <vi...@infoobjects.com > > > >> >> wrote: > >> >>> > >> >>> Hi, > >> >>> > >> >>> Preface: We are testing phoenix using Hortonworks distribution for > >> >>> HBase > >> >>> on Amazon EC2 instance (r3.large, 2 CPU/15 GB RAM). > >> >>> > >> >>> With contrast to performance benchmarks, I found Phoenix to be very > >> >>> slow > >> >>> in querying even on primary key or row key. So, tried to increase > the > >> >>> RAM > >> >>> for HBase and Phoenix and increasing the CPU and RAM by upgrading > the > >> >>> EC2 > >> >>> machine type to r3.xlarge (4 CPU, 30 GB RAM). Results were like > this: > >> >>> > >> >>> Time takes in returning result of query on row key: > >> >>> With Storm running and very less RAM available: 50 sec > >> >>> > >> >>> With Storm stopped and RAM available to Phoenix and HBase: 18 sec > >> >>> > >> >>> With new machine of next higher category (4 CPU and 30 GB RAM): 8 > sec > >> >>> > >> >>> Pure HBase query by row key with Storm stopped and (2 CPU, 15 GB > RAM): > >> >>> 0.0150 seconds. :) > >> >>> > >> >>> So, the difference seems to be many fold of what native HBase is > >> >>> providing to us. I am not able to understand how it can be possible? > >> >>> What I > >> >>> am missing here? > >> >>> > >> >>> -- > >> >>> Regards, > >> >>> Vikas Agarwal > >> >>> 91 – 9928301411 > >> >>> > >> >>> InfoObjects, Inc. > >> >>> Execution Matters > >> >>> http://www.infoobjects.com > >> >>> 2041 Mission College Boulevard, #280 > >> >>> Santa Clara, CA 95054 > >> >>> +1 (408) 988-2000 Work > >> >>> +1 (408) 716-2726 Fax > >> >> > >> >> > >> > > >> > > >> > -- > >> > Regards, > >> > Vikas Agarwal > >> > 91 – 9928301411 > >> > > >> > InfoObjects, Inc. > >> > Execution Matters > >> > http://www.infoobjects.com > >> > 2041 Mission College Boulevard, #280 > >> > Santa Clara, CA 95054 > >> > +1 (408) 988-2000 Work > >> > +1 (408) 716-2726 Fax > >> > > >> > > > > > > > > > > > -- > > Regards, > > Vikas Agarwal > > 91 – 9928301411 > > > > InfoObjects, Inc. > > Execution Matters > > http://www.infoobjects.com > > 2041 Mission College Boulevard, #280 > > Santa Clara, CA 95054 > > +1 (408) 988-2000 Work > > +1 (408) 716-2726 Fax > -- Regards, Vikas Agarwal 91 – 9928301411 InfoObjects, Inc. Execution Matters http://www.infoobjects.com 2041 Mission College Boulevard, #280 Santa Clara, CA 95054 +1 (408) 988-2000 Work +1 (408) 716-2726 Fax