Re: pagination

2017-05-24 Thread Ciureanu Constantin
sk to delete older cached results) 2017-05-18 22:02 GMT+02:00 James Taylor : > HBase does not lend itself to that pattern. Rows overlap in HFiles (by > design). There's no facility to jump to the Nth row. Best to use the RVC > mechanism. > > On Thu, May 18, 2017 at 12:03 PM Ciurean

Re: pagination

2017-05-18 Thread Ciureanu Constantin
What about using the VLH pattern? An d keep the offsets for each page in the server side, for a while... (the client might not need all of them, might also never ask for next page) http://www.oracle.com/technetwork/java/valuelisthandler-142464.html On May 18, 2017 20:02, "James Taylor" wrote: >

Re: How to map sparse hbase table with dynamic columns into Phoenix

2016-12-12 Thread Ciureanu Constantin
Not sure if this works for the view use-case you have but it's working for a Phoenix table. The table create statement should have just the stable columns. CREATE TABLE IF NOT EXISTS TESTC ( TIMESTAMP BIGINT NOT NULL, NAME VARCHAR NOT NULL CONSTRAINT PK PRIMARY KEY (TIMESTAMP, NAME) ); -- insert

Re: how to filter date type column ?

2016-12-02 Thread Ciureanu Constantin
es as timestamps (long) 2016-12-02 9:11 GMT+01:00 Ciureanu Constantin : > Try using WHERE clause... > > ... FROM FARM_PRODUCT_PRICE > WHERE date=TO_DATE('2015-06-01','-MM-dd') > LIMIT 100; > > 2016-12-02 6:43 GMT+01:00 lk_phoenix : > >> hi,all:

Re: how to filter date type column ?

2016-12-02 Thread Ciureanu Constantin
Try using WHERE clause... ... FROM FARM_PRODUCT_PRICE WHERE date=TO_DATE('2015-06-01','-MM-dd') LIMIT 100; 2016-12-02 6:43 GMT+01:00 lk_phoenix : > hi,all: > I have a table with a column as date type, I try to use it as a where > condition: but it was not work. > > select date,TO_DATE('2015-

Re: [jira] Vivek Paranthaman shared "PHOENIX-3395: ResultSet .next() throws commons-io exception" with you

2016-10-22 Thread Ciureanu Constantin
Not sure what ti say, check your apache-commons version, perhaps it's picking an older one in the classpath. În vin., 21 oct. 2016, 09:36 Vivek Paranthaman (JIRA), a scris: > Vivek Paranthaman shared an issue with you > > > > > > ResultSet .ne

Re: sc.phoenixTableAsRDD number of initial partitions

2016-10-14 Thread Ciureanu Constantin
Then please post a small part of your code (that one reading from Phoenix & processing the RDD contents) 2016-10-14 11:12 GMT+02:00 Antonio Murgia : > For the record, autocommit was set to true. > > On 10/14/2016 10:08 AM, James Taylor wrote: > > > > On Fri, Oct 14, 2016 at 12:37 AM, Antonio Murg

Re: sc.phoenixTableAsRDD number of initial partitions

2016-10-13 Thread Ciureanu Constantin
Hi Antonio, Reading the whole table is not a good use-case for Phoenix / HBase or any DB. You should never ever store the whole content read from DB / disk into memory, that's definitely wrong. Spark doesn't do that by itself, no matter what "they" told you that it's going to do in order to be fast

Re: Accessing phoenix tables in Spark 2

2016-10-07 Thread Ciureanu Constantin
In Spark 1.4 it worked via JDBC - sure it would work in 1.6 / 2.0 without issues. Here's a sample code I used (it was getting data in parallel 24 partitions) import org.apache.spark.SparkConf import org.apache.spark.SparkContext import org.apache.spark.rdd.JdbcRDD import java.sql.{Connection, D

Re: Question regarding designing row keys

2016-10-04 Thread Ciureanu Constantin
ingle region if keys are monotonically increasing. > > On Tue, Oct 4, 2016 at 8:04 AM, Ciureanu Constantin < > ciureanu.constan...@gmail.com> wrote: > > select * from metric_table where metric_type='x' > -- so far so good > > and timestamp > 'start_da

Re: Question regarding designing row keys

2016-10-04 Thread Ciureanu Constantin
select * from metric_table where metric_type='x' -- so far so good and timestamp > 'start_date' and timestamp < 'end_date'. -- here in case the timestamp is long (BIGINT in Phoenix) - it should work fine! Try also with "timestamp between (x and y)" Anyway - my proposal would be to reverse the key

Re: Primary key with a separator

2016-09-22 Thread Ciureanu Constantin
But Phoenix does this for you (creates a composite key, special separator) - you just have to specify the PK while creating the table. CREATE TABLE IF NOT EXISTS us_population ( state CHAR(2) NOT NULL, city VARCHAR NOT NULL, population BIGINT CONSTRAINT my_pk PRIMARY KEY (s

Re: is there a way to Join two big tables?

2016-05-23 Thread Ciureanu Constantin
Yes, of course it's possible. Just not using Phoenix - try writing a Spark job (or MapReduce) and if you pick the right join condition it might actually be not that slow at all (time to read the 2 tables in Spark included). If you still want to do it in Phoenix - try to increase those limits (hash

Re: PHOENIX SPARK - Load Table as DataFrame

2016-05-18 Thread Ciureanu Constantin
Hello Mohan, Since you haven't mentioned anything about any tricks for the join, I would assume the entire table would be streamed through Spark and joined there with your file. Tricks to improve the speed, you can pick one that works in your case, or imagine something of your own since you know

Re: FOREIGN KEY

2016-05-12 Thread Ciureanu Constantin
umber ) 2016-05-12 15:14 GMT+02:00 Ciureanu Constantin < ciureanu.constan...@gmail.com>: > Just create a new first unique field CustomerID + TelephoneType to play > the PK role, something has to be unique there and a HBase table needs a Key > (this concatenation of 2 or more values i

Re: FOREIGN KEY

2016-05-12 Thread Ciureanu Constantin
Just create a new first unique field CustomerID + TelephoneType to play the PK role, something has to be unique there and a HBase table needs a Key (this concatenation of 2 or more values is valid in case it's unique otherwise invent some other 3rd part or risk to lose phone numbers that are "dupli

Re: Region Server Crash On Upsert Query Execution

2016-04-01 Thread Ciureanu Constantin
Hi Amit, I guess processing with HBase + Phoenix is not working for your use-case, it needs a lot of memory and of course swap. I imagine there's no direct solution - but post here if you find one (I imagine some good to try options: splitting the query into smaller ones, salt the table in more bu

Re: How To Count Rows In Large Phoenix Table?

2015-06-22 Thread Ciureanu Constantin
Hive can connect to HBase and insert directly into any direction. Don't know if it also works via Phoenix... Counting is too slow on a single threaded job /command line - you should write a map-reduce job, with some filter to load just the key this being really fast. A Map-reduce job is also the

Re: CsvBulkLoadTool hanging

2015-04-11 Thread Ciureanu Constantin
Hi Marek, Change the rights for the output folder. I had the same problem, solved it initially by manually changing the rights to 777, from a console, when the application finished the MR job and was in the infinite loop. My current working solution was to code a sleep of 5sec then change the rig

RE: Update statistics made query 2-3x slower

2015-03-04 Thread Ciureanu, Constantin (GfK)
on your other questions. Thanks, James On Tue, Mar 3, 2015 at 4:23 AM, Ciureanu, Constantin (GfK) mailto:constantin.ciure...@gfk.com>> wrote: Hello James, Btw, I noticed some other issues: - My table key is (DATUM, … ) ordered ascending by key (LONG, in milliseconds) – I have changed th

RE: Update statistics made query 2-3x slower

2015-03-03 Thread Ciureanu, Constantin (GfK)
ll? Thanks, James On Mon, Feb 16, 2015 at 8:44 AM, Vasudevan, Ramkrishna S mailto:ramkrishna.s.vasude...@intel.com>> wrote: Without update statistics – if we run select count(*) what is the PLAN that it executes? One of the RS has got more data I believe. Regards Ram From: Ciureanu,

RE: Update statistics made query 2-3x slower

2015-03-03 Thread Ciureanu, Constantin (GfK)
Hello James, Sorry, no – it’s not my case. I haven’t ran any (minor/major) compaction for my table. Regards, Constantin From: James Taylor [mailto:jamestay...@apache.org] Sent: Tuesday, March 03, 2015 2:20 AM To: user; Ciureanu, Constantin (GfK) Subject: Re: Update statistics made query 2-3x

RE: Inner Join not returning any results in Phoenix

2015-02-20 Thread Ciureanu, Constantin (GfK)
Matt From: Ciureanu, Constantin (GfK) [mailto:constantin.ciure...@gfk.com<mailto:constantin.ciure...@gfk.com>] Sent: 20 February 2015 14:40 To: user@phoenix.apache.org<mailto:user@phoenix.apache.org> Subject: RE: Inner Join not returning any results in Phoenix Hi Matthew, Is it wor

RE: Inner Join not returning any results in Phoenix

2015-02-20 Thread Ciureanu, Constantin (GfK)
Hi Matthew, Is it working without the quotes “ / " ? (I see you are using 2 types of quotes, weird) I guess that’s not needed, and probably causing troubles. I don’t have to use quotes anyway. Alternatively check the types of data in those 2 tables (if the field types are not the same in

RE: Update statistics made query 2-3x slower

2015-02-16 Thread Ciureanu, Constantin (GfK)
statistics tableX; Error: ERROR 6000 (TIM01): Operation timed out . Query couldn't be completed in the alloted time: 60 ms (state=TIM01,code=6000) Thank you, Constantin From: Ciureanu, Constantin (GfK) [mailto:constantin.ciure...@gfk.com] Sent: Monday, February 16, 2015 10:31 AM To: us

RE: Update statistics made query 2-3x slower

2015-02-16 Thread Ciureanu, Constantin (GfK)
are your rows and how much memory is available on your RS/HBase heap? 3. Can you also send output of explain select count(*) from tablex for this case? Thanks, Mujtaba On Fri, Feb 13, 2015 at 12:34 AM, Ciureanu, Constantin (GfK) mailto:constantin.ciure...@gfk.com>> wrote: Hello Mujtaba,

RE: Update statistics made query 2-3x slower

2015-02-13 Thread Ciureanu, Constantin (GfK)
mns width, number of region servers in your cluster plus their heap size, HBase/Phoenix version and any default property overrides so we can identify why stats are slowing things down in your case. Thanks, Mujtaba On Thu, Feb 12, 2015 at 12:56 AM, Ciureanu, Constantin (GfK) mailto:constantin.ci

RE: Update statistics made query 2-3x slower

2015-02-12 Thread Ciureanu, Constantin (GfK)
nks of 1 rows within that region. Have you modified any of the parameters related to statistics like this one ‘phoenix.stats.guidepost.width’. Regards Ram From: Ciureanu, Constantin (GfK) [mailto:constantin.ciure...@gfk.com<mailto:constantin.ciure...@gfk.com>] Sent: Wednesday, Februar

RE: Cascading / Scalding Tap to read / write into Phoenix

2015-02-12 Thread Ciureanu, Constantin (GfK)
oenix. Phoenix-specific InputFormat and OutputFormat implementations were recently added to Phoenix, so if there's an easy way to wrap an existing InputFormat and OutputFormat as a Tap in Cascading, then this would probably be the easiest way to go. - Gabriel On Tue, Feb 10, 2015

Update statistics made query 2-3x slower

2015-02-11 Thread Ciureanu, Constantin (GfK)
Hello all, 1. Is there a good explanation why updating the statistics: update statistics tableX; made this query 2x times slower? (it was 27 seconds before, now it’s somewhere between 60 – 90 seconds) select count(*) from tableX; +--+ |

Cascading / Scalding Tap to read / write into Phoenix

2015-02-10 Thread Ciureanu, Constantin (GfK)
Hello all, Is there any Cascading / Scalding Tap to read / write data from and to Phoenix? I couldn’t find anything on the internet so far. I know that there is a Cascading Tap to read from HBase and Cascading integration with JDBC. Thank you, Constantin

RE: Pig vs Bulk Load record count

2015-02-03 Thread Ciureanu, Constantin (GfK)
Hello Ralph, Try to check if the PIG script doesn’t produce keys that overlap (that would explain the reduce in number of rows). Good luck, Constantin From: Ravi Kiran [mailto:maghamraviki...@gmail.com] Sent: Tuesday, February 03, 2015 2:42 AM To: user@phoenix.apache.org Subject: Re: Pig vs

RE: MapReduce bulk load into Phoenix table

2015-01-16 Thread Ciureanu, Constantin (GfK)
ven better, give me a pointer to it on GitHub or something similar)? - Gabriel On Thu, Jan 15, 2015 at 2:19 PM, Ciureanu, Constantin (GfK) wrote: > Hello all, > > I finished the MR Job - for now it just failed a few times since the Mappers > gave some weird timeout (600 s

RE: MapReduce bulk load into Phoenix table

2015-01-16 Thread Ciureanu, Constantin (GfK)
r to first determine what the real issue is, could you give a general overview of how your MR job is implemented (or even better, give me a pointer to it on GitHub or something similar)? - Gabriel On Thu, Jan 15, 2015 at 2:19 PM, Ciureanu, Constantin (GfK) wrote: > Hello all, > > I

RE: MapReduce bulk load into Phoenix table

2015-01-15 Thread Ciureanu, Constantin (GfK)
machines, 24 tasks can run in the same time). Can be this because of some limitation on number of connections to Phoenix? Regards, Constantin -Original Message- From: Ciureanu, Constantin (GfK) [mailto:constantin.ciure...@gfk.com] Sent: Wednesday, January 14, 2015 9:44 AM To: user

RE: MapReduce bulk load into Phoenix table

2015-01-14 Thread Ciureanu, Constantin (GfK)
multiple results of > PDataType.TYPE.toBytes() as rowkey. For values use same logic. Data > types are defined as enums at this class: > org.apache.phoenix.schema.PDataType. > > Good luck, > Vaclav; > > On 01/13/2015 10:58 AM, Ciureanu, Constantin (GfK) wrote: >> Thank you Vac

RE: MapReduce bulk load into Phoenix table

2015-01-13 Thread Ciureanu, Constantin (GfK)
ase table. Then you should hit bottleneck of HBase itself. It should be from 10 to 30+ times faster than your current solution. Depending on HW of course. I'd prefer this solution for stream writes. Vaclav On 01/13/2015 10:12 AM, Ciureanu, Constantin (GfK) wrote: > Hello all, > >

MapReduce bulk load into Phoenix table

2015-01-13 Thread Ciureanu, Constantin (GfK)
Hello all, (Due to the slow speed of Phoenix JDBC – single machine ~ 1000-1500 rows /sec) I am also documenting myself about loading data into Phoenix via MapReduce. So far I understood that the Key + List<[Key,Value]> to be inserted into HBase table is obtained via a “dummy” Phoenix connection

RE: sqlline.py operation error

2014-12-19 Thread Ciureanu, Constantin (GfK)
Hello, Check the Java version. Phoenix was compiled with JDK 7.0 and you are probably using JDK 6.0 (runtime). From: 聪聪 [mailto:175998...@qq.com] Sent: Friday, December 19, 2014 9:39 AM To: user Subject: sqlline.py operation error I use HBase version hbase-0.98.6-cdh5.2.0,so I download phoenix