Re: Upsert is EXTREMELY slow

2018-07-13 Thread alchemist
Thanks so much for your response. Now I am getting better perforamnce i.e 15K per minute, I made two changes. I disabled pheonix transaction. phoenix.transactions.enabled false And I removed the transaction connection.commit(); Logically this should not make any difference because by de

Re: Upsert is EXTREMELY slow

2018-07-13 Thread Josh Elser
Sorry, I was brief and didn't get my point across. I meant to say the same thing you did. Someone manually submitting two updates to an index is naively faster that what Phoenix goes through to automatically (and safely) do this. On 7/13/18 12:07 PM, James Taylor wrote: Phoenix won’t be slowe

Re: Upsert is EXTREMELY slow

2018-07-13 Thread James Taylor
Phoenix won’t be slower to update secondary indexes than a use case would be. Both have to do the writes to a second table to keep it in sync. On Fri, Jul 13, 2018 at 8:39 AM Josh Elser wrote: > Also, they're relying on Phoenix to do secondary index updates for them. > > Obviously, you can do th

Re: Upsert is EXTREMELY slow

2018-07-13 Thread Josh Elser
Also, they're relying on Phoenix to do secondary index updates for them. Obviously, you can do this faster than Phoenix can if you know the exact use-case. On 7/12/18 6:31 PM, Pedro Boado wrote: A tip for performance is reusing the same preparedStatement , just clearParameters() , set values

Re: Upsert is EXTREMELY slow

2018-07-12 Thread Pedro Boado
A tip for performance is reusing the same preparedStatement , just clearParameters() , set values and executeUpdate() over and over again. Don't close the statement or connections after each upsert. Also, I haven't seen any noticeable benefit on using jdbc batches as Phoenix controls batching by wh

Re: Upsert is EXTREMELY slow

2018-07-12 Thread Pedro Boado
I believe it's related to your client code - In our use case we do easily 15k writes/sec in a cluster lower specced than yours. Check that your jdbc connection has autocommit off so Phoenix can batch writes and that table has a reasonable UPDATE_CACHE_FREQUENCY ( more than 6 ). On Thu, 12 J

Re: Upsert is EXTREMELY slow

2018-07-12 Thread alchemist
Thanks a lot for your help. Our test is inserting new rows individually. For our use case, we are benchmarking that we could be able to get 10,000 new rows in a minute, using a cluster of writers if needed. When executing the inserts with Phoenix API (UPSERT) we have been able to get up to 6,000 ne

Re: Upsert is EXTREMELY slow

2018-07-12 Thread Josh Elser
HBase must grab a lock for the row which is being updated. Normally, for a batch of updates sent to a region server, the RS will grab as many row locks as it can at once. If you only send one row to update at a time, you obviously get no amortization. It's just the normal semantics of batching

Re: Upsert is EXTREMELY slow

2018-07-12 Thread Josh Elser
Phoenix does not recommend connection pooling because Phoenix Connections are not expensive to create as most DB connections are. The first connection you make from a JVM is expensive. Every subsequent one is cheap. On 7/11/18 2:55 PM, alchemist wrote: Since Phoenix does not recommend connec

Re: Upsert is EXTREMELY slow

2018-07-11 Thread alchemist
Josh Elser-2 wrote > Josh thanks so much for all your help. I do not understand why you > "However, you are still fighting yourself when you have threads all trying > to grab the same lock to write their data." My understanding is if we > have many processes running many threads, the data will be

Re: Upsert is EXTREMELY slow

2018-07-11 Thread alchemist
Since Phoenix does not recommend connection pooling. Then even if we have multiple threads and processes, each thread will take time to get connection to Phoenix server, execute upsert operation then commit the operation. This whole process will take sometime. And no matter what level of threadi

Re: Upsert is EXTREMELY slow

2018-07-11 Thread alchemist
Thanks Josh! My use case is very simple, I have data in S3 that is backed by Hbase in EMR. I need to access this data from outside EMR cluster, in this case I created three EC2 nodes that executes multithreaded Java program in parallel. These multithreaded program from different EC2 nodes are t

Re: Upsert is EXTREMELY slow

2018-07-11 Thread Josh Elser
Your real-world situation is not a single-threaded application, is it? You will have multiple threads which are all updating Phoenix concurrently. Given the semantics that your application needs from the requirements you stated, I'm not sure what else you can do differently. You can get low-la

Re: Upsert is EXTREMELY slow

2018-07-11 Thread alchemist
Thanks so much Josh! I am unable to understand why performance is extremely slow. 1. If I perform update using PreparedStatement addBatch and executeBatch then I get nearly 6000 transactions per minute. 2. But in our case we need to save each transaction so cannot perform update batch, so I

Re: Upsert is EXTREMELY slow

2018-07-11 Thread Josh Elser
Some thoughts: * Please _remove_ commented lines before sharing configuration next time. We don't need to see all of the things you don't have set :) * 100 salt buckets is really excessive for a 4 node cluster. Salt buckets are not synonymous with pre-splitting HBase tables. This many salt b

Re: Upsert is EXTREMELY slow

2018-07-11 Thread alchemist
Thanks so much for all your help and response. Here are the recommended details. I have deployed Hbase backed by S3 on EMR cluster. And performing upsert from EC2 node outside EMR using simple java multithreaded client to perform atomic update. I have created a table like following CREATE TABL

Re: Upsert is EXTREMELY slow

2018-07-11 Thread Josh Elser
The explain plan for your tables isn't a substitute for the DDLs. Please provide those. How about sharing your completely hbase-site.xml and hbase-env.sh files, rather than just snippets like you have. A full picture is often needed. Given that HBase cannot directly run on S3, please also des

Re: Upsert is EXTREMELY slow

2018-07-11 Thread alchemist
Is there any configurations I am missing? -- Sent from: http://apache-phoenix-user-list.1124778.n5.nabble.com/

Re: Upsert is EXTREMELY slow

2018-07-11 Thread alchemist
I am doing a POC where I am using Pheonix for single write that updates the database after the write. So I cannot batch update the write. I am getting 1 transaction per second as TPS. I have 3 node EMR cluster. I am using HBase with S3 as backend. I tried tuning parameter I found online I cr

Re: Upsert is EXTREMELY slow

2018-07-11 Thread alchemist
I have also tried adding cache frequency. phoenix.default.update.cache.frequency 30 http://apache-phoenix-user-list.1124778.n5.nabble.com/

Upsert is EXTREMELY slow

2018-07-11 Thread alchemist
I have written a threaded program to batch upsert data into Phoenix. I am using Phoenix because of secondary index capability. I am getting extremely slow performance in write. Explain query looks like following 0: jdbc:phoenix:localhost:2181:/hbase> EXPLAIN select count(1) from VBQL_PHOENIX_