Is 1000 a good default?
On Wed, Feb 17, 2016 at 9:42 AM, Nick Dimiduk <ndimi...@gmail.com> wrote: > Thanks for the context Arun. > > For what it's worth, I greatly increased the batch size (from default > 1,000 to 500,000), which i believe reduced contention on the lock and > allowed ingest to catch up. > > On Tue, Feb 16, 2016 at 9:14 PM, Thangamani, Arun <arun.thangam...@cdk.com > > wrote: > >> Sorry I had pressed Control + Enter a little earlier than I wanted to, >> corrections inline. Thanks >> >> From: "Thangamani, Arun" <arun.thangam...@cdk.com> >> Reply-To: "user@phoenix.apache.org" <user@phoenix.apache.org> >> Date: Tuesday, February 16, 2016 at 8:38 PM >> To: "user@phoenix.apache.org" <user@phoenix.apache.org> >> Cc: James Taylor <jamestay...@apache.org>, Lars Hofhansl < >> la...@apache.org> >> Subject: Re: Write path blocked by MetaDataEndpoint acquiring region lock >> >> Hey Nick, >> >> Looks like you are failing to find your table in meta data cache, if you >> don’t find it in the meta data cache, we end up rebuilding the metadata >> from both the SYSTEM.CATALOG and SYSTEM.STATS tables. >> The rebuilding process for the meta data is a scan on both the tables. >> >> So, we will end up going to zookeeper to find the region and execute the >> scan on the region, it is an expensive operation, that explains the calls >> to a specific region server and zookeeper >> >> Similar thing happens in PHOENIX-2607, but that is specifiically related >> to timestamps, if the client timestamp is less than or equal to the stats >> timestamp of the table, we will end up doing the above rebuilding process >> repeatedly for every batch of insert from PhoenixMapReduceUtil (Batch size >> default is 1000 rows). I don’t believe you have a timestamp issue, but >> looks like you have the same repeated lookups and related scans for >> rebuilding that happen in PHOENIX-2607 >> >> James has suggested a workaround for the meta data cache refresh using >> UPDATE_CACHE_FREQUENCY >> variable while defining the table, and it will probably help (we are >> trying it out for timestamp issue) >> >> Hope this helps. >> >> Thanks >> Arun >> >> Please look at the following lines in MetadataCacheImpl >> private PTable doGetTable(byte[] key, long clientTimeStamp, RowLock >> rowLock) throws IOException, SQLException { >> >> ImmutableBytesPtr cacheKey = new ImmutableBytesPtr(key); >> Cache<ImmutableBytesPtr, PMetaDataEntity> metaDataCache = >> GlobalCache.getInstance(this.env).getMetaDataCache(); >> PTable table = (PTable)metaDataCache.getIfPresent(cacheKey); >> // We only cache the latest, so we'll end up building the table with >> every call if the >> // client connection has specified an SCN. >> // TODO: If we indicate to the client that we're returning an older >> version, but there's a >> // newer version available, the client >> // can safely not call this, since we only allow modifications to the >> latest. >> if (table != null && table.getTimeStamp() < clientTimeStamp) { >> // Table on client is up-to-date with table on server, so just return >> if (isTableDeleted(table)) { >> return null; >> } >> return table; >> } >> // Ask Lars about the expense of this call - if we don't take the lock, >> we still won't get >> // partial results >> // get the co-processor environment >> // TODO: check that key is within region.getStartKey() and >> region.getEndKey() >> // and return special code to force client to lookup region from meta. >> Region region = env.getRegion(); >> /* >> * Lock directly on key, though it may be an index table. This will just >> prevent a table >> * from getting rebuilt too often. >> */ >> final boolean wasLocked = (rowLock != null); >> if (!wasLocked) { >> rowLock = region.getRowLock(key, true); >> if (rowLock == null) { >> throw new IOException("Failed to acquire lock on " + >> Bytes.toStringBinary(key)); >> } >> } >> try { >> // Try cache again in case we were waiting on a lock >> table = (PTable)metaDataCache.getIfPresent(cacheKey); >> // We only cache the latest, so we'll end up building the table with >> every call if the >> // client connection has specified an SCN. >> // TODO: If we indicate to the client that we're returning an older >> version, but there's >> // a newer version available, the client >> // can safely not call this, since we only allow modifications to >> the latest. >> if (table != null && table.getTimeStamp() < clientTimeStamp) { >> // Table on client is up-to-date with table on server, so just >> return >> if (isTableDeleted(table)) { >> return null; >> } >> return table; >> } >> // Query for the latest table first, since it's not cached >> table = buildTable(key, cacheKey, region, >> HConstants.LATEST_TIMESTAMP); >> if (table != null && table.getTimeStamp() < clientTimeStamp) { >> return table; >> } >> // Otherwise, query for an older version of the table - it won't be >> cached >> return buildTable(key, cacheKey, region, clientTimeStamp); >> } finally { >> if (!wasLocked) rowLock.release(); >> } >> } >> >> >> >> >> >> >> From: Nick Dimiduk <ndimi...@gmail.com> >> Reply-To: "user@phoenix.apache.org" <user@phoenix.apache.org> >> Date: Tuesday, February 16, 2016 at 5:58 PM >> To: "user@phoenix.apache.org" <user@phoenix.apache.org> >> Cc: James Taylor <jamestay...@apache.org>, Lars Hofhansl < >> la...@apache.org> >> Subject: Write path blocked by MetaDataEndpoint acquiring region lock >> >> Hello, >> >> I have a high throughput ingest pipeline that's seised up. My ingest >> application ultimately crashes, contains the following stack trace [0]. >> >> Independently, I noticed that the RPC call time of one of the machines >> was significantly higher than others (95pct at multiple seconds vs 10's of >> ms). I grabbed the RS log and a couple jstacks from the process. In the >> logs I see handler threads creating ZK connections excessively (~50 INFO >> lines per second). The jstacks show handler threads parked while taking >> region row locks, calling HRegion.getRowLockInternal() via >> MetaDataEndpointImpl.doGetTable() [1]. The one handler thread I see that's >> in the same MetaDataEndpointImpl area but not under lock appears to be >> making an RPC to read the statistics table [2]. >> >> I believe these two occurrences are related. >> >> My working theory is that the metaDataCache object is performing poorly >> for some reason. This results in excessive meta data lookups, some of which >> appear to require making an RPC call while under lock. >> >> What can I do to relive pressure on this rowlock? Looking at the code >> around the lock, this looks like it's populating to a connection-level >> cache of schema metadata. The host with the high RPC call time is hosting >> the SYSTEM.CATALOG table region. I see some configuration settings that may >> be related (ie, phoenix.coprocessor.maxMetaDataCacheSize), but I see no way >> to get debug information about cache size or evictions from this Guava >> cache instance. >> >> I'll be digging into this further, but I appreciate any pointers you may >> have. >> >> BTW, this is Phoenix 4.6.0 + HBase 1.1.2. >> >> Thanks a lot, >> -n >> >> [0]: client-side stack >> Caused by: org.apache.phoenix.exception.PhoenixIOException: Interrupted >> calling coprocessor service >> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService for >> row \x00<schema>\x00<user_table> >> at >> org.apache.phoenix.util.ServerUtil.parseServerException(ServerUtil.java:108) >> at >> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1053) >> at >> org.apache.phoenix.query.ConnectionQueryServicesImpl.metaDataCoprocessorExec(ConnectionQueryServicesImpl.java:1016) >> at >> org.apache.phoenix.query.ConnectionQueryServicesImpl.getTable(ConnectionQueryServicesImpl.java:1289) >> at >> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:446) >> at >> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:389) >> at >> org.apache.phoenix.schema.MetaDataClient.updateCache(MetaDataClient.java:385) >> at >> org.apache.phoenix.execute.MutationState.validate(MutationState.java:369) >> at >> org.apache.phoenix.execute.MutationState.commit(MutationState.java:417) >> at >> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:482) >> at >> org.apache.phoenix.jdbc.PhoenixConnection$3.call(PhoenixConnection.java:479) >> at org.apache.phoenix.call.CallRunner.run(CallRunner.java:53) >> at >> org.apache.phoenix.jdbc.PhoenixConnection.commit(PhoenixConnection.java:479) >> at >> org.apache.phoenix.mapreduce.PhoenixRecordWriter.write(PhoenixRecordWriter.java:84) >> >> [1] rs handlers blocked stack >> "B.defaultRpcServer.handler=48,queue=3,port=16020" #91 daemon prio=5 >> os_prio=0 tid=0x00007ff4ce458000 nid=0xca0a waiting on condition >> [0x00007ff47a607000] >> java.lang.Thread.State: TIMED_WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> - parking to wait for <0x0000000708ef59c0> (a >> java.util.concurrent.CountDownLatch$Sync) >> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215) >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedNanos(AbstractQueuedSynchronizer.java:1037) >> at >> java.util.concurrent.locks.AbstractQueuedSynchronizer.tryAcquireSharedNanos(AbstractQueuedSynchronizer.java:1328) >> at java.util.concurrent.CountDownLatch.await(CountDownLatch.java:277) >> at >> org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5047) >> at >> org.apache.hadoop.hbase.regionserver.HRegion.getRowLock(HRegion.java:5013) >> at >> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2397) >> at >> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2365) >> at >> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:440) >> at >> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:11609) >> at >> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7435) >> at >> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1875) >> at >> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1857) >> at >> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209) >> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114) >> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) >> at >> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) >> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) >> at java.lang.Thread.run(Thread.java:745) >> >> [2]: rs handler stats rpc stack >> "B.defaultRpcServer.handler=19,queue=4,port=16020" #62 daemon prio=5 >> os_prio=0 tid=0x00007ff4ce420000 nid=0xc9ea in Object.wait() >> [0x00007ff47c323000] >> java.lang.Thread.State: TIMED_WAITING (on object monitor) >> at java.lang.Object.wait(Native Method) >> at java.lang.Object.wait(Object.java:460) >> at java.util.concurrent.TimeUnit.timedWait(TimeUnit.java:348) >> at >> org.apache.hadoop.hbase.client.ResultBoundedCompletionService.poll(ResultBoundedCompletionService.java:155) >> - locked <0x00000007a3f5e030> (a >> [Lorg.apache.hadoop.hbase.client.ResultBoundedCompletionService$QueueingFuture;) >> at >> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:168) >> at >> org.apache.hadoop.hbase.client.ScannerCallableWithReplicas.call(ScannerCallableWithReplicas.java:59) >> at >> org.apache.hadoop.hbase.client.RpcRetryingCaller.callWithoutRetries(RpcRetryingCaller.java:200) >> at >> org.apache.hadoop.hbase.client.ClientScanner.call(ClientScanner.java:320) >> at >> org.apache.hadoop.hbase.client.ClientScanner.loadCache(ClientScanner.java:403) >> at >> org.apache.hadoop.hbase.client.ClientScanner.next(ClientScanner.java:364) >> at >> org.apache.phoenix.schema.stats.StatisticsUtil.readStatistics(StatisticsUtil.java:99) >> at >> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:836) >> at >> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.buildTable(MetaDataEndpointImpl.java:472) >> at >> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2418) >> at >> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.doGetTable(MetaDataEndpointImpl.java:2365) >> at >> org.apache.phoenix.coprocessor.MetaDataEndpointImpl.getTable(MetaDataEndpointImpl.java:440) >> at >> org.apache.phoenix.coprocessor.generated.MetaDataProtos$MetaDataService.callMethod(MetaDataProtos.java:11609) >> at >> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:7435) >> at >> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:1875) >> at >> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:1857) >> at >> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32209) >> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2114) >> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:101) >> at >> org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) >> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) >> at java.lang.Thread.run(Thread.java:745) >> >> ------------------------------ >> This message and any attachments are intended only for the use of the >> addressee and may contain information that is privileged and confidential. >> If the reader of the message is not the intended recipient or an authorized >> representative of the intended recipient, you are hereby notified that any >> dissemination of this communication is strictly prohibited. If you have >> received this communication in error, notify the sender immediately by >> return email and delete the message and any attachments from your system. >> > > -- Best regards, - Andy Problems worthy of attack prove their worth by hitting back. - Piet Hein (via Tom White)