Re: Transactional table read lifecycle

Alan Gates Wed, 22 Apr 2015 13:08:07 -0700

Whether you obtain a read lock depends on the guarantees you want tomake to your readers. Obtaining the lock will do a couple of thingsyour uses might want:1) It will prevent DDL statements such as DROP TABLE from removing thedata while they are reading it.2) It will prevent the compactor from removing the versions of the deltafiles they are reading.

The other step you'll want is to heartbeat the lock. To avoid deadclients holding locks forever the DbLockManager times them out after 300seconds (default, it's configurable). To avoid this you'll need to callIMetaStoreClient.heartbeat on a regular basis.


Alan.

Elliot West <mailto:tea...@gmail.com>
April 17, 2015 at 8:05
Hi, I'm working on a Cascading Tap that reads the data that backs atransactional Hive table. I've successfully utilised the in-builtOrcInputFormat functionality to read and merge the deltas with thebase and optionally pull in the RecordIdentifiers. However, I'm nowconsidering what other steps I may need to take to collaborate with anactive Hive instance that could be writing to or compacting the tableas I'm trying to read it.
I recently became aware of the need to obtain a list of validtransaction IDs but now wonder if I must also acquire a read lock forthe table? I'm thinking that the set of interactions for reading thisdata may look something like:
 1. Obtain ValidTxnList from the meta store:
    org.apache.hadoop.hive.metastore.IMetaStoreClient.getValidTxns()

 2. Set the ValidTxnList in the Configuration:
    conf.set(ValidTxnList.VALID_TXNS_KEY, validTxnList.toString());

 3. Aquire a read lock:
    org.apache.hadoop.hive.metastore.IMetaStoreClient.lock(LockRequest)

 4. Use OrcInputFormat to read the data

 5. Finally, release the lock:
    org.apache.hadoop.hive.metastore.IMetaStoreClient.unlock(long)
Can you advise on whether the lock is needed, whether this is thecorrect way of managing the lock, and whether there are any othersteps I need take to appropriately interact with the data underpinninga 'live' transactional table?
Thanks - Elliot.

Re: Transactional table read lifecycle

Reply via email to