Hi, Peter:
I have started a standalone metastore server and it indeed short that part of 
time, it does connection instead of initialization. But I still have some 
questions,
First, I believe the Hcatalog must be quick because it is a mature product and 
I have not seen others complaining about this problem, is there some 
configuration which controls starting new session or how to keep a session 
connected to the HMS, in the log below it started a new session and connected 
twice. 
Second, I am very interested in using the HMS thrift API, but I could not found 
an example of how to use it in C/C++ to access hive table info. Do you know 
some link about it?
Really thank you for your time!!

Best regards,
Hou

$time ./hcat.py -e "use default; show table extended like haha;"
18/04/24 15:47:08 INFO conf.HiveConf: Found configuration file 
file:/usr/local/hive/conf/hive-site.xml
18/04/24 15:47:10 WARN util.NativeCodeLoader: Unable to load native-hadoop 
library for your platform... using builtin-java classes where applicable
18/04/24 15:47:10 INFO session.SessionState: Created HDFS directory: 
/tmp/hive/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f
18/04/24 15:47:10 INFO session.SessionState: Created local directory: 
/tmp/hive/java/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f
18/04/24 15:47:10 INFO session.SessionState: Created HDFS directory: 
/tmp/hive/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f/_tmp_space.db
18/04/24 15:47:10 INFO ql.Driver: Compiling 
command(queryId=kousouda_20180424154710_e0443fb2-3930-4dc3-9965-25a9f98807a5): 
use default
18/04/24 15:47:12 INFO hive.metastore: Trying to connect to metastore with URI 
thrift://localhost:9083
18/04/24 15:47:12 INFO hive.metastore: Opened a connection to metastore, 
current connections: 1
18/04/24 15:47:12 INFO hive.metastore: Connected to metastore.
18/04/24 15:47:12 INFO ql.Driver: Semantic Analysis Completed
18/04/24 15:47:12 INFO ql.Driver: Returning Hive schema: 
Schema(fieldSchemas:null, properties:null)
18/04/24 15:47:12 INFO ql.Driver: Completed compiling 
command(queryId=kousouda_20180424154710_e0443fb2-3930-4dc3-9965-25a9f98807a5); 
Time taken: 1.591 seconds
18/04/24 15:47:12 INFO ql.Driver: Concurrency mode is disabled, not creating a 
lock manager
18/04/24 15:47:12 INFO ql.Driver: Executing 
command(queryId=kousouda_20180424154710_e0443fb2-3930-4dc3-9965-25a9f98807a5): 
use default
18/04/24 15:47:12 INFO sqlstd.SQLStdHiveAccessController: Created 
SQLStdHiveAccessController for session context : HiveAuthzSessionContext 
[sessionString=6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f, clientType=HIVECLI]
18/04/24 15:47:12 WARN session.SessionState: METASTORE_FILTER_HOOK will be 
ignored, since hive.security.authorization.manager is set to instance of 
HiveAuthorizerFactory.
18/04/24 15:47:12 INFO hive.metastore: Mestastore configuration 
hive.metastore.filter.hook changed from 
org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to 
org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
18/04/24 15:47:12 INFO hive.metastore: Closed a connection to metastore, 
current connections: 0
18/04/24 15:47:12 INFO hive.metastore: Trying to connect to metastore with URI 
thrift://localhost:9083
18/04/24 15:47:12 INFO hive.metastore: Opened a connection to metastore, 
current connections: 1
18/04/24 15:47:12 INFO hive.metastore: Connected to metastore.
18/04/24 15:47:12 INFO ql.Driver: Starting task [Stage-0:DDL] in serial mode
18/04/24 15:47:12 INFO ql.Driver: Completed executing 
command(queryId=kousouda_20180424154710_e0443fb2-3930-4dc3-9965-25a9f98807a5); 
Time taken: 0.119 seconds
OK
18/04/24 15:47:12 INFO ql.Driver: OK
Time taken: 1.728 seconds
18/04/24 15:47:12 INFO ql.Driver: Compiling 
command(queryId=kousouda_20180424154712_99e6e25d-0505-44f1-a429-5ce45b0cae59): 
show table extended like haha
18/04/24 15:47:12 INFO ql.Driver: Semantic Analysis Completed
18/04/24 15:47:12 INFO ql.Driver: Returning Hive schema: 
Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from 
deserializer)], properties:null)
18/04/24 15:47:12 INFO exec.ListSinkOperator: Initializing operator LIST_SINK[0]
18/04/24 15:47:12 INFO ql.Driver: Completed compiling 
command(queryId=kousouda_20180424154712_99e6e25d-0505-44f1-a429-5ce45b0cae59); 
Time taken: 0.166 seconds
18/04/24 15:47:12 INFO ql.Driver: Concurrency mode is disabled, not creating a 
lock manager
18/04/24 15:47:12 INFO ql.Driver: Executing 
command(queryId=kousouda_20180424154712_99e6e25d-0505-44f1-a429-5ce45b0cae59): 
show table extended like haha
18/04/24 15:47:12 INFO ql.Driver: Starting task [Stage-0:DDL] in serial mode
18/04/24 15:47:12 INFO exec.DDLTask: pattern: haha
18/04/24 15:47:12 INFO exec.DDLTask: results : 1
18/04/24 15:47:12 INFO ql.Driver: Completed executing 
command(queryId=kousouda_20180424154712_99e6e25d-0505-44f1-a429-5ce45b0cae59); 
Time taken: 0.187 seconds
OK
18/04/24 15:47:12 INFO ql.Driver: OK
18/04/24 15:47:12 INFO Configuration.deprecation: mapred.input.dir is 
deprecated. Instead, use mapreduce.input.fileinputformat.inputdir
18/04/24 15:47:12 INFO mapred.FileInputFormat: Total input paths to process : 1
18/04/24 15:47:12 INFO exec.ListSinkOperator: Closing operator LIST_SINK[0]
tableName:haha
owner:kousouda
location:hdfs://localhost:8020/user/hive/warehouse/haha
inputformat:org.apache.hadoop.mapred.TextInputFormat
outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
columns:struct columns { i32 id}
partitioned:false
partitionColumns:
totalNumberFiles:2
totalFileSize:4
maxFileSize:2
minFileSize:2
lastAccessTime:1524535110334
lastUpdateTime:1524535113101

Time taken: 0.394 seconds
18/04/24 15:47:12 INFO session.SessionState: Deleted directory: 
/tmp/hive/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f on fs with scheme hdfs
18/04/24 15:47:12 INFO session.SessionState: Deleted directory: 
/tmp/hive/java/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f on fs with scheme 
file
18/04/24 15:47:12 INFO hive.metastore: Closed a connection to metastore, 
current connections: 0 

real    0m5.593s
user    0m8.645s
sys     0m0.523s
> 在 2018年4月23日,下午3:57,Peter Vary <pv...@cloudera.com> 写道:
> 
> Hi,
> 
> Disclaimer: I am not too familiar with the webhcat yet.
> From the logs, I see, that:
> - the first 3 seconds spent on starting a new session, and maybe a driver - 
> this can be reduced, if the session is already there, and the HiveServer2 is 
> started (but do not know if webhcat could use HS2, or reuse sessions) - this 
> delay could be avoided if you use any of the 3 solutions suggested in my last 
> mail.
> - the next 3 seconds spent on initializing the metastore. This can be reduced 
> if a standalone metastore is started, and the webhcat is configured to access 
> this metastore.
> 
> Hope this helps,
> Peter
> 
>> On Apr 23, 2018, at 9:27 AM, 侯宗田 <zongtian...@icloud.com> wrote:
>> 
>> Thank you very much for your reply, I am wondering whether I use the webhcat 
>> rightly, I don’t think it is normal to create all the directories and 
>> objects to get a table describ and take 8 seconds. The webhcat should not be 
>> so slow, Or it is because I forget to start some server which can respond 
>> immediately?   
>>> 在 2018年4月23日,下午3:06,Peter Vary <pv...@cloudera.com> 写道:
>>> 
>>> Hi,
>>> 
>>> Alexander Kolbasov has a project which might interest you (keeping in mind,
>>> that this is not production ready - more like a proof of concept):
>>> https://github.com/akolb1/gometastore/blob/master/hmstool/doc/hmstool.md
>>> 
>>> Also you can use HMS thrift API directly to access the MetaStore, or if you
>>> can/want write java code, you can use HiveMetastoreClient class to do it in
>>> java.
>>> 
>>> I am not sure about the performance gains compared to HCat, but currently
>>> there are no faster interfaces for HMS that I know of.
>>> 
>>> Regards,
>>> Peter
>>> 
>>> 
>>> 侯宗田 <zongtian...@icloud.com> ezt írta (időpont: 2018. ápr. 23., Hét 2:40):
>>> 
>>>> Can anyone give me some suggestions? I have been stuck in this problem for
>>>> several days. Need help!!
>>>>> 在 2018年4月22日,下午9:38,侯宗田 <zongtian...@icloud.com> 写道:
>>>>> 
>>>>> 
>>>>> Hi,
>>>>> 
>>>>> I am writing a application which needs the metastore about hive tables.
>>>> I have used webhcat to get the information about tables and process them.
>>>> But a simple request takes over eight seconds to respond on localhost. Why
>>>> is this so slow, and how can I fix it or is there other way I can extract
>>>> the metadata in C?
>>>>> 
>>>>> $ time curl -s '
>>>> http://localhost:50111/templeton/v1/ddl/database/default/table/haha?user.name=ctdean
>>>> <
>>>> http://localhost:50111/templeton/v1/ddl/database/default/table/haha?user.name=ctdean
>>>>> '
>>>>> {"columns":
>>>>> [{"name":"id","type":"int"}],
>>>>> "database":"default",
>>>>> "table":"haha"}
>>>>> 
>>>>> real    0m8.400s
>>>>> user    0m0.053s
>>>>> sys     0m0.019s
>>>>> it seems to run a hcat.py, and it create a bunch of things then clear
>>>> them, it takes very long time, does anyone have some ideas about it?? Any
>>>> suggestions will be very appreciated!
>>>>> 
>>>>> $hcat.py -e "use default; desc haha; "
>>>>> SLF4J: Class path contains multiple SLF4J bindings.
>>>>> SLF4J: Found binding in
>>>> [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>> SLF4J: Found binding in
>>>> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings <
>>>> http://www.slf4j.org/codes.html#multiple_bindings> for an explanation.
>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
>>>>> 18/04/21 16:38:13 INFO conf.HiveConf: Found configuration file
>>>> file:/usr/local/hive/conf/hive-site.xml
>>>>> 18/04/21 16:38:15 WARN util.NativeCodeLoader: Unable to load
>>>> native-hadoop library for your platform... using builtin-java classes where
>>>> applicable
>>>>> 18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory:
>>>> /tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668
>>>>> 18/04/21 16:38:16 INFO session.SessionState: Created local directory:
>>>> /tmp/hive/java/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668
>>>>> 18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory:
>>>> /tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668/_tmp_space.db
>>>>> 18/04/21 16:38:16 INFO ql.Driver: Compiling
>>>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62):
>>>> use default
>>>>> 18/04/21 16:38:17 INFO metastore.HiveMetaStore: 0: Opening raw store
>>>> with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
>>>>> 18/04/21 16:38:17 INFO metastore.ObjectStore: ObjectStore, initialize
>>>> called
>>>>> 18/04/21 16:38:18 INFO DataNucleus.Persistence: Property
>>>> hive.metastore.integral.jdo.pushdown unknown - will be ignored
>>>>> 18/04/21 16:38:18 INFO DataNucleus.Persistence: Property
>>>> datanucleus.cache.level2 unknown - will be ignored
>>>>> 18/04/21 16:38:18 INFO metastore.ObjectStore: Setting MetaStore object
>>>> pin classes with
>>>> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order"
>>>>> 18/04/21 16:38:20 INFO metastore.MetaStoreDirectSql: Using direct SQL,
>>>> underlying DB is MYSQL
>>>>> 18/04/21 16:38:20 INFO metastore.ObjectStore: Initialized ObjectStore
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added admin role in
>>>> metastore
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added public role in
>>>> metastore
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: No user is added in
>>>> admin role, since config is empty
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_all_functions
>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda
>>>> ip=unknown-ip-addr      cmd=get_all_functions
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default
>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda
>>>> ip=unknown-ip-addr      cmd=get_database: default
>>>>> 18/04/21 16:38:20 INFO ql.Driver: Semantic Analysis Completed
>>>>> 18/04/21 16:38:20 INFO ql.Driver: Returning Hive schema:
>>>> Schema(fieldSchemas:null, properties:null)
>>>>> 18/04/21 16:38:20 INFO ql.Driver: Completed compiling
>>>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62);
>>>> Time taken: 3.936 seconds
>>>>> 18/04/21 16:38:20 INFO ql.Driver: Concurrency mode is disabled, not
>>>> creating a lock manager
>>>>> 18/04/21 16:38:20 INFO ql.Driver: Executing
>>>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62):
>>>> use default
>>>>> 18/04/21 16:38:20 INFO sqlstd.SQLStdHiveAccessController: Created
>>>> SQLStdHiveAccessController for session context : HiveAuthzSessionContext
>>>> [sessionString=05096382-f9b6-4dae-aee2-dfa6750c0668, clientType=HIVECLI]
>>>>> 18/04/21 16:38:20 WARN session.SessionState: METASTORE_FILTER_HOOK will
>>>> be ignored, since hive.security.authorization.manager is set to instance of
>>>> HiveAuthorizerFactory.
>>>>> 18/04/21 16:38:20 INFO hive.metastore: Mestastore configuration
>>>> hive.metastore.filter.hook changed from
>>>> org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to
>>>> org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Cleaning up thread
>>>> local RawStore...
>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda
>>>> ip=unknown-ip-addr      cmd=Cleaning up thread local RawStore...
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Done cleaning up
>>>> thread local RawStore
>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda
>>>> ip=unknown-ip-addr      cmd=Done cleaning up thread local RawStore
>>>>> 18/04/21 16:38:20 INFO ql.Driver: Starting task [Stage-0:DDL] in serial
>>>> mode
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default
>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda
>>>> ip=unknown-ip-addr      cmd=get_database: default
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Opening raw store
>>>> with implementation class:org.apache.hadoop.hive.metastore.ObjectStore
>>>>> 18/04/21 16:38:20 INFO metastore.ObjectStore: ObjectStore, initialize
>>>> called
>>>>> 18/04/21 16:38:20 INFO metastore.MetaStoreDirectSql: Using direct SQL,
>>>> underlying DB is MYSQL
>>>>> 18/04/21 16:38:20 INFO metastore.ObjectStore: Initialized ObjectStore
>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default
>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda
>>>> ip=unknown-ip-addr      cmd=get_database: default
>>>>> 18/04/21 16:38:20 INFO ql.Driver: Completed executing
>>>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62);
>>>> Time taken: 0.202 seconds
>>>>> OK
>>>> 
>>>> 
>> 
> 

Reply via email to