Hi Hou, Kudu uses the Thrift HMS interface, and written in C. An example could be found here: https://github.com/apache/kudu/tree/master/src/kudu/hms <https://github.com/apache/kudu/tree/master/src/kudu/hms>
As for parametrizing Hcatalog I have only found this: https://cwiki.apache.org/confluence/display/Hive/HCatalog+Configuration+Properties <https://cwiki.apache.org/confluence/display/Hive/HCatalog+Configuration+Properties> But have not find anything there which might help you there. Peter > On Apr 24, 2018, at 10:51 AM, 侯宗田 <zongtian...@icloud.com> wrote: > > Hi, Peter: > I have started a standalone metastore server and it indeed short that part of > time, it does connection instead of initialization. But I still have some > questions, > First, I believe the Hcatalog must be quick because it is a mature product > and I have not seen others complaining about this problem, is there some > configuration which controls starting new session or how to keep a session > connected to the HMS, in the log below it started a new session and connected > twice. > Second, I am very interested in using the HMS thrift API, but I could not > found an example of how to use it in C/C++ to access hive table info. Do you > know some link about it? > Really thank you for your time!! > > Best regards, > Hou > > $time ./hcat.py -e "use default; show table extended like haha;" > 18/04/24 15:47:08 INFO conf.HiveConf: Found configuration file > file:/usr/local/hive/conf/hive-site.xml > 18/04/24 15:47:10 WARN util.NativeCodeLoader: Unable to load native-hadoop > library for your platform... using builtin-java classes where applicable > 18/04/24 15:47:10 INFO session.SessionState: Created HDFS directory: > /tmp/hive/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f > 18/04/24 15:47:10 INFO session.SessionState: Created local directory: > /tmp/hive/java/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f > 18/04/24 15:47:10 INFO session.SessionState: Created HDFS directory: > /tmp/hive/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f/_tmp_space.db > 18/04/24 15:47:10 INFO ql.Driver: Compiling > command(queryId=kousouda_20180424154710_e0443fb2-3930-4dc3-9965-25a9f98807a5): > use default > 18/04/24 15:47:12 INFO hive.metastore: Trying to connect to metastore with > URI thrift://localhost:9083 > 18/04/24 15:47:12 INFO hive.metastore: Opened a connection to metastore, > current connections: 1 > 18/04/24 15:47:12 INFO hive.metastore: Connected to metastore. > 18/04/24 15:47:12 INFO ql.Driver: Semantic Analysis Completed > 18/04/24 15:47:12 INFO ql.Driver: Returning Hive schema: > Schema(fieldSchemas:null, properties:null) > 18/04/24 15:47:12 INFO ql.Driver: Completed compiling > command(queryId=kousouda_20180424154710_e0443fb2-3930-4dc3-9965-25a9f98807a5); > Time taken: 1.591 seconds > 18/04/24 15:47:12 INFO ql.Driver: Concurrency mode is disabled, not creating > a lock manager > 18/04/24 15:47:12 INFO ql.Driver: Executing > command(queryId=kousouda_20180424154710_e0443fb2-3930-4dc3-9965-25a9f98807a5): > use default > 18/04/24 15:47:12 INFO sqlstd.SQLStdHiveAccessController: Created > SQLStdHiveAccessController for session context : HiveAuthzSessionContext > [sessionString=6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f, clientType=HIVECLI] > 18/04/24 15:47:12 WARN session.SessionState: METASTORE_FILTER_HOOK will be > ignored, since hive.security.authorization.manager is set to instance of > HiveAuthorizerFactory. > 18/04/24 15:47:12 INFO hive.metastore: Mestastore configuration > hive.metastore.filter.hook changed from > org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to > org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook > 18/04/24 15:47:12 INFO hive.metastore: Closed a connection to metastore, > current connections: 0 > 18/04/24 15:47:12 INFO hive.metastore: Trying to connect to metastore with > URI thrift://localhost:9083 > 18/04/24 15:47:12 INFO hive.metastore: Opened a connection to metastore, > current connections: 1 > 18/04/24 15:47:12 INFO hive.metastore: Connected to metastore. > 18/04/24 15:47:12 INFO ql.Driver: Starting task [Stage-0:DDL] in serial mode > 18/04/24 15:47:12 INFO ql.Driver: Completed executing > command(queryId=kousouda_20180424154710_e0443fb2-3930-4dc3-9965-25a9f98807a5); > Time taken: 0.119 seconds > OK > 18/04/24 15:47:12 INFO ql.Driver: OK > Time taken: 1.728 seconds > 18/04/24 15:47:12 INFO ql.Driver: Compiling > command(queryId=kousouda_20180424154712_99e6e25d-0505-44f1-a429-5ce45b0cae59): > show table extended like haha > 18/04/24 15:47:12 INFO ql.Driver: Semantic Analysis Completed > 18/04/24 15:47:12 INFO ql.Driver: Returning Hive schema: > Schema(fieldSchemas:[FieldSchema(name:tab_name, type:string, comment:from > deserializer)], properties:null) > 18/04/24 15:47:12 INFO exec.ListSinkOperator: Initializing operator > LIST_SINK[0] > 18/04/24 15:47:12 INFO ql.Driver: Completed compiling > command(queryId=kousouda_20180424154712_99e6e25d-0505-44f1-a429-5ce45b0cae59); > Time taken: 0.166 seconds > 18/04/24 15:47:12 INFO ql.Driver: Concurrency mode is disabled, not creating > a lock manager > 18/04/24 15:47:12 INFO ql.Driver: Executing > command(queryId=kousouda_20180424154712_99e6e25d-0505-44f1-a429-5ce45b0cae59): > show table extended like haha > 18/04/24 15:47:12 INFO ql.Driver: Starting task [Stage-0:DDL] in serial mode > 18/04/24 15:47:12 INFO exec.DDLTask: pattern: haha > 18/04/24 15:47:12 INFO exec.DDLTask: results : 1 > 18/04/24 15:47:12 INFO ql.Driver: Completed executing > command(queryId=kousouda_20180424154712_99e6e25d-0505-44f1-a429-5ce45b0cae59); > Time taken: 0.187 seconds > OK > 18/04/24 15:47:12 INFO ql.Driver: OK > 18/04/24 15:47:12 INFO Configuration.deprecation: mapred.input.dir is > deprecated. Instead, use mapreduce.input.fileinputformat.inputdir > 18/04/24 15:47:12 INFO mapred.FileInputFormat: Total input paths to process : > 1 > 18/04/24 15:47:12 INFO exec.ListSinkOperator: Closing operator LIST_SINK[0] > tableName:haha > owner:kousouda > location:hdfs://localhost:8020/user/hive/warehouse/haha > inputformat:org.apache.hadoop.mapred.TextInputFormat > outputformat:org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > columns:struct columns { i32 id} > partitioned:false > partitionColumns: > totalNumberFiles:2 > totalFileSize:4 > maxFileSize:2 > minFileSize:2 > lastAccessTime:1524535110334 > lastUpdateTime:1524535113101 > > Time taken: 0.394 seconds > 18/04/24 15:47:12 INFO session.SessionState: Deleted directory: > /tmp/hive/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f on fs with scheme hdfs > 18/04/24 15:47:12 INFO session.SessionState: Deleted directory: > /tmp/hive/java/kousouda/6c7e97ad-c9dd-4c5e-9636-ab9d4e47d76f on fs with > scheme file > 18/04/24 15:47:12 INFO hive.metastore: Closed a connection to metastore, > current connections: 0 > > real 0m5.593s > user 0m8.645s > sys 0m0.523s >> 在 2018年4月23日,下午3:57,Peter Vary <pv...@cloudera.com> 写道: >> >> Hi, >> >> Disclaimer: I am not too familiar with the webhcat yet. >> From the logs, I see, that: >> - the first 3 seconds spent on starting a new session, and maybe a driver - >> this can be reduced, if the session is already there, and the HiveServer2 is >> started (but do not know if webhcat could use HS2, or reuse sessions) - this >> delay could be avoided if you use any of the 3 solutions suggested in my >> last mail. >> - the next 3 seconds spent on initializing the metastore. This can be >> reduced if a standalone metastore is started, and the webhcat is configured >> to access this metastore. >> >> Hope this helps, >> Peter >> >>> On Apr 23, 2018, at 9:27 AM, 侯宗田 <zongtian...@icloud.com> wrote: >>> >>> Thank you very much for your reply, I am wondering whether I use the >>> webhcat rightly, I don’t think it is normal to create all the directories >>> and objects to get a table describ and take 8 seconds. The webhcat should >>> not be so slow, Or it is because I forget to start some server which can >>> respond immediately? >>>> 在 2018年4月23日,下午3:06,Peter Vary <pv...@cloudera.com> 写道: >>>> >>>> Hi, >>>> >>>> Alexander Kolbasov has a project which might interest you (keeping in mind, >>>> that this is not production ready - more like a proof of concept): >>>> https://github.com/akolb1/gometastore/blob/master/hmstool/doc/hmstool.md >>>> >>>> Also you can use HMS thrift API directly to access the MetaStore, or if you >>>> can/want write java code, you can use HiveMetastoreClient class to do it in >>>> java. >>>> >>>> I am not sure about the performance gains compared to HCat, but currently >>>> there are no faster interfaces for HMS that I know of. >>>> >>>> Regards, >>>> Peter >>>> >>>> >>>> 侯宗田 <zongtian...@icloud.com> ezt írta (időpont: 2018. ápr. 23., Hét 2:40): >>>> >>>>> Can anyone give me some suggestions? I have been stuck in this problem for >>>>> several days. Need help!! >>>>>> 在 2018年4月22日,下午9:38,侯宗田 <zongtian...@icloud.com> 写道: >>>>>> >>>>>> >>>>>> Hi, >>>>>> >>>>>> I am writing a application which needs the metastore about hive tables. >>>>> I have used webhcat to get the information about tables and process them. >>>>> But a simple request takes over eight seconds to respond on localhost. Why >>>>> is this so slow, and how can I fix it or is there other way I can extract >>>>> the metadata in C? >>>>>> >>>>>> $ time curl -s ' >>>>> http://localhost:50111/templeton/v1/ddl/database/default/table/haha?user.name=ctdean >>>>> < >>>>> http://localhost:50111/templeton/v1/ddl/database/default/table/haha?user.name=ctdean >>>>>> ' >>>>>> {"columns": >>>>>> [{"name":"id","type":"int"}], >>>>>> "database":"default", >>>>>> "table":"haha"} >>>>>> >>>>>> real 0m8.400s >>>>>> user 0m0.053s >>>>>> sys 0m0.019s >>>>>> it seems to run a hcat.py, and it create a bunch of things then clear >>>>> them, it takes very long time, does anyone have some ideas about it?? Any >>>>> suggestions will be very appreciated! >>>>>> >>>>>> $hcat.py -e "use default; desc haha; " >>>>>> SLF4J: Class path contains multiple SLF4J bindings. >>>>>> SLF4J: Found binding in >>>>> [jar:file:/usr/local/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>>>> SLF4J: Found binding in >>>>> [jar:file:/usr/local/hive/lib/log4j-slf4j-impl-2.6.2.jar!/org/slf4j/impl/StaticLoggerBinder.class] >>>>>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings < >>>>> http://www.slf4j.org/codes.html#multiple_bindings> for an explanation. >>>>>> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory] >>>>>> 18/04/21 16:38:13 INFO conf.HiveConf: Found configuration file >>>>> file:/usr/local/hive/conf/hive-site.xml >>>>>> 18/04/21 16:38:15 WARN util.NativeCodeLoader: Unable to load >>>>> native-hadoop library for your platform... using builtin-java classes >>>>> where >>>>> applicable >>>>>> 18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory: >>>>> /tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668 >>>>>> 18/04/21 16:38:16 INFO session.SessionState: Created local directory: >>>>> /tmp/hive/java/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668 >>>>>> 18/04/21 16:38:16 INFO session.SessionState: Created HDFS directory: >>>>> /tmp/hive/kousouda/05096382-f9b6-4dae-aee2-dfa6750c0668/_tmp_space.db >>>>>> 18/04/21 16:38:16 INFO ql.Driver: Compiling >>>>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62): >>>>> use default >>>>>> 18/04/21 16:38:17 INFO metastore.HiveMetaStore: 0: Opening raw store >>>>> with implementation class:org.apache.hadoop.hive.metastore.ObjectStore >>>>>> 18/04/21 16:38:17 INFO metastore.ObjectStore: ObjectStore, initialize >>>>> called >>>>>> 18/04/21 16:38:18 INFO DataNucleus.Persistence: Property >>>>> hive.metastore.integral.jdo.pushdown unknown - will be ignored >>>>>> 18/04/21 16:38:18 INFO DataNucleus.Persistence: Property >>>>> datanucleus.cache.level2 unknown - will be ignored >>>>>> 18/04/21 16:38:18 INFO metastore.ObjectStore: Setting MetaStore object >>>>> pin classes with >>>>> hive.metastore.cache.pinobjtypes="Table,StorageDescriptor,SerDeInfo,Partition,Database,Type,FieldSchema,Order" >>>>>> 18/04/21 16:38:20 INFO metastore.MetaStoreDirectSql: Using direct SQL, >>>>> underlying DB is MYSQL >>>>>> 18/04/21 16:38:20 INFO metastore.ObjectStore: Initialized ObjectStore >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added admin role in >>>>> metastore >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: Added public role in >>>>> metastore >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: No user is added in >>>>> admin role, since config is empty >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_all_functions >>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>>>> ip=unknown-ip-addr cmd=get_all_functions >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default >>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>>>> ip=unknown-ip-addr cmd=get_database: default >>>>>> 18/04/21 16:38:20 INFO ql.Driver: Semantic Analysis Completed >>>>>> 18/04/21 16:38:20 INFO ql.Driver: Returning Hive schema: >>>>> Schema(fieldSchemas:null, properties:null) >>>>>> 18/04/21 16:38:20 INFO ql.Driver: Completed compiling >>>>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62); >>>>> Time taken: 3.936 seconds >>>>>> 18/04/21 16:38:20 INFO ql.Driver: Concurrency mode is disabled, not >>>>> creating a lock manager >>>>>> 18/04/21 16:38:20 INFO ql.Driver: Executing >>>>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62): >>>>> use default >>>>>> 18/04/21 16:38:20 INFO sqlstd.SQLStdHiveAccessController: Created >>>>> SQLStdHiveAccessController for session context : HiveAuthzSessionContext >>>>> [sessionString=05096382-f9b6-4dae-aee2-dfa6750c0668, clientType=HIVECLI] >>>>>> 18/04/21 16:38:20 WARN session.SessionState: METASTORE_FILTER_HOOK will >>>>> be ignored, since hive.security.authorization.manager is set to instance >>>>> of >>>>> HiveAuthorizerFactory. >>>>>> 18/04/21 16:38:20 INFO hive.metastore: Mestastore configuration >>>>> hive.metastore.filter.hook changed from >>>>> org.apache.hadoop.hive.metastore.DefaultMetaStoreFilterHookImpl to >>>>> org.apache.hadoop.hive.ql.security.authorization.plugin.AuthorizationMetaStoreFilterHook >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Cleaning up thread >>>>> local RawStore... >>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>>>> ip=unknown-ip-addr cmd=Cleaning up thread local RawStore... >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Done cleaning up >>>>> thread local RawStore >>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>>>> ip=unknown-ip-addr cmd=Done cleaning up thread local RawStore >>>>>> 18/04/21 16:38:20 INFO ql.Driver: Starting task [Stage-0:DDL] in serial >>>>> mode >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default >>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>>>> ip=unknown-ip-addr cmd=get_database: default >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: Opening raw store >>>>> with implementation class:org.apache.hadoop.hive.metastore.ObjectStore >>>>>> 18/04/21 16:38:20 INFO metastore.ObjectStore: ObjectStore, initialize >>>>> called >>>>>> 18/04/21 16:38:20 INFO metastore.MetaStoreDirectSql: Using direct SQL, >>>>> underlying DB is MYSQL >>>>>> 18/04/21 16:38:20 INFO metastore.ObjectStore: Initialized ObjectStore >>>>>> 18/04/21 16:38:20 INFO metastore.HiveMetaStore: 0: get_database: default >>>>>> 18/04/21 16:38:20 INFO HiveMetaStore.audit: ugi=kousouda >>>>> ip=unknown-ip-addr cmd=get_database: default >>>>>> 18/04/21 16:38:20 INFO ql.Driver: Completed executing >>>>> command(queryId=kousouda_20180421163816_58c38a44-25e3-4665-8bb5-a9b17fdf2d62); >>>>> Time taken: 0.202 seconds >>>>>> OK >>>>> >>>>> >>> >> >