Re: HCatalog access from a Java app

2014-06-23 Thread Dmitry Vasilenko
Missed the last line for the # 2 HCatSchema schema = HCatInputFormat.getTableSchema(job.getConfiguration()) On Mon, Jun 23, 2014 at 8:11 AM, Dmitry Vasilenko wrote: > Hi Brian, > > 1. To enumerate databases and tables and to get the Hive table schema you > can use the code I provided earlier.

Re: HCatalog access from a Java app

2014-06-23 Thread Dmitry Vasilenko
Hi Brian, 1. To enumerate databases and tables and to get the Hive table schema you can use the code I provided earlier. 2. To get the HCatalog flavor of the table schema you will use something like this: import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.mapreduce.InputSplit;

Re: HCatalog access from a Java app

2014-06-21 Thread Brian Jeltema
I’m also experimenting with version 0.13, and see that it differs from 0.12 significantly. Can you give me a code example for 0.13? Thanks Brian On Jun 13, 2014, at 9:25 AM, Brian Jeltema wrote: > Version 0.12.0. > > I’d like to obtain the table’s schema, scan a table partition, and use the

Re: HCatalog access from a Java app

2014-06-16 Thread Dmitry Vasilenko
- does each slave read all of the splits, or is the master process responsible for obtaining the... Here is the workflow I am using. I cannot provide implementation details but I hope it will give you enough information to proceed: 1. On the master node you will have to create a custom Hadoop

Re: HCatalog access from a Java app

2014-06-16 Thread Brian Jeltema
Thanks. I’d already implemented something like this based on some docs I found. I’m a little confused about the scenario for reading the splits on slaves: - does each slave read all of the splits, or is the master process responsible for obtaining the list of splits and then modifying the

Re: HCatalog access from a Java app

2014-06-16 Thread Dmitry Vasilenko
Here is the code sketch to get you started: Step 1. Create a builder: ReadEntity.Builder builder = new ReadEntity.Builder(); String database = ... builder.withDatabase(database); String table = ... builder.withTable(table); String filter = ... if (filter != null) { builder.withFilter(filter); } S

Re: HCatalog access from a Java app

2014-06-16 Thread Brian Jeltema
regarding: > 3. To read the HCat records > > It depends on how you' like to read the records ... will you be reading ALL > the records remotely from the client app > or you will get input splits and read the records on mappers??? > > The code will be different (somewhat)... let me kn

Re: HCatalog access from a Java app

2014-06-14 Thread Lefty Leverenz
Excluding HCatalog JavaDocs was a production error in some of the Hive releases after HCatalog graduated from the Apache incubator and merged with Hive, but the HCatalog API has always been public. - Pre-merge HCatalog 0.5.0 JavaDocs are here: http://hive.apache.org/javadocs/hcat-r0.5.0/api/

Re: HCatalog access from a Java app

2014-06-13 Thread Dmitry Vasilenko
BTW, you can also get the Hive schema and partitions (using the code from #1) Table table = hiveMetastoreClient.getTable(databaseName, tableName); List schema = hiveMetastoreClient.getSchema(databaseName, tableName); List partitions = table.getPartitionKeys(); The HCat and Hive APIs for the schem

Re: HCatalog access from a Java app

2014-06-13 Thread Dmitry Vasilenko
Please take a look at http://stackoverflow.com/questions/22630323/hadoop-java-lang-incompatibleclasschangeerror-found-interface-org-apache-hadoo On Fri, Jun 13, 2014 at 9:53 AM, Brian Jeltema < brian.jelt...@digitalenvoy.net> wrote: > Doing this, with the appropriate substitutions for my table

Re: HCatalog access from a Java app

2014-06-13 Thread Brian Jeltema
Doing this, with the appropriate substitutions for my table, jarClass, etc: > 2. To get the table schema... I assume that you are after HCat schema > > > import org.apache.hadoop.conf.Configuration; > import org.apache.hadoop.mapreduce.InputSplit; > import org.apache.hadoop.mapreduce.Job; > im

Re: HCatalog access from a Java app

2014-06-13 Thread Dmitry Vasilenko
I am not sure about java docs... ;-] I have spent the last three years integrating with HCat and to make it work had to go thru the code... So here are some samples that can be helpful to start with. If you are using Hive 0.12.0 I would not bother with the new APIs... I had to create some shim cla

Re: HCatalog access from a Java app

2014-06-13 Thread Brian Jeltema
Version 0.12.0. I’d like to obtain the table’s schema, scan a table partition, and use the schema to parse the rows. I can probably figure this out by looking at the HCatalog source. My concern was that the HCatalog packages in the Hive distributions are excluded in the JavaDoc, which implies

Re: HCatalog access from a Java app

2014-06-13 Thread Dmitry Vasilenko
You should be able to access this information. The exact API depends on the version of Hive/HCat. As you know earlier HCat API is being deprecated and will be removed in Hive 0.14.0. I can provide you with the code sample if you tell me what you are trying to do and what version of Hive you are usi

HCatalog access from a Java app

2014-06-13 Thread Brian Jeltema
I’m experimenting with HCatalog, and would like to be able to access tables and their schema from a Java application (not Hive/Pig/MapReduce). However, the API seems to be hidden, which leads leads me to believe that this is not a supported use case. Is HCatalog use limited to one of the support