Hi, I am beginner with cascading-hive. Through cascading hive, i want to load data into the hive table which is stored as parquet format. My data contains one field which is date. I have created hive table in parquet format. But when i tried to load date data into the hive table (i.e. stored as parquet) , it failed to load. Here in sink I have mentioned HiveTap and I have map the field with Binary(String) as Date datatype is not available in Cascading-parquet. I have tried some sample code.
code : public class ReadText_StoredIn_Parquet_Date { static String inpath="parquet_input/ReadText_StoredIn_Parquet_Date.txt"; public static void main(String[] args) { // TODO Auto-generated method stub Properties properties = new Properties(); AppProps.setApplicationJarClass( properties, TestExample.class ); AppProps.addApplicationTag( properties, "Cascading-HiveDemoPart1" ); Scheme sourceSch = new TextDelimited(new Fields("dob"),true,"\n"); Tap inTapCallCenter = new Hfs( sourceSch, inpath ); String columnFields[]={"dob"}; String columnType[]={"date"}; String databaseName="hive_parquet"; String tableName= "parquet_date"; HiveTableDescriptor sinkTableDescriptor = new HiveTableDescriptor (databaseName ,tableName, columnFields, columnType ); ParquetTupleScheme scheme = new ParquetTupleScheme(new Fields(columnFields),new Fields(columnFields), "message ReadText_Parquet_string_int{optional Binary dob; }"); HiveTap sinkTap = new HiveTap( sinkTableDescriptor, scheme, SinkMode.REPLACE, true ); Pipe copyPipe = new Pipe( "copyPipe" ); FlowDef def=FlowDef.flowDef().addSource(copyPipe, inTapCallCenter).addTailSink(copyPipe, sinkTap); new Hadoop2MR1FlowConnector(properties).connect(def).complete(); } } This code works fine, it loads data into the table (i.e. stored as parquet format). But while I read data it will give exception I have used ParquetTupleScheme to generate scheme for HiveTap. I have used following query. Query: hive (hive_parquet)> create table parquet_date(dob date) stored as parquet; hive (hive_parquet)> select * from parquet_date; Failed with exception java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast to org.apache.hadoop.hive.serde2.io.DateWritable can you please assist me to how to store date data value into the hive table(i.e. stored as parquet) by using ParquetTupleScheme or by any another way. currently i am using: hive-1.2.0 hadoop Thanks, Santlal J. Gupta **************************************Disclaimer****************************************** This e-mail message and any attachments may contain confidential information and is for the sole use of the intended recipient(s) only. Any views or opinions presented or implied are solely those of the author and do not necessarily represent the views of BitWise. If you are not the intended recipient(s), you are hereby notified that disclosure, printing, copying, forwarding, distribution, or the taking of any action whatsoever in reliance on the contents of this electronic information is strictly prohibited. If you have received this e-mail message in error, please immediately notify the sender and delete the electronic message and any attachments.BitWise does not accept liability for any virus introduced by this e-mail or any attachments. ********************************************************************************************