Hi,

I am beginner with cascading-hive.
Through cascading hive, i want to load data into the  hive table which is 
stored as parquet format. My data contains one field which is date. I have 
created hive table in parquet format. But when i tried to load  date data into 
the hive table (i.e. stored as parquet) , it failed to load.  Here in sink I 
have mentioned HiveTap and I have map the field with Binary(String) as Date 
datatype is not available in Cascading-parquet.
I have tried some sample code.

code :

public class ReadText_StoredIn_Parquet_Date {

        static String inpath="parquet_input/ReadText_StoredIn_Parquet_Date.txt";
        public static void main(String[] args) {
                // TODO Auto-generated method stub

                Properties properties = new Properties();
                AppProps.setApplicationJarClass( properties, TestExample.class 
);
                AppProps.addApplicationTag( properties, 
"Cascading-HiveDemoPart1" );

                Scheme sourceSch = new TextDelimited(new 
Fields("dob"),true,"\n");
                Tap inTapCallCenter = new Hfs( sourceSch, inpath );

                String columnFields[]={"dob"};
                String columnType[]={"date"};
                String databaseName="hive_parquet";
                String tableName= "parquet_date";

                HiveTableDescriptor sinkTableDescriptor = new 
HiveTableDescriptor
                                (databaseName ,tableName, columnFields, 
columnType );

                ParquetTupleScheme scheme = new ParquetTupleScheme(new 
Fields(columnFields),new Fields(columnFields),
                                "message ReadText_Parquet_string_int{optional 
Binary dob; }");

                HiveTap sinkTap = new HiveTap( sinkTableDescriptor, scheme, 
SinkMode.REPLACE, true );
                Pipe copyPipe = new Pipe( "copyPipe" );

                FlowDef def=FlowDef.flowDef().addSource(copyPipe, 
inTapCallCenter).addTailSink(copyPipe, sinkTap);
                new Hadoop2MR1FlowConnector(properties).connect(def).complete();
        }
}

This code works  fine, it loads data into the table (i.e. stored as parquet 
format). But while I read data it will give exception
I have used ParquetTupleScheme to generate scheme for HiveTap.

I have used following query.

Query:

hive (hive_parquet)> create table parquet_date(dob date) stored as parquet;
hive (hive_parquet)> select * from parquet_date;

Failed with exception 
java.io.IOException:org.apache.hadoop.hive.ql.metadata.HiveException: 
java.lang.ClassCastException: org.apache.hadoop.io.BytesWritable cannot be cast 
to org.apache.hadoop.hive.serde2.io.DateWritable


can you please assist me to how to store date data value into the hive 
table(i.e. stored as parquet) by using ParquetTupleScheme or by any another way.


currently i am using:
hive-1.2.0
hadoop

Thanks,
Santlal J. Gupta
**************************************Disclaimer******************************************
 This e-mail message and any attachments may contain confidential information 
and is for the sole use of the intended recipient(s) only. Any views or 
opinions presented or implied are solely those of the author and do not 
necessarily represent the views of BitWise. If you are not the intended 
recipient(s), you are hereby notified that disclosure, printing, copying, 
forwarding, distribution, or the taking of any action whatsoever in reliance on 
the contents of this electronic information is strictly prohibited. If you have 
received this e-mail message in error, please immediately notify the sender and 
delete the electronic message and any attachments.BitWise does not accept 
liability for any virus introduced by this e-mail or any attachments. 
********************************************************************************************

Reply via email to