Hi Roshan, The following snippet summarizes the delimiters for your Hive table: colelction.delim \u0002 field.delim \u0001 mapkey.delim \u0003 serialization.format \u0001
Your fields are delimited by \u0001, collections are delimited by \u0002 and the delimiter between the key and value in any maps is \u0003. Can you verify that your XML content doesn't contain any of these characters? If this still doesn't help, could you pick an affected row and share what the XML appears as in Hive and what it is expected to be? Good luck! Mark Mark Grover, Business Intelligence Analyst OANDA Corporation www: oanda.com www: fxtrade.com ----- Original Message ----- From: "mperformer" <codevally.mail.l...@gmail.com> To: user@hive.apache.org Sent: Sunday, May 6, 2012 11:34:55 PM Subject: Re: Data are not displayed correctly on hive tables Hi Mark Many thanks for your reply. Please find the below output. hive> describe formatted messagetemplate; OK # col_name data_type comment messagetemplateid bigint None messagetemplatename string None datacol string None messagetemplatetype string None messagetype string None messagetemplatedescription string None originatingtemplateid bigint None edited boolean None userid bigint None projectid bigint None responsetemplateid bigint None # Detailed Table Information Database: default Owner: root CreateTime: Mon May 07 12:06:59 EST 2012 LastAccessTime: UNKNOWN Protect Mode: None Retention: 0 Location: hdfs://app6:9100/mnt/hive-test/warehouse/messagetemplate Table Type: MANAGED_TABLE Table Parameters: comment This is the messagetemplate table transient_lastDdlTime 1336356473 # Storage Information SerDe Library: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe InputFormat: org.apache.hadoop.mapred.TextInputFormat OutputFormat: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Compressed: No Num Buckets: -1 Bucket Columns: [] Sort Columns: [] Storage Desc Params: colelction.delim \u0002 field.delim \u0001 mapkey.delim \u0003 serialization.format \u0001 Time taken: 3.2 seconds Thanks again. ./Roshan. On Mon, May 7, 2012 at 1:06 PM, Mark Grover < mgro...@oanda.com > wrote: Could you share the output of the following command in Hive: describe formatted messagetemplate My hunch is that your Hive table is using a delimiter (e.g. '\t') that appears in the content of your XML. Mark Grover, Business Intelligence Analyst OANDA Corporation www: oanda.com www: fxtrade.com ----- Original Message ----- From: "mperformer" < codevally.mail.l...@gmail.com > To: user@hive.apache.org Sent: Sunday, May 6, 2012 8:34:27 PM Subject: Data are not displayed correctly on hive tables Hi I am using • Hadoop 0.20.2 • Hive 0.8.1 • Sqoop 1.4.1-incubating in my sample project. Currently I am importing data from PostgreSQL to Hive table using Sqoop. My database table in PostgreSQL has 4 columns and one column stores a bit large XML file as TEXT data type. The same column defined in HIVE as string, but after that column data is not importing and shows as null; Table structure in PostgreSQL CREATE TABLE public.messagetemplate ( messagetemplateid BIGSERIAL, messagetemplatename TEXT, data TEXT, messagetemplatetype TEXT, CONSTRAINT pk_messagetemplate PRIMARY KEY(messagetemplateid) ) WITHOUT OIDS; Table structure in Hive hive> desc messagetemplate; OK messagetemplateid bigint messagetemplatename string data string messagetemplatetype string The data column store the XML file as text, but during the import to hive, all data are imported properly (checked the files in HDFS). But using HIVE select statement, it only shows small part from the XML text and the rest column (last column) is null. Could someone please help me to sort this out. Thanks.