ClassCastException: org.apache.hadoop.io.LongWritable cannot be cast to org.apache.hadoop.hive.serde2.io.TimestampWritable

2018-07-25 Thread Dmitry Goldenberg
Hi, I apologize for the wide distribution and if this is not the right mailing list for this. We write Avro files to Parquet and load them to HDFS so they can be accessed via an EXTERNAL Hive table. These records have two timestamp fields which are expressed in the Avro schema as type = long and

Re: Is it possible to use LOAD DATA INPATH with a PARTITIONED, STORED AS PARQUET table?

2017-04-06 Thread Dmitry Goldenberg
d how would it match content_type='Presentation' -- would the file just need to be named "Presentation"? On Thu, Apr 6, 2017 at 5:05 PM, Dmitry Goldenberg wrote: > >> properly split and partition your data before using LOAD if you want > hive to be able to find it again.

Re: Is it possible to use LOAD DATA INPATH with a PARTITIONED, STORED AS PARQUET table?

2017-04-06 Thread Dmitry Goldenberg
essing step to convert > the data from delimited text to hive-compatible parquet, I don’t see a > reason to use any tool OTHER than hive to perform that conversion. > > > > LOAD DATA is generally used in situations where you **know** that the > data format is already 100% exactl

Re: Is it possible to use LOAD DATA INPATH with a PARTITIONED, STORED AS PARQUET table?

2017-04-06 Thread Dmitry Goldenberg
ion. > > > > LOAD DATA is generally used in situations where you **know** that the > data format is already 100% exactly compatible with your destination > table….which most often occurs when the source of the data is the raw data > backing an existing hive managed table (possibly

Re: Is it possible to use LOAD DATA INPATH with a PARTITIONED, STORED AS PARQUET table?

2017-04-06 Thread Dmitry Goldenberg
flies to the duplicate cluster using > LOAD DATA to ensure the metadata is recorded in hive metastore. > > > > *From:* Dmitry Goldenberg [mailto:dgoldenb...@hexastax.com] > *Sent:* Tuesday, April 04, 2017 3:31 PM > *To:* user@hive.apache.org > *Subject:* Re: Is it po

Re: Is it possible to use LOAD DATA INPATH with a PARTITIONED, STORED AS PARQUET table?

2017-04-04 Thread Dmitry Goldenberg
ion) from their > current HDFS location to the location defined for the table. > > Later on when you query the table the files will be scanned. If there are > in the right format you’ll get results. If not, then no. > > > > *From:* Dmitry Goldenberg [mailto:dgoldenb...@hexastax.

Re: Is it possible to use LOAD DATA INPATH with a PARTITIONED, STORED AS PARQUET table?

2017-04-04 Thread Dmitry Goldenberg
ame results by using hdfs dfs -mv … > > · LOAD DATA LOCAL INPATH is just a file copying operation from the > shell to the HDFS. > You can achieve the same results by using hdfs dfs -put … > > > From: Dmitry Goldenberg [mailto:dgoldenb...@hexastax.com] > Sent: Tuesday, Ap

Re: Is it possible to use LOAD DATA INPATH with a PARTITIONED, STORED AS PARQUET table?

2017-04-04 Thread Dmitry Goldenberg
IONED table in Hive. Thanks, - Dmitry On Tue, Apr 4, 2017 at 12:20 PM, Markovitz, Dudu wrote: > Are your files already in Parquet format? > > > > *From:* Dmitry Goldenberg [mailto:dgoldenb...@hexastax.com] > *Sent:* Tuesday, April 04, 2017 7:03 PM > *To:* user@hive.apache.

Re: Is it possible to use LOAD DATA INPATH with a PARTITIONED, STORED AS PARQUET table?

2017-04-04 Thread Dmitry Goldenberg
erface to read the > files and use it in an INSERT operation. > > > > Dudu > > > > *From:* Dmitry Goldenberg [mailto:dgoldenb...@hexastax.com] > *Sent:* Tuesday, April 04, 2017 4:52 PM > *To:* user@hive.apache.org > *Subject:* Is it possible to use LOAD DATA INP

Is it possible to use LOAD DATA INPATH with a PARTITIONED, STORED AS PARQUET table?

2017-04-04 Thread Dmitry Goldenberg
We have a table such as the following defined: CREATE TABLE IF NOT EXISTS db.mytable ( `item_id` string, `timestamp` string, `item_comments` string) PARTITIONED BY (`date`, `content_type`) STORED AS PARQUET; Currently we insert data into this PARQUET, PARTITIONED table as follows, using an