Sergio Peña created HIVE-11131:
----------------------------------

             Summary: Get row information on DataWritableWriter once for better 
writing performance
                 Key: HIVE-11131
                 URL: https://issues.apache.org/jira/browse/HIVE-11131
             Project: Hive
          Issue Type: Sub-task
    Affects Versions: 1.2.0
            Reporter: Sergio Peña
            Assignee: Sergio Peña


DataWritableWriter is a class used to write Hive records to Parquet files. This 
class is getting all the information about how to parse a record, such as 
schema and object inspector, every time a record is written (or write() is 
called).

We can make this class perform better by initializing some writers per data
type once, and saving all object inspectors on each writer.

The class expects that the next records written will have the same object 
inspectors and schema, so there is no need to have conditions for that. When a 
new schema is written, DataWritableWriter is created again by Parquet. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to