[ 
https://issues.apache.org/jira/browse/HIVE-2370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083835#comment-13083835
 ] 

Tim Armstrong commented on HIVE-2370:
-------------------------------------

I'm not sure exactly what you mean about read and write speeds.

I tested it reading a file off a remote DFS instance, redirecting the output to 
a local file.  The time spend writing the output is negligible.  

The largest part of time is spent doing unicode conversions to get it into a 
Java CharBuffer, and then writing it to the console.  Decompression and 
deserialisation also takes up a large part of CPU time.


> Improve RCFileCat performance significantly
> -------------------------------------------
>
>                 Key: HIVE-2370
>                 URL: https://issues.apache.org/jira/browse/HIVE-2370
>             Project: Hive
>          Issue Type: Improvement
>          Components: CLI
>    Affects Versions: 0.8.0
>            Reporter: Tim Armstrong
>            Assignee: Tim Armstrong
>            Priority: Minor
>         Attachments: rcfilecat_2011-08-11.patch
>
>
> The rcfilecat utility is extraordinarily slow: the throughput can be < 0.5 
> MB/s of compressed RCFile.  We can implement much faster version to enable 
> faster export of data from Hive.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to