[ 
https://issues.apache.org/jira/browse/HIVE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13905058#comment-13905058
 ] 

Puneet Gupta commented on HIVE-5994:
------------------------------------

Hi Prasanth

This is the code I Used to reproduce the issue . 
1. I am using Hive binary from "hive-0.12.0.tar.gz" 
2. I am using a old hadoop version "hadoop-core-1.0.0.jar"   --- 
http://mvnrepository.com/artifact/org.apache.hadoop/hadoop-core
3. In the below code if  ROWS_TO_TEST is set to 1 or >10 , the problem does not 
occur.

---------------------------
package hive;

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.hive.ql.io.orc.CompressionKind;
import org.apache.hadoop.hive.ql.io.orc.OrcFile;
import org.apache.hadoop.hive.ql.io.orc.Reader;
import org.apache.hadoop.hive.ql.io.orc.RecordReader;
import org.apache.hadoop.hive.ql.io.orc.Writer;
import org.apache.hadoop.hive.ql.io.orc.OrcFile.WriterOptions;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspector;
import org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory;

public class TestLong {

        /**
         * @param args
         * @throws IOException 
         */
        public static void main(String[] args) throws IOException
        {
                int ROWS_TO_TEST =10;
                Path path = new Path("E:/Test/file.orc");
                Configuration conf = new Configuration();
                FileSystem fs = FileSystem.getLocal(conf);
                if(fs.exists(path))
                        fs.delete(path,true);
                
                ObjectInspector inspector = ObjectInspectorFactory
                                .getReflectionObjectInspector(MyData.class,
                                                
ObjectInspectorFactory.ObjectInspectorOptions.JAVA);

                WriterOptions options = OrcFile.writerOptions(conf)
                                
.inspector(inspector).compress(CompressionKind.SNAPPY);

                Writer writer = OrcFile.createWriter(path, options);

                for (int i = 0; i < ROWS_TO_TEST; i++) {
                        writer.addRow(new MyData());
                }
                writer.close();

                Reader reader = OrcFile.createReader(fs, path);
                RecordReader rows = reader.rows(null);
                Object row = null;
                while (rows.hasNext()) {
                        row = rows.next(row);
                        System.out.println(row);
                }
        }
        
        
        private static class MyData
        {
                long data = 4703275633953830000L ;
        }
}
-----------
OUTPUT
{112}
{112}
{112}
{112}
{112}
{112}
{112}
{112}
{112}
{112}


> ORC RLEv2 encodes wrongly for large negative BIGINTs  (64 bits )
> ----------------------------------------------------------------
>
>                 Key: HIVE-5994
>                 URL: https://issues.apache.org/jira/browse/HIVE-5994
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.13.0
>            Reporter: Prasanth J
>            Assignee: Prasanth J
>              Labels: orcfile
>             Fix For: 0.13.0
>
>         Attachments: HIVE-5994.1.patch
>
>
> For large negative BIGINTs, zigzag encoding will yield large value (64bit 
> value) with MSB set to 1. This value is interpreted as negative value in 
> SerializationUtils.findClosestNumBits(long value) function. This resulted in 
> wrong computation of total number of bits required which results in wrong 
> encoding/decoding of values.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to