[ https://issues.apache.org/jira/browse/HIVE-12537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bogdan Raducanu updated HIVE-12537: ----------------------------------- Description: Perhaps I'm doing something wrong or is actually working as expected. Putting 1 million constant int32 values produces an ORC file of 1MB. Surprisingly, 1 million consecutive ints produces a much smaller file. Code and FileDump attached. was: Putting 1 million constant int32 values produces an ORC file of 1MB. Perhaps I'm doing something wrong or is actually working as expected. Will attach code. Output from FileDump: Rows: 1000000 Compression: NONE Type: int Stripe Statistics: Stripe 1: Column 0: count: 1000000 hasNull: false min: 123 max: 123 sum: 123000000 File Statistics: Column 0: count: 1000000 hasNull: false min: 123 max: 123 sum: 123000000 Stripes: Stripe: offset: 3 data: 1003847 rows: 1000000 tail: 41 index: 2871 Stream: column 0 section ROW_INDEX start: 3 length 2871 Stream: column 0 section DATA start: 2874 length 1003847 Encoding column 0: DIRECT_V2 File length: 1006860 bytes Padding length: 0 bytes Padding ratio: 0% > RLEv2 doesn't seem to work > -------------------------- > > Key: HIVE-12537 > URL: https://issues.apache.org/jira/browse/HIVE-12537 > Project: Hive > Issue Type: Bug > Components: File Formats > Affects Versions: 1.2.1 > Reporter: Bogdan Raducanu > Labels: orc, orcfile > Attachments: Main.java, orcdump.txt > > > Perhaps I'm doing something wrong or is actually working as expected. > Putting 1 million constant int32 values produces an ORC file of 1MB. > Surprisingly, 1 million consecutive ints produces a much smaller file. > Code and FileDump attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)