Thanks, Owen.
I tried to run from hdfs (not from s3) the problem is the same.
May you please share your hive-site.xml? What env variables, parameters
should I check?
I would use structor with pleasure, but I need to use EMR for this project.
Thanks
Oleg
On Thu, Oct 26, 2017 at 12:22 AM, Owen O
I'm not sure. Using a virtual environment with Hortonwork's version (2.6.1)
and hdfs instead of s3 it works:
hive> CREATE EXTERNAL TABLE Table1 (Id INT, Name STRING) STORED AS ORC
> LOCATION 'hdfs://nn.example.com/user/vagrant/country/';
> OK
> Time taken: 4.073 seconds
> hive> Select * from Table
Yes, It is exactly my point. Since the file has the data (orc is valid),
why hive returns NULLs?
I tested it s3 , hdfs , hive , beeline. the behavior is the same:
select count (*) returns 10.
select * returns NULLs ...
What is the way to debug this problem? Any configuration, logging. I
The file has the data. I'm not sure what Hive is doing wrong.
owen@laptop> java -jar ../tools/target/orc-tools-1.5.0-SNAPSHOT-uber.jar
> data ~/Downloads/Country.orc
> Processing data file /Users/owen/Downloads/Country.orc [length: 392]
> {"Id":1,"Name":"Singapore"}
> {"Id":2,"Name":"Malaysia"}
>
Please refer the document below as well:
Hive on Tez Performance Tuning - Determining Reducer Counts
https://community.hortonworks.com/articles/22419/hive-on-tez-performance-tuning-determining-reducer.html
might
I hope it gives you some clue to understand Tez inside.
2017-01-21 23:35 GMT+09:00 M
Yes below option, i tried it, But I'm not sure about work load (data
ingestion). I cant go with fixed hard coded value,I would like to know reason
for getting 1009 reducer task.
On 1/20/2017 7:45 PM, goun na wrote:
Hi Mahender ,
1st :
Didn't work the following option in Tez?
set mapreduce.job.
Hi Mahender Sarangam,
1st :
Didn't work the following option in Tez?
set mapreduce.job.reduces=100
or
set mapred.reduce.tasks=100 (deprecated)
2nd :
Possibility of data skew. It happens when handling null sometimes.
Goun
2017-01-21 9:58 GMT+09:00 Mahender Sarangam :
> Hi All,
>
> We have ORC