Hi Guys,

I am trying to fetch the data from hive through python code based on dates
and id.

For fetch of 20days till current day for 7tables together it is taking
30seconds.
for fetching for an year worth data for 7tables together it is taking
3minutes26seconds.
My tables are stored as orc and transactional set to true.

so our goal is to make the fetch for an year data within a second or
2seconds.

I have tried it two ways:
1.  cursor.execute("set hive.support.concurrency=true")
     cursor.execute("set hive.exec.dynamic.partition.mode=nonstrict")
     cursor.execute("SET hive.exec.parallel=true")
     cursor.execute("set tez.grouping.split-count=85")

2.  cursor.execute("set hive.fetch.task.conversion=more")

either way is performing the same, is there any better way for reaching our
goal?

Any help is appreciable.


Thanks
Sowjanya

Reply via email to