Hi Guys, I am trying to fetch the data from hive through python code based on dates and id.
For fetch of 20days till current day for 7tables together it is taking 30seconds. for fetching for an year worth data for 7tables together it is taking 3minutes26seconds. My tables are stored as orc and transactional set to true. so our goal is to make the fetch for an year data within a second or 2seconds. I have tried it two ways: 1. cursor.execute("set hive.support.concurrency=true") cursor.execute("set hive.exec.dynamic.partition.mode=nonstrict") cursor.execute("SET hive.exec.parallel=true") cursor.execute("set tez.grouping.split-count=85") 2. cursor.execute("set hive.fetch.task.conversion=more") either way is performing the same, is there any better way for reaching our goal? Any help is appreciable. Thanks Sowjanya