Hi Community,

Many users run adhoc hive queries on our platform.
Some rogue queries managed to fill up the hdfs space and causing mainstream
queries to fail.

We wanted to limit the data generated by these adhoc queries.
We are aware of strict param which limits the data being scanned, but it is
of less help as huge number of user tables aren't partitioned.

Is there a way we can limit the data generated from hive per query, like a
hve parameter for setting HDFS quotas for job level *scratch* directory or
any other approach?
What's the general approach to gaurdrail such multi-tenant cases.

Thanks in advance,
Ravi

Reply via email to