Hi Community, Many users run adhoc hive queries on our platform. Some rogue queries managed to fill up the hdfs space and causing mainstream queries to fail.
We wanted to limit the data generated by these adhoc queries. We are aware of strict param which limits the data being scanned, but it is of less help as huge number of user tables aren't partitioned. Is there a way we can limit the data generated from hive per query, like a hve parameter for setting HDFS quotas for job level *scratch* directory or any other approach? What's the general approach to gaurdrail such multi-tenant cases. Thanks in advance, Ravi