[ 
https://issues.apache.org/jira/browse/HIVE-28212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

László Bodor updated HIVE-28212:
--------------------------------
    Description: 
we hardcode a HDFS session dir like below:
https://github.com/apache/hive/blob/2d855b27d31db6476f18870651db6987816bb5e3/itests/util/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java#L307
{code}
      baseFsDir = new Path(new Path(fs.getUri()), "/base");
{code}

this can lead to problems with tez local mode with mini hs2, as tez mirrors the 
hdfs contents to a local folder, and later it this leads to a confusing message 
like:
{code}
2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error 
starting DAGAppMaster
java.io.FileNotFoundException: 
/base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb
 (No such file or directory)
        at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
        at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
        at java.io.FileInputStream.<init>(FileInputStream.java:138) 
~[?:1.8.0_292]
        at 
org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84)
 ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
        at 
org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) 
~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
        at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) 
[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
{code}

btw, this confusing message will be fixed in TEZ-4555, but we need to give 
something different than /base
it doesn't make sense to hack a different folder in tez for the local mode, 
instead we should change the hardcoded "/base" in MiniHS2 which might be more 
durable and solves the abovementioned problem

currently, hive's default scratch dir is 
[/tmp/hive|https://github.com/apache/hive/blob/2d855b27d31db6476f18870651db6987816bb5e3/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L498]

  was:
we hardcode a HDFS session dir like below:
https://github.com/apache/hive/blob/2d855b27d31db6476f18870651db6987816bb5e3/itests/util/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java#L307
{code}
      baseFsDir = new Path(new Path(fs.getUri()), "/base");
{code}

this can lead to problems with tez local mode with mini hs2, as tez mirrors the 
hdfs contents to a local folder, and later it this leads to a confusing message 
like:
{code}
2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error 
starting DAGAppMaster
java.io.FileNotFoundException: 
/base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb
 (No such file or directory)
        at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
        at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
        at java.io.FileInputStream.<init>(FileInputStream.java:138) 
~[?:1.8.0_292]
        at 
org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84)
 ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
        at 
org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) 
~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
        at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) 
[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
{code}

btw, this confusing message will be fixed in TEZ-4555, but we need to give 
something different than /base
currently, hive's default scratch dir is 
[/tmp/hive|https://github.com/apache/hive/blob/2d855b27d31db6476f18870651db6987816bb5e3/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L498]


> MiniHS2: use a base folder which is more likely writable on the local FS
> ------------------------------------------------------------------------
>
>                 Key: HIVE-28212
>                 URL: https://issues.apache.org/jira/browse/HIVE-28212
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: László Bodor
>            Assignee: László Bodor
>            Priority: Major
>              Labels: pull-request-available
>
> we hardcode a HDFS session dir like below:
> https://github.com/apache/hive/blob/2d855b27d31db6476f18870651db6987816bb5e3/itests/util/src/main/java/org/apache/hive/jdbc/miniHS2/MiniHS2.java#L307
> {code}
>       baseFsDir = new Path(new Path(fs.getUri()), "/base");
> {code}
> this can lead to problems with tez local mode with mini hs2, as tez mirrors 
> the hdfs contents to a local folder, and later it this leads to a confusing 
> message like:
> {code}
> 2024-04-24T02:03:52,101 ERROR [DAGAppMaster Thread] client.LocalClient: Error 
> starting DAGAppMaster
> java.io.FileNotFoundException: 
> /base/scratch/laszlobodor/_tez_session_dir/b76689bc-d25e-4d65-a339-44206ff57ce2/.tez/application_1713949431891_0001_wd/tez-conf.pb
>  (No such file or directory)
>       at java.io.FileInputStream.open0(Native Method) ~[?:1.8.0_292]
>       at java.io.FileInputStream.open(FileInputStream.java:195) ~[?:1.8.0_292]
>       at java.io.FileInputStream.<init>(FileInputStream.java:138) 
> ~[?:1.8.0_292]
>       at 
> org.apache.tez.common.TezUtilsInternal.readUserSpecifiedTezConfiguration(TezUtilsInternal.java:84)
>  ~[tez-common-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>       at 
> org.apache.tez.client.LocalClient.createDAGAppMaster(LocalClient.java:394) 
> ~[tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>       at org.apache.tez.client.LocalClient$1.run(LocalClient.java:357) 
> [tez-dag-0.9.1.2024.0.19.0-3.jar:0.9.1.2024.0.19.0-3]
>       at java.lang.Thread.run(Thread.java:748) [?:1.8.0_292]
> {code}
> btw, this confusing message will be fixed in TEZ-4555, but we need to give 
> something different than /base
> it doesn't make sense to hack a different folder in tez for the local mode, 
> instead we should change the hardcoded "/base" in MiniHS2 which might be more 
> durable and solves the abovementioned problem
> currently, hive's default scratch dir is 
> [/tmp/hive|https://github.com/apache/hive/blob/2d855b27d31db6476f18870651db6987816bb5e3/common/src/java/org/apache/hadoop/hive/conf/HiveConf.java#L498]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to