Re:Re: How the actual "sample data" are implemented when using tez reduce auto-parallelism

2016-03-01 Thread Maria
 Thank you very very very much for your patiently answers. I got it, this is very helpful to understand the auto-parallelism Optimization. At 2016-02-29 12:50:04, "Rajesh Balamohan" wrote: "tez.shuffle-vertex-manager.desired-task-input-size" - Determines the amount of desired task input siz

Re: How the actual "sample data" are implemented when using tez reduce auto-parallelism

2016-02-28 Thread Rajesh Balamohan
"tez.shuffle-vertex-manager.desired-task-input-size" - Determines the amount of desired task input size per reduce task. Default is around 100 MB. "tez.shuffle-vertex-manager.min-task-parallelism" - Min task parallelism that ShuffleVertexManager should honor. I.e, if the client has set it as 100,

How the actual "sample data" are implemented when using tez reduce auto-parallelism

2016-02-26 Thread LLBian
Hello, Respected experts:   Recently, I am studying  tez reduce auto-parallelism, I read the article  "Apache Tez: Dynamic Graph Reconfiguration",TEZ-398 and HIVE-7158. I found the HIVE-7158 said that "Tez can optionally sample data from a fraction  of the tasks of a vertex and use that informati