[
https://issues.apache.org/jira/browse/HIVE-6109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13893691#comment-13893691
]
Satish Mittal commented on HIVE-6109:
-------------------------------------
Copy-pasting my response to review comments from review-board:
1)Sushanth: "_DYN" is already defined in FosterStorageHandler, needs to have
one place where it's defined. I'm okay with it being defined here if the
FosterStorageHandler constant is removed and references to that are changed to
reference this.
Satish Mittal: Will refactor it now. I didn't want to touch unrelated code in
the first cut.
2) Sushanth: whitespace errors - git refers to a bunch of these through the
patch when we try to apply, please correct for final patch upload.
Satish Mittal: Will do.
3) Sushanth: A bit about code readability - if we add a special case, then it
makes sense to add the special case as an else, rather than as an if - that
way, the default behaviour is visible first, and then the special case - please
swap this around so that this is a if (!customDynamicLocationUsed) structure.
Satish Mittal: Fine, will do that.
4) Sushanth: This is now significant amount of code repetition from line
720-741 above, please see if we can refactor this into a separate method.
Satish Mittal: I will see if it can be easily refactored into a private method.
5) Sushanth: This becomes the primary API point with this change, wherein, a
user that is using HCatOutputFormat will generate an OutputJobInfo, and then
call setCustomDynamicLocation on it. This is fine for M/R users of HCat, but is
something that will wind up having to be implemented for each M/R user. It
might have been better to define a constant in HCatConstants, say
"hcat.dynamic.partitioning.custom.pattern", and to use that as a JobInfo
parameter. That makes it easier for other tools to integrate with this feature.
For example, with your patch, we still do not support the ability for the
HCatStorer from pig to be able to write to custom dynamic partitions, while we
do want to keep feature parity where possible between HCatOutputFormat and
HCatStorer.
In fact, as a design goal for HCat, we're trying to move away from
letting(requiring) users explicitly muck around with OutputJobInfo and
InputJobInfo, and stick to static calls to HCatInputFormat/HCatOutputFormat.
I would like to see this call be something the HCatOutputFormat automatically
calls if a jobConf parameter(as above) is set. That way, we can solve pig
compatibility as well easily.
Satish Mittal: Thanks for this feedback Sushanth. I had realized that with the
current patch, HCatStorer is still not covered (only M/R jobs are covered), and
was thinking of exposing equivalent APIs in HCatStorer to achieve it. However
your suggestion sounds cleaner to me. Will do that!
> Support customized location for EXTERNAL tables created by Dynamic
> Partitioning
> -------------------------------------------------------------------------------
>
> Key: HIVE-6109
> URL: https://issues.apache.org/jira/browse/HIVE-6109
> Project: Hive
> Issue Type: Improvement
> Components: HCatalog
> Reporter: Satish Mittal
> Assignee: Satish Mittal
> Attachments: HIVE-6109.1.patch.txt, HIVE-6109.2.patch.txt,
> HIVE-6109.3.patch.txt, HIVE-6109.pdf
>
>
> Currently when dynamic partitions are created by HCatalog, the underlying
> directories for the partitions are created in a fixed 'Hive-style' format,
> i.e. root_dir/key1=value1/key2=value2/.... and so on. However in case of
> external table, user should be able to control the format of directories
> created for dynamic partitions.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)