Hi Attila, Adding new commands could be an option. Honestly, I don't often use BRANCH or TAG, and I don't have a strong opinion on either approach.
Off topic. If I understand correctly, Dremio's branching semantics are different from what Hive provides. Dremio supports versioning not per table but per entire namespace. I wonder if Apache Hive has a plan to support those semantics. Regards, Okumin On Mon, Nov 11, 2024 at 4:11 PM Butao Zhang <butaozha...@163.com> wrote: > Thanks Attila for starting the hive-iceberg branch/tag discussion. > > In HIVE-27233 <https://issues.apache.org/jira/browse/HIVE-27233> , we > first introduced the branch/tag syntax in Hive by referring to > Spark-Iceberg branch/tag syntax. Spark-Iceberg uses the ALTER > <https://iceberg.apache.org/docs/1.7.0/branching/#historical-tags> syntax > to express the branch/tag operation, and I think most users or engines are > used to this syntax. So following Spark-Iceberg syntax is important for > users who use multiple engines(Spark & Hive, or others). > > But what you said is also reasonable. Sometimes, CREATE & DROP syntax are > more straightforward for the new users. I also have seen the Dremio-iceberg > doc <https://docs.dremio.com/cloud/reference/sql/commands/create-branch>, > which shows that Dremio use CREATE & DROP syntax instead of ALTER. For > example: > > CREATE BRANCH [ IF NOT EXISTS ] <branch_name> > [ { FROM | AT } { REF[ERENCE] | BRANCH | TAG | COMMIT } > <reference_name> ] > [ IN <catalog_name> ] > > But this Dreimo CREATE syntax is more like a dialect than Spark-Iceberg > ALTER, as we subconsciously think spark-iceberg syntax is right&official. > > IMO, I am not against for implementing the new branch/tag > syntax(CREATE&DROP), as long as there is a strong demand from community > users. But the new syntax will be a Hive-style dialect, which other > engines(Spark&Trino, etc) will not accept. > > I would like to hear opinions from other folks. :) > > > Thanks, > Butao Zhang > ---- Replied Message ---- > From Attila Turoczy<aturo...@cloudera.com.INVALID> <undefined> > Date 11/6/2024 21:51 > To dev<dev@hive.apache.org> <dev@hive.apache.org> > Subject Iceberg branching tagging syntax > Dear Hive community, > > I would like to hear your feedback about some syntax sugar for iceberg > branching. If somebody is not aware of this cool feature please read out > the following blog post > <https://medium.com/@ayushtkn/apache-hive-4-x-with-iceberg-branches-tags-3d52293ac0bf> > . > > Currently, Hive is implementing branching as per the official > <https://iceberg.apache.org/docs/1.6.1/branching/> recommendation, which > is fine, but there's something about the syntax that feels out of place in > modern SQL linguistics. Today, any new functionality tends to be added > under the ALTER TABLE umbrella. However, ALTER TABLE is increasingly > overloaded with diverse functionalities across different engines, making it > less intuitive. (To me the ALTER TABLE is kinda ETC in the SQL linguistic) > > Many DBAs I’ve worked with are comfortable with commands like CREATE, > SELECT, and INSERT, but when it comes to ALTER, things often get more > complex and everybody starts to google it. > > This is particularly true now with the introduction of iceberg branching > and tagging features, which are some of the most exciting developments > since somebody invited Spotify! :) But from a usability perspective, this > syntax is challenging to remember and use. > > In customer demos, I've been asked why the syntax is so complicated. In my > view, these key features deserve dedicated verbs, making them distinct and > straightforward. > > As a proposal, I’d suggest introducing new syntax options specifically for > branching and tagging. *This wouldn’t replace the current approach but > could be an alternative that enhances clarity and ease of use.* > Create branch: > > CREATE BRANCH audit_branch FROM audit; > > From snapshot: > > CREATE BRANCH audit_branch FROM audit AS OF VERSION 1234; ** > > ** Maybe the FORM here Could be* AT <CommitID>* > > Create tag: > > CREATE TAG historical_tag FROM audit. > > same as for AS OF > > Drop branch: > > DROP BRANCH audit_branch; > > Drop Tag: > > DROP TAG audit_branch; > > Your opinion is very important to us, as it helps determine whether this > is primarily a usability concern for a handful of EU customers, or if it > might be better overall to stick with the classic ALTER approach. > -Attila > >