[ https://issues.apache.org/jira/browse/HIVE-18685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16362962#comment-16362962 ]
Alan Gates commented on HIVE-18685: ----------------------------------- This comment deals with all of the database questions. These are the changes to the Derby upgrade script that I made: {code:java} -- Create new Catalog table CREATE TABLE "APP"."CTLGS" ( "CTLG_ID" BIGINT NOT NULL, "NAME" VARCHAR(256) UNIQUE, "DESC" VARCHAR(4000), "LOCATION_URI" VARCHAR(4000) NOT NULL); ALTER TABLE "APP"."CTLGS" ADD CONSTRAINT "CTLGS_PK" PRIMARY KEY ("CTLG_ID"); -- Insert a default value. The location is TBD. Hive will fix this when it starts INSERT INTO "APP"."CTLGS" VALUES (1, 'Hive', 'Default catalog for Hive', 'TBD'); -- Drop the unique index on DBS DROP INDEX "APP"."UNIQUE_DATABASE"; -- Add the new column to the DBS table, can't put in the not null constraint yet ALTER TABLE "APP"."DBS" ADD COLUMN "CTLG_NAME" VARCHAR(256); -- Update all records in the DBS table to point to the Hive catalog UPDATE "APP"."DBS" SET "CTLG_NAME" = 'hive'; -- Add the not null constraint --ALTER TABLE "APP"."DBS" ADD CONSTRAINT "DBS_CTLG_NN" NOT NULL ("CTLG_NAME"); ALTER TABLE "APP"."DBS" ALTER COLUMN "CTLG_NAME" NOT NULL; -- Put back the unique index CREATE UNIQUE INDEX "APP"."UNIQUE_DATABASE" ON "APP"."DBS" ("NAME", "CTLG_NAME"); -- Add the foreign key ALTER TABLE "APP"."DBS" ADD CONSTRAINT "DBS_FK1" FOREIGN KEY ("CTLG_NAME") REFERENCES "APP"."CTLGS" ("NAME") ON DELETE NO ACTION ON UPDATE NO ACTION;{code} Regarding the location, we need to store that because we need to use it when we create databases in a catalog. Currently Hive creates database locations by adding a directory named <dbname.db> default warehouse location (from the config file). But that won't work once we have multiple catalogs because two databases of the same name may exist in separate catalogs. So my plan is for each catalog to have a location (by default an HDFS directory, though of course it could be an S3 bucket or whatever) where database directories will be created. For the default 'hive' catalog that location will be the default warehouse location from the config file. I don't think there's any need to tie the catalog name and HDFS location. Unlike database and table I am not planning to allow the location to default to something, the user must specify it when creating a catalog. {quote}Is there a need to explicitly create 'hive' catalog - can catalogs be created on demand? {quote} Yes, because of the constraints being added to the RDBMS each database will have to be associated with a catalog. Plus it seems cleaner to explicitly have everything in a catalog. {quote} * When the administrator defines the security model, how does it stored/retrieved? Maybe it should be a catalog level information * It might be difficult to have the same user base / security model working for every connecting application, especially with transient clusters - maybe it is not an immediate concern, but it might be good to keep in mind.{quote} My plan is to store the security model for the catalog in the CTLGS table, though as you see above I haven't added that yet. I haven't finished the design on the security piece yet, and I agree that having varying security models inside the system, especially once we allow users to do cross catalog operations, will be challenging. But I believe it is a compelling enough feature that we will want it. > Add catalogs to metastore > ------------------------- > > Key: HIVE-18685 > URL: https://issues.apache.org/jira/browse/HIVE-18685 > Project: Hive > Issue Type: New Feature > Components: Metastore > Affects Versions: 3.0.0 > Reporter: Alan Gates > Assignee: Alan Gates > Priority: Major > Attachments: HMS Catalog Design Doc.pdf > > > SQL supports two levels of namespaces, called in the spec catalogs and > schemas (with schema being equivalent to Hive's database). I propose to add > the upper level of catalog. The attached design doc covers the use cases, > requirements, and brief discussion of how it will be implemented in a > backwards compatible way. -- This message was sent by Atlassian JIRA (v7.6.3#76005)