This is an automated email from the ASF dual-hosted git repository. stigahuang pushed a commit to branch master in repository https://gitbox.apache.org/repos/asf/impala.git
commit 3a2f5f28c9709664ef31ea9b2b3675eba31f2d15 Author: Fang-Yu Rao <[email protected]> AuthorDate: Mon Mar 18 17:00:56 2024 -0700 IMPALA-12921, IMPALA-12985: Support running Impala with locally built Ranger The goals and non-goals of this patch could be summarized as follows. Goals: - Add changes to the minicluster configuration that allow a non-default version of Ranger (possibly built locally) to run in the context of the minicluster, and to be used as the authorization server by Impala. - Switch to the new constructor when instantiating RangerAccessRequestImpl. This resolves IMPALA-12985 and also makes Impala compatible with Apache Ranger if RangerAccessRequestImpl from Apache Ranger is consumed. - Prepare Ranger and Impala patches as supplemental material to verify what authorization-related tests could be passed if Apache Ranger is the authorization provider. Merging IMPALA-12921_addendum.diff to the Impala repository is not in the scope of this patch in that the diff file changes the behavior of Impala and thus more discussion is required if we'd like to merge it in the future. Non-goals: - Set up any automation for building Ranger from source. - Pass all Impala authorization-related tests with a non-default version of Ranger. Instructions on running Impala with locally built Ranger: Suppose the Ranger project is under the folder $RANGER_SRC_DIR. We could execute the following to build Apache Ranger for easy reference. By default, the compressed tarball is produced under $RANGER_SRC_DIR/target. mvn clean compile -B -nsu -DskipCheck=true -Dcheckstyle.skip=true \ package install -DskipITs -DskipTests -Dmaven.javadoc.skip=true After building Ranger, we need to build Impala's Java code so that Impala's Java code could consume the locally produced Ranger classes. We will need to export the following environment variables before building Impala. This prevents bootstrap_toolchain.py from trying to download the compressed Ranger tarball. 1. export RANGER_VERSION_OVERRIDE=\ $(mvn -f $RANGER_SRC_DIR/pom.xml -q help:evaluate \ -Dexpression=project.version -DforceStdout) 2. export RANGER_HOME_OVERRIDE=$RANGER_SRC_DIR/target/\ ranger-${RANGER_VERSION_OVERRIDE}-admin It then suffices to execute the following to point Impala to the locally built Ranger server before starting Impala. 1. source $IMPALA_HOME/bin/impala-config.sh 2. tar zxv -f $RANGER_SRC_DIR/target/\ ranger-${IMPALA_RANGER_VERSION}-admin.tar.gz \ -C $RANGER_SRC_DIR/target/ 3. $IMPALA_HOME/bin/create-test-configuration.sh 4. $IMPALA_HOME/bin/create-test-configuration.sh \ -create_ranger_policy_db 5. $IMPALA_HOME/testdata/bin/run-ranger.sh (run-all.sh has to be executed instead if other underlying services have not been started) 6. $IMPALA_HOME/testdata/bin/setup-ranger.sh Testing: - Manually verified that we could point Impala to a locally built Apache Ranger on the master branch (with tip being https://github.com/apache/ranger/commit/4abb993). - Manually verified that with RANGER-4771.diff and IMPALA-12921_addendum.diff, only 3 authorization-related tests failed. They failed because the resource type of 'storage-type' is not supported in Apache Ranger yet and thus the test cases added in IMPALA-10436 could fail. - Manually verified that the log files of Apache and CDP Ranger's Admin server could be created under ${RANGER_LOG_DIR} after we start the Ranger service. - Verified that this patch passed the core tests when CDP Ranger is used. Change-Id: I268d6d4d6e371da7497aac8d12f78178d57c6f27 Reviewed-on: http://gerrit.cloudera.org:8080/21160 Reviewed-by: Impala Public Jenkins <[email protected]> Tested-by: Impala Public Jenkins <[email protected]> --- README-build.md | 11 ++ bin/bootstrap_toolchain.py | 9 +- bin/create-test-configuration.sh | 18 ++ bin/impala-config.sh | 5 +- bin/rat_exclude_files.txt | 1 + .../ranger/RangerAuthorizationChecker.java | 21 ++- .../ranger/RangerImpaladAuthorizationManager.java | 8 +- testdata/bin/setup-ranger.sh | 46 ++++- testdata/cluster/ranger/IMPALA-12921_addendum.diff | 199 +++++++++++++++++++++ testdata/cluster/ranger/RANGER-4771.diff | 29 +++ testdata/cluster/ranger/README | 39 ++++ ...n => all_database_policy_revised.json.template} | 2 +- .../ranger/setup/impala_group_non_owner_2.json | 4 + .../setup/impala_user_non_owner.json.template | 1 + .../setup/impala_user_non_owner_2.json.template | 7 + .../ranger/setup/impala_user_owner.json.template | 1 + 16 files changed, 385 insertions(+), 16 deletions(-) diff --git a/README-build.md b/README-build.md index 798b7f7d0..89e30c7ad 100644 --- a/README-build.md +++ b/README-build.md @@ -75,7 +75,18 @@ impala-config.sh, they may cause confusion when switching between branches or ve Apache Impala. | Environment variable | Description | +|----------------------|-------------| | HIVE_VERSION_OVERRIDE | Used to specify different Hive version from default | | HIVE_STORAGE_API_VERSION_OVERRIDE | Used to specify different Hive Storage API version from default | | HIVE_METASTORE_THRIFT_DIR_OVERRIDE | Used to specify location of metastore thrift files to use during Thrift compilation | | HIVE_HOME_OVERRIDE | Used to specify location of Hive | + +## Ranger Dependency Overrides +Typically used together to specify a local build of Apache Ranger. Care should be taken +while using these variables since they take precedence over the defaults in +impala-config.sh. + +| Environment variable | Description | +|----------------------|-------------| +| RANGER_VERSION_OVERRIDE | Used to specify different Ranger version from default | +| RANGER_HOME_OVERRIDE | Used to specify location of Ranger | diff --git a/bin/bootstrap_toolchain.py b/bin/bootstrap_toolchain.py index 171aec497..670e2f447 100755 --- a/bin/bootstrap_toolchain.py +++ b/bin/bootstrap_toolchain.py @@ -526,12 +526,17 @@ def get_hadoop_downloads(): ranger = CdpComponent("ranger", archive_basename_tmpl="ranger-${version}-admin") use_override_hive = \ "HIVE_VERSION_OVERRIDE" in os.environ and os.environ["HIVE_VERSION_OVERRIDE"] != "" + use_override_ranger = \ + "RANGER_VERSION_OVERRIDE" in os.environ and \ + os.environ["RANGER_VERSION_OVERRIDE"] != "" # If we are using a locally built Hive we do not have a need to pull hive as a - # dependency + # dependency. The same applies to Ranger. cluster_components.extend([hadoop, hbase, ozone]) if not use_override_hive: cluster_components.extend([hive, hive_src]) - cluster_components.extend([tez, ranger]) + if not use_override_ranger: + cluster_components.extend([ranger]) + cluster_components.extend([tez]) return cluster_components diff --git a/bin/create-test-configuration.sh b/bin/create-test-configuration.sh index e8fbde807..cb4489149 100755 --- a/bin/create-test-configuration.sh +++ b/bin/create-test-configuration.sh @@ -243,6 +243,9 @@ popd RANGER_SERVER_CONF_DIR="${RANGER_HOME}/ews/webapp/WEB-INF/classes/conf" RANGER_SERVER_CONFDIST_DIR="${RANGER_HOME}/ews/webapp/WEB-INF/classes/conf.dist" RANGER_SERVER_LIB_DIR="${RANGER_HOME}/ews/webapp/WEB-INF/lib" +RANGER_ADMIN_LOGBACK_CONF_FILE="${RANGER_SERVER_CONFDIST_DIR}/logback.xml" +RANGER_ADMIN_LOG4J2_CONF_FILE="${RANGER_HOME}/ews/webapp/WEB-INF/log4j2.properties" +RANGER_LOG_DIR="${IMPALA_CLUSTER_LOGS_DIR}/ranger" if [[ ! -d "${RANGER_SERVER_CONF_DIR}" ]]; then mkdir -p "${RANGER_SERVER_CONF_DIR}" fi @@ -252,6 +255,21 @@ cp -f "${RANGER_TEST_CONF_DIR}/ranger-admin-env-logdir.sh" "${RANGER_SERVER_CONF cp -f "${RANGER_TEST_CONF_DIR}/ranger-admin-env-piddir.sh" "${RANGER_SERVER_CONF_DIR}" cp -f "${RANGER_SERVER_CONFDIST_DIR}/security-applicationContext.xml" \ "${RANGER_SERVER_CONF_DIR}" +# For Apache Ranger, we need logback.xml under ${RANGER_SERVER_CONF_DIR} so that the log +# files like ranger-admin-$(hostname)-$(whoami).log could be created under +# ${RANGER_LOG_DIR}. +if [[ -f ${RANGER_ADMIN_LOGBACK_CONF_FILE} ]]; then + cp -f ${RANGER_ADMIN_LOGBACK_CONF_FILE} ${RANGER_SERVER_CONF_DIR} +fi +# For CDP Ranger, we change the value of the property 'log.dir' in the corresponding +# log4j2.properties so that the log files like ranger-admin-server.log could be created +# under ${RANGER_LOG_DIR}. +if [[ -f ${RANGER_ADMIN_LOG4J2_CONF_FILE} ]]; then + # Use vertical bar instead of slash as the separator to prevent the slash(es) in + # ${RANGER_LOG_DIR} from interfering with the parsing of sed. + sed -i "s|property\.log\.dir=.*|property.log.dir=${RANGER_LOG_DIR}|g" \ + ${RANGER_ADMIN_LOG4J2_CONF_FILE} +fi # Prepend the following 5 URL's to the line starting with "<intercept-url pattern="/**"". # Before the end-to-end tests could be performed in a Kerberized environment diff --git a/bin/impala-config.sh b/bin/impala-config.sh index dc4277882..24676f998 100755 --- a/bin/impala-config.sh +++ b/bin/impala-config.sh @@ -383,7 +383,7 @@ export IMPALA_ICEBERG_VERSION=${CDP_ICEBERG_VERSION} export IMPALA_ICEBERG_URL=${CDP_ICEBERG_URL-} export IMPALA_KNOX_VERSION=${CDP_KNOX_VERSION} export IMPALA_PARQUET_VERSION=${CDP_PARQUET_VERSION} -export IMPALA_RANGER_VERSION=${CDP_RANGER_VERSION} +export IMPALA_RANGER_VERSION=${RANGER_VERSION_OVERRIDE:-"$CDP_RANGER_VERSION"} export IMPALA_RANGER_URL=${CDP_RANGER_URL-} export IMPALA_TEZ_VERSION=${CDP_TEZ_VERSION} export IMPALA_TEZ_URL=${CDP_TEZ_URL-} @@ -947,7 +947,8 @@ HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:${HADOOP_HOME}/share/hadoop/tools/lib/*" export PATH="$HADOOP_HOME/bin:$PATH" -export RANGER_HOME="${CDP_COMPONENTS_HOME}/ranger-${IMPALA_RANGER_VERSION}-admin" +export RANGER_HOME=\ +${RANGER_HOME_OVERRIDE:-"${CDP_COMPONENTS_HOME}/ranger-${IMPALA_RANGER_VERSION}-admin"} export RANGER_CONF_DIR="$IMPALA_HOME/fe/src/test/resources" # To configure Hive logging, there's a hive-log4j2.properties[.template] diff --git a/bin/rat_exclude_files.txt b/bin/rat_exclude_files.txt index 0ebe532d9..ea6b4882e 100644 --- a/bin/rat_exclude_files.txt +++ b/bin/rat_exclude_files.txt @@ -150,6 +150,7 @@ testdata/cluster/node_templates/cdh5/etc/hadoop/conf/*.xml.tmpl testdata/cluster/node_templates/common/etc/kudu/*.conf.tmpl testdata/cluster/node_templates/common/etc/hadoop/conf/*.xml.tmpl testdata/cluster/ranger/setup/*.json.template +testdata/cluster/ranger/*.diff testdata/data/binary_tbl/000000_0.txt testdata/data/chars-formats.txt testdata/data/chars-tiny.txt diff --git a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java index 56f009f0c..b5dadfdd6 100644 --- a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java +++ b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java @@ -496,8 +496,11 @@ public class RangerAuthorizationChecker extends BaseAuthorizationChecker { .table(tableName) .column(columnName) .build(); - RangerAccessRequest req = new RangerAccessRequestImpl(resource, - SELECT_ACCESS_TYPE, user.getShortName(), getUserGroups(user)); + RangerAccessRequestImpl req = new RangerAccessRequestImpl(); + req.setResource(resource); + req.setAccessType(SELECT_ACCESS_TYPE); + req.setUser(user.getShortName()); + req.setUserGroups(getUserGroups(user)); // The method evalDataMaskPolicies() only checks whether there is a corresponding // column masking policy on the Ranger server and thus does not check whether the // requesting user/group is granted the necessary privilege on the specified @@ -517,8 +520,11 @@ public class RangerAuthorizationChecker extends BaseAuthorizationChecker { .database(dbName) .table(tableName) .build(); - RangerAccessRequest req = new RangerAccessRequestImpl(resource, - SELECT_ACCESS_TYPE, user.getShortName(), getUserGroups(user)); + RangerAccessRequestImpl req = new RangerAccessRequestImpl(); + req.setResource(resource); + req.setAccessType(SELECT_ACCESS_TYPE); + req.setUser(user.getShortName()); + req.setUserGroups(getUserGroups(user)); return plugin_.evalRowFilterPolicies(req, auditHandler); } @@ -666,8 +672,11 @@ public class RangerAuthorizationChecker extends BaseAuthorizationChecker { accessType = privilege.name().toLowerCase(); } } - RangerAccessRequestImpl request = new RangerAccessRequestImpl(resource, - accessType, user.getShortName(), getUserGroups(user)); + RangerAccessRequestImpl request = new RangerAccessRequestImpl(); + request.setResource(resource); + request.setAccessType(accessType); + request.setUser(user.getShortName()); + request.setUserGroups(getUserGroups(user)); request.setClusterName(plugin_.getClusterName()); if (authzCtx.getSessionState() != null) { request.setClientIPAddress( diff --git a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java index ea69df169..3747c5d74 100644 --- a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java +++ b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerImpaladAuthorizationManager.java @@ -336,9 +336,11 @@ public class RangerImpaladAuthorizationManager implements AuthorizationManager { List<RangerAccessRequest> requests = new ArrayList<>(); for (Map<String, String> resource : resources) { - requests.add(new RangerAccessRequestImpl( - new RangerAccessResourceImpl(Collections.unmodifiableMap(resource)), - RangerPolicyEngine.ANY_ACCESS, null, null)); + RangerAccessRequestImpl request = new RangerAccessRequestImpl(); + request.setResource( + new RangerAccessResourceImpl(Collections.unmodifiableMap(resource))); + request.setAccessType(RangerPolicyEngine.ANY_ACCESS); + requests.add(request); } return requests; } diff --git a/testdata/bin/setup-ranger.sh b/testdata/bin/setup-ranger.sh index 47309cc71..c8645dbed 100755 --- a/testdata/bin/setup-ranger.sh +++ b/testdata/bin/setup-ranger.sh @@ -47,6 +47,14 @@ function setup-ranger { python -c "import sys, json; print(json.load(sys.stdin)['id'])") export GROUP_ID_NON_OWNER + GROUP_ID_NON_OWNER_2=$(wget -qO - --auth-no-challenge --user=admin \ + --password=admin --post-file="${RANGER_SETUP_DIR}/impala_group_non_owner_2.json" \ + --header="accept:application/json" \ + --header="Content-Type:application/json" \ + http://localhost:6080/service/xusers/secure/groups | + python -c "import sys, json; print(json.load(sys.stdin)['id'])") + export GROUP_ID_NON_OWNER_2 + perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \ "${RANGER_SETUP_DIR}/impala_user_owner.json.template" > \ "${RANGER_SETUP_DIR}/impala_user_owner.json" @@ -55,6 +63,10 @@ function setup-ranger { "${RANGER_SETUP_DIR}/impala_user_non_owner.json.template" > \ "${RANGER_SETUP_DIR}/impala_user_non_owner.json" + perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \ + "${RANGER_SETUP_DIR}/impala_user_non_owner_2.json.template" > \ + "${RANGER_SETUP_DIR}/impala_user_non_owner_2.json" + if grep "\${[A-Z_]*}" "${RANGER_SETUP_DIR}/impala_user_owner.json"; then echo "Found undefined variables in ${RANGER_SETUP_DIR}/impala_user_owner.json." exit 1 @@ -65,6 +77,11 @@ function setup-ranger { exit 1 fi + if grep "\${[A-Z_]*}" "${RANGER_SETUP_DIR}/impala_user_non_owner_2.json"; then + echo "Found undefined variables in ${RANGER_SETUP_DIR}/impala_user_non_owner_2.json." + exit 1 + fi + wget -O /dev/null --auth-no-challenge --user=admin --password=admin \ --post-file="${RANGER_SETUP_DIR}/impala_user_owner.json" \ --header="Content-Type:application/json" \ @@ -75,15 +92,40 @@ function setup-ranger { --header="Content-Type:application/json" \ http://localhost:6080/service/xusers/secure/users + wget -O /dev/null --auth-no-challenge --user=admin --password=admin \ + --post-file="${RANGER_SETUP_DIR}/impala_user_non_owner_2.json" \ + --header="Content-Type:application/json" \ + http://localhost:6080/service/xusers/secure/users + wget -O /dev/null --auth-no-challenge --user=admin --password=admin \ --post-file="${RANGER_SETUP_DIR}/impala_service.json" \ --header="Content-Type:application/json" \ http://localhost:6080/service/public/v2/api/service + # The policy id corresponding to all the databases and tables is 4 in Apache Ranger, + # whereas it is 5 in CDP Ranger. Getting the policy id via the following API call + # makes this script more resilient to the change in the policy id. + ALL_DATABASE_POLICY_ID=$(curl -u admin:admin -X GET \ + http://localhost:6080/service/public/v2/api/service/test_impala/policy/\ +all%20-%20database \ + -H 'accept: application/json' | \ + python -c "import sys, json; print(json.load(sys.stdin)['id'])") + export ALL_DATABASE_POLICY_ID + + perl -wpl -e 's/\$\{([^}]+)\}/defined $ENV{$1} ? $ENV{$1} : $&/eg' \ + "${RANGER_SETUP_DIR}/all_database_policy_revised.json.template" > \ + "${RANGER_SETUP_DIR}/all_database_policy_revised.json" + + if grep "\${[A-Z_]*}" "${RANGER_SETUP_DIR}/all_database_policy_revised.json"; then + echo "Found undefined variables in \ + ${RANGER_SETUP_DIR}/all_database_policy_revised.json." + exit 1 + fi + curl -f -u admin:admin -H "Accept: application/json" \ -H "Content-Type: application/json" \ - -X PUT http://localhost:6080/service/public/v2/api/policy/5 \ - -d @"${RANGER_SETUP_DIR}/policy_5_revised.json" + -X PUT http://localhost:6080/service/public/v2/api/policy/${ALL_DATABASE_POLICY_ID} \ + -d @"${RANGER_SETUP_DIR}/all_database_policy_revised.json" } setup-ranger diff --git a/testdata/cluster/ranger/IMPALA-12921_addendum.diff b/testdata/cluster/ranger/IMPALA-12921_addendum.diff new file mode 100644 index 000000000..aea42e01c --- /dev/null +++ b/testdata/cluster/ranger/IMPALA-12921_addendum.diff @@ -0,0 +1,199 @@ +diff --git a/be/src/util/backend-gflag-util.cc b/be/src/util/backend-gflag-util.cc +index f1b94562b..c7368ef90 100644 +--- a/be/src/util/backend-gflag-util.cc ++++ b/be/src/util/backend-gflag-util.cc +@@ -272,6 +272,9 @@ DEFINE_double_hidden(query_cpu_root_factor, 1.5, + "(Advance) The Nth root value to control sublinear scale down of unbounded " + "cpu requirement for executor group set selection."); + ++DEFINE_bool(support_storage_type_in_ranger, true, ++ "If true, support the resource type of 'storage-type' added in IMPALA-10436."); ++ + using strings::Substitute; + + namespace impala { +@@ -486,6 +489,7 @@ Status PopulateThriftBackendGflags(TBackendGflags& cfg) { + cfg.__set_dbcp_max_conn_pool_size(FLAGS_dbcp_max_conn_pool_size); + cfg.__set_dbcp_max_wait_millis_for_conn(FLAGS_dbcp_max_wait_millis_for_conn); + cfg.__set_dbcp_data_source_idle_timeout(FLAGS_dbcp_data_source_idle_timeout_s); ++ cfg.__set_support_storage_type_in_ranger(FLAGS_support_storage_type_in_ranger); + return Status::OK(); + } + +diff --git a/common/thrift/BackendGflags.thrift b/common/thrift/BackendGflags.thrift +index e51b6bad4..5ca65000f 100644 +--- a/common/thrift/BackendGflags.thrift ++++ b/common/thrift/BackendGflags.thrift +@@ -304,4 +304,6 @@ struct TBackendGflags { + 136: required i32 dbcp_max_wait_millis_for_conn + + 137: required i32 dbcp_data_source_idle_timeout ++ ++ 138: required bool support_storage_type_in_ranger + } +diff --git a/fe/pom.xml b/fe/pom.xml +index 9dd29855a..3227b5c09 100644 +--- a/fe/pom.xml ++++ b/fe/pom.xml +@@ -221,6 +221,19 @@ under the License. + <groupId>com.amazonaws</groupId> + <artifactId>aws-java-sdk-bundle</artifactId> + </exclusion> ++ <!-- Exclude the hive-storage-api dependency. If we are using a locally built ++ ranger-plugins-audit that may pull in a hive-storage-api of incompatible ++ version, the compilation of MetastoreShim.java could fail because the class ++ TableName is not defined in this version of hive-storage-api. This is the case ++ if the version is 2.7.2, which is still used by Apache Ranger on the master ++ branch (with tip being RANGER-4745). A compatible hive-storage-api of version ++ ${hive.storage.api.version} will still be pulled in via hive-standalone-metastore ++ since this specific version is added as an exception to the excludes in the ++ banned dependencies. --> ++ <exclusion> ++ <groupId>org.apache.hive</groupId> ++ <artifactId>hive-storage-api</artifactId> ++ </exclusion> + </exclusions> + </dependency> + +diff --git a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java +index b5dadfdd6..5f98a8449 100644 +--- a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java ++++ b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerAuthorizationChecker.java +@@ -104,8 +104,13 @@ public class RangerAuthorizationChecker extends BaseAuthorizationChecker { + resources.add(new RangerImpalaResourceBuilder().uri("*").build()); + if (privilege == Privilege.ALL || privilege == Privilege.OWNER || + privilege == Privilege.RWSTORAGE) { +- resources.add(new RangerImpalaResourceBuilder() +- .storageType("*").storageUri("*").build()); ++ // Do not add a resource of type 'storage-type' so that the privilege on this ++ // resource type will not be checked if this type is not supported by the ++ // Ranger service. ++ if (BackendConfig.INSTANCE.supportStorageTypeInRanger()) { ++ resources.add(new RangerImpalaResourceBuilder() ++ .storageType("*").storageUri("*").build()); ++ } + } + break; + case DB: +@@ -159,10 +164,15 @@ public class RangerAuthorizationChecker extends BaseAuthorizationChecker { + .build()); + break; + case STORAGEHANDLER_URI: +- resources.add(new RangerImpalaResourceBuilder() +- .storageType(authorizable.getStorageType()) +- .storageUri(authorizable.getStorageUri()) +- .build()); ++ // Do not add a resource of type 'storage-type' so that the privilege on this ++ // resource type will not be checked if this type is not supported by the Ranger ++ // service. ++ if (BackendConfig.INSTANCE.supportStorageTypeInRanger()) { ++ resources.add(new RangerImpalaResourceBuilder() ++ .storageType(authorizable.getStorageType()) ++ .storageUri(authorizable.getStorageUri()) ++ .build()); ++ } + break; + default: + throw new IllegalArgumentException(String.format("Invalid authorizable type: %s", +diff --git a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java +index 809f2928b..7082b638f 100644 +--- a/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java ++++ b/fe/src/main/java/org/apache/impala/authorization/ranger/RangerCatalogdAuthorizationManager.java +@@ -29,6 +29,7 @@ import org.apache.impala.catalog.CatalogServiceCatalog; + import org.apache.impala.common.ImpalaException; + import org.apache.impala.common.InternalException; + import org.apache.impala.common.UnsupportedFeatureException; ++import org.apache.impala.service.BackendConfig; + import org.apache.impala.thrift.TCatalogServiceRequestHeader; + import org.apache.impala.thrift.TCreateDropRoleParams; + import org.apache.impala.thrift.TDdlExecResponse; +@@ -409,13 +410,25 @@ public class RangerCatalogdAuthorizationManager implements AuthorizationManager + requests.add(createRequest.apply(RangerUtil.createColumnResource(p))); + requests.add(createRequest.apply(RangerUtil.createFunctionResource(p))); + } else if (p.getStorage_url() != null || p.getStorage_type() != null) { +- requests.add(createRequest.apply(RangerUtil.createStorageHandlerUriResource(p))); ++ // Do not send a GrantRevokeRequest for a resource of type 'storage-type' if such ++ // a type is not supported by the Ranger service. This prevents the Ranger ++ // service from throwing an exception. ++ if (BackendConfig.INSTANCE.supportStorageTypeInRanger()) { ++ requests.add(createRequest.apply(RangerUtil.createStorageHandlerUriResource(p))); ++ } else { ++ LOG.warn("Resource type of 'storage-type' in Ranger is not supported"); ++ } + } else { + // Server is used by column, function, URI, and storage handler URI resources. + requests.add(createRequest.apply(RangerUtil.createColumnResource(p))); + requests.add(createRequest.apply(RangerUtil.createFunctionResource(p))); + requests.add(createRequest.apply(RangerUtil.createUriResource(p))); +- requests.add(createRequest.apply(RangerUtil.createStorageHandlerUriResource(p))); ++ // Do not send a GrantRevokeRequest for a resource of type 'storage-type' if such ++ // a type is not supported by the Ranger service. This prevents the Ranger ++ // service from throwing an exception. ++ if (BackendConfig.INSTANCE.supportStorageTypeInRanger()) { ++ requests.add(createRequest.apply(RangerUtil.createStorageHandlerUriResource(p))); ++ } + } + } + +diff --git a/fe/src/main/java/org/apache/impala/service/BackendConfig.java b/fe/src/main/java/org/apache/impala/service/BackendConfig.java +index a050fe016..4a362784d 100644 +--- a/fe/src/main/java/org/apache/impala/service/BackendConfig.java ++++ b/fe/src/main/java/org/apache/impala/service/BackendConfig.java +@@ -518,4 +518,10 @@ public class BackendConfig { + public int getDbcpDataSourceIdleTimeoutInSeconds() { + return backendCfg_.dbcp_data_source_idle_timeout; + } ++ ++ public boolean supportStorageTypeInRanger() { return backendCfg_.support_storage_type_in_ranger; } ++ ++ public void setSupportStorageTypeInRanger(boolean supportStorageTypeInRanger) { ++ backendCfg_.setSupport_storage_type_in_ranger(supportStorageTypeInRanger); ++ } + } +diff --git a/fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java b/fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java +index 364628456..cdff0d5b0 100644 +--- a/fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java ++++ b/fe/src/test/java/org/apache/impala/authorization/AuthorizationTestBase.java +@@ -32,6 +32,7 @@ import org.apache.impala.catalog.Type; + import org.apache.impala.common.AnalysisException; + import org.apache.impala.common.FrontendTestBase; + import org.apache.impala.common.ImpalaException; ++import org.apache.impala.service.BackendConfig; + import org.apache.impala.service.Frontend; + import org.apache.impala.testutil.CatalogServiceTestCatalog; + import org.apache.impala.testutil.ImpaladTestCatalog; +@@ -135,6 +136,7 @@ public abstract class AuthorizationTestBase extends FrontendTestBase { + rangerRestClient_ = new RangerRESTClient(RANGER_ADMIN_URL, null, + rangerImpalaPlugin_.getConfig()); + rangerRestClient_.setBasicAuthInfo(RANGER_USER, RANGER_PASSWORD); ++ BackendConfig.INSTANCE.setSupportStorageTypeInRanger(false); + break; + default: + throw new IllegalArgumentException(String.format( +diff --git a/tests/authorization/test_ranger.py b/tests/authorization/test_ranger.py +index f83680614..a6fcf39bb 100644 +--- a/tests/authorization/test_ranger.py ++++ b/tests/authorization/test_ranger.py +@@ -47,14 +47,18 @@ RANGER_AUTH = ("admin", "admin") + RANGER_HOST = "http://localhost:6080" + REST_HEADERS = {"Content-Type": "application/json", "Accept": "application/json"} + IMPALAD_ARGS = "--server-name=server1 --ranger_service_type=hive " \ +- "--ranger_app_id=impala --authorization_provider=ranger" ++ "--ranger_app_id=impala --authorization_provider=ranger " \ ++ "--support_storage_type_in_ranger=false" + CATALOGD_ARGS = "--server-name=server1 --ranger_service_type=hive " \ +- "--ranger_app_id=impala --authorization_provider=ranger" ++ "--ranger_app_id=impala --authorization_provider=ranger " \ ++ "--support_storage_type_in_ranger=false" + + LOCAL_CATALOG_IMPALAD_ARGS = "--server-name=server1 --ranger_service_type=hive " \ +- "--ranger_app_id=impala --authorization_provider=ranger --use_local_catalog=true" ++ "--ranger_app_id=impala --authorization_provider=ranger --use_local_catalog=true " \ ++ "--support_storage_type_in_ranger=false" + LOCAL_CATALOG_CATALOGD_ARGS = "--server-name=server1 --ranger_service_type=hive " \ +- "--ranger_app_id=impala --authorization_provider=ranger --catalog_topic_mode=minimal" ++ "--ranger_app_id=impala --authorization_provider=ranger " \ ++ "--catalog_topic_mode=minimal --support_storage_type_in_ranger=false" + + LOG = logging.getLogger('impala_test_suite') + diff --git a/testdata/cluster/ranger/RANGER-4771.diff b/testdata/cluster/ranger/RANGER-4771.diff new file mode 100644 index 000000000..ee94b6a85 --- /dev/null +++ b/testdata/cluster/ranger/RANGER-4771.diff @@ -0,0 +1,29 @@ +diff --git a/security-admin/src/main/java/org/apache/ranger/rest/ServiceREST.java b/security-admin/src/main/java/org/apache/ranger/rest/ServiceREST.java +index a6c759234..e85cf1a3c 100644 +--- a/security-admin/src/main/java/org/apache/ranger/rest/ServiceREST.java ++++ b/security-admin/src/main/java/org/apache/ranger/rest/ServiceREST.java +@@ -1283,7 +1283,6 @@ public RESTResponse grantAccess(@PathParam("serviceName") String serviceName, Gr + + if(policyUpdated) { + policy.setZoneName(zoneName); +- ensureAdminAccess(policy); + svcStore.updatePolicy(policy); + } else { + LOG.error("processGrantRequest processing failed"); +@@ -1321,7 +1320,6 @@ public RESTResponse grantAccess(@PathParam("serviceName") String serviceName, Gr + policy.getPolicyItems().add(policyItem); + policy.setZoneName(zoneName); + +- ensureAdminAccess(policy); + svcStore.createPolicy(policy); + } + } catch(WebApplicationException excp) { +@@ -1525,8 +1523,6 @@ public RESTResponse revokeAccess(@PathParam("serviceName") String serviceName, G + if(policyUpdated) { + policy.setZoneName(zoneName); + +- ensureAdminAccess(policy); +- + svcStore.updatePolicy(policy); + } else { + LOG.error("processRevokeRequest processing failed"); diff --git a/testdata/cluster/ranger/README b/testdata/cluster/ranger/README new file mode 100644 index 000000000..e66c4f9b4 --- /dev/null +++ b/testdata/cluster/ranger/README @@ -0,0 +1,39 @@ +RANGER-4771.diff and IMPALA-12921_addendum.diff are provided with IMPALA-12921 to +facilitate the verification of the results of authorization-related tests when we build +Apache Impala on the master branch (with tip being +https://github.com/apache/impala/commit/1233ac3) with Apache Ranger on the master branch +(with tip being https://github.com/apache/ranger/commit/4abb993). +After applying these 2 patches, we only found the following 3 failed tests. They failed +because the support for storage handler privileges is not yet supported by Apache Ranger +and hence the test cases added in IMPALA-10436 could fail. + +1. TestRanger::test_show_grant() in test_ranger.py. +2. TestRanger::test_grant_revoke_with_role() in test_ranger.py. +3. AuthorizationStmtTest#testCreateTable() in AuthorizationStmtTest.java. + +In what follows we give more details regarding why these 2 patches are required to verify +the results of authorization-related tests when we build Apache Impala with Apache +Ranger. + +RANGER-4771.diff should be applied to Apache Ranger on the master branch (with tip being +https://github.com/apache/ranger/commit/4abb993) so that an Impala administrator could +manage policies using GRANT/REVOKE statements in the Impala shell in the non-Kerberized +environment. Recall that currently in Impala's development environment, we run the +authorization-related tests without Kerberos. + +IMPALA-12921_addendum.diff should be applied due to the following reasons. + +1. fe/pom.xml needs to be patched to prevent the compilation of Impala's frontend from + failing. This is done by excluding the hive-storage-api dependency that is + transitively pulled in by ranger-plugins-audit because the locally built + ranger-plugins-audit could pull in an incompatible version of hive-storage-api. +2. Various files have to be patched to add a startup flag + 'support_storage_type_in_ranger'. This flag is needed because when it is set to false, + a) the catalog server will not send to the Ranger server a GrantRevokeRequest + involving the resource type of 'storage-type' (e.g., in GRANT ALL ON SERVER TO USER + <grantee>), and b) the coordinator will skip the check for the privileges on resources + of type 'storage-type' (e.g., when the ALL privilege on SERVER is registered in a + query). +3. AuthorizationTestBase.java and test_ranger.py have to be patched to set + 'support_storage_type_in_ranger' to false when we start the coordinators and the + catalog server. diff --git a/testdata/cluster/ranger/setup/policy_5_revised.json b/testdata/cluster/ranger/setup/all_database_policy_revised.json.template similarity index 98% rename from testdata/cluster/ranger/setup/policy_5_revised.json rename to testdata/cluster/ranger/setup/all_database_policy_revised.json.template index 762e7eace..a8ac6232f 100644 --- a/testdata/cluster/ranger/setup/policy_5_revised.json +++ b/testdata/cluster/ranger/setup/all_database_policy_revised.json.template @@ -5,7 +5,7 @@ "denyExceptions": [], "denyPolicyItems": [], "description": "Policy for all - database", - "id": 5, + "id": ${ALL_DATABASE_POLICY_ID}, "isAuditEnabled": true, "isDenyAllElse": false, "isEnabled": true, diff --git a/testdata/cluster/ranger/setup/impala_group_non_owner_2.json b/testdata/cluster/ranger/setup/impala_group_non_owner_2.json new file mode 100644 index 000000000..c031a0d8a --- /dev/null +++ b/testdata/cluster/ranger/setup/impala_group_non_owner_2.json @@ -0,0 +1,4 @@ +{ + "name" : "non_owner_2", + "description" : "" +} diff --git a/testdata/cluster/ranger/setup/impala_user_non_owner.json.template b/testdata/cluster/ranger/setup/impala_user_non_owner.json.template index fad287ea6..f0003a95a 100644 --- a/testdata/cluster/ranger/setup/impala_user_non_owner.json.template +++ b/testdata/cluster/ranger/setup/impala_user_non_owner.json.template @@ -1,5 +1,6 @@ { "name" : "non_owner", + "firstName" : "First name of non_owner", "password" : "Password123", "userRoleList" : [ "ROLE_USER" ], "groupIdList" : [ "${GROUP_ID_NON_OWNER}" ] diff --git a/testdata/cluster/ranger/setup/impala_user_non_owner_2.json.template b/testdata/cluster/ranger/setup/impala_user_non_owner_2.json.template new file mode 100644 index 000000000..c3fc4dfa4 --- /dev/null +++ b/testdata/cluster/ranger/setup/impala_user_non_owner_2.json.template @@ -0,0 +1,7 @@ +{ + "name" : "non_owner_2", + "firstName" : "First name of non_owner_2", + "password" : "Password123", + "userRoleList" : [ "ROLE_USER" ], + "groupIdList" : [ "${GROUP_ID_NON_OWNER_2}" ] +} diff --git a/testdata/cluster/ranger/setup/impala_user_owner.json.template b/testdata/cluster/ranger/setup/impala_user_owner.json.template index bcb368e38..8646ae69b 100644 --- a/testdata/cluster/ranger/setup/impala_user_owner.json.template +++ b/testdata/cluster/ranger/setup/impala_user_owner.json.template @@ -1,5 +1,6 @@ { "name" : "${USER}", + "firstName" : "First name of ${USER}", "password" : "Password123", "userRoleList" : [ "ROLE_USER" ], "groupIdList" : [ "${GROUP_ID_OWNER}" ]
