[
https://issues.apache.org/jira/browse/IMPALA-14129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17966293#comment-17966293
]
Joe McDonnell commented on IMPALA-14129:
----------------------------------------
Merged
[https://github.com/cloudera/native-toolchain/commit/a1f257d5b75745670d43af20d254f3f3260e7070]
to native-toolchain:
{noformat}
commit a1f257d5b75745670d43af20d254f3f3260e7070
Author: Joe McDonnell <[email protected]>
Date: Thu Jun 5 16:42:08 2025 -0700 IMPALA-14129: Patch hadoop-client to
disable repository.apache.org
This applies a patch on top of hadoop that disables two
Maven repositories: repository.jboss.org and repository.apache.org
The build does not actually need those repositories. All the
artifacts needed for this build are available from central.
Repeated requests to repository.apache.org are discouraged and
can lead to an IP being banned.
Applying a patch changes some of the directories names, so this
needed further adjustments to handle that.
Testing:
- Ran ARM build and used that to build Impala on ARM
- Verified that the hadoop-client build did not access
repository.apache.org based on the logs
Change-Id: I2a441c1dc2c43e5fdcd467486b50e531daff62eb
Reviewed-on: http://gerrit.cloudera.org:8080/22992
Reviewed-by: Michael Smith <[email protected]>
Tested-by: Joe McDonnell <[email protected]>
{noformat}
> Native-toolchain's hadoop-client build should not contact Apache servers
> ------------------------------------------------------------------------
>
> Key: IMPALA-14129
> URL: https://issues.apache.org/jira/browse/IMPALA-14129
> Project: IMPALA
> Issue Type: Task
> Components: Infrastructure
> Affects Versions: Impala 5.0.0
> Reporter: Joe McDonnell
> Assignee: Joe McDonnell
> Priority: Major
>
> The hadoop-client build (needed for ARM) does not get all of its dependencies
> from the central repository. Instead, it has some attempts to download from
> repository.jboss.org and repository.apache.org:
> {noformat}
> [INFO] Downloading from apache.snapshots.https:
> https://repository.apache.org/content/repositories/snapshots/org/apache/apache/24/apache-24.pom
> [INFO] Downloading from repository.jboss.org:
> https://repository.jboss.org/nexus/content/groups/public/org/apache/apache/24/apache-24.pom
> [INFO] Downloading from central:
> https://repo.maven.apache.org/maven2/org/apache/apache/24/apache-24.pom
> [INFO] Downloaded from central:
> https://repo.maven.apache.org/maven2/org/apache/apache/24/apache-24.pom (20
> kB at 140 kB/s){noformat}
> Everything that it needs to download is available in central, so these extra
> requests don't do anything. We should find a way to avoid contacting
> repository.apache.org. One option is to apply a patch to hadoop to
> specifically disable those repositories so that it only uses central.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]