This is an automated email from the ASF dual-hosted git repository.

michaelsmith pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 7b502f7c965fe9236b034b2ef0ec86d9055c3901
Author: Joe McDonnell <[email protected]>
AuthorDate: Fri Jun 23 13:48:07 2023 -0700

    IMPALA-12240: Put gcc on the PATH when building the impala-python venv
    
    On some systems, we have seen the build for the impala-python
    virtualenv refer to system gcc directly, even though we have
    specified Impala toolchain's gcc via CC. When the system gcc
    is newer than Impala's gcc, it fails to execute because it needs
    symbols that are not present in Impala's libstdc++:
    
    gcc: 
/home/joe/impala/toolchain/toolchain-packages-gcc10.4.0/gcc-10.4.0/lib64/libstdc++.so.6:
 version `GLIBCXX_3.4.29' not found (required by gcc)
    
    This adds the toolchain gcc to the PATH when building the impala-python
    virtualenv. This means that any direct reference to gcc will use our
    compiler rather than system gcc. We continue to have CC pointed to
    our compiler.
    
    Testing:
     - Ran a build on Redhat 9 where the issue presented
    
    Change-Id: Ia5ddd6a88b41a3f8ba04d13538b3de2d9499cbf5
    Reviewed-on: http://gerrit.cloudera.org:8080/20114
    Reviewed-by: Michael Smith <[email protected]>
    Tested-by: Impala Public Jenkins <[email protected]>
---
 infra/python/bootstrap_virtualenv.py | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/infra/python/bootstrap_virtualenv.py 
b/infra/python/bootstrap_virtualenv.py
index 5a96ad3c6..f26bb84b2 100644
--- a/infra/python/bootstrap_virtualenv.py
+++ b/infra/python/bootstrap_virtualenv.py
@@ -177,6 +177,13 @@ def exec_pip_install(venv_dir, is_py3, args, 
cc="no-cc-available", env=None):
   toolchain_binutils_dir = toolchain_pkg_dir("binutils")
   binutils_bin_dir = os.path.join(toolchain_binutils_dir, "bin")
   env["PATH"] = "{0}:{1}".format(binutils_bin_dir, env["PATH"])
+  # Sometimes pip install invokes gcc directly without using the CC environment
+  # variable. If system GCC is too new, then it will fail, because it needs 
symbols
+  # that are not in Impala's libstdc++. To avoid this, we add GCC to the PATH,
+  # so any direct reference will use our GCC rather than the system GCC.
+  toolchain_gcc_dir = toolchain_pkg_dir("gcc")
+  gcc_bin_dir = os.path.join(toolchain_gcc_dir, "bin")
+  env["PATH"] = "{0}:{1}".format(gcc_bin_dir, env["PATH"])
 
   # Parallelize the slow numpy build.
   # Use getconf instead of nproc because it is supported more widely, e.g. on 
older

Reply via email to