Title: [163570] trunk/Tools
Revision
163570
Author
[email protected]
Date
2014-02-06 16:03:15 -0800 (Thu, 06 Feb 2014)

Log Message

Add support for multiple sources for AutoInstaller
https://bugs.webkit.org/show_bug.cgi?id=124848

Patch by Jozsef Berta <[email protected]> on 2014-02-06
Reviewed by Ryosuke Niwa.

The autoinstaller in the webkitpy currently fails if the download source of a package is unavailable.
This patch adds support for multiple sources to the script. The sources are provided in three environment variables.
If it exists, the script will look at a local cache. If not, it will try to download the package from the original url.
If it fails, it gets a mirror from the corresponding environment variable.(One for sourceforge.org and one for pypi.python.org)

* Scripts/webkitpy/common/system/autoinstall.py:
(AutoInstaller._copy_unpackaged_files_from_local_cache):  If the package is not packaged in its original form,
this method will copy it to the scratch directory. Otherwise it would be deleted from the cache, which we do not want.
(AutoInstaller._prepare_package): If the package is not zipped or tarred, and the file is in the cache, the
_copy_unpackaged_files_from_local_cache function is called.
(AutoInstaller._parse_colon_separated_mirrors_from_env): This will read the mirrors from the environment variables if possible,
and prepares them for further use.
(AutoInstaller):
(AutoInstaller._replace_domain_with_next_mirror): If the original download url fails, it is replaced by a mirror provided
in the environment variables. The function identifies the original source, and replaces it with a mirror. If it can't be done,
the return url will be None, indicating that no mirrors are provided, or none of them could be reached.
(AutoInstaller._download_to_stream): The timeout for one try is now limited to 30 seconds. Without this, the script waited for
roughly 4 minutes before retrying. After three failiures the script will try to switch to a mirror.
(AutoInstaller._check_package_in_local_autoinstall_cache): This method searches the cache for the currently downloaded module.
If it's found there, its path is returned.
(AutoInstaller._download):  Before downloading the module, it is looked up in the cache. If it's not found there,
the script will continue with the download, and cache the module.

Modified Paths

Diff

Modified: trunk/Tools/ChangeLog (163569 => 163570)


--- trunk/Tools/ChangeLog	2014-02-06 23:45:57 UTC (rev 163569)
+++ trunk/Tools/ChangeLog	2014-02-07 00:03:15 UTC (rev 163570)
@@ -1,3 +1,33 @@
+2014-02-06  Jozsef Berta  <[email protected]>
+
+        Add support for multiple sources for AutoInstaller
+        https://bugs.webkit.org/show_bug.cgi?id=124848
+
+        Reviewed by Ryosuke Niwa.
+
+        The autoinstaller in the webkitpy currently fails if the download source of a package is unavailable.
+        This patch adds support for multiple sources to the script. The sources are provided in three environment variables.
+        If it exists, the script will look at a local cache. If not, it will try to download the package from the original url.
+        If it fails, it gets a mirror from the corresponding environment variable.(One for sourceforge.org and one for pypi.python.org)
+
+        * Scripts/webkitpy/common/system/autoinstall.py:
+        (AutoInstaller._copy_unpackaged_files_from_local_cache):  If the package is not packaged in its original form,
+        this method will copy it to the scratch directory. Otherwise it would be deleted from the cache, which we do not want.
+        (AutoInstaller._prepare_package): If the package is not zipped or tarred, and the file is in the cache, the
+        _copy_unpackaged_files_from_local_cache function is called.
+        (AutoInstaller._parse_colon_separated_mirrors_from_env): This will read the mirrors from the environment variables if possible,
+        and prepares them for further use.
+        (AutoInstaller):
+        (AutoInstaller._replace_domain_with_next_mirror): If the original download url fails, it is replaced by a mirror provided
+        in the environment variables. The function identifies the original source, and replaces it with a mirror. If it can't be done,
+        the return url will be None, indicating that no mirrors are provided, or none of them could be reached.
+        (AutoInstaller._download_to_stream): The timeout for one try is now limited to 30 seconds. Without this, the script waited for
+        roughly 4 minutes before retrying. After three failiures the script will try to switch to a mirror.
+        (AutoInstaller._check_package_in_local_autoinstall_cache): This method searches the cache for the currently downloaded module.
+        If it's found there, its path is returned.
+        (AutoInstaller._download):  Before downloading the module, it is looked up in the cache. If it's not found there,
+        the script will continue with the download, and cache the module.
+
 2014-02-06  Benjamin Poulain  <[email protected]>
 
         Remove run-test-webkit-api

Modified: trunk/Tools/Scripts/webkitpy/common/system/autoinstall.py (163569 => 163570)


--- trunk/Tools/Scripts/webkitpy/common/system/autoinstall.py	2014-02-06 23:45:57 UTC (rev 163569)
+++ trunk/Tools/Scripts/webkitpy/common/system/autoinstall.py	2014-02-07 00:03:15 UTC (rev 163570)
@@ -42,10 +42,17 @@
 import urllib2
 import urlparse
 import zipfile
+import re
+from distutils import dir_util
+from glob import glob
+import urlparse
 
 _log = logging.getLogger(__name__)
+_MIRROR_REGEXS = re.compile('.*sourceforge.*'), re.compile('.*pypi.*')
+_PYPI_ENV_VAR = 'PYPI_MIRRORS'
+_SOURCEFORGE_ENV_VAR = 'SOURCEFORGE_MIRRORS'
+_CACHE_ENV_VAR = 'LOCAL_AUTOINSTALL_CACHE'
 
-
 class AutoInstaller(object):
 
     """Supports automatically installing Python packages from an URL.
@@ -259,6 +266,14 @@
 
         return target_path
 
+    def _copy_unpackaged_files_from_local_cache(self, path, scratch_dir):
+
+        target_basename = os.path.basename(path)
+        target_path = os.path.join(scratch_dir, target_basename)
+
+        shutil.copy(path, target_path)
+        return target_path
+
     def _prepare_package(self, path, scratch_dir):
         """Prepare a package for use, if necessary, and return the new path.
 
@@ -277,26 +292,68 @@
             new_path = self._unzip(path, scratch_dir)
         elif path.endswith(".tar.gz"):
             new_path = self._extract_targz(path, scratch_dir)
+        elif _CACHE_ENV_VAR in os.environ:
+            new_path = path
+            if os.path.dirname(path) == os.path.normpath(os.environ[_CACHE_ENV_VAR]):
+                new_path = self._copy_unpackaged_files_from_local_cache(path, scratch_dir)
         else:
             # No preparation is needed.
             new_path = path
 
         return new_path
 
+    def _parse_colon_separated_mirrors_from_env(self):
+        """
+        Pypi mirror examle: PYPI_MIRRORS=pypi.hustunique.com...
+        Sourceforge mirror example: SOURCEFORGE_MIRRORS=aarnet.dl.sourceforge.net:citylan.dl.sourceforge.net...
+        Mirror sources: http://www.pypi-mirrors.org/, http://sourceforge.net/apps/trac/sourceforge/wiki/Mirrors
+        """
+        try:
+            pypi_mirrors_list = os.environ[_PYPI_ENV_VAR].split(':')
+        except(KeyError):
+            pypi_mirrors_list = ()
+
+        try:
+            sourceforge_mirrors_list = os.environ[_SOURCEFORGE_ENV_VAR].split(':')
+        except(KeyError):
+            sourceforge_mirrors_list = ()
+
+        mirroriterators = iter(sourceforge_mirrors_list), iter(pypi_mirrors_list)
+        return zip(_MIRROR_REGEXS, mirroriterators)
+
+    def _replace_domain_with_next_mirror(self, url, mirrors):
+        parsed_url = list(urlparse.urlparse(url))
+        new_url = None
+        try:
+            for regex, addresses in mirrors:
+                if regex.match(parsed_url[1]):
+                    parsed_url[1] = addresses.next()
+                    new_url = urlparse.urlunparse(parsed_url)
+        except StopIteration, e:
+            _log.info('Ran out of mirrors.')
+
+        return new_url
+
     def _download_to_stream(self, url, stream):
+        mirrors = self._parse_colon_separated_mirrors_from_env()
         failures = 0
         while True:
             try:
-                netstream = urllib2.urlopen(url)
+                netstream = urllib2.urlopen(url, timeout=30)
                 break
             except IOError, err:
                 # Try multiple times
-                if failures < 5:
+                if failures < 2:
                     _log.warning("Failed to download %s, %s retrying" % (
                         url, err))
                     failures += 1
                     continue
 
+                url = "" mirrors)
+                if url:
+                    failures = 0
+                    continue
+
                 # Append existing Error message to new Error.
                 message = ('Could not download Python modules from URL "%s".\n'
                            " Make sure you are connected to the internet.\n"
@@ -319,15 +376,31 @@
             stream.write(data)
         netstream.close()
 
+    def _check_package_in_local_autoinstall_cache(self, filename):
+        if _CACHE_ENV_VAR not in os.environ:
+            return False
+        path = glob(os.path.join(os.environ[_CACHE_ENV_VAR], filename) + '*')
+        if not path:
+            return False
+
+        return path[0]
+
     def _download(self, url, scratch_dir):
         url_path = urlparse.urlsplit(url)[2]
         url_path = os.path.normpath(url_path)  # Removes trailing slash.
         target_filename = os.path.basename(url_path)
+
+        cache = self._check_package_in_local_autoinstall_cache(target_filename)
+        if cache:
+            return cache
+
         target_path = os.path.join(scratch_dir, target_filename)
-
         with open(target_path, "wb") as stream:
             self._download_to_stream(url, stream)
 
+        if _CACHE_ENV_VAR in os.environ:
+            dir_util.copy_tree(scratch_dir, os.environ[_CACHE_ENV_VAR])
+
         return target_path
 
     def _install(self, scratch_dir, package_name, target_path, url, url_subpath, files_to_remove):
_______________________________________________
webkit-changes mailing list
[email protected]
https://lists.webkit.org/mailman/listinfo/webkit-changes

Reply via email to