summaryrefslogtreecommitdiff
path: root/lib
diff options
context:
space:
mode:
authorAdam J. Stewart <ajstewart426@gmail.com>2017-01-31 10:14:52 -0600
committerTodd Gamblin <tgamblin@llnl.gov>2017-01-31 11:14:52 -0500
commit123f057089547d79d6f308bc47698be936aa1cb5 (patch)
treee8a8e5b7da2e974a69d42126fb5a9bedb95c57d8 /lib
parent2e81fe4fb3a8982932f16c212f2a0732ee1766ea (diff)
downloadspack-123f057089547d79d6f308bc47698be936aa1cb5.tar.gz
spack-123f057089547d79d6f308bc47698be936aa1cb5.tar.bz2
spack-123f057089547d79d6f308bc47698be936aa1cb5.tar.xz
spack-123f057089547d79d6f308bc47698be936aa1cb5.zip
Refactor Spack's URL parsing commands (#2938)
* Replace `spack urls` and `spack url-parse` with `spack url` * Allow spack url list to only list incorrect parsings * Add spack url test reporting * Add unit tests for new URL commands
Diffstat (limited to 'lib')
-rw-r--r--lib/spack/docs/developer_guide.rst102
-rw-r--r--lib/spack/docs/packaging_guide.rst197
-rw-r--r--lib/spack/spack/cmd/url.py319
-rw-r--r--lib/spack/spack/cmd/url_parse.py79
-rw-r--r--lib/spack/spack/cmd/urls.py59
-rw-r--r--lib/spack/spack/test/cmd/url.py116
-rw-r--r--lib/spack/spack/url.py237
7 files changed, 797 insertions, 312 deletions
diff --git a/lib/spack/docs/developer_guide.rst b/lib/spack/docs/developer_guide.rst
index 5ddbaf2478..dbb9a670b4 100644
--- a/lib/spack/docs/developer_guide.rst
+++ b/lib/spack/docs/developer_guide.rst
@@ -300,6 +300,42 @@ Stage objects
Writing commands
----------------
+Adding a new command to Spack is easy. Simply add a ``<name>.py`` file to
+``lib/spack/spack/cmd/``, where ``<name>`` is the name of the subcommand.
+At the bare minimum, two functions are required in this file:
+
+^^^^^^^^^^^^^^^^^^
+``setup_parser()``
+^^^^^^^^^^^^^^^^^^
+
+Unless your command doesn't accept any arguments, a ``setup_parser()``
+function is required to define what arguments and flags your command takes.
+See the `Argparse documentation <https://docs.python.org/2.7/library/argparse.html>`_
+for more details on how to add arguments.
+
+Some commands have a set of subcommands, like ``spack compiler find`` or
+``spack module refresh``. You can add subparsers to your parser to handle
+this. Check out ``spack edit --command compiler`` for an example of this.
+
+A lot of commands take the same arguments and flags. These arguments should
+be defined in ``lib/spack/spack/cmd/common/arguments.py`` so that they don't
+need to be redefined in multiple commands.
+
+^^^^^^^^^^^^
+``<name>()``
+^^^^^^^^^^^^
+
+In order to run your command, Spack searches for a function with the same
+name as your command in ``<name>.py``. This is the main method for your
+command, and can call other helper methods to handle common tasks.
+
+Remember, before adding a new command, think to yourself whether or not this
+new command is actually necessary. Sometimes, the functionality you desire
+can be added to an existing command. Also remember to add unit tests for
+your command. If it isn't used very frequently, changes to the rest of
+Spack can cause your command to break without sufficient unit tests to
+prevent this from happening.
+
----------
Unit tests
----------
@@ -312,14 +348,80 @@ Unit testing
Developer commands
------------------
+.. _cmd-spack-doc:
+
^^^^^^^^^^^^^
``spack doc``
^^^^^^^^^^^^^
+.. _cmd-spack-test:
+
^^^^^^^^^^^^^^
``spack test``
^^^^^^^^^^^^^^
+.. _cmd-spack-url:
+
+^^^^^^^^^^^^^
+``spack url``
+^^^^^^^^^^^^^
+
+A package containing a single URL can be used to download several different
+versions of the package. If you've ever wondered how this works, all of the
+magic is in :mod:`spack.url`. This module contains methods for extracting
+the name and version of a package from its URL. The name is used by
+``spack create`` to guess the name of the package. By determining the version
+from the URL, Spack can replace it with other versions to determine where to
+download them from.
+
+The regular expressions in ``parse_name_offset`` and ``parse_version_offset``
+are used to extract the name and version, but they aren't perfect. In order
+to debug Spack's URL parsing support, the ``spack url`` command can be used.
+
+"""""""""""""""""""
+``spack url parse``
+"""""""""""""""""""
+
+If you need to debug a single URL, you can use the following command:
+
+.. command-output:: spack url parse http://cache.ruby-lang.org/pub/ruby/2.2/ruby-2.2.0.tar.gz
+
+You'll notice that the name and version of this URL are correctly detected,
+and you can even see which regular expressions it was matched to. However,
+you'll notice that when it substitutes the version number in, it doesn't
+replace the ``2.2`` with ``9.9`` where we would expect ``9.9.9b`` to live.
+This particular package may require a ``list_url`` or ``url_for_version``
+function.
+
+This command also accepts a ``--spider`` flag. If provided, Spack searches
+for other versions of the package and prints the matching URLs.
+
+""""""""""""""""""
+``spack url list``
+""""""""""""""""""
+
+This command lists every URL in every package in Spack. If given the
+``--color`` and ``--extrapolation`` flags, it also colors the part of
+the string that it detected to be the name and version. The
+``--incorrect-name`` and ``--incorrect-version`` flags can be used to
+print URLs that were not being parsed correctly.
+
+""""""""""""""""""
+``spack url test``
+""""""""""""""""""
+
+This command attempts to parse every URL for every package in Spack
+and prints a summary of how many of them are being correctly parsed.
+It also prints a histogram showing which regular expressions are being
+matched and how frequently:
+
+.. command-output:: spack url test
+
+This command is essential for anyone adding or changing the regular
+expressions that parse names and versions. By running this command
+before and after the change, you can make sure that your regular
+expression fixes more packages than it breaks.
+
---------
Profiling
---------
diff --git a/lib/spack/docs/packaging_guide.rst b/lib/spack/docs/packaging_guide.rst
index 75546d943e..41d4289636 100644
--- a/lib/spack/docs/packaging_guide.rst
+++ b/lib/spack/docs/packaging_guide.rst
@@ -712,8 +712,8 @@ is at ``http://example.com/downloads/foo-1.0.tar.gz``, Spack will look
in ``http://example.com/downloads/`` for links to additional versions.
If you need to search another path for download links, you can supply
some extra attributes that control how your package finds new
-versions. See the documentation on `attribute_list_url`_ and
-`attribute_list_depth`_.
+versions. See the documentation on :ref:`attribute_list_url` and
+:ref:`attribute_list_depth`.
.. note::
@@ -728,6 +728,102 @@ versions. See the documentation on `attribute_list_url`_ and
syntax errors, or the ``import`` will fail. Use this once you've
got your package in working order.
+--------------------
+Finding new versions
+--------------------
+
+You've already seen the ``homepage`` and ``url`` package attributes:
+
+.. code-block:: python
+ :linenos:
+
+ from spack import *
+
+
+ class Mpich(Package):
+ """MPICH is a high performance and widely portable implementation of
+ the Message Passing Interface (MPI) standard."""
+ homepage = "http://www.mpich.org"
+ url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
+
+These are class-level attributes used by Spack to show users
+information about the package, and to determine where to download its
+source code.
+
+Spack uses the tarball URL to extrapolate where to find other tarballs
+of the same package (e.g. in :ref:`cmd-spack-checksum`, but
+this does not always work. This section covers ways you can tell
+Spack to find tarballs elsewhere.
+
+.. _attribute_list_url:
+
+^^^^^^^^^^^^
+``list_url``
+^^^^^^^^^^^^
+
+When spack tries to find available versions of packages (e.g. with
+:ref:`cmd-spack-checksum`), it spiders the parent directory
+of the tarball in the ``url`` attribute. For example, for libelf, the
+url is:
+
+.. code-block:: python
+
+ url = "http://www.mr511.de/software/libelf-0.8.13.tar.gz"
+
+Here, Spack spiders ``http://www.mr511.de/software/`` to find similar
+tarball links and ultimately to make a list of available versions of
+``libelf``.
+
+For many packages, the tarball's parent directory may be unlistable,
+or it may not contain any links to source code archives. In fact,
+many times additional package downloads aren't even available in the
+same directory as the download URL.
+
+For these, you can specify a separate ``list_url`` indicating the page
+to search for tarballs. For example, ``libdwarf`` has the homepage as
+the ``list_url``, because that is where links to old versions are:
+
+.. code-block:: python
+ :linenos:
+
+ class Libdwarf(Package):
+ homepage = "http://www.prevanders.net/dwarf.html"
+ url = "http://www.prevanders.net/libdwarf-20130729.tar.gz"
+ list_url = homepage
+
+.. _attribute_list_depth:
+
+^^^^^^^^^^^^^^
+``list_depth``
+^^^^^^^^^^^^^^
+
+``libdwarf`` and many other packages have a listing of available
+versions on a single webpage, but not all do. For example, ``mpich``
+has a tarball URL that looks like this:
+
+.. code-block:: python
+
+ url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
+
+But its downloads are in many different subdirectories of
+``http://www.mpich.org/static/downloads/``. So, we need to add a
+``list_url`` *and* a ``list_depth`` attribute:
+
+.. code-block:: python
+ :linenos:
+
+ class Mpich(Package):
+ homepage = "http://www.mpich.org"
+ url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
+ list_url = "http://www.mpich.org/static/downloads/"
+ list_depth = 2
+
+By default, Spack only looks at the top-level page available at
+``list_url``. ``list_depth`` tells it to follow up to 2 levels of
+links from the top-level page. Note that here, this implies two
+levels of subdirectories, as the ``mpich`` website is structured much
+like a filesystem. But ``list_depth`` really refers to link depth
+when spidering the page.
.. _vcs-fetch:
@@ -1241,103 +1337,6 @@ RPATHs in Spack are handled in one of three ways:
links. You can see this how this is used in the :ref:`PySide
example <pyside-patch>` above.
---------------------
-Finding new versions
---------------------
-
-You've already seen the ``homepage`` and ``url`` package attributes:
-
-.. code-block:: python
- :linenos:
-
- from spack import *
-
-
- class Mpich(Package):
- """MPICH is a high performance and widely portable implementation of
- the Message Passing Interface (MPI) standard."""
- homepage = "http://www.mpich.org"
- url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
-
-These are class-level attributes used by Spack to show users
-information about the package, and to determine where to download its
-source code.
-
-Spack uses the tarball URL to extrapolate where to find other tarballs
-of the same package (e.g. in :ref:`cmd-spack-checksum`, but
-this does not always work. This section covers ways you can tell
-Spack to find tarballs elsewhere.
-
-.. _attribute_list_url:
-
-^^^^^^^^^^^^
-``list_url``
-^^^^^^^^^^^^
-
-When spack tries to find available versions of packages (e.g. with
-:ref:`cmd-spack-checksum`), it spiders the parent directory
-of the tarball in the ``url`` attribute. For example, for libelf, the
-url is:
-
-.. code-block:: python
-
- url = "http://www.mr511.de/software/libelf-0.8.13.tar.gz"
-
-Here, Spack spiders ``http://www.mr511.de/software/`` to find similar
-tarball links and ultimately to make a list of available versions of
-``libelf``.
-
-For many packages, the tarball's parent directory may be unlistable,
-or it may not contain any links to source code archives. In fact,
-many times additional package downloads aren't even available in the
-same directory as the download URL.
-
-For these, you can specify a separate ``list_url`` indicating the page
-to search for tarballs. For example, ``libdwarf`` has the homepage as
-the ``list_url``, because that is where links to old versions are:
-
-.. code-block:: python
- :linenos:
-
- class Libdwarf(Package):
- homepage = "http://www.prevanders.net/dwarf.html"
- url = "http://www.prevanders.net/libdwarf-20130729.tar.gz"
- list_url = homepage
-
-.. _attribute_list_depth:
-
-^^^^^^^^^^^^^^
-``list_depth``
-^^^^^^^^^^^^^^
-
-``libdwarf`` and many other packages have a listing of available
-versions on a single webpage, but not all do. For example, ``mpich``
-has a tarball URL that looks like this:
-
-.. code-block:: python
-
- url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
-
-But its downloads are in many different subdirectories of
-``http://www.mpich.org/static/downloads/``. So, we need to add a
-``list_url`` *and* a ``list_depth`` attribute:
-
-.. code-block:: python
- :linenos:
-
- class Mpich(Package):
- homepage = "http://www.mpich.org"
- url = "http://www.mpich.org/static/downloads/3.0.4/mpich-3.0.4.tar.gz"
- list_url = "http://www.mpich.org/static/downloads/"
- list_depth = 2
-
-By default, Spack only looks at the top-level page available at
-``list_url``. ``list_depth`` tells it to follow up to 2 levels of
-links from the top-level page. Note that here, this implies two
-levels of subdirectories, as the ``mpich`` website is structured much
-like a filesystem. But ``list_depth`` really refers to link depth
-when spidering the page.
-
.. _attribute_parallel:
---------------
diff --git a/lib/spack/spack/cmd/url.py b/lib/spack/spack/cmd/url.py
new file mode 100644
index 0000000000..6823f0febd
--- /dev/null
+++ b/lib/spack/spack/cmd/url.py
@@ -0,0 +1,319 @@
+##############################################################################
+# Copyright (c) 2013-2016, Lawrence Livermore National Security, LLC.
+# Produced at the Lawrence Livermore National Laboratory.
+#
+# This file is part of Spack.
+# Created by Todd Gamblin, tgamblin@llnl.gov, All rights reserved.
+# LLNL-CODE-647188
+#
+# For details, see https://github.com/llnl/spack
+# Please also see the LICENSE file for our notice and the LGPL.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU Lesser General Public License (as
+# published by the Free Software Foundation) version 2.1, February 1999.
+#
+# This program is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the IMPLIED WARRANTY OF
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the terms and
+# conditions of the GNU Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+##############################################################################
+from __future__ import division, print_function
+
+from collections import defaultdict
+
+import spack
+
+from llnl.util import tty
+from spack.url import *
+from spack.util.web import find_versions_of_archive
+
+description = "debugging tool for url parsing"
+
+
+def setup_parser(subparser):
+ sp = subparser.add_subparsers(metavar='SUBCOMMAND', dest='subcommand')
+
+ # Parse
+ parse_parser = sp.add_parser('parse', help='attempt to parse a url')
+
+ parse_parser.add_argument(
+ 'url',
+ help='url to parse')
+ parse_parser.add_argument(
+ '-s', '--spider', action='store_true',
+ help='spider the source page for versions')
+
+ # List
+ list_parser = sp.add_parser('list', help='list urls in all packages')
+
+ list_parser.add_argument(
+ '-c', '--color', action='store_true',
+ help='color the parsed version and name in the urls shown '
+ '(versions will be cyan, name red)')
+ list_parser.add_argument(
+ '-e', '--extrapolation', action='store_true',
+ help='color the versions used for extrapolation as well '
+ '(additional versions will be green, names magenta)')
+
+ excl_args = list_parser.add_mutually_exclusive_group()
+
+ excl_args.add_argument(
+ '-n', '--incorrect-name', action='store_true',
+ help='only list urls for which the name was incorrectly parsed')
+ excl_args.add_argument(
+ '-v', '--incorrect-version', action='store_true',
+ help='only list urls for which the version was incorrectly parsed')
+
+ # Test
+ sp.add_parser(
+ 'test', help='print a summary of how well we are parsing package urls')
+
+
+def url(parser, args):
+ action = {
+ 'parse': url_parse,
+ 'list': url_list,
+ 'test': url_test
+ }
+
+ action[args.subcommand](args)
+
+
+def url_parse(args):
+ url = args.url
+
+ tty.msg('Parsing URL: {0}'.format(url))
+ print()
+
+ ver, vs, vl, vi, vregex = parse_version_offset(url)
+ tty.msg('Matched version regex {0:>2}: r{1!r}'.format(vi, vregex))
+
+ name, ns, nl, ni, nregex = parse_name_offset(url, ver)
+ tty.msg('Matched name regex {0:>2}: r{1!r}'.format(ni, nregex))
+
+ print()
+ tty.msg('Detected:')
+ try:
+ print_name_and_version(url)
+ except UrlParseError as e:
+ tty.error(str(e))
+
+ print(' name: {0}'.format(name))
+ print(' version: {0}'.format(ver))
+ print()
+
+ tty.msg('Substituting version 9.9.9b:')
+ newurl = substitute_version(url, '9.9.9b')
+ print_name_and_version(newurl)
+
+ if args.spider:
+ print()
+ tty.msg('Spidering for versions:')
+ versions = find_versions_of_archive(url)
+
+ max_len = max(len(str(v)) for v in versions)
+
+ for v in sorted(versions):
+ print('{0:{1}} {2}'.format(v, max_len, versions[v]))
+
+
+def url_list(args):
+ urls = set()
+
+ # Gather set of URLs from all packages
+ for pkg in spack.repo.all_packages():
+ url = getattr(pkg.__class__, 'url', None)
+ urls = url_list_parsing(args, urls, url, pkg)
+
+ for params in pkg.versions.values():
+ url = params.get('url', None)
+ urls = url_list_parsing(args, urls, url, pkg)
+
+ # Print URLs
+ for url in sorted(urls):
+ if args.color or args.extrapolation:
+ print(color_url(url, subs=args.extrapolation, errors=True))
+ else:
+ print(url)
+
+ # Return the number of URLs that were printed, only for testing purposes
+ return len(urls)
+
+
+def url_test(args):
+ # Collect statistics on how many URLs were correctly parsed
+ total_urls = 0
+ correct_names = 0
+ correct_versions = 0
+
+ # Collect statistics on which regexes were matched and how often
+ name_regex_dict = dict()
+ name_count_dict = defaultdict(int)
+ version_regex_dict = dict()
+ version_count_dict = defaultdict(int)
+
+ tty.msg('Generating a summary of URL parsing in Spack...')
+
+ # Loop through all packages
+ for pkg in spack.repo.all_packages():
+ urls = set()
+
+ url = getattr(pkg.__class__, 'url', None)
+ if url:
+ urls.add(url)
+
+ for params in pkg.versions.values():
+ url = params.get('url', None)
+ if url:
+ urls.add(url)
+
+ # Calculate statistics
+ for url in urls:
+ total_urls += 1
+
+ # Parse versions
+ version = None
+ try:
+ version, vs, vl, vi, vregex = parse_version_offset(url)
+ version_regex_dict[vi] = vregex
+ version_count_dict[vi] += 1
+ if version_parsed_correctly(pkg, version):
+ correct_versions += 1
+ except UndetectableVersionError:
+ pass
+
+ # Parse names
+ try:
+ name, ns, nl, ni, nregex = parse_name_offset(url, version)
+ name_regex_dict[ni] = nregex
+ name_count_dict[ni] += 1
+ if name_parsed_correctly(pkg, name):
+ correct_names += 1
+ except UndetectableNameError:
+ pass
+
+ print()
+ print(' Total URLs found: {0}'.format(total_urls))
+ print(' Names correctly parsed: {0:>4}/{1:>4} ({2:>6.2%})'.format(
+ correct_names, total_urls, correct_names / total_urls))
+ print(' Versions correctly parsed: {0:>4}/{1:>4} ({2:>6.2%})'.format(
+ correct_versions, total_urls, correct_versions / total_urls))
+ print()
+
+ tty.msg('Statistics on name regular expresions:')
+
+ print()
+ print(' Index Count Regular Expresion')
+ for ni in name_regex_dict:
+ print(' {0:>3}: {1:>6} r{2!r}'.format(
+ ni, name_count_dict[ni], name_regex_dict[ni]))
+ print()
+
+ tty.msg('Statistics on version regular expresions:')
+
+ print()
+ print(' Index Count Regular Expresion')
+ for vi in version_regex_dict:
+ print(' {0:>3}: {1:>6} r{2!r}'.format(
+ vi, version_count_dict[vi], version_regex_dict[vi]))
+ print()
+
+ # Return statistics, only for testing purposes
+ return (total_urls, correct_names, correct_versions,
+ name_count_dict, version_count_dict)
+
+
+def print_name_and_version(url):
+ """Prints a URL. Underlines the detected name with dashes and
+ the detected version with tildes.
+
+ :param str url: The url to parse
+ """
+ name, ns, nl, ntup, ver, vs, vl, vtup = substitution_offsets(url)
+ underlines = [' '] * max(ns + nl, vs + vl)
+ for i in range(ns, ns + nl):
+ underlines[i] = '-'
+ for i in range(vs, vs + vl):
+ underlines[i] = '~'
+
+ print(' {0}'.format(url))
+ print(' {0}'.format(''.join(underlines)))
+
+
+def url_list_parsing(args, urls, url, pkg):
+ """Helper function for :func:`url_list`.
+
+ :param argparse.Namespace args: The arguments given to ``spack url list``
+ :param set urls: List of URLs that have already been added
+ :param url: A URL to potentially add to ``urls`` depending on ``args``
+ :type url: str or None
+ :param spack.package.PackageBase pkg: The Spack package
+ :returns: The updated ``urls`` list
+ :rtype: set
+ """
+ if url:
+ if args.incorrect_name:
+ # Only add URLs whose name was incorrectly parsed
+ try:
+ name = parse_name(url)
+ if not name_parsed_correctly(pkg, name):
+ urls.add(url)
+ except UndetectableNameError:
+ urls.add(url)
+ elif args.incorrect_version:
+ # Only add URLs whose version was incorrectly parsed
+ try:
+ version = parse_version(url)
+ if not version_parsed_correctly(pkg, version):
+ urls.add(url)
+ except UndetectableVersionError:
+ urls.add(url)
+ else:
+ urls.add(url)
+
+ return urls
+
+
+def name_parsed_correctly(pkg, name):
+ """Determine if the name of a package was correctly parsed.
+
+ :param spack.package.PackageBase pkg: The Spack package
+ :param str name: The name that was extracted from the URL
+ :returns: True if the name was correctly parsed, else False
+ :rtype: bool
+ """
+ pkg_name = pkg.name
+
+ # After determining a name, `spack create` determines a build system.
+ # Some build systems prepend a special string to the front of the name.
+ # Since this can't be guessed from the URL, it would be unfair to say
+ # that these names are incorrectly parsed, so we remove them.
+ if pkg_name.startswith('r-'):
+ pkg_name = pkg_name[2:]
+ elif pkg_name.startswith('py-'):
+ pkg_name = pkg_name[3:]
+ elif pkg_name.startswith('octave-'):
+ pkg_name = pkg_name[7:]
+
+ return name == pkg_name
+
+
+def version_parsed_correctly(pkg, version):
+ """Determine if the version of a package was correctly parsed.
+
+ :param spack.package.PackageBase pkg: The Spack package
+ :param str version: The version that was extracted from the URL
+ :returns: True if the name was correctly parsed, else False
+ :rtype: bool
+ """
+ # If the version parsed from the URL is listed in a version()
+ # directive, we assume it was correctly parsed
+ for pkg_version in pkg.versions:
+ if str(pkg_version) == str(version):
+ return True
+ return False
diff --git a/lib/spack/spack/cmd/url_parse.py b/lib/spack/spack/cmd/url_parse.py
deleted file mode 100644
index b33d96299f..0000000000
--- a/lib/spack/spack/cmd/url_parse.py
+++ /dev/null
@@ -1,79 +0,0 @@
-##############################################################################
-# Copyright (c) 2013-2016, Lawrence Livermore National Security, LLC.
-# Produced at the Lawrence Livermore National Laboratory.
-#
-# This file is part of Spack.
-# Created by Todd Gamblin, tgamblin@llnl.gov, All rights reserved.
-# LLNL-CODE-647188
-#
-# For details, see https://github.com/llnl/spack
-# Please also see the LICENSE file for our notice and the LGPL.
-#
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU Lesser General Public License (as
-# published by the Free Software Foundation) version 2.1, February 1999.
-#
-# This program is distributed in the hope that it will be useful, but
-# WITHOUT ANY WARRANTY; without even the IMPLIED WARRANTY OF
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the terms and
-# conditions of the GNU Lesser General Public License for more details.
-#
-# You should have received a copy of the GNU Lesser General Public
-# License along with this program; if not, write to the Free Software
-# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
-##############################################################################
-import llnl.util.tty as tty
-
-import spack
-import spack.url
-from spack.util.web import find_versions_of_archive
-
-description = "show parsing of a URL, optionally spider web for versions"
-
-
-def setup_parser(subparser):
- subparser.add_argument('url', help="url of a package archive")
- subparser.add_argument(
- '-s', '--spider', action='store_true',
- help="spider the source page for versions")
-
-
-def print_name_and_version(url):
- name, ns, nl, ntup, ver, vs, vl, vtup = spack.url.substitution_offsets(url)
- underlines = [" "] * max(ns + nl, vs + vl)
- for i in range(ns, ns + nl):
- underlines[i] = '-'
- for i in range(vs, vs + vl):
- underlines[i] = '~'
-
- print " %s" % url
- print " %s" % ''.join(underlines)
-
-
-def url_parse(parser, args):
- url = args.url
-
- ver, vs, vl = spack.url.parse_version_offset(url, debug=True)
- name, ns, nl = spack.url.parse_name_offset(url, ver, debug=True)
- print
-
- tty.msg("Detected:")
- try:
- print_name_and_version(url)
- except spack.url.UrlParseError as e:
- tty.error(str(e))
-
- print ' name: %s' % name
- print ' version: %s' % ver
-
- print
- tty.msg("Substituting version 9.9.9b:")
- newurl = spack.url.substitute_version(url, '9.9.9b')
- print_name_and_version(newurl)
-
- if args.spider:
- print
- tty.msg("Spidering for versions:")
- versions = find_versions_of_archive(url)
- for v in sorted(versions):
- print "%-20s%s" % (v, versions[v])
diff --git a/lib/spack/spack/cmd/urls.py b/lib/spack/spack/cmd/urls.py
deleted file mode 100644
index 4ff23e69c1..0000000000
--- a/lib/spack/spack/cmd/urls.py
+++ /dev/null
@@ -1,59 +0,0 @@
-##############################################################################
-# Copyright (c) 2013-2016, Lawrence Livermore National Security, LLC.
-# Produced at the Lawrence Livermore National Laboratory.
-#
-# This file is part of Spack.
-# Created by Todd Gamblin, tgamblin@llnl.gov, All rights reserved.
-# LLNL-CODE-647188
-#
-# For details, see https://github.com/llnl/spack
-# Please also see the LICENSE file for our notice and the LGPL.
-#
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU Lesser General Public License (as
-# published by the Free Software Foundation) version 2.1, February 1999.
-#
-# This program is distributed in the hope that it will be useful, but
-# WITHOUT ANY WARRANTY; without even the IMPLIED WARRANTY OF
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the terms and
-# conditions of the GNU Lesser General Public License for more details.
-#
-# You should have received a copy of the GNU Lesser General Public
-# License along with this program; if not, write to the Free Software
-# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
-##############################################################################
-import spack
-import spack.url
-
-description = "inspect urls used by packages in spack"
-
-
-def setup_parser(subparser):
- subparser.add_argument(
- '-c', '--color', action='store_true',
- help="color the parsed version and name in the urls shown. "
- "version will be cyan, name red")
- subparser.add_argument(
- '-e', '--extrapolation', action='store_true',
- help="color the versions used for extrapolation as well. "
- "additional versions are green, names magenta")
-
-
-def urls(parser, args):
- urls = set()
- for pkg in spack.repo.all_packages():
- url = getattr(pkg.__class__, 'url', None)
- if url:
- urls.add(url)
-
- for params in pkg.versions.values():
- url = params.get('url', None)
- if url:
- urls.add(url)
-
- for url in sorted(urls):
- if args.color or args.extrapolation:
- print spack.url.color_url(
- url, subs=args.extrapolation, errors=True)
- else:
- print url
diff --git a/lib/spack/spack/test/cmd/url.py b/lib/spack/spack/test/cmd/url.py
new file mode 100644
index 0000000000..4c60d814ce
--- /dev/null
+++ b/lib/spack/spack/test/cmd/url.py
@@ -0,0 +1,116 @@
+##############################################################################
+# Copyright (c) 2013-2016, Lawrence Livermore National Security, LLC.
+# Produced at the Lawrence Livermore National Laboratory.
+#
+# This file is part of Spack.
+# Created by Todd Gamblin, tgamblin@llnl.gov, All rights reserved.
+# LLNL-CODE-647188
+#
+# For details, see https://github.com/llnl/spack
+# Please also see the LICENSE file for our notice and the LGPL.
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU Lesser General Public License (as
+# published by the Free Software Foundation) version 2.1, February 1999.
+#
+# This program is distributed in the hope that it will be useful, but
+# WITHOUT ANY WARRANTY; without even the IMPLIED WARRANTY OF
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the terms and
+# conditions of the GNU Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this program; if not, write to the Free Software
+# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+##############################################################################
+import argparse
+import pytest
+
+from spack.cmd.url import *
+
+
+@pytest.fixture(scope='module')
+def parser():
+ """Returns the parser for the ``url`` command"""
+ parser = argparse.ArgumentParser()
+ setup_parser(parser)
+ return parser
+
+
+class MyPackage:
+ def __init__(self, name, versions):
+ self.name = name
+ self.versions = versions
+
+
+def test_name_parsed_correctly():
+ # Expected True
+ assert name_parsed_correctly(MyPackage('netcdf', []), 'netcdf')
+ assert name_parsed_correctly(MyPackage('r-devtools', []), 'devtools')
+ assert name_parsed_correctly(MyPackage('py-numpy', []), 'numpy')
+ assert name_parsed_correctly(MyPackage('octave-splines', []), 'splines')
+
+ # Expected False
+ assert not name_parsed_correctly(MyPackage('', []), 'hdf5')
+ assert not name_parsed_correctly(MyPackage('hdf5', []), '')
+ assert not name_parsed_correctly(MyPackage('imagemagick', []), 'ImageMagick') # noqa
+ assert not name_parsed_correctly(MyPackage('yaml-cpp', []), 'yamlcpp')
+ assert not name_parsed_correctly(MyPackage('yamlcpp', []), 'yaml-cpp')
+ assert not name_parsed_correctly(MyPackage('r-py-parser', []), 'parser')
+ assert not name_parsed_correctly(MyPackage('oce', []), 'oce-0.18.0') # noqa
+
+
+def test_version_parsed_correctly():
+ # Expected True
+ assert version_parsed_correctly(MyPackage('', ['1.2.3']), '1.2.3')
+ assert version_parsed_correctly(MyPackage('', ['5.4a', '5.4b']), '5.4a')
+ assert version_parsed_correctly(MyPackage('', ['5.4a', '5.4b']), '5.4b')
+
+ # Expected False
+ assert not version_parsed_correctly(MyPackage('', []), '1.2.3')
+ assert not version_parsed_correctly(MyPackage('', ['1.2.3']), '')
+ assert not version_parsed_correctly(MyPackage('', ['1.2.3']), '1.2.4')
+ assert not version_parsed_correctly(MyPackage('', ['3.4a']), '3.4')
+ assert not version_parsed_correctly(MyPackage('', ['3.4']), '3.4b')
+ assert not version_parsed_correctly(MyPackage('', ['0.18.0']), 'oce-0.18.0') # noqa
+
+
+def test_url_parse(parser):
+ args = parser.parse_args(['parse', 'http://zlib.net/fossils/zlib-1.2.10.tar.gz'])
+ url(parser, args)
+
+
+@pytest.mark.xfail
+def test_url_parse_xfail(parser):
+ # No version in URL
+ args = parser.parse_args(['parse', 'http://www.netlib.org/voronoi/triangle.zip'])
+ url(parser, args)
+
+
+def test_url_list(parser):
+ args = parser.parse_args(['list'])
+ total_urls = url_list(args)
+
+ # The following two options should not change the number of URLs printed.
+ args = parser.parse_args(['list', '--color', '--extrapolation'])
+ colored_urls = url_list(args)
+ assert colored_urls == total_urls
+
+ # The following two options should print fewer URLs than the default.
+ # If they print the same number of URLs, something is horribly broken.
+ # If they say we missed 0 URLs, something is probably broken too.
+ args = parser.parse_args(['list', '--incorrect-name'])
+ incorrect_name_urls = url_list(args)
+ assert 0 < incorrect_name_urls < total_urls
+
+ args = parser.parse_args(['list', '--incorrect-version'])
+ incorrect_version_urls = url_list(args)
+ assert 0 < incorrect_version_urls < total_urls
+
+
+def test_url_test(parser):
+ args = parser.parse_args(['test'])
+ (total_urls, correct_names, correct_versions,
+ name_count_dict, version_count_dict) = url_test(args)
+
+ assert 0 < correct_names <= sum(name_count_dict.values()) <= total_urls # noqa
+ assert 0 < correct_versions <= sum(version_count_dict.values()) <= total_urls # noqa
diff --git a/lib/spack/spack/url.py b/lib/spack/spack/url.py
index 93c443fde8..65f8e12e58 100644
--- a/lib/spack/spack/url.py
+++ b/lib/spack/spack/url.py
@@ -28,17 +28,17 @@ The idea is to allow package creators to supply nothing more than the
download location of the package, and figure out version and name information
from there.
-Example: when spack is given the following URL:
+**Example:** when spack is given the following URL:
- ftp://ftp.ruby-lang.org/pub/ruby/1.9/ruby-1.9.1-p243.tar.gz
+ https://www.hdfgroup.org/ftp/HDF/releases/HDF4.2.12/src/hdf-4.2.12.tar.gz
-It can figure out that the package name is ruby, and that it is at version
-1.9.1-p243. This is useful for making the creation of packages simple: a user
+It can figure out that the package name is ``hdf``, and that it is at version
+``4.2.12``. This is useful for making the creation of packages simple: a user
just supplies a URL and skeleton code is generated automatically.
-Spack can also figure out that it can most likely download 1.8.1 at this URL:
+Spack can also figure out that it can most likely download 4.2.6 at this URL:
- ftp://ftp.ruby-lang.org/pub/ruby/1.9/ruby-1.8.1.tar.gz
+ https://www.hdfgroup.org/ftp/HDF/releases/HDF4.2.6/src/hdf-4.2.6.tar.gz
This is useful if a user asks for a package at a particular version number;
spack doesn't need anyone to tell it where to get the tarball even though
@@ -104,24 +104,23 @@ def strip_query_and_fragment(path):
def split_url_extension(path):
"""Some URLs have a query string, e.g.:
- 1. https://github.com/losalamos/CLAMR/blob/packages/PowerParser_v2.0.7.tgz?raw=true
- 2. http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.2.0/apache-cassandra-1.2.0-rc2-bin.tar.gz
- 3. https://gitlab.kitware.com/vtk/vtk/repository/archive.tar.bz2?ref=v7.0.0
+ 1. https://github.com/losalamos/CLAMR/blob/packages/PowerParser_v2.0.7.tgz?raw=true
+ 2. http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.2.0/apache-cassandra-1.2.0-rc2-bin.tar.gz
+ 3. https://gitlab.kitware.com/vtk/vtk/repository/archive.tar.bz2?ref=v7.0.0
- In (1), the query string needs to be stripped to get at the
- extension, but in (2) & (3), the filename is IN a single final query
- argument.
+ In (1), the query string needs to be stripped to get at the
+ extension, but in (2) & (3), the filename is IN a single final query
+ argument.
- This strips the URL into three pieces: prefix, ext, and suffix.
- The suffix contains anything that was stripped off the URL to
- get at the file extension. In (1), it will be '?raw=true', but
- in (2), it will be empty. In (3) the suffix is a parameter that follows
- after the file extension, e.g.:
+ This strips the URL into three pieces: ``prefix``, ``ext``, and ``suffix``.
+ The suffix contains anything that was stripped off the URL to
+ get at the file extension. In (1), it will be ``'?raw=true'``, but
+ in (2), it will be empty. In (3) the suffix is a parameter that follows
+ after the file extension, e.g.:
- 1. ('https://github.com/losalamos/CLAMR/blob/packages/PowerParser_v2.0.7', '.tgz', '?raw=true')
- 2. ('http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.2.0/apache-cassandra-1.2.0-rc2-bin',
- '.tar.gz', None)
- 3. ('https://gitlab.kitware.com/vtk/vtk/repository/archive', '.tar.bz2', '?ref=v7.0.0')
+ 1. ``('https://github.com/losalamos/CLAMR/blob/packages/PowerParser_v2.0.7', '.tgz', '?raw=true')``
+ 2. ``('http://www.apache.org/dyn/closer.cgi?path=/cassandra/1.2.0/apache-cassandra-1.2.0-rc2-bin', '.tar.gz', None)``
+ 3. ``('https://gitlab.kitware.com/vtk/vtk/repository/archive', '.tar.bz2', '?ref=v7.0.0')``
"""
prefix, ext, suffix = path, '', ''
@@ -149,7 +148,7 @@ def determine_url_file_extension(path):
"""This returns the type of archive a URL refers to. This is
sometimes confusing because of URLs like:
- (1) https://github.com/petdance/ack/tarball/1.93_02
+ (1) https://github.com/petdance/ack/tarball/1.93_02
Where the URL doesn't actually contain the filename. We need
to know what type it is so that we can appropriately name files
@@ -166,19 +165,44 @@ def determine_url_file_extension(path):
return ext
-def parse_version_offset(path, debug=False):
- """Try to extract a version string from a filename or URL. This is taken
- largely from Homebrew's Version class."""
+def parse_version_offset(path):
+ """Try to extract a version string from a filename or URL.
+
+ :param str path: The filename or URL for the package
+
+ :return: A tuple containing:
+ version of the package,
+ first index of version,
+ length of version string,
+ the index of the matching regex
+ the matching regex
+
+ :rtype: tuple
+
+ :raises UndetectableVersionError: If the URL does not match any regexes
+ """
original_path = path
+ # path: The prefix of the URL, everything before the ext and suffix
+ # ext: The file extension
+ # suffix: Any kind of query string that begins with a '?'
path, ext, suffix = split_url_extension(path)
- # Allow matches against the basename, to avoid including parent
- # dirs in version name Remember the offset of the stem in the path
+ # stem: Everything from path after the final '/'
stem = os.path.basename(path)
offset = len(path) - len(stem)
- version_types = [
+ # List of the following format:
+ #
+ # [
+ # (regex, string),
+ # ...
+ # ]
+ #
+ # The first regex that matches string will be used to determine
+ # the version of the package. Thefore, hyperspecific regexes should
+ # come first while generic, catch-all regexes should come last.
+ version_regexes = [
# GitHub tarballs, e.g. v1.2.3
(r'github.com/.+/(?:zip|tar)ball/v?((\d+\.)+\d+)$', path),
@@ -258,16 +282,13 @@ def parse_version_offset(path, debug=False):
(r'\/(\d\.\d+)\/', path),
# e.g. http://www.ijg.org/files/jpegsrc.v8d.tar.gz
- (r'\.v(\d+[a-z]?)', stem)]
+ (r'\.v(\d+[a-z]?)', stem)
+ ]
- for i, vtype in enumerate(version_types):
- regex, match_string = vtype
+ for i, version_regex in enumerate(version_regexes):
+ regex, match_string = version_regex
match = re.search(regex, match_string)
if match and match.group(1) is not None:
- if debug:
- tty.msg("Parsing URL: %s" % path,
- " Matched regex %d: r'%s'" % (i, regex))
-
version = match.group(1)
start = match.start(1)
@@ -275,30 +296,74 @@ def parse_version_offset(path, debug=False):
if match_string is stem:
start += offset
- return version, start, len(version)
+ return version, start, len(version), i, regex
raise UndetectableVersionError(original_path)
-def parse_version(path, debug=False):
- """Given a URL or archive name, extract a version from it and return
- a version object.
+def parse_version(path):
+ """Try to extract a version string from a filename or URL.
+
+ :param str path: The filename or URL for the package
+
+ :return: The version of the package
+ :rtype: spack.version.Version
+
+ :raises UndetectableVersionError: If the URL does not match any regexes
"""
- ver, start, l = parse_version_offset(path, debug=debug)
- return Version(ver)
+ version, start, length, i, regex = parse_version_offset(path)
+ return Version(version)
-def parse_name_offset(path, v=None, debug=False):
- if v is None:
- v = parse_version(path, debug=debug)
+def parse_name_offset(path, v=None):
+ """Try to determine the name of a package from its filename or URL.
+
+ :param str path: The filename or URL for the package
+ :param str v: The version of the package
+
+ :return: A tuple containing:
+ name of the package,
+ first index of name,
+ length of name,
+ the index of the matching regex
+ the matching regex
+
+ :rtype: tuple
+
+ :raises UndetectableNameError: If the URL does not match any regexes
+ """
+ original_path = path
+ # We really need to know the version of the package
+ # This helps us prevent collisions between the name and version
+ if v is None:
+ try:
+ v = parse_version(path)
+ except UndetectableVersionError:
+ # Not all URLs contain a version. We still want to be able
+ # to determine a name if possible.
+ v = ''
+
+ # path: The prefix of the URL, everything before the ext and suffix
+ # ext: The file extension
+ # suffix: Any kind of query string that begins with a '?'
path, ext, suffix = split_url_extension(path)
- # Allow matching with either path or stem, as with the version.
+ # stem: Everything from path after the final '/'
stem = os.path.basename(path)
offset = len(path) - len(stem)
- name_types = [
+ # List of the following format:
+ #
+ # [
+ # (regex, string),
+ # ...
+ # ]
+ #
+ # The first regex that matches string will be used to determine
+ # the name of the package. Thefore, hyperspecific regexes should
+ # come first while generic, catch-all regexes should come last.
+ name_regexes = [
(r'/sourceforge/([^/]+)/', path),
(r'github.com/[^/]+/[^/]+/releases/download/%s/(.*)-%s$' %
(v, v), path),
@@ -316,10 +381,11 @@ def parse_name_offset(path, v=None, debug=False):
(r'/([^/]+)%s' % v, path),
(r'^([^/]+)[_.-]v?%s' % v, path),
- (r'^([^/]+)%s' % v, path)]
+ (r'^([^/]+)%s' % v, path)
+ ]
- for i, name_type in enumerate(name_types):
- regex, match_string = name_type
+ for i, name_regex in enumerate(name_regexes):
+ regex, match_string = name_regex
match = re.search(regex, match_string)
if match:
name = match.group(1)
@@ -333,17 +399,38 @@ def parse_name_offset(path, v=None, debug=False):
name = name.lower()
name = re.sub('[_.]', '-', name)
- return name, start, len(name)
+ return name, start, len(name), i, regex
- raise UndetectableNameError(path)
+ raise UndetectableNameError(original_path)
def parse_name(path, ver=None):
- name, start, l = parse_name_offset(path, ver)
+ """Try to determine the name of a package from its filename or URL.
+
+ :param str path: The filename or URL for the package
+ :param str ver: The version of the package
+
+ :return: The name of the package
+ :rtype: str
+
+ :raises UndetectableNameError: If the URL does not match any regexes
+ """
+ name, start, length, i, regex = parse_name_offset(path, ver)
return name
def parse_name_and_version(path):
+ """Try to determine the name of a package and extract its version
+ from its filename or URL.
+
+ :param str path: The filename or URL for the package
+
+ :return: A tuple containing:
+ The name of the package
+ The version of the package
+
+ :rtype: tuple
+ """
ver = parse_version(path)
name = parse_name(path, ver)
return (name, ver)
@@ -371,12 +458,12 @@ def cumsum(elts, init=0, fn=lambda x: x):
def substitution_offsets(path):
"""This returns offsets for substituting versions and names in the
- provided path. It is a helper for substitute_version().
+ provided path. It is a helper for :func:`substitute_version`.
"""
# Get name and version offsets
try:
- ver, vs, vl = parse_version_offset(path)
- name, ns, nl = parse_name_offset(path, ver)
+ ver, vs, vl, vi, vregex = parse_version_offset(path)
+ name, ns, nl, ni, nregex = parse_name_offset(path, ver)
except UndetectableNameError:
return (None, -1, -1, (), ver, vs, vl, (vs,))
except UndetectableVersionError:
@@ -444,21 +531,22 @@ def wildcard_version(path):
def substitute_version(path, new_version):
"""Given a URL or archive name, find the version in the path and
- substitute the new version for it. Replace all occurrences of
- the version *if* they don't overlap with the package name.
+ substitute the new version for it. Replace all occurrences of
+ the version *if* they don't overlap with the package name.
- Simple example::
- substitute_version('http://www.mr511.de/software/libelf-0.8.13.tar.gz', '2.9.3')
- ->'http://www.mr511.de/software/libelf-2.9.3.tar.gz'
+ Simple example:
- Complex examples::
- substitute_version('http://mvapich.cse.ohio-state.edu/download/mvapich/mv2/mvapich2-2.0.tar.gz', 2.1)
- -> 'http://mvapich.cse.ohio-state.edu/download/mvapich/mv2/mvapich2-2.1.tar.gz'
+ .. code-block:: python
- # In this string, the "2" in mvapich2 is NOT replaced.
- substitute_version('http://mvapich.cse.ohio-state.edu/download/mvapich/mv2/mvapich2-2.tar.gz', 2.1)
- -> 'http://mvapich.cse.ohio-state.edu/download/mvapich/mv2/mvapich2-2.1.tar.gz'
+ substitute_version('http://www.mr511.de/software/libelf-0.8.13.tar.gz', '2.9.3')
+ >>> 'http://www.mr511.de/software/libelf-2.9.3.tar.gz'
+ Complex example:
+
+ .. code-block:: python
+
+ substitute_version('https://www.hdfgroup.org/ftp/HDF/releases/HDF4.2.12/src/hdf-4.2.12.tar.gz', '2.3')
+ >>> 'https://www.hdfgroup.org/ftp/HDF/releases/HDF2.3/src/hdf-2.3.tar.gz'
"""
(name, ns, nl, noffs,
ver, vs, vl, voffs) = substitution_offsets(path)
@@ -477,17 +565,16 @@ def substitute_version(path, new_version):
def color_url(path, **kwargs):
"""Color the parts of the url according to Spack's parsing.
- Colors are:
- Cyan: The version found by parse_version_offset().
- Red: The name found by parse_name_offset().
-
- Green: Instances of version string from substitute_version().
- Magenta: Instances of the name (protected from substitution).
+ Colors are:
+ | Cyan: The version found by :func:`parse_version_offset`.
+ | Red: The name found by :func:`parse_name_offset`.
- Optional args:
- errors=True Append parse errors at end of string.
- subs=True Color substitutions as well as parsed name/version.
+ | Green: Instances of version string from :func:`substitute_version`.
+ | Magenta: Instances of the name (protected from substitution).
+ :param str path: The filename or URL for the package
+ :keyword bool errors: Append parse errors at end of string.
+ :keyword bool subs: Color substitutions as well as parsed name/version.
"""
errors = kwargs.get('errors', False)
subs = kwargs.get('subs', False)