summaryrefslogtreecommitdiff
path: root/lib/spack/docs/build_systems/rpackage.rst
diff options
context:
space:
mode:
Diffstat (limited to 'lib/spack/docs/build_systems/rpackage.rst')
-rw-r--r--lib/spack/docs/build_systems/rpackage.rst346
1 files changed, 346 insertions, 0 deletions
diff --git a/lib/spack/docs/build_systems/rpackage.rst b/lib/spack/docs/build_systems/rpackage.rst
new file mode 100644
index 0000000000..5e44b2135e
--- /dev/null
+++ b/lib/spack/docs/build_systems/rpackage.rst
@@ -0,0 +1,346 @@
+.. Copyright 2013-2018 Lawrence Livermore National Security, LLC and other
+ Spack Project Developers. See the top-level COPYRIGHT file for details.
+
+ SPDX-License-Identifier: (Apache-2.0 OR MIT)
+
+.. _rpackage:
+
+--------
+RPackage
+--------
+
+Like Python, R has its own built-in build system.
+
+The R build system is remarkably uniform and well-tested.
+This makes it one of the easiest build systems to create
+new Spack packages for.
+
+^^^^^^
+Phases
+^^^^^^
+
+The ``RPackage`` base class has a single phase:
+
+#. ``install`` - install the package
+
+By default, this phase runs the following command:
+
+.. code-block:: console
+
+ $ R CMD INSTALL --library=/path/to/installation/prefix/rlib/R/library .
+
+
+^^^^^^^^^^^^^^^^^^
+Finding R packages
+^^^^^^^^^^^^^^^^^^
+
+The vast majority of R packages are hosted on CRAN - The Comprehensive
+R Archive Network. If you are looking for a particular R package, search
+for "CRAN <package-name>" and you should quickly find what you want.
+If it isn't on CRAN, try Bioconductor, another common R repository.
+
+For the purposes of this tutorial, we will be walking through
+`r-caret <https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/r-caret/package.py>`_
+as an example. If you search for "CRAN caret", you will quickly find what
+you are looking for at https://cran.r-project.org/web/packages/caret/index.html.
+If you search for "Package source", you will find the download URL for
+the latest release. Use this URL with ``spack create`` to create a new
+package.
+
+^^^^^^^^^^^^
+Package name
+^^^^^^^^^^^^
+
+The first thing you'll notice is that Spack prepends ``r-`` to the front
+of the package name. This is how Spack separates R package extensions
+from the rest of the packages in Spack. Without this, we would end up
+with package name collisions more frequently than we would like. For
+instance, there are already packages for both:
+
+* ``ape`` and ``r-ape``
+* ``curl`` and ``r-curl``
+* ``gmp`` and ``r-gmp``
+* ``jpeg`` and ``r-jpeg``
+* ``openssl`` and ``r-openssl``
+* ``uuid`` and ``r-uuid``
+* ``xts`` and ``r-xts``
+
+Many popular programs written in C/C++ are later ported to R as a
+separate project.
+
+^^^^^^^^^^^
+Description
+^^^^^^^^^^^
+
+The first thing you'll need to add to your new package is a description.
+The top of the homepage for ``caret`` lists the following description:
+
+ caret: Classification and Regression Training
+
+ Misc functions for training and plotting classification and regression models.
+
+You can either use the short description (first line), long description
+(second line), or both depending on what you feel is most appropriate.
+
+^^^^^^^^
+Homepage
+^^^^^^^^
+
+If you look at the bottom of the page, you'll see:
+
+ Linking:
+
+ Please use the canonical form https://CRAN.R-project.org/package=caret to link to this page.
+
+Please uphold the wishes of the CRAN admins and use
+https://CRAN.R-project.org/package=caret as the homepage instead of
+https://cran.r-project.org/web/packages/caret/index.html. The latter may
+change without notice.
+
+^^^
+URL
+^^^
+
+As previously mentioned, the download URL for the latest release can be
+found by searching "Package source" on the homepage.
+
+^^^^^^^^
+List URL
+^^^^^^^^
+
+CRAN maintains a single webpage containing the latest release of every
+single package: https://cran.r-project.org/src/contrib/
+
+Of course, as soon as a new release comes out, the version you were using
+in your package is no longer available at that URL. It is moved to an
+archive directory. If you search for "Old sources", you will find:
+https://cran.r-project.org/src/contrib/Archive/caret
+
+If you only specify the URL for the latest release, your package will
+no longer be able to fetch that version as soon as a new release comes
+out. To get around this, add the archive directory as a ``list_url``.
+
+^^^^^^^^^^^^^^^^^^^^^^^^^
+Build system dependencies
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+As an extension of the R ecosystem, your package will obviously depend
+on R to build and run. Normally, we would use ``depends_on`` to express
+this, but for R packages, we use ``extends``. ``extends`` is similar to
+``depends_on``, but adds an additional feature: the ability to "activate"
+the package by symlinking it to the R installation directory. Since
+every R package needs this, the ``RPackage`` base class contains:
+
+.. code-block:: python
+
+ extends('r')
+ depends_on('r', type=('build', 'run'))
+
+
+Take a close look at the homepage for ``caret``. If you look at the
+"Depends" section, you'll notice that ``caret`` depends on "R (≥ 2.10)".
+You should add this to your package like so:
+
+.. code-block:: python
+
+ depends_on('r@2.10:', type=('build', 'run'))
+
+
+^^^^^^^^^^^^^^
+R dependencies
+^^^^^^^^^^^^^^
+
+R packages are often small and follow the classic Unix philosophy
+of doing one thing well. They are modular and usually depend on
+several other packages. You may find a single package with over a
+hundred dependencies. Luckily, CRAN packages are well-documented
+and list all of their dependencies in the following sections:
+
+* Depends
+* Imports
+* LinkingTo
+
+As far as Spack is concerned, all 3 of these dependency types
+correspond to ``type=('build', 'run')``, so you don't have to worry
+about them. If you are curious what they mean,
+https://github.com/spack/spack/issues/2951 has a pretty good summary:
+
+ ``Depends`` is required and will cause those R packages to be *attached*,
+ that is, their APIs are exposed to the user. ``Imports`` *loads* packages
+ so that *the package* importing these packages can access their APIs,
+ while *not* being exposed to the user. When a user calls ``library(foo)``
+ s/he *attaches* package ``foo`` and all of the packages under ``Depends``.
+ Any function in one of these package can be called directly as ``bar()``.
+ If there are conflicts, user can also specify ``pkgA::bar()`` and
+ ``pkgB::bar()`` to distinguish between them. Historically, there was only
+ ``Depends`` and ``Suggests``, hence the confusing names. Today, maybe
+ ``Depends`` would have been named ``Attaches``.
+
+ The ``LinkingTo`` is not perfect and there was recently an extensive
+ discussion about API/ABI among other things on the R-devel mailing
+ list among very skilled R developers:
+
+ * https://stat.ethz.ch/pipermail/r-devel/2016-December/073505.html
+ * https://stat.ethz.ch/pipermail/r-devel/2017-January/073647.html
+
+Some packages also have a fourth section:
+
+* Suggests
+
+These are optional, rarely-used dependencies that a user might find
+useful. You should **NOT** add these dependencies to your package.
+R packages already have enough dependencies as it is, and adding
+optional dependencies can really slow down the concretization
+process. They can also introduce circular dependencies.
+
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Core, recommended, and non-core packages
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you look at "Depends", "Imports", and "LinkingTo", you will notice
+3 different types of packages:
+
+"""""""""""""
+Core packages
+"""""""""""""
+
+If you look at the ``caret`` homepage, you'll notice a few dependencies
+that don't have a link to the package, like ``methods``, ``stats``, and
+``utils``. These packages are part of the core R distribution and are
+tied to the R version installed. You can basically consider these to be
+"R itself". These are so essential to R so it would not make sense that
+they could be updated via CRAN. If so, you would basically get a different
+version of R. Thus, they're updated when R is updated.
+
+You can find a list of these core libraries at:
+https://github.com/wch/r-source/tree/trunk/src/library
+
+""""""""""""""""""""
+Recommended packages
+""""""""""""""""""""
+
+When you install R, there is an option called ``--with-recommended-packages``.
+This flag causes the R installation to include a few "Recommended" packages
+(legacy term). They are for historical reasons quite tied to the core R
+distribution, developed by the R core team or people closely related to it.
+The R core distribution "knows" about these package, but they are indeed
+distributed via CRAN. Because they're distributed via CRAN, they can also be
+updated between R version releases.
+
+Spack explicitly adds the ``--without-recommended-packages`` flag to prevent
+the installation of these packages. Due to the way Spack handles package
+activation (symlinking packages to the R installation directory),
+pre-existing recommended packages will cause conflicts for already-existing
+files. We could either not include these recommended packages in Spack and
+require them to be installed through ``--with-recommended-packages``, or
+we could not install them with R and let users choose the version of the
+package they want to install. We chose the latter.
+
+Since these packages are so commonly distributed with the R system, many
+developers may assume these packages exist and fail to list them as
+dependencies. Watch out for this.
+
+You can find a list of these recommended packages at:
+https://github.com/wch/r-source/blob/trunk/share/make/vars.mk
+
+"""""""""""""""""
+Non-core packages
+"""""""""""""""""
+
+These are packages that are neither "core" nor "recommended". There are more
+than 10,000 of these packages hosted on CRAN alone.
+
+For each of these package types, if you see that a specific version is
+required, for example, "lattice (≥ 0.20)", please add this information to
+the dependency:
+
+.. code-block:: python
+
+ depends_on('r-lattice@0.20:', type=('build', 'run'))
+
+
+^^^^^^^^^^^^^^^^^^
+Non-R dependencies
+^^^^^^^^^^^^^^^^^^
+
+Some packages depend on non-R libraries for linking. Check out the
+`r-stringi <https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/r-stringi/package.py>`_
+package for an example: https://CRAN.R-project.org/package=stringi.
+If you search for the text "SystemRequirements", you will see:
+
+ ICU4C (>= 52, optional)
+
+This is how non-R dependencies are listed. Make sure to add these
+dependencies. The default dependency type should suffice.
+
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Passing arguments to the installation
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+Some R packages provide additional flags that can be passed to
+``R CMD INSTALL``, often to locate non-R dependencies.
+`r-rmpi <https://github.com/spack/spack/blob/develop/var/spack/repos/builtin/packages/r-rmpi/package.py>`_
+is an example of this, and flags for linking to an MPI library. To pass
+these to the installation command, you can override ``configure_args``
+like so:
+
+.. code-block:: python
+
+ def configure_args(self, spec, prefix):
+ mpi_name = spec['mpi'].name
+
+ # The type of MPI. Supported values are:
+ # OPENMPI, LAM, MPICH, MPICH2, or CRAY
+ if mpi_name == 'openmpi':
+ Rmpi_type = 'OPENMPI'
+ elif mpi_name == 'mpich':
+ Rmpi_type = 'MPICH2'
+ else:
+ raise InstallError('Unsupported MPI type')
+
+ return [
+ '--with-Rmpi-type={0}'.format(Rmpi_type),
+ '--with-mpi={0}'.format(spec['mpi'].prefix),
+ ]
+
+
+There is a similar ``configure_vars`` function that can be overridden
+to pass variables to the build.
+
+^^^^^^^^^^^^^^^^^^^^^
+Alternatives to Spack
+^^^^^^^^^^^^^^^^^^^^^
+
+CRAN hosts over 10,000 R packages, most of which are not in Spack. Many
+users may not need the advanced features of Spack, and may prefer to
+install R packages the normal way:
+
+.. code-block:: console
+
+ $ R
+ > install.packages("ggplot2")
+
+
+R will search CRAN for the ``ggplot2`` package and install all necessary
+dependencies for you. If you want to update all installed R packages to
+the latest release, you can use:
+
+.. code-block:: console
+
+ > update.packages(ask = FALSE)
+
+
+This works great for users who have internet access, but those on an
+air-gapped cluster will find it easier to let Spack build a download
+mirror and install these packages for you.
+
+Where Spack really shines is its ability to install non-R dependencies
+and link to them properly, something the R installation mechanism
+cannot handle.
+
+^^^^^^^^^^^^^^^^^^^^^^
+External documentation
+^^^^^^^^^^^^^^^^^^^^^^
+
+For more information on installing R packages, see:
+https://stat.ethz.ch/R-manual/R-devel/library/utils/html/INSTALL.html