summaryrefslogtreecommitdiff
path: root/lib/spack/docs/developer_guide.rst
diff options
context:
space:
mode:
Diffstat (limited to 'lib/spack/docs/developer_guide.rst')
-rw-r--r--lib/spack/docs/developer_guide.rst442
1 files changed, 437 insertions, 5 deletions
diff --git a/lib/spack/docs/developer_guide.rst b/lib/spack/docs/developer_guide.rst
index 9964bb6e51..4e349cff45 100644
--- a/lib/spack/docs/developer_guide.rst
+++ b/lib/spack/docs/developer_guide.rst
@@ -106,11 +106,21 @@ with a high level view of Spack's directory structure:
external/ <- external libs included in Spack distro
llnl/ <- some general-use libraries
- spack/ <- spack module; contains Python code
- cmd/ <- each file in here is a spack subcommand
- compilers/ <- compiler description files
- test/ <- unit test modules
- util/ <- common code
+ spack/ <- spack module; contains Python code
+ analyzers/ <- modules to run analysis on installed packages
+ build_systems/ <- modules for different build systems
+ cmd/ <- each file in here is a spack subcommand
+ compilers/ <- compiler description files
+ container/ <- module for spack containerize
+ hooks/ <- hook modules to run at different points
+ modules/ <- modules for lmod, tcl, etc.
+ operating_systems/ <- operating system modules
+ platforms/ <- different spack platforms
+ reporters/ <- reporters like cdash, junit
+ schema/ <- schemas to validate data structures
+ solver/ <- the spack solver
+ test/ <- unit test modules
+ util/ <- common code
Spack is designed so that it could live within a `standard UNIX
directory hierarchy <http://linux.die.net/man/7/hier>`_, so ``lib``,
@@ -251,6 +261,22 @@ Unit tests
This is a fake package hierarchy used to mock up packages for
Spack's test suite.
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+Research and Monitoring Modules
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+:mod:`spack.monitor`
+ Contains :class:`SpackMonitor <spack.monitor.SpackMonitor>`. This is accessed
+ from the ``spack install`` and ``spack analyze`` commands to send build
+ and package metadada up to a `Spack Monitor <https://github.com/spack/spack-monitor>`_ server.
+
+
+:mod:`spack.analyzers`
+ A module folder with a :class:`AnalyzerBase <spack.analyzers.analyzer_base.AnalyzerBase>`
+ that provides base functions to run, save, and (optionally) upload analysis
+ results to a `Spack Monitor <https://github.com/spack/spack-monitor>`_ server.
+
+
^^^^^^^^^^^^^
Other Modules
^^^^^^^^^^^^^
@@ -299,6 +325,235 @@ Conceptually, packages are overloaded. They contain:
Stage objects
-------------
+
+.. _writing-analyzers:
+
+-----------------
+Writing analyzers
+-----------------
+
+To write an analyzer, you should add a new python file to the
+analyzers module directory at ``lib/spack/spack/analyzers`` .
+Your analyzer should be a subclass of the :class:`AnalyzerBase <spack.analyzers.analyzer_base.AnalyzerBase>`. For example, if you want
+to add an analyzer class ``Myanalyzer`` you woul write to
+``spack/analyzers/myanalyzer.py`` and import and
+use the base as follows:
+
+.. code-block:: python
+
+ from .analyzer_base import AnalyzerBase
+
+ class Myanalyzer(AnalyzerBase):
+
+
+Note that the class name is your module file name, all lowercase
+except for the first capital letter. You can look at other analyzers in
+that analyzer directory for examples. The guide here will tell you about the basic functions needed.
+
+^^^^^^^^^^^^^^^^^^^^^^^^^
+Analyzer Output Directory
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+By default, when you run ``spack analyze run`` an analyzer output directory will
+be created in your spack user directory in your ``$HOME``. The reason we output here
+is because the install directory might not always be writable.
+
+.. code-block:: console
+
+ ~/.spack/
+ analyzers
+
+Result files will be written here, organized in subfolders in the same structure
+as the package, with each analyzer owning it's own subfolder. for example:
+
+
+.. code-block:: console
+
+ $ tree ~/.spack/analyzers/
+ /home/spackuser/.spack/analyzers/
+ └── linux-ubuntu20.04-skylake
+ └── gcc-9.3.0
+ └── zlib-1.2.11-sl7m27mzkbejtkrajigj3a3m37ygv4u2
+ ├── environment_variables
+ │   └── spack-analyzer-environment-variables.json
+ ├── install_files
+ │   └── spack-analyzer-install-files.json
+ └── libabigail
+ └── lib
+ └── spack-analyzer-libabigail-libz.so.1.2.11.xml
+
+
+Notice that for the libabigail analyzer, since results are generated per object,
+we honor the object's folder in case there are equivalently named files in
+different folders. The result files are typically written as json so they can be easily read and uploaded in a future interaction with a monitor.
+
+
+^^^^^^^^^^^^^^^^^
+Analyzer Metadata
+^^^^^^^^^^^^^^^^^
+
+Your analyzer is required to have the class attributes ``name``, ``outfile``,
+and ``description``. These are printed to the user with they use the subcommand
+``spack analyze list-analyzers``. Here is an example.
+As we mentioned above, note that this analyzer would live in a module named
+``libabigail.py`` in the analyzers folder so that the class can be discovered.
+
+
+.. code-block:: python
+
+ class Libabigail(AnalyzerBase):
+
+ name = "libabigail"
+ outfile = "spack-analyzer-libabigail.json"
+ description = "Application Binary Interface (ABI) features for objects"
+
+
+This means that the name and output file should be unique for your analyzer.
+Note that "all" cannot be the name of an analyzer, as this key is used to indicate
+that the user wants to run all analyzers.
+
+.. _analyzer_run_function:
+
+
+^^^^^^^^^^^^^^^^^^^^^^^^
+An analyzer run Function
+^^^^^^^^^^^^^^^^^^^^^^^^
+
+The core of an analyzer is its ``run()`` function, which should accept no
+arguments. You can assume your analyzer has the package spec of interest at ``self.spec``
+and it's up to the run function to generate whatever analysis data you need,
+and then return the object with a key as the analyzer name. The result data
+should be a list of objects, each with a name, ``analyzer_name``, ``install_file``,
+and one of ``value`` or ``binary_value``. The install file should be for a relative
+path, and not the absolute path. For example, let's say we extract a metric called
+``metric`` for ``bin/wget`` using our analyzer ``thebest-analyzer``.
+We might have data that looks like this:
+
+.. code-block:: python
+
+ result = {"name": "metric", "analyzer_name": "thebest-analyzer", "value": "1", "install_file": "bin/wget"}
+
+
+We'd then return it as follows - note that they key is the analyzer name at ``self.name``.
+
+.. code-block:: python
+
+ return {self.name: result}
+
+This will save the complete result to the analyzer metadata folder, as described
+previously. If you want support for adding a different kind of metadata (e.g.,
+not associated with an install file) then the monitor server would need to be updated
+to support this first.
+
+
+^^^^^^^^^^^^^^^^^^^^^^^^^
+An analyzer init Function
+^^^^^^^^^^^^^^^^^^^^^^^^^
+
+If you don't need any extra dependencies or checks, you can skip defining an analyzer
+init function, as the base class will handle it. Typically, it will accept
+a spec, and an optional output directory (if the user does not want the default
+metadata folder for analyzer results). The analyzer init function should call
+it's parent init, and then do any extra checks or validation that are required to
+work. For example:
+
+.. code-block:: python
+
+ def __init__(self, spec, dirname=None):
+ super(Myanalyzer, self).__init__(spec, dirname)
+
+ # install extra dependencies, do extra preparation and checks here
+
+
+At the end of the init, you will have available to you:
+
+ - **self.spec**: the spec object
+ - **self.dirname**: an optional directory name the user as provided at init to save
+ - **self.output_dir**: the analyzer metadata directory, where we save by default
+ - **self.meta_dir**: the path to the package metadata directory (.spack) if you need it
+
+And can proceed to write your analyzer.
+
+
+^^^^^^^^^^^^^^^^^^^^^^^
+Saving Analyzer Results
+^^^^^^^^^^^^^^^^^^^^^^^
+
+The analyzer will have ``save_result`` called, with the result object generated
+to save it to the filesystem, and if the user has added the ``--monitor`` flag
+to upload it to a monitor server. If your result follows an accepted result
+format and you don't need to parse it further, you don't need to add this
+function to your class. However, if your result data is large or otherwise
+needs additional parsing, you can define it. If you define the function, it
+is useful to know about the ``output_dir`` property, which you can join
+with your output file relative path of choice:
+
+.. code-block:: python
+
+ outfile = os.path.join(self.output_dir, "my-output-file.txt")
+
+
+The directory will be provided by the ``output_dir`` property but it won't exist,
+so you should create it:
+
+
+.. code::block:: python
+
+ # Create the output directory
+ if not os.path.exists(self._output_dir):
+ os.makedirs(self._output_dir)
+
+
+If you are generating results that match to specific files in the package
+install directory, you should try to maintain those paths in the case that
+there are equivalently named files in different directories that would
+overwrite one another. As an example of an analyzer with a custom save,
+the Libabigail analyzer saves ``*.xml`` files to the analyzer metadata
+folder in ``run()``, as they are either binaries, or as xml (text) would
+usually be too big to pass in one request. For this reason, the files
+are saved during ``run()`` and the filenames added to the result object,
+and then when the result object is passed back into ``save_result()``,
+we skip saving to the filesystem, and instead read the file and send
+each one (separately) to the monitor:
+
+
+.. code-block:: python
+
+ def save_result(self, result, monitor=None, overwrite=False):
+ """ABI results are saved to individual files, so each one needs to be
+ read and uploaded. Result here should be the lookup generated in run(),
+ the key is the analyzer name, and each value is the result file.
+ We currently upload the entire xml as text because libabigail can't
+ easily read gzipped xml, but this will be updated when it can.
+ """
+ if not monitor:
+ return
+
+ name = self.spec.package.name
+
+ for obj, filename in result.get(self.name, {}).items():
+
+ # Don't include the prefix
+ rel_path = obj.replace(self.spec.prefix + os.path.sep, "")
+
+ # We've already saved the results to file during run
+ content = spack.monitor.read_file(filename)
+
+ # A result needs an analyzer, value or binary_value, and name
+ data = {"value": content, "install_file": rel_path, "name": "abidw-xml"}
+ tty.info("Sending result for %s %s to monitor." % (name, rel_path))
+ monitor.send_analyze_metadata(self.spec.package, {"libabigail": [data]})
+
+
+
+Notice that this function, if you define it, requires a result object (generated by
+``run()``, a monitor (if you want to send), and a boolean ``overwrite`` to be used
+to check if a result exists first, and not write to it if the result exists and
+overwrite is False. Also notice that since we already saved these files to the analyzer metadata folder, we return early if a monitor isn't defined, because this function serves to send results to the monitor. If you haven't saved anything to the analyzer metadata folder
+yet, you might want to do that here. You should also use ``tty.info`` to give
+the user a message of "Writing result to $DIRNAME."
+
+
.. _writing-commands:
----------------
@@ -345,6 +600,183 @@ Whenever you add/remove/rename a command or flags for an existing command,
make sure to update Spack's `Bash tab completion script
<https://github.com/adamjstewart/spack/blob/develop/share/spack/spack-completion.bash>`_.
+
+-------------
+Writing Hooks
+-------------
+
+A hook is a callback that makes it easy to design functions that run
+for different events. We do this by way of defining hook types, and then
+inserting them at different places in the spack code base. Whenever a hook
+type triggers by way of a function call, we find all the hooks of that type,
+and run them.
+
+Spack defines hooks by way of a module at ``lib/spack/spack/hooks`` where we can define
+types of hooks in the ``__init__.py``, and then python files in that folder
+can use hook functions. The files are automatically parsed, so if you write
+a new file for some integration (e.g., ``lib/spack/spack/hooks/myintegration.py``
+you can then write hook functions in that file that will be automatically detected,
+and run whenever your hook is called. This section will cover the basic kind
+of hooks, and how to write them.
+
+^^^^^^^^^^^^^^
+Types of Hooks
+^^^^^^^^^^^^^^
+
+The following hooks are currently implemented to make it easy for you,
+the developer, to add hooks at different stages of a spack install or similar.
+If there is a hook that you would like and is missing, you can propose to add a new one.
+
+"""""""""""""""""""""
+``pre_install(spec)``
+"""""""""""""""""""""
+
+A ``pre_install`` hook is run within an install subprocess, directly before
+the install starts. It expects a single argument of a spec, and is run in
+a multiprocessing subprocess. Note that if you see ``pre_install`` functions associated with packages these are not hooks
+as we have defined them here, but rather callback functions associated with
+a package install.
+
+
+""""""""""""""""""""""
+``post_install(spec)``
+""""""""""""""""""""""
+
+A ``post_install`` hook is run within an install subprocess, directly after
+the install finishes, but before the build stage is removed. If you
+write one of these hooks, you should expect it to accept a spec as the only
+argument. This is run in a multiprocessing subprocess. This ``post_install`` is
+also seen in packages, but in this context not related to the hooks described
+here.
+
+
+""""""""""""""""""""""""""
+``on_install_start(spec)``
+""""""""""""""""""""""""""
+
+This hook is run at the beginning of ``lib/spack/spack/installer.py``,
+in the install function of a ``PackageInstaller``,
+and importantly is not part of a build process, but before it. This is when
+we have just newly grabbed the task, and are preparing to install. If you
+write a hook of this type, you should provide the spec to it.
+
+.. code-block:: python
+
+ def on_install_start(spec):
+ """On start of an install, we want to...
+ """
+ print('on_install_start')
+
+
+""""""""""""""""""""""""""""
+``on_install_success(spec)``
+""""""""""""""""""""""""""""
+
+This hook is run on a successful install, and is also run inside the build
+process, akin to ``post_install``. The main difference is that this hook
+is run outside of the context of the stage directory, meaning after the
+build stage has been removed and the user is alerted that the install was
+successful. If you need to write a hook that is run on success of a particular
+phase, you should use ``on_phase_success``.
+
+""""""""""""""""""""""""""""
+``on_install_failure(spec)``
+""""""""""""""""""""""""""""
+
+This hook is run given an install failure that happens outside of the build
+subprocess, but somewhere in ``installer.py`` when something else goes wrong.
+If you need to write a hook that is relevant to a failure within a build
+process, you would want to instead use ``on_phase_failure``.
+
+
+"""""""""""""""""""""""""""""""""""""""""""""""
+``on_phase_success(pkg, phase_name, log_file)``
+"""""""""""""""""""""""""""""""""""""""""""""""
+
+This hook is run within the install subprocess, and specifically when a phase
+successfully finishes. Since we are interested in the package, the name of
+the phase, and any output from it, we require:
+
+ - **pkg**: the package variable, which also has the attached spec at ``pkg.spec``
+ - **phase_name**: the name of the phase that was successful (e.g., configure)
+ - **log_file**: the path to the file with output, in case you need to inspect or otherwise interact with it.
+
+"""""""""""""""""""""""""""""""""""""""""""""
+``on_phase_error(pkg, phase_name, log_file)``
+"""""""""""""""""""""""""""""""""""""""""""""
+
+In the case of an error during a phase, we might want to trigger some event
+with a hook, and this is the purpose of this particular hook. Akin to
+``on_phase_success`` we require the same variables - the package that failed,
+the name of the phase, and the log file where we might find errors.
+
+"""""""""""""""""""""""""""""""""
+``on_analyzer_save(pkg, result)``
+"""""""""""""""""""""""""""""""""
+
+After an analyzer has saved some result for a package, this hook is called,
+and it provides the package that we just ran the analysis for, along with
+the loaded result. Typically, a result is structured to have the name
+of the analyzer as key, and the result object that is defined in detail in
+:ref:`analyzer_run_function`.
+
+.. code-block:: python
+
+ def on_analyzer_save(pkg, result):
+ """given a package and a result...
+ """
+ print('Do something extra with a package analysis result here')
+
+
+^^^^^^^^^^^^^^^^^^^^^^
+Adding a New Hook Type
+^^^^^^^^^^^^^^^^^^^^^^
+
+Adding a new hook type is very simple! In ``lib/spack/spack/hooks/__init__.py``
+you can simply create a new ``HookRunner`` that is named to match your new hook.
+For example, let's say you want to add a new hook called ``post_log_write``
+to trigger after anything is written to a logger. You would add it as follows:
+
+.. code-block:: python
+
+ # pre/post install and run by the install subprocess
+ pre_install = HookRunner('pre_install')
+ post_install = HookRunner('post_install')
+
+ # hooks related to logging
+ post_log_write = HookRunner('post_log_write') # <- here is my new hook!
+
+
+You then need to decide what arguments my hook would expect. Since this is
+related to logging, let's say that you want a message and level. That means
+that when you add a python file to the ``lib/spack/spack/hooks``
+folder with one or more callbacks intended to be triggered by this hook. You might
+use my new hook as follows:
+
+.. code-block:: python
+
+ def post_log_write(message, level):
+ """Do something custom with the messsage and level every time we write
+ to the log
+ """
+ print('running post_log_write!')
+
+
+To use the hook, we would call it as follows somewhere in the logic to do logging.
+In this example, we use it outside of a logger that is already defined:
+
+.. code-block:: python
+
+ import spack.hooks
+
+ # We do something here to generate a logger and message
+ spack.hooks.post_log_write(message, logger.level)
+
+
+This is not to say that this would be the best way to implement an integration
+with the logger (you'd probably want to write a custom logger, or you could
+have the hook defined within the logger) but serves as an example of writing a hook.
+
----------
Unit tests
----------