summaryrefslogtreecommitdiff
path: root/lib/spack/docs/developer_guide.rst
diff options
context:
space:
mode:
authorMassimiliano Culpo <massimiliano.culpo@gmail.com>2022-12-07 23:56:53 +0100
committerGitHub <noreply@github.com>2022-12-07 14:56:53 -0800
commitab6499ce1e4c8c6f043c52226b9eb50dc96ddfc9 (patch)
tree14c064f6d008e17d29af898233a7ef5f12eb17fb /lib/spack/docs/developer_guide.rst
parent412bec45aa2457aa181271afbafd912f9551ee20 (diff)
downloadspack-ab6499ce1e4c8c6f043c52226b9eb50dc96ddfc9.tar.gz
spack-ab6499ce1e4c8c6f043c52226b9eb50dc96ddfc9.tar.bz2
spack-ab6499ce1e4c8c6f043c52226b9eb50dc96ddfc9.tar.xz
spack-ab6499ce1e4c8c6f043c52226b9eb50dc96ddfc9.zip
parser: refactor with coarser token granularity (#34151)
## Motivation Our parser grew to be quite complex, with a 2-state lexer and logic in the parser that has up to 5 levels of nested conditionals. In the future, to turn compilers into proper dependencies, we'll have to increase the complexity further as we foresee the need to add: 1. Edge attributes 2. Spec nesting to the spec syntax (see https://github.com/spack/seps/pull/5 for an initial discussion of those changes). The main attempt here is thus to _simplify the existing code_ before we start extending it later. We try to do that by adopting a different token granularity, and by using more complex regexes for tokenization. This allow us to a have a "flatter" encoding for the parser. i.e., it has fewer nested conditionals and a near-trivial lexer. There are places, namely in `VERSION`, where we have to use negative lookahead judiciously to avoid ambiguity. Specifically, this parse is ambiguous without `(?!\s*=)` in `VERSION_RANGE` and an extra final `\b` in `VERSION`: ``` @ 1.2.3 : develop # This is a version range 1.2.3:develop @ 1.2.3 : develop=foo # This is a version range 1.2.3: followed by a key-value pair ``` ## Differences with the previous parser ~There are currently 2 known differences with the previous parser, which have been added on purpose:~ - ~No spaces allowed after a sigil (e.g. `foo @ 1.2.3` is invalid while `foo @1.2.3` is valid)~ - ~`/<hash> @1.2.3` can be parsed as a concrete spec followed by an anonymous spec (before was invalid)~ ~We can recover the previous behavior on both ones but, especially for the second one, it seems the current behavior in the PR is more consistent.~ The parser is currently 100% backward compatible. ## Error handling Being based on more complex regexes, we can possibly improve error handling by adding regexes for common issues and hint users on that. I'll leave that for a following PR, but there's a stub for this approach in the PR. ## Performance To be sure we don't add any performance penalty with this new encoding, I measured: ```console $ spack python -m timeit -s "import spack.spec" -c "spack.spec.Spec(<spec>)" ``` for different specs on my machine: * **Spack:** 0.20.0.dev0 (c9db4e50ba045f5697816187accaf2451cb1aae7) * **Python:** 3.8.10 * **Platform:** linux-ubuntu20.04-icelake * **Concretizer:** clingo results are: | Spec | develop | this PR | | ------------- | ------------- | ------- | | `trilinos` | 28.9 usec | 13.1 usec | | `trilinos @1.2.10:1.4.20,2.0.1` | 131 usec | 120 usec | | `trilinos %gcc` | 44.9 usec | 20.9 usec | | `trilinos +foo` | 44.1 usec | 21.3 usec | | `trilinos foo=bar` | 59.5 usec | 25.6 usec | | `trilinos foo=bar ^ mpich foo=baz` | 120 usec | 82.1 usec | so this new parser seems to be consistently faster than the previous one. ## Modifications In this PR we just substituted the Spec parser, which means: - [x] Deleted in `spec.py` the `SpecParser` and `SpecLexer` classes. deleted `spack/parse.py` - [x] Added a new parser in `spack/parser.py` - [x] Hooked the new parser in all the places the previous one was used - [x] Adapted unit tests in `test/spec_syntax.py` ## Possible future improvements Random thoughts while working on the PR: - Currently we transform hashes and files into specs during parsing. I think we might want to introduce an additional step and parse special objects like a `FileSpec` etc. in-between parsing and concretization.
Diffstat (limited to 'lib/spack/docs/developer_guide.rst')
-rw-r--r--lib/spack/docs/developer_guide.rst9
1 files changed, 3 insertions, 6 deletions
diff --git a/lib/spack/docs/developer_guide.rst b/lib/spack/docs/developer_guide.rst
index 6b67ef9f77..93bbce5198 100644
--- a/lib/spack/docs/developer_guide.rst
+++ b/lib/spack/docs/developer_guide.rst
@@ -175,14 +175,11 @@ Spec-related modules
^^^^^^^^^^^^^^^^^^^^
:mod:`spack.spec`
- Contains :class:`~spack.spec.Spec` and :class:`~spack.spec.SpecParser`.
- Also implements most of the logic for normalization and concretization
+ Contains :class:`~spack.spec.Spec`. Also implements most of the logic for concretization
of specs.
-:mod:`spack.parse`
- Contains some base classes for implementing simple recursive descent
- parsers: :class:`~spack.parse.Parser` and :class:`~spack.parse.Lexer`.
- Used by :class:`~spack.spec.SpecParser`.
+:mod:`spack.parser`
+ Contains :class:`~spack.parser.SpecParser` and functions related to parsing specs.
:mod:`spack.concretize`
Contains :class:`~spack.concretize.Concretizer` implementation,