Improve parsing of quoted flags and variants in specs (#41529)

This PR does several things: - [x] Allow any character to appear in the quoted values of variants and flags. - [x] Allow easier passing of quoted flags on the command line, e.g. `cflags="-O2 -g"`. - [x] Handle quoting better in spec output, using single quotes around double quotes and vice versa. - [x] Disallow spaces around `=` and `==` when parsing variants and flags. ## Motivation This PR is motivated by the issues above and by ORNL's [tips for launching at scale on Frontier](https://docs.olcf.ornl.gov/systems/frontier_user_guide.html#tips-for-launching-at-scale). ORNL recommends using `sbcast --send-libs` to broadcast executables and their libraries to compute nodes when running large jobs (e.g., 80k ranks). For an executable named `exe`, `sbcast --send-libs` stores the needed libraries in a directory alongside the executable called `exe_libs`. ORNL recommends pointing `LD_LIBRARY_PATH` at that directory so that `exe` will find the local libraries and not overwhelm the filesystem. There are other ways to mitigate this problem: * You could build with `RUNPATH` using `spack config add config:shared_linking:type:runpath`, which would make `LD_LIBRARY_PATH` take precedence over Spack's `RUNPATHs`. I don't recommend this one because `RUNPATH` can cause many other things to go wrong. * You could use `spack config add config:shared_linking:bind:true`, added in #31948, which will greatly reduce the filesystem load for large jobs by pointing `DT_NEEDED` entries in ELF *directly* at the needed `.so` files instead of relying on `RPATH` search via soname. I have not experimented with this at 80,000 ranks, but it should help quite a bit. * You could use [Spindle](https://github.com/hpc/Spindle) (as LLNL does on its machines) which should transparently fix this without any changes to your executable and without any need to use `sbcast` or other tools. But we want to support the `sbcast` use case as well. ## `sbcast` and Spack Spack's `RPATHs` break the `sbcast` fix because they're considered with higher precedence than `LD_LIBRARY_PATH`. So Spack applications will still end up hitting the shared filesystem when searching for libraries. We can avoid this by injecting some `ldflags` in to the build, e.g., if were were going to launch, say, `LAMMPS` at scale, we could add another `RPATH` specifically for use with `sbcast`: spack install lammps ldflags='-Wl,-rpath=$ORIGIN/lmp_libs' This will put the `lmp_libs` directory alongside `LAMMPS`'s `lmp` executable first in the `RPATH`, so it will be searched before any directories on the shared filesystem. ## Issues with quoting Before this PR, the command above would've errored out for two reasons: 1. `$` wasn't an allowed character in our spec parser. 2. You would've had to double quote the flags to get them to pass through correctly: spack install lammps ldflags='"-Wl,-rpath=$ORIGIN/lmp_libs"' This is ugly and I don't think many users will easily figure it out. The behavior was added in #29282, and it improved parsing of specs passed as a single string, e.g.: spack install 'lammps ldflags="-Wl,-rpath=$ORIGIN/lmp_libs"' but a lot of users are naturally going to try to quote arguments *directly* on the command line, without quoting their entire spec. #29282 used a heuristic to detect unquoted flags and warn the user, but the warning could be confusing. In particular, if you wrote `cflags="-O2 -g"` on the command line, it would break the flags up, warn, and tell you that you could fix the issue by writing `cflags="-O2 -g"` even though you just wrote that. It's telling you to *quote* that value, but the user has to know to double quote. ## New heuristic for quoted arguments from the CLI There are only two places where we allow arbitrary quoted strings in specs: flags and variant values, so this PR adds a simpler heuristic to the CLI parser: if an argument in `sys.argv` starts with `name=...`, then we assume the whole argument is quoted. This means you can write: spack install bzip2 cflags="-O2 -g" directly on the command line, without multiple levels of quoting. This also works: spack install 'bzip2 cflags="-O2 -g"' The only place where this heuristic runs into ambiguity is if you attempt to pass anonymous specs that start with `name=...` as one large string. e.g., this will be interpreted as one large flag value: spack find 'cflags="-O2 -g" ~bar +baz' This sets `cflags` to `"-O2 -g" ~bar +baz`, which is likely not what you wanted. You can fix this easily by either removing the quotes: spack find cflags="-O2 -g" ~bar +baz Or by adding a space at the start, which has the same effect: spack find ' cflags="-O2 -g" ~bar +baz' You may wonder why we don't just look for quotes inside of flag arguments, and the reason is that you *might* want them there. If you are passing arguments like: spack install zlib cppflags="-D DEBUG_MSG1='quick fox' -D DEBUG_MSG2='lazy dog'" You *need* the quotes there. So we've opted for one potentially confusing, but easily fixed outcome vs. limiting what you can put in your quoted strings. ## Quotes in formatted spec output In addition to being more lenient about characters accepted in quoted strings, this PR fixes up spec formatting a bit. We now format quoted strings in specs with single quotes, unless the string has a single quote in it, in which case we JSON-escape the string (i.e., we add `\` before `"` and `\`). zlib cflags='-D FOO="bar"' zlib cflags="-D FOO='bar'" zlib cflags="-D FOO='bar' BAR=\"baz\""
author: Todd Gamblin <tgamblin@llnl.gov> 2023-12-13 16:36:22 -0800
committer: GitHub <noreply@github.com> 2023-12-13 16:36:22 -0800
commit: a690b8c27ccf33745afbdf7b1ce800fa07510b78 (patch)
tree: 2a190da3f952b575040490e11942a756549e304a /lib
parent: a1fa862c3f3334055344c6cc77368f7ed4f7c976 (diff)
download: spack-a690b8c27ccf33745afbdf7b1ce800fa07510b78.tar.gz
spack-a690b8c27ccf33745afbdf7b1ce800fa07510b78.tar.bz2
spack-a690b8c27ccf33745afbdf7b1ce800fa07510b78.tar.xz
spack-a690b8c27ccf33745afbdf7b1ce800fa07510b78.zip
7 files changed, 198 insertions, 188 deletions
diff --git a/lib/spack/spack/cmd/__init__.py b/lib/spack/spack/cmd/__init__.py
index c26ab181c1..cb4083ed8f 100644
--- a/lib/spack/spack/cmd/__init__.py
+++ b/lib/spack/spack/cmd/__init__.py
@@ -6,10 +6,8 @@
 import argparse
 import os
 import re
-import shlex
 import sys
-from textwrap import dedent
-from typing import List, Match, Tuple
+from typing import List, Union
 
 import llnl.string
 import llnl.util.tty as tty
@@ -147,89 +145,37 @@ def get_command(cmd_name):
     return getattr(get_module(cmd_name), pname)
 
 
-class _UnquotedFlags:
-    """Use a heuristic in `.extract()` to detect whether the user is trying to set
-    multiple flags like the docker ENV attribute allows (e.g. 'cflags=-Os -pipe').
+def quote_kvp(string: str) -> str:
+    """For strings like ``name=value`` or ``name==value``, quote and escape the value if needed.
 
-    If the heuristic finds a match (which can be checked with `__bool__()`), a warning
-    message explaining how to quote multiple flags correctly can be generated with
-    `.report()`.
+    This is a compromise to respect quoting of key-value pairs on the CLI. The shell
+    strips quotes from quoted arguments, so we cannot know *exactly* how CLI arguments
+    were quoted. To compensate, we re-add quotes around anything staritng with ``name=``
+    or ``name==``, and we assume the rest of the argument is the value. This covers the
+    common cases of passign flags, e.g., ``cflags="-O2 -g"`` on the command line.
     """
+    match = re.match(spack.parser.SPLIT_KVP, string)
+    if not match:
+        return string
 
-    flags_arg_pattern = re.compile(
-        r'^({0})=([^\'"].*)$'.format("|".join(spack.spec.FlagMap.valid_compiler_flags()))
-    )
+    key, delim, value = match.groups()
+    return f"{key}{delim}{spack.parser.quote_if_needed(value)}"
 
-    def __init__(self, all_unquoted_flag_pairs: List[Tuple[Match[str], str]]):
-        self._flag_pairs = all_unquoted_flag_pairs
-
-    def __bool__(self) -> bool:
-        return bool(self._flag_pairs)
-
-    @classmethod
-    def extract(cls, sargs: str) -> "_UnquotedFlags":
-        all_unquoted_flag_pairs: List[Tuple[Match[str], str]] = []
-        prev_flags_arg = None
-        for arg in shlex.split(sargs):
-            if prev_flags_arg is not None:
-                all_unquoted_flag_pairs.append((prev_flags_arg, arg))
-            prev_flags_arg = cls.flags_arg_pattern.match(arg)
-        return cls(all_unquoted_flag_pairs)
-
-    def report(self) -> str:
-        single_errors = [
-            "({0}) {1} {2} => {3}".format(
-                i + 1,
-                match.group(0),
-                next_arg,
-                '{0}="{1} {2}"'.format(match.group(1), match.group(2), next_arg),
-            )
-            for i, (match, next_arg) in enumerate(self._flag_pairs)
-        ]
-        return dedent(
-            """\
-        Some compiler or linker flags were provided without quoting their arguments,
-        which now causes spack to try to parse the *next* argument as a spec component
-        such as a variant instead of an additional compiler or linker flag. If the
-        intent was to set multiple flags, try quoting them together as described below.
-
-        Possible flag quotation errors (with the correctly-quoted version after the =>):
-        {0}"""
-        ).format("\n".join(single_errors))
-
-
-def parse_specs(args, **kwargs):
+
+def parse_specs(
+    args: Union[str, List[str]], concretize: bool = False, tests: bool = False
+) -> List[spack.spec.Spec]:
     """Convenience function for parsing arguments from specs.  Handles common
     exceptions and dies if there are errors.
     """
-    concretize = kwargs.get("concretize", False)
-    normalize = kwargs.get("normalize", False)
-    tests = kwargs.get("tests", False)
-
-    sargs = args
-    if not isinstance(args, str):
-        sargs = " ".join(args)
-    unquoted_flags = _UnquotedFlags.extract(sargs)
+    args = [args] if isinstance(args, str) else args
+    arg_string = " ".join([quote_kvp(arg) for arg in args])
 
-    try:
-        specs = spack.parser.parse(sargs)
-        for spec in specs:
-            if concretize:
-                spec.concretize(tests=tests)  # implies normalize
-            elif normalize:
-                spec.normalize(tests=tests)
-        return specs
-
-    except spack.error.SpecError as e:
-        msg = e.message
-        if e.long_message:
-            msg += e.long_message
-        # Unquoted flags will be read as a variant or hash
-        if unquoted_flags and ("variant" in msg or "hash" in msg):
-            msg += "\n\n"
-            msg += unquoted_flags.report()
-
-        raise spack.error.SpackError(msg) from e
+    specs = spack.parser.parse(arg_string)
+    for spec in specs:
+        if concretize:
+            spec.concretize(tests=tests)
+    return specs
 
 
 def matching_spec_from_env(spec):
diff --git a/lib/spack/spack/parser.py b/lib/spack/spack/parser.py
index b0a8d8ad83..63c139b9f0 100644
--- a/lib/spack/spack/parser.py
+++ b/lib/spack/spack/parser.py
@@ -58,6 +58,7 @@ specs to avoid ambiguity.  Both are provided because ~ can cause shell
 expansion when it is the first character in an id typed on the command line.
 """
 import enum
+import json
 import pathlib
 import re
 import sys
@@ -95,13 +96,55 @@ if not IS_WINDOWS:
 else:
     FILENAME = WINDOWS_FILENAME
 
+#: These are legal values that *can* be parsed bare, without quotes on the command line.
 VALUE = r"(?:[a-zA-Z_0-9\-+\*.,:=\~\/\\]+)"
-QUOTED_VALUE = r"[\"']+(?:[a-zA-Z_0-9\-+\*.,:=\~\/\\\s]+)[\"']+"
+
+#: Variant/flag values that match this can be left unquoted in Spack output
+NO_QUOTES_NEEDED = r"^[a-zA-Z0-9,/_.-]+$"
+
+#: Quoted values can be *anything* in between quotes, including escaped quotes.
+QUOTED_VALUE = r"(?:'(?:[^']|(?<=\\)')*'|\"(?:[^\"]|(?<=\\)\")*\")"
 
 VERSION = r"=?(?:[a-zA-Z0-9_][a-zA-Z_0-9\-\.]*\b)"
 VERSION_RANGE = rf"(?:(?:{VERSION})?:(?:{VERSION}(?!\s*=))?)"
 VERSION_LIST = rf"(?:{VERSION_RANGE}|{VERSION})(?:\s*,\s*(?:{VERSION_RANGE}|{VERSION}))*"
 
+#: Regex with groups to use for splitting (optionally propagated) key-value pairs
+SPLIT_KVP = rf"^({NAME})(==?)(.*)$"
+
+#: Regex to strip quotes. Group 2 will be the unquoted string.
+STRIP_QUOTES = r"^(['\"])(.*)\1$"
+
+
+def strip_quotes_and_unescape(string: str) -> str:
+    """Remove surrounding single or double quotes from string, if present."""
+    match = re.match(STRIP_QUOTES, string)
+    if not match:
+        return string
+
+    # replace any escaped quotes with bare quotes
+    quote, result = match.groups()
+    return result.replace(rf"\{quote}", quote)
+
+
+def quote_if_needed(value: str) -> str:
+    """Add quotes around the value if it requires quotes.
+
+    This will add quotes around the value unless it matches ``NO_QUOTES_NEEDED``.
+
+    This adds:
+    * single quotes by default
+    * double quotes around any value that contains single quotes
+
+    If double quotes are used, we json-escpae the string. That is, we escape ``\\``,
+    ``"``, and control codes.
+
+    """
+    if re.match(spack.parser.NO_QUOTES_NEEDED, value):
+        return value
+
+    return json.dumps(value) if "'" in value else f"'{value}'"
+
 
 class TokenBase(enum.Enum):
     """Base class for an enum type with a regex value"""
@@ -138,8 +181,8 @@ class TokenType(TokenBase):
     # Variants
     PROPAGATED_BOOL_VARIANT = rf"(?:(?:\+\+|~~|--)\s*{NAME})"
     BOOL_VARIANT = rf"(?:[~+-]\s*{NAME})"
-    PROPAGATED_KEY_VALUE_PAIR = rf"(?:{NAME}\s*==\s*(?:{VALUE}|{QUOTED_VALUE}))"
-    KEY_VALUE_PAIR = rf"(?:{NAME}\s*=\s*(?:{VALUE}|{QUOTED_VALUE}))"
+    PROPAGATED_KEY_VALUE_PAIR = rf"(?:{NAME}==(?:{VALUE}|{QUOTED_VALUE}))"
+    KEY_VALUE_PAIR = rf"(?:{NAME}=(?:{VALUE}|{QUOTED_VALUE}))"
     # Compilers
     COMPILER_AND_VERSION = rf"(?:%\s*(?:{NAME})(?:[\s]*)@\s*(?:{VERSION_LIST}))"
     COMPILER = rf"(?:%\s*(?:{NAME}))"
@@ -351,12 +394,14 @@ class SpecNodeParser:
         # accept another package name afterwards in a node
         if self.ctx.accept(TokenType.UNQUALIFIED_PACKAGE_NAME):
             initial_spec.name = self.ctx.current_token.value
+
         elif self.ctx.accept(TokenType.FULLY_QUALIFIED_PACKAGE_NAME):
             parts = self.ctx.current_token.value.split(".")
             name = parts[-1]
             namespace = ".".join(parts[:-1])
             initial_spec.name = name
             initial_spec.namespace = namespace
+
         elif self.ctx.accept(TokenType.FILENAME):
             return FileParser(self.ctx).parse(initial_spec)
 
@@ -370,6 +415,7 @@ class SpecNodeParser:
                 compiler_name = self.ctx.current_token.value[1:]
                 initial_spec.compiler = spack.spec.CompilerSpec(compiler_name.strip(), ":")
                 self.has_compiler = True
+
             elif self.ctx.accept(TokenType.COMPILER_AND_VERSION):
                 if self.has_compiler:
                     raise spack.spec.DuplicateCompilerSpecError(
@@ -381,6 +427,7 @@ class SpecNodeParser:
                     compiler_name.strip(), compiler_version
                 )
                 self.has_compiler = True
+
             elif (
                 self.ctx.accept(TokenType.VERSION_HASH_PAIR)
                 or self.ctx.accept(TokenType.GIT_VERSION)
@@ -395,31 +442,39 @@ class SpecNodeParser:
                 )
                 initial_spec.attach_git_version_lookup()
                 self.has_version = True
+
             elif self.ctx.accept(TokenType.BOOL_VARIANT):
                 variant_value = self.ctx.current_token.value[0] == "+"
                 initial_spec._add_flag(
                     self.ctx.current_token.value[1:].strip(), variant_value, propagate=False
                 )
+
             elif self.ctx.accept(TokenType.PROPAGATED_BOOL_VARIANT):
                 variant_value = self.ctx.current_token.value[0:2] == "++"
                 initial_spec._add_flag(
                     self.ctx.current_token.value[2:].strip(), variant_value, propagate=True
                 )
+
             elif self.ctx.accept(TokenType.KEY_VALUE_PAIR):
-                name, value = self.ctx.current_token.value.split("=", maxsplit=1)
-                name = name.strip("'\" ")
-                value = value.strip("'\" ")
-                initial_spec._add_flag(name, value, propagate=False)
+                match = re.match(SPLIT_KVP, self.ctx.current_token.value)
+                assert match, "SPLIT_KVP and KEY_VALUE_PAIR do not agree."
+
+                name, delim, value = match.groups()
+                initial_spec._add_flag(name, strip_quotes_and_unescape(value), propagate=False)
+
             elif self.ctx.accept(TokenType.PROPAGATED_KEY_VALUE_PAIR):
-                name, value = self.ctx.current_token.value.split("==", maxsplit=1)
-                name = name.strip("'\" ")
-                value = value.strip("'\" ")
-                initial_spec._add_flag(name, value, propagate=True)
+                match = re.match(SPLIT_KVP, self.ctx.current_token.value)
+                assert match, "SPLIT_KVP and PROPAGATED_KEY_VALUE_PAIR do not agree."
+
+                name, delim, value = match.groups()
+                initial_spec._add_flag(name, strip_quotes_and_unescape(value), propagate=True)
+
             elif self.ctx.expect(TokenType.DAG_HASH):
                 if initial_spec.abstract_hash:
                     break
                 self.ctx.accept(TokenType.DAG_HASH)
                 initial_spec.abstract_hash = self.ctx.current_token.value[1:]
+
             else:
                 break
 
diff --git a/lib/spack/spack/spec.py b/lib/spack/spack/spec.py
index 88184de30f..600ce988d7 100644
--- a/lib/spack/spack/spec.py
+++ b/lib/spack/spack/spec.py
@@ -910,19 +910,23 @@ class FlagMap(lang.HashableMap):
             yield flags
 
     def __str__(self):
-        sorted_keys = [k for k in sorted(self.keys()) if self[k] != []]
-        cond_symbol = " " if len(sorted_keys) > 0 else ""
-        return (
-            cond_symbol
-            + " ".join(
-                key
-                + ('=="' if True in [f.propagate for f in self[key]] else '="')
-                + " ".join(self[key])
-                + '"'
-                for key in sorted_keys
-            )
-            + cond_symbol
-        )
+        sorted_items = sorted((k, v) for k, v in self.items() if v)
+
+        result = ""
+        for flag_type, flags in sorted_items:
+            normal = [f for f in flags if not f.propagate]
+            if normal:
+                result += f" {flag_type}={spack.parser.quote_if_needed(' '.join(normal))}"
+
+            propagated = [f for f in flags if f.propagate]
+            if propagated:
+                result += f" {flag_type}=={spack.parser.quote_if_needed(' '.join(propagated))}"
+
+        # TODO: somehow add this space only if something follows in Spec.format()
+        if sorted_items:
+            result += " "
+
+        return result
 
 
 def _sort_by_dep_types(dspec: DependencySpec):
diff --git a/lib/spack/spack/test/cmd/common/arguments.py b/lib/spack/spack/test/cmd/common/arguments.py
index e889fe55d8..dcfe86aa51 100644
--- a/lib/spack/spack/test/cmd/common/arguments.py
+++ b/lib/spack/spack/test/cmd/common/arguments.py
@@ -50,8 +50,8 @@ def test_negative_integers_not_allowed_for_parallel_jobs(job_parser):
     [
         (['coreutils cflags="-O3 -g"'], ["-O3", "-g"], [False, False], []),
         (['coreutils cflags=="-O3 -g"'], ["-O3", "-g"], [True, True], []),
-        (["coreutils", "cflags=-O3 -g"], ["-O3"], [False], ["g"]),
-        (["coreutils", "cflags==-O3 -g"], ["-O3"], [True], ["g"]),
+        (["coreutils", "cflags=-O3 -g"], ["-O3", "-g"], [False, False], []),
+        (["coreutils", "cflags==-O3 -g"], ["-O3", "-g"], [True, True], []),
         (["coreutils", "cflags=-O3", "-g"], ["-O3"], [False], ["g"]),
     ],
 )
diff --git a/lib/spack/spack/test/cmd/spec.py b/lib/spack/spack/test/cmd/spec.py
index 763d83bf0a..39fff0bbd4 100644
--- a/lib/spack/spack/test/cmd/spec.py
+++ b/lib/spack/spack/test/cmd/spec.py
@@ -4,7 +4,6 @@
 # SPDX-License-Identifier: (Apache-2.0 OR MIT)
 
 import re
-from textwrap import dedent
 
 import pytest
 
@@ -74,41 +73,6 @@ def test_spec_parse_cflags_quoting():
     assert ["-flto", "-Os"] == gh_flagged.compiler_flags["cxxflags"]
 
 
-def test_spec_parse_unquoted_flags_report():
-    """Verify that a useful error message is produced if unquoted compiler flags are
-    provided."""
-    # This should fail during parsing, since /usr/include is interpreted as a spec hash.
-    with pytest.raises(spack.error.SpackError) as cm:
-        # We don't try to figure out how many following args were intended to be part of
-        # cflags, we just explain how to fix it for the immediate next arg.
-        spec("gcc cflags=-Os -pipe -other-arg-that-gets-ignored cflags=-I /usr/include")
-    # Verify that the generated error message is nicely formatted.
-
-    expected_message = dedent(
-        '''\
-    Some compiler or linker flags were provided without quoting their arguments,
-    which now causes spack to try to parse the *next* argument as a spec component
-    such as a variant instead of an additional compiler or linker flag. If the
-    intent was to set multiple flags, try quoting them together as described below.
-
-    Possible flag quotation errors (with the correctly-quoted version after the =>):
-    (1) cflags=-Os -pipe => cflags="-Os -pipe"
-    (2) cflags=-I /usr/include => cflags="-I /usr/include"'''
-    )
-
-    assert expected_message in str(cm.value)
-
-    # Verify that the same unquoted cflags report is generated in the error message even
-    # if it fails during concretization, not just during parsing.
-    with pytest.raises(spack.error.SpackError) as cm:
-        spec("gcc cflags=-Os -pipe")
-    cm = str(cm.value)
-    assert cm.startswith(
-        'trying to set variant "pipe" in package "gcc", but the package has no such variant'
-    )
-    assert cm.endswith('(1) cflags=-Os -pipe => cflags="-Os -pipe"')
-
-
 def test_spec_yaml():
     output = spec("--yaml", "mpileaks")
 
diff --git a/lib/spack/spack/test/spec_syntax.py b/lib/spack/spack/test/spec_syntax.py
index 3cbb59e69f..e2282df745 100644
--- a/lib/spack/spack/test/spec_syntax.py
+++ b/lib/spack/spack/test/spec_syntax.py
@@ -2,6 +2,7 @@
 # Spack Project Developers. See the top-level COPYRIGHT file for details.
 #
 # SPDX-License-Identifier: (Apache-2.0 OR MIT)
+import inspect
 import itertools
 import os
 import re
@@ -9,6 +10,7 @@ import sys
 
 import pytest
 
+import spack.cmd
 import spack.platforms.test
 import spack.spec
 import spack.variant
@@ -203,7 +205,8 @@ def specfile_for(default_mock_concretization):
             "mvapich_foo ^_openmpi@1.2:1.4,1.6%intel@12.1~qt_4 debug=2 ^stackwalker@8.1_1e",
         ),
         (
-            "mvapich_foo ^_openmpi@1.2:1.4,1.6%intel@12.1 cppflags=-O3 +debug~qt_4 ^stackwalker@8.1_1e",  # noqa: E501
+            "mvapich_foo ^_openmpi@1.2:1.4,1.6%intel@12.1 cppflags=-O3 +debug~qt_4 "
+            "^stackwalker@8.1_1e",
             [
                 Token(TokenType.UNQUALIFIED_PACKAGE_NAME, value="mvapich_foo"),
                 Token(TokenType.DEPENDENCY, value="^"),
@@ -217,7 +220,8 @@ def specfile_for(default_mock_concretization):
                 Token(TokenType.UNQUALIFIED_PACKAGE_NAME, value="stackwalker"),
                 Token(TokenType.VERSION, value="@8.1_1e"),
             ],
-            'mvapich_foo ^_openmpi@1.2:1.4,1.6%intel@12.1 cppflags="-O3" +debug~qt_4 ^stackwalker@8.1_1e',  # noqa: E501
+            "mvapich_foo ^_openmpi@1.2:1.4,1.6%intel@12.1 cppflags=-O3 +debug~qt_4 "
+            "^stackwalker@8.1_1e",
         ),
         # Specs containing YAML or JSON in the package name
         (
@@ -424,7 +428,7 @@ def specfile_for(default_mock_concretization):
         compiler_with_version_range("%gcc@10.1.0,12.2.1:"),
         compiler_with_version_range("%gcc@:8.4.3,10.2.1:12.1.0"),
         # Special key value arguments
-        ("dev_path=*", [Token(TokenType.KEY_VALUE_PAIR, value="dev_path=*")], "dev_path=*"),
+        ("dev_path=*", [Token(TokenType.KEY_VALUE_PAIR, value="dev_path=*")], "dev_path='*'"),
         (
             "dev_path=none",
             [Token(TokenType.KEY_VALUE_PAIR, value="dev_path=none")],
@@ -444,33 +448,28 @@ def specfile_for(default_mock_concretization):
         (
             "cflags=a=b=c",
             [Token(TokenType.KEY_VALUE_PAIR, value="cflags=a=b=c")],
-            'cflags="a=b=c"',
+            "cflags='a=b=c'",
         ),
         (
             "cflags=a=b=c",
             [Token(TokenType.KEY_VALUE_PAIR, value="cflags=a=b=c")],
-            'cflags="a=b=c"',
+            "cflags='a=b=c'",
         ),
         (
             "cflags=a=b=c+~",
             [Token(TokenType.KEY_VALUE_PAIR, value="cflags=a=b=c+~")],
-            'cflags="a=b=c+~"',
+            "cflags='a=b=c+~'",
         ),
         (
             "cflags=-Wl,a,b,c",
             [Token(TokenType.KEY_VALUE_PAIR, value="cflags=-Wl,a,b,c")],
-            'cflags="-Wl,a,b,c"',
+            "cflags=-Wl,a,b,c",
         ),
         # Multi quoted
         (
-            "cflags=''-Wl,a,b,c''",
-            [Token(TokenType.KEY_VALUE_PAIR, value="cflags=''-Wl,a,b,c''")],
-            'cflags="-Wl,a,b,c"',
-        ),
-        (
             'cflags=="-O3 -g"',
             [Token(TokenType.PROPAGATED_KEY_VALUE_PAIR, value='cflags=="-O3 -g"')],
-            'cflags=="-O3 -g"',
+            "cflags=='-O3 -g'",
         ),
         # Whitespace is allowed in version lists
         ("@1.2:1.4 , 1.6 ", [Token(TokenType.VERSION, value="@1.2:1.4 , 1.6")], "@1.2:1.4,1.6"),
@@ -485,22 +484,6 @@ def specfile_for(default_mock_concretization):
             "a@1:",
         ),
         (
-            "@1.2:   develop   = foo",
-            [
-                Token(TokenType.VERSION, value="@1.2:"),
-                Token(TokenType.KEY_VALUE_PAIR, value="develop   = foo"),
-            ],
-            "@1.2: develop=foo",
-        ),
-        (
-            "@1.2:develop   = foo",
-            [
-                Token(TokenType.VERSION, value="@1.2:"),
-                Token(TokenType.KEY_VALUE_PAIR, value="develop   = foo"),
-            ],
-            "@1.2: develop=foo",
-        ),
-        (
             "% intel @ 12.1:12.6 + debug",
             [
                 Token(TokenType.COMPILER_AND_VERSION, value="% intel @ 12.1:12.6"),
@@ -587,8 +570,8 @@ def specfile_for(default_mock_concretization):
 )
 def test_parse_single_spec(spec_str, tokens, expected_roundtrip):
     parser = SpecParser(spec_str)
-    assert parser.tokens() == tokens
-    assert str(parser.next_spec()) == expected_roundtrip
+    assert tokens == parser.tokens()
+    assert expected_roundtrip == str(parser.next_spec())
 
 
 @pytest.mark.parametrize(
@@ -655,19 +638,79 @@ def test_parse_multiple_specs(text, tokens, expected_specs):
 
 
 @pytest.mark.parametrize(
+    "args,expected",
+    [
+        # Test that CLI-quoted flags/variant values are preserved
+        (["zlib", "cflags=-O3 -g", "+bar", "baz"], "zlib cflags='-O3 -g' +bar baz"),
+        # Test that CLI-quoted propagated flags/variant values are preserved
+        (["zlib", "cflags==-O3 -g", "+bar", "baz"], "zlib cflags=='-O3 -g' +bar baz"),
+        # An entire string passed on the CLI with embedded quotes also works
+        (["zlib cflags='-O3 -g' +bar baz"], "zlib cflags='-O3 -g' +bar baz"),
+        # Entire string *without* quoted flags splits -O3/-g (-g interpreted as a variant)
+        (["zlib cflags=-O3 -g +bar baz"], "zlib cflags=-O3 +bar~g baz"),
+        # If the entirety of "-O3 -g +bar baz" is quoted on the CLI, it's all taken as flags
+        (["zlib", "cflags=-O3 -g +bar baz"], "zlib cflags='-O3 -g +bar baz'"),
+        # If the string doesn't start with key=, it needs internal quotes for flags
+        (["zlib", " cflags=-O3 -g +bar baz"], "zlib cflags=-O3 +bar~g baz"),
+        # Internal quotes for quoted CLI args are considered part of *one* arg
+        (["zlib", 'cflags="-O3 -g" +bar baz'], """zlib cflags='"-O3 -g" +bar baz'"""),
+        # Use double quotes if internal single quotes are present
+        (["zlib", "cflags='-O3 -g' +bar baz"], '''zlib cflags="'-O3 -g' +bar baz"'''),
+        # Use single quotes and escape single quotes with internal single and double quotes
+        (["zlib", "cflags='-O3 -g' \"+bar baz\""], 'zlib cflags="\'-O3 -g\' \\"+bar baz\\""'),
+        # Ensure that empty strings are handled correctly on CLI
+        (["zlib", "ldflags=", "+pic"], "zlib+pic"),
+        # These flags are assumed to be quoted by the shell, but the space doesn't matter because
+        # flags are space-separated.
+        (["zlib", "ldflags= +pic"], "zlib ldflags='+pic'"),
+        (["ldflags= +pic"], "ldflags='+pic'"),
+        # If the name is not a flag name, the space is preserved verbatim, because variant values
+        # are comma-separated.
+        (["zlib", "foo= +pic"], "zlib foo=' +pic'"),
+        (["foo= +pic"], "foo=' +pic'"),
+        # You can ensure no quotes are added parse_specs() by starting your string with space,
+        # but you still need to quote empty strings properly.
+        ([" ldflags= +pic"], SpecTokenizationError),
+        ([" ldflags=", "+pic"], SpecTokenizationError),
+        ([" ldflags='' +pic"], "+pic"),
+        ([" ldflags=''", "+pic"], "+pic"),
+        # Ensure that empty strings are handled properly in quoted strings
+        (["zlib ldflags='' +pic"], "zlib+pic"),
+        # Ensure that $ORIGIN is handled correctly
+        (["zlib", "ldflags=-Wl,-rpath=$ORIGIN/_libs"], "zlib ldflags='-Wl,-rpath=$ORIGIN/_libs'"),
+        # Ensure that passing escaped quotes on the CLI raises a tokenization error
+        (["zlib", '"-g', '-O2"'], SpecTokenizationError),
+    ],
+)
+def test_cli_spec_roundtrip(args, expected):
+    if inspect.isclass(expected) and issubclass(expected, BaseException):
+        with pytest.raises(expected):
+            spack.cmd.parse_specs(args)
+        return
+
+    specs = spack.cmd.parse_specs(args)
+    output_string = " ".join(str(spec) for spec in specs)
+    assert expected == output_string
+
+
+@pytest.mark.parametrize(
     "text,expected_in_error",
     [
-        ("x@@1.2", "x@@1.2\n ^^^^^"),
-        ("y ^x@@1.2", "y ^x@@1.2\n   ^^^^^"),
-        ("x@1.2::", "x@1.2::\n      ^"),
-        ("x::", "x::\n ^^"),
+        ("x@@1.2", r"x@@1.2\n ^"),
+        ("y ^x@@1.2", r"y ^x@@1.2\n    ^"),
+        ("x@1.2::", r"x@1.2::\n      ^"),
+        ("x::", r"x::\n ^^"),
+        ("cflags=''-Wl,a,b,c''", r"cflags=''-Wl,a,b,c''\n            ^ ^ ^ ^^"),
+        ("@1.2:   develop   = foo", r"@1.2:   develop   = foo\n                  ^^"),
+        ("@1.2:develop   = foo", r"@1.2:develop   = foo\n               ^^"),
     ],
 )
 def test_error_reporting(text, expected_in_error):
     parser = SpecParser(text)
     with pytest.raises(SpecTokenizationError) as exc:
         parser.tokens()
-        assert expected_in_error in str(exc), parser.tokens()
+
+    assert expected_in_error in str(exc), parser.tokens()
 
 
 @pytest.mark.parametrize(
diff --git a/lib/spack/spack/variant.py b/lib/spack/spack/variant.py
index 9bea903aac..f9f6f8b960 100644
--- a/lib/spack/spack/variant.py
+++ b/lib/spack/spack/variant.py
@@ -19,6 +19,7 @@ from llnl.string import comma_or
 
 import spack.directives
 import spack.error as error
+import spack.parser
 
 special_variant_values = [None, "none", "*"]
 
@@ -399,13 +400,12 @@ class AbstractVariant:
         return item in self._value
 
     def __repr__(self):
-        cls = type(self)
-        return "{0.__name__}({1}, {2})".format(cls, repr(self.name), repr(self._original_value))
+        return f"{type(self).__name__}({repr(self.name)}, {repr(self._original_value)})"
 
     def __str__(self):
-        if self.propagate:
-            return "{0}=={1}".format(self.name, ",".join(str(x) for x in self.value))
-        return "{0}={1}".format(self.name, ",".join(str(x) for x in self.value))
+        delim = "==" if self.propagate else "="
+        values = spack.parser.quote_if_needed(",".join(str(v) for v in self.value))
+        return f"{self.name}{delim}{values}"
 
 
 class MultiValuedVariant(AbstractVariant):
@@ -443,15 +443,14 @@ class MultiValuedVariant(AbstractVariant):
         self._original_value = ",".join(self._value)
 
     def __str__(self):
-        # Special-case patches to not print the full 64 character hashes
+        # Special-case patches to not print the full 64 character sha256
         if self.name == "patches":
             values_str = ",".join(x[:7] for x in self.value)
         else:
             values_str = ",".join(str(x) for x in self.value)
 
-        if self.propagate:
-            return "{0}=={1}".format(self.name, values_str)
-        return "{0}={1}".format(self.name, values_str)
+        delim = "==" if self.propagate else "="
+        return f"{self.name}{delim}{spack.parser.quote_if_needed(values_str)}"
 
 
 class SingleValuedVariant(AbstractVariant):
@@ -467,9 +466,8 @@ class SingleValuedVariant(AbstractVariant):
         self._value = str(self._value[0])
 
     def __str__(self):
-        if self.propagate:
-            return "{0}=={1}".format(self.name, self.value)
-        return "{0}={1}".format(self.name, self.value)
+        delim = "==" if self.propagate else "="
+        return f"{self.name}{delim}{spack.parser.quote_if_needed(self.value)}"
 
     @implicit_variant_conversion
     def satisfies(self, other):
author	Todd Gamblin <tgamblin@llnl.gov>	2023-12-13 16:36:22 -0800
committer	GitHub <noreply@github.com>	2023-12-13 16:36:22 -0800
commit	a690b8c27ccf33745afbdf7b1ce800fa07510b78 (patch)
tree	2a190da3f952b575040490e11942a756549e304a /lib
parent	a1fa862c3f3334055344c6cc77368f7ed4f7c976 (diff)
download	spack-a690b8c27ccf33745afbdf7b1ce800fa07510b78.tar.gz spack-a690b8c27ccf33745afbdf7b1ce800fa07510b78.tar.bz2 spack-a690b8c27ccf33745afbdf7b1ce800fa07510b78.tar.xz spack-a690b8c27ccf33745afbdf7b1ce800fa07510b78.zip