summaryrefslogtreecommitdiff
path: root/.github/CONTRIBUTING.md
diff options
context:
space:
mode:
authorTodd Gamblin <tgamblin@llnl.gov>2021-09-26 16:20:26 -0700
committerTodd Gamblin <tgamblin@llnl.gov>2021-10-04 18:30:19 -0700
commit052b2e1b08d54374694abca2321c789a98101243 (patch)
treeb804401f2fdb4042452744b0cf93b967a0d61541 /.github/CONTRIBUTING.md
parent472638f025abfd08c9ed94be4af2d475c09c32dd (diff)
downloadspack-052b2e1b08d54374694abca2321c789a98101243.tar.gz
spack-052b2e1b08d54374694abca2321c789a98101243.tar.bz2
spack-052b2e1b08d54374694abca2321c789a98101243.tar.xz
spack-052b2e1b08d54374694abca2321c789a98101243.zip
cc: convert compiler wrapper to posix shell
This converts everything in cc to POSIX sh, except for the parts currently handled with bash arrays. Tests are still passing. This version tries to be as straightforward as possible. Specifically, most conversions are kept simple -- convert ifs to ifs, handle indirect expansion the way we do in `setup-env.sh`, only mess with the logic in `cc`, and don't mess with the python code at all. The big refactor is for arrays. We can't rely on bash's nice arrays and be ignorant of separators anymore. So: 1. To avoid complicated separator logic, there are three types of lists. They are: * `$lsep`-separated lists, which end with `_list`. `lsep` is customizable, but we picked `^G` (alarm bell) for `$lsep` because it's ASCII and it's unlikely that it would actually appear in any arguments. If we need to get fancier (and I will lose faith in the world if we do) then we could consider XON or XOFF. * `:`-separated directory lists, which end with `_dirs`, `_DIRS`, `PATH`, or `PATHS` * Whitespace-separated lists (like flags), which can have any other name. Whitespace and colon-separated lists come with the territory with PATHs from env vars and lists of flags. `^G` separated lists are what we use for most internal variables, b/c it's more likely to work. 2. To avoid subshells, use a bunch of functions that do dirty `eval` stuff instead. This adds 3 functions to deal with lists: * `append LISTNAME ELEMENT [SEP]` will put `ELEMENT` at the end of the list called `LISTNAME`. You can optionally say what separator you expect to use. Note that we are taking advantage of everything being global and passing lists by name. * `prepend LISTNAME ELEMENT [SEP]` like append, but puts `ELEMENT` at the start of `LISTNAME` * `extend LISTNAME1 LISTNAME2 [PREFIX]` appends everything in LISTNAME2 to LISTNAME1, and optionally prepends `PREFIX` to every element (this is useful for things like `-I`, `-isystem `, etc. * `preextend LISTNAME1 LISTNAME2 [PREFIX]` prepends everything in LISTNAME2 to LISTNAME1 in order, and optionally prepends `PREFIX` to every element. The routines determine the separator for each argument by its name, so we don't have to pass around separators everywhere. Amazingly, as long as you do not expand variables' values within an `eval` environment, you can do all this and still preserve quoting. When iterating over lists, the user of this API still has to set and unset `IFS` properly. We ended up having to ignore shellcheck SC2034 (unused variable), because using evals all over the place means that shellcheck doesn't notice that our list variables are actually used. So far this is looking pretty good. I took the most complex unit test I could find (which runs a sample link line) and ran the same command line 200 times in a shell script. Times are roughly as follows: For this invocation: ```console $ bash -c 'time (for i in `seq 1 200`; do ~/test_cc.sh > /dev/null; done)' ``` I get the following performance numbers (the listed shells are what I put in `cc`'s shebang): **Original** * Old version of `cc` with arrays and `bash v3.2.57` (macOS builtin): `4.462s` (`.022s` / call) * Old version of `cc` with arrays and `bash v5.1.8` (Homebrew): `3.267s` (`.016s` / call) **Using many subshells (#26408)** * with `bash v3.2.57`: `25.302s` (`.127s` / call) * with `bash v5.1.8`: `27.801s` (`.139s` / call) * with `dash`: `15.302s` (`.077s` / call) This version didn't seem to work with zsh. **This PR (no subshells)** * with `bash v3.2.57`: `4.973s` (`.025s` / call) * with `bash v5.1.8`: `4.984s` (`.025s` / call) * with `zsh`: `2.995s` (`.015s` / call) * with `dash`: `1.890s` (`.0095s` / call) Dash, with the new posix design, is easily the winner. So there are several interesting things to note here: 1. Running the posix version in `bash` is slower than using `bash` arrays. That is to be expected because it's doing a bunch of string processing where it likely did not have to before, at least in `bash`. 2. `zsh`, at least on macOS, is significantly faster than the ancient `bash` they ship with the system. Using `zsh` with the new version also makes the posix wrappers faster than `develop`. So it's worth preferring `zsh` if we have it. I suppose we should also try this with newer `bash` on Linux. 3. `bash v5.1.8` seems to be significantly faster than the old system `bash v3.2.57` for arrays. For straight POSIX stuff, it's a little slower. It did not seem to matter whether `--posix` was used. 4. `dash` is way faster than `bash` or `zsh`, so the real payoff just comes from being able to use it. I am not sure if that is mostly startup time, but it's significant. `dash` is ~2.4x faster than the original `bash` with arrays. So, doing a lot of string stuff is slower than arrays, but converting to posix seems worth it to be able to exploit `dash`. - [x] Convert all but array-related portions to sh - [x] Fix basic shellcheck issues. - [x] Convert arrays to use a few convenience functions: `append` and `extend` - [x] Get `cc` tests passing. - [x] Add `cc` tests where needed passing. - [x] Benchmarking. Co-authored-by: Tom Scogland <scogland1@llnl.gov> Co-authored-by: Danny McClanahan <1305167+cosmicexplorer@users.noreply.github.com>
Diffstat (limited to '.github/CONTRIBUTING.md')
0 files changed, 0 insertions, 0 deletions