diff options
author | Todd Gamblin <tgamblin@llnl.gov> | 2018-01-31 21:57:56 -0800 |
---|---|---|
committer | becker33 <becker33@llnl.gov> | 2018-01-31 21:57:56 -0800 |
commit | df758e1cfc2e25d08eb25f57c2add3548bedfc14 (patch) | |
tree | 1845767cce1bd0706dd9ce17ecb8af61a3dc21a0 /CONTRIBUTING.md | |
parent | 514f0bf5c546d666f32c02ea652658dcf3b85c8c (diff) | |
download | spack-df758e1cfc2e25d08eb25f57c2add3548bedfc14.tar.gz spack-df758e1cfc2e25d08eb25f57c2add3548bedfc14.tar.bz2 spack-df758e1cfc2e25d08eb25f57c2add3548bedfc14.tar.xz spack-df758e1cfc2e25d08eb25f57c2add3548bedfc14.zip |
Improve log parsing performance (#7093)
* Allow dashes in command names and fix command name handling
- Command should allow dashes in their names like the reest of spack,
e.g. `spack log-parse`
- It might be too late for `spack build-cache` (since it is already
called `spack buildcache`), but we should try a bit to avoid
inconsistencies in naming conventions
- The code was inconsistent about where commands should be called by
their python module name (e.g. `log_parse`) and where the actual
command name should be used (e.g. `log-parse`).
- This made it hard to make a command with a dash in the name, and it
made `SpackCommand` fail to recognize commands with dashes.
- The code now uses the user-facing name with dashes for function
parameters, then converts that the module name when needed.
* Improve performance of log parsing
- A number of regular expressions from ctest_log_parser have really poor
performance, most due to untethered expressions with * or + (i.e., they
don't start with ^, so the repetition has to be checked for every
position in the string with Python's backtracking regex implementation)
- I can't verify that CTest's regexes work with an added ^, so I don't
really want to touch them. I tried adding this and found that it
caused some tests to break.
- Instead of using only "efficient" regular expressions, Added a
prefilter() class that allows the parser to quickly check a
precondition before evaluating any of the expensive regexes.
- Preconditions do things like check whether the string contains "error"
or "warning" (linear time things) before evaluating regexes that would
require them. It's sad that Python doesn't use Thompson string
matching (see https://swtch.com/~rsc/regexp/regexp1.html)
- Even with Python's slow implementation, this makes the parser ~200x
faster on the input we tried it on.
* Add `spack log-parse` command and improve the display of parsed logs
- Add better coloring and line wrapping to the log parse output. This
makes nasty build output look better with the line numbers.
- `spack log-parse` allows the log parsing logic used at the end of
builds to be executed on arbitrary files, which is handy even outside
of spack.
- Also provides a profile option -- we can profile arbitrary files and
show which regular expressions in the magic CTest parser take the most
time.
* Parallelize log parsing
- Log parsing now uses multiple threads for long logs
- Lines from logs are divided into chnks and farmed out to <ncpus>
- Add -j option to `spack log-parse`
Diffstat (limited to 'CONTRIBUTING.md')
0 files changed, 0 insertions, 0 deletions