summaryrefslogtreecommitdiff
path: root/share
diff options
context:
space:
mode:
authorTodd Gamblin <tgamblin@llnl.gov>2018-01-31 21:57:56 -0800
committerbecker33 <becker33@llnl.gov>2018-01-31 21:57:56 -0800
commitdf758e1cfc2e25d08eb25f57c2add3548bedfc14 (patch)
tree1845767cce1bd0706dd9ce17ecb8af61a3dc21a0 /share
parent514f0bf5c546d666f32c02ea652658dcf3b85c8c (diff)
downloadspack-df758e1cfc2e25d08eb25f57c2add3548bedfc14.tar.gz
spack-df758e1cfc2e25d08eb25f57c2add3548bedfc14.tar.bz2
spack-df758e1cfc2e25d08eb25f57c2add3548bedfc14.tar.xz
spack-df758e1cfc2e25d08eb25f57c2add3548bedfc14.zip
Improve log parsing performance (#7093)
* Allow dashes in command names and fix command name handling - Command should allow dashes in their names like the reest of spack, e.g. `spack log-parse` - It might be too late for `spack build-cache` (since it is already called `spack buildcache`), but we should try a bit to avoid inconsistencies in naming conventions - The code was inconsistent about where commands should be called by their python module name (e.g. `log_parse`) and where the actual command name should be used (e.g. `log-parse`). - This made it hard to make a command with a dash in the name, and it made `SpackCommand` fail to recognize commands with dashes. - The code now uses the user-facing name with dashes for function parameters, then converts that the module name when needed. * Improve performance of log parsing - A number of regular expressions from ctest_log_parser have really poor performance, most due to untethered expressions with * or + (i.e., they don't start with ^, so the repetition has to be checked for every position in the string with Python's backtracking regex implementation) - I can't verify that CTest's regexes work with an added ^, so I don't really want to touch them. I tried adding this and found that it caused some tests to break. - Instead of using only "efficient" regular expressions, Added a prefilter() class that allows the parser to quickly check a precondition before evaluating any of the expensive regexes. - Preconditions do things like check whether the string contains "error" or "warning" (linear time things) before evaluating regexes that would require them. It's sad that Python doesn't use Thompson string matching (see https://swtch.com/~rsc/regexp/regexp1.html) - Even with Python's slow implementation, this makes the parser ~200x faster on the input we tried it on. * Add `spack log-parse` command and improve the display of parsed logs - Add better coloring and line wrapping to the log parse output. This makes nasty build output look better with the line numbers. - `spack log-parse` allows the log parsing logic used at the end of builds to be executed on arbitrary files, which is handy even outside of spack. - Also provides a profile option -- we can profile arbitrary files and show which regular expressions in the magic CTest parser take the most time. * Parallelize log parsing - Log parsing now uses multiple threads for long logs - Lines from logs are divided into chnks and farmed out to <ncpus> - Add -j option to `spack log-parse`
Diffstat (limited to 'share')
0 files changed, 0 insertions, 0 deletions