diff options
author | iximeow <me@iximeow.net> | 2024-06-24 15:39:30 -0700 |
---|---|---|
committer | iximeow <me@iximeow.net> | 2024-06-24 15:39:30 -0700 |
commit | 681262f4472ba4f452446e86012ce629b849d8d9 (patch) | |
tree | ef002c3ca42199e6ec49ce16f78bc1ec7afd8a9a /data/generate_opcode.py | |
parent | 24b33d5fdc9513c1b46e99b526d21e0a7b5eea38 (diff) |
summary description of opt workHEAD2.0.0no-gods-no-
this empty commit reproduces a github comment that describes the work on
commits from this point back to, roughly, 1.2.2. since many commits
between these two points are interesting in the context of performance
optimization (especially uarch-relevant tweaks), many WIP commits are
preserved. as a result there is no clear squash merge, and this commit
will be the next best thing.
on Rust 1.68.0 and a Xeon E3-1230 V2, relative changes are measured
roughly as:
starting at ed4f238a4c2d860e6fadc8abeaa0cba36ed1df8a:
- non-fmt ns/decode: 15ns
- non-fmt instructions/decode: 94.6
- non-fmt IPC: 1.71
- fmt ns/decode+display: 91ns
- fmt instructions/decode+display: 683.8
- fmt IPC: 2.035
ending at 6a5ea107475284756070614a566970fbb383c4e6
- non-fmt ns/decode: 15ns
- non-fmt instructions/decode: 94.6
- non-fmt IPC: 1.71
- fmt ns/decode+display: 47ns
- fmt instructions/decode+display: 329.6
- fmt IPC: 1.898
for an overall ~50% reduction in runtimes to display instructions.
writing into InstructionTextBuffer reduces overhead another ~10%.
-- original message follows --
this is where much of https://github.com/iximeow/yaxpeax-arch/pull/7
originated.
`std::fmt` as a primary writing mechanism has.. some limitations:
* https://github.com/rust-lang/rust/issues/92993#issuecomment-2028915232
* https://github.com/llvm/llvm-project/issues/87440
* https://github.com/rust-lang/rust/pull/122770
and some more interesting more fundamental limitations - writing to a
`T: fmt::Write` means implementations don't know if it's possible to
write bytes in reverse order (useful for printing digits) or if it's OK
to write too many bytes and then only advance `len` by the correct
amount (useful for copying variable-length-but-short strings like
register names). these are both perfectly fine to a `String` or `Vec`,
less fine to do to a file descriptor like stdout.
at the same time, `Colorize` and traits depending on it are very broken,
for reasons described in yaxpeax-arch.
so, this adapts `yaxpeax-x86` to use the new `DisplaySink` type for
writing, with optimizations where appropriate and output spans for
certain kinds of tokens - registers, integers, opcodes, etc. it's not
a perfect replacement for Colorize-to-ANSI-supporting-outputs but it's
more flexible and i think can be made right.
along the way this completes the move of `safer_unchecked` out to
yaxpeax-arch (ty @5225225 it's still so useful), cleans up some docs,
and comes with a few new test cases.
because of the major version bump of yaxpeax-arch, and because this
removes most functionality of the Colorize impl - it prints the
correct words, just without coloring - this is itself a major version
bump to 2.0.0. yay! this in turn is a good point to change the
`Opcode` enums from being tuple-like to struct-like, and i've done so
in
https://github.com/iximeow/yaxpeax-x86/commit/1b8019d5b39a05c109399b8628a1082bfec79755.
full notes in CHANGELOG ofc. this is notes for myself when i'm trying
to remember any of this in two years :)
Diffstat (limited to 'data/generate_opcode.py')
0 files changed, 0 insertions, 0 deletions