diff options
author | iximeow <me@iximeow.net> | 2022-04-21 02:31:40 -0700 |
---|---|---|
committer | iximeow <me@iximeow.net> | 2022-05-30 11:16:52 -0700 |
commit | f338c74656f6eef8b3080fa9f249b1cb733fd1a9 (patch) | |
tree | a7aaa075893b66516f4a10935a81a3e6d0b7556b /ffi/multiarch/src/long_mode.rs | |
parent | e7f4950985ab9976e9d00599c9225327c64f6439 (diff) |
reorder prefix checks, extract vex/evex prefix handling
sharing vex/evex invalid prefix checks improves codegen a bit, but
ordering prefix checks by likeliest prefix first reduces time falling
through prefix handling arms. both together are a notable improvement in
throughput on typical x86 code.
bundled in here is some code motion to where `mem_size = 0` and
`operand_count = 2` are executed; this is because, at least on zen2 and
cascade lake parts, bunching all stores to the instruction together
caused small stalls getting into the decoder. spreading out stores seems
to mix these assignments with parts of code that was not using memory
anyway, and pipelines better.
Diffstat (limited to 'ffi/multiarch/src/long_mode.rs')
0 files changed, 0 insertions, 0 deletions