aboutsummaryrefslogtreecommitdiff
path: root/ffi
diff options
context:
space:
mode:
authoriximeow <me@iximeow.net>2022-04-21 02:31:40 -0700
committeriximeow <me@iximeow.net>2022-05-30 11:16:52 -0700
commitf338c74656f6eef8b3080fa9f249b1cb733fd1a9 (patch)
treea7aaa075893b66516f4a10935a81a3e6d0b7556b /ffi
parente7f4950985ab9976e9d00599c9225327c64f6439 (diff)
reorder prefix checks, extract vex/evex prefix handling
sharing vex/evex invalid prefix checks improves codegen a bit, but ordering prefix checks by likeliest prefix first reduces time falling through prefix handling arms. both together are a notable improvement in throughput on typical x86 code. bundled in here is some code motion to where `mem_size = 0` and `operand_count = 2` are executed; this is because, at least on zen2 and cascade lake parts, bunching all stores to the instruction together caused small stalls getting into the decoder. spreading out stores seems to mix these assignments with parts of code that was not using memory anyway, and pipelines better.
Diffstat (limited to 'ffi')
0 files changed, 0 insertions, 0 deletions