aboutsummaryrefslogtreecommitdiff
path: root/src/long_mode/uarch.rs
diff options
context:
space:
mode:
authoriximeow <me@iximeow.net>2022-04-21 02:31:40 -0700
committeriximeow <git@iximeow.net>2022-12-03 15:11:09 -0800
commit76418a5a934c99ef918070c3c740ce3eceb6c5bb (patch)
tree2f460feff6a349b6fc126f2f1ab3854a16856dca /src/long_mode/uarch.rs
parentf5cfe59ce7b7a62ec57325d4d742608b9ae20929 (diff)
reorder prefix checks, extract vex/evex prefix handling
sharing vex/evex invalid prefix checks improves codegen a bit, but ordering prefix checks by likeliest prefix first reduces time falling through prefix handling arms. both together are a notable improvement in throughput on typical x86 code. bundled in here is some code motion to where `mem_size = 0` and `operand_count = 2` are executed; this is because, at least on zen2 and cascade lake parts, bunching all stores to the instruction together caused small stalls getting into the decoder. spreading out stores seems to mix these assignments with parts of code that was not using memory anyway, and pipelines better.
Diffstat (limited to 'src/long_mode/uarch.rs')
0 files changed, 0 insertions, 0 deletions