Age | Commit message (Collapse) | Author |
|
|
|
|
|
|
|
this measures a bit faster. it doesn't seem like it should be. the rex
prefix checks compile identically but move a lea for a later expression
up and pipelines better?
|
|
also remove redundant assignments of operand_count and some OperandSpec,
bulk-assign all registers and operands on entry to `read_instr`. this
all, taken together, shaves off about 7 cycles per decode.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
vex decoding is really intended to avoid explosions in code size more than anything...
|
|
|
|
also some long-mode cleanup in corresponding areas
|
|
|
|
|
|
|
|
|
|
|
|
|
|
i really didnt know rust could do this
|
|
|
|
|
|
|
|
instructions
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
in the future these can and will change (new operands, new instructions) and i would prefer they not be major breaking changes. applications can ignore them and probably do undesired variants anyway.
if you want to write a 1120-variant match, are you me? why would you do this
|
|
the in-repo benchmark got better with this inlined but it's probably
better to leave it up to the compiler when finally stitching stuff
together. i suspect that having read_operands inlined resulted in just
too many live values, and the compiler was inspired to play hijinks that
pipelined poorly. disas-bench shows a ~15% improvement from this change.
|
|
|
|
vmov* are.. somehow messed up too
|
|
does intel know no bounds
|
|
|
|
|
|
|
|
|