| Age | Commit message (Collapse) | Author | 
|---|
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | this measures a bit faster. it doesn't seem like it should be. the rex
prefix checks compile identically but move a lea for a later expression
up and pipelines better? | 
|  | also remove redundant assignments of operand_count and some OperandSpec,
bulk-assign all registers and operands on entry to `read_instr`. this
all, taken together, shaves off about 7 cycles per decode. | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | vex decoding is really intended to avoid explosions in code size more than anything... | 
|  |  | 
|  | also some long-mode cleanup in corresponding areas | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | i really didnt know rust could do this | 
|  |  | 
|  |  | 
|  |  | 
|  | instructions | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | in the future these can and will change (new operands, new instructions) and i would prefer they not be major breaking changes. applications can ignore them and probably do undesired variants anyway.
if you want to write a 1120-variant match, are you me? why would you do this | 
|  | the in-repo benchmark got better with this inlined but it's probably
better to leave it up to the compiler when finally stitching stuff
together. i suspect that having read_operands inlined resulted in just
too many live values, and the compiler was inspired to play hijinks that
pipelined poorly. disas-bench shows a ~15% improvement from this change. | 
|  |  | 
|  | vmov* are.. somehow messed up too | 
|  | does intel know no bounds | 
|  |  | 
|  |  | 
|  |  | 
|  |  | 
|  | decoder flag to come | 
|  | this is... a more significant rewrite than i expected yaxpeax-x86 to
ever need. it turns out that capstone is extremely permissive about
duplicative 66/f2/f3 prefixes to the point that the implemented prefex
handling was unsalvageable.
while this replaces the *0f* opcode tables, i haven't profiled these
changes. it's possible this is a net improvement for single-byte
opcodes, it could be a net loss. code size may be severely impacted.
there is still work to do.
but this in total gets very close to iced/xed/zydis parity, far more
than before.
also adds several small extensions, gfni, 3dnow, enqcmd, invpcid, some
of cet, and a few missing avx instructions. | 
|  |  | 
|  |  | 
|  | initial work to optionally discard any instruction printing support
when using `-Z build-std` to fully remove .eh_frame, a stripped
long_mode_no_fmt .so is 61kb! | 
|  |  | 
|  | clearing reg_rrr and reg_mmm more efficiently is an extremely small win,
but a win
read_imm_signed generally should inline well and runs afoul of some
heuristic. inlining gets about 8% improved throughput on the
(unrealistic) in-repo benchmark
it would be great to be able to avoid bounds checks somehow; it looks
like they alone are another ~10% of decode time. i'm not sure how to
pull that off while retaining the generic iterator parameter. might just
not be possible. | 
|  | * `mwaitx`, `monitorx`, `rdpru`, and `clzero` are now supported
* swapgs is no longer decoded in protected mode
* rdpkru and wrpkru are no longer decoded if mod bits != 11 | 
|  | base 0b101
for memory operands with a base, index, and displacement either
the wrong base would be selected (register number ignored, so only
`*ax` or `r8*` would be reported), or yaxpeax-x86 would report a
base register is present when it is not (`RegIndexBaseScaleDisp`
when the operand is actually `RegScaleDisp`)
thank you to Evan Johnson for catching and reporting this bug!
also bump crate version to 0.1.4 as this will be immediately tagged and
released. | 
|  |  | 
|  |  |