aboutsummaryrefslogtreecommitdiff
path: root/src
AgeCommit message (Collapse)Author
2021-07-03clean up x86_32 and make interfaces match x86_64iximeow
2021-07-03prefixes on 0f01-series opcodes are more strictiximeow
2021-07-03add hresetiximeow
2021-07-03port over x86_64 improvements to x86_32iximeow
2021-07-03vbroadcastsd requires Wiximeow
2021-07-03support pconfig/tmeiximeow
2021-07-03reject instructions when their opcode is `Invalid`iximeow
the evex route would allow "valid" instructions that have the opcode `invalid`. this is.. not correct.
2021-07-03fix incorrect rex prefix selectioniximeow
2021-07-02adjust decode logic for better pipeliningiximeow
at least on my zen2. when reading prefixes, optimize for the likely case of reading an instruction rather than an invalid run of prefixes. checking if we've exceeded the x86 length bound immediately after reading the byte is only a benefit if we'd otherwise read an impossibly-long instruction; in this case we can exit exactly at prefix byte 15 rather than potentially later at byte 16 (assuming a one-byte instruction like `c3`), or byte ~24 (a more complex store with immediate and displacement). these casese are extremely unlikely in practice. more likely is that reading a prefix byte is one of the first two or three bytes in an instruction, and we will never benefit from checking the x86 length bound at this point. instead, only check length bounds after decoding the entire instruction. this penalizes the slowest path through the decoder but speeds up the likely path about 5% on my zen2 processor. additionally, begin reading instruction bytes as soon as we enter the decoder, and before initial clearing of instruction data. again, this is for zen2 pipeline reasons. reading the first byte and corresponding `OPCODES` entry improves the odds that this data is available by the time we check for `Interpretation::Prefix` in the opcode scanning loop. then, if we did *not* load an instruction, we immediately know another byte must be read; begin reading this byte before applying `rex` prefixes, and as soon as a prefix is known to not be one of the escape-code prefix byte (c5, c4, 62, 0f). this clocked in at another ~5% in total. i've found that `read_volatile` is necessary to force rust to begin the loadwhere it's written, rather than reordering it over other data. i'm not committed to this being a guaranteed truth. also, don't bother checking for `Invalid`. again, `Opcode::Invalid` is a relatively unlikely path through the decoder and `Nothing` is already optiimized for `None` cases. this appears to be another small improvement in throughput but i wouldn't want to give it a number - it was relatively small and may not be attributable to this effect.
2021-07-02intel keylocker instructions that access memory have memory access sizesiximeow
2021-07-02fix several strict rejection for severaliximeow
2021-07-02consolidate length/extension checks to help compilers DCEiximeow
2021-07-02`Nothing` operand code must be decoded with operand_count=0iximeow
2021-07-01fix warningsiximeow
2021-07-01reorder prefix checksiximeow
this measures a bit faster. it doesn't seem like it should be. the rex prefix checks compile identically but move a lea for a later expression up and pipelines better?
2021-07-01reallocate OperandCode, convert disparate registers to arrayiximeow
also remove redundant assignments of operand_count and some OperandSpec, bulk-assign all registers and operands on entry to `read_instr`. this all, taken together, shaves off about 7 cycles per decode.
2021-07-01making opcode u32 reduces a stall?iximeow
2021-07-01complete yaxpeax-arch 0.1.0 adaptation, shore up .mem_size()iximeow
2021-07-01update yaxpeax-x86 to yaxpeax-arch 0.1.0 interfacesiximeow
2021-06-29fix several lingering mem_size discrepanciesiximeow
2021-06-28remove old movsx/movzx-related memory size hacksiximeow
2021-06-28remove unused evex variants from generated codeiximeow
2021-06-28clean up protected mode vex-related warningsiximeow
2021-06-28remove a few operand casesiximeow
vex decoding is really intended to avoid explosions in code size more than anything...
2021-06-28round out x86_32 support - avx2, avx, memory sizesiximeow
2021-06-28protected mode memory sizesiximeow
also some long-mode cleanup in corresponding areas
2021-06-27protected-mode avx512iximeow
2021-06-27remove support for nonexistent prefixesiximeow
2021-06-27PartialEq impls for data in instructiosn, and Instruction itselfiximeow
2021-06-27all tests now passing for long modeiximeow
2021-06-27report memory sizes for all long-mode instructionsiximeow
2021-06-26awkwardiximeow
i really didnt know rust could do this
2021-06-26clean up avx2-related warningsiximeow
2021-06-26add long-mode avx512 support, except for compressed displacementsiximeow
2021-06-12finish up long mode avx2iximeow
2021-06-11add extensive avx and initial avx2 tests, fix several bugs and missing ↵iximeow
instructions
2021-06-11remove vex ops file, didnt mean to track that in the first placeiximeow
2021-05-31fix typoiximeow
2021-05-16fix ShowContextual rendering error with stale data and operands, publish 0.2.20.2.2iximeow
2021-05-07remove dead OperandSpec variantsiximeow
2021-05-07update yaxpeax-arch to 0.0.5, fix interface breakagesiximeow
2021-03-22and clean up some warningsiximeow
2021-03-22port long-mode decoder updates to protected-modeiximeow
2021-03-21remove some forgotten println commentsiximeow
2021-03-21include memory sizes on inc/dec in C formatiximeow
2021-03-21make Opcode, Operand, and DecodeError non_exhaustiveiximeow
in the future these can and will change (new operands, new instructions) and i would prefer they not be major breaking changes. applications can ignore them and probably do undesired variants anyway. if you want to write a 1120-variant match, are you me? why would you do this
2021-03-21in real programs, having read_operands inlined hurts performance!iximeow
the in-repo benchmark got better with this inlined but it's probably better to leave it up to the compiler when finally stitching stuff together. i suspect that having read_operands inlined resulted in just too many live values, and the compiler was inspired to play hijinks that pipelined poorly. disas-bench shows a ~15% improvement from this change.
2021-03-21fuzzing shows resetting operands is not beneficialiximeow
2021-03-21fix potential successful decodes with Opcode::Invalidiximeow
vmov* are.. somehow messed up too
2021-03-21add tsxldtrkiximeow
does intel know no bounds