aboutsummaryrefslogtreecommitdiff
path: root/src/long_mode
AgeCommit message (Collapse)Author
2021-08-21improve relative branch offset formatting for DisplayStyle::Ciximeow
2021-08-21maintain pre-annotation inlining propertiesiximeow
this gets yaxpeax-x86 in no-inline configurations back to building as it did before, but is quite a blunt hammer. it seems that extra calls to `sink.record` trips the inlining thresholds for `read_with_annotation`, and then its caller, and its caller, even when one of them is just a delegation to its inner call. this is particularly unfortunate because yaxpeax-x86 is now making a decision about the inlining of a rather large function at the public edge of its API, but these attributes match the inlining decisions that LLVM was making before adding `DescriptionSink`. hopefully not too bad. not sure how to handle this in the future.
2021-08-21add descriptions for other prefixes, 16-bit addressingiximeow
2021-08-21add description reporting for segment prefixes and opcodes for 32-bit and 16-bitiximeow
2021-08-21report barebones decoder annotation for vex-coded instructionsiximeow
2021-08-21extend decoder annotation through all of 64-, 32-, and 16-bit modesiximeow
2021-08-21force read_sib inlining in 64-bit modeiximeow
even though NullSink is no-ops, it causes llvm to not inline this function, for a net perf reduction
2021-08-21extend annotation reporting to 32- and 16-bit modes, kindaiximeow
2021-08-21wipiximeow
2021-08-21fix negative relative branches (again!!! +- is bad!!!)iximeow
2021-08-21fix incorrect decoding of 0x9*-series instructions with rex.biximeow
2021-08-21report memory sizes for push, pop, call, retiximeow
these instructions had memory sizes reported for the operand, if it was a memory operand, but for versions with non-memory operands the decoded `Instruction` would imply that non memory access would happen at all. now, decoded instructions in these cases will report a more useful memory size.
2021-08-14relative branches should be shown as $+offset, not just plain offsetiximeow
while x86 branches of immediates are all relative to PC, other architectures may have absolute branches to immediate addresses, leaving this syntax ambiguous and potentially confusing. yaxpeax prefers to write relative offsets `$+...` as a rule, so uphold that here.
2021-08-14delcare pub const fn constructors for all gp registers, segment registers, ↵iximeow
and ip/flags
2021-08-12add RegSpec::rbx() helper (#6)chc4
2021-07-22fix incorrect decodes with scas and 67-prefixes1.0.4iximeow
2021-07-04update yaxpeax-arch to 0.2.0 and update DecodeError implsiximeow
2021-07-04update crate to rust 2018iximeow
2021-07-04support avx512 registers >=16iximeow
2021-07-04handle vzeroupper/vzeroall, reject vzero* with nonzero vvvviximeow
2021-07-04support xacquire/xrelease prefixingiximeow
2021-07-04fix several incorrect tests and docs in 64- and 32-bit modesiximeow
2021-07-03update protected_mode to match long_mode docs, apisiximeow
2021-07-03update DecodeError implsiximeow
2021-07-03document public members in long_modeiximeow
2021-07-03write some dang docs, export `MemoryAccessSize` where you'll look for itiximeow
2021-07-03more carefully test mmx operand sizesiximeow
2021-07-03factor out MemoryAccessSizeiximeow
2021-07-03add tests for MemoryAccessSize, consistentify style on docsiximeow
2021-07-03be more strict about denying invalid operandsiximeow
2021-07-03do not reject prefixed sgdt, add a TODO for xopiximeow
not that xop will ever be wanted, rip
2021-07-03support AMD `sev_snp`iximeow
2021-07-03defer checking invalid lengths for multi-prefix instructionsiximeow
this profiles slightly better? not entirely sure why...
2021-07-03document some of the weird decisions in read_instriximeow
2021-07-03clean up x86_32 and make interfaces match x86_64iximeow
2021-07-03prefixes on 0f01-series opcodes are more strictiximeow
2021-07-03add hresetiximeow
2021-07-03port over x86_64 improvements to x86_32iximeow
2021-07-03support pconfig/tmeiximeow
2021-07-03reject instructions when their opcode is `Invalid`iximeow
the evex route would allow "valid" instructions that have the opcode `invalid`. this is.. not correct.
2021-07-03fix incorrect rex prefix selectioniximeow
2021-07-02adjust decode logic for better pipeliningiximeow
at least on my zen2. when reading prefixes, optimize for the likely case of reading an instruction rather than an invalid run of prefixes. checking if we've exceeded the x86 length bound immediately after reading the byte is only a benefit if we'd otherwise read an impossibly-long instruction; in this case we can exit exactly at prefix byte 15 rather than potentially later at byte 16 (assuming a one-byte instruction like `c3`), or byte ~24 (a more complex store with immediate and displacement). these casese are extremely unlikely in practice. more likely is that reading a prefix byte is one of the first two or three bytes in an instruction, and we will never benefit from checking the x86 length bound at this point. instead, only check length bounds after decoding the entire instruction. this penalizes the slowest path through the decoder but speeds up the likely path about 5% on my zen2 processor. additionally, begin reading instruction bytes as soon as we enter the decoder, and before initial clearing of instruction data. again, this is for zen2 pipeline reasons. reading the first byte and corresponding `OPCODES` entry improves the odds that this data is available by the time we check for `Interpretation::Prefix` in the opcode scanning loop. then, if we did *not* load an instruction, we immediately know another byte must be read; begin reading this byte before applying `rex` prefixes, and as soon as a prefix is known to not be one of the escape-code prefix byte (c5, c4, 62, 0f). this clocked in at another ~5% in total. i've found that `read_volatile` is necessary to force rust to begin the loadwhere it's written, rather than reordering it over other data. i'm not committed to this being a guaranteed truth. also, don't bother checking for `Invalid`. again, `Opcode::Invalid` is a relatively unlikely path through the decoder and `Nothing` is already optiimized for `None` cases. this appears to be another small improvement in throughput but i wouldn't want to give it a number - it was relatively small and may not be attributable to this effect.
2021-07-02intel keylocker instructions that access memory have memory access sizesiximeow
2021-07-02fix several strict rejection for severaliximeow
2021-07-02consolidate length/extension checks to help compilers DCEiximeow
2021-07-01fix warningsiximeow
2021-07-01reorder prefix checksiximeow
this measures a bit faster. it doesn't seem like it should be. the rex prefix checks compile identically but move a lea for a later expression up and pipelines better?
2021-07-01reallocate OperandCode, convert disparate registers to arrayiximeow
also remove redundant assignments of operand_count and some OperandSpec, bulk-assign all registers and operands on entry to `read_instr`. this all, taken together, shaves off about 7 cycles per decode.
2021-07-01making opcode u32 reduces a stall?iximeow
2021-07-01complete yaxpeax-arch 0.1.0 adaptation, shore up .mem_size()iximeow