yaxpeax-x86 - yaxpeax x86 decoder

Age	Commit message (Collapse)	Author
2023-01-02	TEMP generate InstDecoder bitsx86-generic	iximeow

2023-01-02	add a `generic` module for x86 disassembly	iximeow
	this module generally attempts to decode as 64-bit x86 instructions, on the assumption they are the most likely-desired instructions, falling back to 32-bit and then 16-bit decoding, in order. translation from a 64-bit `long_mode::Instruction` to `generic::Instruction` is close to free, where `protected_mode::Instruction` and `real_mode::Instruction` may be a little more costly in time but should still not be too bad. docs still need much touching up. most docs reference the `long_mode` structures and enums they're strongly inspired by.
2023-01-02	remove a few duplicate impls, add stubs for geneirc translations	iximeow
	generate_opcode.py has quickly grown into generating much more than just opcode definitions, and now handles a few duplicate impls across the different decode modes as well. some of the added impl generation conflicts with still-existing hand-written impls from yore, so they needed a bit of removing. next will be the addition of a generic module for "probably what you want" disassembly of x86, avoiding the 64-/32-/16-bitness of the architecture family with an attempt to decode "probably what you wanted" from a byte sequence. it needs a little more work still, but TODO stubs added here support that new module.
2023-01-02	building out the generic x86 type	iximeow

2023-01-02	codegen `Colorized` impl and normalize `name()` implementation	iximeow
	unfortunately because of the layout of instruction information this adds lines rather than removes them..
2023-01-02	yax builds again with opcodes generated by type	iximeow

2023-01-02	do benchmarking in ci too	iximeow

2023-01-02	add a goodfile, will this.. work?	iximeow

2022-12-24	update old yaxpeax-arch versions in ffi crates to compatible versions	iximeow

2022-12-03	bump Cargo.toml to 1.1.51.1.5	iximeow

2022-12-03	include typo fixes in the changelog!	iximeow

2022-12-03	describe optimizations included in 1.1.5	iximeow

2022-12-03	roll up decoding loop changes for 16-bit and 32-bit decoders	iximeow
	this applies * f338c74656f6eef8b3080fa9f249b1cb733fd1a9 * bece19e6a69b158893abbf56a6cac25eb25d9a32 * 6353f58170d28a142e3b012c2c86f684d50dea45 * 67be1c0983244645a3c762b7aa0601f0d0ba4bb3 * 091f1d66ef853d6339a96e43d71c137ee7d3907a as one unit to both the 16-bit and 32-bit decoders.
2022-12-03	apply e7f49509 to 16-bit and 32-bit decoders	iximeow

2022-12-03	apply 2444de11 to 16-bit and 32-bit decoders	iximeow
	these don't need the extra `rex`-supporting index space, so they don't have it.
2022-12-03	fix incorrect rex selection and field description offsets	iximeow

2022-12-03	66 prefixes are common, 0f opcodes are common	iximeow

2022-12-03	support a fast path through the decoder for [rex-prefixed]opcode insts	iximeow
	the overwhelming majority of x86 instructions are either a single-byte opcode or a single-byte opcode with a rex prefix. supporting these specially means that we don't have to length-check on every byte or go through the full decode loop while reading the most likely instructions. this is a significant improvement on typical x86 streams, but comes at a moderate penalty for crafted x86 instructions. the penalty is still not very bad, as the fast path is exited in favor of the full decode loop as soon as we see a non-rex prefix byte; this adds maybe a dozen instructions to the slow path.
2022-12-03	just a bit more code motion that seemed to help things sometimes	iximeow

2022-12-03	reorder prefix checks, extract vex/evex prefix handling	iximeow
	sharing vex/evex invalid prefix checks improves codegen a bit, but ordering prefix checks by likeliest prefix first reduces time falling through prefix handling arms. both together are a notable improvement in throughput on typical x86 code. bundled in here is some code motion to where `mem_size = 0` and `operand_count = 2` are executed; this is because, at least on zen2 and cascade lake parts, bunching all stores to the instruction together caused small stalls getting into the decoder. spreading out stores seems to mix these assignments with parts of code that was not using memory anyway, and pipelines better.
2022-12-03	move opcode lookup tables into const arrays	iximeow
	cleanliness, but also slightly better codegen somehow?
2022-12-03	replace size lookup logic with a LUT	iximeow
	the match compiled into some indirect branch awfulness!! no thank you
2022-09-23	Fix some typos.	Bruce Mitchener

2022-05-30	pshufb annotations use incorrect register banks (for now?)	iximeow
	the correct bank is applied far after register numbers are read. a correct annotation would need to know to defer emission until setting register banks, but also would need to work backwards for the number of bits between the current byte and modrm. not impossible, but substantial refactoring.
2022-05-07	more annotation fixes?	iximeow

2022-05-01	add testing setup for field descriptions	iximeow

2022-04-30	support 0x9a callf in 16/32-bit modes	iximeow

2022-04-24	fix a few issues preventing no-std builds from ... building	iximeow
	this includes a `Makefile` that exercises the various crate configs. most annoyingly, several doc comments needed to grow `#[cfg(feature="fmt")]` blocks so docs continue to build with that feature enabled or disabled. carved out a way to run exhaustive tests; they should be written as `#[ignore]`, and then the makefile will run even ignored tests on the expectation that this will run the exhaustive (but slower) suite. exhaustive tests are not yet written. they'll probably involve spanning 4 byte sequences from 0 to 2^32-1.
2022-01-12	fuzz DisplayStyle::C and fix corresponding issues1.1.4	iximeow

2022-01-02	update changelog	iximeow

2022-01-02	fix incorrect decoder used in docs test	iximeow

2022-01-02	actually include a link	iximeow

2022-01-02	explicit inline annotations for kinda_uncheckeds	iximeow
	unfortunately something about the wrapper functions adjusts codegen even when the wrapper functions themselves are just calls to inner functions. the in-tree benchmark (known to not be comprehensive, but enough to spot a difference), showed a ~3.5% regression in throughput with the prior commit, even though it doesn't change behavior at all. explicit #[inline(always)] gets things to a state where the wrapper functions do not penalize performance. for an example of the differences in codegen, see below. before: ``` < 141d4: 48 39 fa cmp %rdi,%rdx < 141d7: 0f 84 0b 4d 00 00 je 18ee8 <_ZN5bench16do_decode_swathe17h694154735739ce4cE+0x4e58> < 141dd: 0f b6 0f movzbl (%rdi),%ecx < 141e0: 48 83 c7 01 add $0x1,%rdi < 141e4: 48 89 7c 24 38 mov %rdi,0x38(%rsp) ... snip ... ``` after: ``` > 141d4: 48 39 ea cmp %rbp,%rdx > 141d7: 0f 84 97 4c 00 00 je 18e74 <_ZN5bench16do_decode_swathe17h694154735739ce4cE+0x4de4> > 141dd: 0f b6 4d 00 movzbl 0x0(%rbp),%ecx > 141e1: 48 83 c5 01 add $0x1,%rbp > 141e5: 48 89 6c 24 38 mov %rbp,0x38(%rsp) ... snip ... ``` there are several spans of code with this kind of change involved; there are no explicit calls to `get_kinda_unchecked` or `unreachable_kinda_unchecked` but clearly a difference did make it through to the benchmark's code. while the choice of `rbp` instead of `rdi` wouldn't seem very interesting, the instructions themselves are more substantially different. `0fb60f` vs `0fb64d00`; to encode `[rbp + 0]`, the instruction requires a displacement, and is one byte longer as a result. there are several instructions so-impacted, and i suspect the increased code size is what ended up changing benchmark behavior. after adding these `#[inline(always)]` annotations, there is no difference in generated code with or without the `kinda_unchecked` helpers!
2022-01-02	Wrap unsafe functions to catch errors in debug	5225225
	Closes https://github.com/iximeow/yaxpeax-x86/issues/16
2021-12-19	prep for 1.1.3 release	iximeow
	actual release is being held until cargo fuzz runs a while without a panic
2021-12-19	add in-tree cargo fuzz targets for decode and display impls	iximeow

2021-12-19	fix incorrect memory size for f30f1e-style nop	iximeow
	not only did the instruction have wrong data, but if displayed, the formatter would panic.
2021-12-19	test that invalid RegSpec constructions panic as expected	iximeow
	in the process, fix 64-bit rex-byte limit, 32/16-bit mode mask reg limit
2021-12-17	write `apply_disp_scale` in a mode-agnostic way	iximeow
	`apply_disp_scale` forgot that `wrapping_mul` exists, so we don't need to explicitly write the size of value that `mem_size` should be cast to, in casting to/from a signed integer. taken with `.into()`, we don't need per-architecture stubs to make evex decoding work.
2021-12-17	do not panic on negative compressed displacements, i mean it!!	iximeow

2021-12-16	bump version to 1.1.21.1.2	iximeow

2021-12-16	displacements are stored as unsigned, but are functionally signed ints	iximeow
	so multiplying to expand EVEX compressed offsets can overflow, and that needs to be okay.
2021-10-10	bump version to 1.1.11.1.1	iximeow

2021-10-10	talk about contribution policy a little	iximeow

2021-10-10	downgrade "most hardware" to "some hardware"	iximeow
	alas
2021-10-10	add `InstructionDisplayer` export to changelog	iximeow

2021-10-10	support endbr{32,64}	iximeow

2021-10-10	consistentify doc style	iximeow

2021-10-10	export `InstructionDisplayer` (#9)	i509VCB
	This makes generated docs refer to a type and show said type in the list of all structs rather than rustdoc showing gray text in return types. quote doc references
2021-08-22	bump to yaxpeax-arch 0.2.7 and proper field description support1.1.0	iximeow