yaxpeax-x86 - yaxpeax x86 decoder

Age	Commit message (Collapse)	Author
2021-07-04	remove stale `! user beware !` line from readme1.0.2	iximeow

2021-07-04	fix docs link1.0.1	iximeow

2021-07-04	update changelog for 1.0 release1.0.0	iximeow

2021-07-04	update yaxpeax-arch to 0.2.0 and update DecodeError impls	iximeow

2021-07-04	update crate to rust 2018	iximeow

2021-07-04	add ffi wrappers for real mode, protected mode, and a multiarch build	iximeow

2021-07-04	support vpscatter{dd,dq,qd,qq}	iximeow

2021-07-04	support avx512 registers >=16	iximeow

2021-07-04	handle vzeroupper/vzeroall, reject vzero* with nonzero vvvv	iximeow

2021-07-04	support xacquire/xrelease prefixing	iximeow

2021-07-04	add real-mode decoder	iximeow

2021-07-04	16-bit addressing in protected mode may see avx512 masks too	iximeow

2021-07-04	fix several incorrect tests and docs in 64- and 32-bit modes	iximeow

2021-07-03	update protected_mode to match long_mode docs, apis	iximeow

2021-07-03	update yaxpeax-arch deps in ffi builds	iximeow

2021-07-03	update DecodeError impls	iximeow

2021-07-03	bump yaxpeax-arch to 0.1.0 and add badges	iximeow

2021-07-03	document public members in long_mode	iximeow

2021-07-03	write some dang docs, export `MemoryAccessSize` where you'll look for it	iximeow

2021-07-03	fix yaxpeax_arch use in ffi packaging	iximeow

2021-07-03	update readme reference to sizes, use correct measurements	iximeow

2021-07-03	export long_mode as amd64, a more recognizable name	iximeow

2021-07-03	more carefully test mmx operand sizes	iximeow

2021-07-03	factor out MemoryAccessSize	iximeow

2021-07-03	add tests for MemoryAccessSize, consistentify style on docs	iximeow

2021-07-03	be more strict about denying invalid operands	iximeow

2021-07-03	do not reject prefixed sgdt, add a TODO for xop	iximeow
	not that xop will ever be wanted, rip
2021-07-03	support AMD `sev_snp`	iximeow

2021-07-03	instructions with evex-coded registers may have registers other than 0	iximeow

2021-07-03	defer checking invalid lengths for multi-prefix instructions	iximeow
	this profiles slightly better? not entirely sure why...
2021-07-03	document some of the weird decisions in read_instr	iximeow

2021-07-03	enforce reserved evex prefix bits	iximeow

2021-07-03	clean up x86_32 and make interfaces match x86_64	iximeow

2021-07-03	prefixes on 0f01-series opcodes are more strict	iximeow

2021-07-03	add hreset	iximeow

2021-07-03	update readme in preparation for a 1.0!	iximeow

2021-07-03	port over x86_64 improvements to x86_32	iximeow

2021-07-03	vbroadcastsd requires W	iximeow

2021-07-03	support pconfig/tme	iximeow

2021-07-03	reject instructions when their opcode is `Invalid`	iximeow
	the evex route would allow "valid" instructions that have the opcode `invalid`. this is.. not correct.
2021-07-03	fix incorrect rex prefix selection	iximeow

2021-07-02	adjust decode logic for better pipelining	iximeow
	at least on my zen2. when reading prefixes, optimize for the likely case of reading an instruction rather than an invalid run of prefixes. checking if we've exceeded the x86 length bound immediately after reading the byte is only a benefit if we'd otherwise read an impossibly-long instruction; in this case we can exit exactly at prefix byte 15 rather than potentially later at byte 16 (assuming a one-byte instruction like `c3`), or byte ~24 (a more complex store with immediate and displacement). these casese are extremely unlikely in practice. more likely is that reading a prefix byte is one of the first two or three bytes in an instruction, and we will never benefit from checking the x86 length bound at this point. instead, only check length bounds after decoding the entire instruction. this penalizes the slowest path through the decoder but speeds up the likely path about 5% on my zen2 processor. additionally, begin reading instruction bytes as soon as we enter the decoder, and before initial clearing of instruction data. again, this is for zen2 pipeline reasons. reading the first byte and corresponding `OPCODES` entry improves the odds that this data is available by the time we check for `Interpretation::Prefix` in the opcode scanning loop. then, if we did not load an instruction, we immediately know another byte must be read; begin reading this byte before applying `rex` prefixes, and as soon as a prefix is known to not be one of the escape-code prefix byte (c5, c4, 62, 0f). this clocked in at another ~5% in total. i've found that `read_volatile` is necessary to force rust to begin the loadwhere it's written, rather than reordering it over other data. i'm not committed to this being a guaranteed truth. also, don't bother checking for `Invalid`. again, `Opcode::Invalid` is a relatively unlikely path through the decoder and `Nothing` is already optiimized for `None` cases. this appears to be another small improvement in throughput but i wouldn't want to give it a number - it was relatively small and may not be attributable to this effect.
2021-07-02	intel keylocker instructions that access memory have memory access sizes	iximeow

2021-07-02	fix several strict rejection for several	iximeow

2021-07-02	consolidate length/extension checks to help compilers DCE	iximeow

2021-07-02	`Nothing` operand code must be decoded with operand_count=0	iximeow

2021-07-01	fix warnings	iximeow

2021-07-01	[DROP] fix up tests to match newer operand width interfaces	iximeow

2021-07-01	reorder prefix checks	iximeow
	this measures a bit faster. it doesn't seem like it should be. the rex prefix checks compile identically but move a lea for a later expression up and pipelines better?
2021-07-01	reallocate OperandCode, convert disparate registers to array	iximeow
	also remove redundant assignments of operand_count and some OperandSpec, bulk-assign all registers and operands on entry to `read_instr`. this all, taken together, shaves off about 7 cycles per decode.