describe optimizations included in 1.1.5opts

author: iximeow <me@iximeow.net> 2022-12-03 15:03:13 -0800
committer: iximeow <me@iximeow.net> 2022-12-03 15:03:13 -0800
commit: 8eef2114ece0b4a96866f075e87f195a804d61cb (patch)
tree: f4cdbf7303d88df14e35f678f1bca8c00b8f0630
parent: 64abb4e439230c8b4b8a6534989784e362efb12d (diff)
1 files changed, 14 insertions, 0 deletions
diff --git a/CHANGELOG b/CHANGELOG
index 1d676be..016c7bd 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -1,4 +1,18 @@
 ## 1.1.5
+* optimizations (mostly code motion) for hot codepaths
+  - large `match`-based decode tables have been outlined to 256-entry arrays.
+    this makes for slicely nicer inlining in `read_with_annotations`.
+  - vex/evex decoding in 64-bit decoding now shares more code. this seems to
+    aid code cache friendliness when prefixes must be read.
+  - added a fast path for operand reading for the more-likely cases of
+    [64-bit]: {0x66,rex}{<opcode>,0x0f-<opcode>}
+    [32-bit]: {0x66}{<opcode>,0x0f-<opcode>}
+    [16-bit]: {0x66}{<opcode>,0x0f-<opcode>}
+
+    in particular, this avoids checking for instruction length overflows and
+    some bounds checks when we aren't handling a pessimal case of many-prefixed
+    instructions. if an instruction has multiple prefixes, decoders fall back
+    to normal read-in-a-loop-until-length-limit-reached decoding.
 * `Makefile` at the crate root now exercises `yaxpeax-x86` builds and tests under:
   - default features (fmt, std)
   - no-std + fmt
author	iximeow <me@iximeow.net>	2022-12-03 15:03:13 -0800
committer	iximeow <me@iximeow.net>	2022-12-03 15:03:13 -0800
commit	8eef2114ece0b4a96866f075e87f195a804d61cb (patch)
tree	f4cdbf7303d88df14e35f678f1bca8c00b8f0630
parent	64abb4e439230c8b4b8a6534989784e362efb12d (diff)