diff options
| author | iximeow <me@iximeow.net> | 2021-01-15 18:15:04 -0800 | 
|---|---|---|
| committer | iximeow <me@iximeow.net> | 2021-01-15 18:21:19 -0800 | 
| commit | d8083b08dc987adeda73fb13298383c6cf519596 (patch) | |
| tree | e6e4f19b631d26c547551d9168fda81476de2c1e | |
| parent | dd88df2b9769630f8eafe909e00eb3a3a21c954e (diff) | |
small perf tweaks
clearing reg_rrr and reg_mmm more efficiently is an extremely small win,
but a win
read_imm_signed generally should inline well and runs afoul of some
heuristic. inlining gets about 8% improved throughput on the
(unrealistic) in-repo benchmark
it would be great to be able to avoid bounds checks somehow; it looks
like they alone are another ~10% of decode time. i'm not sure how to
pull that off while retaining the generic iterator parameter. might just
not be possible.
| -rw-r--r-- | CHANGELOG | 2 | ||||
| -rw-r--r-- | src/long_mode/mod.rs | 8 | 
2 files changed, 7 insertions, 3 deletions
@@ -3,6 +3,8 @@    - AMD-only `monitorx`, `mwaitx`, `clzero`, and `rdpru` are now supported    - `swapgs` is invalid in non-64-bit modes    - `rdpkru` and `wrpkru` were incorrectly decoded when modrm bits were not `11` +* small performance tweaks. read_imm_signed is now inline(always) and some +  pre-decode initialization is a bit better-packed  ## 0.1.4  * [long mode only]: fix decoding of rex-prefixed modrm+sib operands selecting index 0b100 and base 0b101 diff --git a/src/long_mode/mod.rs b/src/long_mode/mod.rs index f15e2a1..f9be9ab 100644 --- a/src/long_mode/mod.rs +++ b/src/long_mode/mod.rs @@ -5835,8 +5835,10 @@ fn read_instr<T: Iterator<Item=u8>>(decoder: &InstDecoder, mut bytes_iter: T, in  //    use core::intrinsics::unlikely;      let mut prefixes = Prefixes::new(0); -    instruction.modrm_mmm.bank = RegisterBank::Q; -    instruction.sib_index.bank = RegisterBank::Q; +    // ever so slightly faster than just setting .bank: this allows the two assignments to merge +    // into one `mov 0, dword [instruction + modrm_mmm_offset]` +    instruction.modrm_mmm = RegSpec::rax(); +    instruction.sib_index = RegSpec::rax();      fn escapes_are_prefixes_actually(prefixes: &mut Prefixes, opc_map: &mut Option<OpcodeMap>) {          match opc_map { @@ -8881,7 +8883,7 @@ fn read_imm_ivq<T: Iterator<Item=u8>>(bytes: &mut T, width: u8, length: &mut u8)      }  } -#[inline] +#[inline(always)]  fn read_imm_signed<T: Iterator<Item=u8>>(bytes: &mut T, num_width: u8, length: &mut u8) -> Result<i64, DecodeError> {      if num_width == 1 {          *length += 1;  | 
