add `AnnotatingDecoder`, supporting definitions, and a doc about it

author: iximeow <me@iximeow.net> 2021-08-06 00:23:46 -0700
committer: iximeow <me@iximeow.net> 2021-08-13 22:14:08 -0700
commit: b85a987a56602c7733e3f8a86990a33644a97bf5 (patch)
tree: d42f582f84095341de1b1917ec8e1e1b463a49fa
parent: 6390ab5811143671058407681aa58e18a6042836 (diff)
5 files changed, 176 insertions, 1 deletions
diff --git a/CHANGELOG b/CHANGELOG
index 1683fec..2ffb6d6 100644
--- a/CHANGELOG
+++ b/CHANGELOG
@@ -2,6 +2,17 @@
 
 TODO: Reader::next_n should return the number of items read as Err(ReadError::Incomplete(n)) if the buffer is exhausted
 TODO: Reader::offset should return an AddressDiff<Address>, not a bare Address
+TODO: impls of `fn one` and `fn zero` so downstream users don't have to import num_traits directly
+
+## 0.2.6
+
+added `AnnotatingDecoder` and associated traits `FieldDescription` and `DescriptionSink` for architectures to report meanings for bit ranges in decoded instructions.
+
+added `NullSink`, with an `impl<T> DescriptionSink<T> for NullSink` - `NullSink` can always be used to discard instruction annotations. this is mostly useful for shared annotating and non-annotating decode logic.
+
+added a `docs/` directory for `yaxpeax-arch`: trip reports for `yaxpeax-arch` design. if `yaxpeax` eventually grows an RFC process one day, these are the kind of changes that would get RFC'd.
+
+added `docs/0001-AnnotatingDecoder.md`, describing motivation and implementation notes of `AnnotatingDecoder`.
 
 ## 0.2.5
 
diff --git a/Cargo.toml b/Cargo.toml
index 07f4e04..a8352f6 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -7,7 +7,7 @@ keywords = ["disassembly", "disassembler"]
 license = "0BSD"
 name = "yaxpeax-arch"
 repository = "https://git.iximeow.net/yaxpeax-arch/"
-version = "0.2.5"
+version = "0.2.6"
 
 [dependencies]
 "num-traits" = { version = "0.2", default-features = false }
diff --git a/README.md b/README.md
index b11f5b8..c331f02 100644
--- a/README.md
+++ b/README.md
@@ -37,6 +37,19 @@ there are numerous architectures for which decoders are implemented, at varying
 | `MOS 6502` | [yaxpeax-6502](https://github.com/cr1901/yaxpeax-6502) | ⚠️ | ❓ | ❓ | contributed by [@cr1901](https://www.twitter.com/cr1901) |
 | `lc87` | [yaxpeax-lc87](https://www.github.com/iximeow/yaxpeax-lc87) | 🥳 | ⚠️ | ❓ | |
 
+#### features
+
+`yaxpeax-arch` defines a few typically-optional features that decoders can also implement, in addition to simple `(bytes) -> instruction` decoding. these are not crate features, but `yaxpeax-arch` trait impls or collections thereof.
+
+`description_spans`: implementation of [`AnnotatingDecoder`](https://docs.rs/yaxpeax-arch/latest/yaxpeax_arch/trait.AnnotatingDecoder.html), to decode instructions with bit-level details of what incoming bitstreams mean.
+`contextualize`: implementation of [`ShowContextual`](https://docs.rs/yaxpeax-arch/latest/yaxpeax_arch/trait.ShowContextual.html), to display instructions with user-defined information in place of default instruction data. typically expected to show label names instead of relative branch addresses. **i do not recommend implementing this trait**, it needs significant reconsideration.
+
+| architecture | `description_spans` | `contextualize` |
+| ------------ | ------------------- | --------------- |
+| `x86_64` | 🥳 | ❓ |
+| `ia64` | 🥳 | ❓ |
+| `msp430` | 🥳 | ❓ |
+
 ### mirrors
 
 the canonical copy of `yaxpeax-arch` is at [https://git.iximeow.net/yaxpeax-arch](https://git.iximeow.net/yaxpeax-arch).
diff --git a/docs/0001-AnnotatingDecoder.md b/docs/0001-AnnotatingDecoder.md
new file mode 100644
index 0000000..ed544f7
--- /dev/null
+++ b/docs/0001-AnnotatingDecoder.md
@@ -0,0 +1,85 @@
+## `DescriptionSink`
+
+most architectures' machine code packs interesting meanings into specific bit fields, and one of the more important tasks of the yaxpeax decoders is to unpack these into opcodes, operands, and other instruction data for later use. in the worst case, some architectures - typically interpreted bytecodes - do less bit-packing and simply map bytes to instructions.
+
+the yaxpeax decoders' primary role is to handle this unpacking into user code-friendly structs. i want decoders to be able to report the meaning of bitfields too, so user code can mark up bit streams.
+
+implementing this capability should (borderline-"must") not regress performance for decoders that do not use it. as a constraint, this is surprisingly restrictive!
+
+a. it rules out a parameter to [`Decoder::decode_into`](https://docs.rs/yaxpeax-arch/0.2.5/yaxpeax_arch/trait.Decoder.html#tymethod.decode_into): an ignored or unused parameter can still change how `decode_into` inlines.  
+b. it rules out extra state on `Decoder` impls: writing to an unread `Vec` is still extra work at decode time.  
+
+decoders other than x86 are less performance-sensitive, so **light** regressions in performance may be tolerable.
+
+i would also like to:
+
+c. not require decoders implement this to participate in code analysis [`yaxpeax-core`](https://github.com/iximeow/yaxpeax-core/) provides.  
+d. re-use existing decode logic -- requiring myself and other decoder authors to write everything twice would be miserable.  
+
+the point `c` suggests not adding this capability to existing traits. taken together, these constraints point towards a _new_ trait that _could_ be implemented as an independent copy of decode logic, like:
+
+```rust
+trait AnnotatingDecoder<A: Arch + ?Sized> {
+    fn decode_with_annotation<
+        T: Reader<A::Address, A::Word>,
+    >(&mut self, inst: &mut A::Instruction, words: &mut T) -> Result<(), A::DecodeError>;
+}
+```
+
+but for implementations, it's easiest to tack this onto an existing `Arch`'s `InstDecoder`. point `b` means no new state, so wherever details about a span of bits are recorded, it should be an additional `&mut` parameter. then, if that parameter is an impl of some `Sink` trait, `yaxpeax_arch` can provide a no-op implementation of the `Sink` and let call sites be eliminated for non-annotating decodes.
+
+taken together, this ends up adding three traits:
+
+```rust
+pub trait DescriptionSink<Descriptor> {
+    fn record(&mut self, start: u32, end: u32, description: Descriptor);
+}
+
+pub trait FieldDescription {
+    fn id(&self) -> u32;
+}
+
+pub trait AnnotatingDecoder<A: Arch + ?Sized> {
+    type FieldDescription: FieldDescription + Clone + Display + PartialEq;
+
+    fn decode_with_annotation<
+        T: Reader<A::Address, A::Word>,
+        S: DescriptionSink<Self::FieldDescription>
+    >(&self, inst: &mut A::Instruction, words: &mut T, sink: &mut S) -> Result<(), A::DecodeError>;
+}
+```
+
+where `FieldDescription` lets callers that operate generically over spans do *something* with them. implementations can use `id` to tag descriptions that should be ordered together, regardless of the actual order the decoder reports them in. for some architectures, fields parsed later in decoding may influence the understanding of earlier fields, so reporting spans in `id`-order up front is an unreasonable burden. consider an x86 instruction, `660f6ec0` - the leading `66` is an operand-size override, but only after reading `0f6e` is it known that that prefix is changing the operands from `mm`/`dword` registers to `xmm`/`qword` registers. in fact this is only known _after_ reporting the opcode of `0f6e`, too.
+
+`start` and `end` are bit offsets where a `description` applies. `description`s can overlap in part, or in full. exact bit order is known only by the architecture being decoded; is the order `0-7,8-15,16-23,24-31`, `7-0,15-8,23-16,31-24`, or something else? i'm not sure trying to encode that in `yaxpeax-arch` traits is useful right now. `start` and `end` are `u32` because in my professional opinion, `u16` is cursed, `u8` isn't large enough, and `u32` is the next smallest size. `id()` returns a `u32` because i never want to think of `id` space constraints; even if `id` encoded a `major.minor`-style pair of ordering components, the most constrained layout would be `u16.u16` for at most 65536 values in major or minor. that's a big instruction.
+
+### implementation
+
+i've added WIP support for span reporting to `msp430`, `ia64`, and `x86` decoders. i extended `yaxpeax-dis` to [make pretty lines](https://twitter.com/iximeow/status/1423930207614889984). more could be said about that; `id`-order is expected to be, roughtly, the order an instruction is decoded. some instructions sets keep the "first" bits as the low-order bits, some others use the higher bits. so respecting `id`-order necessarily means some instruction sets will have fields "backwards" and make lines extra confusing.
+
+decoders probably ought to indicate boundaries for significant parts of decoding, lest large instructions [like itanium](https://twitter.com/iximeow/status/1424092536071618561) be a nebulous mess. maybe `FieldDescription` could have a `is_separator()` to know when an element (and its bit range) indicates the end of part of an instruction?
+
+for the most part, things work great. `yaxpeax-x86` had a minor performance regression. tracking it down wasn't too bad: the first one was because `sink` is a fifth argument for a non-inlined function. at this point most ABIs start spilling to memory. so an unused `sink` caused an extra stack write. this was a measurable overhead. the second regression was again pretty simple looking at `disas-bench` builds:
+
+```sh
+diff \
+  ` # a typical non-formatting build, from cratesio yaxpeax-x86 1.0.4 ` \
+  <(objdump -d bench-yaxpeax-no-fmt | grep -o ' .*long_mode.*>:')
+  ` # a non-formatting build, from the local patch of yaxpeax-x86 with annotation reported to a no-op sink ` \
+  <(objdump -d bench-yaxpeax-no-fmt-no-annotation | grep -o ' .*long_mode.*>:')
+```
+
+the entire diff output:
+```diff
+>  <_ZN11yaxpeax_x869long_mode8read_sib17hdc339ef7a182098aE>:
+```
+
+indeed, [`read_sib`](https://github.com/iximeow/yaxpeax-x86/blob/4371ed02ac30cb56ec4ddbf60c87e85c183d860b/src/long_mode/mod.rs#L5769-L5770) is not written as `inline(always)`, so it's possible this might not get inlined sometimes. since the only difference to `read_sib` is an extra parameter, for which all calls are no-ops that ignore arguments, i'm surprised to see the change, anyway. adding `#[inline(always)]` to `read_sib` returned `yaxpeax-x86` to "same-as-before" decode throughput.
+
+in the process, i found a slight optimization for `read_sib` that removed a few extra branches from the function. the scrutiny was good after all.
+
+### conclusion
+
+in summary, it works. it doesn't slow down callers that don't need spans of information. decoders can implement it optionally and at their leisure, without being ineligible for analysis-oriented libraries.
+
+this is almost certainly going to be in `yaxpeax-arch 0.2.6` with implementations trickling into decoders whenever it seems like fun.
diff --git a/src/lib.rs b/src/lib.rs
index edc0742..0fe090e 100644
--- a/src/lib.rs
+++ b/src/lib.rs
@@ -124,6 +124,15 @@ impl DecodeError for StandardPartialDecoderError {
     }
 }
 
+#[derive(Copy, Clone)]
+struct NoDescription {}
+
+impl fmt::Display for NoDescription {
+    fn fmt(&self, _: &mut fmt::Formatter) -> fmt::Result {
+        Ok(())
+    }
+}
+
 /// an interface to decode [`Arch::Instruction`] words from a reader of [`Arch::Word`]s. errors are
 /// the architecture-defined [`DecodeError`] implemention.
 pub trait Decoder<A: Arch + ?Sized> {
@@ -146,6 +155,63 @@ pub trait Decoder<A: Arch + ?Sized> {
     fn decode_into<T: Reader<A::Address, A::Word>>(&self, inst: &mut A::Instruction, words: &mut T) -> Result<(), A::DecodeError>;
 }
 
+/// implementors of `DescriptionSink` receive descriptions of an instruction's disassembly process
+/// and relevant offsets in the bitstream being decoded. descriptions are archtecture-specific, and
+/// architectures are expected to be able to turn the bit-level `start` and `width` values into a
+/// meaningful description of bits in the original instruction stream.
+pub trait DescriptionSink<Descriptor> {
+    /// inform this `DescriptionSink` of a `description` that was informed by bits `start` to
+    /// `end` from the start of an instruction's decoding. `start` and `end` are only relative the
+    /// instruction being decoded when this sink `DescriptionSink` provided, so they will have no
+    /// relation to the position in an underlying data stream used for past or future instructions.
+    fn record(&mut self, start: u32, end: u32, description: Descriptor);
+}
+
+pub struct NullSink;
+
+impl<T> DescriptionSink<T> for NullSink {
+    fn record(&mut self, _start: u32, _end: u32, _description: T) { }
+}
+
+#[cfg(feature = "std")]
+pub struct VecSink<T: Clone + Display> {
+    pub records: std::vec::Vec<(u32, u32, T)>
+}
+
+#[cfg(feature = "std")]
+impl<T: Clone + Display> VecSink<T> {
+    pub fn new() -> Self {
+        VecSink { records: std::vec::Vec::new() }
+    }
+}
+
+#[cfg(feature = "std")]
+impl<T: Clone + Display> DescriptionSink<T> for VecSink<T> {
+    fn record(&mut self, start: u32, end: u32, description: T) {
+        self.records.push((start, end, description));
+    }
+}
+
+pub trait FieldDescription {
+    fn id(&self) -> u32;
+    fn is_separator(&self) -> bool;
+}
+
+/// an interface to decode [`Arch::Instruction`] words from a reader of [`Arch::Word`]s, with the
+/// decoder able to report descriptions of bits or fields in the instruction to a sink implementing
+/// [`DescriptionSink`]. the sink may be [`NullSink`] which discards provided data. decoding with a
+/// `NullSink` should behave identically to `Decoder::decode_into`. implementors are recommended to
+/// implement `Decoder::decode_into` as a call to `AnnotatingDecoder::decode_with_fields` if
+/// implementing both traits.
+pub trait AnnotatingDecoder<A: Arch + ?Sized> {
+    type FieldDescription: FieldDescription + Clone + Display + PartialEq;
+
+    fn decode_with_annotation<
+        T: Reader<A::Address, A::Word>,
+        S: DescriptionSink<Self::FieldDescription>
+    >(&self, inst: &mut A::Instruction, words: &mut T, sink: &mut S) -> Result<(), A::DecodeError>;
+}
+
 #[cfg(feature = "use-serde")]
 pub trait AddressBounds: Address + Debug + Hash + PartialEq + Eq + Serialize + for<'de> Deserialize<'de> {}
 #[cfg(not(feature = "use-serde"))]
author	iximeow <me@iximeow.net>	2021-08-06 00:23:46 -0700
committer	iximeow <me@iximeow.net>	2021-08-13 22:14:08 -0700
commit	b85a987a56602c7733e3f8a86990a33644a97bf5 (patch)
tree	d42f582f84095341de1b1917ec8e1e1b463a49fa
parent	6390ab5811143671058407681aa58e18a6042836 (diff)