<feed xmlns='http://www.w3.org/2005/Atom'>
<title>yaxpeax-x86/src/safer_unchecked.rs, branch 2.1.1</title>
<subtitle>yaxpeax x86 decoder</subtitle>
<link rel='alternate' type='text/html' href='http://git.iximeow.net/yaxpeax-x86/'/>
<entry>
<title>remove yaxpeax-x86 safer_unchecked.rs, it is now in yaxpeax-arch</title>
<updated>2024-06-23T22:43:54+00:00</updated>
<author>
<name>iximeow</name>
<email>me@iximeow.net</email>
</author>
<published>2024-06-23T22:41:40+00:00</published>
<link rel='alternate' type='text/html' href='http://git.iximeow.net/yaxpeax-x86/commit/?id=09dcfca94240b6c18fbaa1186781dac0d436e500'/>
<id>09dcfca94240b6c18fbaa1186781dac0d436e500</id>
<content type='text'>
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
</pre>
</div>
</content>
</entry>
<entry>
<title>explicit inline annotations for kinda_uncheckeds</title>
<updated>2022-01-02T22:40:07+00:00</updated>
<author>
<name>iximeow</name>
<email>me@iximeow.net</email>
</author>
<published>2022-01-02T22:36:15+00:00</published>
<link rel='alternate' type='text/html' href='http://git.iximeow.net/yaxpeax-x86/commit/?id=3df0790c898d480eda6a906cbc9a3d3d6749a140'/>
<id>3df0790c898d480eda6a906cbc9a3d3d6749a140</id>
<content type='text'>
unfortunately something about the wrapper functions adjusts codegen even when
the wrapper functions themselves are just calls to inner functions. the in-tree
benchmark (known to not be comprehensive, but enough to spot a difference),
showed a ~3.5% regression in throughput with the prior commit, even though it
doesn't change behavior at all.

explicit #[inline(always)] gets things to a state where the wrapper functions
do not penalize performance. for an example of the differences in codegen, see
below.

before:
```
&lt;    141d4:	48 39 fa             	cmp    %rdi,%rdx
&lt;    141d7:	0f 84 0b 4d 00 00    	je     18ee8 &lt;_ZN5bench16do_decode_swathe17h694154735739ce4cE+0x4e58&gt;
&lt;    141dd:	0f b6 0f             	movzbl (%rdi),%ecx
&lt;    141e0:	48 83 c7 01          	add    $0x1,%rdi
&lt;    141e4:	48 89 7c 24 38       	mov    %rdi,0x38(%rsp)
... snip ...
```

after:
```
&gt;    141d4:	48 39 ea             	cmp    %rbp,%rdx
&gt;    141d7:	0f 84 97 4c 00 00    	je     18e74 &lt;_ZN5bench16do_decode_swathe17h694154735739ce4cE+0x4de4&gt;
&gt;    141dd:	0f b6 4d 00          	movzbl 0x0(%rbp),%ecx
&gt;    141e1:	48 83 c5 01          	add    $0x1,%rbp
&gt;    141e5:	48 89 6c 24 38       	mov    %rbp,0x38(%rsp)
... snip ...
```

there are several spans of code with this kind of change involved; there are no
explicit calls to `get_kinda_unchecked` or `unreachable_kinda_unchecked` but
clearly a difference did make it through to the benchmark's code.

while the choice of `rbp` instead of `rdi` wouldn't seem very interesting, the
  instructions themselves are more substantially different. `0fb60f` vs
  `0fb64d00`; to encode `[rbp + 0]`, the instruction requires a displacement,
  and is one byte longer as a result. there are several instructions
  so-impacted, and i suspect the increased code size is what ended up changing
  benchmark behavior.

after adding these `#[inline(always)]` annotations, there is no difference in
generated code with or without the `kinda_unchecked` helpers!</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
unfortunately something about the wrapper functions adjusts codegen even when
the wrapper functions themselves are just calls to inner functions. the in-tree
benchmark (known to not be comprehensive, but enough to spot a difference),
showed a ~3.5% regression in throughput with the prior commit, even though it
doesn't change behavior at all.

explicit #[inline(always)] gets things to a state where the wrapper functions
do not penalize performance. for an example of the differences in codegen, see
below.

before:
```
&lt;    141d4:	48 39 fa             	cmp    %rdi,%rdx
&lt;    141d7:	0f 84 0b 4d 00 00    	je     18ee8 &lt;_ZN5bench16do_decode_swathe17h694154735739ce4cE+0x4e58&gt;
&lt;    141dd:	0f b6 0f             	movzbl (%rdi),%ecx
&lt;    141e0:	48 83 c7 01          	add    $0x1,%rdi
&lt;    141e4:	48 89 7c 24 38       	mov    %rdi,0x38(%rsp)
... snip ...
```

after:
```
&gt;    141d4:	48 39 ea             	cmp    %rbp,%rdx
&gt;    141d7:	0f 84 97 4c 00 00    	je     18e74 &lt;_ZN5bench16do_decode_swathe17h694154735739ce4cE+0x4de4&gt;
&gt;    141dd:	0f b6 4d 00          	movzbl 0x0(%rbp),%ecx
&gt;    141e1:	48 83 c5 01          	add    $0x1,%rbp
&gt;    141e5:	48 89 6c 24 38       	mov    %rbp,0x38(%rsp)
... snip ...
```

there are several spans of code with this kind of change involved; there are no
explicit calls to `get_kinda_unchecked` or `unreachable_kinda_unchecked` but
clearly a difference did make it through to the benchmark's code.

while the choice of `rbp` instead of `rdi` wouldn't seem very interesting, the
  instructions themselves are more substantially different. `0fb60f` vs
  `0fb64d00`; to encode `[rbp + 0]`, the instruction requires a displacement,
  and is one byte longer as a result. there are several instructions
  so-impacted, and i suspect the increased code size is what ended up changing
  benchmark behavior.

after adding these `#[inline(always)]` annotations, there is no difference in
generated code with or without the `kinda_unchecked` helpers!</pre>
</div>
</content>
</entry>
<entry>
<title>Wrap unsafe functions to catch errors in debug</title>
<updated>2022-01-02T22:40:07+00:00</updated>
<author>
<name>5225225</name>
<email>5225225@mailbox.org</email>
</author>
<published>2021-12-19T20:34:52+00:00</published>
<link rel='alternate' type='text/html' href='http://git.iximeow.net/yaxpeax-x86/commit/?id=dd1e281c85cb047c6a4a05a4af0314e064cba088'/>
<id>dd1e281c85cb047c6a4a05a4af0314e064cba088</id>
<content type='text'>
Closes https://github.com/iximeow/yaxpeax-x86/issues/16
</content>
<content type='xhtml'>
<div xmlns='http://www.w3.org/1999/xhtml'>
<pre>
Closes https://github.com/iximeow/yaxpeax-x86/issues/16
</pre>
</div>
</content>
</entry>
</feed>
