diff options
Diffstat (limited to 'source/notes/pic-mcu/pickit2/pickit2_firmware_notes.md')
-rw-r--r-- | source/notes/pic-mcu/pickit2/pickit2_firmware_notes.md | 311 |
1 files changed, 311 insertions, 0 deletions
diff --git a/source/notes/pic-mcu/pickit2/pickit2_firmware_notes.md b/source/notes/pic-mcu/pickit2/pickit2_firmware_notes.md new file mode 100644 index 0000000..f0c99aa --- /dev/null +++ b/source/notes/pic-mcu/pickit2/pickit2_firmware_notes.md @@ -0,0 +1,311 @@ +# PICKit2 Firmware + +Trying to make sense of [pk2cmd scripts](pk2cmd/pk2cmd_notes.html#scripts) lead me to needing to reverse engineer some of the firmware on the programmer, so here we are. + +## Recon + +The PICKit2 consists of a `PIC18F2550` and some supporting logic to program devices over ICSP. I have the `TODO: canakit, images` programmer which includes a ZIF socket, but seems to be functionally equivalent to connecting the same ICSP pins to the programmer. + +Hardware-wise, it's worth noting the LEDs are driven directly by the controller on the programmer, and of course that the USB connector is wired to the `PIC18F2550` as well. This is informative because in addition to programming and handling pk2cmd scripts, the microcontroller is responsible for USB traffic as well. + +## The Good Stuff + +The [firmware](PK2V023200.hex) that is freely available is distributd as an Intel HEX format file. An additional annoyance, Radare2 doesn't seem to have support for the PIC18F, PIC24, nor PIC16F I expect to see. + +So I wrote [an interactive disassembler](`TODO link to pydare`) to help keep track of notes as I walk through this firmware. + +Starting with the first few bytes of the firmware is as good as anywhere. As PIC18 instructions: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 16 @ 0' 'q' | aha --no-header --stylesheet +</pre></div> + +`goto`s are interesting, as are the `BAD` - those are bytes that don't map to any PIC18 instruction. +The actual bytes there: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'px 32 @ 0' 'q' +</pre></div> + +and comparing with the memory map in the datasheet for the `PIC18F2550`, reproduced partially in the table below: +``` +--------------------------------------- +0000h | Reset Vector +0002h | - +0004h | - +0006h | - +0008h | High Priority Interrupt Vector +000ah | - +.. +0018h | Low Priority Interrupt Vector +--------------------------------------- +``` + +Following the goto at the reset vector leads to more setup code: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 8 @ 0xf0a' 'q' | aha --no-header --stylesheet +</pre></div> + +This loads [File Select Registers](../pic18.html#pic18f2550_fsr) 1 and 2 to defaults of 0x300, clears the table pointer, clears a bit, and calls some initialization functions. + +Continuing through initialization functions from here seemed problematic, and it wasn't immediately obvious how this even leads to reading USB traffic from pk2cmd. + +So while this eventually might lead to making sense of the firmware, at least trying to find USB accesses seems like a faster path to the script processing logic. The firmware is small, so disassembling the whole thing and searching for relevant SFR and instructions is quite tractable. From there it might be easy to reach the script interpreter, since that should be adjacent to the USB request handling logic. + +Searching for USB-relevant SFR like +[UIR](../pic18.html#pic18f2550_fsr_uir), +[UIE](../pic18.html#pic18f2550_fsr_uie), +or [UCON](../pic18.html#pic18f2550_fsr_ucon) leads to the code around 0xb2a (notes included for reference): + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 50 @ 0xb2a' 'q' | aha --no-header --stylesheet +</pre></div> + +In this code `UIE` and `UIR` are both tested to ensure that if an interrupt is seen by `UIR` it's also enabled in `UIE`. Then a call is made to the corresponding handler for the USB event that was seen. These handlers are all reasonable and mostly short. They also don't seem to provide much additional information on how data gets to the script interpreter, or where the script interpreter is. + +In the process, this leads to some USB controller setup, which is still interesting: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 50 @ urstif_handler' 'q' | aha --no-header --stylesheet +</pre></div> + +Only really interesting to know how the USB controller is configured. Nothing that seems to be relevant for the script interpreter or code that stores and indexes scripts. + +Failing to find it through following USB processing logic, the next thought is to just look for what might be dispatch tables associated with the contiguous blocks of script command opcodes. + +As luck would have it, there are three sequences of unconditional branches at these addresses: +* `0x2334` +* `0x3b0a` +* `0x429e` + +These are good to keep in mind, and there's some logic that updates PC around them, which is a good sign these are in fact switch tables. + +Also seen, and may come in handy, is bit bang-looking logic around `0x525c`, through `LATA` 3 and somethig to do with `TRISA` 4: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 40 @ 0x525c' 'q' | aha --no-header --stylesheet +</pre></div> + +With some good leads, it should be easy to correlate these tables with script command implementations as a sanity check. The simplest script commands seem like `SCMD_BUSY_LED_OFF` and `SCMD_BUSY_LED_ON`, which are `0xf4` and `0xf5` respectively. Following traces on the board, these should just map to setting a bit in the appropriate latch high or low. + +There are in fact several branch targets that seem like they would fill this role: everything between `0x3b60` and `0x3b92`: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 26 @ 0x3b60' 'q' | aha --no-header --stylesheet +</pre></div> + +From looking at the board, it looks like the busy LED is pin 4 in `PORTB`. Searching `PORTB` yields a lot of `btfss` and `btfsc`, but searching `LATB` is entirely `bsf` and `bcf`. Further refining to `LATB, 4` yields 12 possibly interesting instructions. + +Because these operations would be the result of command dispatch, and are adjacent codes, the implementations are probably close together. The only close pairs: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 6 @ 0x26ca' 'q' | aha --no-header --stylesheet +</pre></div> + +and + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 4 @ 0x3b60' 'q' | aha --no-header --stylesheet +</pre></div> + +Further reading around the jump tables, noticed some interesting constants before the first jump table: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 7 @ 0x22ae' 'q' | aha --no-header --stylesheet +</pre></div> +This is immediately eyecatching on account of 0x42 being `FWCMD_ENTER_BOOTLOADER`. +The subsequent `xorlw #0x34` functionally results in `W == W_original ^ 0x42 ^ 0x34`. Because xor is associative that can be read as `W == W_original ^ (0x42 ^ 0x34)`, or `W == W_original ^ (0x76)`. `0x76` is, itself, also interesting, as that's the opcode for `FWCMD_FIRMWARE_VERSION`. +The third xor at `0x22b6` is equivalent to `W == W_original ^ (0x42 ^ 0x34 ^ 0x2c)` aka `W == W_original ^ (0x5a)`. 0x53 maps to `FWCMD_NO_OPERATION`, so at this point this is definitely the start of the command interpreter loop. + +At this point there's something to closely inspect, so it's worth investigating a structure seen many times in this firmware: + +<div class="codebox"><pre> +; a call target +0x222c: movff FSR2L, POSTINC1 +0x2230: movff FSR1L, FSR2L +; ^ what's up with this and why? +; and at the return... (different function) +0x7532: movf POSTDEC1, POSTDEC1 +0x7534: movff INDF1, FSR2L +0x7536: return +</pre></div> + +So at function entry the code stores the previous FSR2L to `[FSR1++]`, then `FSR1 => FSR2`. because the PIC18F2550 doesn't have a stack, this seems like a decent approximation of the same context-saving behavior. And of course, at return, context is restored. + +This provides some illumination on another interesting idiom: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 6 @ 0x22d2' 'q' | aha --no-header --stylesheet +</pre></div> + +Again, FSR1 is being used akin to a stack to pass parameters. + +`setf POSTINC1` stores `0xFF` through `FSR1` and increments, the subsequent `movlw #0x4; movwf POSTINC1` provides a second argument, which aligns with the two `movf POSTDEC` after the call (undoing the increments to provide parameters) + +Returning to the jump table-style code, register 0x54 is compared with 0xb9 (in the range of opcodes, though well into `SCMD_` space), and if the opcode is less than 0xb9, a branch is taken to `0x2306`: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 30 @ 0x2306' 'q' | aha --no-header --stylesheet +</pre></div> + +At `0x2306` the opcode is turned into an offset into the subsequent jump table and added to PC, implementing the switch by opcode. At this point it's pretty reasonable to add labels for the cases to indicate what command they correspond to. Out of an abundance of caution though, it's worthwhile verfiying at least a simple command. + +`FWCMD_RESET` gives us exactly that: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 3 @ FWCMD_RESET' 'q' | aha --no-header --stylesheet +</pre></div> +This case is the "reset" instruction plus logic to advance to the next opcode in the script. Of course, that never gets reached because the chip has reset, but that's a clear artifact of compilation. + +#### FWCMD instructions to revisit +* `FWCMD_EXECUTE_SCRIPT`: this is relevant for programmer operation. all operations seem to be scripts. +* `FWCMD_RUN_SCRIPT`: how is this different from "execute"? +* `FWCMD_DOWNLOAD_SCRIPT`: this is the command to download a script to the programmer. can we read script memory? how many scripts can there be? what's their size? +* `FWCMD_DOWNLOAD_DATA`: seems interesting. download to programmer or to target? +* `FWCMD_UPLOAD_DATA`: same as above +* `FWCMD_END_OF_BUFFER`: might be interesting to know later + +Will come back to these in the future, now that there's a plausible starting point to find their logic. + +With `FWCMD` addressed, it's still not clear how `SCMD` commands are dispatched, which is probably a different table. Time to keep looking. Onward to the table around `0x3b0a`... + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 58 @ 0x3aec' 'q' | aha --no-header --stylesheet +</pre></div> + +At `0x3aec` the opcode (again in register `0x54`) is compared against `0xd5`. If the opcode is that or above, `0x3af0` branches to a different but similar switch on opcode value. Since then it's safe to assume the entries are in order from command `0xd5` up, script command names apply cleanly to each branch: +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 58 @ 0x3aec' 'q' | aha --no-header --stylesheet +</pre></div> + +While if the opcode was below `0xd5`, the branch at `0x3af2` kicks in and takes us to... + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 54 @ 0x3af2' 'q' | aha --no-header --stylesheet +</pre></div> + +Again, very similar structure to other tables, but starts with `0xb3`. `0xb3` is not in the header file and is unknown, but the rest can be filled in. + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 54 @ 0x3af2' 'q' | aha --no-header --stylesheet +</pre></div> + +(note: at this point as i was keymashing i accidentally hit ctrl+c, which my script doesn't handle, and lost three hours of notes give or take. largely looking through the interpreter cases, and easy enough to rebuild, but beware sigint) + +With the script command dispatcher found and marked up, it's time to return to the original questions: +* What script opcodes take how many parameters? +* How does programming actually take place? + +## Script Opcodes +Easiest to start with obvious opcodes where the intended functionality is straightforward, to find unknown details about implementation. `SCMD_NOP24` is an obvious candidate here: functionally it *probably* does nothing, but does likely increment some pointers on the programmer side (current script opcode pointer, if such a thing exists), and possibly on the device side (to cause the target device to execute a no-op). + +### `SCMD_NOP24` + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 25 @ 0x4112' 'q' | aha --no-header --stylesheet +</pre></div> + +`SCMD_NOP24` makes four calls to the same function, with parameters that vary: +* `0x5066(4, 0)` +* `0x5066(8, 0)` +* `0x5066(8, 0)` +* `0x5066(8, 0)` + +the sequence of three calls with `(8, 0)` probably correspond to the bytes `00 00 00`, which decode under PIC24 to a NOP. So this seems sane! The remaining question if this is true, is why is there a call with arguments `(4, 0)`? + +The answer to that is a little further, at `0x5066`: +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 53 @ 0x5066' 'q' | aha --no-header --stylesheet +</pre></div> + +This function reads two parameters, stores the second parameter to `0x54`, and the first to `0x55`. As the first parameter for these calls is either 8 or 4, and the latter parameter is all 0's, this continues to make sense. + +The function then tests if `0x2e1` > `0x2` and branches below if so. + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 12 @ 0x5084' 'q' | aha --no-header --stylesheet +</pre></div> + +This sequence of instructions is responsible for putting pin 2 of latch A into a high or low state. Bit 0 in `0x54` can only be 1 or 0, so exactly one of `btfss` or `btfcs` will result in a fallthrough. As a result, only one of `bcf LATA, 2; bsf LATA, 2` will be executed. Afterward, the `nop; bsf LATA, 3; nop; bcf LATA, 3`, which raises waits, then lowers another GPIO pin. This makes sense given that the programmer is responsible for directly driving clock signals on the remote chip in programming modes, so this is the logic that actually sends data to the target chip, one bit at a time. + +There is another region of this function that's similar save for one change. If the value in `0x2e1` was above 2, a different bit banging loop is reached, down at `0x509c`: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 20 @ 0x509c' 'q' | aha --no-header --stylesheet +</pre></div> + +This loop is the same as above, as far as setting the data pin (LATA bit 2). The difference is that before alternating the clock three nops are used (rather than the earlier one nop), and those nops are executed in a loop. The loop repeats `[0x2e1]` times and functionally is the same bitbanging loop with longer periods between clock cycles. This may be relevant for some target devices that have different tolerances in serial programming, and makes sense to keep as a single "send these bits" abstraction. + +Back to the question of the two arguments, while one is sent one bit at a time, the other seems to count the number of bits to send: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 2 @ 0x50c8' 'q' | aha --no-header --stylesheet +</pre></div> + +This function is noted down as `tx_bits` for later reference. + +So `(4, 0)` sends the bits `0000`, while `(8, 0)` sends `00000000`. Taken together `SCMD_NOP24` sends `0000b 000000`, and inferring from the PIC18 ICSP guide a leading `0000` prefix should indicate the following word is an instruction, yielding the expected `NOP` on the target device. + +### `SCMD_COREINST24` + +Moving on to `SCMD_COREINST24`, which functionally should send three user-specified bytes to the target... + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'define-function "tx_bits" @ 0x5066' 'd 2 @ 0x41b2' 'q' | aha --no-header --stylesheet +</pre></div> + +As expected, this calls the same transmit function, and passes 4 and 8 in the same places. The major difference is how the second parameter is provided, where this version involves a bit of an incantation: + +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'define-function "tx_bits" @ 0x5066' 'd 14 @ 0x41c6' 'q' | aha --no-header --stylesheet +</pre></div> + +Since the three bytes for a pic24 instruction are part of the script (in fact, the three bytes after the `SCMD_COREINST24`), it stands to reason that the value passed is read out of the script buffer, tugging this thread should lead to the script buffer and eventually where it's selected. Interestingly, the logic appears to be: +* read the current script buffer offset (through `INDF2` at `0x41c6`) +* increment the offset (`0x41c8`) +* load the pointer to the script array (`0x41cc`-`0x41d4`) into `FRS0` - this comes from an argument! +* add the offset into the pointer (`0x41d8`) +* read that address (`0x41e0`) +* "push" the parameter (`0x41e2`) + +This confirms that for the various commands we can track `INDF2` modification to determine the number of parameters for the various commands! + +The first unknown script commands ones of interest: `SCMD_WRITE_BUFBYTE_W` and `SCMD_WRITE_BUFWORD_W` + +### `SCMD_WRITE_BUFWORD_W` + +``` +TODO: SCMD_WRITE_BUFWORD_W +``` + +this handler involves the same transmit function from earlier and also another function, `0x5006`. + +guessing from context it appears to read from a global address and return the byte in W. `SCMD_WRITE_BUFWORD_W` then uses that as bits to send down the wire. + +### `SCMD_WRITE_BUFBYTE_W` + +`SCMD_WRITE_BUFBYTE_W` does the same, but only reads one byte, the second value transmitted is a 0. + +### Miscellaneous Script Commands + +`SCMD_WRITE_BITS_LITERAL` takes two parameters, the number of bits and the pattern to send. +`SCMD_WRITE_BYTE_LITERAL` takes one parameter, the pattern, and assumes the number of bits to send is 8. +TODO: `SCMD_RD2_BYTE_BUFFER`. +TODO: `SCMD_READ_BYTE`. +TODO: `SCMD_VISI24`. (transmits (4, 1), (8, 0), then some unknown calls.) + +### `SCMD_DELAY_LONG` + +Passes the single byte parameter along to the function at `0x4966`, which... +<div class="codebox"><pre> +#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'define-function "tx_bits" @ 0x5066' 'd 11 @ 0x4966' 'q' | aha --no-header --stylesheet +</pre></div> + +At `0x496e` the flag indicating `timer0` has fired is cleared (since it may have in the past), followed by multiplying the argument by `0xff` and loading the low byte of the result into `tmr0h`? This really has no result other than to negate it before setting the timer. From there, the timer is enabled and the controller spins until the timer fires (`0x497e`). + +`delay`. Dunno what else to have expected. + +### `SCMD_LOOP` + +TODO, eventually. |