# PICKit2 Firmware Trying to make sense of [pk2cmd scripts](pk2cmd/pk2cmd_notes.html#scripts) lead me to needing to reverse engineer some of the firmware on the programmer, so here we are. ## Recon The PICKit2 consists of a `PIC18F2550` and some supporting logic to program devices over ICSP. I have the `TODO: canakit, images` programmer which includes a ZIF socket, but seems to be functionally equivalent to connecting the same ICSP pins to the programmer. Hardware-wise, it's worth noting the LEDs are driven directly by the controller on the programmer, and of course that the USB connector is wired to the `PIC18F2550` as well. This is informative because in addition to programming and handling pk2cmd scripts, the microcontroller is responsible for USB traffic as well. ## The Good Stuff The [firmware](PK2V023200.hex) that is freely available is distributd as an Intel HEX format file. An additional annoyance, Radare2 doesn't seem to have support for the PIC18F, PIC24, nor PIC16F I expect to see. So I wrote [an interactive disassembler](`TODO link to pydare`) to help keep track of notes as I walk through this firmware. Starting with the first few bytes of the firmware is as good as anywhere. As PIC18 instructions:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 16 @ 0' 'q' | aha --no-header --stylesheet
`goto`s are interesting, as are the `BAD` - those are bytes that don't map to any PIC18 instruction. The actual bytes there:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'px 32 @ 0' 'q'
and comparing with the memory map in the datasheet for the `PIC18F2550`, reproduced partially in the table below: ``` --------------------------------------- 0000h | Reset Vector 0002h | - 0004h | - 0006h | - 0008h | High Priority Interrupt Vector 000ah | - .. 0018h | Low Priority Interrupt Vector --------------------------------------- ``` Following the goto at the reset vector leads to more setup code:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 8 @ 0xf0a' 'q' | aha --no-header --stylesheet
This loads [File Select Registers](../pic18.html#pic18f2550_fsr) 1 and 2 to defaults of 0x300, clears the table pointer, clears a bit, and calls some initialization functions. Continuing through initialization functions from here seemed problematic, and it wasn't immediately obvious how this even leads to reading USB traffic from pk2cmd. So while this eventually might lead to making sense of the firmware, at least trying to find USB accesses seems like a faster path to the script processing logic. The firmware is small, so disassembling the whole thing and searching for relevant SFR and instructions is quite tractable. From there it might be easy to reach the script interpreter, since that should be adjacent to the USB request handling logic. Searching for USB-relevant SFR like [UIR](../pic18.html#pic18f2550_fsr_uir), [UIE](../pic18.html#pic18f2550_fsr_uie), or [UCON](../pic18.html#pic18f2550_fsr_ucon) leads to the code around 0xb2a (notes included for reference):
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 50 @ 0xb2a' 'q' | aha --no-header --stylesheet
In this code `UIE` and `UIR` are both tested to ensure that if an interrupt is seen by `UIR` it's also enabled in `UIE`. Then a call is made to the corresponding handler for the USB event that was seen. These handlers are all reasonable and mostly short. They also don't seem to provide much additional information on how data gets to the script interpreter, or where the script interpreter is. In the process, this leads to some USB controller setup, which is still interesting:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 50 @ urstif_handler' 'q' | aha --no-header --stylesheet
Only really interesting to know how the USB controller is configured. Nothing that seems to be relevant for the script interpreter or code that stores and indexes scripts. Failing to find it through following USB processing logic, the next thought is to just look for what might be dispatch tables associated with the contiguous blocks of script command opcodes. As luck would have it, there are three sequences of unconditional branches at these addresses: * `0x2334` * `0x3b0a` * `0x429e` These are good to keep in mind, and there's some logic that updates PC around them, which is a good sign these are in fact switch tables. Also seen, and may come in handy, is bit bang-looking logic around `0x525c`, through `LATA` 3 and somethig to do with `TRISA` 4:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 40 @ 0x525c' 'q' | aha --no-header --stylesheet
With some good leads, it should be easy to correlate these tables with script command implementations as a sanity check. The simplest script commands seem like `SCMD_BUSY_LED_OFF` and `SCMD_BUSY_LED_ON`, which are `0xf4` and `0xf5` respectively. Following traces on the board, these should just map to setting a bit in the appropriate latch high or low. There are in fact several branch targets that seem like they would fill this role: everything between `0x3b60` and `0x3b92`:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 26 @ 0x3b60' 'q' | aha --no-header --stylesheet
From looking at the board, it looks like the busy LED is pin 4 in `PORTB`. Searching `PORTB` yields a lot of `btfss` and `btfsc`, but searching `LATB` is entirely `bsf` and `bcf`. Further refining to `LATB, 4` yields 12 possibly interesting instructions. Because these operations would be the result of command dispatch, and are adjacent codes, the implementations are probably close together. The only close pairs:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 6 @ 0x26ca' 'q' | aha --no-header --stylesheet
and
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 4 @ 0x3b60' 'q' | aha --no-header --stylesheet
Further reading around the jump tables, noticed some interesting constants before the first jump table:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 7 @ 0x22ae' 'q' | aha --no-header --stylesheet
This is immediately eyecatching on account of 0x42 being `FWCMD_ENTER_BOOTLOADER`. The subsequent `xorlw #0x34` functionally results in `W == W_original ^ 0x42 ^ 0x34`. Because xor is associative that can be read as `W == W_original ^ (0x42 ^ 0x34)`, or `W == W_original ^ (0x76)`. `0x76` is, itself, also interesting, as that's the opcode for `FWCMD_FIRMWARE_VERSION`. The third xor at `0x22b6` is equivalent to `W == W_original ^ (0x42 ^ 0x34 ^ 0x2c)` aka `W == W_original ^ (0x5a)`. 0x53 maps to `FWCMD_NO_OPERATION`, so at this point this is definitely the start of the command interpreter loop. At this point there's something to closely inspect, so it's worth investigating a structure seen many times in this firmware:
; a call target
0x222c: movff FSR2L, POSTINC1
0x2230: movff FSR1L, FSR2L
; ^ what's up with this and why?
; and at the return... (different function)
0x7532: movf POSTDEC1, POSTDEC1
0x7534: movff INDF1, FSR2L
0x7536: return
So at function entry the code stores the previous FSR2L to `[FSR1++]`, then `FSR1 => FSR2`. because the PIC18F2550 doesn't have a stack, this seems like a decent approximation of the same context-saving behavior. And of course, at return, context is restored. This provides some illumination on another interesting idiom:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 6 @ 0x22d2' 'q' | aha --no-header --stylesheet
Again, FSR1 is being used akin to a stack to pass parameters. `setf POSTINC1` stores `0xFF` through `FSR1` and increments, the subsequent `movlw #0x4; movwf POSTINC1` provides a second argument, which aligns with the two `movf POSTDEC` after the call (undoing the increments to provide parameters) Returning to the jump table-style code, register 0x54 is compared with 0xb9 (in the range of opcodes, though well into `SCMD_` space), and if the opcode is less than 0xb9, a branch is taken to `0x2306`:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 30 @ 0x2306' 'q' | aha --no-header --stylesheet
At `0x2306` the opcode is turned into an offset into the subsequent jump table and added to PC, implementing the switch by opcode. At this point it's pretty reasonable to add labels for the cases to indicate what command they correspond to. Out of an abundance of caution though, it's worthwhile verfiying at least a simple command. `FWCMD_RESET` gives us exactly that:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 3 @ FWCMD_RESET' 'q' | aha --no-header --stylesheet
This case is the "reset" instruction plus logic to advance to the next opcode in the script. Of course, that never gets reached because the chip has reset, but that's a clear artifact of compilation. #### FWCMD instructions to revisit * `FWCMD_EXECUTE_SCRIPT`: this is relevant for programmer operation. all operations seem to be scripts. * `FWCMD_RUN_SCRIPT`: how is this different from "execute"? * `FWCMD_DOWNLOAD_SCRIPT`: this is the command to download a script to the programmer. can we read script memory? how many scripts can there be? what's their size? * `FWCMD_DOWNLOAD_DATA`: seems interesting. download to programmer or to target? * `FWCMD_UPLOAD_DATA`: same as above * `FWCMD_END_OF_BUFFER`: might be interesting to know later Will come back to these in the future, now that there's a plausible starting point to find their logic. With `FWCMD` addressed, it's still not clear how `SCMD` commands are dispatched, which is probably a different table. Time to keep looking. Onward to the table around `0x3b0a`...
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 58 @ 0x3aec' 'q' | aha --no-header --stylesheet
At `0x3aec` the opcode (again in register `0x54`) is compared against `0xd5`. If the opcode is that or above, `0x3af0` branches to a different but similar switch on opcode value. Since then it's safe to assume the entries are in order from command `0xd5` up, script command names apply cleanly to each branch:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 58 @ 0x3aec' 'q' | aha --no-header --stylesheet
While if the opcode was below `0xd5`, the branch at `0x3af2` kicks in and takes us to...
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 54 @ 0x3af2' 'q' | aha --no-header --stylesheet
Again, very similar structure to other tables, but starts with `0xb3`. `0xb3` is not in the header file and is unknown, but the rest can be filled in.
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'loaddb pk2cmd-stuff/pk2cmd_firmware' 'd 54 @ 0x3af2' 'q' | aha --no-header --stylesheet
(note: at this point as i was keymashing i accidentally hit ctrl+c, which my script doesn't handle, and lost three hours of notes give or take. largely looking through the interpreter cases, and easy enough to rebuild, but beware sigint) With the script command dispatcher found and marked up, it's time to return to the original questions: * What script opcodes take how many parameters? * How does programming actually take place? ## Script Opcodes Easiest to start with obvious opcodes where the intended functionality is straightforward, to find unknown details about implementation. `SCMD_NOP24` is an obvious candidate here: functionally it *probably* does nothing, but does likely increment some pointers on the programmer side (current script opcode pointer, if such a thing exists), and possibly on the device side (to cause the target device to execute a no-op). ### `SCMD_NOP24`
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 25 @ 0x4112' 'q' | aha --no-header --stylesheet
`SCMD_NOP24` makes four calls to the same function, with parameters that vary: * `0x5066(4, 0)` * `0x5066(8, 0)` * `0x5066(8, 0)` * `0x5066(8, 0)` the sequence of three calls with `(8, 0)` probably correspond to the bytes `00 00 00`, which decode under PIC24 to a NOP. So this seems sane! The remaining question if this is true, is why is there a call with arguments `(4, 0)`? The answer to that is a little further, at `0x5066`:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 53 @ 0x5066' 'q' | aha --no-header --stylesheet
This function reads two parameters, stores the second parameter to `0x54`, and the first to `0x55`. As the first parameter for these calls is either 8 or 4, and the latter parameter is all 0's, this continues to make sense. The function then tests if `0x2e1` > `0x2` and branches below if so.
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 12 @ 0x5084' 'q' | aha --no-header --stylesheet
This sequence of instructions is responsible for putting pin 2 of latch A into a high or low state. Bit 0 in `0x54` can only be 1 or 0, so exactly one of `btfss` or `btfcs` will result in a fallthrough. As a result, only one of `bcf LATA, 2; bsf LATA, 2` will be executed. Afterward, the `nop; bsf LATA, 3; nop; bcf LATA, 3`, which raises waits, then lowers another GPIO pin. This makes sense given that the programmer is responsible for directly driving clock signals on the remote chip in programming modes, so this is the logic that actually sends data to the target chip, one bit at a time. There is another region of this function that's similar save for one change. If the value in `0x2e1` was above 2, a different bit banging loop is reached, down at `0x509c`:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 20 @ 0x509c' 'q' | aha --no-header --stylesheet
This loop is the same as above, as far as setting the data pin (LATA bit 2). The difference is that before alternating the clock three nops are used (rather than the earlier one nop), and those nops are executed in a loop. The loop repeats `[0x2e1]` times and functionally is the same bitbanging loop with longer periods between clock cycles. This may be relevant for some target devices that have different tolerances in serial programming, and makes sense to keep as a single "send these bits" abstraction. Back to the question of the two arguments, while one is sent one bit at a time, the other seems to count the number of bits to send:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'd 2 @ 0x50c8' 'q' | aha --no-header --stylesheet
This function is noted down as `tx_bits` for later reference. So `(4, 0)` sends the bits `0000`, while `(8, 0)` sends `00000000`. Taken together `SCMD_NOP24` sends `0000b 000000`, and inferring from the PIC18 ICSP guide a leading `0000` prefix should indicate the following word is an instruction, yielding the expected `NOP` on the target device. ### `SCMD_COREINST24` Moving on to `SCMD_COREINST24`, which functionally should send three user-specified bytes to the target...
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'define-function "tx_bits" @ 0x5066' 'd 2 @ 0x41b2' 'q' | aha --no-header --stylesheet
As expected, this calls the same transmit function, and passes 4 and 8 in the same places. The major difference is how the second parameter is provided, where this version involves a bit of an incantation:
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'define-function "tx_bits" @ 0x5066' 'd 14 @ 0x41c6' 'q' | aha --no-header --stylesheet
Since the three bytes for a pic24 instruction are part of the script (in fact, the three bytes after the `SCMD_COREINST24`), it stands to reason that the value passed is read out of the script buffer, tugging this thread should lead to the script buffer and eventually where it's selected. Interestingly, the logic appears to be: * read the current script buffer offset (through `INDF2` at `0x41c6`) * increment the offset (`0x41c8`) * load the pointer to the script array (`0x41cc`-`0x41d4`) into `FRS0` - this comes from an argument! * add the offset into the pointer (`0x41d8`) * read that address (`0x41e0`) * "push" the parameter (`0x41e2`) This confirms that for the various commands we can track `INDF2` modification to determine the number of parameters for the various commands! The first unknown script commands ones of interest: `SCMD_WRITE_BUFBYTE_W` and `SCMD_WRITE_BUFWORD_W` ### `SCMD_WRITE_BUFWORD_W` ``` TODO: SCMD_WRITE_BUFWORD_W ``` this handler involves the same transmit function from earlier and also another function, `0x5006`. guessing from context it appears to read from a global address and return the byte in W. `SCMD_WRITE_BUFWORD_W` then uses that as bits to send down the wire. ### `SCMD_WRITE_BUFBYTE_W` `SCMD_WRITE_BUFBYTE_W` does the same, but only reads one byte, the second value transmitted is a 0. ### Miscellaneous Script Commands `SCMD_WRITE_BITS_LITERAL` takes two parameters, the number of bits and the pattern to send. `SCMD_WRITE_BYTE_LITERAL` takes one parameter, the pattern, and assumes the number of bits to send is 8. TODO: `SCMD_RD2_BYTE_BUFFER`. TODO: `SCMD_READ_BYTE`. TODO: `SCMD_VISI24`. (transmits (4, 1), (8, 0), then some unknown calls.) ### `SCMD_DELAY_LONG` Passes the single byte parameter along to the function at `0x4966`, which...
#eval python pk2cmd-stuff/pydare.py 'o pk2cmd-stuff/PK2V023200.hex' 'arch pic18' 'define-function "tx_bits" @ 0x5066' 'd 11 @ 0x4966' 'q' | aha --no-header --stylesheet
At `0x496e` the flag indicating `timer0` has fired is cleared (since it may have in the past), followed by multiplying the argument by `0xff` and loading the low byte of the result into `tmr0h`? This really has no result other than to negate it before setting the timer. From there, the timer is enabled and the controller spins until the timer fires (`0x497e`). `delay`. Dunno what else to have expected. ### `SCMD_LOOP` TODO, eventually.