Kotcrab.com

import com.kotcrab.Brain;

Assembly Programming in Kotlin

I needed a way to patch games executables. More specifically PSP games executables.

My only other alternative I knew at that time was using assembler built into PPSSPP (PSP emulator). Then copying the hex value of new instruction and applying that to target executable manually while keeping track of what I changed in some other file. There are obvious problems with this solution. Which led me into thinking why not write my own simple assembler.

Quick research into MIPS showed that it would be easy to write it. It’s a RISC architecture with fixed instruction size and has good documentation online. Instruction set is rather small and there are only 3 types of instruction to implement (ignoring FPU instructions which I added much later).

Introducing kmips

kmips is a MIPS assembler that is invoked directly from Kotlin code. It doesn’t parse external file with asm code. Instead it’s a Kotlin DSL. Each written asm instruction is a call to standard Kotlin method.

The whole assembler which implements MIPS II instruction set including FPU instructions is just a single file. There’s also one other file with unit tests for each instruction. They verify correctness against PPSSPP assembler.

The code is available on Github. Kmips can be included in other projects from Maven Central repository:

1
compile "com.kotcrab.kmips:kmips:1.2"

Example

This code will clear 32 bytes of memory at 0x08804100.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
import kmips.Label
import kmips.Reg.*
import kmips.assemble
//...
val asm = assemble(startPc = 0x8804000) {
    val loop = Label()
    val target = 0x08804100
    val bytes = 32

    la(s0, target)     // write target address to s0
    li(t1, 0)          // set up loop counter
    li(t2, bytes)      // how many bytes to write
    label(loop)        // 'loop' label will be placed here
    sb(t1, 0, s0)      // store byte in memory at register s0 with offset 0
    addiu(t1, t1, 1)   // increment loop counter
    addiu(s0, s0, 1)   // increment memory address pointer
    bne(t1, t2, loop)  // jump to `loop` branch if t1 != t2
    nop()              // ignoring branch delay slot for simplicity
}

Branch delay slot

MIPS has this oddity where after encountering branch or jump instruction CPU will execute one more instruction after it before changing instruction pointer. This is due to pipeline construction and is called a branch delay slot. kmips doesn’t do anything about it, you are expected to handle it manually.

Syntax differences

Naturally the syntax is different, after all it’s just calling Kotlin methods. Which normally would be:

1
2
li $a0, 0x42
sw $a0, 0x0($s0)

becomes:

1
2
li(a0, 0x42)
sw(a0, 0x0, s0)

It’s actually possible to make Kotlin syntax more natural thanks to extension functions. By adding this:

1
2
3
4
fun Assembler.sw(rt: Reg, dest: Pair<Int, Reg>) = sw(rt, dest.first, dest.second)
operator fun Int.invoke(reg: Reg): Pair<Int, Reg> {
    return Pair(this, reg)
}

It’s possible to write:

1
2
3
assemble {
    sw(a0, 0x0(s0))
}

That’s an idea to include in future version. I’m not really keen on for writing aliases for each store and load methods. Maybe that won’t be needed in the future.

FPU instructions would normally look like this:

1
2
add.s $f0, $f12, $f13
c.eq.s $f0, $f1

In kmips it would be:

1
2
add.s(f0, f12, f13)
c.eq.s(f0, f1)

I managed to keep the dot from normal syntax by using inner classes. So c is an object which has eq field. eq is another object with s method.

Advantages

Since this is using fully featured programming language kmips has nice advantage with possibility to script everything.

For example how about helper method to automatically create function prologue and epilogue. That piece of boilerplate code is responsible for allocating space for function call on stack (by moving down stack pointer register) and saving registers modified by the function. Epilogue restores those registers and moves up stack pointer. I can generate it by doing something like this:

1
2
3
4
5
assemble {
    ctx = preserve(arrayOf(s0, s1))
    // ... some function code that modifies s0 and s1 registers
    ctx.restoreAndReturn()
}

See the implementation.

Standard MIPS calling convention requires only some register to be saved (callee saved registers, s0-s7). There is another class of registers that are saved before calling other function (caller saved registers, t0-t9). It’s up to the caller function whether it wants to preserve its temporary registers. But that’s just a convention. There’s nothing special about those registers. When doing patches I pretty much always ignored this. It’s simpler to just save all modified registers than modify original function even more. In original function I only place jump to the new code.

Real life use

Speaking of new code, where to place it? What if you need to inject more code than just simple in-line patches? Well, while modifying ELF file to allocate more space for code seems possible or finding unused space in executable but I’ve been using a different way.

It requires finding a file that always gets loaded in memory in early stages of game launch. It can be loading screen texture or entire archive of files that game load first and never unloads (in my case conveniently named PRELOAD.pak). Just append new code there and you’re good to go.

This is where another kmips strength comes in. For this auxiliary code I can set startPc to where the game will load injected file and kmips will take care of calculating target addresses. Moving the code to different spot is easy. Of course this is done in compile time so it wouldn’t be useful on systems that use ALSR or when for some other reason memory addresses are unpredictable. Which fortunately wasn’t a problem on PSP.

PSP executables can be relocatable but so far I’ve found no evidence of loading at different address than 0x8804000. I verified that in PPSSPP emulator code.

kmips is a great help for assembling but it can only do that. In my project I’m using more of Kotlin DLS capabilities to make patches look a bit prettier. This is the syntax I’m using:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
// assembling an external patch that will be placed in PRELOAD.pak
private fun createAuxPatch(patchStart: Int) = assembleAsHexString(patchStart) {
    run {
        // allocate section for new variables
        vars.baseAddress = virtualPc
        vars.sCustomText = zeroTerminatedString("Some text")
        //...
        nop()
    }

    run {
        // start of a custom function
        funcs.customFunc = virtualPc
        val ctx = preserve(arrayOf(a0, a1, a2, ra))
        //...
        la(a0, t0)
        la(a1, vars.sCustomText)
        jal(funcs.memcmp) //call to original function from executable
        li(a2, 10)
        ctx.restoreAndExit()
    }
}

// assembling patches applied directly to main executable
private fun createEbootPatches() {
    with(ebootPatches) {
        patch("Patch name #1") {
            change(0x0) { li(a0, 1) }
            change(0x4) { li(a0, 1) }
        }
        patch("Patch name #2", active = !debugBuild) {
            change(0x20) { li(t0, 0x10) }
            change(0x30) {
                lw(s0, 0x0, s0)
                jal(funcs.customFunc)
                nop()
            }
        }
    }
}

// class storing address to global variables
class Var {
    var baseAddress: Int = 0
    var sCustomText = 0
    //...
}

// class storing address to function already present in main executable
// as well as to those added by external patch
class Func {
    val memcmp = 0x0898CF24
    var customFunc = 0
    //...
}

Not bad, eh?

Now I know that there already are assemblers that could do similar things. Still, I like the uniqueness of how this one works. It was a nice learning experience in both low level assembly and exploring capabilities of high level Kotlin. That’s it for now. Maybe in the future I will write some more about reverse engineering and how I’m using Kotlin for it.

Comments