### Who?

• Sebastian Millius
• Software engineer at Google (but I did this in my spare time); malware analysis
• MSc Computer Science ETHZ - focus on theoretical computer science
• Solution received: Tue, 14 Apr 2015 08:48:03 -0700 (PDT)

### How?

I wrote a disassembler & "decompiler" for the VM (decompiler.zip). I manually (using IDA Pro) mapped out the opcodes and extracted the embedded program. The disassembler really just outputs, e.g.:

```0x9:	push(0);

0x12:	push(8);

0x1b:	push(0);

0x1c:	a1 = pop();
a0 = pop();
push(a1 * a0);

0x1d:	a1 = pop();
a0 = pop();
push(a1 + a0);

0x22:	push(r_16);

0x23:	a1 = pop();
a0 = pop();
push(a0 + a1);

0x2c:	push(8);
0x35:	push(0);
```

In order to get closer to the original code, I just did constant propagation. This produces some "pseudo" C, that I manually fixed to get the source code, e.g (challenge 3):

```r_16[0] = (1103356944384 + ((in << 1) | (in >> 63)));
r_16[1] = ((in >> (((7 & (r_16[0] >> 4)) | 1) & 0xff)) | 1021570277);
r_16[2] = ((in >> 1) | (in << 63));
r_16[3] = (191560872 + in);
r_16[0] = (r_16[2] * r_16[2]);
r_16[2] = (r_16[2] + r_16[2]);
out[0] = (((r_16[0] - r_16[1]) << (((15 & ((r_16[2] << (((7 & (((r_16[3] >> 3) | (r_16[3] << 61)) >> 1)) | 1) & 0xff)) >> 1)) | 1) & 0xff)) | ((r_16[0] - r_16[1]) >> ((64 - ((15 & ((r_16[2] << (((7 & (((r_16[3] >> 3) | (r_16[3] << 61)) >> 1)) | 1) & 0xff)) >> 1)) | 1)) & 0xff)));
```

I didn't cleanup the generated codes too much (this looks close enough from what I can see comparing to the examples posted on your website), but did manually transform the jumps into loops - doing some graph/flow analysis should be easily possible, but I didn't implement this for these simple challenges in this set. I attached all the scripts that produce the pseudo C output for all 5 challenges (decompiler_0.py, decompiler_1.py, decompiler_2.py, decompiler_3.py, decompiler_3.py). I wrote the code mostly in one sitting and I didn't clean-up too much (my appologies for that). Maybe I try to improve it, if I find some time looking at the other harder sets.

### Tools used

• IDA Pro - manually map the opcodes
• Python - "decompiler" script
• Debugger/gdb - debugging

### Time spent

• Roughly 8 hours
• 4 hours to the basic "decompiler" script working.
• ~ 2 hours getting it to work for all 5 challenges, clean-up and improvements.
• ~ 2 hours debugging (it's easy to get the parameter ordering in the wrong way), fixing, write-up