So I've slowly been working on converting the emulation core behind Commodore Clicker to hand-written WebAssembly, to see if it helps with performance at all. Writing WASM by hand is something that's been covered excellently by others (I first picked up the concepts from Colin at Scott Logic), but one thing that's been a bugbear recently is an inability to name constants.
The Problem
The Commodore 64 has, as the name implies, 64kB of static RAM that can be written to at any time; however, its processor only has access to a 64kB address space, and somehow needs to fit the BASIC and Kernal DOS ROMs, as well as peripheral access, into that same space. Commodore achieved this by overloading the processor's onboard I/O port so you can "switch in" the ROMs and peripheral space, and switch them out if you need access to all the RAM.
For example, the Kernal ROM maps into memory at $E000
, but only if it's been enabled at the CPU port level by switching on bit 1 (HIRAM
). In the emulator's memory controller, this particular mapping might read as follows in JavaScript.
MMU.js: Kernal area mapping
const CPUPORT = 0x0001; const HIRAM = 2; export const readByte = (addr) => { switch (addr & 0xF000) {// RAM, BASIC, peripheral areas...// Kernal ROMcase 0xE000: case 0xF000: if (memory[CPUPORT] & HIRAM) { return ROM.kernal[addr & 0x1FFF]; } else { return memory[addr]; } break; } };
If we're to translate this to WAT, we'll first convert the switch
statement to a function table:
MMU.wat: Indirect call table
(memory (import "mem") 2) (table 16 anyfunc) (elem (i32.const 0);; RAM, BASIC, peripheral areas ;; ...$read_kernal $read_kernal ) (type $readfunc (func (param i32) (result i32))) (func $read (param $addr i32) (result i32);; table[(addr & 0xF000) >> 12]()(call_indirect (type $readfunc) (get_local $addr) (i32.shr_u (i32.and (get_local $addr) (i32.const 0xF000)) (i32.const 12) ) ) )
Now, this isn't terrible so far. We have two contiguous 64k-value blocks of memory (one for RAM, one for the mapped ROMs), and there are some magic constants in the $read
main handler, but they make sense in the context of needing to extract four bits from the address and using those to index the function table. Where the constants start to make less sense is in $read_kernal
:
MMU.wat: Read from Kernal if it's mapped in
(func $read_kernal (param $addr i32) (result i32) (i32.load8_u (if (result i32) (i32.and (i32.load (i32.const 0x0001)) (i32.const 2)) (then (i32.add (i32.and (get_local $addr) (i32.const 0xFFFF)) (i32.const 0x10000) ) ) (else (i32.and (get_local $addr) (i32.const 0xFFFF)) ) ) ) )
This is more inscrutable, especially if (as was the case for me) this code has been written and then left to marinate for a year before coming back to attempt to read it again. It would be much more readable if the above function could instead use defined constants:
MMU.wat: Read from Kernal, but with constants
(defineCPUPORT(i32.const 0x0001)) (defineHIRAM(i32.const 2)) (defineROM_MEMORY_START(i32. 0x10000)) (func $read_kernal (param $addr i32) (result i32) (i32.load8_u (if (result i32) (i32.and (i32.loadCPUPORT)HIRAM) (then (i32.add (i32.and (get_local $addr) (i32.const 0xFFFF))ROM_MEMORY_START) ) (else (i32.and (get_local $addr) (i32.const 0xFFFF)) ) ) ) )
Handling S-expressions
The above code makes use of a define
keyword that doesn't exist in WASM, so we need something akin to the C preprocessor to parse through the WAT file picking up definitions and replacing their occurrences. Fortunately, WAT was designed to be a format that's quick and easy to handle programmatically (while still being at least halfway usable by human standards): a WebAssembly Text file is one big Lisp-style S-expression, and each element within the file is itself a nested S-expression.
This means we can use S-expression handling libraries to quickly move from the WAT file to an internal representation that can be worked with. One such library is sexpdata by Joshua Boyd, which is an S-expression parser and dumper for Python. If we point our fledgling MMU file at sexpdata.load
, something usable starts to fall out:
mwat.py: Loading the WAT file
import sys from sexpdata import load def main(): filename = sys.argv[1] print(load(open(filename))) if __name__ == "__main__": main()
Initial debug output
[Symbol('module'), [Symbol('define'), Symbol('CPUPORT'), [Symbol('i32.const'), Symbol('0x0001')]], [Symbol('define'), ...
Arrays with Symbol
objects (which are an internal construct to sexpdata
) and other arrays inside. We can work with this by using a recursive function to handle the array representing the whole file: if an array is found inside, it will need to be processed recursively, unless it's an array that holds a define
statement.
Definitions can be made at any level of the WAT file, so any extracted definitions will need to be passed down to deeper levels of recursion, and merged with any definitions that are passed in from higher levels. So that means the list of definitions will need to be a parameter to the preprocessor:
mwat.py: Preprocessor function call
import sys from sexpdata import load, dumps def process(sexp, defines): new_defines = {}# First pass: Find immediate descendent s-exp's which define macros# TODO# Macros defined at this level override any of the same name higher up# NOTE: We don't want to mutate the parent's dict, we want our own copymerged_defines = defines.copy() | new_defines# Second pass: Replace at this level, recurse if deeper levels encountered# TODOreturn [i for i in sexp] def main(): filename = sys.argv[1]# Top-level process, with no existing definitions in the dictprint(dumps(process(load(open(filename)), {}))) if __name__ == "__main__": main()
Detecting and replacing definitions
We come to the crux of the problem: filling in the TODOs above. In the first pass, this is fairly simple: we'd like to find any arrays where the first element is a Symbol('define')
, and pull out the associated definition. Let's take the first line of debug output again.
Initial debug output
[Symbol('module'), [Symbol('define'), Symbol('CPUPORT'), [Symbol('i32.const'), Symbol('0x0001')]], [Symbol('define'), Symbol('HIRAM'), [Symbol('i32.const'), 2]], ...
We see that some elements in the file have been parsed as scalar values, and some as Symbol's. As Python doesn't have a native is-array
, we can't directly find arrays in order to perform the first-element check mentioned above; we can, however, use numpy's isscalar
to detect scalar values, and isinstance
for Symbol's. From here, it's fairly simple to detect and extract definition clauses.
Extracting definitions
from numpy import isscalar from sexpdata import Symbol def is_scalar_or_symbol(i): return isscalar(i) or isinstance(i, Symbol) def process(sexp, defines): new_defines = {} for i in sexp: if not is_scalar_or_symbol(i): if i[0] == Symbol('define'): new_defines[i[1]] = i[2] ...
And once we have the definitions to be used on this level, replacement is fairly simple. There are four types of item that will need action:
- Arrays where the first element is
Symbol('define')
: Exclude; - Arrays where the first element is not
Symbol('define')
: Recurse; - Scalars or plain Symbol's which are defined: Replace;
- Scalars or plain Symbol's which aren't defined: Copy.
List comprehension for replacing definitions
return [ merged_defines[i] if ( is_scalar_or_symbol(i) and merged_defines.get(i) )# Replaceelse i if is_scalar_or_symbol(i)# Copyelse process(i, merged_defines)# Recursefor i in sexp if is_scalar_or_symbol(i) or i[0] != Symbol('define')# Exclude]
Putting everything together, the following GitHub Gist contains the final preprocessor script, as well as a sample input WAT file and the generated output:
https://gist.github.com/Two9A/427985064d360342caaf4f7d5769aeef
Caveats and improvements
The astute observer will already have noticed that circular definitions are not handled at all well by this script: if one define
contains a keyword that's handled by another define
, the final output is dependent on the order of definition. In addition, this replicates the behaviour of C's #define
but it doesn't help with any of the other C-style preprocessor directives, notably #include
; support for these is a matter for future expansion.